From kvn at openjdk.org Sun Jun 1 00:11:55 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 1 Jun 2025 00:11:55 GMT Subject: RFR: 8357175: Failure to generate or load AOT code should be handled gracefully In-Reply-To: References: Message-ID: <_dcowgIjM5R9m4Ye0BNclWFRkNW_GisoCsrSAW4b0rI=.fced02a6-16ac-4a56-bd45-3e3b6e764bb5@github.com> On Sat, 31 May 2025 19:16:17 GMT, Ashutosh Mehra wrote: >> By default a failed AOT code should be discarded with UL message about it by request (`-Xlog:aot+codecache+*=debug`) and VM and AOT code processing should continue run. >> >> Unless we hit some catastrophic failure: OOM for example. This is similar how JIT compilers behave. >> >> I reordered VM configuration settings checking (`Config::verify()`) so that we switch off AOT code caching type which depends on these VM settings. For example, AOT adapters do not operate on oops - they are not affected by compressed oops settings/encoding. I removed `_objectAlignment` check because CDS already does this check when open archive. >> >> The AOT relocation processing for a blob will skip this blob when corresponding address is not found instead of bailing out VM in product mode. In debug VM it will issue assert so we know about missing address. These changes are in `AOTCodeAddressTable::id_for_address()` >> >> I kept `fatal()` in `AOTCodeAddressTable::for_address_for_id()` for incorrect ID we read from archive. The archive could be corrupted if ID is wrong. >> >> I did small code cleanup/renaming. >> >> Tested: tier1-10 > > src/hotspot/share/code/aotCodeCache.cpp line 434: > >> 432: >> 433: if (((_flags & enableContendedPadding) != 0) != EnableContended) { >> 434: log_debug(aot, codecache, init)("AOT Code Cache disabled: it was created with EnableContended = %s", EnableContended ? "false" : "true"); > > This check says code cache is disabled, but we still return true. Same with other checks following this. Is that intentional? The rest of checks are for nmethods, UseCodeCaching. May be I should remove it to avoid confusion. > src/hotspot/share/code/aotCodeCache.cpp line 1011: > >> 1009: } >> 1010: case relocInfo::runtime_call_w_cp_type: >> 1011: log_debug(aot, codecache, reloc)("runtime_call_w_cp_type relocation is not unimplemented"); > > typo: "relocation is not unimplemented" -> "relocation is unimplemented" fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25525#discussion_r2118436203 PR Review Comment: https://git.openjdk.org/jdk/pull/25525#discussion_r2118436486 From kvn at openjdk.org Sun Jun 1 00:22:58 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 1 Jun 2025 00:22:58 GMT Subject: RFR: 8357175: Failure to generate or load AOT code should be handled gracefully In-Reply-To: References: Message-ID: On Sat, 31 May 2025 19:50:04 GMT, Ashutosh Mehra wrote: >> By default a failed AOT code should be discarded with UL message about it by request (`-Xlog:aot+codecache+*=debug`) and VM and AOT code processing should continue run. >> >> Unless we hit some catastrophic failure: OOM for example. This is similar how JIT compilers behave. >> >> I reordered VM configuration settings checking (`Config::verify()`) so that we switch off AOT code caching type which depends on these VM settings. For example, AOT adapters do not operate on oops - they are not affected by compressed oops settings/encoding. I removed `_objectAlignment` check because CDS already does this check when open archive. >> >> The AOT relocation processing for a blob will skip this blob when corresponding address is not found instead of bailing out VM in product mode. In debug VM it will issue assert so we know about missing address. These changes are in `AOTCodeAddressTable::id_for_address()` >> >> I kept `fatal()` in `AOTCodeAddressTable::for_address_for_id()` for incorrect ID we read from archive. The archive could be corrupted if ID is wrong. >> >> I did small code cleanup/renaming. >> >> Tested: tier1-10 > > src/hotspot/share/code/aotCodeCache.cpp line 985: > >> 983: // ------------ process code and data -------------- >> 984: >> 985: #define BAD_ADDRESS_ID -2 > > Can you please add a comment to indicate why -1 is not used. > From the comment in `id_for_address`, I guess it is because -1 is a valid id for representing jump to itself in static call stub. Is that correct? > > int id = -1; > if (addr == (address)-1) { // Static call stub has jump to itself > return id; > } Yes, it is correct. I will add the comment: // Can't use -1. It is valid value for jump to iteself destination // used by static call stub: see NativeJump::jump_destination(). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25525#discussion_r2118454632 From kvn at openjdk.org Sun Jun 1 00:28:15 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 1 Jun 2025 00:28:15 GMT Subject: RFR: 8357175: Failure to generate or load AOT code should be handled gracefully In-Reply-To: References: Message-ID: On Thu, 29 May 2025 18:45:11 GMT, Vladimir Kozlov wrote: > By default a failed AOT code should be discarded with UL message about it by request (`-Xlog:aot+codecache+*=debug`) and VM and AOT code processing should continue run. > > Unless we hit some catastrophic failure: OOM for example. This is similar how JIT compilers behave. > > I reordered VM configuration settings checking (`Config::verify()`) so that we switch off AOT code caching type which depends on these VM settings. For example, AOT adapters do not operate on oops - they are not affected by compressed oops settings/encoding. I removed `_objectAlignment` check because CDS already does this check when open archive. > > The AOT relocation processing for a blob will skip this blob when corresponding address is not found instead of bailing out VM in product mode. In debug VM it will issue assert so we know about missing address. These changes are in `AOTCodeAddressTable::id_for_address()` > > I kept `fatal()` in `AOTCodeAddressTable::for_address_for_id()` for incorrect ID we read from archive. The archive could be corrupted if ID is wrong. > > I did small code cleanup/renaming. > > Tested: tier1-10 Thank you, @ashu-mehra. I addressed your comments. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25525#issuecomment-2926099925 From kvn at openjdk.org Sun Jun 1 00:28:15 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 1 Jun 2025 00:28:15 GMT Subject: RFR: 8357175: Failure to generate or load AOT code should be handled gracefully [v2] In-Reply-To: References: Message-ID: > By default a failed AOT code should be discarded with UL message about it by request (`-Xlog:aot+codecache+*=debug`) and VM and AOT code processing should continue run. > > Unless we hit some catastrophic failure: OOM for example. This is similar how JIT compilers behave. > > I reordered VM configuration settings checking (`Config::verify()`) so that we switch off AOT code caching type which depends on these VM settings. For example, AOT adapters do not operate on oops - they are not affected by compressed oops settings/encoding. I removed `_objectAlignment` check because CDS already does this check when open archive. > > The AOT relocation processing for a blob will skip this blob when corresponding address is not found instead of bailing out VM in product mode. In debug VM it will issue assert so we know about missing address. These changes are in `AOTCodeAddressTable::id_for_address()` > > I kept `fatal()` in `AOTCodeAddressTable::for_address_for_id()` for incorrect ID we read from archive. The archive could be corrupted if ID is wrong. > > I did small code cleanup/renaming. > > Tested: tier1-10 Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: address comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25525/files - new: https://git.openjdk.org/jdk/pull/25525/files/3399e5f9..497c141d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25525&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25525&range=00-01 Stats: 22 lines in 1 file changed: 2 ins; 18 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25525.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25525/head:pull/25525 PR: https://git.openjdk.org/jdk/pull/25525 From kvn at openjdk.org Sun Jun 1 00:29:50 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 1 Jun 2025 00:29:50 GMT Subject: RFR: 8358230: Incorrect location for the assert for blob != nullptr in CodeBlob::create In-Reply-To: References: Message-ID: On Sat, 31 May 2025 20:45:56 GMT, Ashutosh Mehra wrote: > A trivial fix to moves the assert for `blob != nullptr` before any usage of the the `blob` Yes, it is trivial. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25566#pullrequestreview-2884824886 From asmehra at openjdk.org Sun Jun 1 01:05:09 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Sun, 1 Jun 2025 01:05:09 GMT Subject: RFR: 8357175: Failure to generate or load AOT code should be handled gracefully [v2] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 00:28:15 GMT, Vladimir Kozlov wrote: >> By default a failed AOT code should be discarded with UL message about it by request (`-Xlog:aot+codecache+*=debug`) and VM and AOT code processing should continue run. >> >> Unless we hit some catastrophic failure: OOM for example. This is similar how JIT compilers behave. >> >> I reordered VM configuration settings checking (`Config::verify()`) so that we switch off AOT code caching type which depends on these VM settings. For example, AOT adapters do not operate on oops - they are not affected by compressed oops settings/encoding. I removed `_objectAlignment` check because CDS already does this check when open archive. >> >> The AOT relocation processing for a blob will skip this blob when corresponding address is not found instead of bailing out VM in product mode. In debug VM it will issue assert so we know about missing address. These changes are in `AOTCodeAddressTable::id_for_address()` >> >> I kept `fatal()` in `AOTCodeAddressTable::for_address_for_id()` for incorrect ID we read from archive. The archive could be corrupted if ID is wrong. >> >> I did small code cleanup/renaming. >> >> Tested: tier1-10 > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > address comments Marked as reviewed by asmehra (Committer). Thanks for addressing the comments. Looks good. ------------- PR Review: https://git.openjdk.org/jdk/pull/25525#pullrequestreview-2884902972 PR Comment: https://git.openjdk.org/jdk/pull/25525#issuecomment-2926206541 From asmehra at openjdk.org Sun Jun 1 01:08:01 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Sun, 1 Jun 2025 01:08:01 GMT Subject: Integrated: 8358230: Incorrect location for the assert for blob != nullptr in CodeBlob::create In-Reply-To: References: Message-ID: On Sat, 31 May 2025 20:45:56 GMT, Ashutosh Mehra wrote: > A trivial fix to moves the assert for `blob != nullptr` before any usage of the the `blob` This pull request has now been integrated. Changeset: 59dc8499 Author: Ashutosh Mehra URL: https://git.openjdk.org/jdk/commit/59dc849909c1edc892c94a27b0340fcf53db3a98 Stats: 3 lines in 1 file changed: 2 ins; 1 del; 0 mod 8358230: Incorrect location for the assert for blob != nullptr in CodeBlob::create Reviewed-by: kvn ------------- PR: https://git.openjdk.org/jdk/pull/25566 From iveresov at openjdk.org Sun Jun 1 03:03:51 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Sun, 1 Jun 2025 03:03:51 GMT Subject: RFR: 8357175: Failure to generate or load AOT code should be handled gracefully [v2] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 00:28:15 GMT, Vladimir Kozlov wrote: >> By default a failed AOT code should be discarded with UL message about it by request (`-Xlog:aot+codecache+*=debug`) and VM and AOT code processing should continue run. >> >> Unless we hit some catastrophic failure: OOM for example. This is similar how JIT compilers behave. >> >> I reordered VM configuration settings checking (`Config::verify()`) so that we switch off AOT code caching type which depends on these VM settings. For example, AOT adapters do not operate on oops - they are not affected by compressed oops settings/encoding. I removed `_objectAlignment` check because CDS already does this check when open archive. >> >> The AOT relocation processing for a blob will skip this blob when corresponding address is not found instead of bailing out VM in product mode. In debug VM it will issue assert so we know about missing address. These changes are in `AOTCodeAddressTable::id_for_address()` >> >> I kept `fatal()` in `AOTCodeAddressTable::for_address_for_id()` for incorrect ID we read from archive. The archive could be corrupted if ID is wrong. >> >> I did small code cleanup/renaming. >> >> Tested: tier1-10 > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > address comments Marked as reviewed by iveresov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25525#pullrequestreview-2884989001 From kvn at openjdk.org Sun Jun 1 03:59:55 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 1 Jun 2025 03:59:55 GMT Subject: Integrated: 8357175: Failure to generate or load AOT code should be handled gracefully In-Reply-To: References: Message-ID: On Thu, 29 May 2025 18:45:11 GMT, Vladimir Kozlov wrote: > By default a failed AOT code should be discarded with UL message about it by request (`-Xlog:aot+codecache+*=debug`) and VM and AOT code processing should continue run. > > Unless we hit some catastrophic failure: OOM for example. This is similar how JIT compilers behave. > > I reordered VM configuration settings checking (`Config::verify()`) so that we switch off AOT code caching type which depends on these VM settings. For example, AOT adapters do not operate on oops - they are not affected by compressed oops settings/encoding. I removed `_objectAlignment` check because CDS already does this check when open archive. > > The AOT relocation processing for a blob will skip this blob when corresponding address is not found instead of bailing out VM in product mode. In debug VM it will issue assert so we know about missing address. These changes are in `AOTCodeAddressTable::id_for_address()` > > I kept `fatal()` in `AOTCodeAddressTable::for_address_for_id()` for incorrect ID we read from archive. The archive could be corrupted if ID is wrong. > > I did small code cleanup/renaming. > > Tested: tier1-10 This pull request has now been integrated. Changeset: e3eb089d Author: Vladimir Kozlov URL: https://git.openjdk.org/jdk/commit/e3eb089d47d62ae6feeba3dc6b3752a025e27bed Stats: 130 lines in 2 files changed: 41 ins; 46 del; 43 mod 8357175: Failure to generate or load AOT code should be handled gracefully Reviewed-by: iveresov, asmehra ------------- PR: https://git.openjdk.org/jdk/pull/25525 From epeter at openjdk.org Sun Jun 1 05:37:08 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Sun, 1 Jun 2025 05:37:08 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v61] In-Reply-To: References: Message-ID: <2nsC6sfjkW6j7aMI9TwUgOM4qcyqQj03xGQ8WKfd2VU=.46a960b2-941e-40dc-917f-331aea0e6a70@github.com> On Fri, 30 May 2025 07:54:53 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: >> >> - Merge branch 'JDK-8344942-TemplateFramework-v3' of https://github.com/eme64/jdk into JDK-8344942-TemplateFramework-v3 >> - move verification > > test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java line 151: > >> 149: "System.out.println(", arg, ");\n", // capture arg via lambda argument >> 150: "System.out.println(#arg);\n", // capture arg via hashtag replacement >> 151: "System.out.println(#{arg});\n", // capture arg via hashtag replacement with brackets > > It's not clear here why one should use brackets. If there is an argument for those further down, then you can cross reference. Otherwise, it might need some explanation here. I rewrote the whole section a little: 155 // It would have been optimal to use Java String Templates to format 156 // argument values into Strings. However, since these are not (yet) 157 // available, the Template Framework provides two alternative ways of 158 // formatting Strings: 159 // 1) By appending to the comma-separated list of Tokens passed to body(). 160 // Appending as a Token works whenever one has a reference to the Object 161 // in Java code. But often, this is rather cumbersome and looks awkward, 162 // given all the additional quotes and commands required. Hence, it 163 // is encouraged to only use this method when necessary. 164 // 2) By hashtag replacements inside a single string. One can either 165 // use "#arg" directly, or use brackets "#{arg}". When possible, one 166 // should prefer avoiding the brackets, as they create additional 167 // noise. However, there are cases where they are useful, for 168 // example "#TYPE_CON" would be parsed as a hashtag replacement 169 // for the hashtag name "TYPE_CON", whereas "#{TYPE}_CON" is 170 // parsed as hashtag name "TYPE", followed by literal string "_CON". 171 // See also: generateWithHashtagAndDollarReplacements2 172 // There are two ways to define the value of a hashtag replacement: 173 // a) Capturing Template arguments as Strings. 174 // b) Using a "let" definition (see examples further down). 175 // Which one should be preferred is a code style question. Generally, we 176 // prefer the use of hashtag replacements because that allows easy use of _ 177 // multiline strings (i.e. text blocks). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2118746980 From epeter at openjdk.org Sun Jun 1 05:45:08 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Sun, 1 Jun 2025 05:45:08 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v61] In-Reply-To: References: Message-ID: On Fri, 30 May 2025 08:08:20 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: >> >> - Merge branch 'JDK-8344942-TemplateFramework-v3' of https://github.com/eme64/jdk into JDK-8344942-TemplateFramework-v3 >> - move verification > > test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java line 258: > >> 256: >> 257: // Render templateClass to String. >> 258: return templateClass.render(); > > When printing this, it starts at `var_2` and not `var_1`. Why is that? The `nextTemplateFrameId` starts at zero, and is incremented for every Template instantiation. The `templateClass` has `nextTemplateFrameId=1`. If there was any use of `$`, it would append `_1`. For `template1.asToken(1)` we have `nextTemplateFrameId=2` -> produces the `var_2`. Generally, the API does not make any guarantees about what id we give, it is just unique. Is that ok for you? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2118752174 From epeter at openjdk.org Sun Jun 1 05:45:08 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Sun, 1 Jun 2025 05:45:08 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v61] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 05:41:29 GMT, Emanuel Peter wrote: >> test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java line 258: >> >>> 256: >>> 257: // Render templateClass to String. >>> 258: return templateClass.render(); >> >> When printing this, it starts at `var_2` and not `var_1`. Why is that? > > The `nextTemplateFrameId` starts at zero, and is incremented for every Template instantiation. > The `templateClass` has `nextTemplateFrameId=1`. If there was any use of `$`, it would append `_1`. > For `template1.asToken(1)` we have `nextTemplateFrameId=2` -> produces the `var_2`. > Generally, the API does not make any guarantees about what id we give, it is just unique. > > Is that ok for you? Ah, I guess the comment above talks about `var_1, var_2 ...` hmm. I suppose I can add another comment for that in the test code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2118752394 From epeter at openjdk.org Sun Jun 1 06:02:05 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Sun, 1 Jun 2025 06:02:05 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v67] In-Reply-To: References: Message-ID: <3o8bVN9T_7p1h4miFfyUXDnyESEh4YAMzJhPcmE6XmI=.be1003c1-6867-4971-be12-1aa9389cf25e@github.com> > **Goal** > We want to generate Java source code: > - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. > - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). > > Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). > > **How to get started** > When reviewing, please start by looking at: > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 > > We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. > > Second, look at this advanced test: > https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 > > And then for a "tutorial", look at: > `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` > > It shows these features: > - The `body` of a Template is essentially a list of `Token`s that are concatenated. > - Templates can be nested: a `TemplateWithArgs` is also a `Token`. > - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. > - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. > - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. > - The use of recursive templates, and `fuel` to limit the recursion. > - `Name`s: useful to register field and variable names in code scopes. > > Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 > > For a better experience, you may want to generate the `javadocs`: > `javadoc -sourcepath test/hotspot/j... Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: more improvements ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24217/files - new: https://git.openjdk.org/jdk/pull/24217/files/ea2bb65d..68b45b1c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=66 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=65-66 Stats: 25 lines in 1 file changed: 21 ins; 1 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/24217.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24217/head:pull/24217 PR: https://git.openjdk.org/jdk/pull/24217 From epeter at openjdk.org Sun Jun 1 06:02:05 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Sun, 1 Jun 2025 06:02:05 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v61] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 05:42:37 GMT, Emanuel Peter wrote: >> The `nextTemplateFrameId` starts at zero, and is incremented for every Template instantiation. >> The `templateClass` has `nextTemplateFrameId=1`. If there was any use of `$`, it would append `_1`. >> For `template1.asToken(1)` we have `nextTemplateFrameId=2` -> produces the `var_2`. >> Generally, the API does not make any guarantees about what id we give, it is just unique. >> >> Is that ok for you? > > Ah, I guess the comment above talks about `var_1, var_2 ...` hmm. I suppose I can add another comment for that in the test code. Wrote this: 255 var templateClass = Template.make(() -> body( + 256 // The Template Framework API only guarantees that every Template use + 257 // has a unique ID. When using the Templates, all we need is that + 258 // variables from different Template uses do not conflict. But it can + 259 // be helpful to understand how the IDs are produced. The implementation + 260 // simply gives the first Template use the ID=1, and increments from there. + 261 // + 262 // In this example, the templateClass is the first Template use, and + 263 // has ID=1. We never use a dollar replacement here, so the code will + 264 // not show any "_1". 265 """ 266 package p.xyz; 267 268 public class InnerTest3 { 269 public static void main() { 270 """, + 271 // Second Template use: ID=2 -> var_2 272 template1.asToken(1), + 273 // Third Template use: ID=3 -> var_3 274 template1.asToken(7), + 275 // Fourth Template use with template2, no use of dollar, so + 276 // no "_4" shows up in the generated code. Internally, it + 277 // calls template1, shich is the fifth Template use, with + 278 // ID = 5 -> var_5 279 template2.asToken(2), + 280 // Sixth and Seventh Template use -> var_7 281 template2.asToken(5), + 282 // Eighth Template use with template4 -> var_8. + 283 // Ninth Template use with internal call to template3, + 284 // The local "$var" turns to "var_9", but the Template + 285 // argument captured value = "var_8" from the outer + 286 // template use of $("var"). 287 template4.asToken(), 288 """ 289 } 290 } 291 """ 292 )); 293 294 // Render templateClass to String. 295 return templateClass.render(); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2118771773 From epeter at openjdk.org Sun Jun 1 06:02:06 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Sun, 1 Jun 2025 06:02:06 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v61] In-Reply-To: References: Message-ID: On Fri, 30 May 2025 08:22:00 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: >> >> - Merge branch 'JDK-8344942-TemplateFramework-v3' of https://github.com/eme64/jdk into JDK-8344942-TemplateFramework-v3 >> - move verification > > test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java line 306: > >> 304: var myHook = new Hook("MyHook"); >> 305: >> 306: var template1 = Template.make("name", "value", (String name, Integer value) -> body( > > One could generally think about using `_` for unused lambda parameters which I think is the common convention. But then I guess we would need to update the documentation about saying "name" and "String name" should be the same and make an exception for unused ones. I don't know. I think it is better to keep the names duplicated. This gives the reader an easier visual aid to check which name has which type. What do you think? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2118774938 From epeter at openjdk.org Sun Jun 1 16:03:08 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Sun, 1 Jun 2025 16:03:08 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v61] In-Reply-To: References: Message-ID: On Fri, 30 May 2025 08:39:14 GMT, Christian Hagedorn wrote: >> test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java line 358: >> >>> 356: >>> 357: // We saw the use of custom hooks above, but now we look at the use of CLASS_HOOK and METHOD_HOOK >>> 358: // from the Template Library. >> >> Can you expand here on why it's better to use them instead of creating your own? Is it just readability/convenience? > > Another question which is not evidently clear by following the examples: Can and should (not) you use the same hook inside the hook itself, i.e.: > > Hooks.CLASS_HOOK.anchor( > Hooks.CLASS_HOOK.anchor( > // ... > > This is probably not done on purpose but such a situation could arise when nesting more templates and suddenly one anchors the same hook again? I extended the explanations: ~ 397 // We saw the use of custom hooks above, but now we look at the use of CLASS_HOOK and METHOD_HOOK. ~ 398 // By convention, we use the CLASS_HOOK for class scopes, and METHOD_HOOK for method scopes. + 399 // Whenever we open a class scope, we should anchor a CLASS_HOOK for that scope, and whenever we + 400 // open a method, we should anchor a METHOD_HOOK. Conversely, this allows us to check if we are + 401 // inside a class or method scope by querying "isAnchored". This convention helps us when building + 402 // a large library of Templates. But if you are writing your own self-contained set of Templates, + 403 // you do not have to follow this convention. + 404 // + 405 // Hooks are "re-entrant", that is we can anchor the same hook inside a scope that we already + 406 // anchored it previously. The "Hook.insert" always goes to the innermost anchoring of that + 407 // hook. There are cases where "re-entrant" Hooks are helpful such as nested classes, where + 408 // there is a class scope inside another class scope. Similarly, we can nest lambda bodies + 409 // inside method bodies, so also METHOD_HOOK can be used in such a "re-entrant" way. We could consider having both "re-entrant" and "non-re-entrant" Hooks. But I'm not yet convinced it is a very useful feature. Sure, there could be some confusion with nested hooks. But I think that confusion to code generation, because we can also nest class and method/lambda scopes. What do you think? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2119274873 From epeter at openjdk.org Sun Jun 1 16:03:10 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Sun, 1 Jun 2025 16:03:10 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v61] In-Reply-To: References: Message-ID: On Fri, 30 May 2025 08:57:44 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: >> >> - Merge branch 'JDK-8344942-TemplateFramework-v3' of https://github.com/eme64/jdk into JDK-8344942-TemplateFramework-v3 >> - move verification > > test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java line 454: > >> 452: // For every recursion depth, some fuel is automatically subtracted >> 453: // so that the fuel slowly depletes with the depth. >> 454: // We keep the recursion going until the fuel is depleted. > > You can also note here that if we forget to check the `fuel()`, the renderer causes a stack overflow because the recursion never ends. Good idea! Added. > test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java line 487: > >> 485: // in this scope, and in any nested scope, including nested Templates. This allows us to >> 486: // add some fields and registers in one Template, and later on, in another Template, we >> 487: // can access these fields and registers again with "dataNames()". > > What do you mean by "registers"? Hmm good question. I think I meant "variables". Changed it! > test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java line 596: > >> 594: @Override >> 595: public boolean isSubtypeOf(DataName.Type other) { >> 596: return other instanceof MyPrimitive(String n) && n == name(); > > Is `==` intended? Should it be `equals()`? Nice catch, fixed. Well it did not matter here, but it is good practice I guess. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2119278069 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2119276977 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2119278275 From epeter at openjdk.org Sun Jun 1 16:06:53 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Sun, 1 Jun 2025 16:06:53 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v68] In-Reply-To: References: Message-ID: <2xkmqbUmlAvSV6SUym7pUeA_gwTDErFOMPuzTZ86TAI=.4a2da8f6-db82-46a0-b5b7-3f8fa4b30385@github.com> > **Goal** > We want to generate Java source code: > - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. > - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). > > Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). > > **How to get started** > When reviewing, please start by looking at: > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 > > We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. > > Second, look at this advanced test: > https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 > > And then for a "tutorial", look at: > `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` > > It shows these features: > - The `body` of a Template is essentially a list of `Token`s that are concatenated. > - Templates can be nested: a `TemplateWithArgs` is also a `Token`. > - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. > - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. > - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. > - The use of recursive templates, and `fuel` to limit the recursion. > - `Name`s: useful to register field and variable names in code scopes. > > Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 > > For a better experience, you may want to generate the `javadocs`: > `javadoc -sourcepath test/hotspot/j... Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: more fixes from Christian ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24217/files - new: https://git.openjdk.org/jdk/pull/24217/files/68b45b1c..ab20c217 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=67 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=66-67 Stats: 19 lines in 1 file changed: 14 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/24217.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24217/head:pull/24217 PR: https://git.openjdk.org/jdk/pull/24217 From epeter at openjdk.org Sun Jun 1 16:10:07 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Sun, 1 Jun 2025 16:10:07 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v61] In-Reply-To: References: Message-ID: On Fri, 30 May 2025 10:39:57 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: >> >> - Merge branch 'JDK-8344942-TemplateFramework-v3' of https://github.com/eme64/jdk into JDK-8344942-TemplateFramework-v3 >> - move verification > > Thanks for all the updates and discussions! I've worked my way through the documentation in `Template` and the examples again in some more detail. It's much better and the new explanations are well done, excellent work! > > I left some comments here and there but mostly minor things. I will have another look at the implementation - probably only finished by Monday. The design now looks great. I'm glad we could find a good solution now after some more iterations :-) @chhagedorn Thanks a lot for all the great suggestions! I now addressed everything except for: Issue with `$$var` and `$1var`. Similarly, we would have issues with `##name` and `#1name`. https://github.com/openjdk/jdk/pull/24217#discussion_r2115232385 (I'll have to do some more experiments with parsing.) These are issues we could continue the conversation, unless you are satisfied with my answers: https://github.com/openjdk/jdk/pull/24217#discussion_r2115388737 https://github.com/openjdk/jdk/pull/24217#discussion_r2115406391 ------------- PR Comment: https://git.openjdk.org/jdk/pull/24217#issuecomment-2927467228 From epeter at openjdk.org Sun Jun 1 16:14:06 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Sun, 1 Jun 2025 16:14:06 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v61] In-Reply-To: References: <4MgAjHfzurYkWqrZ6ah81SwKah7IHR7okOxnq5gapb8=.b7b7bfc8-6dd7-4186-9839-b446c86f21a3@github.com> Message-ID: On Sat, 31 May 2025 11:48:39 GMT, Emanuel Peter wrote: >> @chhagedorn >> The current parsing/regex-ing is relatively simple. We only parse the "valid" cases, so the description above is still relevant. >> Your example `$1var` is not a valid pattern, so the regex does not match, and there is no replacement. Sadly, in Java `$1var` is a valid variable name, so there is some chance that the user makes a mistake and gets tripped up by this. >> >> If the user does a call to `let` or `$` with such a bad string `1var`, then they get a `RendererException`. >> >> The question is this: >> Should I really try to parse these "bad" patterns, just to validate them as well? All solutions I can think of are really complicated. Is it worth it? Or is it just a mistake by the user, and so the matching does not happen, and that is the users problem? > > FYI: `$$var` the first `$` is not a valid pattern, so it is not replaced. But `$var` is, and so that part gets replaced. The result is `$var_1`, which sadly happens to also be valid Java code. I think I just need to rewrite the way I parse and replace the strings. Doing a simple regex with `replaceAll` does not work if we also want to allow "bad" patterns such as `$$var` to be parsed, because of ambiguity. My new idea: split the string by `#` and `$`. The first string is just a regular string, because it has no `#` or `$` before it. But all others should start with either a `name` or `{name}` pattern. I should also do the `#` and `$` replacement in a single pass, so that we cannot have one replacement influence the other, i.e. that we have no "replacement injection" issues that may be confusing if anybody ever trips over it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2119291366 From epeter at openjdk.org Sun Jun 1 16:57:49 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Sun, 1 Jun 2025 16:57:49 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v69] In-Reply-To: References: Message-ID: > **Goal** > We want to generate Java source code: > - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. > - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). > > Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). > > **How to get started** > When reviewing, please start by looking at: > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 > > We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. > > Second, look at this advanced test: > https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 > > And then for a "tutorial", look at: > `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` > > It shows these features: > - The `body` of a Template is essentially a list of `Token`s that are concatenated. > - Templates can be nested: a `TemplateWithArgs` is also a `Token`. > - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. > - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. > - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. > - The use of recursive templates, and `fuel` to limit the recursion. > - `Name`s: useful to register field and variable names in code scopes. > > Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 > > For a better experience, you may want to generate the `javadocs`: > `javadoc -sourcepath test/hotspot/j... Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: wip refactor parsing dollar and hashtag ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24217/files - new: https://git.openjdk.org/jdk/pull/24217/files/ab20c217..ccc132b5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=68 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=67-68 Stats: 52 lines in 1 file changed: 20 ins; 13 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/24217.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24217/head:pull/24217 PR: https://git.openjdk.org/jdk/pull/24217 From jbhateja at openjdk.org Sun Jun 1 17:26:07 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sun, 1 Jun 2025 17:26:07 GMT Subject: RFR: 8352635: Improve inferencing of Float16 operations with constant inputs [v5] In-Reply-To: <44nVQBYgzCOB2mAB9xtAPvkUcOMJOITA2VjMdDFgm1g=.48266693-48bf-41db-8871-a7dcafe93509@github.com> References: <44nVQBYgzCOB2mAB9xtAPvkUcOMJOITA2VjMdDFgm1g=.48266693-48bf-41db-8871-a7dcafe93509@github.com> Message-ID: > This is a follow-up PR#22755 to improve Float16 operations inferencing. > > The existing scheme to detect Float16 operations for some operations is based on pattern matching which expects to receive inputs through ConvHF2F IR, this patch extends matching to accept constant floating point inputs within the Float16 value range. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Extending tests and review resolutions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24179/files - new: https://git.openjdk.org/jdk/pull/24179/files/b44d62dc..4a491bef Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24179&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24179&range=03-04 Stats: 181 lines in 4 files changed: 112 ins; 5 del; 64 mod Patch: https://git.openjdk.org/jdk/pull/24179.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24179/head:pull/24179 PR: https://git.openjdk.org/jdk/pull/24179 From jbhateja at openjdk.org Sun Jun 1 17:26:10 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sun, 1 Jun 2025 17:26:10 GMT Subject: RFR: 8352635: Improve inferencing of Float16 operations with constant inputs [v4] In-Reply-To: <6PFX21b9eT5mQv8Ym7b_RuKNpnuQ5CVqhc8TKxstlYo=.eb7d9f85-5e49-4e8f-b17a-c8e3728e7624@github.com> References: <44nVQBYgzCOB2mAB9xtAPvkUcOMJOITA2VjMdDFgm1g=.48266693-48bf-41db-8871-a7dcafe93509@github.com> <6PFX21b9eT5mQv8Ym7b_RuKNpnuQ5CVqhc8TKxstlYo=.eb7d9f85-5e49-4e8f-b17a-c8e3728e7624@github.com> Message-ID: <4kFfYPljgrRZSDgDmn4XbCB9iwnrETd0eFOxBSV-sVg=.422f1e5d-6182-4ad5-a509-3b1451a71dfc@github.com> On Wed, 28 May 2025 08:56:51 GMT, Emanuel Peter wrote: >> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: >> >> - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8352635 >> - Enabling some test points >> - Adding test points and some re-factoring >> - Merge branch 'master' of https://github.com/openjdk/jdk into JDK-8352635 >> - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8352635 >> - 8352635: Improve inferencing of Float16 operations with constant inputs > > src/hotspot/share/opto/convertnode.cpp line 290: > >> 288: // If constant lie within Float16 value range, convert it to >> 289: // a half-float constant. >> 290: if (StubRoutines::hf2f(StubRoutines::f2hf(conF)) == conF) { > > How does this behave with `NaN` values? Do you have a test for that below? Extended coveage for NaNs, yes we have new test points for them. > src/hotspot/share/opto/convertnode.cpp line 298: > >> 296: } else { >> 297: f16bOp = phase->transform(Float16NodeFactory::make(f32bOp->Opcode(), f32bOp->in(0), new_var_inp, new_con_inp)); >> 298: } > > Why is the order important here? A comment could help :) Addressed. > src/hotspot/share/opto/subnode.cpp line 566: > >> 564: // applicable to other floating point types. >> 565: // There are no known undefined, unspecified or implimentation specific >> 566: // behaviors w.r.t to floating point non-pointer subtraction. > > That sounds like we are not quite sure "no known" ... problems. Could there be any, or are we sure there are none? C++ follows IEEE 754 semantics for floating-point subtraction and there is no specified undefined behavior related to it in C++ standard. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2119357586 PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2119357694 PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2119358354 From jbhateja at openjdk.org Sun Jun 1 17:26:10 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sun, 1 Jun 2025 17:26:10 GMT Subject: RFR: 8352635: Improve inferencing of Float16 operations with constant inputs [v4] In-Reply-To: <-d846uXzYApO-CUq6peUgguY2YLpvG6ioAdVkN1wHG0=.94a09310-9d87-481c-b374-05ae99db0133@github.com> References: <44nVQBYgzCOB2mAB9xtAPvkUcOMJOITA2VjMdDFgm1g=.48266693-48bf-41db-8871-a7dcafe93509@github.com> <6PFX21b9eT5mQv8Ym7b_RuKNpnuQ5CVqhc8TKxstlYo=.eb7d9f85-5e49-4e8f-b17a-c8e3728e7624@github.com> <-d846uXzYApO-CUq6peUgguY2YLpvG6ioAdVkN1wHG0=.94a09310-9d87-481c-b374-05ae99db0133@github.com> Message-ID: On Wed, 28 May 2025 09:09:46 GMT, Emanuel Peter wrote: >> test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java line 320: >> >>> 318: res += Float.floatToFloat16(POSITIVE_ZERO_VAR.floatValue() / INEXACT_FP16); >>> 319: assertResult(Float.float16ToFloat(res), 32.125f, "testInexactFP16ConstantPatterns"); >>> 320: } >> >> Alignment is messed up by one space indentation. >> >> Can you add a comment why we are expecting none of the `HF` ops here? >> Are we expecting any other ops, maybe `F` ops? >> It could be good to check for that, so that we are sure that we get anything even close to our expectation. > > Same for the tests below :) Fixed, IR checks and indentaitons. >> test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java line 363: >> >>> 361: res += Float.floatToFloat16(POSITIVE_ZERO_VAR.floatValue() / EXACT_FP16); >>> 362: assertResult(Float.float16ToFloat(res), 32.125f, "testExactFP16ConstantPatterns"); >>> 363: } >> >> Can we have a test that picks a random `FP16` value, and does result verification on it? Because currently, you are testing the new pattern only with a few example values. > > And: your pattern matching allows the constant to be lhs or rhs, so you should add corresponding tests. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2119358477 PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2119358543 From kvn at openjdk.org Sun Jun 1 21:23:53 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 1 Jun 2025 21:23:53 GMT Subject: RFR: 8358236: [AOT] Graal crashes when trying to use persisted MDOs In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 19:01:27 GMT, Igor Veresov wrote: > Forgot to null out MethodData::_failed_speculations before snapshotting. As a result it gets restored with a dangling pointer. > Testing looks clean. Trivial. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25570#pullrequestreview-2886119546 From iveresov at openjdk.org Sun Jun 1 21:23:54 2025 From: iveresov at openjdk.org (Igor Veresov) Date: Sun, 1 Jun 2025 21:23:54 GMT Subject: Integrated: 8358236: [AOT] Graal crashes when trying to use persisted MDOs In-Reply-To: References: Message-ID: <2VQGaTWxeSr29uU3Ih3S5kF9l70w3xwlkHNG_pVFr7U=.3279eb7c-5bf8-4df1-8405-61b1678552d5@github.com> On Sun, 1 Jun 2025 19:01:27 GMT, Igor Veresov wrote: > Forgot to null out MethodData::_failed_speculations before snapshotting. As a result it gets restored with a dangling pointer. > Testing looks clean. This pull request has now been integrated. Changeset: 85e36d79 Author: Igor Veresov URL: https://git.openjdk.org/jdk/commit/85e36d79246913abb8b85c2be719670655d619ab Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod 8358236: [AOT] Graal crashes when trying to use persisted MDOs Reviewed-by: kvn ------------- PR: https://git.openjdk.org/jdk/pull/25570 From epeter at openjdk.org Mon Jun 2 03:09:09 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 03:09:09 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v70] In-Reply-To: References: Message-ID: > **Goal** > We want to generate Java source code: > - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. > - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). > > Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). > > **How to get started** > When reviewing, please start by looking at: > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 > > We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. > > Second, look at this advanced test: > https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 > > And then for a "tutorial", look at: > `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` > > It shows these features: > - The `body` of a Template is essentially a list of `Token`s that are concatenated. > - Templates can be nested: a `TemplateWithArgs` is also a `Token`. > - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. > - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. > - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. > - The use of recursive templates, and `fuel` to limit the recursion. > - `Name`s: useful to register field and variable names in code scopes. > > Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 > > For a better experience, you may want to generate the `javadocs`: > `javadoc -sourcepath test/hotspot/j... Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: dollar and hashtag parsing validatiaon ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24217/files - new: https://git.openjdk.org/jdk/pull/24217/files/ccc132b5..21d3f507 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=69 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=68-69 Stats: 31 lines in 2 files changed: 26 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/24217.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24217/head:pull/24217 PR: https://git.openjdk.org/jdk/pull/24217 From epeter at openjdk.org Mon Jun 2 03:30:24 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 03:30:24 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v71] In-Reply-To: References: Message-ID: > **Goal** > We want to generate Java source code: > - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. > - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). > > Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). > > **How to get started** > When reviewing, please start by looking at: > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 > > We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. > > Second, look at this advanced test: > https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 > > And then for a "tutorial", look at: > `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` > > It shows these features: > - The `body` of a Template is essentially a list of `Token`s that are concatenated. > - Templates can be nested: a `TemplateWithArgs` is also a `Token`. > - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. > - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. > - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. > - The use of recursive templates, and `fuel` to limit the recursion. > - `Name`s: useful to register field and variable names in code scopes. > > Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 > > For a better experience, you may want to generate the `javadocs`: > `javadoc -sourcepath test/hotspot/j... Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 91 commits: - Merge branch 'master' into JDK-8344942-TemplateFramework-v3 - validation tests - dollar and hashtag parsing validatiaon - wip refactor parsing dollar and hashtag - more fixes from Christian - more improvements - more suggestions applied - good practice - rename template arguments - more from Christian - ... and 81 more: https://git.openjdk.org/jdk/compare/90d6ad01...cb7037e7 ------------- Changes: https://git.openjdk.org/jdk/pull/24217/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=70 Stats: 6683 lines in 27 files changed: 6683 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24217.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24217/head:pull/24217 PR: https://git.openjdk.org/jdk/pull/24217 From epeter at openjdk.org Mon Jun 2 03:30:24 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 03:30:24 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v61] In-Reply-To: References: Message-ID: On Fri, 30 May 2025 10:39:57 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: >> >> - Merge branch 'JDK-8344942-TemplateFramework-v3' of https://github.com/eme64/jdk into JDK-8344942-TemplateFramework-v3 >> - move verification > > Thanks for all the updates and discussions! I've worked my way through the documentation in `Template` and the examples again in some more detail. It's much better and the new explanations are well done, excellent work! > > I left some comments here and there but mostly minor things. I will have another look at the implementation - probably only finished by Monday. The design now looks great. I'm glad we could find a good solution now after some more iterations :-) @chhagedorn Alright, I now have a decent solution for `$$var` and `$1var` etc. I also added tests for it. These are issues we could continue the conversation, unless you are satisfied with my answers: https://github.com/openjdk/jdk/pull/24217#discussion_r2115388737 https://github.com/openjdk/jdk/pull/24217#discussion_r2115406391 This is now ready for another review pass ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24217#issuecomment-2928567671 From amitkumar at openjdk.org Mon Jun 2 03:37:57 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 2 Jun 2025 03:37:57 GMT Subject: RFR: 8353500: [s390x] Intrinsify Unsafe::setMemory [v5] In-Reply-To: References: Message-ID: On Fri, 30 May 2025 08:32:30 GMT, Andrew Haley wrote: > What are all those `nopr`s for? Sorry that is old code; nops were inserted for the loop alignment; this is the newer stub code: - - - [BEGIN] - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - StubRoutines::unsafe_setmemory [0x000003ffa84b63c0, 0x000003ffa84b644c] (140 bytes) -------------------------------------------------------------------------------- BFD: unknown S/390 disassembler option: s390 .long 0x00000000 0x000003ffa84b63c0: vlvgb %v0,%r4,0 0x000003ffa84b63c6: vrepb %v0,%v0,0 0x000003ffa84b63cc: aghi %r3,-32 0x000003ffa84b63d0: jl 0x000003ffa84b63ec 0x000003ffa84b63d4: vst %v0,0(%r2) 0x000003ffa84b63da: vst %v0,16(%r2) 0x000003ffa84b63e0: aghi %r2,32 0x000003ffa84b63e4: aghi %r3,-32 0x000003ffa84b63e8: jhe 0x000003ffa84b63d4 0x000003ffa84b63ec: tmll %r3,16 0x000003ffa84b63f0: je 0x000003ffa84b63fe 0x000003ffa84b63f4: vst %v0,0(%r2) 0x000003ffa84b63fa: aghi %r2,16 0x000003ffa84b63fe: tmll %r3,8 0x000003ffa84b6402: je 0x000003ffa84b6410 0x000003ffa84b6406: vsteg %v0,0(%r2),0 0x000003ffa84b640c: aghi %r2,8 0x000003ffa84b6410: tmll %r3,7 0x000003ffa84b6414: je 0x000003ffa84b644a 0x000003ffa84b6418: tmll %r3,4 0x000003ffa84b641c: je 0x000003ffa84b642a 0x000003ffa84b6420: vstef %v0,0(%r2),0 0x000003ffa84b6426: aghi %r2,4 0x000003ffa84b642a: tmll %r3,2 0x000003ffa84b642e: je 0x000003ffa84b643c 0x000003ffa84b6432: vsteh %v0,0(%r2),0 0x000003ffa84b6438: aghi %r2,2 0x000003ffa84b643c: tmll %r3,1 0x000003ffa84b6440: je 0x000003ffa84b644a 0x000003ffa84b6444: vsteb %v0,0(%r2),0 0x000003ffa84b644a: br %r14 -------------------------------------------------------------------------------- - - - [END] - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ------------- PR Comment: https://git.openjdk.org/jdk/pull/24480#issuecomment-2928591294 From epeter at openjdk.org Mon Jun 2 04:54:53 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 04:54:53 GMT Subject: RFR: 8350896: Integer/Long.compress gets wrong type from CompressBitsNode::Value [v8] In-Reply-To: References: Message-ID: On Fri, 30 May 2025 17:43:27 GMT, Jatin Bhateja wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Review comments resolutions > > We can further constrain the value range bounds of bit compression and expansion once PR #17508 gets integrated. For now, I have developed the following draft demonstrates bound constraining with KnownBitLattice. > > > // > // Prototype of bit compress/expand value range computation > // using KnownBits infrastructure. > // > > #include > #include > #include > #include > > template > class KnownBitsLattice { > private: > U zeros; > U ones; > > public: > KnownBitsLattice(U lb, U ub); > > U getKnownZeros() { > return zeros; > } > > U getKnownOnes() { > return ones; > } > > long getKnownZerosCount() { > uint64_t count = 0; > asm volatile ("popcntq %1, %0 \n\t" : "=r"(count) : "r"(zeros) : "cc"); > return count; > } > > long getKnownOnesCount() { > uint64_t count = 0; > asm volatile ("popcntq %1, %0 \n\t" : "=r"(count) : "r"(ones) : "cc"); > return count; > } > > bool check_voilation() { > // A given bit cannot be both zero or one. > return (zeros & ones) != 0; > } > > bool is_MSB_KnownOneBitsSet() { > return (ones >> 63) == 1; > } > > bool is_MSB_KnownZeroBitsSet() { > return (zeros >> 63) == 1; > } > }; > > template > KnownBitsLattice::KnownBitsLattice(U lb, U ub) { > // To find KnownBitsLattice from a given value range > // we first find the common prefix b/w upper and lower > // bound, we then concertize known zeros and ones bit > // based on common prefix. > // e.g. > // lb = 00110001 > // ub = 00111111 > // common prefix = 0011XXXX > // knownbits.zeros = 11000000 > // knownbits.ones = 00110000 > // > // conversely, for a give knownbits value we can find > // lower and upper value ranges. > // e.g. > // knownbits.zeros = 0x00010001 > // knownbits.ones = 0x10001100 > // range.lo = knownbits.ones, this is because knownbits.ones are > // guaranteed to be one. > // range.hi = ~knownbits.zeros, this is an optimistic upper bound > // which assumes all unset knownbits.zero > // are ones. > // Thus in above example, > // range.lo = 0x8C > // range.hi = 0xEE > > U lzcnt = 0; > U common_prefix = lb ^ ub; > asm volatile ("lzcntq %1, %0 \n\t" : "=r"(lzcnt) : "r"(common_prefix) : "cc"); > U common_prefix_mask = lzcnt == 0 ? 0xFFFFFFFFFFFFFFFFL : ~((1ULL << (64 - lzcnt)) - 1); > zeros = (~lb) & common_prefix_mask; > ones = (lb) & c... @jatin-bhateja Nice! Yes I'm looking forward to reviewing all the KnownBits extensions! @jatin-bhateja Let me know whenever this is ready for another pass of reviews :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/23947#issuecomment-2928741573 From rehn at openjdk.org Mon Jun 2 05:45:59 2025 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 2 Jun 2025 05:45:59 GMT Subject: RFR: 8357968: RISC-V: Interpreter volatile reference stores with G1 are not sequentially consistent In-Reply-To: References: Message-ID: On Wed, 28 May 2025 16:47:06 GMT, Robbin Ehn wrote: > Hi please consider. > > As ref: https://github.com/openjdk/jdk/pull/25483 > As suggested in that PR - I removed these helpers as it's very hard to see that you get registers clobbered. > > Sanity tested, running t1. > > /Robbin Thanks all! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25502#issuecomment-2928896307 From rehn at openjdk.org Mon Jun 2 05:45:59 2025 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 2 Jun 2025 05:45:59 GMT Subject: Integrated: 8357968: RISC-V: Interpreter volatile reference stores with G1 are not sequentially consistent In-Reply-To: References: Message-ID: On Wed, 28 May 2025 16:47:06 GMT, Robbin Ehn wrote: > Hi please consider. > > As ref: https://github.com/openjdk/jdk/pull/25483 > As suggested in that PR - I removed these helpers as it's very hard to see that you get registers clobbered. > > Sanity tested, running t1. > > /Robbin This pull request has now been integrated. Changeset: c5a1543e Author: Robbin Ehn URL: https://git.openjdk.org/jdk/commit/c5a1543ee3e68775f09ca29fb07efd9aebfdb33e Stats: 27 lines in 1 file changed: 0 ins; 18 del; 9 mod 8357968: RISC-V: Interpreter volatile reference stores with G1 are not sequentially consistent Reviewed-by: eosterlund, fbredberg, shade, fyang ------------- PR: https://git.openjdk.org/jdk/pull/25502 From mchevalier at openjdk.org Mon Jun 2 06:53:50 2025 From: mchevalier at openjdk.org (Marc Chevalier) Date: Mon, 2 Jun 2025 06:53:50 GMT Subject: RFR: 8353266: C2: Wrong execution with Integer.bitCount(int) intrinsic on AArch64 In-Reply-To: References: Message-ID: On Sat, 31 May 2025 02:59:48 GMT, SendaoYan wrote: > Hi, how does this bug was found, seems the original testcase generated by a fuzz tool. Seems so, given what the initial reproducer looks like, but I'm not sure. The ticket was opened 3 years ago, not sure anyone remembers. If you want to know more context, maybe you can ask the initial reporter. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25551#issuecomment-2929087749 From jbhateja at openjdk.org Mon Jun 2 07:44:58 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 2 Jun 2025 07:44:58 GMT Subject: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API In-Reply-To: References: Message-ID: On Fri, 9 May 2025 07:35:41 GMT, Xiaohong Gong wrote: > JDK-8318650 introduced hotspot intrinsification of subword gather load APIs for X86 platforms [1]. However, the current implementation is not optimal for AArch64 SVE platform, which natively supports vector instructions for subword gather load operations using an int vector for indices (see [2][3]). > > Two key areas require improvement: > 1. At the Java level, vector indices generated for range validation could be reused for the subsequent gather load operation on architectures with native vector instructions like AArch64 SVE. However, the current implementation prevents compiler reuse of these index vectors due to divergent control flow, potentially impacting performance. > 2. At the compiler IR level, the additional `offset` input for `LoadVectorGather`/`LoadVectorGatherMasked` with subword types increases IR complexity and complicates backend implementation. Furthermore, generating `add` instructions before each memory access negatively impacts performance. > > This patch refactors the implementation at both the Java level and compiler mid-end to improve efficiency and maintainability across different architectures. > > Main changes: > 1. Java-side API refactoring: > - Explicitly passes generated index vectors to hotspot, eliminating duplicate index vectors for gather load instructions on > architectures like AArch64. > 2. C2 compiler IR refactoring: > - Refactors `LoadVectorGather`/`LoadVectorGatherMasked` IR for subword types by removing the memory offset input and incorporating it into the memory base `addr` at the IR level. This simplifies backend implementation, reduces add operations, and unifies the IR across all types. > 3. Backend changes: > - Streamlines X86 implementation of subword gather operations following the removal of the offset input from the IR level. > > Performance: > The performance of the relative JMH improves up to 27% on a X86 AVX512 system. Please see the data below: > > Benchmark Mode Cnt Unit SIZE Before After Gain > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 64 53682.012 52650.325 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 256 14484.252 14255.156 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 1024 3664.900 3595.615 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms 4096 908.312 935.269 1.02 > GatherOperationsBenchmark.micr... Hi @XiaohongGong , Looks good to me, thanks again for this re-factor !! Best Regards, Jatin ------------- Marked as reviewed by jbhateja (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25138#pullrequestreview-2887157235 From duke at openjdk.org Mon Jun 2 08:15:36 2025 From: duke at openjdk.org (Tom Shull) Date: Mon, 2 Jun 2025 08:15:36 GMT Subject: RFR: 8357987: [JVMCI] Add support for retrieving all methods of a ResolvedJavaType [v2] In-Reply-To: References: Message-ID: > Currently from ResolvedJavaType one can retrieve all declared methods, static methods, and constructors of the given type. However, internally in HotSpot there are also VM-internal methods, such as overpass methods, associated with a given type which we cannot access via the API. > > To correct this, we should add a new method which enables VM-internal methods, such as overpass methods, to be accessed. Tom Shull has updated the pull request incrementally with one additional commit since the last revision: format javadoc and update test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25498/files - new: https://git.openjdk.org/jdk/pull/25498/files/1f42f05f..0de1feae Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25498&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25498&range=00-01 Stats: 16 lines in 4 files changed: 2 ins; 2 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/25498.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25498/head:pull/25498 PR: https://git.openjdk.org/jdk/pull/25498 From shade at openjdk.org Mon Jun 2 08:18:53 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Jun 2025 08:18:53 GMT Subject: RFR: 8358169: Shenandoah/JVMCI: Export GC state constants In-Reply-To: References: Message-ID: On Fri, 30 May 2025 16:09:03 GMT, Roman Kennke wrote: > We need the GC state enum constants available in JVMCI. Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25552#pullrequestreview-2887264825 From dbriemann at openjdk.org Mon Jun 2 08:27:56 2025 From: dbriemann at openjdk.org (David Briemann) Date: Mon, 2 Jun 2025 08:27:56 GMT Subject: RFR: 8357793: [PPC64] VM crashes with -XX:-UseSIGTRAP -XX:-ImplicitNullChecks [v2] In-Reply-To: References: <5XqAA3Z2G0uwOBkitUrqkG3Y68xtpRuvBwj_cEIFECs=.18259520-6f73-406f-a46f-fa025c12b303@github.com> Message-ID: On Wed, 28 May 2025 19:12:55 GMT, Martin Doerr wrote: >> In case of -XX:-UseSIGTRAP -XX:-ImplicitNullChecks, we use the manually selected entry. (The same is true for -XX:-TrapBasedNullChecks -XX:-ImplicitNullChecks.) >> We only need to use the correct NullPointerException entry in the compiler case. >> >> With this patch, the manually selected entry matches the one selected by `PosixSignals::pd_hotspot_signal_handler`. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Fix bastore without ImplicitNullChecks. LGTM. Thanks ------------- Marked as reviewed by dbriemann (Author). PR Review: https://git.openjdk.org/jdk/pull/25504#pullrequestreview-2887294855 From mdoerr at openjdk.org Mon Jun 2 08:33:56 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 2 Jun 2025 08:33:56 GMT Subject: RFR: 8357793: [PPC64] VM crashes with -XX:-UseSIGTRAP -XX:-ImplicitNullChecks [v2] In-Reply-To: References: <5XqAA3Z2G0uwOBkitUrqkG3Y68xtpRuvBwj_cEIFECs=.18259520-6f73-406f-a46f-fa025c12b303@github.com> Message-ID: On Wed, 28 May 2025 19:12:55 GMT, Martin Doerr wrote: >> In case of -XX:-UseSIGTRAP -XX:-ImplicitNullChecks, we use the manually selected entry. (The same is true for -XX:-TrapBasedNullChecks -XX:-ImplicitNullChecks.) >> We only need to use the correct NullPointerException entry in the compiler case. >> >> With this patch, the manually selected entry matches the one selected by `PosixSignals::pd_hotspot_signal_handler`. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Fix bastore without ImplicitNullChecks. Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25504#issuecomment-2929418451 From mdoerr at openjdk.org Mon Jun 2 08:33:57 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 2 Jun 2025 08:33:57 GMT Subject: Integrated: 8357793: [PPC64] VM crashes with -XX:-UseSIGTRAP -XX:-ImplicitNullChecks In-Reply-To: <5XqAA3Z2G0uwOBkitUrqkG3Y68xtpRuvBwj_cEIFECs=.18259520-6f73-406f-a46f-fa025c12b303@github.com> References: <5XqAA3Z2G0uwOBkitUrqkG3Y68xtpRuvBwj_cEIFECs=.18259520-6f73-406f-a46f-fa025c12b303@github.com> Message-ID: On Wed, 28 May 2025 17:00:48 GMT, Martin Doerr wrote: > In case of -XX:-UseSIGTRAP -XX:-ImplicitNullChecks, we use the manually selected entry. (The same is true for -XX:-TrapBasedNullChecks -XX:-ImplicitNullChecks.) > We only need to use the correct NullPointerException entry in the compiler case. > > With this patch, the manually selected entry matches the one selected by `PosixSignals::pd_hotspot_signal_handler`. This pull request has now been integrated. Changeset: ba9f44c9 Author: Martin Doerr URL: https://git.openjdk.org/jdk/commit/ba9f44c90fe8da2d97d67b6878ac2c0c14e35bd0 Stats: 4 lines in 2 files changed: 2 ins; 0 del; 2 mod 8357793: [PPC64] VM crashes with -XX:-UseSIGTRAP -XX:-ImplicitNullChecks Reviewed-by: shade, dbriemann ------------- PR: https://git.openjdk.org/jdk/pull/25504 From duke at openjdk.org Mon Jun 2 08:39:31 2025 From: duke at openjdk.org (Tom Shull) Date: Mon, 2 Jun 2025 08:39:31 GMT Subject: RFR: 8357660: [JVMCI] Add support for retrieving all BootstrapMethodInvocations directly from ConstantPool [v2] In-Reply-To: References: Message-ID: > This PR adds support for directly retrieving both all invokedynamic and all condy BootstrapMethodInvocations from a ConstantPool via the new method `List lookupBootstrapMethodInvocations(boolean invokeDynamic)`. > > In addition, two methods are added to the BootstrapMethodInvocations: > 1. `void resolve()` > 2. `JavaConstant lookup()` > > The combination of these two features allows one to directly interact with all BSM information of a given ConstantPool without having to iterate through all of the Classfile's methods to find all invokedynamic bytecodes and/or iterate through all Constant Pool entries. Tom Shull has updated the pull request incrementally with one additional commit since the last revision: reviewer feedback and update javadoc formatting ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25420/files - new: https://git.openjdk.org/jdk/pull/25420/files/519be178..60c39b5e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25420&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25420&range=00-01 Stats: 23 lines in 2 files changed: 3 ins; 1 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/25420.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25420/head:pull/25420 PR: https://git.openjdk.org/jdk/pull/25420 From shade at openjdk.org Mon Jun 2 08:42:55 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Jun 2025 08:42:55 GMT Subject: RFR: 8357223: AArch64: Optimize interpreter profile updates [v2] In-Reply-To: <7wo-_Wt-EiVGKgxMxU_MnTA8o1QQxH_LDtNzDShlOIY=.9c8093b7-ed4b-487d-afbe-5227362f1ade@github.com> References: <7wo-_Wt-EiVGKgxMxU_MnTA8o1QQxH_LDtNzDShlOIY=.9c8093b7-ed4b-487d-afbe-5227362f1ade@github.com> Message-ID: <-gNhkdcFda-JXrWH4bpViukhPFnm0EyO71u1o2ZyV68=.0228af79-58ec-4fb5-9ca0-85148cc8365d@github.com> On Thu, 29 May 2025 23:04:25 GMT, Chad Rakoczy wrote: >> [JDK-8357223](https://bugs.openjdk.org/browse/JDK-8357223) >> >> The aarch64 version of [JDK-8356946](https://bugs.openjdk.org/browse/JDK-8356946) >> >> The reasoning for this change is the same as the x86 version's PR: >> >>> First, we carry the implementation for counter decrements without using them. This is dead code, and can be purged. >>> >>> Second, we care about overflows for 64-bit for some reason. I think this is a reminiscent of 32-bit x86 support, where we can plausibly have 32-bit counter overflow in a reasonable timeframe. But for 64-bit counter, we need tens of years of constantly bashing the counter to get it to overflow. No other profile counter update code, e.g. in C1, cares about this. >> >> Additional testing: >> >> - [x] Linux aarch64 fastdebug tier 1/2/3/4 > > Chad Rakoczy has updated the pull request incrementally with one additional commit since the last revision: > > Address comments Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25512#pullrequestreview-2887351889 From rkennke at openjdk.org Mon Jun 2 08:59:56 2025 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 2 Jun 2025 08:59:56 GMT Subject: Integrated: 8358169: Shenandoah/JVMCI: Export GC state constants In-Reply-To: References: Message-ID: On Fri, 30 May 2025 16:09:03 GMT, Roman Kennke wrote: > We need the GC state enum constants available in JVMCI. This pull request has now been integrated. Changeset: eb9badd8 Author: Roman Kennke URL: https://git.openjdk.org/jdk/commit/eb9badd8a4ea6dca834525fd49429e2ce771a76c Stats: 8 lines in 1 file changed: 8 ins; 0 del; 0 mod 8358169: Shenandoah/JVMCI: Export GC state constants Reviewed-by: dnsimon, shade ------------- PR: https://git.openjdk.org/jdk/pull/25552 From galder at openjdk.org Mon Jun 2 09:17:52 2025 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Mon, 2 Jun 2025 09:17:52 GMT Subject: RFR: 8357726: C2 fails to recognize the counted loop when induction variable range is changed multiple times In-Reply-To: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> References: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> Message-ID: On Fri, 30 May 2025 07:43:29 GMT, Xiaohong Gong wrote: > C2 compiler fails to recognize counted loops when the induction variable is constrained by multiple consecutive `CastII` nodes. > This prevents optimizations like range check elimination, loop unrolling and auto-vectorization for these loops. Please refer > to the detailed discussion for a related performance issue from [1]. > > The ideal graph of such a loop typically looks like: > > > /-----------| > | | > | ConI | > loop | / / > | | / / > \ AddI / > RangeCheck \ / | > | \ / | > IfTrue Phi | > \ | | > RangeCheck \ | | > \ CastII / <- Range check #1 > | | / > IfTrue | | > \ | | > CastII | <- Range check #2 > | / > |-------/ > > > > For a counted loop, the loop induction variable (i.e `Phi`) should be the input of `AddI` ideally. However, in above case, it is used > by two consecutive `CastII` nodes generated by two different range check operations. Compiler should skip all such kind of `CastII` when recognizing a counted loop. > > This patch modifies the counted loop recognition code to iteratively uncast the loop `iv` until no `CastII` nodes remain, enabling proper counted loop recognition even when the induction variable undergoes multiple range constraint operations. > > Test: > - Tested tier1, tier2, tier3, and no regressions are found. > - An additional test case is added to verify the fix. > > Performance: > Here is the performance gain on a NVIDIA Grace machine which is an AArch64 architecture: > > > Benchmark Mode Cnt Unit Before After Gain > CountedLoopCastIV.loop_iv_int thrpt 30 ops/s 941482.597 4389292.439 4.66 > CountedLoopCastIV.loop_iv_long thrpt 30 ops/s 884563.232 1441485.455 1.62 > > > We can also observe the similar uplift on a x86_64 machine. > > [1] https://github.com/openjdk/jdk/pull/25138#issuecomment-2892720654 Marked as reviewed by galder (Author). ------------- PR Review: https://git.openjdk.org/jdk/pull/25539#pullrequestreview-2887478434 From epeter at openjdk.org Mon Jun 2 10:30:51 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 10:30:51 GMT Subject: RFR: 8357726: C2 fails to recognize the counted loop when induction variable range is changed multiple times In-Reply-To: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> References: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> Message-ID: On Fri, 30 May 2025 07:43:29 GMT, Xiaohong Gong wrote: > C2 compiler fails to recognize counted loops when the induction variable is constrained by multiple consecutive `CastII` nodes. > This prevents optimizations like range check elimination, loop unrolling and auto-vectorization for these loops. Please refer > to the detailed discussion for a related performance issue from [1]. > > The ideal graph of such a loop typically looks like: > > > /-----------| > | | > | ConI | > loop | / / > | | / / > \ AddI / > RangeCheck \ / | > | \ / | > IfTrue Phi | > \ | | > RangeCheck \ | | > \ CastII / <- Range check #1 > | | / > IfTrue | | > \ | | > CastII | <- Range check #2 > | / > |-------/ > > > > For a counted loop, the loop induction variable (i.e `Phi`) should be the input of `AddI` ideally. However, in above case, it is used > by two consecutive `CastII` nodes generated by two different range check operations. Compiler should skip all such kind of `CastII` when recognizing a counted loop. > > This patch modifies the counted loop recognition code to iteratively uncast the loop `iv` until no `CastII` nodes remain, enabling proper counted loop recognition even when the induction variable undergoes multiple range constraint operations. > > Test: > - Tested tier1, tier2, tier3, and no regressions are found. > - An additional test case is added to verify the fix. > > Performance: > Here is the performance gain on a NVIDIA Grace machine which is an AArch64 architecture: > > > Benchmark Mode Cnt Unit Before After Gain > CountedLoopCastIV.loop_iv_int thrpt 30 ops/s 941482.597 4389292.439 4.66 > CountedLoopCastIV.loop_iv_long thrpt 30 ops/s 884563.232 1441485.455 1.62 > > > We can also observe the similar uplift on a x86_64 machine. > > [1] https://github.com/openjdk/jdk/pull/25138#issuecomment-2892720654 test/hotspot/jtreg/compiler/c2/irTests/TestCountedLoopCastIV.java line 2: > 1: /* > 2: * Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. Can you please move the test to `test/hotspot/jtreg/compiler/loopopts`? The `irTests` directory was not the best idea, it makes more sense to have tests thematically grouped. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25539#discussion_r2120715051 From mchevalier at openjdk.org Mon Jun 2 10:37:11 2025 From: mchevalier at openjdk.org (Marc Chevalier) Date: Mon, 2 Jun 2025 10:37:11 GMT Subject: RFR: 8353266: C2: Wrong execution with Integer.bitCount(int) intrinsic on AArch64 [v2] In-Reply-To: References: Message-ID: > ### Problem > > On Aarch64, using `Integer.bitCount` can modify its argument. The problem comes from the implementation of `popCountI` on Aarch64. For instance, that's what we get with the reproducer `Reduced.java` on the related issue: > > ; Load lFld into local x > ldr x11, [x10, #120] > ; popCountI > mov w11, w11 > mov v16.d[0], x11 > cnt v16.8b, v16.8b > addv b16, v16.8b > mov x13, v16.d[0] > ; [...] > ; store local x (which is believed to still contain lFld) into result > str x11, [x10, #128] > > > The instruction `mov w11, w11` is used to cut the 32 higher bits of `x11` since we use `popCountI` (from `Integer.bitCount`): on aarch64 (like other architectures), assigning the 32 lower bits of a register reset the 32 higher bits. Short: the input is modified, but the implementation of `popCountI` doesn't declare it: > > instruct popCountI(iRegINoSp dst, iRegIorL2I src, vRegF tmp) %{ > match(Set dst (PopCountI src)); > effect(TEMP tmp); > [...] > %} > > > But then, why resetting the upper word of `x11`? It all starts with vector instructions: > > cnt v16.8b, v16.8b > addv b16, v16.8b > > The `8b` specifies that it operates on the 8 lower bytes of `v16`, it would be nice to simply use `4b`, but that doesn't exist: vector instructions can only work on either the whole 128-bit register, or the 64 lower bits (by blocks of 1, 2, 4, 8 or 16 bytes). There is no suffix (and encoding) for a vector instruction to work only on the 32 lower bits, so not to pollute the bit count, we need to reset the 32 higher bits of `v16.d[0]` (aka `d16`), that is `v16.s[1]`, that is `v16[32:63]` in a more bit-explicit notation. Moreover, unlike with general purpose register doing > > mov v16.s[0], w11 > > would set `v16[0:31]` to `w11`, but not reset `v16[32:63]`. Which makes sense! Otherwise, using vector registers would be impractical if writing any piece would reset the rest... So we indeed need to set all of `v16[0:63]`, which > > mov w11, w11 > mov v16.d[0], x11 > > does, but by destroying `x11`. > > ### Solution > > Simply adding `USE_KILL src` in the effects would be nice, but unfortunately not possible: `iRegIorL2I` is an operand class (either a 32-bit register or a L2I of a 64-bit register) and those cannot be used in effect lists. > > The way I went for is rather not to modify the source, but rather do write the two lower words of `v16` we are interested in separately: > > mov v16.s[1], wzr ; Reset the 1-indexed word of v16, that is v16[32:63] <- 0 > mov v16.s[0], w11 ; Set the 0-ind... Marc Chevalier has updated the pull request incrementally with one additional commit since the last revision: Apply suggestions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25551/files - new: https://git.openjdk.org/jdk/pull/25551/files/fb8d64d9..8318b50c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25551&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25551&range=00-01 Stats: 6 lines in 2 files changed: 0 ins; 2 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/25551.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25551/head:pull/25551 PR: https://git.openjdk.org/jdk/pull/25551 From mchevalier at openjdk.org Mon Jun 2 10:37:12 2025 From: mchevalier at openjdk.org (Marc Chevalier) Date: Mon, 2 Jun 2025 10:37:12 GMT Subject: RFR: 8353266: C2: Wrong execution with Integer.bitCount(int) intrinsic on AArch64 In-Reply-To: References: Message-ID: <5KRLt28hn0r2ZL_M0Rdx7LOThZPIymChXhWGP7SVLXI=.0a0bc3f7-b81d-4271-8044-8431edd6196d@github.com> On Fri, 30 May 2025 15:33:14 GMT, Marc Chevalier wrote: > ### Problem > > On Aarch64, using `Integer.bitCount` can modify its argument. The problem comes from the implementation of `popCountI` on Aarch64. For instance, that's what we get with the reproducer `Reduced.java` on the related issue: > > ; Load lFld into local x > ldr x11, [x10, #120] > ; popCountI > mov w11, w11 > mov v16.d[0], x11 > cnt v16.8b, v16.8b > addv b16, v16.8b > mov x13, v16.d[0] > ; [...] > ; store local x (which is believed to still contain lFld) into result > str x11, [x10, #128] > > > The instruction `mov w11, w11` is used to cut the 32 higher bits of `x11` since we use `popCountI` (from `Integer.bitCount`): on aarch64 (like other architectures), assigning the 32 lower bits of a register reset the 32 higher bits. Short: the input is modified, but the implementation of `popCountI` doesn't declare it: > > instruct popCountI(iRegINoSp dst, iRegIorL2I src, vRegF tmp) %{ > match(Set dst (PopCountI src)); > effect(TEMP tmp); > [...] > %} > > > But then, why resetting the upper word of `x11`? It all starts with vector instructions: > > cnt v16.8b, v16.8b > addv b16, v16.8b > > The `8b` specifies that it operates on the 8 lower bytes of `v16`, it would be nice to simply use `4b`, but that doesn't exist: vector instructions can only work on either the whole 128-bit register, or the 64 lower bits (by blocks of 1, 2, 4, 8 or 16 bytes). There is no suffix (and encoding) for a vector instruction to work only on the 32 lower bits, so not to pollute the bit count, we need to reset the 32 higher bits of `v16.d[0]` (aka `d16`), that is `v16.s[1]`, that is `v16[32:63]` in a more bit-explicit notation. Moreover, unlike with general purpose register doing > > mov v16.s[0], w11 > > would set `v16[0:31]` to `w11`, but not reset `v16[32:63]`. Which makes sense! Otherwise, using vector registers would be impractical if writing any piece would reset the rest... So we indeed need to set all of `v16[0:63]`, which > > mov w11, w11 > mov v16.d[0], x11 > > does, but by destroying `x11`. > > ### Solution > > Simply adding `USE_KILL src` in the effects would be nice, but unfortunately not possible: `iRegIorL2I` is an operand class (either a 32-bit register or a L2I of a 64-bit register) and those cannot be used in effect lists. > > The way I went for is rather not to modify the source, but rather do write the two lower words of `v16` we are interested in separately: > > mov v16.s[1], wzr ; Reset the 1-indexed word of v16, that is v16[32:63] <- 0 > mov v16.s[0], w11 ; Set the 0-ind... I've changed the two `mov`s into a `fmovs` as suggested and adapted the format part. Tests seem happy. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25551#issuecomment-2929946930 From mchevalier at openjdk.org Mon Jun 2 10:37:12 2025 From: mchevalier at openjdk.org (Marc Chevalier) Date: Mon, 2 Jun 2025 10:37:12 GMT Subject: RFR: 8353266: C2: Wrong execution with Integer.bitCount(int) intrinsic on AArch64 [v2] In-Reply-To: References: Message-ID: On Sat, 31 May 2025 14:29:26 GMT, Andrew Haley wrote: >> Marc Chevalier has updated the pull request incrementally with one additional commit since the last revision: >> >> Apply suggestions > > src/hotspot/cpu/aarch64/aarch64.ad line 7771: > >> 7769: ins_encode %{ >> 7770: __ mov($tmp$$FloatRegister, __ S, 1, zr); // tmp[32:63] <- 0 >> 7771: __ mov($tmp$$FloatRegister, __ S, 0, $src$$Register); // tmp[ 0:31] <- src > > "Where the entire 128-bit wide register is not fully utilized, the vector or scalar quantity is held in the least significant bits of the register, with the most significant bits being cleared to zero on a write." > > Suggestion: > > __ fmovs($tmp$$FloatRegister, $src$$Register); > > should do it. Yes! Nicer, thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25551#discussion_r2120723694 From mchevalier at openjdk.org Mon Jun 2 10:37:12 2025 From: mchevalier at openjdk.org (Marc Chevalier) Date: Mon, 2 Jun 2025 10:37:12 GMT Subject: RFR: 8353266: C2: Wrong execution with Integer.bitCount(int) intrinsic on AArch64 [v2] In-Reply-To: References: Message-ID: On Sat, 31 May 2025 03:11:28 GMT, SendaoYan wrote: >> Marc Chevalier has updated the pull request incrementally with one additional commit since the last revision: >> >> Apply suggestions > > test/hotspot/jtreg/compiler/intrinsics/BitCountIAarch64PreservesArgument.java line 58: > >> 56: if (result != 0xfedc_ba98_7654_3210L) { >> 57: // Wrongly outputs the cut input 0x7654_3210 == 1985229328 >> 58: throw new RuntimeException("Wrong result. lFld=" + lFld + "; result=" + result); > > How about: > > > throw new RuntimeException("Wrong result. Expected result = " + lFld + "; Actual result = " + result); That looks better indeed. Applied. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25551#discussion_r2120724260 From epeter at openjdk.org Mon Jun 2 10:45:50 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 10:45:50 GMT Subject: RFR: 8357726: C2 fails to recognize the counted loop when induction variable range is changed multiple times In-Reply-To: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> References: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> Message-ID: <1Er4nlGWx_yp6RIkqSo0PUk84lX50sTAGmGbnu4jokY=.74dc326e-9038-40d0-9b00-f5eaef1bd504@github.com> On Fri, 30 May 2025 07:43:29 GMT, Xiaohong Gong wrote: > C2 compiler fails to recognize counted loops when the induction variable is constrained by multiple consecutive `CastII` nodes. > This prevents optimizations like range check elimination, loop unrolling and auto-vectorization for these loops. Please refer > to the detailed discussion for a related performance issue from [1]. > > The ideal graph of such a loop typically looks like: > > > /-----------| > | | > | ConI | > loop | / / > | | / / > \ AddI / > RangeCheck \ / | > | \ / | > IfTrue Phi | > \ | | > RangeCheck \ | | > \ CastII / <- Range check #1 > | | / > IfTrue | | > \ | | > CastII | <- Range check #2 > | / > |-------/ > > > > For a counted loop, the loop induction variable (i.e `Phi`) should be the input of `AddI` ideally. However, in above case, it is used > by two consecutive `CastII` nodes generated by two different range check operations. Compiler should skip all such kind of `CastII` when recognizing a counted loop. > > This patch modifies the counted loop recognition code to iteratively uncast the loop `iv` until no `CastII` nodes remain, enabling proper counted loop recognition even when the induction variable undergoes multiple range constraint operations. > > Test: > - Tested tier1, tier2, tier3, and no regressions are found. > - An additional test case is added to verify the fix. > > Performance: > Here is the performance gain on a NVIDIA Grace machine which is an AArch64 architecture: > > > Benchmark Mode Cnt Unit Before After Gain > CountedLoopCastIV.loop_iv_int thrpt 30 ops/s 941482.597 4389292.439 4.66 > CountedLoopCastIV.loop_iv_long thrpt 30 ops/s 884563.232 1441485.455 1.62 > > > We can also observe the similar uplift on a x86_64 machine. > > [1] https://github.com/openjdk/jdk/pull/25138#issuecomment-2892720654 @XiaohongGong Nice work! @chhagedorn And I quickly discussed it offline, and we think this is a good approach. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25539#issuecomment-2929984802 From epeter at openjdk.org Mon Jun 2 10:49:51 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 10:49:51 GMT Subject: RFR: 8357726: C2 fails to recognize the counted loop when induction variable range is changed multiple times In-Reply-To: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> References: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> Message-ID: On Fri, 30 May 2025 07:43:29 GMT, Xiaohong Gong wrote: > C2 compiler fails to recognize counted loops when the induction variable is constrained by multiple consecutive `CastII` nodes. > This prevents optimizations like range check elimination, loop unrolling and auto-vectorization for these loops. Please refer > to the detailed discussion for a related performance issue from [1]. > > The ideal graph of such a loop typically looks like: > > > /-----------| > | | > | ConI | > loop | / / > | | / / > \ AddI / > RangeCheck \ / | > | \ / | > IfTrue Phi | > \ | | > RangeCheck \ | | > \ CastII / <- Range check #1 > | | / > IfTrue | | > \ | | > CastII | <- Range check #2 > | / > |-------/ > > > > For a counted loop, the loop induction variable (i.e `Phi`) should be the input of `AddI` ideally. However, in above case, it is used > by two consecutive `CastII` nodes generated by two different range check operations. Compiler should skip all such kind of `CastII` when recognizing a counted loop. > > This patch modifies the counted loop recognition code to iteratively uncast the loop `iv` until no `CastII` nodes remain, enabling proper counted loop recognition even when the induction variable undergoes multiple range constraint operations. > > Test: > - Tested tier1, tier2, tier3, and no regressions are found. > - An additional test case is added to verify the fix. > > Performance: > Here is the performance gain on a NVIDIA Grace machine which is an AArch64 architecture: > > > Benchmark Mode Cnt Unit Before After Gain > CountedLoopCastIV.loop_iv_int thrpt 30 ops/s 941482.597 4389292.439 4.66 > CountedLoopCastIV.loop_iv_long thrpt 30 ops/s 884563.232 1441485.455 1.62 > > > We can also observe the similar uplift on a x86_64 machine. > > [1] https://github.com/openjdk/jdk/pull/25138#issuecomment-2892720654 test/hotspot/jtreg/compiler/c2/irTests/TestCountedLoopCastIV.java line 57: > 55: out[i] = 0; > 56: } > 57: } You could also just use `Arrays.fill` test/hotspot/jtreg/compiler/c2/irTests/TestCountedLoopCastIV.java line 174: > 172: > 173: public static void main(String[] args) { > 174: TestFramework.runWithFlags("-XX:LoopUnrollLimit=0"); What is the reason for the flag here? Do you really need it? test/micro/org/openjdk/bench/vm/compiler/CountedLoopCastIV.java line 54: > 52: Random r = new Random(); > 53: start = r.nextInt(LEN >> 2); > 54: limit = r.nextInt(LEN >> 1, LEN - 3); Does this not mean that we use a different seed every time, and therefore the loop has different lengths, and so the results can be influenced accordingly? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25539#discussion_r2120762941 PR Review Comment: https://git.openjdk.org/jdk/pull/25539#discussion_r2120766290 PR Review Comment: https://git.openjdk.org/jdk/pull/25539#discussion_r2120770394 From epeter at openjdk.org Mon Jun 2 10:50:54 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 10:50:54 GMT Subject: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API In-Reply-To: References: Message-ID: On Fri, 30 May 2025 08:15:22 GMT, Xiaohong Gong wrote: >>> @XiaohongGong Thanks for splitting this one out, and for investigating the regressions here. >>> >>> Putting the permalink here, fixed to the current change (the link you pasted will always refer to the newest, which may later on point to the wrong line when lines above are inserted / deleted): >>> >>> https://github.com/openjdk/jdk/blob/7077535c0b0a6ea0a2a167f9135b1504a3d71fb3/src/hotspot/share/opto/loopnode.cpp#L1659-L1661 >>> >>> I wonder if we should just use `Node::uncast` there? But I'm quite unsure about that. >> >> Sounds good to me. I will have a deep investigation for it. Thanks! >> >> >> >>> > Yes, I also observed such regression. >>> > It would be nice if you proactively mentioned regressions, so it does not have to be pointed out by reviewers. >>> >>> For me, it could be ok to fix it in a follow-up patch. I think we are too close to RDP1 for JDK25 now anyway, and so we could push this patch here into JDK26, and then we have enough time in JDK26 to investigate the regression. Even better would be if we could do the other patch first, so we never even encounter a regression. >> >> Sounds good to me. Thanks! > >> > @XiaohongGong Thanks for splitting this one out, and for investigating the regressions here. >> > Putting the permalink here, fixed to the current change (the link you pasted will always refer to the newest, which may later on point to the wrong line when lines above are inserted / deleted): >> > https://github.com/openjdk/jdk/blob/7077535c0b0a6ea0a2a167f9135b1504a3d71fb3/src/hotspot/share/opto/loopnode.cpp#L1659-L1661 >> > >> > I wonder if we should just use `Node::uncast` there? But I'm quite unsure about that. >> >> Sounds good to me. I will have a deep investigation for it. Thanks! > > Hi @eme64 @jatin-bhateja, I'v created a PR https://github.com/openjdk/jdk/pull/25539 to fix this issue. With this change, the performance regression can be fixed as well. Could you please take a look at that change and help to run the test on different X86 machines? Thanks a lot! @XiaohongGong I reviewed https://github.com/openjdk/jdk/pull/25539. Since it is a relatively simple patch, I suggest that we integrate that one first, and come back to this here later. Is that ok for you? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25138#issuecomment-2930007655 From epeter at openjdk.org Mon Jun 2 10:53:52 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 10:53:52 GMT Subject: RFR: 8357726: C2 fails to recognize the counted loop when induction variable range is changed multiple times In-Reply-To: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> References: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> Message-ID: On Fri, 30 May 2025 07:43:29 GMT, Xiaohong Gong wrote: > C2 compiler fails to recognize counted loops when the induction variable is constrained by multiple consecutive `CastII` nodes. > This prevents optimizations like range check elimination, loop unrolling and auto-vectorization for these loops. Please refer > to the detailed discussion for a related performance issue from [1]. > > The ideal graph of such a loop typically looks like: > > > /-----------| > | | > | ConI | > loop | / / > | | / / > \ AddI / > RangeCheck \ / | > | \ / | > IfTrue Phi | > \ | | > RangeCheck \ | | > \ CastII / <- Range check #1 > | | / > IfTrue | | > \ | | > CastII | <- Range check #2 > | / > |-------/ > > > > For a counted loop, the loop induction variable (i.e `Phi`) should be the input of `AddI` ideally. However, in above case, it is used > by two consecutive `CastII` nodes generated by two different range check operations. Compiler should skip all such kind of `CastII` when recognizing a counted loop. > > This patch modifies the counted loop recognition code to iteratively uncast the loop `iv` until no `CastII` nodes remain, enabling proper counted loop recognition even when the induction variable undergoes multiple range constraint operations. > > Test: > - Tested tier1, tier2, tier3, and no regressions are found. > - An additional test case is added to verify the fix. > > Performance: > Here is the performance gain on a NVIDIA Grace machine which is an AArch64 architecture: > > > Benchmark Mode Cnt Unit Before After Gain > CountedLoopCastIV.loop_iv_int thrpt 30 ops/s 941482.597 4389292.439 4.66 > CountedLoopCastIV.loop_iv_long thrpt 30 ops/s 884563.232 1441485.455 1.62 > > > We can also observe the similar uplift on a x86_64 machine. > > [1] https://github.com/openjdk/jdk/pull/25138#issuecomment-2892720654 @XiaohongGong I suggest you change the title from: `8357726: C2 fails to recognize the counted loop when induction variable range is changed multiple times` to `8357726: C2 recognize loops with multiple casts in trip counter` or even: `8357726: C2 recognize loops with multiple casts in trip counter: phi -> CastII* -> AddI -> phi` ------------- PR Comment: https://git.openjdk.org/jdk/pull/25539#issuecomment-2930020530 From epeter at openjdk.org Mon Jun 2 11:06:53 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 11:06:53 GMT Subject: RFR: 8356813: Improve Mod(I|L)Node::Value [v4] In-Reply-To: References: <2Jf_gfvRlKcmCFoQHp5T0WW_fU_yK5-0Z3z41f00-YU=.164be9f0-fae1-44bb-84c3-846d8c2c0db2@github.com> Message-ID: On Fri, 30 May 2025 07:26:13 GMT, Hannes Greule wrote: >> This change improves the precision of the `Mod(I|L)Node::Value()` functions. >> >> I reordered the structure a bit. First, we handle constants, afterwards, we handle ranges. The bottom checks seem to be excessive (`Type::BOTTOM` is covered by using `isa_(int|long)()`, the local bottom is just the full range). Given we can even give reasonable bounds if only one input has any bounds, we don't want to return early. >> The changes after that are commented. Please let me know if the explanations are good, or if you have any suggestions. >> >> ### Monotonicity >> >> Before, a 0 divisor resulted in `Type(Int|Long)::POS`. Initially I wanted to keep it this way, but that violates monotonicity during PhaseCCP. As an example, if we see a 0 divisor first and a 3 afterwards, we might try to go from `>=0` to `-2..2`, but the meet of these would be `>=-2` rather than `-2..2`. Using `Type(Int|Long)::ZERO` instead (zero is always in the resulting value if we cover a range). >> >> ### Testing >> >> I added tests for cases around the relevant bounds. I also ran tier1, tier2, and tier3 but didn't see any related failures after addressing the monotonicity problem described above (I'm having a few unrelated failures on my system currently, so separate testing would be appreciated in case I missed something). >> >> Please review and let me know what you think. >> >> ### Other >> >> The `UMod(I|L)Node`s were adjusted to be more in line with its signed variants. This change diverges them again, but similar improvements could be made after #17508. >> >> During experimenting with these changes, I stumbled upon a few things that aren't directly related to this change, but might be worth to further look into: >> - If the divisor is a constant, we will directly replace the `Mod(I|L)Node` with more but less expensive nodes in `::Ideal()`. Type analysis for these nodes combined is less precise, means we miss potential cases were this would help e.g., removing range checks. Would it make sense to delay the replacement? >> - To force non-negative ranges, I'm using `char`. I noticed that method parameters of sub-int integer types all fall back to `TypeInt::INT`. This seems to be an intentional change of https://github.com/openjdk/jdk/commit/200784d505dd98444c48c9ccb7f2e4df36dcbb6a. The bug report is private, so I can't really judge if that part is necessary, but it seems odd. > > Hannes Greule has updated the pull request incrementally with one additional commit since the last revision: > > Add randomized test src/hotspot/share/opto/divnode.cpp line 1206: > 1204: > 1205: //------------------------------Value------------------------------------------ > 1206: static const Type* mod_value(const PhaseGVN* phase, const Node* in1, const Node* in2, const BasicType bt, const Type* bottom) { You did choose the `bt` path here! I would add an assert that we only allow `T_INT` and `T_LONG` src/hotspot/share/opto/divnode.cpp line 1237: > 1235: // We don't need to check for min_jint % '-1' as its result is defined when using jlong. > 1236: if (i1->get_con_as_long(bt) == min_jlong && i2->get_con_as_long(bt) == -1) { > 1237: return TypeInteger::zero(bt); Is this correct? For `bt = T_INT` is this really equivalent? `i1->get_con() == min_jint` We might get `min_jint` back here. `i1->get_con_as_long(bt) == min_jlong` Would we not return `min_jint` here, and then the condition is false? Do we have an IR test for this? src/hotspot/share/opto/divnode.cpp line 1241: > 1239: return TypeInteger::make(i1->get_con_as_long(bt) % i2->get_con_as_long(bt), bt); > 1240: } > 1241: // The magnitude of the divisor is in range [1, 2^63]. You should probably also mention the `2^31` variant. src/hotspot/share/opto/divnode.cpp line 1247: > 1245: // JVMS lrem bytecode: "the magnitude of the result is always less than the magnitude of the divisor" > 1246: // "less than" means we can subtract 1 to get an inclusive upper bound in [0, 2^63-1] > 1247: jlong hi = static_cast(divisor_magnitude - 1); Hmm, this also looks confusing for the `T_INT` case. What about `-5`, does that then not become `max_julong - 5`, but it should have been `max_juint - 1`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25254#discussion_r2120802575 PR Review Comment: https://git.openjdk.org/jdk/pull/25254#discussion_r2120800945 PR Review Comment: https://git.openjdk.org/jdk/pull/25254#discussion_r2120801900 PR Review Comment: https://git.openjdk.org/jdk/pull/25254#discussion_r2120805584 From epeter at openjdk.org Mon Jun 2 11:07:53 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 11:07:53 GMT Subject: RFR: 8252473: [TESTBUG] compiler tests fail with minimal VM: Unrecognized VM option [v2] In-Reply-To: References: Message-ID: On Wed, 28 May 2025 18:45:49 GMT, Zdenek Zambersky wrote: > (I have not changed JIRA as there is no info about fix. Should I add it there?) Yes please, that is generally what we should do :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/24262#issuecomment-2930075745 From epeter at openjdk.org Mon Jun 2 11:10:53 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 11:10:53 GMT Subject: RFR: 8252473: [TESTBUG] compiler tests fail with minimal VM: Unrecognized VM option [v3] In-Reply-To: References: Message-ID: On Wed, 28 May 2025 18:39:27 GMT, Zdenek Zambersky wrote: >> This change adds ` -XX:-IgnoreUnrecognizedVMOptions` to problematic tests (or `@requires vm.compiler2.enabled` in one case), to prevent failures `Unrecognized VM option` on client VM. > > Zdenek Zambersky has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: > > Fix of compiler tests for client VM Still looks reasonable. I'll run some testing now, please ping me again in 24h :) ------------- PR Review: https://git.openjdk.org/jdk/pull/24262#pullrequestreview-2887893389 From dnsimon at openjdk.org Mon Jun 2 11:11:52 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 2 Jun 2025 11:11:52 GMT Subject: RFR: 8357987: [JVMCI] Add support for retrieving all methods of a ResolvedJavaType [v2] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 08:15:36 GMT, Tom Shull wrote: >> Currently from ResolvedJavaType one can retrieve all declared methods, static methods, and constructors of the given type. However, internally in HotSpot there are also VM-internal methods, such as overpass methods, associated with a given type which we cannot access via the API. >> >> To correct this, we should add a new method which enables VM-internal methods, such as overpass methods, to be accessed. > > Tom Shull has updated the pull request incrementally with one additional commit since the last revision: > > format javadoc and update test Looks good to me. ------------- Marked as reviewed by dnsimon (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25498#pullrequestreview-2887897833 From dnsimon at openjdk.org Mon Jun 2 11:16:53 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 2 Jun 2025 11:16:53 GMT Subject: RFR: 8357660: [JVMCI] Add support for retrieving all BootstrapMethodInvocations directly from ConstantPool [v2] In-Reply-To: References: Message-ID: <1jDUbEJHRDYuT4RDOHlEeY5C4IWwwcenweFgZcwnUsU=.bc8d84ad-13bb-4b5a-9d02-de020301e3d6@github.com> On Mon, 2 Jun 2025 08:39:31 GMT, Tom Shull wrote: >> This PR adds support for directly retrieving both all invokedynamic and all condy BootstrapMethodInvocations from a ConstantPool via the new method `List lookupBootstrapMethodInvocations(boolean invokeDynamic)`. >> >> In addition, two methods are added to the BootstrapMethodInvocations: >> 1. `void resolve()` >> 2. `JavaConstant lookup()` >> >> The combination of these two features allows one to directly interact with all BSM information of a given ConstantPool without having to iterate through all of the Classfile's methods to find all invokedynamic bytecodes and/or iterate through all Constant Pool entries. > > Tom Shull has updated the pull request incrementally with one additional commit since the last revision: > > reviewer feedback and update javadoc formatting Looks good to me. Please enable GitHub Actions on your JDK fork. ------------- Marked as reviewed by dnsimon (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25420#pullrequestreview-2887908832 From hgreule at openjdk.org Mon Jun 2 11:33:53 2025 From: hgreule at openjdk.org (Hannes Greule) Date: Mon, 2 Jun 2025 11:33:53 GMT Subject: RFR: 8356813: Improve Mod(I|L)Node::Value [v4] In-Reply-To: References: <2Jf_gfvRlKcmCFoQHp5T0WW_fU_yK5-0Z3z41f00-YU=.164be9f0-fae1-44bb-84c3-846d8c2c0db2@github.com> Message-ID: On Mon, 2 Jun 2025 10:58:45 GMT, Emanuel Peter wrote: >> Hannes Greule has updated the pull request incrementally with one additional commit since the last revision: >> >> Add randomized test > > src/hotspot/share/opto/divnode.cpp line 1237: > >> 1235: // We don't need to check for min_jint % '-1' as its result is defined when using jlong. >> 1236: if (i1->get_con_as_long(bt) == min_jlong && i2->get_con_as_long(bt) == -1) { >> 1237: return TypeInteger::zero(bt); > > Is this correct? For `bt = T_INT` is this really equivalent? > > `i1->get_con() == min_jint` > We might get `min_jint` back here. > > `i1->get_con_as_long(bt) == min_jlong` > Would we not return `min_jint` here, and then the condition is false? > > Do we have an IR test for this? This special case is only needed because `min_jlong % -1L` in C++ is UB (afaik) and the idiv instruction triggers a SIGFPE in such case. But `min_jint % -1L` *using long arithmetic* correctly produces 0. I think it would make sense to expand tests for constant folding, but I'll have to check if that actually gets called, see **Other** in the PR description (copied): > If the divisor is a constant, we will directly replace the Mod(I|L)Node with more but less expensive nodes in ::Ideal(). Type analysis for these nodes combined is less precise, means we miss potential cases were this would help e.g., removing range checks. Would it make sense to delay the replacement? So there's a chance this code was never called before... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25254#discussion_r2120868235 From hgreule at openjdk.org Mon Jun 2 11:36:54 2025 From: hgreule at openjdk.org (Hannes Greule) Date: Mon, 2 Jun 2025 11:36:54 GMT Subject: RFR: 8356813: Improve Mod(I|L)Node::Value [v4] In-Reply-To: References: <2Jf_gfvRlKcmCFoQHp5T0WW_fU_yK5-0Z3z41f00-YU=.164be9f0-fae1-44bb-84c3-846d8c2c0db2@github.com> Message-ID: <5FnA_gZNzRom3MBShwfbdCffeRGogf1cyKo0nF40c4I=.9db6f973-e6a5-4852-b82e-24ccc198bcb9@github.com> On Mon, 2 Jun 2025 11:01:29 GMT, Emanuel Peter wrote: >> Hannes Greule has updated the pull request incrementally with one additional commit since the last revision: >> >> Add randomized test > > src/hotspot/share/opto/divnode.cpp line 1247: > >> 1245: // JVMS lrem bytecode: "the magnitude of the result is always less than the magnitude of the divisor" >> 1246: // "less than" means we can subtract 1 to get an inclusive upper bound in [0, 2^63-1] >> 1247: jlong hi = static_cast(divisor_magnitude - 1); > > Hmm, this also looks confusing for the `T_INT` case. What about `-5`, does that then not become `max_julong - 5`, but it should have been `max_juint - 1`? We use `g_uabs()` to get the absolute value, that should't exceed 2^31 for int values (i.e., `g_uabs(min_jint) == 2^31`). So we should get into the right range here again. But I guess I can expand the comment to better explain that part. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25254#discussion_r2120875453 From epeter at openjdk.org Mon Jun 2 11:46:54 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 11:46:54 GMT Subject: RFR: 8356813: Improve Mod(I|L)Node::Value [v4] In-Reply-To: <5FnA_gZNzRom3MBShwfbdCffeRGogf1cyKo0nF40c4I=.9db6f973-e6a5-4852-b82e-24ccc198bcb9@github.com> References: <2Jf_gfvRlKcmCFoQHp5T0WW_fU_yK5-0Z3z41f00-YU=.164be9f0-fae1-44bb-84c3-846d8c2c0db2@github.com> <5FnA_gZNzRom3MBShwfbdCffeRGogf1cyKo0nF40c4I=.9db6f973-e6a5-4852-b82e-24ccc198bcb9@github.com> Message-ID: On Mon, 2 Jun 2025 11:34:22 GMT, Hannes Greule wrote: >> src/hotspot/share/opto/divnode.cpp line 1247: >> >>> 1245: // JVMS lrem bytecode: "the magnitude of the result is always less than the magnitude of the divisor" >>> 1246: // "less than" means we can subtract 1 to get an inclusive upper bound in [0, 2^63-1] >>> 1247: jlong hi = static_cast(divisor_magnitude - 1); >> >> Hmm, this also looks confusing for the `T_INT` case. What about `-5`, does that then not become `max_julong - 5`, but it should have been `max_juint - 1`? > > We use `g_uabs()` to get the absolute value, that should't exceed 2^31 for int values (i.e., `g_uabs(min_jint) == 2^31`). So we should get into the right range here again. But I guess I can expand the comment to better explain that part. @SirYwell I'm not 100% sure here, so please correct me if I'm wrong. You are now always passing in a `jlong` value, so you always use `static inline julong g_uabs(jlong n) { return g_uabs((julong)n); }`, even for `T_INT`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25254#discussion_r2120898799 From epeter at openjdk.org Mon Jun 2 11:53:00 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 11:53:00 GMT Subject: RFR: 8347555: [REDO] C2: implement optimization for series of Add of unique value [v7] In-Reply-To: References: Message-ID: On Wed, 28 May 2025 14:47:02 GMT, Kangcheng Xu wrote: >> @tabjy Thanks for your patience, this one took me longer than I wanted. I responded like this above: >> >>> Hmm, ok I see. Why don't you remove the asserts for now, and we see how clear the code looks now. I think I asked for the consistency check because I was confused by the previous code structure. Maybe it is ok now as it is. > > Ping @eme64 again for awareness. :) @tabjy > I could, at very least, try to swap LHS and RHS if no match is found I think that would be a good idea, and not very hard. You can just have a function `add_pattern(lhs, rhs)`, and then run it also with `add_pattern(rhs, lhs)` for **swapping**. Personally, I would have preferred a recursive algorithm, but that could have some compile time overhead. @chhagedorn Was a little more skeptical about the recursive algorithm. It seems the motivation for this change is the benchmark from here: ArithmeticCanonicalizationBenchmark https://ionutbalosin.com/2024/02/jvm-performance-comparison-for-jdk-21/#jit-compiler This benchmark is of course somewhat arbitrary, and so are now all of your added patterns. Having a most general solution would be nice, but maybe the recursive algorithm is too much, I'm not 100% sure. Of course we now still have cases that do not optimize/canonicalize, and so someone could write a benchmark for those cases still.. oh well. What I would like to see for **testing**: add some more patterns with IR rules. More that now optimize, and also a few that do not optimize, just so we have a bit of a sense what we are still missing. @rwestrel Filed this issue. I wonder: what do you think we should do here? How general should the optimization/canonicalization be? ------------- PR Comment: https://git.openjdk.org/jdk/pull/23506#issuecomment-2930295143 From aph at openjdk.org Mon Jun 2 11:56:52 2025 From: aph at openjdk.org (Andrew Haley) Date: Mon, 2 Jun 2025 11:56:52 GMT Subject: RFR: 8353266: C2: Wrong execution with Integer.bitCount(int) intrinsic on AArch64 [v2] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 10:37:11 GMT, Marc Chevalier wrote: >> ### Problem >> >> On Aarch64, using `Integer.bitCount` can modify its argument. The problem comes from the implementation of `popCountI` on Aarch64. For instance, that's what we get with the reproducer `Reduced.java` on the related issue: >> >> ; Load lFld into local x >> ldr x11, [x10, #120] >> ; popCountI >> mov w11, w11 >> mov v16.d[0], x11 >> cnt v16.8b, v16.8b >> addv b16, v16.8b >> mov x13, v16.d[0] >> ; [...] >> ; store local x (which is believed to still contain lFld) into result >> str x11, [x10, #128] >> >> >> The instruction `mov w11, w11` is used to cut the 32 higher bits of `x11` since we use `popCountI` (from `Integer.bitCount`): on aarch64 (like other architectures), assigning the 32 lower bits of a register reset the 32 higher bits. Short: the input is modified, but the implementation of `popCountI` doesn't declare it: >> >> instruct popCountI(iRegINoSp dst, iRegIorL2I src, vRegF tmp) %{ >> match(Set dst (PopCountI src)); >> effect(TEMP tmp); >> [...] >> %} >> >> >> But then, why resetting the upper word of `x11`? It all starts with vector instructions: >> >> cnt v16.8b, v16.8b >> addv b16, v16.8b >> >> The `8b` specifies that it operates on the 8 lower bytes of `v16`, it would be nice to simply use `4b`, but that doesn't exist: vector instructions can only work on either the whole 128-bit register, or the 64 lower bits (by blocks of 1, 2, 4, 8 or 16 bytes). There is no suffix (and encoding) for a vector instruction to work only on the 32 lower bits, so not to pollute the bit count, we need to reset the 32 higher bits of `v16.d[0]` (aka `d16`), that is `v16.s[1]`, that is `v16[32:63]` in a more bit-explicit notation. Moreover, unlike with general purpose register doing >> >> mov v16.s[0], w11 >> >> would set `v16[0:31]` to `w11`, but not reset `v16[32:63]`. Which makes sense! Otherwise, using vector registers would be impractical if writing any piece would reset the rest... So we indeed need to set all of `v16[0:63]`, which >> >> mov w11, w11 >> mov v16.d[0], x11 >> >> does, but by destroying `x11`. >> >> ### Solution >> >> Simply adding `USE_KILL src` in the effects would be nice, but unfortunately not possible: `iRegIorL2I` is an operand class (either a 32-bit register or a L2I of a 64-bit register) and those cannot be used in effect lists. >> >> The way I went for is rather not to modify the source, but rather do write the two lower words of `v16` we are interested in separately: >> >> mov v16.s[1], wzr ... > > Marc Chevalier has updated the pull request incrementally with one additional commit since the last revision: > > Apply suggestions Marked as reviewed by aph (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25551#pullrequestreview-2888056446 From chagedorn at openjdk.org Mon Jun 2 12:08:10 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Mon, 2 Jun 2025 12:08:10 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v61] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 05:58:13 GMT, Emanuel Peter wrote: >> test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java line 306: >> >>> 304: var myHook = new Hook("MyHook"); >>> 305: >>> 306: var template1 = Template.make("name", "value", (String name, Integer value) -> body( >> >> One could generally think about using `_` for unused lambda parameters which I think is the common convention. But then I guess we would need to update the documentation about saying "name" and "String name" should be the same and make an exception for unused ones. I don't know. > > I think it is better to keep the names duplicated. This gives the reader an easier visual aid to check which name has which type. What do you think? That's totally fine and easy to follow. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2120948254 From jbhateja at openjdk.org Mon Jun 2 12:08:54 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 2 Jun 2025 12:08:54 GMT Subject: RFR: 8352635: Improve inferencing of Float16 operations with constant inputs [v4] In-Reply-To: <6PFX21b9eT5mQv8Ym7b_RuKNpnuQ5CVqhc8TKxstlYo=.eb7d9f85-5e49-4e8f-b17a-c8e3728e7624@github.com> References: <44nVQBYgzCOB2mAB9xtAPvkUcOMJOITA2VjMdDFgm1g=.48266693-48bf-41db-8871-a7dcafe93509@github.com> <6PFX21b9eT5mQv8Ym7b_RuKNpnuQ5CVqhc8TKxstlYo=.eb7d9f85-5e49-4e8f-b17a-c8e3728e7624@github.com> Message-ID: On Wed, 28 May 2025 09:15:31 GMT, Emanuel Peter wrote: >> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: >> >> - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8352635 >> - Enabling some test points >> - Adding test points and some re-factoring >> - Merge branch 'master' of https://github.com/openjdk/jdk into JDK-8352635 >> - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8352635 >> - 8352635: Improve inferencing of Float16 operations with constant inputs > > @jatin-bhateja That looks very promising, thanks for working on that! Hi @eme64 , Your comments have been addressed. Best Regards ------------- PR Comment: https://git.openjdk.org/jdk/pull/24179#issuecomment-2930355506 From yzheng at openjdk.org Mon Jun 2 12:11:57 2025 From: yzheng at openjdk.org (Yudi Zheng) Date: Mon, 2 Jun 2025 12:11:57 GMT Subject: RFR: 8357987: [JVMCI] Add support for retrieving all methods of a ResolvedJavaType [v2] In-Reply-To: References: Message-ID: <0o43MdXkVHVU8JQIoBSQ-46j3jLJjvAEqARhk88aeEw=.a202168b-af3f-4ce4-b274-f1cbbd4295fa@github.com> On Mon, 2 Jun 2025 08:15:36 GMT, Tom Shull wrote: >> Currently from ResolvedJavaType one can retrieve all declared methods, static methods, and constructors of the given type. However, internally in HotSpot there are also VM-internal methods, such as overpass methods, associated with a given type which we cannot access via the API. >> >> To correct this, we should add a new method which enables VM-internal methods, such as overpass methods, to be accessed. > > Tom Shull has updated the pull request incrementally with one additional commit since the last revision: > > format javadoc and update test src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotResolvedObjectTypeImpl.java line 1079: > 1077: return List.of(); > 1078: } > 1079: return Collections.unmodifiableList(Arrays.asList(instanceMethods)); `return List.of(instanceMethods);` should work. We can then replace the above with `return List.of(runtime().compilerToVm.getAllMethods(this));` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25498#discussion_r2120952854 From chagedorn at openjdk.org Mon Jun 2 12:12:11 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Mon, 2 Jun 2025 12:12:11 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v61] In-Reply-To: References: Message-ID: On Sun, 1 Jun 2025 15:56:18 GMT, Emanuel Peter wrote: >> Another question which is not evidently clear by following the examples: Can and should (not) you use the same hook inside the hook itself, i.e.: >> >> Hooks.CLASS_HOOK.anchor( >> Hooks.CLASS_HOOK.anchor( >> // ... >> >> This is probably not done on purpose but such a situation could arise when nesting more templates and suddenly one anchors the same hook again? > > I extended the explanations: > > ~ 397 // We saw the use of custom hooks above, but now we look at the use of CLASS_HOOK and METHOD_HOOK. > ~ 398 // By convention, we use the CLASS_HOOK for class scopes, and METHOD_HOOK for method scopes. > + 399 // Whenever we open a class scope, we should anchor a CLASS_HOOK for that scope, and whenever we > + 400 // open a method, we should anchor a METHOD_HOOK. Conversely, this allows us to check if we are > + 401 // inside a class or method scope by querying "isAnchored". This convention helps us when building > + 402 // a large library of Templates. But if you are writing your own self-contained set of Templates, > + 403 // you do not have to follow this convention. > + 404 // > + 405 // Hooks are "re-entrant", that is we can anchor the same hook inside a scope that we already > + 406 // anchored it previously. The "Hook.insert" always goes to the innermost anchoring of that > + 407 // hook. There are cases where "re-entrant" Hooks are helpful such as nested classes, where > + 408 // there is a class scope inside another class scope. Similarly, we can nest lambda bodies > + 409 // inside method bodies, so also METHOD_HOOK can be used in such a "re-entrant" way. > > > We could consider having both "re-entrant" and "non-re-entrant" Hooks. But I'm not yet convinced it is a very useful feature. Sure, there could be some confusion with nested hooks. But I think that confusion to code generation, because we can also nest class and method/lambda scopes. > > What do you think? The updated explanation is very good of making clear when we could/want to have nested hooks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2120955442 From chagedorn at openjdk.org Mon Jun 2 12:18:10 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Mon, 2 Jun 2025 12:18:10 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v61] In-Reply-To: References: Message-ID: On Fri, 30 May 2025 10:39:57 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: >> >> - Merge branch 'JDK-8344942-TemplateFramework-v3' of https://github.com/eme64/jdk into JDK-8344942-TemplateFramework-v3 >> - move verification > > Thanks for all the updates and discussions! I've worked my way through the documentation in `Template` and the examples again in some more detail. It's much better and the new explanations are well done, excellent work! > > I left some comments here and there but mostly minor things. I will have another look at the implementation - probably only finished by Monday. The design now looks great. I'm glad we could find a good solution now after some more iterations :-) > @chhagedorn Alright, I now have a decent solution for `$$var` and `$1var` etc. I also added tests for it. > > These are issues we could continue the conversation, unless you are satisfied with my answers: [#24217 (comment)](https://github.com/openjdk/jdk/pull/24217#discussion_r2115388737) [#24217 (comment)](https://github.com/openjdk/jdk/pull/24217#discussion_r2115406391) > > This is now ready for another review pass ? Awesome, thanks for spending some more time with these nasty edge-cases and finding a solution! I had a look at your updates for all my comments, they look good, thanks! I'm going to make a pass over the implementation classes now and will have a look at the `Renderer` updates as well :-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/24217#issuecomment-2930394221 From thartmann at openjdk.org Mon Jun 2 12:49:52 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 2 Jun 2025 12:49:52 GMT Subject: RFR: 8353266: C2: Wrong execution with Integer.bitCount(int) intrinsic on AArch64 [v2] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 10:37:11 GMT, Marc Chevalier wrote: >> ### Problem >> >> On Aarch64, using `Integer.bitCount` can modify its argument. The problem comes from the implementation of `popCountI` on Aarch64. For instance, that's what we get with the reproducer `Reduced.java` on the related issue: >> >> ; Load lFld into local x >> ldr x11, [x10, #120] >> ; popCountI >> mov w11, w11 >> mov v16.d[0], x11 >> cnt v16.8b, v16.8b >> addv b16, v16.8b >> mov x13, v16.d[0] >> ; [...] >> ; store local x (which is believed to still contain lFld) into result >> str x11, [x10, #128] >> >> >> The instruction `mov w11, w11` is used to cut the 32 higher bits of `x11` since we use `popCountI` (from `Integer.bitCount`): on aarch64 (like other architectures), assigning the 32 lower bits of a register reset the 32 higher bits. Short: the input is modified, but the implementation of `popCountI` doesn't declare it: >> >> instruct popCountI(iRegINoSp dst, iRegIorL2I src, vRegF tmp) %{ >> match(Set dst (PopCountI src)); >> effect(TEMP tmp); >> [...] >> %} >> >> >> But then, why resetting the upper word of `x11`? It all starts with vector instructions: >> >> cnt v16.8b, v16.8b >> addv b16, v16.8b >> >> The `8b` specifies that it operates on the 8 lower bytes of `v16`, it would be nice to simply use `4b`, but that doesn't exist: vector instructions can only work on either the whole 128-bit register, or the 64 lower bits (by blocks of 1, 2, 4, 8 or 16 bytes). There is no suffix (and encoding) for a vector instruction to work only on the 32 lower bits, so not to pollute the bit count, we need to reset the 32 higher bits of `v16.d[0]` (aka `d16`), that is `v16.s[1]`, that is `v16[32:63]` in a more bit-explicit notation. Moreover, unlike with general purpose register doing >> >> mov v16.s[0], w11 >> >> would set `v16[0:31]` to `w11`, but not reset `v16[32:63]`. Which makes sense! Otherwise, using vector registers would be impractical if writing any piece would reset the rest... So we indeed need to set all of `v16[0:63]`, which >> >> mov w11, w11 >> mov v16.d[0], x11 >> >> does, but by destroying `x11`. >> >> ### Solution >> >> Simply adding `USE_KILL src` in the effects would be nice, but unfortunately not possible: `iRegIorL2I` is an operand class (either a 32-bit register or a L2I of a 64-bit register) and those cannot be used in effect lists. >> >> The way I went for is rather not to modify the source, but rather do write the two lower words of `v16` we are interested in separately: >> >> mov v16.s[1], wzr ... > > Marc Chevalier has updated the pull request incrementally with one additional commit since the last revision: > > Apply suggestions Nice analysis, Marc! The fix looks good to me and I don't have a strong opinion about the print format. ------------- Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25551#pullrequestreview-2888252754 From hgreule at openjdk.org Mon Jun 2 12:55:50 2025 From: hgreule at openjdk.org (Hannes Greule) Date: Mon, 2 Jun 2025 12:55:50 GMT Subject: RFR: 8356813: Improve Mod(I|L)Node::Value [v4] In-Reply-To: References: <2Jf_gfvRlKcmCFoQHp5T0WW_fU_yK5-0Z3z41f00-YU=.164be9f0-fae1-44bb-84c3-846d8c2c0db2@github.com> <5FnA_gZNzRom3MBShwfbdCffeRGogf1cyKo0nF40c4I=.9db6f973-e6a5-4852-b82e-24ccc198bcb9@github.com> Message-ID: On Mon, 2 Jun 2025 11:44:36 GMT, Emanuel Peter wrote: >> We use `g_uabs()` to get the absolute value, that should't exceed 2^31 for int values (i.e., `g_uabs(min_jint) == 2^31`). So we should get into the right range here again. But I guess I can expand the comment to better explain that part. > > @SirYwell I'm not 100% sure here, so please correct me if I'm wrong. > You are now always passing in a `jlong` value, so you always use `static inline julong g_uabs(jlong n) { return g_uabs((julong)n); }`, even for `T_INT`. Yes that's correct, and it should still work due to how negation works for negative inputs. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25254#discussion_r2121069875 From mhaessig at openjdk.org Mon Jun 2 12:58:53 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Mon, 2 Jun 2025 12:58:53 GMT Subject: RFR: 8354930: IGV: dump C2 graph before and after live range stretching In-Reply-To: References: Message-ID: On Wed, 28 May 2025 11:54:24 GMT, Manuel H?ssig wrote: > This PR introduces a new phase `LIVE_RANGE_STRETCHING` that prints after live ranges have been stretched, if that happens at all. The phase `INITIAL_LIVENESS` is moved before live range stretching so we can compare the live ranges before and after stretching in IGV, which is useful for debugging why an oop suddenly belongs to an oop map. > > ## Testing > > - [x] [Github Actions](https://github.com/mhaessig/jdk/actions/runs/15299362485) > - [x] tier1 and tier1, plus additional Oracle internal testing for all Oracle supported platforms and OSs > - [x] verified that the new phase prints when it should in IGV and with `-XX:PrintPhaseLevel=4` Thank you for your reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25492#issuecomment-2930572102 From duke at openjdk.org Mon Jun 2 12:58:53 2025 From: duke at openjdk.org (duke) Date: Mon, 2 Jun 2025 12:58:53 GMT Subject: RFR: 8354930: IGV: dump C2 graph before and after live range stretching In-Reply-To: References: Message-ID: On Wed, 28 May 2025 11:54:24 GMT, Manuel H?ssig wrote: > This PR introduces a new phase `LIVE_RANGE_STRETCHING` that prints after live ranges have been stretched, if that happens at all. The phase `INITIAL_LIVENESS` is moved before live range stretching so we can compare the live ranges before and after stretching in IGV, which is useful for debugging why an oop suddenly belongs to an oop map. > > ## Testing > > - [x] [Github Actions](https://github.com/mhaessig/jdk/actions/runs/15299362485) > - [x] tier1 and tier1, plus additional Oracle internal testing for all Oracle supported platforms and OSs > - [x] verified that the new phase prints when it should in IGV and with `-XX:PrintPhaseLevel=4` @mhaessig Your change (at version df3c396f5a26658f6efbaf4f7a153f7214be5e57) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25492#issuecomment-2930573797 From chagedorn at openjdk.org Mon Jun 2 13:58:22 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Mon, 2 Jun 2025 13:58:22 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v71] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 03:30:24 GMT, Emanuel Peter wrote: >> **Goal** >> We want to generate Java source code: >> - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. >> - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). >> >> Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). >> >> **How to get started** >> When reviewing, please start by looking at: >> https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 >> >> We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. >> >> Second, look at this advanced test: >> https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 >> >> And then for a "tutorial", look at: >> `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` >> >> It shows these features: >> - The `body` of a Template is essentially a list of `Token`s that are concatenated. >> - Templates can be nested: a `TemplateWithArgs` is also a `Token`. >> - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. >> - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. >> - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. >> - The use of recursive templates, and `fuel` to limit the recursion. >> - `Name`s: useful to register field and variable names in code scopes. >> >> Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. >> https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 >> >> For a better experience, you may want... > > Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 91 commits: > > - Merge branch 'master' into JDK-8344942-TemplateFramework-v3 > - validation tests > - dollar and hashtag parsing validatiaon > - wip refactor parsing dollar and hashtag > - more fixes from Christian > - more improvements > - more suggestions applied > - good practice > - rename template arguments > - more from Christian > - ... and 81 more: https://git.openjdk.org/jdk/compare/90d6ad01...cb7037e7 I worked my way through the rest of the implementation. Impressive work Emanuel! I left some more mostly minor comments. But otherwise, this looks great! test/hotspot/jtreg/compiler/lib/template_framework/Code.java line 26: > 24: package compiler.lib.template_framework; > 25: > 26: import java.util.ArrayList; Unused: Suggestion: test/hotspot/jtreg/compiler/lib/template_framework/Code.java line 33: > 31: * All the {@link String}s are later collected in a {@link StringBuilder}. If we used a {@link StringBuilder} > 32: * directly to collect the {@link String}s, we could not as easily insert code at an "earlier" position, i.e. > 33: * reaching out to a {@link Hook#set}. Suggestion: * reaching out to a {@link Hook#anchor}. test/hotspot/jtreg/compiler/lib/template_framework/CodeFrame.java line 37: > 35: * When a {@link Hook} is {@link Hook#set}, this separates the Template into an outer and inner > 36: * {@link CodeFrame}, ensuring that names that are {@link Template#addName}'d inside the inner frame > 37: * are only available inside that frame. Still references old method names. Suggestion: Suggestion: * The {@link CodeFrame} represents a frame (i.e. scope) of code, appending {@link Code} to the {@code 'codeList'} * as {@link Token}s are rendered, and adding names to the {@link NameSet}s with {@link Template#addStructuralName}/ * {@link Template#addDataName}. {@link Hook}s can be added to a frame, which allows code to be inserted at that * location later. When a {@link Hook} is {@link Hook#anchor}ed, it separates the Template into an outer and inner * {@link CodeFrame}, ensuring that names that are added inside the inner frame are only available inside that frame. test/hotspot/jtreg/compiler/lib/template_framework/CodeFrame.java line 52: > 50: class CodeFrame { > 51: public final CodeFrame parent; > 52: private final List codeList = new ArrayList(); Suggestion: private final List codeList = new ArrayList<>(); test/hotspot/jtreg/compiler/lib/template_framework/CodeFrame.java line 58: > 56: * The {@link NameSet} is used for variable and fields etc. > 57: */ > 58: final NameSet names; I think this can also be made private: Suggestion: private final NameSet names; test/hotspot/jtreg/compiler/lib/template_framework/CodeFrame.java line 70: > 68: } else { > 69: // New NameSet, to make sure we have a nested scope for the names. > 70: this.names = new NameSet(parent.names); Indentation is off: Suggestion: this.names = parent.names; } else { // New NameSet, to make sure we have a nested scope for the names. this.names = new NameSet(parent.names); test/hotspot/jtreg/compiler/lib/template_framework/CodeFrame.java line 92: > 90: /** > 91: * Creates a special frame, which has a {@link #parent} but uses the {@link NameSet} > 92: * from the parent frame, allowing {@link Template#defineName} to persist in the outer `defineName` -> `addName`? test/hotspot/jtreg/compiler/lib/template_framework/CodeFrame.java line 96: > 94: * where we would possibly want to make field or variable definitions during the insertion > 95: * that are not just local to the insertion but affect the {@link CodeFrame} that we > 96: * {@link Hook#set} earlier and are now {@link Hook#insert}ing into. Suggestion: * {@link Hook#anchor} earlier and are now {@link Hook#insert}ing into. test/hotspot/jtreg/compiler/lib/template_framework/CodeFrame.java line 118: > 116: } > 117: > 118: boolean hasHook(Hook hook) { Can be made private: Suggestion: private boolean hasHook(Hook hook) { test/hotspot/jtreg/compiler/lib/template_framework/DataName.java line 33: > 31: * count, list or even sample random {@link DataName}s. Every {@link DataName} has a {@link DataName.Type}, > 32: * so that sampling can be restricted to these types. > 33: * Suggestion: * *

test/hotspot/jtreg/compiler/lib/template_framework/DataName.java line 123: > 121: if (mutability == Mutability.IMMUTABLE && dn.mutable()) { return false; } > 122: if (subtype != null && !dn.type().isSubtypeOf(subtype)) { return false; } > 123: if (supertype != null && !supertype.isSubtypeOf(dn.type())) { return false; } I suggest to use the full term: Suggestion: if (!(name instanceof DataName dataName)) { return false; } if (mutability == Mutability.MUTABLE && !dataName.mutable()) { return false; } if (mutability == Mutability.IMMUTABLE && dataName.mutable()) { return false; } if (subtype != null && !dataName.type().isSubtypeOf(subtype)) { return false; } if (supertype != null && !supertype.isSubtypeOf(dataName.type())) { return false; } test/hotspot/jtreg/compiler/lib/template_framework/DataName.java line 134: > 132: * @return The filtered {@link View}. > 133: * @throws UnsupportedOperationException If this {@link View} was already filtered with > 134: * {@link subtypeOf} or {@link exactOf}. Also for links at methods below: Suggestion: * {@link #subtypeOf} or {@link #exactOf}. test/hotspot/jtreg/compiler/lib/template_framework/DataName.java line 144: > 142: > 143: /** > 144: * Create a filtered {@link View}, where all {@link DataName}s must be subtypes of {@code type}. Suggestion: * Create a filtered {@link View}, where all {@link DataName}s must be supertypes of {@code type}. test/hotspot/jtreg/compiler/lib/template_framework/DataName.java line 181: > 179: */ > 180: public DataName sample() { > 181: DataName n = (DataName)Renderer.getCurrent().sampleName(predicate()); Do you really need this cast? Can't you just return a `Name`. From the uses it seems that you only call interface methods from `Name` at the use-sites. test/hotspot/jtreg/compiler/lib/template_framework/Hook.java line 34: > 32: * "back" or to some outer scope, e.g. while generating code for a method, one can reach out > 33: * to the class scope to insert fields. > 34: * Suggestion: * *

test/hotspot/jtreg/compiler/lib/template_framework/Name.java line 35: > 33: * The name of the name, that can be used in code. > 34: * > 35: * @return The {@String} name of the name, that can be used in code. Suggestion: * @return The {@link String} name of the name, that can be used in code. test/hotspot/jtreg/compiler/lib/template_framework/Name.java line 54: > 52: int weight(); > 53: > 54: public interface Type { Implicitly public: Suggestion: interface Type { test/hotspot/jtreg/compiler/lib/template_framework/NameSet.java line 38: > 36: */ > 37: class NameSet { > 38: static final Random RANDOM = Utils.getRandomInstance(); Suggestion: private static final Random RANDOM = Utils.getRandomInstance(); test/hotspot/jtreg/compiler/lib/template_framework/NameSet.java line 58: > 56: > 57: private long weight(Predicate predicate) { > 58: long w = names.stream().filter(n -> predicate.check(n)).mapToInt(Name::weight).sum(); Suggestion: long w = names.stream().filter(predicate::check).mapToInt(Name::weight).sum(); test/hotspot/jtreg/compiler/lib/template_framework/NameSet.java line 64: > 62: > 63: public int count(Predicate predicate) { > 64: int c = (int)names.stream().filter(n -> predicate.check(n)).count(); Suggestion: int c = (int)names.stream().filter(predicate::check).count(); test/hotspot/jtreg/compiler/lib/template_framework/NameSet.java line 70: > 68: > 69: public boolean hasAny(Predicate predicate) { > 70: return names.stream().anyMatch(n -> predicate.check(n)) || Suggestion: return names.stream().anyMatch(predicate::check) || test/hotspot/jtreg/compiler/lib/template_framework/NameSet.java line 77: > 75: List list = (parent != null) ? parent.toList(predicate) > 76: : new ArrayList<>(); > 77: list.addAll(names.stream().filter(n -> predicate.check(n)).toList()); Suggestion: list.addAll(names.stream().filter(predicate::check).toList()); test/hotspot/jtreg/compiler/lib/template_framework/NameSet.java line 88: > 86: if (w <= 0) { > 87: return null; > 88: } Shouldn't the weight always be positive? test/hotspot/jtreg/compiler/lib/template_framework/Renderer.java line 66: > 64: // another non-capturing group. > 65: "(?:\\{" + > 66: // capturing group for "name" inside of "{name}" Suggestion: // capturing group for "name" inside "{name}" test/hotspot/jtreg/compiler/lib/template_framework/Renderer.java line 199: > 197: /** > 198: * Formats values to {@link String} with the goal of using them in Java code. > 199: * By default we use the overrides of {@link Object#toString}. Suggestion: * By default, we use the overrides of {@link Object#toString}. test/hotspot/jtreg/compiler/lib/template_framework/Renderer.java line 266: > 264: case StringToken(String s) -> { > 265: renderStringWithDollarAndHashtagReplacements(s); > 266: } Suggestion: case StringToken(String s) -> renderStringWithDollarAndHashtagReplacements(s); test/hotspot/jtreg/compiler/lib/template_framework/Renderer.java line 321: > 319: callerCodeFrame.addCode(currentCodeFrame.getCode()); > 320: currentCodeFrame = callerCodeFrame; > 321: } For readability: Suggestion: case HookInsertToken(Hook hook, TemplateToken templateToken) -> { // Switch to hook CodeFrame. CodeFrame callerCodeFrame = currentCodeFrame; CodeFrame hookCodeFrame = codeFrameForHook(hook); // Use a transparent nested CodeFrame. We need a CodeFrame so that the code generated // by the TemplateToken can be collected, and hook insertions from it can still // be made to the hookCodeFrame before the code from the TemplateToken is added to // the hookCodeFrame. // But the CodeFrame must be transparent, so that its name definitions go out to // the hookCodeFrame, and are not limited to the CodeFrame for the TemplateToken. currentCodeFrame = CodeFrame.makeTransparentForNames(hookCodeFrame); renderTemplateToken(templateToken); hookCodeFrame.addCode(currentCodeFrame.getCode()); // Switch back from hook CodeFrame to caller CodeFrame. currentCodeFrame = callerCodeFrame; } case TemplateToken templateToken -> { // Use a nested CodeFrame. CodeFrame callerCodeFrame = currentCodeFrame; currentCodeFrame = CodeFrame.make(currentCodeFrame); renderTemplateToken(templateToken); callerCodeFrame.addCode(currentCodeFrame.getCode()); currentCodeFrame = callerCodeFrame; } test/hotspot/jtreg/compiler/lib/template_framework/Renderer.java line 324: > 322: case AddNameToken(Name name) -> { > 323: currentCodeFrame.addName(name); > 324: } Suggestion: case AddNameToken(Name name) -> currentCodeFrame.addName(name); test/hotspot/jtreg/compiler/lib/template_framework/Renderer.java line 338: > 336: } > 337: > 338: private void renderStringWithDollarAndHashtagReplacements(String s) { Hard to grasp the logic of that method. But I trust you on that :-) I leave it up to you if you want to improve readability to extract some of the logic to separate methods such that this method becomes easier to understand. test/hotspot/jtreg/compiler/lib/template_framework/StructuralName.java line 33: > 31: * count, list or even sample random {@link StructuralName}s. Every {@link StructuralName} has a {@link StructuralName.Type}, > 32: * so that sampling can be restricted to these types. > 33: * Suggestion: * *

test/hotspot/jtreg/compiler/lib/template_framework/StructuralName.java line 47: > 45: */ > 46: public StructuralName { > 47: } Is this required? Is it not automatically added? Same for `DataName`. test/hotspot/jtreg/compiler/lib/template_framework/StructuralName.java line 68: > 66: */ > 67: boolean isSubtypeOf(StructuralName.Type other); > 68: } This is identical to `DataName.Type`. What is the benefit of having separate interfaces `DataName.Type` and `StructuralName.Type`? Couldn't we just move `isSubtypeOf()` directly to the `Name.Type` interface and use that one below and for the fields and expose that one instead to the user? This would mean that you can update all `DataName/StructuralName.Type` to `Name.Type`. I have not checked if this is fully possible but it just occurred to me when reviewing this duplicated interface now. test/hotspot/jtreg/compiler/lib/template_framework/StructuralName.java line 96: > 94: if (!(name instanceof StructuralName dn)) { return false; } > 95: if (subtype != null && !dn.type().isSubtypeOf(subtype)) { return false; } > 96: if (supertype != null && !supertype.isSubtypeOf(dn.type())) { return false; } Suggestion: if (!(name instanceof StructuralName structuralName)) { return false; } if (subtype != null && !structuralName.type().isSubtypeOf(subtype)) { return false; } if (supertype != null && !supertype.isSubtypeOf(structuralName.type())) { return false; } test/hotspot/jtreg/compiler/lib/template_framework/StructuralName.java line 107: > 105: * @return The filtered {@link View}. > 106: * @throws UnsupportedOperationException If this {@link View} was already filtered with > 107: * {@link subtypeOf} or {@link exactOf}. Same here and in methods below: Suggestion: * {@link #subtypeOf} or {@link #exactOf}. test/hotspot/jtreg/compiler/lib/template_framework/StructuralName.java line 117: > 115: > 116: /** > 117: * Create a filtered {@link View}, where all {@link StructuralName}s must be subtypes of {@code type}. Suggestion: * Create a filtered {@link View}, where all {@link StructuralName}s must be supertypes of {@code type}. test/hotspot/jtreg/compiler/lib/template_framework/TemplateBinding.java line 43: > 41: * Creates a new {@link TemplateBinding} that has no Template bound to it yet. > 42: */ > 43: public TemplateBinding() {} Can also be removed since it's the default constructor that is automatically added for you. Suggestion: test/hotspot/jtreg/compiler/lib/template_framework/Token.java line 31: > 29: > 30: /** > 31: * The {@link Template#body} and {@link Hook#set} are given a list of tokens, which are either Suggestion: * The {@link Template#body} and {@link Hook#anchor} are given a list of tokens, which are either test/hotspot/jtreg/compiler/lib/template_framework/Token.java line 74: > 72: case Float s -> outputList.add(new StringToken(Renderer.format(s))); > 73: case Boolean s -> outputList.add(new StringToken(Renderer.format(s))); > 74: case List l -> parseList(l, outputList); Not sure if we should use a raw `List` here. Would `List` work as well? Would then need to update `parseList(List inputList ...)` to `List` as well. test/hotspot/jtreg/compiler/lib/template_framework/library/Hooks.java line 32: > 30: */ > 31: public abstract class Hooks { > 32: private Hooks() {} // Avoid instantiation and need for documentation. With `abstract` you cannot call the constructor. But you could make `Hooks` final instead of abstract and keep the private constructor. ------------- PR Review: https://git.openjdk.org/jdk/pull/24217#pullrequestreview-2888138689 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121190567 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121189211 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121045420 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121041625 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121043407 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121047490 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121166922 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121167248 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121170840 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121066303 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121094470 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121065215 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121100321 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121117160 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121184120 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121002671 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121005423 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121050914 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121052214 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121054013 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121054604 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121054802 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121074504 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121194420 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121221834 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121206168 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121217757 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121220490 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121228275 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121119033 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121122026 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121143144 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121124204 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121124996 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121125964 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2120995640 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2120976577 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2120983233 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2120989645 From epeter at openjdk.org Mon Jun 2 14:08:12 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 14:08:12 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v71] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 12:27:21 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 91 commits: >> >> - Merge branch 'master' into JDK-8344942-TemplateFramework-v3 >> - validation tests >> - dollar and hashtag parsing validatiaon >> - wip refactor parsing dollar and hashtag >> - more fixes from Christian >> - more improvements >> - more suggestions applied >> - good practice >> - rename template arguments >> - more from Christian >> - ... and 81 more: https://git.openjdk.org/jdk/compare/90d6ad01...cb7037e7 > > test/hotspot/jtreg/compiler/lib/template_framework/TemplateBinding.java line 43: > >> 41: * Creates a new {@link TemplateBinding} that has no Template bound to it yet. >> 42: */ >> 43: public TemplateBinding() {} > > Can also be removed since it's the default constructor that is automatically added for you. > Suggestion: If I do that, then `javadoc` complains: test/hotspot/jtreg/compiler/lib/template_framework/TemplateBinding.java:37: warning: use of default constructor, which does not provide a comment public class TemplateBinding { ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121254922 From mli at openjdk.org Mon Jun 2 14:13:51 2025 From: mli at openjdk.org (Hamlin Li) Date: Mon, 2 Jun 2025 14:13:51 GMT Subject: RFR: 8357554: Enable vectorization of Bool -> CMove with different type size (on riscv) In-Reply-To: References: Message-ID: On Wed, 28 May 2025 09:20:03 GMT, Emanuel Peter wrote: > @Hamlin-Li Thanks for working on this! @eme64 Sorry for the delayed reply, I've been on vacation. Thank you for having a look! > Can you please provide the the JMH benchmark results for your measurements? Sure, I have the data in https://github.com/openjdk/jdk/pull/25341, I can copy the data here. But it won't impact jmh result until https://github.com/openjdk/jdk/pull/25341 is pushed in. I'll add more jmh test and data for integral types. > It would also be good to have some IR tests, that cover the newly vectorized cases. You're right, will add more IR tests. > src/hotspot/cpu/riscv/matcher_riscv.hpp line 204: > >> 202: static bool supports_vectorize_cmove_bool_unconditionally() { >> 203: return true; >> 204: } > > Does RISCV support the use of any input vector element type, including 8bit, 16bit, 32bit and 64bit masks, and any elements we would be blending, incl `byte, short, char, int, long, HF, F, D`? > > Because it sounds you are promissing this really "unconditionally". Or what exactly do you mean by "unconditionally"? ( In this pr, it should return false for riscv too and be enabled in the riscv pr. I'll modify it. ) > Does RISCV support the use of any input vector element type, including 8bit, 16bit, 32bit and 64bit masks, and any elements we would be blending, incl byte, short, char, int, long, HF, F, D? Good question! I'll add some additional tests to double check and reflect this. I think the answer should be yes, i.e. on riscv all size of source inputs (comparing operands) and all size of dest outputs (blending result) are supported. But for HF, it's a bit special, the underlying payload is a short, so in theory it should be supported too, but it's not supported in this pr and the related riscv pr (https://github.com/openjdk/jdk/pull/25341). > Because it sounds you are promissing this really "unconditionally". Or what exactly do you mean by "unconditionally"? I mean it's really "unconditionally", but if you feel it's better to add an argument, like `supports_vectorize_cmove_bool_unconditionally(BasicType src, BasicType dst)`, I can do it. And I need to modify the `vectornode.cpp` as below too, I'll check it and modify this pr. ``` case Op_CMoveI: return (is_integral_type(bt) && bt != T_LONG ? Op_VectorBlend : 0); > src/hotspot/share/opto/superword.cpp line 2363: > >> 2361: VectorNode::is_vectorize_cmove_bool_unconditionally_supported()) { >> 2362: return true; >> 2363: } > > Can you please list which additional cases this now allows? > I suppose `D/F` comparison for the `Bool`, and then `D/F` inputs for `CMove`, but we can mismatch, e.g. compare `F` but blend `D`, right? Sure, I'll add this list. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25336#issuecomment-2930920880 PR Review Comment: https://git.openjdk.org/jdk/pull/25336#discussion_r2121272954 PR Review Comment: https://git.openjdk.org/jdk/pull/25336#discussion_r2121273227 From epeter at openjdk.org Mon Jun 2 14:13:56 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 14:13:56 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v71] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 12:54:35 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 91 commits: >> >> - Merge branch 'master' into JDK-8344942-TemplateFramework-v3 >> - validation tests >> - dollar and hashtag parsing validatiaon >> - wip refactor parsing dollar and hashtag >> - more fixes from Christian >> - more improvements >> - more suggestions applied >> - good practice >> - rename template arguments >> - more from Christian >> - ... and 81 more: https://git.openjdk.org/jdk/compare/90d6ad01...cb7037e7 > > test/hotspot/jtreg/compiler/lib/template_framework/NameSet.java line 88: > >> 86: if (w <= 0) { >> 87: return null; >> 88: } > > Shouldn't the weight always be positive? Yes. True. I sometimes just also cover negative values to be a bit more robust... but I can also change it if you prefer that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121271395 From epeter at openjdk.org Mon Jun 2 14:13:52 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 14:13:52 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v72] In-Reply-To: References: Message-ID: > **Goal** > We want to generate Java source code: > - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. > - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). > > Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). > > **How to get started** > When reviewing, please start by looking at: > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 > > We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. > > Second, look at this advanced test: > https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 > > And then for a "tutorial", look at: > `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` > > It shows these features: > - The `body` of a Template is essentially a list of `Token`s that are concatenated. > - Templates can be nested: a `TemplateWithArgs` is also a `Token`. > - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. > - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. > - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. > - The use of recursive templates, and `fuel` to limit the recursion. > - `Name`s: useful to register field and variable names in code scopes. > > Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 > > For a better experience, you may want to generate the `javadocs`: > `javadoc -sourcepath test/hotspot/j... Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: Apply suggestions from code review Co-authored-by: Christian Hagedorn ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24217/files - new: https://git.openjdk.org/jdk/pull/24217/files/cb7037e7..d8f66250 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=71 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=70-71 Stats: 19 lines in 4 files changed: 0 ins; 1 del; 18 mod Patch: https://git.openjdk.org/jdk/pull/24217.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24217/head:pull/24217 PR: https://git.openjdk.org/jdk/pull/24217 From epeter at openjdk.org Mon Jun 2 14:21:27 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 14:21:27 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v73] In-Reply-To: References: Message-ID: > **Goal** > We want to generate Java source code: > - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. > - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). > > Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). > > **How to get started** > When reviewing, please start by looking at: > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 > > We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. > > Second, look at this advanced test: > https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 > > And then for a "tutorial", look at: > `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` > > It shows these features: > - The `body` of a Template is essentially a list of `Token`s that are concatenated. > - Templates can be nested: a `TemplateWithArgs` is also a `Token`. > - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. > - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. > - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. > - The use of recursive templates, and `fuel` to limit the recursion. > - `Name`s: useful to register field and variable names in code scopes. > > Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 > > For a better experience, you may want to generate the `javadocs`: > `javadoc -sourcepath test/hotspot/j... Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: Apply suggestions from code review Co-authored-by: Christian Hagedorn ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24217/files - new: https://git.openjdk.org/jdk/pull/24217/files/d8f66250..30059e66 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=72 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=71-72 Stats: 23 lines in 5 files changed: 3 ins; 1 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/24217.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24217/head:pull/24217 PR: https://git.openjdk.org/jdk/pull/24217 From epeter at openjdk.org Mon Jun 2 14:21:32 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 14:21:32 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v71] In-Reply-To: References: Message-ID: <5HBkwD7E-Kr7zr4jRiRrO_uFxE4gWwJYqw1XcKsFCPY=.756231da-cc4b-4db8-85d7-9db17894810e@github.com> On Mon, 2 Jun 2025 13:45:07 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 91 commits: >> >> - Merge branch 'master' into JDK-8344942-TemplateFramework-v3 >> - validation tests >> - dollar and hashtag parsing validatiaon >> - wip refactor parsing dollar and hashtag >> - more fixes from Christian >> - more improvements >> - more suggestions applied >> - good practice >> - rename template arguments >> - more from Christian >> - ... and 81 more: https://git.openjdk.org/jdk/compare/90d6ad01...cb7037e7 > > test/hotspot/jtreg/compiler/lib/template_framework/Renderer.java line 266: > >> 264: case StringToken(String s) -> { >> 265: renderStringWithDollarAndHashtagReplacements(s); >> 266: } > > Suggestion: > > case StringToken(String s) -> renderStringWithDollarAndHashtagReplacements(s); I think I prefer the uniformity of the brackets as I have it. Would that be ok for you too? > test/hotspot/jtreg/compiler/lib/template_framework/Renderer.java line 324: > >> 322: case AddNameToken(Name name) -> { >> 323: currentCodeFrame.addName(name); >> 324: } > > Suggestion: > > case AddNameToken(Name name) -> currentCodeFrame.addName(name); Like above: I like the uniformity of the brackets here. Is that ok for you to keep as is? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121288248 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121290027 From epeter at openjdk.org Mon Jun 2 14:24:16 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 2 Jun 2025 14:24:16 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v71] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 12:21:42 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 91 commits: >> >> - Merge branch 'master' into JDK-8344942-TemplateFramework-v3 >> - validation tests >> - dollar and hashtag parsing validatiaon >> - wip refactor parsing dollar and hashtag >> - more fixes from Christian >> - more improvements >> - more suggestions applied >> - good practice >> - rename template arguments >> - more from Christian >> - ... and 81 more: https://git.openjdk.org/jdk/compare/90d6ad01...cb7037e7 > > test/hotspot/jtreg/compiler/lib/template_framework/Token.java line 74: > >> 72: case Float s -> outputList.add(new StringToken(Renderer.format(s))); >> 73: case Boolean s -> outputList.add(new StringToken(Renderer.format(s))); >> 74: case List l -> parseList(l, outputList); > > Not sure if we should use a raw `List` here. Would `List` work as well? Would then need to update `parseList(List inputList ...)` to `List` as well. What exactly do you think is the problem here? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121300491 From shade at openjdk.org Mon Jun 2 15:00:05 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Jun 2025 15:00:05 GMT Subject: RFR: 8357473: Compilation spike leaves many CompileTasks in free list [v3] In-Reply-To: References: Message-ID: > See bug for more discussion. > > This PR implements the "all the way" solution by removing the free list completely. It complements https://github.com/openjdk/jdk/pull/25364, and can go either first, or second. We will remerge the other one once either integrates. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `compiler` > - [ ] Linux AArch64 server fastdebug, `all` Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - Merge branch 'master' into JDK-8357473-compile-task-free-list - Merge branch 'master' into JDK-8357473-compile-task-free-list - Also free the lock! - Comments and indenting - Basic deletion ------------- Changes: https://git.openjdk.org/jdk/pull/25409/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25409&range=02 Stats: 105 lines in 4 files changed: 13 ins; 57 del; 35 mod Patch: https://git.openjdk.org/jdk/pull/25409.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25409/head:pull/25409 PR: https://git.openjdk.org/jdk/pull/25409 From shade at openjdk.org Mon Jun 2 15:00:06 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Jun 2025 15:00:06 GMT Subject: RFR: 8357473: Compilation spike leaves many CompileTasks in free list [v2] In-Reply-To: References: Message-ID: On Fri, 23 May 2025 17:34:36 GMT, Aleksey Shipilev wrote: >> See bug for more discussion. >> >> This PR implements the "all the way" solution by removing the free list completely. It complements https://github.com/openjdk/jdk/pull/25364, and can go either first, or second. We will remerge the other one once either integrates. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `compiler` >> - [ ] Linux AArch64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: > > - Merge branch 'master' into JDK-8357473-compile-task-free-list > - Also free the lock! > - Comments and indenting > - Basic deletion Re-merged with current master. Running tests now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25409#issuecomment-2931134142 From dfenacci at openjdk.org Mon Jun 2 15:13:29 2025 From: dfenacci at openjdk.org (Damon Fenacci) Date: Mon, 2 Jun 2025 15:13:29 GMT Subject: RFR: 8358129: compiler/startup/StartupOutput.java runs into out of memory on Windows after JDK-8347406 Message-ID: The test `compiler/startup/StartupOutput.java` starts **200 VMs in a loop** , this can lead to resource shortages on some (Windows) machines. There is no real need to run those VMs concurrently (their run is short and basically check that the VM doesn't crash giving limited code cache). Running them **sequentially** should be OK and should avoid running out of memory. Testing: Tier1-3+ ------------- Commit messages: - JDK-8358129: remove compiler/startup/StartupOutput.java from ProblemList - Merge branch 'master' into JDK-8358129 - JDK-8358129: compiler/startup/StartupOutput.java runs into out of memory on Windows after JDK-8347406 Changes: https://git.openjdk.org/jdk/pull/25582/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25582&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358129 Stats: 11 lines in 2 files changed: 2 ins; 8 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25582/head:pull/25582 PR: https://git.openjdk.org/jdk/pull/25582 From shade at openjdk.org Mon Jun 2 15:39:54 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 2 Jun 2025 15:39:54 GMT Subject: RFR: 8356000: C1/C2-only modes use 2 compiler threads on low CPU count machines [v3] In-Reply-To: References: Message-ID: On Wed, 28 May 2025 18:05:12 GMT, Aleksey Shipilev wrote: >> There is an unfortunate limitation with default tiered policy that we would have at least 2 threads on 1 CPU machine: 1 thread for C1, and 1 thread for C2. >> >> But if we select C1-only or C2-only modes, we _also_ get 2 compiler threads, for which we have no good reason. These threads would just step on each other toes. The fix changes the behavior for 1..3 CPU hosts in C1/C2-only configurations, by using 1 thread instead of 2 threads. The change for 1 CPU config is what we really need. The change in 2..3 CPU configs is an additional effect, but I think it is still good not to use 100%/66% of the CPUs in those configurations as well. >> >> >> $ for I in `seq 1 8`; do build/linux-x86_64-server-release/images/jdk/bin/java \ >> -XX:-TieredCompilation -XX:ActiveProcessorCount=${I} \ >> -XX:+PrintFlagsFinal 2>&1 | grep "CICompilerCount "; done >> >> # Before >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> # After >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> >> It is a minor bug in `CompilationPolicy::initialize`, but it gets in the way studying Leyden in tight CPU scenarios. >> >> Additional testing: >> - [x] New regression test passes with the fix, fails without it >> - [x] GHA > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Better test, patch amendments > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Unnecessary arch limitation > - Simplify test > - Adjust test bound > - Fix Any further comments / testing? Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24972#issuecomment-2931292775 From yzheng at openjdk.org Mon Jun 2 15:58:25 2025 From: yzheng at openjdk.org (Yudi Zheng) Date: Mon, 2 Jun 2025 15:58:25 GMT Subject: RFR: 8358333: Use VEX2 prefix in Assembler::psllq Message-ID: While porting the commit https://github.com/openjdk/jdk/commit/0df8c9684b8782ef830e2bd425217864c3f51784 to Graal, I noticed that the Assembler::psllq instruction is using the VEX3 prefix. This results in the instruction being unrecognizable by my outdated version of hsdis. Currently, HotSpot generates the following bytes for vpsllq xmm7, xmm7, 0x34 https://github.com/openjdk/jdk/blob/0df8c9684b8782ef830e2bd425217864c3f51784/src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp#L255 c4 e1 c1 73 f7 34 By setting the rex_w to WIG, the emitted bytes are: c5 c1 73 f7 34 ------------- Commit messages: - Use VEX2 prefix in Assembler::psllq Changes: https://git.openjdk.org/jdk/pull/25593/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25593&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358333 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25593.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25593/head:pull/25593 PR: https://git.openjdk.org/jdk/pull/25593 From yzheng at openjdk.org Mon Jun 2 16:04:57 2025 From: yzheng at openjdk.org (Yudi Zheng) Date: Mon, 2 Jun 2025 16:04:57 GMT Subject: RFR: 8358333: Use VEX2 prefix in Assembler::psllq In-Reply-To: References: Message-ID: <6_9L_DiGyVYdZqzwGTLMKyTAUURjRwpwHvQYEgBMZVo=.3e7ac003-9721-43e0-b0cf-4ed89d67d431@github.com> On Mon, 2 Jun 2025 15:53:17 GMT, Yudi Zheng wrote: > While porting the commit https://github.com/openjdk/jdk/commit/0df8c9684b8782ef830e2bd425217864c3f51784 to Graal, I noticed that the Assembler::psllq instruction is using the VEX3 prefix. This results in the instruction being unrecognizable by my outdated version of hsdis. Currently, HotSpot generates the following bytes for vpsllq xmm7, xmm7, 0x34 > https://github.com/openjdk/jdk/blob/0df8c9684b8782ef830e2bd425217864c3f51784/src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp#L255 > > > c4 e1 c1 73 f7 34 > > > By setting the rex_w to WIG, the emitted bytes are: > > > c5 c1 73 f7 34 @jatin-bhateja could you please review this trivial PR? Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25593#issuecomment-2931397163 From galder at openjdk.org Mon Jun 2 16:40:51 2025 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Mon, 2 Jun 2025 16:40:51 GMT Subject: RFR: 8358129: compiler/startup/StartupOutput.java runs into out of memory on Windows after JDK-8347406 In-Reply-To: References: Message-ID: <-9AT4ja1WZHf_xLO6Uzl90zPJKG-KOHTyUG075CTxHE=.be43593a-ff3b-4e33-8a63-f1d02cce8836@github.com> On Mon, 2 Jun 2025 10:57:22 GMT, Damon Fenacci wrote: > The test `compiler/startup/StartupOutput.java` starts **200 VMs in a loop** , this can lead to resource shortages on some (Windows) machines. > > There is no real need to run those VMs concurrently (their run is short and basically check that the VM doesn't crash giving limited code cache). > > Running them **sequentially** should be OK and should avoid running out of memory. > > Testing: Tier1-3+ What impact has this change in the time the test takes to run? If it turns out to be too slow, maybe the processes could be run batches? ------------- PR Review: https://git.openjdk.org/jdk/pull/25582#pullrequestreview-2889190186 From dnsimon at openjdk.org Mon Jun 2 16:42:35 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 2 Jun 2025 16:42:35 GMT Subject: RFR: 8357619: [JVMCI] Revisit phantom_ref parameter in JVMCINMethodData::get_nmethod_mirror [v2] In-Reply-To: <37LbN00VRPqAt9LN8jx43xx3QGsF6jnPFS_OQLUa-0U=.687f6afe-d13a-4d03-af0c-ac91a9862b13@github.com> References: <37LbN00VRPqAt9LN8jx43xx3QGsF6jnPFS_OQLUa-0U=.687f6afe-d13a-4d03-af0c-ac91a9862b13@github.com> Message-ID: > The point of the `phantom_ref` parameter (introduced by [JDK-8234359](https://bugs.openjdk.org/browse/JDK-8234359)) of `JVMCINMethodData::get_nmethod_mirror` is to avoid the special resurrection semantics of a phantom read when reading the field during GC, which is when `JVMCINMethodData::invalidate_nmethod_mirror` can be called. > This case can be handled directly in `JVMCINMethodData::invalidate_nmethod_mirror` and so the `phantom_ref` parameter can be removed. Doug Simon has updated the pull request incrementally with one additional commit since the last revision: still rely on get_nmethod_mirror in invalidate_nmethod_mirror ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25488/files - new: https://git.openjdk.org/jdk/pull/25488/files/0dcb1c78..7456988a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25488&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25488&range=00-01 Stats: 6 lines in 1 file changed: 3 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25488.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25488/head:pull/25488 PR: https://git.openjdk.org/jdk/pull/25488 From never at openjdk.org Mon Jun 2 16:42:35 2025 From: never at openjdk.org (Tom Rodriguez) Date: Mon, 2 Jun 2025 16:42:35 GMT Subject: RFR: 8357619: [JVMCI] Revisit phantom_ref parameter in JVMCINMethodData::get_nmethod_mirror [v2] In-Reply-To: References: <37LbN00VRPqAt9LN8jx43xx3QGsF6jnPFS_OQLUa-0U=.687f6afe-d13a-4d03-af0c-ac91a9862b13@github.com> Message-ID: On Mon, 2 Jun 2025 16:39:42 GMT, Doug Simon wrote: >> The point of the `phantom_ref` parameter (introduced by [JDK-8234359](https://bugs.openjdk.org/browse/JDK-8234359)) of `JVMCINMethodData::get_nmethod_mirror` is to avoid the special resurrection semantics of a phantom read when reading the field during GC, which is when `JVMCINMethodData::invalidate_nmethod_mirror` can be called. >> This case can be handled directly in `JVMCINMethodData::invalidate_nmethod_mirror` and so the `phantom_ref` parameter can be removed. > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > still rely on get_nmethod_mirror in invalidate_nmethod_mirror Marked as reviewed by never (Reviewer). src/hotspot/share/jvmci/jvmciRuntime.cpp line 807: > 805: oop nmethod_mirror = get_nmethod_mirror(nm, /* phantom_ref */ false); > 806: if (nmethod_mirror == nullptr) { > 807: return; minor typo ------------- PR Review: https://git.openjdk.org/jdk/pull/25488#pullrequestreview-2889188965 PR Review Comment: https://git.openjdk.org/jdk/pull/25488#discussion_r2121662248 From dnsimon at openjdk.org Mon Jun 2 16:42:36 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 2 Jun 2025 16:42:36 GMT Subject: RFR: 8357619: [JVMCI] Revisit phantom_ref parameter in JVMCINMethodData::get_nmethod_mirror [v2] In-Reply-To: References: <37LbN00VRPqAt9LN8jx43xx3QGsF6jnPFS_OQLUa-0U=.687f6afe-d13a-4d03-af0c-ac91a9862b13@github.com> Message-ID: <_hhDZ4c8Kic5tPWMwNHv_G21vpCK2eKjTFJQOqW_nEw=.1c90159f-ee23-441c-b72f-283bd2b35c80@github.com> On Fri, 30 May 2025 16:05:23 GMT, Tom Rodriguez wrote: >> Doug Simon has updated the pull request incrementally with one additional commit since the last revision: >> >> still rely on get_nmethod_mirror in invalidate_nmethod_mirror > > src/hotspot/share/jvmci/jvmciRuntime.cpp line 801: > >> 799: >> 800: void JVMCINMethodData::invalidate_nmethod_mirror(nmethod* nm) { >> 801: if (_nmethod_mirror_index == -1) { > > This part is actually wrong as that's the first part of `get_nmethod_mirror` and we must always check that `get_nmethod_mirror` doesn't return nullptr. I'd assumed that the mirror was always non-null if `_nmethod_mirror_index != -1` but that's not true. The slot is reserved for all non-default nmethods and must stay around so that `translate` can work. Fixed: https://github.com/openjdk/jdk/pull/25488/commits/7456988a6fcab00bf13e602553f9e5a295d75b0f ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25488#discussion_r2121658848 From dnsimon at openjdk.org Mon Jun 2 16:48:16 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 2 Jun 2025 16:48:16 GMT Subject: RFR: 8357619: [JVMCI] Revisit phantom_ref parameter in JVMCINMethodData::get_nmethod_mirror [v3] In-Reply-To: <37LbN00VRPqAt9LN8jx43xx3QGsF6jnPFS_OQLUa-0U=.687f6afe-d13a-4d03-af0c-ac91a9862b13@github.com> References: <37LbN00VRPqAt9LN8jx43xx3QGsF6jnPFS_OQLUa-0U=.687f6afe-d13a-4d03-af0c-ac91a9862b13@github.com> Message-ID: > The point of the `phantom_ref` parameter (introduced by [JDK-8234359](https://bugs.openjdk.org/browse/JDK-8234359)) of `JVMCINMethodData::get_nmethod_mirror` is to avoid the special resurrection semantics of a phantom read when reading the field during GC, which is when `JVMCINMethodData::invalidate_nmethod_mirror` can be called. > This case can be handled directly in `JVMCINMethodData::invalidate_nmethod_mirror` and so the `phantom_ref` parameter can be removed. Doug Simon has updated the pull request incrementally with one additional commit since the last revision: fixed typo ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25488/files - new: https://git.openjdk.org/jdk/pull/25488/files/7456988a..e02be82c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25488&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25488&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25488.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25488/head:pull/25488 PR: https://git.openjdk.org/jdk/pull/25488 From eosterlund at openjdk.org Mon Jun 2 16:59:57 2025 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 2 Jun 2025 16:59:57 GMT Subject: RFR: 8357619: [JVMCI] Revisit phantom_ref parameter in JVMCINMethodData::get_nmethod_mirror [v3] In-Reply-To: References: <37LbN00VRPqAt9LN8jx43xx3QGsF6jnPFS_OQLUa-0U=.687f6afe-d13a-4d03-af0c-ac91a9862b13@github.com> Message-ID: On Mon, 2 Jun 2025 16:48:16 GMT, Doug Simon wrote: >> The point of the `phantom_ref` parameter (introduced by [JDK-8234359](https://bugs.openjdk.org/browse/JDK-8234359)) of `JVMCINMethodData::get_nmethod_mirror` is to avoid the special resurrection semantics of a phantom read when reading the field during GC, which is when `JVMCINMethodData::invalidate_nmethod_mirror` can be called. >> This case can be handled directly in `JVMCINMethodData::invalidate_nmethod_mirror` and so the `phantom_ref` parameter can be removed. > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > fixed typo Marked as reviewed by eosterlund (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25488#pullrequestreview-2889256264 From eastigeevich at openjdk.org Mon Jun 2 17:04:06 2025 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 2 Jun 2025 17:04:06 GMT Subject: RFR: 8316694: Implement relocation of nmethod within CodeCache [v19] In-Reply-To: References: <17al0aeFhm0iZHoHHGiqB03RfPeSrIHIoZuapOHPuy4=.a2ff2d67-392b-40f0-b6d9-6e3a7f396e8a@github.com> Message-ID: On Fri, 30 May 2025 19:25:51 GMT, Tom Rodriguez wrote: > So this copying keeps the same compile_id, which sort of makes sense but it's also potentially confusing. What's the plan for how this interacts with flags like PrintNMethods and JVMTI code installation notification? This is done in nmethod::post_compiled_method which doesn't seem to be used on the new nmethod. If the reclamation of the old nmethod is performed in the normal fashion, we now have 2 nmethods alive with the same compile_id which could be confusing. But allocating a new compile_id breaks the connection to the original compile which seems bad too. As we are not compiling, `compile_id` should stay the same. Yes, we need to add some logging: `log_info(codecache)` and `PrintNMethods`. According to https://docs.oracle.com/en/java/javase/24/docs/specs/jvmti.html#CompiledMethodLoad, compile methods can be moved. We need to generate events if it happens: >If it is moved, the [CompiledMethodUnload](https://docs.oracle.com/en/java/javase/24/docs/specs/jvmti.html#CompiledMethodUnload) event is sent, followed by a new CompiledMethodLoad event. > we now have 2 nmethods alive with the same compile_id which could be confusing. If `compile_id` is interpreted as id of nmethod, it is confusing. Comment to `nmethod::_compile_id`: https://github.com/openjdk/jdk/blob/aea2837143289800cfbb7044de4f105e87e233ff/src/hotspot/share/code/nmethod.hpp#L259 According to it, it is id of a compilation task. In such case there should be no confusion. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23573#issuecomment-2931612988 From duke at openjdk.org Mon Jun 2 17:41:00 2025 From: duke at openjdk.org (Chad Rakoczy) Date: Mon, 2 Jun 2025 17:41:00 GMT Subject: RFR: 8316694: Implement relocation of nmethod within CodeCache [v18] In-Reply-To: <62uTtu5i-RDdM1Lnk0i_2JXoNdbJzcn4CBXdCGBU3B0=.48748b12-0871-46b3-9754-b42943fdbad5@github.com> References: <2HApmZeeYmB9G5gttb7G9zKLyTMSQXwrXODoYgvYmQM=.743583e2-7918-4900-9dbd-7223917cf310@github.com> <0tmWzYMOS7jyjgoJL0mBMRywf6mCEBkSTQ7jdRE7Xtg=.5857550c-35e0-4cbb-8bd8-0542ae1b70a5@github.com> <62uTtu5i-RDdM1Lnk0i_2JXoNdbJzcn4CBXdCGBU3B0=.48748b12-0871-46b3-9754-b42943fdbad5@github.com> Message-ID: <5iiKPx9fgyc5pvJIzUggaL1XEPeijmRPqsMIC1MGa48=.9f380d7c-9504-4921-8daf-1543fc5b1cba@github.com> On Thu, 29 May 2025 11:24:46 GMT, Evgeny Astigeevich wrote: >> If we want to guarantee that a trampoline exists if `Assembler::reachable_from_branch_at` fails we would need to update Graal to use the check as well > >> The null trampoline check is needed because on debug builds branches of distance >2M will fall into the if (!Assembler::reachable_from_branch_at(addr(), x)) block but Graal would not have generated a trampoline for that call because it is still <128M. It is still safe to use that distance but it is just different than what HotSpot expects > > This logic looks strange to me. > You are saying that a trampoline is only null in case of Graal but dest is always valid in this case. > This is a bug in Graal: it always uses 128M branch range despite Hotspot can change the range to smaller values in debug builds. > When Graal fixes the bug you will have undefined behaviour in this place. > We must handle the situation where no trampoline is available. > Options: > 1. This is a bug in code generation. If the bug can be easy to reproduce with debug builds, use assert. If no, use guarantee. > 2. This is an expected case. We need to generate a trampoline. This can be complicated. > > I think it's a bug situation. Updated the code to have a guarantee check here instead. Filed the following which will fix Graal to use the same range check https://github.com/oracle/graal/issues/11291 https://bugs.openjdk.org/browse/JDK-8358096 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23573#discussion_r2121784842 From duke at openjdk.org Mon Jun 2 17:48:00 2025 From: duke at openjdk.org (Chad Rakoczy) Date: Mon, 2 Jun 2025 17:48:00 GMT Subject: RFR: 8316694: Implement relocation of nmethod within CodeCache [v21] In-Reply-To: References: Message-ID: On Sat, 31 May 2025 20:57:40 GMT, Evgeny Astigeevich wrote: >> Chad Rakoczy has updated the pull request incrementally with one additional commit since the last revision: >> >> Change to ImmutableDataReferences > > test/lib/jdk/test/whitebox/WhiteBox.java line 498: > >> 496: relocateNMethodFromMethod0(method, type); >> 497: } >> 498: public native void relocateNMethodFromAddr0(long address, int type); > > Why does the name have '0' at the end? I noticed that was a common trend with JNI functions. There is no `relocateNMethodFromAddr` so I suppose the '0' isn't necessary in this case ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23573#discussion_r2121796964 From jbhateja at openjdk.org Mon Jun 2 17:58:09 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 2 Jun 2025 17:58:09 GMT Subject: RFR: 8350896: Integer/Long.compress gets wrong type from CompressBitsNode::Value [v9] In-Reply-To: References: Message-ID: > Hi All, > > This bugfix patch fixes incorrect value computation for Integer/Long. compress APIs. > > Problems occur with a constant input and variable mask where the input's value is equal to the lower bound of the mask value., In this case, an erroneous value range estimation results in a constant value. Existing value routine first attempts to constant fold the compression operation if both input and compression mask are constant values; otherwise, it attempts to constrain the value range of result based on the upper and lower bounds of mask type. > > New IR test covers the issue reported in the bug report along with a case for value range based logic pruning. > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Fix aarch64 failure ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23947/files - new: https://git.openjdk.org/jdk/pull/23947/files/4065fb9c..96ecbac1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23947&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23947&range=07-08 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/23947.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23947/head:pull/23947 PR: https://git.openjdk.org/jdk/pull/23947 From asmehra at openjdk.org Mon Jun 2 18:37:02 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Mon, 2 Jun 2025 18:37:02 GMT Subject: RFR: 8358330: AsmRemarks and DbgStrings clear() method may not get called before their destructor Message-ID: This patch fixes a possible assert in debug builds if the allocation of memory for a CodeBlob fails when loading it from the AOT Code Cache. ------------- Commit messages: - 8358330: AsmRemarks and DbgStrings clear() method may not get called before their destructor Changes: https://git.openjdk.org/jdk/pull/25598/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25598&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358330 Stats: 17 lines in 3 files changed: 6 ins; 9 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25598.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25598/head:pull/25598 PR: https://git.openjdk.org/jdk/pull/25598 From jbhateja at openjdk.org Mon Jun 2 18:45:31 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 2 Jun 2025 18:45:31 GMT Subject: RFR: 8351645: C2: ExpandBitsNode::Ideal hits assert because of TOP input Message-ID: Bugfix patch adds missing safe type access checks in Expand/Compress Ideal transforms. Test mentioned in the bug report has been included allong with the patch. Kindly review. Best Regards, Jatin ------------- Commit messages: - 8351645: C2: ExpandBitsNode::Ideal hits assert because of TOP input Changes: https://git.openjdk.org/jdk/pull/25586/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25586&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8351645 Stats: 72 lines in 2 files changed: 70 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25586.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25586/head:pull/25586 PR: https://git.openjdk.org/jdk/pull/25586 From chagedorn at openjdk.org Mon Jun 2 18:54:12 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Mon, 2 Jun 2025 18:54:12 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v71] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 14:10:29 GMT, Emanuel Peter wrote: >> test/hotspot/jtreg/compiler/lib/template_framework/NameSet.java line 88: >> >>> 86: if (w <= 0) { >>> 87: return null; >>> 88: } >> >> Shouldn't the weight always be positive? > > Yes. True. I sometimes just also cover negative values to be a bit more robust... but I can also change it if you prefer that. I guess if it's never negative, you can still cover it but maybe throw an exception instead? >> test/hotspot/jtreg/compiler/lib/template_framework/Renderer.java line 266: >> >>> 264: case StringToken(String s) -> { >>> 265: renderStringWithDollarAndHashtagReplacements(s); >>> 266: } >> >> Suggestion: >> >> case StringToken(String s) -> renderStringWithDollarAndHashtagReplacements(s); > > I think I prefer the uniformity of the brackets as I have it. Would that be ok for you too? I don't have a strong preference, so I'm fine with it ? >> test/hotspot/jtreg/compiler/lib/template_framework/TemplateBinding.java line 43: >> >>> 41: * Creates a new {@link TemplateBinding} that has no Template bound to it yet. >>> 42: */ >>> 43: public TemplateBinding() {} >> >> Can also be removed since it's the default constructor that is automatically added for you. >> Suggestion: > > If I do that, then `javadoc` complains: > > test/hotspot/jtreg/compiler/lib/template_framework/TemplateBinding.java:37: warning: use of default constructor, which does not provide a comment > public class TemplateBinding { Ah I see. Then you can leave it in. >> test/hotspot/jtreg/compiler/lib/template_framework/Token.java line 74: >> >>> 72: case Float s -> outputList.add(new StringToken(Renderer.format(s))); >>> 73: case Boolean s -> outputList.add(new StringToken(Renderer.format(s))); >>> 74: case List l -> parseList(l, outputList); >> >> Not sure if we should use a raw `List` here. Would `List` work as well? Would then need to update `parseList(List inputList ...)` to `List` as well. > > What exactly do you think is the problem here? My IDE advises against matching on the raw type `List`. As an alternative you can match on `List`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121921123 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121924453 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121918498 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2121917521 From duke at openjdk.org Mon Jun 2 19:12:19 2025 From: duke at openjdk.org (Tom Shull) Date: Mon, 2 Jun 2025 19:12:19 GMT Subject: RFR: 8357660: [JVMCI] Add support for retrieving all BootstrapMethodInvocations directly from ConstantPool [v3] In-Reply-To: References: Message-ID: <23ZmqystynFesmzLG1GXybdGxQSChcwZQt4cM__DbYw=.ae8a7c39-7fc2-4e27-82a3-6f4e879debd9@github.com> > This PR adds support for directly retrieving both all invokedynamic and all condy BootstrapMethodInvocations from a ConstantPool via the new method `List lookupBootstrapMethodInvocations(boolean invokeDynamic)`. > > In addition, two methods are added to the BootstrapMethodInvocations: > 1. `void resolve()` > 2. `JavaConstant lookup()` > > The combination of these two features allows one to directly interact with all BSM information of a given ConstantPool without having to iterate through all of the Classfile's methods to find all invokedynamic bytecodes and/or iterate through all Constant Pool entries. Tom Shull has updated the pull request incrementally with two additional commits since the last revision: - commit to trigger testing - commit to trigger testing ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25420/files - new: https://git.openjdk.org/jdk/pull/25420/files/60c39b5e..4d508fc4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25420&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25420&range=01-02 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25420.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25420/head:pull/25420 PR: https://git.openjdk.org/jdk/pull/25420 From asmehra at openjdk.org Mon Jun 2 19:20:51 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Mon, 2 Jun 2025 19:20:51 GMT Subject: RFR: 8358330: AsmRemarks and DbgStrings clear() method may not get called before their destructor In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 18:32:12 GMT, Ashutosh Mehra wrote: > This patch fixes a possible assert in debug builds if the allocation of memory for a CodeBlob fails when loading it from the AOT Code Cache. See description of [JDK-8358330](https://bugs.openjdk.org/browse/JDK-8358330) for more details. @vnkozlov can you please review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25598#issuecomment-2932112060 From duke at openjdk.org Mon Jun 2 20:36:37 2025 From: duke at openjdk.org (Tom Shull) Date: Mon, 2 Jun 2025 20:36:37 GMT Subject: RFR: 8357987: [JVMCI] Add support for retrieving all methods of a ResolvedJavaType [v3] In-Reply-To: References: Message-ID: <7qRH8PFpSXJTshHNvxEMqEbc34N5wSnpknQaMUbWrCg=.6de4f71a-8c22-492f-b156-b25a07f3b428@github.com> > Currently from ResolvedJavaType one can retrieve all declared methods, static methods, and constructors of the given type. However, internally in HotSpot there are also VM-internal methods, such as overpass methods, associated with a given type which we cannot access via the API. > > To correct this, we should add a new method which enables VM-internal methods, such as overpass methods, to be accessed. Tom Shull has updated the pull request incrementally with one additional commit since the last revision: return List.of() from getAllMethods ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25498/files - new: https://git.openjdk.org/jdk/pull/25498/files/0de1feae..ae81d46f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25498&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25498&range=01-02 Stats: 5 lines in 1 file changed: 0 ins; 4 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25498.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25498/head:pull/25498 PR: https://git.openjdk.org/jdk/pull/25498 From duke at openjdk.org Mon Jun 2 21:08:50 2025 From: duke at openjdk.org (Chad Rakoczy) Date: Mon, 2 Jun 2025 21:08:50 GMT Subject: RFR: 8316694: Implement relocation of nmethod within CodeCache [v22] In-Reply-To: References: Message-ID: > This PR introduces a new function to replace nmethods, addressing [JDK-8316694](https://bugs.openjdk.org/browse/JDK-8316694). It enables the creation of new nmethods from existing ones, allowing method relocation in the code heap and supporting [JDK-8328186](https://bugs.openjdk.org/browse/JDK-8328186). > > When an nmethod is replaced, a deep copy is performed. The corresponding Java method is updated to reference the new nmethod, while the old one is marked as unused. The garbage collector handles final cleanup and deallocation. > > This change does not modify existing code paths and therefore does not benefit much from existing tests. New tests were created and confirmed to pass on x64/aarch64 for slowdebug/fastdebug/release. Chad Rakoczy has updated the pull request incrementally with six additional commits since the last revision: - Update ImmutableDataReferences - Update nm valid check - Remove 0 from relocateNMethodFromAddr0 - Update DeoptimizeRelocatedNMethod to call relocated function - Fix is_safe - Use ptrdiff_t instead of int ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23573/files - new: https://git.openjdk.org/jdk/pull/23573/files/9f753071..eed3d434 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23573&range=21 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23573&range=20-21 Stats: 19 lines in 9 files changed: 4 ins; 0 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/23573.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23573/head:pull/23573 PR: https://git.openjdk.org/jdk/pull/23573 From duke at openjdk.org Mon Jun 2 21:47:48 2025 From: duke at openjdk.org (Chad Rakoczy) Date: Mon, 2 Jun 2025 21:47:48 GMT Subject: RFR: 8316694: Implement relocation of nmethod within CodeCache [v23] In-Reply-To: References: Message-ID: > This PR introduces a new function to replace nmethods, addressing [JDK-8316694](https://bugs.openjdk.org/browse/JDK-8316694). It enables the creation of new nmethods from existing ones, allowing method relocation in the code heap and supporting [JDK-8328186](https://bugs.openjdk.org/browse/JDK-8328186). > > When an nmethod is replaced, a deep copy is performed. The corresponding Java method is updated to reference the new nmethod, while the old one is marked as unused. The garbage collector handles final cleanup and deallocation. > > This change does not modify existing code paths and therefore does not benefit much from existing tests. New tests were created and confirmed to pass on x64/aarch64 for slowdebug/fastdebug/release. Chad Rakoczy has updated the pull request incrementally with one additional commit since the last revision: Update immutable_data_references naming ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23573/files - new: https://git.openjdk.org/jdk/pull/23573/files/eed3d434..b0dad665 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23573&range=22 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23573&range=21-22 Stats: 7 lines in 2 files changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/23573.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23573/head:pull/23573 PR: https://git.openjdk.org/jdk/pull/23573 From duke at openjdk.org Mon Jun 2 21:52:02 2025 From: duke at openjdk.org (Chad Rakoczy) Date: Mon, 2 Jun 2025 21:52:02 GMT Subject: RFR: 8316694: Implement relocation of nmethod within CodeCache [v3] In-Reply-To: References: <8l4e6nqzNukJ6st0fEkLwKqlF35stq_W9ph831eo8w4=.6cbb2172-b35a-4d27-bab7-1d104c9f993b@github.com> <3lvuXGbqkCDeGwkzDQtzhkbZGN1XgcTcuFfL0_TUPvA=.4ba152ea-b3ab-4339-a42c-03d78bfcc829@github.com> Message-ID: On Fri, 30 May 2025 23:21:41 GMT, Vladimir Kozlov wrote: >> I have created a RFE to move the immutable data from the nmethod to a separate class. [JDK-8358213](https://bugs.openjdk.org/browse/JDK-8358213) > > Thanks. Let's keep current changes as it is with small comment: > IMMUTABLE_DATA_REFERENCES is used with `sizeof()` in all places - consider using instead > ``` > #define IMMUTABLE_DATA_REFERENCES_SIZE sizeof(int) > ``` I updated it to hold the size based on @vnkozlov suggestion ([source](https://github.com/chadrako/jdk/blob/b0dad6659047553ebee1387939e54ea817b31cb1/src/hotspot/share/code/nmethod.hpp#L172)) @dean-long Are you good with this change and moving the immutable data to a separate class in a seperate issue ([JDK-8358213](https://bugs.openjdk.org/browse/JDK-8358213))? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23573#discussion_r2122223999 From duke at openjdk.org Mon Jun 2 22:43:50 2025 From: duke at openjdk.org (Chad Rakoczy) Date: Mon, 2 Jun 2025 22:43:50 GMT Subject: RFR: 8316694: Implement relocation of nmethod within CodeCache [v24] In-Reply-To: References: Message-ID: > This PR introduces a new function to replace nmethods, addressing [JDK-8316694](https://bugs.openjdk.org/browse/JDK-8316694). It enables the creation of new nmethods from existing ones, allowing method relocation in the code heap and supporting [JDK-8328186](https://bugs.openjdk.org/browse/JDK-8328186). > > When an nmethod is replaced, a deep copy is performed. The corresponding Java method is updated to reference the new nmethod, while the old one is marked as unused. The garbage collector handles final cleanup and deallocation. > > This change does not modify existing code paths and therefore does not benefit much from existing tests. New tests were created and confirmed to pass on x64/aarch64 for slowdebug/fastdebug/release. Chad Rakoczy has updated the pull request incrementally with one additional commit since the last revision: Fix test copywrite ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23573/files - new: https://git.openjdk.org/jdk/pull/23573/files/b0dad665..4e80e359 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23573&range=23 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23573&range=22-23 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/23573.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23573/head:pull/23573 PR: https://git.openjdk.org/jdk/pull/23573 From duke at openjdk.org Mon Jun 2 22:44:54 2025 From: duke at openjdk.org (duke) Date: Mon, 2 Jun 2025 22:44:54 GMT Subject: RFR: 8357223: AArch64: Optimize interpreter profile updates [v2] In-Reply-To: <7wo-_Wt-EiVGKgxMxU_MnTA8o1QQxH_LDtNzDShlOIY=.9c8093b7-ed4b-487d-afbe-5227362f1ade@github.com> References: <7wo-_Wt-EiVGKgxMxU_MnTA8o1QQxH_LDtNzDShlOIY=.9c8093b7-ed4b-487d-afbe-5227362f1ade@github.com> Message-ID: <0y-3unyZVrC4JTeHDkrRKfQFloudfzacKa99fG689aQ=.ad39a021-2f6f-460b-913a-3868633d35af@github.com> On Thu, 29 May 2025 23:04:25 GMT, Chad Rakoczy wrote: >> [JDK-8357223](https://bugs.openjdk.org/browse/JDK-8357223) >> >> The aarch64 version of [JDK-8356946](https://bugs.openjdk.org/browse/JDK-8356946) >> >> The reasoning for this change is the same as the x86 version's PR: >> >>> First, we carry the implementation for counter decrements without using them. This is dead code, and can be purged. >>> >>> Second, we care about overflows for 64-bit for some reason. I think this is a reminiscent of 32-bit x86 support, where we can plausibly have 32-bit counter overflow in a reasonable timeframe. But for 64-bit counter, we need tens of years of constantly bashing the counter to get it to overflow. No other profile counter update code, e.g. in C1, cares about this. >> >> Additional testing: >> >> - [x] Linux aarch64 fastdebug tier 1/2/3/4 > > Chad Rakoczy has updated the pull request incrementally with one additional commit since the last revision: > > Address comments @chadrako Your change (at version 0a33652392d445fa0f10650edc5448168f823272) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25512#issuecomment-2932759213 From cslucas at openjdk.org Mon Jun 2 22:49:36 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Mon, 2 Jun 2025 22:49:36 GMT Subject: RFR: 8357600: Patch nmethod flushing message to include more details [v3] In-Reply-To: References: Message-ID: > Please review this patch for adding more details to nmethod flushing message. These details are particularly important when investigating interaction of JVMCI compiled code and code cache flushing heuristics. > > Tested on Linux x64 with JTREG tier1-3 using fastdebug and release builds. Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: Address PR feedback. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25402/files - new: https://git.openjdk.org/jdk/pull/25402/files/f6c64755..2aabfa72 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25402&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25402&range=01-02 Stats: 7 lines in 1 file changed: 4 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25402.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25402/head:pull/25402 PR: https://git.openjdk.org/jdk/pull/25402 From duke at openjdk.org Mon Jun 2 22:59:59 2025 From: duke at openjdk.org (Chad Rakoczy) Date: Mon, 2 Jun 2025 22:59:59 GMT Subject: RFR: 8316694: Implement relocation of nmethod within CodeCache [v24] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 22:43:50 GMT, Chad Rakoczy wrote: >> This PR introduces a new function to replace nmethods, addressing [JDK-8316694](https://bugs.openjdk.org/browse/JDK-8316694). It enables the creation of new nmethods from existing ones, allowing method relocation in the code heap and supporting [JDK-8328186](https://bugs.openjdk.org/browse/JDK-8328186). >> >> When an nmethod is replaced, a deep copy is performed. The corresponding Java method is updated to reference the new nmethod, while the old one is marked as unused. The garbage collector handles final cleanup and deallocation. >> >> This change does not modify existing code paths and therefore does not benefit much from existing tests. New tests were created and confirmed to pass on x64/aarch64 for slowdebug/fastdebug/release. > > Chad Rakoczy has updated the pull request incrementally with one additional commit since the last revision: > > Fix test copywrite Thanks for pointing out the missing JVMTI event publication. I?m currently looking into what?s required to address that, along with JFR event publication that may also have been missed. I?d appreciate hearing others? thoughts on how critical this is: should we treat it as a blocker for integration, or would it be acceptable to follow up with a separate issue? We?re hoping to get this into JDK 25, as it would simplify both development and backporting of features related to hot code grouping. That said, if the consensus is that JVMTI/JFR support is essential upfront, this can be delayed until JDK 26. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23573#issuecomment-2932801568 From dlong at openjdk.org Mon Jun 2 23:53:59 2025 From: dlong at openjdk.org (Dean Long) Date: Mon, 2 Jun 2025 23:53:59 GMT Subject: RFR: 8316694: Implement relocation of nmethod within CodeCache [v3] In-Reply-To: References: <8l4e6nqzNukJ6st0fEkLwKqlF35stq_W9ph831eo8w4=.6cbb2172-b35a-4d27-bab7-1d104c9f993b@github.com> <3lvuXGbqkCDeGwkzDQtzhkbZGN1XgcTcuFfL0_TUPvA=.4ba152ea-b3ab-4339-a42c-03d78bfcc829@github.com> Message-ID: On Mon, 2 Jun 2025 21:49:36 GMT, Chad Rakoczy wrote: >> Thanks. Let's keep current changes as it is with small comment: >> IMMUTABLE_DATA_REFERENCES is used with `sizeof()` in all places - consider using instead >> ``` >> #define IMMUTABLE_DATA_REFERENCES_SIZE sizeof(int) >> ``` > > I updated it to hold the size based on @vnkozlov suggestion ([source](https://github.com/chadrako/jdk/blob/b0dad6659047553ebee1387939e54ea817b31cb1/src/hotspot/share/code/nmethod.hpp#L172)) > > @dean-long Are you good with this change and moving the immutable data to a separate class in a seperate issue ([JDK-8358213](https://bugs.openjdk.org/browse/JDK-8358213))? Sure, fine with me. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23573#discussion_r2122349147 From kvn at openjdk.org Tue Jun 3 00:35:00 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 3 Jun 2025 00:35:00 GMT Subject: RFR: 8316694: Implement relocation of nmethod within CodeCache [v24] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 22:57:19 GMT, Chad Rakoczy wrote: > We?re hoping to get this into JDK 25, as it would simplify both development and backporting of features related to hot code grouping. That said, if the consensus is that JVMTI/JFR support is essential upfront, this can be delayed until JDK 26. I don't think this can be put into JDK 25. Too late and changes are not simple. And yes, JVMTI/JFR support is essential - you have to support all public functionalities of VM. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23573#issuecomment-2932980633 From kvn at openjdk.org Tue Jun 3 01:31:56 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 3 Jun 2025 01:31:56 GMT Subject: RFR: 8358330: AsmRemarks and DbgStrings clear() method may not get called before their destructor In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 18:32:12 GMT, Ashutosh Mehra wrote: > This patch fixes a possible assert in debug builds if the allocation of memory for a CodeBlob fails when loading it from the AOT Code Cache. See description of [JDK-8358330](https://bugs.openjdk.org/browse/JDK-8358330) for more details. I am not comfortable that you are changing code not used by AOT. Can you consider populate `CodeBlob::_asm_remarks` and `_dbg_strings` after calling `CodeBlob::create()`? Then you don't need temporary `AsmRemarks` and `DbgStrings`. ------------- PR Review: https://git.openjdk.org/jdk/pull/25598#pullrequestreview-2890312467 From xgong at openjdk.org Tue Jun 3 01:49:07 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Tue, 3 Jun 2025 01:49:07 GMT Subject: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API In-Reply-To: References: Message-ID: On Fri, 30 May 2025 08:15:22 GMT, Xiaohong Gong wrote: >>> @XiaohongGong Thanks for splitting this one out, and for investigating the regressions here. >>> >>> Putting the permalink here, fixed to the current change (the link you pasted will always refer to the newest, which may later on point to the wrong line when lines above are inserted / deleted): >>> >>> https://github.com/openjdk/jdk/blob/7077535c0b0a6ea0a2a167f9135b1504a3d71fb3/src/hotspot/share/opto/loopnode.cpp#L1659-L1661 >>> >>> I wonder if we should just use `Node::uncast` there? But I'm quite unsure about that. >> >> Sounds good to me. I will have a deep investigation for it. Thanks! >> >> >> >>> > Yes, I also observed such regression. >>> > It would be nice if you proactively mentioned regressions, so it does not have to be pointed out by reviewers. >>> >>> For me, it could be ok to fix it in a follow-up patch. I think we are too close to RDP1 for JDK25 now anyway, and so we could push this patch here into JDK26, and then we have enough time in JDK26 to investigate the regression. Even better would be if we could do the other patch first, so we never even encounter a regression. >> >> Sounds good to me. Thanks! > >> > @XiaohongGong Thanks for splitting this one out, and for investigating the regressions here. >> > Putting the permalink here, fixed to the current change (the link you pasted will always refer to the newest, which may later on point to the wrong line when lines above are inserted / deleted): >> > https://github.com/openjdk/jdk/blob/7077535c0b0a6ea0a2a167f9135b1504a3d71fb3/src/hotspot/share/opto/loopnode.cpp#L1659-L1661 >> > >> > I wonder if we should just use `Node::uncast` there? But I'm quite unsure about that. >> >> Sounds good to me. I will have a deep investigation for it. Thanks! > > Hi @eme64 @jatin-bhateja, I'v created a PR https://github.com/openjdk/jdk/pull/25539 to fix this issue. With this change, the performance regression can be fixed as well. Could you please take a look at that change and help to run the test on different X86 machines? Thanks a lot! > @XiaohongGong I reviewed #25539. Since it is a relatively simple patch, I suggest that we integrate that one first, and come back to this here later. Is that ok for you? That's fine to me. Thanks for your review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25138#issuecomment-2933082670 From xgong at openjdk.org Tue Jun 3 01:49:07 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Tue, 3 Jun 2025 01:49:07 GMT Subject: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 01:45:57 GMT, Xiaohong Gong wrote: >>> > @XiaohongGong Thanks for splitting this one out, and for investigating the regressions here. >>> > Putting the permalink here, fixed to the current change (the link you pasted will always refer to the newest, which may later on point to the wrong line when lines above are inserted / deleted): >>> > https://github.com/openjdk/jdk/blob/7077535c0b0a6ea0a2a167f9135b1504a3d71fb3/src/hotspot/share/opto/loopnode.cpp#L1659-L1661 >>> > >>> > I wonder if we should just use `Node::uncast` there? But I'm quite unsure about that. >>> >>> Sounds good to me. I will have a deep investigation for it. Thanks! >> >> Hi @eme64 @jatin-bhateja, I'v created a PR https://github.com/openjdk/jdk/pull/25539 to fix this issue. With this change, the performance regression can be fixed as well. Could you please take a look at that change and help to run the test on different X86 machines? Thanks a lot! > >> @XiaohongGong I reviewed #25539. Since it is a relatively simple patch, I suggest that we integrate that one first, and come back to this here later. Is that ok for you? > > That's fine to me. Thanks for your review! > Hi @XiaohongGong , Looks good to me, thanks again for this re-factor !! > > Best Regards, Jatin Thanks so much for your review @jatin-bhateja ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25138#issuecomment-2933083694 From xgong at openjdk.org Tue Jun 3 01:51:54 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Tue, 3 Jun 2025 01:51:54 GMT Subject: RFR: 8357726: C2 fails to recognize the counted loop when induction variable range is changed multiple times In-Reply-To: References: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> Message-ID: On Mon, 2 Jun 2025 10:47:05 GMT, Emanuel Peter wrote: >> C2 compiler fails to recognize counted loops when the induction variable is constrained by multiple consecutive `CastII` nodes. >> This prevents optimizations like range check elimination, loop unrolling and auto-vectorization for these loops. Please refer >> to the detailed discussion for a related performance issue from [1]. >> >> The ideal graph of such a loop typically looks like: >> >> >> /-----------| >> | | >> | ConI | >> loop | / / >> | | / / >> \ AddI / >> RangeCheck \ / | >> | \ / | >> IfTrue Phi | >> \ | | >> RangeCheck \ | | >> \ CastII / <- Range check #1 >> | | / >> IfTrue | | >> \ | | >> CastII | <- Range check #2 >> | / >> |-------/ >> >> >> >> For a counted loop, the loop induction variable (i.e `Phi`) should be the input of `AddI` ideally. However, in above case, it is used >> by two consecutive `CastII` nodes generated by two different range check operations. Compiler should skip all such kind of `CastII` when recognizing a counted loop. >> >> This patch modifies the counted loop recognition code to iteratively uncast the loop `iv` until no `CastII` nodes remain, enabling proper counted loop recognition even when the induction variable undergoes multiple range constraint operations. >> >> Test: >> - Tested tier1, tier2, tier3, and no regressions are found. >> - An additional test case is added to verify the fix. >> >> Performance: >> Here is the performance gain on a NVIDIA Grace machine which is an AArch64 architecture: >> >> >> Benchmark Mode Cnt Unit Before After Gain >> CountedLoopCastIV.loop_iv_int thrpt 30 ops/s 941482.597 4389292.439 4.66 >> CountedLoopCastIV.loop_iv_long thrpt 30 ops/s 884563.232 1441485.455 1.62 >> >> >> We can also observe the similar uplift on a x86_64 machine. >> >> [1] https://github.com/openjdk/jdk/pull/25138#issuecomment-2892720654 > > test/micro/org/openjdk/bench/vm/compiler/CountedLoopCastIV.java line 54: > >> 52: Random r = new Random(); >> 53: start = r.nextInt(LEN >> 2); >> 54: limit = r.nextInt(LEN >> 1, LEN - 3); > > Does this not mean that we use a different seed every time, and therefore the loop has different lengths, and so the results can be influenced accordingly? Yes, I just want to make sure the loop length is different with each time running, so that it will not be influenced by some profiling related optimizations. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25539#discussion_r2122445568 From xgong at openjdk.org Tue Jun 3 02:11:51 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Tue, 3 Jun 2025 02:11:51 GMT Subject: RFR: 8357726: C2 fails to recognize the counted loop when induction variable range is changed multiple times In-Reply-To: References: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> Message-ID: On Mon, 2 Jun 2025 10:45:58 GMT, Emanuel Peter wrote: >> C2 compiler fails to recognize counted loops when the induction variable is constrained by multiple consecutive `CastII` nodes. >> This prevents optimizations like range check elimination, loop unrolling and auto-vectorization for these loops. Please refer >> to the detailed discussion for a related performance issue from [1]. >> >> The ideal graph of such a loop typically looks like: >> >> >> /-----------| >> | | >> | ConI | >> loop | / / >> | | / / >> \ AddI / >> RangeCheck \ / | >> | \ / | >> IfTrue Phi | >> \ | | >> RangeCheck \ | | >> \ CastII / <- Range check #1 >> | | / >> IfTrue | | >> \ | | >> CastII | <- Range check #2 >> | / >> |-------/ >> >> >> >> For a counted loop, the loop induction variable (i.e `Phi`) should be the input of `AddI` ideally. However, in above case, it is used >> by two consecutive `CastII` nodes generated by two different range check operations. Compiler should skip all such kind of `CastII` when recognizing a counted loop. >> >> This patch modifies the counted loop recognition code to iteratively uncast the loop `iv` until no `CastII` nodes remain, enabling proper counted loop recognition even when the induction variable undergoes multiple range constraint operations. >> >> Test: >> - Tested tier1, tier2, tier3, and no regressions are found. >> - An additional test case is added to verify the fix. >> >> Performance: >> Here is the performance gain on a NVIDIA Grace machine which is an AArch64 architecture: >> >> >> Benchmark Mode Cnt Unit Before After Gain >> CountedLoopCastIV.loop_iv_int thrpt 30 ops/s 941482.597 4389292.439 4.66 >> CountedLoopCastIV.loop_iv_long thrpt 30 ops/s 884563.232 1441485.455 1.62 >> >> >> We can also observe the similar uplift on a x86_64 machine. >> >> [1] https://github.com/openjdk/jdk/pull/25138#issuecomment-2892720654 > > test/hotspot/jtreg/compiler/c2/irTests/TestCountedLoopCastIV.java line 174: > >> 172: >> 173: public static void main(String[] args) { >> 174: TestFramework.runWithFlags("-XX:LoopUnrollLimit=0"); > > What is the reason for the flag here? Do you really need it? Thanks so much for your review! This flag prevents the loop being unrolled and splited (i.e. pre-main-post loop mode) as well. So that the compiler can get just one `CountedLoop` in the case, and we can make sure it is generated by the loop exactly. Checking the count of `CountedLoop` >0 is also fine to me without this flag. I just want to avoid any noise. WDYT? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25539#discussion_r2122468953 From jbhateja at openjdk.org Tue Jun 3 02:41:28 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 3 Jun 2025 02:41:28 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v2] In-Reply-To: References: Message-ID: > A) Patch extends the following tests with hard-coded encoding checks for various BMI instructions to cover REX2 or extended EVEX encodings supported by APX. > > > compiler/intrinsics/bmi/verifycode/AndnTestI.java > compiler/intrinsics/bmi/verifycode/AndnTestL.java > compiler/intrinsics/bmi/verifycode/BzhiTestI2L.java > compiler/intrinsics/bmi/verifycode/LZcntTestL.java > compiler/intrinsics/bmi/verifycode/TZcntTestL.java > > > B) After integration of JDK-8349582, which added APX NDD support, AndN instruction selection patterns that expect (Xor SRC, -1) as one of its operands were not getting selected because of a lower-cost generic immediate pattern match; patch fixes this issue through strict predicate checks. > > Above tests are now passing, validations were carried out using Intel Software Development emulator. > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Review comments resolutions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25501/files - new: https://git.openjdk.org/jdk/pull/25501/files/b79d8e35..9c249239 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25501&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25501&range=00-01 Stats: 54 lines in 5 files changed: 54 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25501.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25501/head:pull/25501 PR: https://git.openjdk.org/jdk/pull/25501 From jbhateja at openjdk.org Tue Jun 3 02:47:57 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 3 Jun 2025 02:47:57 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v2] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 02:41:28 GMT, Jatin Bhateja wrote: >> A) Patch extends the following tests with hard-coded encoding checks for various BMI instructions to cover REX2 or extended EVEX encodings supported by APX. >> >> >> compiler/intrinsics/bmi/verifycode/AndnTestI.java >> compiler/intrinsics/bmi/verifycode/AndnTestL.java >> compiler/intrinsics/bmi/verifycode/BzhiTestI2L.java >> compiler/intrinsics/bmi/verifycode/LZcntTestL.java >> compiler/intrinsics/bmi/verifycode/TZcntTestL.java >> >> >> B) After integration of JDK-8349582, which added APX NDD support, AndN instruction selection patterns that expect (Xor SRC, -1) as one of its operands were not getting selected because of a lower-cost generic immediate pattern match; patch fixes this issue through strict predicate checks. >> >> Above tests are now passing, validations were carried out using Intel Software Development emulator. >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Review comments resolutions Thanks, encoding logic is concentrated in integral instruction tests and is shared with corresponding long variants, extended APX coverage for BLS/R/MSK. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25501#issuecomment-2933177986 From mhaessig at openjdk.org Tue Jun 3 03:21:59 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Tue, 3 Jun 2025 03:21:59 GMT Subject: Integrated: 8354930: IGV: dump C2 graph before and after live range stretching In-Reply-To: References: Message-ID: On Wed, 28 May 2025 11:54:24 GMT, Manuel H?ssig wrote: > This PR introduces a new phase `LIVE_RANGE_STRETCHING` that prints after live ranges have been stretched, if that happens at all. The phase `INITIAL_LIVENESS` is moved before live range stretching so we can compare the live ranges before and after stretching in IGV, which is useful for debugging why an oop suddenly belongs to an oop map. > > ## Testing > > - [x] [Github Actions](https://github.com/mhaessig/jdk/actions/runs/15299362485) > - [x] tier1 and tier1, plus additional Oracle internal testing for all Oracle supported platforms and OSs > - [x] verified that the new phase prints when it should in IGV and with `-XX:PrintPhaseLevel=4` This pull request has now been integrated. Changeset: 24edd3b2 Author: Manuel H?ssig Committer: SendaoYan URL: https://git.openjdk.org/jdk/commit/24edd3b2c1324fd58575a6273e5cae17e3d6fbf5 Stats: 7 lines in 3 files changed: 5 ins; 2 del; 0 mod 8354930: IGV: dump C2 graph before and after live range stretching Reviewed-by: rcastanedalo, chagedorn ------------- PR: https://git.openjdk.org/jdk/pull/25492 From galder at openjdk.org Tue Jun 3 04:16:52 2025 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Tue, 3 Jun 2025 04:16:52 GMT Subject: RFR: 8356000: C1/C2-only modes use 2 compiler threads on low CPU count machines [v3] In-Reply-To: References: Message-ID: On Wed, 28 May 2025 18:05:12 GMT, Aleksey Shipilev wrote: >> There is an unfortunate limitation with default tiered policy that we would have at least 2 threads on 1 CPU machine: 1 thread for C1, and 1 thread for C2. >> >> But if we select C1-only or C2-only modes, we _also_ get 2 compiler threads, for which we have no good reason. These threads would just step on each other toes. The fix changes the behavior for 1..3 CPU hosts in C1/C2-only configurations, by using 1 thread instead of 2 threads. The change for 1 CPU config is what we really need. The change in 2..3 CPU configs is an additional effect, but I think it is still good not to use 100%/66% of the CPUs in those configurations as well. >> >> >> $ for I in `seq 1 8`; do build/linux-x86_64-server-release/images/jdk/bin/java \ >> -XX:-TieredCompilation -XX:ActiveProcessorCount=${I} \ >> -XX:+PrintFlagsFinal 2>&1 | grep "CICompilerCount "; done >> >> # Before >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> # After >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> >> It is a minor bug in `CompilationPolicy::initialize`, but it gets in the way studying Leyden in tight CPU scenarios. >> >> Additional testing: >> - [x] New regression test passes with the fix, fails without it >> - [x] GHA > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Better test, patch amendments > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Unnecessary arch limitation > - Simplify test > - Adjust test bound > - Fix Thanks for the fix and expanding the test case. ------------- Marked as reviewed by galder (Author). PR Review: https://git.openjdk.org/jdk/pull/24972#pullrequestreview-2890550967 From azeller at openjdk.org Tue Jun 3 04:44:53 2025 From: azeller at openjdk.org (Arno Zeller) Date: Tue, 3 Jun 2025 04:44:53 GMT Subject: RFR: 8358129: compiler/startup/StartupOutput.java runs into out of memory on Windows after JDK-8347406 In-Reply-To: <-9AT4ja1WZHf_xLO6Uzl90zPJKG-KOHTyUG075CTxHE=.be43593a-ff3b-4e33-8a63-f1d02cce8836@github.com> References: <-9AT4ja1WZHf_xLO6Uzl90zPJKG-KOHTyUG075CTxHE=.be43593a-ff3b-4e33-8a63-f1d02cce8836@github.com> Message-ID: On Mon, 2 Jun 2025 16:38:36 GMT, Galder Zamarre?o wrote: > What impact has this change in the time the test takes to run? If it turns out to be too slow, maybe the processes could be run batches? I checked on one of our windows Machines - the test did run in 11 seconds before and took 48 seconds after this change. Looks fine for me. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25582#issuecomment-2933397555 From thartmann at openjdk.org Tue Jun 3 05:38:51 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Tue, 3 Jun 2025 05:38:51 GMT Subject: RFR: 8358129: compiler/startup/StartupOutput.java runs into out of memory on Windows after JDK-8347406 In-Reply-To: References: Message-ID: <-WFYyJVFxG0nhBwCJuRcpMZhBUtba6Nf1dHVrhNfFxU=.3b277750-d2ee-495f-8959-4837fcc0354b@github.com> On Mon, 2 Jun 2025 10:57:22 GMT, Damon Fenacci wrote: > The test `compiler/startup/StartupOutput.java` starts **200 VMs in a loop** , this can lead to resource shortages on some (Windows) machines. > > There is no real need to run those VMs concurrently (their run is short and basically check that the VM doesn't crash giving limited code cache). > > Running them **sequentially** should be OK and should avoid running out of memory. > > Testing: Tier1-3+ Looks good to me. ------------- Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25582#pullrequestreview-2890727368 From xgong at openjdk.org Tue Jun 3 05:39:53 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Tue, 3 Jun 2025 05:39:53 GMT Subject: RFR: 8357726: C2 fails to recognize the counted loop when induction variable range is changed multiple times In-Reply-To: References: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> Message-ID: <698Q9LoBFMdDFBnBVAB8FYiI0U-abyXms26RLoMv5Xc=.f21b9a25-8f64-412c-b37a-553f0a13192e@github.com> On Mon, 2 Jun 2025 10:51:21 GMT, Emanuel Peter wrote: > @XiaohongGong I suggest you change the title from: `8357726: C2 fails to recognize the counted loop when induction variable range is changed multiple times` to `8357726: C2 recognize loops with multiple casts in trip counter` or even: `8357726: C2 recognize loops with multiple casts in trip counter: phi -> CastII* -> AddI -> phi` Thanks for your suggestion! Sounds better to me. How about changing the title to `Improve C2 to recognize counted loops with multiple casts in trip counter` ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25539#issuecomment-2933530238 From epeter at openjdk.org Tue Jun 3 05:55:12 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 05:55:12 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v71] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 18:48:56 GMT, Christian Hagedorn wrote: >> Yes. True. I sometimes just also cover negative values to be a bit more robust... but I can also change it if you prefer that. > > I guess if it's never negative, you can still cover it but maybe throw an exception instead? Ok, I'll add a check/exception :) >> What exactly do you think is the problem here? > > My IDE advises against matching on the raw type `List`. As an alternative you can match on `List`. Done, I must have been tired yesterday afternoon ? Thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2122782343 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2122776155 From epeter at openjdk.org Tue Jun 3 05:55:16 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 05:55:16 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v71] In-Reply-To: References: Message-ID: <7jtDESfNoZ3zdEsSrVDsbtDk3nF2p96DlgdIBqfVAXI=.75feb7a9-31da-4f78-b420-f3abb9b2356c@github.com> On Mon, 2 Jun 2025 12:24:32 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 91 commits: >> >> - Merge branch 'master' into JDK-8344942-TemplateFramework-v3 >> - validation tests >> - dollar and hashtag parsing validatiaon >> - wip refactor parsing dollar and hashtag >> - more fixes from Christian >> - more improvements >> - more suggestions applied >> - good practice >> - rename template arguments >> - more from Christian >> - ... and 81 more: https://git.openjdk.org/jdk/compare/90d6ad01...cb7037e7 > > test/hotspot/jtreg/compiler/lib/template_framework/library/Hooks.java line 32: > >> 30: */ >> 31: public abstract class Hooks { >> 32: private Hooks() {} // Avoid instantiation and need for documentation. > > With `abstract` you cannot call the constructor. But you could make `Hooks` final instead of abstract and keep the private constructor. Good idea! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2122777154 From epeter at openjdk.org Tue Jun 3 06:04:19 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 06:04:19 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v71] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 13:13:24 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 91 commits: >> >> - Merge branch 'master' into JDK-8344942-TemplateFramework-v3 >> - validation tests >> - dollar and hashtag parsing validatiaon >> - wip refactor parsing dollar and hashtag >> - more fixes from Christian >> - more improvements >> - more suggestions applied >> - good practice >> - rename template arguments >> - more from Christian >> - ... and 81 more: https://git.openjdk.org/jdk/compare/90d6ad01...cb7037e7 > > test/hotspot/jtreg/compiler/lib/template_framework/DataName.java line 181: > >> 179: */ >> 180: public DataName sample() { >> 181: DataName n = (DataName)Renderer.getCurrent().sampleName(predicate()); > > Do you really need this cast? Can't you just return a `Name`. From the uses it seems that you only call interface methods from `Name` at the use-sites. That would require `Name` to become public. I wanted to avoid that. I want the user to think that `DataName` and `StructuralName` are separate but parallel. But internally, they have a unified implementation with `Name`. An alternative would have been to use `NameSet` generically, once with `DataName` and `StructuralName`. But that would mean we could have a `DataName` with the same `name` as a `StructuralName`, because they would be in separate `NameSet`s. What do you think? > test/hotspot/jtreg/compiler/lib/template_framework/StructuralName.java line 47: > >> 45: */ >> 46: public StructuralName { >> 47: } > > Is this required? Is it not automatically added? Same for `DataName`. If I remove it, then `javadoc` complains that we are now using the default constructor, and that it does not have a comment for it... kinda strange but ok ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2122789086 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2122796063 From epeter at openjdk.org Tue Jun 3 06:04:19 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 06:04:19 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v71] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 06:00:38 GMT, Emanuel Peter wrote: >> test/hotspot/jtreg/compiler/lib/template_framework/StructuralName.java line 47: >> >>> 45: */ >>> 46: public StructuralName { >>> 47: } >> >> Is this required? Is it not automatically added? Same for `DataName`. > > If I remove it, then `javadoc` complains that we are now using the default constructor, and that it does not have a comment for it... kinda strange but ok ? So I'd rather keep it. It's a bit of unnecessary boilarplate, but I like having `javadoc` all happy and quiet. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2122798094 From epeter at openjdk.org Tue Jun 3 06:11:10 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 06:11:10 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v71] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 13:31:57 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 91 commits: >> >> - Merge branch 'master' into JDK-8344942-TemplateFramework-v3 >> - validation tests >> - dollar and hashtag parsing validatiaon >> - wip refactor parsing dollar and hashtag >> - more fixes from Christian >> - more improvements >> - more suggestions applied >> - good practice >> - rename template arguments >> - more from Christian >> - ... and 81 more: https://git.openjdk.org/jdk/compare/90d6ad01...cb7037e7 > > test/hotspot/jtreg/compiler/lib/template_framework/CodeFrame.java line 92: > >> 90: /** >> 91: * Creates a special frame, which has a {@link #parent} but uses the {@link NameSet} >> 92: * from the parent frame, allowing {@link Template#defineName} to persist in the outer > > `defineName` -> `addName`? Good catch! Left over from a previous refactoring. `javadoc` does not complain about it, because it seems it does not look at anything that is not `public` ? > I have not checked if this is fully possible but it just occurred to me when reviewing this duplicated interface now. It would require `Name` to be public. As I said above, I'd like to avoid that. We could of course detach `Name.Type` and make it its own interface `NameType`, and make that public. (Just calling it `Type` is a bit too generic, and may lead to name collisions later on). Having `Name.Type` private and `DataName.Type` and `StructuralName.Type` public means they are separate, and the user cannot use one for the other. Hence, the user cannot confuse them as easily. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2122809426 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2122807687 From dnsimon at openjdk.org Tue Jun 3 06:22:57 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 3 Jun 2025 06:22:57 GMT Subject: RFR: 8357619: [JVMCI] Revisit phantom_ref parameter in JVMCINMethodData::get_nmethod_mirror [v3] In-Reply-To: References: <37LbN00VRPqAt9LN8jx43xx3QGsF6jnPFS_OQLUa-0U=.687f6afe-d13a-4d03-af0c-ac91a9862b13@github.com> Message-ID: <4HicVmT-d5SBUtqg8Q2JqQSDhCXmzZdGMcc-CrCu8Bw=.7483a779-a992-4153-9a55-55dd20c7f029@github.com> On Mon, 2 Jun 2025 16:48:16 GMT, Doug Simon wrote: >> The point of the `phantom_ref` parameter (introduced by [JDK-8234359](https://bugs.openjdk.org/browse/JDK-8234359)) of `JVMCINMethodData::get_nmethod_mirror` is to avoid the special resurrection semantics of a phantom read when reading the field during GC, which is when `JVMCINMethodData::invalidate_nmethod_mirror` can be called. >> This case can be handled directly in `JVMCINMethodData::invalidate_nmethod_mirror` and so the `phantom_ref` parameter can be removed. > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > fixed typo Thanks for the reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25488#issuecomment-2933646688 From dnsimon at openjdk.org Tue Jun 3 06:22:57 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 3 Jun 2025 06:22:57 GMT Subject: Integrated: 8357619: [JVMCI] Revisit phantom_ref parameter in JVMCINMethodData::get_nmethod_mirror In-Reply-To: <37LbN00VRPqAt9LN8jx43xx3QGsF6jnPFS_OQLUa-0U=.687f6afe-d13a-4d03-af0c-ac91a9862b13@github.com> References: <37LbN00VRPqAt9LN8jx43xx3QGsF6jnPFS_OQLUa-0U=.687f6afe-d13a-4d03-af0c-ac91a9862b13@github.com> Message-ID: On Wed, 28 May 2025 10:28:38 GMT, Doug Simon wrote: > The point of the `phantom_ref` parameter (introduced by [JDK-8234359](https://bugs.openjdk.org/browse/JDK-8234359)) of `JVMCINMethodData::get_nmethod_mirror` is to avoid the special resurrection semantics of a phantom read when reading the field during GC, which is when `JVMCINMethodData::invalidate_nmethod_mirror` can be called. > This case can be handled directly in `JVMCINMethodData::invalidate_nmethod_mirror` and so the `phantom_ref` parameter can be removed. This pull request has now been integrated. Changeset: 6cfd4057 Author: Doug Simon URL: https://git.openjdk.org/jdk/commit/6cfd4057dce9262f54e71a3930e16da84aa0d9f1 Stats: 12 lines in 3 files changed: 0 ins; 4 del; 8 mod 8357619: [JVMCI] Revisit phantom_ref parameter in JVMCINMethodData::get_nmethod_mirror Reviewed-by: eosterlund, never ------------- PR: https://git.openjdk.org/jdk/pull/25488 From dnsimon at openjdk.org Tue Jun 3 06:32:54 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 3 Jun 2025 06:32:54 GMT Subject: RFR: 8357987: [JVMCI] Add support for retrieving all methods of a ResolvedJavaType [v3] In-Reply-To: <7qRH8PFpSXJTshHNvxEMqEbc34N5wSnpknQaMUbWrCg=.6de4f71a-8c22-492f-b156-b25a07f3b428@github.com> References: <7qRH8PFpSXJTshHNvxEMqEbc34N5wSnpknQaMUbWrCg=.6de4f71a-8c22-492f-b156-b25a07f3b428@github.com> Message-ID: On Mon, 2 Jun 2025 20:36:37 GMT, Tom Shull wrote: >> Currently from ResolvedJavaType one can retrieve all declared methods, static methods, and constructors of the given type. However, internally in HotSpot there are also VM-internal methods, such as overpass methods, associated with a given type which we cannot access via the API. >> >> To correct this, we should add a new method which enables VM-internal methods, such as overpass methods, to be accessed. > > Tom Shull has updated the pull request incrementally with one additional commit since the last revision: > > return List.of() from getAllMethods Still looks good to me. ------------- Marked as reviewed by dnsimon (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25498#pullrequestreview-2890872094 From epeter at openjdk.org Tue Jun 3 06:37:53 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 06:37:53 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v74] In-Reply-To: References: Message-ID: > **Goal** > We want to generate Java source code: > - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. > - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). > > Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). > > **How to get started** > When reviewing, please start by looking at: > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 > > We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. > > Second, look at this advanced test: > https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 > > And then for a "tutorial", look at: > `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` > > It shows these features: > - The `body` of a Template is essentially a list of `Token`s that are concatenated. > - Templates can be nested: a `TemplateWithArgs` is also a `Token`. > - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. > - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. > - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. > - The use of recursive templates, and `fuel` to limit the recursion. > - `Name`s: useful to register field and variable names in code scopes. > > Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 > > For a better experience, you may want to generate the `javadocs`: > `javadoc -sourcepath test/hotspot/j... Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: more for Christian ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24217/files - new: https://git.openjdk.org/jdk/pull/24217/files/30059e66..310d7d86 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=73 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=72-73 Stats: 93 lines in 5 files changed: 65 ins; 20 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/24217.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24217/head:pull/24217 PR: https://git.openjdk.org/jdk/pull/24217 From epeter at openjdk.org Tue Jun 3 06:37:56 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 06:37:56 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v71] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 13:54:10 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 91 commits: >> >> - Merge branch 'master' into JDK-8344942-TemplateFramework-v3 >> - validation tests >> - dollar and hashtag parsing validatiaon >> - wip refactor parsing dollar and hashtag >> - more fixes from Christian >> - more improvements >> - more suggestions applied >> - good practice >> - rename template arguments >> - more from Christian >> - ... and 81 more: https://git.openjdk.org/jdk/compare/90d6ad01...cb7037e7 > > test/hotspot/jtreg/compiler/lib/template_framework/Renderer.java line 338: > >> 336: } >> 337: >> 338: private void renderStringWithDollarAndHashtagReplacements(String s) { > > Hard to grasp the logic of that method. But I trust you on that :-) I leave it up to you if you want to improve readability to extract some of the logic to separate methods such that this method becomes easier to understand. I split out a part and added some more comments / examples. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2122847550 From epeter at openjdk.org Tue Jun 3 06:41:13 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 06:41:13 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v61] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 12:14:48 GMT, Christian Hagedorn wrote: >> Thanks for all the updates and discussions! I've worked my way through the documentation in `Template` and the examples again in some more detail. It's much better and the new explanations are well done, excellent work! >> >> I left some comments here and there but mostly minor things. I will have another look at the implementation - probably only finished by Monday. The design now looks great. I'm glad we could find a good solution now after some more iterations :-) > >> @chhagedorn Alright, I now have a decent solution for `$$var` and `$1var` etc. I also added tests for it. >> >> These are issues we could continue the conversation, unless you are satisfied with my answers: [#24217 (comment)](https://github.com/openjdk/jdk/pull/24217#discussion_r2115388737) [#24217 (comment)](https://github.com/openjdk/jdk/pull/24217#discussion_r2115406391) >> >> This is now ready for another review pass ? > > Awesome, thanks for spending some more time with these nasty edge-cases and finding a solution! I had a look at your updates for all my comments, they look good, thanks! > > I'm going to make a pass over the implementation classes now and will have a look at the `Renderer` updates as well :-) @chhagedorn Thank you very much for the thorough review! I addressed all your comments. We might still want to have a conversation about the `Name` and `Name.Type`. I like the way I have it because it separates the `DataName` and `StructuralName` in the API (less user confusion), while having a unified implementation. But it does mean some casting and some API duplication. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24217#issuecomment-2933725922 From thartmann at openjdk.org Tue Jun 3 06:42:52 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Tue, 3 Jun 2025 06:42:52 GMT Subject: RFR: 8351635: C2 ROR/ROL: assert failed: Long constant expected In-Reply-To: References: Message-ID: On Wed, 28 May 2025 14:19:21 GMT, Jatin Bhateja wrote: > This bug fix patch relaxes the strict assertion check to allow other pattern matches for degenerated long vector ROL/ROR operations with non-constant scalar shift values. > > Kindly review and share feedback. > > Best Regards, > Jatin Looks good to me. I submitted testing and will report back once it passed. Could you please update the affects version accordingly? I assume this is a regression from [JDK-8271589](https://bugs.openjdk.org/browse/JDK-8271589)? test/hotspot/jtreg/compiler/vectorapi/TestVectorRotateScalarCount.java line 114: > 112: } > 113: > 114: Suggestion: ------------- Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25493#pullrequestreview-2890886147 PR Review Comment: https://git.openjdk.org/jdk/pull/25493#discussion_r2122854451 From yzheng at openjdk.org Tue Jun 3 06:53:53 2025 From: yzheng at openjdk.org (Yudi Zheng) Date: Tue, 3 Jun 2025 06:53:53 GMT Subject: RFR: 8357987: [JVMCI] Add support for retrieving all methods of a ResolvedJavaType [v3] In-Reply-To: <7qRH8PFpSXJTshHNvxEMqEbc34N5wSnpknQaMUbWrCg=.6de4f71a-8c22-492f-b156-b25a07f3b428@github.com> References: <7qRH8PFpSXJTshHNvxEMqEbc34N5wSnpknQaMUbWrCg=.6de4f71a-8c22-492f-b156-b25a07f3b428@github.com> Message-ID: On Mon, 2 Jun 2025 20:36:37 GMT, Tom Shull wrote: >> Currently from ResolvedJavaType one can retrieve all declared methods, static methods, and constructors of the given type. However, internally in HotSpot there are also VM-internal methods, such as overpass methods, associated with a given type which we cannot access via the API. >> >> To correct this, we should add a new method which enables VM-internal methods, such as overpass methods, to be accessed. > > Tom Shull has updated the pull request incrementally with one additional commit since the last revision: > > return List.of() from getAllMethods Marked as reviewed by yzheng (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25498#pullrequestreview-2890926150 From thartmann at openjdk.org Tue Jun 3 07:00:58 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Tue, 3 Jun 2025 07:00:58 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v2] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 02:41:28 GMT, Jatin Bhateja wrote: >> A) Patch extends the following tests with hard-coded encoding checks for various BMI instructions to cover REX2 or extended EVEX encodings supported by APX. >> >> >> compiler/intrinsics/bmi/verifycode/AndnTestI.java >> compiler/intrinsics/bmi/verifycode/AndnTestL.java >> compiler/intrinsics/bmi/verifycode/BzhiTestI2L.java >> compiler/intrinsics/bmi/verifycode/LZcntTestL.java >> compiler/intrinsics/bmi/verifycode/TZcntTestL.java >> >> >> B) After integration of JDK-8349582, which added APX NDD support, AndN instruction selection patterns that expect (Xor SRC, -1) as one of its operands were not getting selected because of a lower-cost generic immediate pattern match; patch fixes this issue through strict predicate checks. >> >> Above tests are now passing, validations were carried out using Intel Software Development emulator. >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Review comments resolutions src/hotspot/cpu/x86/x86_64.ad line 10620: > 10618: instruct xorI_rReg_imm(rRegI dst, immI src, rFlagsReg cr) > 10619: %{ > 10620: predicate(!UseAPX && n->in(2)->bottom_type()->is_int()->get_con() != -1); Suggestion: predicate(!UseAPX && n->in(2)->bottom_type()->is_int()->get_con() != -1); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25501#discussion_r2122865710 From epeter at openjdk.org Tue Jun 3 07:12:53 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 07:12:53 GMT Subject: RFR: 8357554: Enable vectorization of Bool -> CMove with different type size (on riscv) In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 14:10:54 GMT, Hamlin Li wrote: > I mean it's really "unconditionally", but if you feel it's better to add an argument, like supports_vectorize_cmove_bool_unconditionally(BasicType src, BasicType dst), I can do it. I think this would be good! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25336#discussion_r2122925805 From epeter at openjdk.org Tue Jun 3 07:19:54 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 07:19:54 GMT Subject: RFR: 8357726: C2 fails to recognize the counted loop when induction variable range is changed multiple times In-Reply-To: <698Q9LoBFMdDFBnBVAB8FYiI0U-abyXms26RLoMv5Xc=.f21b9a25-8f64-412c-b37a-553f0a13192e@github.com> References: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> <698Q9LoBFMdDFBnBVAB8FYiI0U-abyXms26RLoMv5Xc=.f21b9a25-8f64-412c-b37a-553f0a13192e@github.com> Message-ID: On Tue, 3 Jun 2025 05:36:47 GMT, Xiaohong Gong wrote: > Thanks for your suggestion! Sounds better to me. How about changing the title to Improve C2 to recognize counted loops with multiple casts in trip counter ? @XiaohongGong Sounds good too :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/25539#issuecomment-2933851019 From dskantz at openjdk.org Tue Jun 3 07:22:25 2025 From: dskantz at openjdk.org (Daniel Skantz) Date: Tue, 3 Jun 2025 07:22:25 GMT Subject: RFR: 8357822: C2: Multiple string optimization tests are no longer testing string concatenation optimizations Message-ID: <4GDLAMfeWjgfcGvn4sUSMT2jjG3vsebjcFeJqgHqPQw=.e7dfa9e7-4608-4304-ba00-0b254b6bf2b1@github.com> This PR updates a few tests to reintroduce testing of string concatenation optimizations since a few bugs have recently been identified in this area. Selection criteria: performed a text search on the test suite and identified tests for string concatenations or string optimizations that are not currently compiled with `-XDstringConcat=inline` and are not using StringBuilders explicitly. Testing: T1-4. Extra testing: ran the tests manually with `-XX:+OptimizeStringConcat` and verified that the tests are exercising string optimizations after the fix. ------------- Commit messages: - fix up tests Changes: https://git.openjdk.org/jdk/pull/25610/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25610&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8357822 Stats: 98 lines in 7 files changed: 92 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/25610.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25610/head:pull/25610 PR: https://git.openjdk.org/jdk/pull/25610 From epeter at openjdk.org Tue Jun 3 07:24:52 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 07:24:52 GMT Subject: RFR: 8357726: C2 fails to recognize the counted loop when induction variable range is changed multiple times In-Reply-To: References: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> Message-ID: <6JheHO7RO7O4aEUlkwYvMAWra7NFZgdmRz4wBSnzA9c=.f6bfb95d-89a2-437a-9068-ed54099807d2@github.com> On Tue, 3 Jun 2025 01:48:47 GMT, Xiaohong Gong wrote: >> test/micro/org/openjdk/bench/vm/compiler/CountedLoopCastIV.java line 54: >> >>> 52: Random r = new Random(); >>> 53: start = r.nextInt(LEN >> 2); >>> 54: limit = r.nextInt(LEN >> 1, LEN - 3); >> >> Does this not mean that we use a different seed every time, and therefore the loop has different lengths, and so the results can be influenced accordingly? > > Yes, I just want to make sure the loop length is different with each time running, so that it will not be influenced by some profiling related optimizations. I see. Did you have any direct issues with profiling here? I'm worried that the trip count is really very variant here. We could have: start = 0 and limit = LEN - 3 -> count ~ LEN start = LEN/4 and limit = LEN/2 -> count = LEN/4 That is a 4x variance, am I right? Can you run the benchmark a few times, and see what the error term looks like? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25539#discussion_r2122954605 From thartmann at openjdk.org Tue Jun 3 07:25:56 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Tue, 3 Jun 2025 07:25:56 GMT Subject: RFR: 8355574: Fatal error in abort_verify_int_in_range due to Invalid CastII [v4] In-Reply-To: References: Message-ID: On Sun, 18 May 2025 07:06:41 GMT, Quan Anh Mai wrote: >> Hi, >> >> The issue here is that the `CastLLNode` is created before the actual check that ensures the range of the input. This patch fixes it by moving the creation to the correct place, which is under `inline_block`. I also noticed that the code there seems incorrect and confusing. `ArrayCopyNode::get_partial_inline_vector_lane_count` takes the length of the array, not the size in bytes. If you look into the method it will multiply `const_len` with `type2aelementbytes(bt)` to get the size in bytes of the array. In the runtime test, we compare `length << log2(type2bytes(bt))` with `ArrayOperationPartialInlineSize`. This seems confusing, why don't we just compare `length` with `ArrayOperationPartialInlineSize / type2bytes(bt)`, it also unifies the test with the actual cast. >> >> Please take a look and leave your reviews, thanks a lot. > > Quan Anh Mai has updated the pull request incrementally with two additional commits since the last revision: > > - fix comment > - fix comment Just a reminder that since this is a P4, the fix would need to be integrated until RDP 2 on Thursday (June 5) this week (or we need to raise the priority). ------------- PR Comment: https://git.openjdk.org/jdk/pull/25284#issuecomment-2933871198 From epeter at openjdk.org Tue Jun 3 07:29:52 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 07:29:52 GMT Subject: RFR: 8357726: C2 fails to recognize the counted loop when induction variable range is changed multiple times In-Reply-To: References: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> Message-ID: <4JntkYt8QE4lSwWuvEVfqyp_EriMyVV-2YRZKwj6uZk=.e27f92e4-d8f0-49a1-b491-fe19b22141c3@github.com> On Tue, 3 Jun 2025 02:09:10 GMT, Xiaohong Gong wrote: >> test/hotspot/jtreg/compiler/c2/irTests/TestCountedLoopCastIV.java line 174: >> >>> 172: >>> 173: public static void main(String[] args) { >>> 174: TestFramework.runWithFlags("-XX:LoopUnrollLimit=0"); >> >> What is the reason for the flag here? Do you really need it? > > Thanks so much for your review! This flag prevents the loop being unrolled and splited (i.e. pre-main-post loop mode) as well. So that the compiler can get just one `CountedLoop` in the case, and we can make sure it is generated by the loop exactly. Checking the count of `CountedLoop` >0 is also fine to me without this flag. I just want to avoid any noise. WDYT? I would suggest this: Have one run with `-XX:LoopUnrollLimit=0`, and one run without setting the flag. Write some comments about why you are setting the flag. You could then restrict your IR rule to `LoopUnrollLimit=0`. This is where you can most easily reproduce the multiple CastII problem, and it is the easiest to write an IR rule and explain. You should probably leave a comment as to why you set that flag. But maybe you also succeed in writing an IR rule for `LoopUnrollLimit > 0`, though it could be a little more noisy/complicated. It would just be nice to see that things work fine without having to set special flags. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25539#discussion_r2122963453 From mchevalier at openjdk.org Tue Jun 3 08:08:57 2025 From: mchevalier at openjdk.org (Marc Chevalier) Date: Tue, 3 Jun 2025 08:08:57 GMT Subject: RFR: 8353266: C2: Wrong execution with Integer.bitCount(int) intrinsic on AArch64 [v2] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 10:37:11 GMT, Marc Chevalier wrote: >> ### Problem >> >> On Aarch64, using `Integer.bitCount` can modify its argument. The problem comes from the implementation of `popCountI` on Aarch64. For instance, that's what we get with the reproducer `Reduced.java` on the related issue: >> >> ; Load lFld into local x >> ldr x11, [x10, #120] >> ; popCountI >> mov w11, w11 >> mov v16.d[0], x11 >> cnt v16.8b, v16.8b >> addv b16, v16.8b >> mov x13, v16.d[0] >> ; [...] >> ; store local x (which is believed to still contain lFld) into result >> str x11, [x10, #128] >> >> >> The instruction `mov w11, w11` is used to cut the 32 higher bits of `x11` since we use `popCountI` (from `Integer.bitCount`): on aarch64 (like other architectures), assigning the 32 lower bits of a register reset the 32 higher bits. Short: the input is modified, but the implementation of `popCountI` doesn't declare it: >> >> instruct popCountI(iRegINoSp dst, iRegIorL2I src, vRegF tmp) %{ >> match(Set dst (PopCountI src)); >> effect(TEMP tmp); >> [...] >> %} >> >> >> But then, why resetting the upper word of `x11`? It all starts with vector instructions: >> >> cnt v16.8b, v16.8b >> addv b16, v16.8b >> >> The `8b` specifies that it operates on the 8 lower bytes of `v16`, it would be nice to simply use `4b`, but that doesn't exist: vector instructions can only work on either the whole 128-bit register, or the 64 lower bits (by blocks of 1, 2, 4, 8 or 16 bytes). There is no suffix (and encoding) for a vector instruction to work only on the 32 lower bits, so not to pollute the bit count, we need to reset the 32 higher bits of `v16.d[0]` (aka `d16`), that is `v16.s[1]`, that is `v16[32:63]` in a more bit-explicit notation. Moreover, unlike with general purpose register doing >> >> mov v16.s[0], w11 >> >> would set `v16[0:31]` to `w11`, but not reset `v16[32:63]`. Which makes sense! Otherwise, using vector registers would be impractical if writing any piece would reset the rest... So we indeed need to set all of `v16[0:63]`, which >> >> mov w11, w11 >> mov v16.d[0], x11 >> >> does, but by destroying `x11`. >> >> ### Solution >> >> Simply adding `USE_KILL src` in the effects would be nice, but unfortunately not possible: `iRegIorL2I` is an operand class (either a 32-bit register or a L2I of a 64-bit register) and those cannot be used in effect lists. >> >> The way I went for is rather not to modify the source, but rather do write the two lower words of `v16` we are interested in separately: >> >> mov v16.s[1], wzr ... > > Marc Chevalier has updated the pull request incrementally with one additional commit since the last revision: > > Apply suggestions Thanks @sendaoYan, @theRealAph and @TobiHartmann for review and nice suggestions! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25551#issuecomment-2934041591 From mchevalier at openjdk.org Tue Jun 3 08:08:59 2025 From: mchevalier at openjdk.org (Marc Chevalier) Date: Tue, 3 Jun 2025 08:08:59 GMT Subject: Integrated: 8353266: C2: Wrong execution with Integer.bitCount(int) intrinsic on AArch64 In-Reply-To: References: Message-ID: On Fri, 30 May 2025 15:33:14 GMT, Marc Chevalier wrote: > ### Problem > > On Aarch64, using `Integer.bitCount` can modify its argument. The problem comes from the implementation of `popCountI` on Aarch64. For instance, that's what we get with the reproducer `Reduced.java` on the related issue: > > ; Load lFld into local x > ldr x11, [x10, #120] > ; popCountI > mov w11, w11 > mov v16.d[0], x11 > cnt v16.8b, v16.8b > addv b16, v16.8b > mov x13, v16.d[0] > ; [...] > ; store local x (which is believed to still contain lFld) into result > str x11, [x10, #128] > > > The instruction `mov w11, w11` is used to cut the 32 higher bits of `x11` since we use `popCountI` (from `Integer.bitCount`): on aarch64 (like other architectures), assigning the 32 lower bits of a register reset the 32 higher bits. Short: the input is modified, but the implementation of `popCountI` doesn't declare it: > > instruct popCountI(iRegINoSp dst, iRegIorL2I src, vRegF tmp) %{ > match(Set dst (PopCountI src)); > effect(TEMP tmp); > [...] > %} > > > But then, why resetting the upper word of `x11`? It all starts with vector instructions: > > cnt v16.8b, v16.8b > addv b16, v16.8b > > The `8b` specifies that it operates on the 8 lower bytes of `v16`, it would be nice to simply use `4b`, but that doesn't exist: vector instructions can only work on either the whole 128-bit register, or the 64 lower bits (by blocks of 1, 2, 4, 8 or 16 bytes). There is no suffix (and encoding) for a vector instruction to work only on the 32 lower bits, so not to pollute the bit count, we need to reset the 32 higher bits of `v16.d[0]` (aka `d16`), that is `v16.s[1]`, that is `v16[32:63]` in a more bit-explicit notation. Moreover, unlike with general purpose register doing > > mov v16.s[0], w11 > > would set `v16[0:31]` to `w11`, but not reset `v16[32:63]`. Which makes sense! Otherwise, using vector registers would be impractical if writing any piece would reset the rest... So we indeed need to set all of `v16[0:63]`, which > > mov w11, w11 > mov v16.d[0], x11 > > does, but by destroying `x11`. > > ### Solution > > Simply adding `USE_KILL src` in the effects would be nice, but unfortunately not possible: `iRegIorL2I` is an operand class (either a 32-bit register or a L2I of a 64-bit register) and those cannot be used in effect lists. > > The way I went for is rather not to modify the source, but rather do write the two lower words of `v16` we are interested in separately: > > mov v16.s[1], wzr ; Reset the 1-indexed word of v16, that is v16[32:63] <- 0 > mov v16.s[0], w11 ; Set the 0-ind... This pull request has now been integrated. Changeset: be923a8b Author: Marc Chevalier URL: https://git.openjdk.org/jdk/commit/be923a8b7229cb7a705e72ebbb3046e9f2085048 Stats: 84 lines in 2 files changed: 80 ins; 2 del; 2 mod 8353266: C2: Wrong execution with Integer.bitCount(int) intrinsic on AArch64 Reviewed-by: aph, thartmann ------------- PR: https://git.openjdk.org/jdk/pull/25551 From jbhateja at openjdk.org Tue Jun 3 08:19:06 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 3 Jun 2025 08:19:06 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v3] In-Reply-To: References: Message-ID: <1jna58ZtxrGgcqNt9FQf5Tl-rIo6YwTFYzavusVZGyA=.87513e10-77ba-436e-9d9e-b82f5041d368@github.com> > A) Patch extends the following tests with hard-coded encoding checks for various BMI instructions to cover REX2 or extended EVEX encodings supported by APX. > > > compiler/intrinsics/bmi/verifycode/AndnTestI.java > compiler/intrinsics/bmi/verifycode/AndnTestL.java > compiler/intrinsics/bmi/verifycode/BzhiTestI2L.java > compiler/intrinsics/bmi/verifycode/LZcntTestL.java > compiler/intrinsics/bmi/verifycode/TZcntTestL.java > > > B) After integration of JDK-8349582, which added APX NDD support, AndN instruction selection patterns that expect (Xor SRC, -1) as one of its operands were not getting selected because of a lower-cost generic immediate pattern match; patch fixes this issue through strict predicate checks. > > Above tests are now passing, validations were carried out using Intel Software Development emulator. > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Update src/hotspot/cpu/x86/x86_64.ad Thanks :-) Co-authored-by: Tobias Hartmann ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25501/files - new: https://git.openjdk.org/jdk/pull/25501/files/9c249239..b5f69c8d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25501&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25501&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25501.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25501/head:pull/25501 PR: https://git.openjdk.org/jdk/pull/25501 From jbhateja at openjdk.org Tue Jun 3 08:35:20 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 3 Jun 2025 08:35:20 GMT Subject: RFR: 8351635: C2 ROR/ROL: assert failed: Long constant expected [v2] In-Reply-To: References: Message-ID: <5k2J6AUT-a3B006J_ksxccQVxprZa21uqUbKTGkkby0=.5dfc4f2b-ad7a-4393-bf5e-efc246582c83@github.com> > This bug fix patch relaxes the strict assertion check to allow other pattern matches for degenerated long vector ROL/ROR operations with non-constant scalar shift values. > > Kindly review and share feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Update test/hotspot/jtreg/compiler/vectorapi/TestVectorRotateScalarCount.java Co-authored-by: Tobias Hartmann ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25493/files - new: https://git.openjdk.org/jdk/pull/25493/files/c68b7468..9ec47164 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25493&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25493&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25493.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25493/head:pull/25493 PR: https://git.openjdk.org/jdk/pull/25493 From shade at openjdk.org Tue Jun 3 08:36:56 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 3 Jun 2025 08:36:56 GMT Subject: RFR: 8357223: AArch64: Optimize interpreter profile updates [v2] In-Reply-To: <7wo-_Wt-EiVGKgxMxU_MnTA8o1QQxH_LDtNzDShlOIY=.9c8093b7-ed4b-487d-afbe-5227362f1ade@github.com> References: <7wo-_Wt-EiVGKgxMxU_MnTA8o1QQxH_LDtNzDShlOIY=.9c8093b7-ed4b-487d-afbe-5227362f1ade@github.com> Message-ID: On Thu, 29 May 2025 23:04:25 GMT, Chad Rakoczy wrote: >> [JDK-8357223](https://bugs.openjdk.org/browse/JDK-8357223) >> >> The aarch64 version of [JDK-8356946](https://bugs.openjdk.org/browse/JDK-8356946) >> >> The reasoning for this change is the same as the x86 version's PR: >> >>> First, we carry the implementation for counter decrements without using them. This is dead code, and can be purged. >>> >>> Second, we care about overflows for 64-bit for some reason. I think this is a reminiscent of 32-bit x86 support, where we can plausibly have 32-bit counter overflow in a reasonable timeframe. But for 64-bit counter, we need tens of years of constantly bashing the counter to get it to overflow. No other profile counter update code, e.g. in C1, cares about this. >> >> Additional testing: >> >> - [x] Linux aarch64 fastdebug tier 1/2/3/4 > > Chad Rakoczy has updated the pull request incrementally with one additional commit since the last revision: > > Address comments I would like @theRealAph to ack this before I sponsor. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25512#issuecomment-2934132868 From wenanjian at openjdk.org Tue Jun 3 08:46:02 2025 From: wenanjian at openjdk.org (Anjian Wen) Date: Tue, 3 Jun 2025 08:46:02 GMT Subject: RFR: 8358105: RISC-V: Optimize interpreter profile updates Message-ID: The reason of this patch is same as the x86 and aarch64 but for riscv [JDK-8356946](https://bugs.openjdk.org/browse/JDK-8356946) [JDK-8357223](https://bugs.openjdk.org/browse/JDK-8357223) > First, we carry the implementation for counter decrements without using them. This is dead code, and can be purged. Second, we care about overflows for 64-bit for some reason. I think this is a reminiscent of 32-bit x86 support, where we can plausibly have 32-bit counter overflow in a reasonable timeframe. But for 64-bit counter, we need tens of years of constantly bashing the counter to get it to overflow. No other profile counter update code, e.g. in C1, cares about this. ------------- Commit messages: - delete useless func declare and add assert back - RISCV: Optimize interpreter profile updates Changes: https://git.openjdk.org/jdk/pull/25520/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25520&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358105 Stats: 33 lines in 2 files changed: 0 ins; 21 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/25520.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25520/head:pull/25520 PR: https://git.openjdk.org/jdk/pull/25520 From aph at openjdk.org Tue Jun 3 08:48:52 2025 From: aph at openjdk.org (Andrew Haley) Date: Tue, 3 Jun 2025 08:48:52 GMT Subject: RFR: 8357223: AArch64: Optimize interpreter profile updates [v2] In-Reply-To: <7wo-_Wt-EiVGKgxMxU_MnTA8o1QQxH_LDtNzDShlOIY=.9c8093b7-ed4b-487d-afbe-5227362f1ade@github.com> References: <7wo-_Wt-EiVGKgxMxU_MnTA8o1QQxH_LDtNzDShlOIY=.9c8093b7-ed4b-487d-afbe-5227362f1ade@github.com> Message-ID: <_uKcfKe3J417r7ute1faRzhFqXC5xYVDvRscI8z4e5k=.d82eec06-2dc8-4961-9c40-a5f70bb3be11@github.com> On Thu, 29 May 2025 23:04:25 GMT, Chad Rakoczy wrote: >> [JDK-8357223](https://bugs.openjdk.org/browse/JDK-8357223) >> >> The aarch64 version of [JDK-8356946](https://bugs.openjdk.org/browse/JDK-8356946) >> >> The reasoning for this change is the same as the x86 version's PR: >> >>> First, we carry the implementation for counter decrements without using them. This is dead code, and can be purged. >>> >>> Second, we care about overflows for 64-bit for some reason. I think this is a reminiscent of 32-bit x86 support, where we can plausibly have 32-bit counter overflow in a reasonable timeframe. But for 64-bit counter, we need tens of years of constantly bashing the counter to get it to overflow. No other profile counter update code, e.g. in C1, cares about this. >> >> Additional testing: >> >> - [x] Linux aarch64 fastdebug tier 1/2/3/4 > > Chad Rakoczy has updated the pull request incrementally with one additional commit since the last revision: > > Address comments Yes, that's a nice improvement. Most satisfying. ------------- Marked as reviewed by aph (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25512#pullrequestreview-2891342554 From shade at openjdk.org Tue Jun 3 08:58:05 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 3 Jun 2025 08:58:05 GMT Subject: RFR: 8357223: AArch64: Optimize interpreter profile updates [v2] In-Reply-To: <7wo-_Wt-EiVGKgxMxU_MnTA8o1QQxH_LDtNzDShlOIY=.9c8093b7-ed4b-487d-afbe-5227362f1ade@github.com> References: <7wo-_Wt-EiVGKgxMxU_MnTA8o1QQxH_LDtNzDShlOIY=.9c8093b7-ed4b-487d-afbe-5227362f1ade@github.com> Message-ID: On Thu, 29 May 2025 23:04:25 GMT, Chad Rakoczy wrote: >> [JDK-8357223](https://bugs.openjdk.org/browse/JDK-8357223) >> >> The aarch64 version of [JDK-8356946](https://bugs.openjdk.org/browse/JDK-8356946) >> >> The reasoning for this change is the same as the x86 version's PR: >> >>> First, we carry the implementation for counter decrements without using them. This is dead code, and can be purged. >>> >>> Second, we care about overflows for 64-bit for some reason. I think this is a reminiscent of 32-bit x86 support, where we can plausibly have 32-bit counter overflow in a reasonable timeframe. But for 64-bit counter, we need tens of years of constantly bashing the counter to get it to overflow. No other profile counter update code, e.g. in C1, cares about this. >> >> Additional testing: >> >> - [x] Linux aarch64 fastdebug tier 1/2/3/4 > > Chad Rakoczy has updated the pull request incrementally with one additional commit since the last revision: > > Address comments Excellent, thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25512#issuecomment-2934211424 From duke at openjdk.org Tue Jun 3 08:58:06 2025 From: duke at openjdk.org (Chad Rakoczy) Date: Tue, 3 Jun 2025 08:58:06 GMT Subject: Integrated: 8357223: AArch64: Optimize interpreter profile updates In-Reply-To: References: Message-ID: On Wed, 28 May 2025 20:21:20 GMT, Chad Rakoczy wrote: > [JDK-8357223](https://bugs.openjdk.org/browse/JDK-8357223) > > The aarch64 version of [JDK-8356946](https://bugs.openjdk.org/browse/JDK-8356946) > > The reasoning for this change is the same as the x86 version's PR: > >> First, we carry the implementation for counter decrements without using them. This is dead code, and can be purged. >> >> Second, we care about overflows for 64-bit for some reason. I think this is a reminiscent of 32-bit x86 support, where we can plausibly have 32-bit counter overflow in a reasonable timeframe. But for 64-bit counter, we need tens of years of constantly bashing the counter to get it to overflow. No other profile counter update code, e.g. in C1, cares about this. > > Additional testing: > > - [x] Linux aarch64 fastdebug tier 1/2/3/4 This pull request has now been integrated. Changeset: 44025276 Author: Chad Rakoczy Committer: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/4402527683ed08eebf4953a9d83f72f64a5ff4fa Stats: 47 lines in 2 files changed: 0 ins; 37 del; 10 mod 8357223: AArch64: Optimize interpreter profile updates Reviewed-by: shade, aph ------------- PR: https://git.openjdk.org/jdk/pull/25512 From epeter at openjdk.org Tue Jun 3 09:28:03 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 09:28:03 GMT Subject: RFR: 8350896: Integer/Long.compress gets wrong type from CompressBitsNode::Value [v9] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 09:20:57 GMT, Emanuel Peter wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix aarch64 failure > > src/hotspot/share/opto/intrinsicnode.cpp line 288: > >> 286: // For constant mask strictly less than zero, maximum result value will be >> 287: // same as mask value with its sign bit flipped, assuming all but last read >> 288: // source bits are set to 1. > > Suggestion: > > // For constant mask strictly less than zero, the maximum result value will be > // the same as the mask value with its sign bit flipped, assuming all source bits but the last > // are set to 1. Honestly, I don't understand the sign flip... hmm > src/hotspot/share/opto/intrinsicnode.cpp line 298: > >> 296: // result.hi = 0xEFFFFFFF ^ 0x80000000 = 0x6FFFFFFF >> 297: // result.lo = 0x80000000 >> 298: // > > Same here: why not do a proper `if-else`, and add the comments to each scope directly? `Result.Hi` -> `result.hi` etc for consistency. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23947#discussion_r2123251808 PR Review Comment: https://git.openjdk.org/jdk/pull/23947#discussion_r2123233772 From epeter at openjdk.org Tue Jun 3 09:28:03 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 09:28:03 GMT Subject: RFR: 8350896: Integer/Long.compress gets wrong type from CompressBitsNode::Value [v9] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 17:58:09 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This bugfix patch fixes incorrect value computation for Integer/Long. compress APIs. >> >> Problems occur with a constant input and variable mask where the input's value is equal to the lower bound of the mask value., In this case, an erroneous value range estimation results in a constant value. Existing value routine first attempts to constant fold the compression operation if both input and compression mask are constant values; otherwise, it attempts to constrain the value range of result based on the upper and lower bounds of mask type. >> >> New IR test covers the issue reported in the bug report along with a case for value range based logic pruning. >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Fix aarch64 failure Thanks for all the comment updates! I had a few minute to look into it, and will add more later! src/hotspot/share/opto/intrinsicnode.cpp line 267: > 265: // mask = 0xEFFFFFFF (constant mask) > 266: // result.hi = 0x7FFFFFFF > 267: // result.lo = 0 Should shit not go inside the `CompressBits` scope? `Hi` -> `lo` `Result.Hi = popcount(1 << mask_bits - 1)` Does not look right. Is this not the wrong way around? Just repeating code here also does not make sense. Either give a reason in english, or just drop the duplication if it is indeed trivail. I would also do the case distinction a bit clearer: If mask == -1 -> all ones -> just returns src: result.lo = type_min (happens if src = type_min) Question: does that not mean we could just return the input type of `src`? If mask != -1 -> at least one zero in mask -> result cannot be negative: result.lo = 0 But if we are doing this with the comments, then why not just create an `if-else` block, and add the comments inside each block? src/hotspot/share/opto/intrinsicnode.cpp line 272: > 270: int bitcount = population_count(static_cast(bt == T_INT ? maskcon & 0xFFFFFFFFL : maskcon)); > 271: hi = maskcon == -1L ? hi : (1UL << bitcount) - 1; > 272: lo = maskcon == -1L ? lo : 0L; It could be nice to have a proper `if-else` here, and add the comments to each scope, rather than above. That would allow you to avoid duplicating the code above in the comments. src/hotspot/share/opto/intrinsicnode.cpp line 274: > 272: lo = maskcon == -1L ? lo : 0L; > 273: } else { > 274: // Case A.2 bit expansion:- I would put the assert for `Op_ExpandBits` above, so that it is immediately clear that this matches. src/hotspot/share/opto/intrinsicnode.cpp line 278: > 276: // Result.Hi = mask, optimistically assuming all source bits > 277: // read starting from least significant bit positions are 1. > 278: // Result.Lo = 0 Suggestion: // Result.Lo = 0, because at least one bit in mask is zero. src/hotspot/share/opto/intrinsicnode.cpp line 288: > 286: // For constant mask strictly less than zero, maximum result value will be > 287: // same as mask value with its sign bit flipped, assuming all but last read > 288: // source bits are set to 1. Suggestion: // For constant mask strictly less than zero, the maximum result value will be // the same as the mask value with its sign bit flipped, assuming all source bits but the last // are set to 1. src/hotspot/share/opto/intrinsicnode.cpp line 298: > 296: // result.hi = 0xEFFFFFFF ^ 0x80000000 = 0x6FFFFFFF > 297: // result.lo = 0x80000000 > 298: // Same here: why not do a proper `if-else`, and add the comments to each scope directly? src/hotspot/share/opto/intrinsicnode.cpp line 300: > 298: // > 299: assert(opc == Op_ExpandBits, ""); > 300: hi = maskcon >= 0L ? maskcon : maskcon ^ lo; If you are already touching this line: `maskcon ^ lo` is really a bit hairy. It should really be `maskcon ^ type_min(bt)`, and then you add a comment right there that it is a sign flip. ------------- PR Review: https://git.openjdk.org/jdk/pull/23947#pullrequestreview-2891441258 PR Review Comment: https://git.openjdk.org/jdk/pull/23947#discussion_r2123225738 PR Review Comment: https://git.openjdk.org/jdk/pull/23947#discussion_r2123228042 PR Review Comment: https://git.openjdk.org/jdk/pull/23947#discussion_r2123229716 PR Review Comment: https://git.openjdk.org/jdk/pull/23947#discussion_r2123235763 PR Review Comment: https://git.openjdk.org/jdk/pull/23947#discussion_r2123240806 PR Review Comment: https://git.openjdk.org/jdk/pull/23947#discussion_r2123231362 PR Review Comment: https://git.openjdk.org/jdk/pull/23947#discussion_r2123248474 From dfenacci at openjdk.org Tue Jun 3 09:29:53 2025 From: dfenacci at openjdk.org (Damon Fenacci) Date: Tue, 3 Jun 2025 09:29:53 GMT Subject: RFR: 8358129: compiler/startup/StartupOutput.java runs into out of memory on Windows after JDK-8347406 In-Reply-To: References: <-9AT4ja1WZHf_xLO6Uzl90zPJKG-KOHTyUG075CTxHE=.be43593a-ff3b-4e33-8a63-f1d02cce8836@github.com> Message-ID: On Tue, 3 Jun 2025 04:42:23 GMT, Arno Zeller wrote: > > What impact has this change in the time the test takes to run? If it turns out to be too slow, maybe the processes could be run batches? > > I checked on one of our windows Machines - the test did run in 11 seconds before and took 48 seconds after this change. I quickly checked on our machines (few different architectures) and it went from a range between 2 and 10 seconds to a range between 6 and 23 seconds. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25582#issuecomment-2934342130 From fjiang at openjdk.org Tue Jun 3 09:29:59 2025 From: fjiang at openjdk.org (Feilong Jiang) Date: Tue, 3 Jun 2025 09:29:59 GMT Subject: RFR: 8358105: RISC-V: Optimize interpreter profile updates In-Reply-To: References: Message-ID: On Thu, 29 May 2025 11:05:04 GMT, Anjian Wen wrote: > The reason of this patch is same as the x86 and aarch64 but for riscv > [JDK-8356946](https://bugs.openjdk.org/browse/JDK-8356946) > [JDK-8357223](https://bugs.openjdk.org/browse/JDK-8357223) > >> First, we carry the implementation for counter decrements without using them. This is dead code, and can be purged. Second, we care about overflows for 64-bit for some reason. I think this is a reminiscent of 32-bit x86 support, where we can plausibly have 32-bit counter overflow in a reasonable timeframe. But for 64-bit counter, we need tens of years of constantly bashing the counter to get it to overflow. No other profile counter update code, e.g. in C1, cares about this. Looks good! ------------- Marked as reviewed by fjiang (Committer). PR Review: https://git.openjdk.org/jdk/pull/25520#pullrequestreview-2891497710 From epeter at openjdk.org Tue Jun 3 09:31:55 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 09:31:55 GMT Subject: RFR: 8350896: Integer/Long.compress gets wrong type from CompressBitsNode::Value [v9] In-Reply-To: References: Message-ID: <6tkCVQHc4bQQsHHt-VfuZi00vTUuyWwYT3gZGyFAAMA=.3cc07024-2d7f-4eaa-ac62-811532e50a75@github.com> On Tue, 3 Jun 2025 09:25:01 GMT, Emanuel Peter wrote: >> src/hotspot/share/opto/intrinsicnode.cpp line 288: >> >>> 286: // For constant mask strictly less than zero, maximum result value will be >>> 287: // same as mask value with its sign bit flipped, assuming all but last read >>> 288: // source bits are set to 1. >> >> Suggestion: >> >> // For constant mask strictly less than zero, the maximum result value will be >> // the same as the mask value with its sign bit flipped, assuming all source bits but the last >> // are set to 1. > > Honestly, I don't understand the sign flip... hmm Ah, you are just masking off the sign bit... right. Makes sense. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23947#discussion_r2123265342 From shade at openjdk.org Tue Jun 3 09:38:09 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 3 Jun 2025 09:38:09 GMT Subject: RFR: 8357434: x86: Simplify Interpreter::profile_taken_branch [v3] In-Reply-To: References: Message-ID: > Noticed that `Interpreter::profile_taken_branch` has the same `sbbptr` pattern we have eliminated with [JDK-8356946](https://bugs.openjdk.org/browse/JDK-8356946). The same logic applies here: the counter is 64-bit, never practically overflows, and no other code cares about it. > > Also tidied up some comments around it. > > Additional testing; > - [x] Linux x86_64 server fastdebug, `tier1 tier2` Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Merge branch 'master' into JDK-8357434-x86-profile-taken - Stale comment - Merge branch 'master' into JDK-8357434-x86-profile-taken - Fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25343/files - new: https://git.openjdk.org/jdk/pull/25343/files/816b7af7..f17feaa6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25343&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25343&range=01-02 Stats: 49352 lines in 792 files changed: 25581 ins; 15003 del; 8768 mod Patch: https://git.openjdk.org/jdk/pull/25343.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25343/head:pull/25343 PR: https://git.openjdk.org/jdk/pull/25343 From syan at openjdk.org Tue Jun 3 10:02:02 2025 From: syan at openjdk.org (SendaoYan) Date: Tue, 3 Jun 2025 10:02:02 GMT Subject: RFR: 8358129: compiler/startup/StartupOutput.java runs into out of memory on Windows after JDK-8347406 In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 10:57:22 GMT, Damon Fenacci wrote: > The test `compiler/startup/StartupOutput.java` starts **200 VMs in a loop** , this can lead to resource shortages on some (Windows) machines. > > There is no real need to run those VMs concurrently (their run is short and basically check that the VM doesn't crash giving limited code cache). > > Running them **sequentially** should be OK and should avoid running out of memory. > > Testing: Tier1-3+ test/hotspot/jtreg/compiler/startup/StartupOutput.java line 67: > 65: int reservedCodeCacheSizeInKb = initialCodeCacheSizeInKb + rand.nextInt(200); > 66: pb = ProcessTools.createLimitedTestJavaProcessBuilder("-XX:InitialCodeCacheSize=" + initialCodeCacheSizeInKb + "K", "-XX:ReservedCodeCacheSize=" + reservedCodeCacheSizeInKb + "k", "-version"); > 67: out = new OutputAnalyzer(pb.start()); SInce we start VMs from concurrently to serially, do we need start VMs 200 times anymore, maybe 20 times or 5 times is enough, or even just 1 time is enough? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25582#discussion_r2123333185 From shade at openjdk.org Tue Jun 3 10:38:54 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 3 Jun 2025 10:38:54 GMT Subject: RFR: 8357600: Patch nmethod flushing message to include more details [v3] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 22:49:36 GMT, Cesar Soares Lucas wrote: >> Please review this patch for adding more details to nmethod flushing message. These details are particularly important when investigating interaction of JVMCI compiled code and code cache flushing heuristics. >> >> Tested on Linux x64 with JTREG tier1-3 using fastdebug and release builds. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Address PR feedback. Looks OK to me. This is a diagnostic logging, so we do not have to be extra crisp about it. Let any other compiler folks review as well. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25402#pullrequestreview-2891737056 From kwei at openjdk.org Tue Jun 3 10:50:39 2025 From: kwei at openjdk.org (Kuai Wei) Date: Tue, 3 Jun 2025 10:50:39 GMT Subject: RFR: 8345485: C2 MergeLoads: merge adjacent array/native memory loads into larger load [v17] In-Reply-To: References: Message-ID: > In this patch, I extent the merge stores optimization to merge adjacents loads. Tier1 tests are passed in my machine. > > The benchmark result of MergeLoadBench.java > AMD EPYC 9T24 96-Core Processor: > > |name | -MergeLoads | +MergeLoads |delta| > |---|---|---|---| > |MergeLoadBench.getCharB |4352.150 |4407.435 | 55.29 | > |MergeLoadBench.getCharBU |4075.320 |4084.663 | 9.34 | > |MergeLoadBench.getCharBV |3221.302 |3221.528 | 0.23 | > |MergeLoadBench.getCharC |2235.433 |2238.796 | 3.36 | > |MergeLoadBench.getCharL |4363.244 |4372.281 | 9.04 | > |MergeLoadBench.getCharLU |4072.550 |4075.744 | 3.19 | > |MergeLoadBench.getCharLV |2227.825 |2231.612 | 3.79 | > |MergeLoadBench.getIntB |11199.935 |6869.030 | -4330.90 | > |MergeLoadBench.getIntBU |6853.862 |2763.923 | -4089.94 | > |MergeLoadBench.getIntBV |306.953 |309.911 | 2.96 | > |MergeLoadBench.getIntL |10426.843 |6523.716 | -3903.13 | > |MergeLoadBench.getIntLU |6740.847 |2602.701 | -4138.15 | > |MergeLoadBench.getIntLV |2233.151 |2231.745 | -1.41 | > |MergeLoadBench.getIntRB |11335.756 |8980.619 | -2355.14 | > |MergeLoadBench.getIntRBU |7439.873 |3190.208 | -4249.66 | > |MergeLoadBench.getIntRL |16323.040 |7786.842 | -8536.20 | > |MergeLoadBench.getIntRLU |7457.745 |3364.140 | -4093.61 | > |MergeLoadBench.getIntRU |2512.621 |2511.668 | -0.95 | > |MergeLoadBench.getIntU |2501.064 |2500.629 | -0.43 | > |MergeLoadBench.getLongB |21175.442 |21103.660 | -71.78 | > |MergeLoadBench.getLongBU |14042.046 |2512.784 | -11529.26 | > |MergeLoadBench.getLongBV |606.448 |606.171 | -0.28 | > |MergeLoadBench.getLongL |23142.178 |23217.785 | 75.61 | > |MergeLoadBench.getLongLU |14112.972 |2237.659 | -11875.31 | > |MergeLoadBench.getLongLV |2230.416 |2231.224 | 0.81 | > |MergeLoadBench.getLongRB |21152.558 |21140.583 | -11.98 | > |MergeLoadBench.getLongRBU |14031.178 |2520.317 | -11510.86 | > |MergeLoadBench.getLongRL |23248.506 |23136.410 | -112.10 | > |MergeLoadBench.getLongRLU |14125.032 |2240.481 | -11884.55 | > |MergeLoadBench.getLongRU |3071.881 |3066.606 | -5.27 | > |Merg... Kuai Wei has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 24 commits: - Merge remote-tracking branch 'origin/master' into dev/merge_loads - Move _merge_memops_checks into OrI/OrL - Fix test error after merging - Merge remote-tracking branch 'origin/master' into dev/merge_loads - Fix for comments - Fix build error on mac and windows - Add check flag for combine operator - Make MergeLoadInfoList an in-place growable array - Fix for comments - Merge remote-tracking branch 'origin/master' into dev/merge_loads - ... and 14 more: https://git.openjdk.org/jdk/compare/8674f491...bdaae3ee ------------- Changes: https://git.openjdk.org/jdk/pull/24023/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24023&range=16 Stats: 2727 lines in 17 files changed: 2677 ins; 0 del; 50 mod Patch: https://git.openjdk.org/jdk/pull/24023.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24023/head:pull/24023 PR: https://git.openjdk.org/jdk/pull/24023 From jbhateja at openjdk.org Tue Jun 3 10:52:50 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 3 Jun 2025 10:52:50 GMT Subject: RFR: 8358333: Use VEX2 prefix in Assembler::psllq In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 15:53:17 GMT, Yudi Zheng wrote: > While porting the commit https://github.com/openjdk/jdk/commit/0df8c9684b8782ef830e2bd425217864c3f51784 to Graal, I noticed that the Assembler::psllq instruction is using the VEX3 prefix. This results in the instruction being unrecognizable by my outdated version of hsdis. Currently, HotSpot generates the following bytes for vpsllq xmm7, xmm7, 0x34 > https://github.com/openjdk/jdk/blob/0df8c9684b8782ef830e2bd425217864c3f51784/src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp#L255 > > > c4 e1 c1 73 f7 34 > > > By setting the rex_w to WIG, the emitted bytes are: > > > c5 c1 73 f7 34 LGTM. ------------- Marked as reviewed by jbhateja (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25593#pullrequestreview-2891782918 From mli at openjdk.org Tue Jun 3 11:26:52 2025 From: mli at openjdk.org (Hamlin Li) Date: Tue, 3 Jun 2025 11:26:52 GMT Subject: RFR: 8357554: Enable vectorization of Bool -> CMove with different type size (on riscv) In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 07:10:19 GMT, Emanuel Peter wrote: >> ( In this pr, it should return false for riscv too and be enabled in the riscv pr. I'll modify it. ) >> >>> Does RISCV support the use of any input vector element type, including 8bit, 16bit, 32bit and 64bit masks, and any elements we would be blending, incl byte, short, char, int, long, HF, F, D? >> >> Good question! I'll add some additional tests to double check and reflect this. >> >> I think the answer should be yes, i.e. on riscv all size of source inputs (comparing operands) and all size of dest outputs (blending result) are supported. >> But for HF, it's a bit special, the underlying payload is a short, so in theory it should be supported too, but it's not supported in this pr and the related riscv pr (https://github.com/openjdk/jdk/pull/25341). >> >>> Because it sounds you are promissing this really "unconditionally". Or what exactly do you mean by "unconditionally"? >> >> I mean it's really "unconditionally", but if you feel it's better to add an argument, like `supports_vectorize_cmove_bool_unconditionally(BasicType src, BasicType dst)`, I can do it. >> And I need to modify the `vectornode.cpp` as below too, I'll check it and modify this pr. >> ``` case Op_CMoveI: >> return (is_integral_type(bt) && bt != T_LONG ? Op_VectorBlend : 0); > >> I mean it's really "unconditionally", but if you feel it's better to add an argument, like supports_vectorize_cmove_bool_unconditionally(BasicType src, BasicType dst), I can do it. > > I think this would be good! There is some issue when the comparison is unsigned one, e.g. `c[i] = Long.compareUnsigned(a[i], b[i]) > 0 ? 1.0 : 2.0;`, or `c[i] = (a[i] > b[i]) ? 1.0 : 2.0;` when a[]/b[] are char[]. Seems currently the unsigned comparison is not supported for superword vectorization? The unsigned information is lost, i.e. all the comparisons are just signed ones. I checked the geneated code, and seems when VectorMaskCmp is matched, `BoolTest::unsigned_compare & cond` is always 0 in the passed in `cond` parameter. (Vector API supports unsigned ones, as it passes in `cond` with `BoolTest::unsigned_compare` mask explicitly when the operator is in UGE/UGT/ULE/ULT.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25336#discussion_r2123518238 From epeter at openjdk.org Tue Jun 3 11:31:25 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 11:31:25 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v75] In-Reply-To: References: Message-ID: > **Goal** > We want to generate Java source code: > - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. > - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). > > Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). > > **How to get started** > When reviewing, please start by looking at: > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 > > We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. > > Second, look at this advanced test: > https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 > > And then for a "tutorial", look at: > `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` > > It shows these features: > - The `body` of a Template is essentially a list of `Token`s that are concatenated. > - Templates can be nested: a `TemplateWithArgs` is also a `Token`. > - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. > - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. > - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. > - The use of recursive templates, and `fuel` to limit the recursion. > - `Name`s: useful to register field and variable names in code scopes. > > Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 > > For a better experience, you may want to generate the `javadocs`: > `javadoc -sourcepath test/hotspot/j... Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: remove unnecessary Type.name() ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24217/files - new: https://git.openjdk.org/jdk/pull/24217/files/310d7d86..455cd434 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=74 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=73-74 Stats: 8 lines in 1 file changed: 0 ins; 7 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24217.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24217/head:pull/24217 PR: https://git.openjdk.org/jdk/pull/24217 From epeter at openjdk.org Tue Jun 3 11:43:48 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 11:43:48 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v61] In-Reply-To: References: Message-ID: <0ML8Y2hLfrZZ_NiRcltx_D4MoFqjwCJguJAxCIvJpHU=.1adc4a92-4f24-4037-8d2f-062b590d23b1@github.com> On Mon, 2 Jun 2025 12:14:48 GMT, Christian Hagedorn wrote: >> Thanks for all the updates and discussions! I've worked my way through the documentation in `Template` and the examples again in some more detail. It's much better and the new explanations are well done, excellent work! >> >> I left some comments here and there but mostly minor things. I will have another look at the implementation - probably only finished by Monday. The design now looks great. I'm glad we could find a good solution now after some more iterations :-) > >> @chhagedorn Alright, I now have a decent solution for `$$var` and `$1var` etc. I also added tests for it. >> >> These are issues we could continue the conversation, unless you are satisfied with my answers: [#24217 (comment)](https://github.com/openjdk/jdk/pull/24217#discussion_r2115388737) [#24217 (comment)](https://github.com/openjdk/jdk/pull/24217#discussion_r2115406391) >> >> This is now ready for another review pass ? > > Awesome, thanks for spending some more time with these nasty edge-cases and finding a solution! I had a look at your updates for all my comments, they look good, thanks! > > I'm going to make a pass over the implementation classes now and will have a look at the `Renderer` updates as well :-) @chhagedorn Thanks a lot for taking the time for the offline meeting! I now updated the two little things we talked about. Looking forward to another round of review :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/24217#issuecomment-2934840238 From epeter at openjdk.org Tue Jun 3 11:43:48 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 11:43:48 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v76] In-Reply-To: References: Message-ID: > **Goal** > We want to generate Java source code: > - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. > - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). > > Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). > > **How to get started** > When reviewing, please start by looking at: > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 > > We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. > > Second, look at this advanced test: > https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 > > And then for a "tutorial", look at: > `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` > > It shows these features: > - The `body` of a Template is essentially a list of `Token`s that are concatenated. > - Templates can be nested: a `TemplateWithArgs` is also a `Token`. > - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. > - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. > - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. > - The use of recursive templates, and `fuel` to limit the recursion. > - `Name`s: useful to register field and variable names in code scopes. > > Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 > > For a better experience, you may want to generate the `javadocs`: > `javadoc -sourcepath test/hotspot/j... Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: rename View -> FilteredSet ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24217/files - new: https://git.openjdk.org/jdk/pull/24217/files/455cd434..fa3d086a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=75 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=74-75 Stats: 51 lines in 3 files changed: 5 ins; 0 del; 46 mod Patch: https://git.openjdk.org/jdk/pull/24217.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24217/head:pull/24217 PR: https://git.openjdk.org/jdk/pull/24217 From thartmann at openjdk.org Tue Jun 3 11:44:58 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Tue, 3 Jun 2025 11:44:58 GMT Subject: RFR: 8351635: C2 ROR/ROL: assert failed: Long constant expected [v2] In-Reply-To: <5k2J6AUT-a3B006J_ksxccQVxprZa21uqUbKTGkkby0=.5dfc4f2b-ad7a-4393-bf5e-efc246582c83@github.com> References: <5k2J6AUT-a3B006J_ksxccQVxprZa21uqUbKTGkkby0=.5dfc4f2b-ad7a-4393-bf5e-efc246582c83@github.com> Message-ID: On Tue, 3 Jun 2025 08:35:20 GMT, Jatin Bhateja wrote: >> This bug fix patch relaxes the strict assertion check to allow other pattern matches for degenerated long vector ROL/ROR operations with non-constant scalar shift values. >> >> Kindly review and share feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Update test/hotspot/jtreg/compiler/vectorapi/TestVectorRotateScalarCount.java > > Co-authored-by: Tobias Hartmann Looks good to me. All tests passed. ------------- Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25493#pullrequestreview-2891957732 From roland at openjdk.org Tue Jun 3 11:46:55 2025 From: roland at openjdk.org (Roland Westrelin) Date: Tue, 3 Jun 2025 11:46:55 GMT Subject: RFR: 8354383: C2: enable sinking of Type nodes out of loop [v3] In-Reply-To: References: Message-ID: On Tue, 27 May 2025 15:24:38 GMT, Christian Hagedorn wrote: > I'm not sure either, we would not to further investigate if we can find cases that benefit from it. Should we file an RFE either way? I filed: https://bugs.openjdk.org/browse/JDK-8358501 ------------- PR Comment: https://git.openjdk.org/jdk/pull/25396#issuecomment-2934848953 From roland at openjdk.org Tue Jun 3 12:03:59 2025 From: roland at openjdk.org (Roland Westrelin) Date: Tue, 3 Jun 2025 12:03:59 GMT Subject: RFR: 8347555: [REDO] C2: implement optimization for series of Add of unique value [v7] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 11:50:38 GMT, Emanuel Peter wrote: > What I would like to see for **testing**: add some more patterns with IR rules. More that now optimize, and also a few that do not optimize, just so we have a bit of a sense what we are still missing. > > @rwestrel Filed this issue. I wonder: what do you think we should do here? How general should the optimization/canonicalization be? Having a clearer view what optimizes and what doesn't as you suggest (and filing a bug to keep track of what's missing) sounds useful. Beyond that, I don't see why we wouldn't get what we have so far integrated. It can be improved or reworked down the road but it feels useful to me as it is. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23506#issuecomment-2934913822 From aph at openjdk.org Tue Jun 3 12:09:04 2025 From: aph at openjdk.org (Andrew Haley) Date: Tue, 3 Jun 2025 12:09:04 GMT Subject: RFR: 8316694: Implement relocation of nmethod within CodeCache [v24] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 22:43:50 GMT, Chad Rakoczy wrote: >> This PR introduces a new function to replace nmethods, addressing [JDK-8316694](https://bugs.openjdk.org/browse/JDK-8316694). It enables the creation of new nmethods from existing ones, allowing method relocation in the code heap and supporting [JDK-8328186](https://bugs.openjdk.org/browse/JDK-8328186). >> >> When an nmethod is replaced, a deep copy is performed. The corresponding Java method is updated to reference the new nmethod, while the old one is marked as unused. The garbage collector handles final cleanup and deallocation. >> >> This change does not modify existing code paths and therefore does not benefit much from existing tests. New tests were created and confirmed to pass on x64/aarch64 for slowdebug/fastdebug/release. > > Chad Rakoczy has updated the pull request incrementally with one additional commit since the last revision: > > Fix test copywrite src/hotspot/cpu/aarch64/relocInfo_aarch64.cpp line 89: > 87: x = trampoline; > 88: } > 89: call->set_destination(x); I think I see what you're doing here, but it doesn't look right. At the very least it's a trap for maintainers, who don't expect the destination address to be discarded if the call doesn't reach. When the call doesn't reach, I believe you're fixing up an internal call to point to its target in the new copy of the code. But this isn't right when calls are PC relative, is it? In that case it makes more sense to leave the call instruction alone rather than rewrite it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23573#discussion_r2123618495 From roland at openjdk.org Tue Jun 3 12:10:54 2025 From: roland at openjdk.org (Roland Westrelin) Date: Tue, 3 Jun 2025 12:10:54 GMT Subject: RFR: 8351889: C2 crash: assertion failed: Base pointers must match (addp 344) In-Reply-To: References: Message-ID: On Tue, 27 May 2025 08:11:27 GMT, Galder Zamarre?o wrote: >> The test case has an out of loop `Store` with an `AddP` address >> expression that has other uses and is in the loop body. Schematically, >> only showing the address subgraph and the bases for the `AddP`s: >> >> >> Store#195 -> AddP#133 -> AddP#134 -> CastPP#110 >> -> CastPP#110 >> >> >> Both `AddP`s have the same base, a `CastPP` that's also in the loop >> body. >> >> That loop is a counted loop and only has 3 iterations so is fully >> unrolled. First, one iteration is peeled: >> >> >> /-> CastPP#110 >> Store#195 -> Phi#360 -> AddP#133 -> AddP#134 -> CastPP#110 >> -> AddP#277 -> AddP#278 -> CastPP#283 >> -> CastPP#283 >> >> >> >> The `AddP`s and `CastPP` are cloned (because in the loop body). As >> part of peeling, `PhaseIdealLoop::peeled_dom_test_elim()` is >> called. It finds the test that guards `CastPP#283` in the peeled >> iteration dominates and replaces the test that guards `CastPP#110` >> (the test in the peeled iteration is the clone of the test in the >> loop). That causes `CastPP#110`'s control to be updated to that of the >> test in the peeled iteration and to be yanked from the loop. So now >> `CastPP#283` and `CastPP#110` have the same inputs. >> >> Next unrolling happens: >> >> >> /-> CastPP#110 >> /-> AddP#400 -> AddP#401 -> CastPP#110 >> Store#195 -> Phi#360 -> Phi#477 -> AddP#133 -> AddP#134 -> CastPP#110 >> \ -> CastPP#110 >> -> AddP#277 -> AddP#278 -> CastPP#283 >> -> CastPP#283 >> >> >> >> `AddP`s are cloned once more but not the `CastPP`s because they are >> both in the peeled iteration now. A new `Phi` is added. >> >> Next igvn runs. It's going to push the `AddP`s through the `Phi`s. >> >> Through `Phi#477`: >> >> >> >> /-> CastPP#110 >> Store#195 -> Phi#360 -> AddP#510 -> Phi#509 -> AddP#401 -> CastPP#110 >> \ -> AddP#134 -> CastPP#110 >> -> AddP#277 -> AddP#278 -> CastPP#283 >> -> CastPP#283 >> >> >> >> Through `Phi#360`: >> >> >> /-> AddP#134 -> CastPP#110 >> /-> Phi#509 -> AddP#401 -> CastPP#110 >> Store#195 -> AddP#516 -> Phi#515 -> AddP#278 -> CastPP#283 >> -> Phi#514 -> CastPP#283 >> ... > > test/hotspot/jtreg/compiler/c2/TestMismatchedAddPAfterMaxUnroll.java line 74: > >> 72: } >> 73: if (flag) { >> 74: if (flag2) { > > It looks a bit odd to have this if statement and the flag one. One would expect these to be dead code eliminated? Or does the DCE only happen after the problematic C2 crash? Might be useful to have some comment explaining the rationale for these if statements. What happens is that if a branch is never taken at runtime (as is the case here for `if (flag2) {` here), that's captured by profile data. If c2 sees a never taken branch, it doesn't parse it. So it is compiled to: if (flags2) { uncommon_trap(unstable_if); } As a result, c2 never sees the branch is empty and the while statement is useless. It's a missed optimization opportunity but I find it useful for test cases and use that pattern. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25386#discussion_r2123621503 From dfenacci at openjdk.org Tue Jun 3 12:49:54 2025 From: dfenacci at openjdk.org (Damon Fenacci) Date: Tue, 3 Jun 2025 12:49:54 GMT Subject: RFR: 8358129: compiler/startup/StartupOutput.java runs into out of memory on Windows after JDK-8347406 In-Reply-To: References: Message-ID: <1LnBHbsT5dzl9cE4ooMiSSRV0vGu4Y0qZiZ2rkKjdTY=.c9759e84-cf54-44eb-9922-0b9691822010@github.com> On Tue, 3 Jun 2025 09:57:37 GMT, SendaoYan wrote: >> The test `compiler/startup/StartupOutput.java` starts **200 VMs in a loop** , this can lead to resource shortages on some (Windows) machines. >> >> There is no real need to run those VMs concurrently (their run is short and basically check that the VM doesn't crash giving limited code cache). >> >> Running them **sequentially** should be OK and should avoid running out of memory. >> >> Testing: Tier1-3+ > > test/hotspot/jtreg/compiler/startup/StartupOutput.java line 67: > >> 65: int reservedCodeCacheSizeInKb = initialCodeCacheSizeInKb + rand.nextInt(200); >> 66: pb = ProcessTools.createLimitedTestJavaProcessBuilder("-XX:InitialCodeCacheSize=" + initialCodeCacheSizeInKb + "K", "-XX:ReservedCodeCacheSize=" + reservedCodeCacheSizeInKb + "k", "-version"); >> 67: out = new OutputAnalyzer(pb.start()); > > SInce we start VMs from concurrently to serially, do we need start VMs 200 times anymore, maybe 20 times or 5 times is enough, or even just 1 time is enough? The goal is actually to test the VM startup with different (randomised) code cache sizes. So, I'm not sure that there is another way to do that other than starting a new VM every time. We could definitely try with a lower number but, as the whole test takes just a few seconds to run, I don't think it would make a big difference. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25582#discussion_r2123709317 From thartmann at openjdk.org Tue Jun 3 13:03:54 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Tue, 3 Jun 2025 13:03:54 GMT Subject: RFR: 8358333: Use VEX2 prefix in Assembler::psllq In-Reply-To: References: Message-ID: <6GfLx7sBoN-0ffjlsnL0dZ-q-GKAje-1HNO-FQ__b4o=.7ace0d2f-85c0-42dc-b4a4-3f6749287da2@github.com> On Mon, 2 Jun 2025 15:53:17 GMT, Yudi Zheng wrote: > While porting the commit https://github.com/openjdk/jdk/commit/0df8c9684b8782ef830e2bd425217864c3f51784 to Graal, I noticed that the Assembler::psllq instruction is using the VEX3 prefix. This results in the instruction being unrecognizable by my outdated version of hsdis. Currently, HotSpot generates the following bytes for vpsllq xmm7, xmm7, 0x34 > https://github.com/openjdk/jdk/blob/0df8c9684b8782ef830e2bd425217864c3f51784/src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp#L255 > > > c4 e1 c1 73 f7 34 > > > By setting the rex_w to WIG, the emitted bytes are: > > > c5 c1 73 f7 34 Looks good to me too (assuming you ran this through Oracle testing). ------------- Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25593#pullrequestreview-2892254436 From epeter at openjdk.org Tue Jun 3 13:05:51 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 13:05:51 GMT Subject: RFR: 8357554: Enable vectorization of Bool -> CMove with different type size (on riscv) In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 11:23:47 GMT, Hamlin Li wrote: > Seems currently the unsigned comparison is not supported for superword vectorization? I think that currently only `float` and `doulbe` for CMove is really implemented. Integer types are still to be added, see [JDK-8308841](https://bugs.openjdk.org/browse/JDK-8308841) C2 SuperWord: implement vectorization of integer CMove I hope we get to it soon, and then we can generally extend the combinations too. Like comparing `int`, but blending between `double`. Maybe it would be better if for now you focus just on the `D/F` cases that are already supported on x86 and aarch64? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25336#discussion_r2123751783 From epeter at openjdk.org Tue Jun 3 13:07:53 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 13:07:53 GMT Subject: RFR: 8358333: Use VEX2 prefix in Assembler::psllq In-Reply-To: <6_9L_DiGyVYdZqzwGTLMKyTAUURjRwpwHvQYEgBMZVo=.3e7ac003-9721-43e0-b0cf-4ed89d67d431@github.com> References: <6_9L_DiGyVYdZqzwGTLMKyTAUURjRwpwHvQYEgBMZVo=.3e7ac003-9721-43e0-b0cf-4ed89d67d431@github.com> Message-ID: <-d9PNtM_vl1zXmkLMmxLjkp0MexZqaNjlXNmmyG7uTA=.0e5e57a4-101b-4628-86f5-332beb095e88@github.com> On Mon, 2 Jun 2025 16:02:30 GMT, Yudi Zheng wrote: >> While porting the commit https://github.com/openjdk/jdk/commit/0df8c9684b8782ef830e2bd425217864c3f51784 to Graal, I noticed that the Assembler::psllq instruction is using the VEX3 prefix. This results in the instruction being unrecognizable by my outdated version of hsdis. Currently, HotSpot generates the following bytes for vpsllq xmm7, xmm7, 0x34 >> https://github.com/openjdk/jdk/blob/0df8c9684b8782ef830e2bd425217864c3f51784/src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp#L255 >> >> >> c4 e1 c1 73 f7 34 >> >> >> By setting the rex_w to WIG, the emitted bytes are: >> >> >> c5 c1 73 f7 34 > > @jatin-bhateja could you please review this trivial PR? Thanks! @mur47x111 Testing was only run for tier1-3. It would be better if there was additional stress testing as well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25593#issuecomment-2935143384 From chagedorn at openjdk.org Tue Jun 3 13:19:52 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Tue, 3 Jun 2025 13:19:52 GMT Subject: RFR: 8351635: C2 ROR/ROL: assert failed: Long constant expected [v2] In-Reply-To: <5k2J6AUT-a3B006J_ksxccQVxprZa21uqUbKTGkkby0=.5dfc4f2b-ad7a-4393-bf5e-efc246582c83@github.com> References: <5k2J6AUT-a3B006J_ksxccQVxprZa21uqUbKTGkkby0=.5dfc4f2b-ad7a-4393-bf5e-efc246582c83@github.com> Message-ID: On Tue, 3 Jun 2025 08:35:20 GMT, Jatin Bhateja wrote: >> This bug fix patch relaxes the strict assertion check to allow other pattern matches for degenerated long vector ROL/ROR operations with non-constant scalar shift values. >> >> Kindly review and share feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Update test/hotspot/jtreg/compiler/vectorapi/TestVectorRotateScalarCount.java > > Co-authored-by: Tobias Hartmann Looks good to me, too. ------------- Marked as reviewed by chagedorn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25493#pullrequestreview-2892322888 From epeter at openjdk.org Tue Jun 3 13:35:08 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 13:35:08 GMT Subject: RFR: 8350896: Integer/Long.compress gets wrong type from CompressBitsNode::Value [v9] In-Reply-To: References: Message-ID: <2YFuLETRIRASPPjocbdhIGklH-45xnIVuY6cYrAdIzU=.84c661ff-faf8-49e8-9c05-056bb9a0fcab@github.com> On Mon, 2 Jun 2025 17:58:09 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This bugfix patch fixes incorrect value computation for Integer/Long. compress APIs. >> >> Problems occur with a constant input and variable mask where the input's value is equal to the lower bound of the mask value., In this case, an erroneous value range estimation results in a constant value. Existing value routine first attempts to constant fold the compression operation if both input and compression mask are constant values; otherwise, it attempts to constrain the value range of result based on the upper and lower bounds of mask type. >> >> New IR test covers the issue reported in the bug report along with a case for value range based logic pruning. >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Fix aarch64 failure A few more comments about the first part. Will now dig into the case where `!mask_type->is_con()` next... src/hotspot/share/opto/intrinsicnode.cpp line 241: > 239: jlong lo = bt == T_INT ? min_jint : min_jlong; > 240: > 241: if(mask_type->is_con() && mask_type->get_con_as_long(bt) != -1L) { Now you removed the condition `mask_type->get_con_as_long(bt) != -1L`. Do you know why it was there in the first place? It seems to me that if `mask_type->get_con_as_long(bt) == -1L`, then we can just return the type of `src`, right? src/hotspot/share/opto/intrinsicnode.cpp line 292: > 290: // To compute minimum result value we assume all but last read source bit as zero, > 291: // this is because sign bit of result will always be set to 1 while other bit > 292: // corresponding to set mask bit should be zero. I don't understand, are you talking about `lo` if `mask < 0`? Don't we just keep `lo = type_min`, which is always ok? ------------- PR Review: https://git.openjdk.org/jdk/pull/23947#pullrequestreview-2892367256 PR Review Comment: https://git.openjdk.org/jdk/pull/23947#discussion_r2123819264 PR Review Comment: https://git.openjdk.org/jdk/pull/23947#discussion_r2123813921 From epeter at openjdk.org Tue Jun 3 13:35:10 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 13:35:10 GMT Subject: RFR: 8350896: Integer/Long.compress gets wrong type from CompressBitsNode::Value [v9] In-Reply-To: <2YFuLETRIRASPPjocbdhIGklH-45xnIVuY6cYrAdIzU=.84c661ff-faf8-49e8-9c05-056bb9a0fcab@github.com> References: <2YFuLETRIRASPPjocbdhIGklH-45xnIVuY6cYrAdIzU=.84c661ff-faf8-49e8-9c05-056bb9a0fcab@github.com> Message-ID: <-qsTG7NyclV8PbQ1CsbHobu0bCwIK-6JvsMhmzmpVtg=.51d21506-9da9-4bf2-93d8-6907a6b54c5b@github.com> On Tue, 3 Jun 2025 13:28:36 GMT, Emanuel Peter wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix aarch64 failure > > src/hotspot/share/opto/intrinsicnode.cpp line 241: > >> 239: jlong lo = bt == T_INT ? min_jint : min_jlong; >> 240: >> 241: if(mask_type->is_con() && mask_type->get_con_as_long(bt) != -1L) { > > Now you removed the condition `mask_type->get_con_as_long(bt) != -1L`. Do you know why it was there in the first place? > > It seems to me that if `mask_type->get_con_as_long(bt) == -1L`, then we can just return the type of `src`, right? This is a bug-fix for `CompressBitsNode::Value`, but this change also has an effect on `ExpandBitsNode::Value`, and that makes me a little nervous. For example: do we have enough test coverage for `expand`? It seems we did not have enough tests for `compress`, so probably also not for `expand`... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23947#discussion_r2123824471 From mli at openjdk.org Tue Jun 3 13:48:57 2025 From: mli at openjdk.org (Hamlin Li) Date: Tue, 3 Jun 2025 13:48:57 GMT Subject: RFR: 8357554: Enable vectorization of Bool -> CMove with different type size (on riscv) In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 13:03:39 GMT, Emanuel Peter wrote: >> There is some issue when the comparison is unsigned one, e.g. `c[i] = Long.compareUnsigned(a[i], b[i]) > 0 ? 1.0 : 2.0;`, or `c[i] = (a[i] > b[i]) ? 1.0 : 2.0;` when a[]/b[] are char[]. >> >> Seems currently the unsigned comparison is not supported for superword vectorization? The unsigned information is lost, i.e. all the comparisons are just signed ones. >> I checked the geneated code, and seems when VectorMaskCmp is matched, `BoolTest::unsigned_compare & cond` is always 0 in the passed in `cond` parameter. >> (Vector API supports unsigned ones, as it passes in `cond` with `BoolTest::unsigned_compare` mask explicitly when the operator is in UGE/UGT/ULE/ULT.) > >> Seems currently the unsigned comparison is not supported for superword vectorization? > > I think that currently only `float` and `doulbe` for CMove is really implemented. Integer types are still to be added, see [JDK-8308841](https://bugs.openjdk.org/browse/JDK-8308841) > C2 SuperWord: implement vectorization of integer CMove > I hope we get to it soon, and then we can generally extend the combinations too. Like comparing `int`, but blending between `double`. > > Maybe it would be better if for now you focus just on the `D/F` cases that are already supported on x86 and aarch64? Thanks for the information! I'll hold off these prs until integer CMove vectorization is fully supported. At first, I also just planned to implement the CMoveF/D on riscv and let it automatically vectorized based on current C2 implementation. But, I found some performance regression in the cases of some type combination (please check the `table 1` below), the reason is that for some type combination cmoveF/D can not be vectorized, because of the type size check in `SuperWord::is_velt_basic_type_compatible_use_def`, on the other hand scalar implementation of CMoveF/D on riscv explode the generated code after loop unroll (because of the complicated implmentation on riscv). These 2 reasons will lead to the performance regression in some cases. table 1 Can be vectorized? | CMoveF | CMoveD -- | -- | -- CmpI | V | X CmpU | V | X CmpL | X | V CmpUL | X | V CmpF | V | X CmpD | X | V CmpN | V | X CmpP | X | V ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25336#discussion_r2123879603 From epeter at openjdk.org Tue Jun 3 13:52:56 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 13:52:56 GMT Subject: RFR: 8357554: Enable vectorization of Bool -> CMove with different type size (on riscv) In-Reply-To: References: Message-ID: <2eJGxfqvohND-ilZR_F1g-Bu4IBfKOHR3myHQVPNFcU=.ce4110a0-4946-4a8a-9eda-d8e54bc69bb6@github.com> On Tue, 3 Jun 2025 13:46:24 GMT, Hamlin Li wrote: >>> Seems currently the unsigned comparison is not supported for superword vectorization? >> >> I think that currently only `float` and `doulbe` for CMove is really implemented. Integer types are still to be added, see [JDK-8308841](https://bugs.openjdk.org/browse/JDK-8308841) >> C2 SuperWord: implement vectorization of integer CMove >> I hope we get to it soon, and then we can generally extend the combinations too. Like comparing `int`, but blending between `double`. >> >> Maybe it would be better if for now you focus just on the `D/F` cases that are already supported on x86 and aarch64? > > Thanks for the information! > I'll hold off these prs until integer CMove vectorization is fully supported. > > At first, I also just planned to implement the CMoveF/D on riscv and let it automatically vectorized based on current C2 implementation. > But, I found some performance regression in the cases of some type combination (please check the `table 1` below), the reason is that for some type combination cmoveF/D can not be vectorized, because of the type size check in `SuperWord::is_velt_basic_type_compatible_use_def`, on the other hand scalar implementation of CMoveF/D on riscv explode the generated code after loop unroll (because of the complicated implmentation on riscv). These 2 reasons will lead to the performance regression in some cases. > > table 1 > > Can be vectorized? | CMoveF | CMoveD > -- | -- | -- > CmpI | V | X > CmpU | V | X > CmpL | X | V > CmpUL | X | V > CmpF | V | X > CmpD | X | V > CmpN | V | X > CmpP | X | V > > Yes, getting this all right and with optimal performance is tricky... @jaskarth is working on https://github.com/openjdk/jdk/pull/23413, which will make changes to `SuperWord::is_velt_basic_type_compatible_use_def` ... so we also will have to see how this plays together... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25336#discussion_r2123889184 From mli at openjdk.org Tue Jun 3 13:58:52 2025 From: mli at openjdk.org (Hamlin Li) Date: Tue, 3 Jun 2025 13:58:52 GMT Subject: RFR: 8357554: Enable vectorization of Bool -> CMove with different type size (on riscv) In-Reply-To: <2eJGxfqvohND-ilZR_F1g-Bu4IBfKOHR3myHQVPNFcU=.ce4110a0-4946-4a8a-9eda-d8e54bc69bb6@github.com> References: <2eJGxfqvohND-ilZR_F1g-Bu4IBfKOHR3myHQVPNFcU=.ce4110a0-4946-4a8a-9eda-d8e54bc69bb6@github.com> Message-ID: On Tue, 3 Jun 2025 13:50:07 GMT, Emanuel Peter wrote: >> Thanks for the information! >> I'll hold off these prs until integer CMove vectorization is fully supported. >> >> At first, I also just planned to implement the CMoveF/D on riscv and let it automatically vectorized based on current C2 implementation. >> But, I found some performance regression in the cases of some type combination (please check the `table 1` below), the reason is that for some type combination cmoveF/D can not be vectorized, because of the type size check in `SuperWord::is_velt_basic_type_compatible_use_def`, on the other hand scalar implementation of CMoveF/D on riscv explode the generated code after loop unroll (because of the complicated implmentation on riscv). These 2 reasons will lead to the performance regression in some cases. >> >> table 1 >> >> Can be vectorized? | CMoveF | CMoveD >> -- | -- | -- >> CmpI | V | X >> CmpU | V | X >> CmpL | X | V >> CmpUL | X | V >> CmpF | V | X >> CmpD | X | V >> CmpN | V | X >> CmpP | X | V >> >> > > Yes, getting this all right and with optimal performance is tricky... @jaskarth is working on https://github.com/openjdk/jdk/pull/23413, which will make changes to `SuperWord::is_velt_basic_type_compatible_use_def` ... so we also will have to see how this plays together... Yes, it is. Thank you for discussion! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25336#discussion_r2123908890 From asmehra at openjdk.org Tue Jun 3 14:13:52 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Tue, 3 Jun 2025 14:13:52 GMT Subject: RFR: 8358330: AsmRemarks and DbgStrings clear() method may not get called before their destructor In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 01:29:04 GMT, Vladimir Kozlov wrote: > Can you consider populate CodeBlob::_asm_remarks and _dbg_strings after calling CodeBlob::create()? Then you don't need temporary AsmRemarks and DbgStrings. That would work as well. CodeBlob::_asm_remarks::_remarks need to be allocated memory explicitly in this case. I will add AsmRemarks::init() to allocate memory for and initialize AsmRemarks::_remarks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25598#issuecomment-2935464214 From shade at openjdk.org Tue Jun 3 14:15:01 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 3 Jun 2025 14:15:01 GMT Subject: RFR: 8357434: x86: Simplify Interpreter::profile_taken_branch [v3] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 09:38:09 GMT, Aleksey Shipilev wrote: >> Noticed that `Interpreter::profile_taken_branch` has the same `sbbptr` pattern we have eliminated with [JDK-8356946](https://bugs.openjdk.org/browse/JDK-8356946). The same logic applies here: the counter is 64-bit, never practically overflows, and no other code cares about it. >> >> Also tidied up some comments around it. >> >> Additional testing; >> - [x] Linux x86_64 server fastdebug, `tier1 tier2` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Merge branch 'master' into JDK-8357434-x86-profile-taken > - Stale comment > - Merge branch 'master' into JDK-8357434-x86-profile-taken > - Fix Ran Linux x86_64 server fastdebug, `make test TEST=all`, no new failures. I think this is ready for integration. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25343#issuecomment-2935466193 From epeter at openjdk.org Tue Jun 3 14:21:02 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 14:21:02 GMT Subject: RFR: 8350896: Integer/Long.compress gets wrong type from CompressBitsNode::Value [v9] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 17:58:09 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This bugfix patch fixes incorrect value computation for Integer/Long. compress APIs. >> >> Problems occur with a constant input and variable mask where the input's value is equal to the lower bound of the mask value., In this case, an erroneous value range estimation results in a constant value. Existing value routine first attempts to constant fold the compression operation if both input and compression mask are constant values; otherwise, it attempts to constrain the value range of result based on the upper and lower bounds of mask type. >> >> New IR test covers the issue reported in the bug report along with a case for value range based logic pruning. >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Fix aarch64 failure src/hotspot/share/opto/intrinsicnode.cpp line 314: > 312: // mask value for which iff all corresponding input bits are set then bit compression > 313: // will result in a -ve value, therefore this case negates the possibility of > 314: // strictly non-negative bit compression result. Grammar was a little off. I think we can say it in fewer words. Suggestion: // Case B.1 The mask value range includes -1, hence we may use all bits, // the result has the whole value range. src/hotspot/share/opto/intrinsicnode.cpp line 328: > 326: // optimistic upper bound of result i.e. all the bits other than leading zero bits > 327: // can be assumed holding 1 value. > 328: assert(mask_type->lo_as_long() >= 0, ""); Suggestion: assert(mask_type->lo_as_long() >= 0, ""); // Case B.3 Mask value range only includes non-negative values. Since all integral // types honours an invariant that TypeInteger._lo <= TypeInteger._hi, thus computing // leading zero bits of upper bound of mask value will allow us to ascertain // optimistic upper bound of result i.e. all the bits other than leading zero bits // can be assumed holding 1 value. Have the assert first, like a condition. Then the comments follow from it and make sense immediately. src/hotspot/share/opto/intrinsicnode.cpp line 339: > 337: // compression result will never be a -ve value and we can safely set the > 338: // lower bound of bit compression to zero. > 339: lo = result_bit_width == mask_bit_width ? lo : 0L; Fixed grammar a little. Suggestion: // If the number of bits required to for the mask value range is less than the // full bit width of the integral type, then the MSB bit is guaranteed to be zero, // thus the compression result will never be a -ve value and we can safely set the // lower bound of the bit compression to zero. lo = result_bit_width == mask_bit_width ? lo : 0L; src/hotspot/share/opto/intrinsicnode.cpp line 369: > 367: hi = src_type->hi_as_long() >= 0 ? src_type->hi_as_long() : hi; > 368: // Tightening upper bound of bit compression as per Rule 3. > 369: hi = result_bit_width < mask_bit_width ? MIN2((jlong)((1UL << result_bit_width) - 1L), hi) : hi; // As per Rule 1, bit compression packs the source bits corresponding to // set mask bits This says something, but does not really explain the rest of the sentence: // set mask bits, hence for a non-negative input, result of compression will // always be less that equal to input. Plus: input could be both `mask` or `src`. I would be specific, and just talk about `src`. Also: I don't really see how the conclusion in this sentence follows from its assumption. How exactly does some bits not participating really ensure that the value is not greater? // set. If a mask bit corresponding to set input bit is zero then that input bit will // not take part in bit compression, which means that maximum possible result value // can never be greater than non-negative input. I think I know what you are trying to say, it just sounds a little vague. It also smells like this `Lemma 1` might be easier proved by a proof of contradiction. I need to take a break now, but I'll see if I can come up with something a bit clearer later. Here my suggestion: Suggestion: if (src_type->hi_as_long() >= 0) { // Lemma 1: For strictly non-negative src, the result of the compression will never be // greater than src. // Proof: Since src is a non-negative value, its most significant bit is always 0. // Thus even if the corresponding MSB of the mask is one, the result will be a +ve // value. hi = src_type->hi_as_long(); } if (result_bit_width < mask_bit_width) { // Rule 3: // We can further constrain the upper bound of bit compression if the number of bits // which can be set to 1 is less than the maximum number of bits of integral type. hi = MIN2((jlong)((1UL << result_bit_width) - 1L), hi); } test/hotspot/jtreg/compiler/c2/gvn/TestBitCompressValueTransform.java line 307: > 305: } > 306: Asserts.assertEQ(0, res); > 307: } Like I mentioned in the email. I contributed this test, it would be nice if you could give me credits by making me a contributor to this issue. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23947#discussion_r2123857262 PR Review Comment: https://git.openjdk.org/jdk/pull/23947#discussion_r2123864703 PR Review Comment: https://git.openjdk.org/jdk/pull/23947#discussion_r2123881395 PR Review Comment: https://git.openjdk.org/jdk/pull/23947#discussion_r2123983125 PR Review Comment: https://git.openjdk.org/jdk/pull/23947#discussion_r2123992004 From epeter at openjdk.org Tue Jun 3 14:27:01 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 14:27:01 GMT Subject: RFR: 8350896: Integer/Long.compress gets wrong type from CompressBitsNode::Value [v8] In-Reply-To: References: Message-ID: On Fri, 30 May 2025 17:43:27 GMT, Jatin Bhateja wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Review comments resolutions > > We can further constrain the value range bounds of bit compression and expansion once PR #17508 gets integrated. For now, I have developed the following draft demonstrates bound constraining with KnownBitLattice. > > > // > // Prototype of bit compress/expand value range computation > // using KnownBits infrastructure. > // > > #include > #include > #include > #include > > template > class KnownBitsLattice { > private: > U zeros; > U ones; > > public: > KnownBitsLattice(U lb, U ub); > > U getKnownZeros() { > return zeros; > } > > U getKnownOnes() { > return ones; > } > > long getKnownZerosCount() { > uint64_t count = 0; > asm volatile ("popcntq %1, %0 \n\t" : "=r"(count) : "r"(zeros) : "cc"); > return count; > } > > long getKnownOnesCount() { > uint64_t count = 0; > asm volatile ("popcntq %1, %0 \n\t" : "=r"(count) : "r"(ones) : "cc"); > return count; > } > > bool check_voilation() { > // A given bit cannot be both zero or one. > return (zeros & ones) != 0; > } > > bool is_MSB_KnownOneBitsSet() { > return (ones >> 63) == 1; > } > > bool is_MSB_KnownZeroBitsSet() { > return (zeros >> 63) == 1; > } > }; > > template > KnownBitsLattice::KnownBitsLattice(U lb, U ub) { > // To find KnownBitsLattice from a given value range > // we first find the common prefix b/w upper and lower > // bound, we then concertize known zeros and ones bit > // based on common prefix. > // e.g. > // lb = 00110001 > // ub = 00111111 > // common prefix = 0011XXXX > // knownbits.zeros = 11000000 > // knownbits.ones = 00110000 > // > // conversely, for a give knownbits value we can find > // lower and upper value ranges. > // e.g. > // knownbits.zeros = 0x00010001 > // knownbits.ones = 0x10001100 > // range.lo = knownbits.ones, this is because knownbits.ones are > // guaranteed to be one. > // range.hi = ~knownbits.zeros, this is an optimistic upper bound > // which assumes all unset knownbits.zero > // are ones. > // Thus in above example, > // range.lo = 0x8C > // range.hi = 0xEE > > U lzcnt = 0; > U common_prefix = lb ^ ub; > asm volatile ("lzcntq %1, %0 \n\t" : "=r"(lzcnt) : "r"(common_prefix) : "cc"); > U common_prefix_mask = lzcnt == 0 ? 0xFFFFFFFFFFFFFFFFL : ~((1ULL << (64 - lzcnt)) - 1); > zeros = (~lb) & common_prefix_mask; > ones = (lb) & c... @jatin-bhateja I think we are making progress, it seems to me now that the VM code is correct, at least as far as I can tell with visual inspection. We are still missing some additional tests, as I have asked for a few times already: https://github.com/openjdk/jdk/pull/23947#issuecomment-2853896251 We should do something like this: public static test(int mask, int src) { mask = Math.max(CON1, Math.min(CON2, mask)); src = Math.max(CON2, Math.min(CON4, src)); result = Integer.compress(src, mask); int sum = 0; if (sum > LIMIT_1) { sum += 1; } if (sum > LIMIT_2) { sum += 2; } if (sum > LIMIT_3) { sum += 4; } if (sum > LIMIT_4) { sum += 8; } if (sum > LIMIT_5) { sum += 16; } if (sum > LIMIT_6) { sum += 32; } if (sum > LIMIT_7) { sum += 64; } if (sum > LIMIT_8) { sum += 128; } return new int[] {sum, result}; } You could do the same pattern for `expand` too. Then you pick random values using `Generators.java` for all the `CON` and `LIMIT`. If we somehow produce a bad range, then the limit checks could constant fold wrongly, and then the `sum` would reflect this wrong result. Optimal would be to duplicate this pattern, and have one method that compiles, and one that runs in interpreter. That way, you can repeatedly call the methods with various `src` and `mask` values, and compare the output. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23947#issuecomment-2935548411 From mdoerr at openjdk.org Tue Jun 3 14:34:06 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 3 Jun 2025 14:34:06 GMT Subject: RFR: 8354636: [PPC64] Clean up comments regarding frame manager Message-ID: <28IlBh9k0o4RZMbIstYTCl8c0rfIIqVqyPXeXFyx1Ik=.1d4919d2-2437-4c81-8d30-75128b0a0afb@github.com> Trivial comment cleanup: Replace "frame manager" by "template interpreter". ------------- Commit messages: - 8354636: [PPC64] Clean up comments regarding frame manager Changes: https://git.openjdk.org/jdk/pull/25616/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25616&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8354636 Stats: 13 lines in 3 files changed: 0 ins; 2 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/25616.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25616/head:pull/25616 PR: https://git.openjdk.org/jdk/pull/25616 From epeter at openjdk.org Tue Jun 3 14:36:00 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 14:36:00 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v77] In-Reply-To: References: Message-ID: > **Goal** > We want to generate Java source code: > - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. > - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). > > Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). > > **How to get started** > When reviewing, please start by looking at: > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 > > We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. > > Second, look at this advanced test: > https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 > > And then for a "tutorial", look at: > `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` > > It shows these features: > - The `body` of a Template is essentially a list of `Token`s that are concatenated. > - Templates can be nested: a `TemplateWithArgs` is also a `Token`. > - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. > - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. > - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. > - The use of recursive templates, and `fuel` to limit the recursion. > - `Name`s: useful to register field and variable names in code scopes. > > Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 > > For a better experience, you may want to generate the `javadocs`: > `javadoc -sourcepath test/hotspot/j... Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: add some hashes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24217/files - new: https://git.openjdk.org/jdk/pull/24217/files/fa3d086a..6ef71270 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=76 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=75-76 Stats: 12 lines in 2 files changed: 0 ins; 0 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/24217.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24217/head:pull/24217 PR: https://git.openjdk.org/jdk/pull/24217 From chagedorn at openjdk.org Tue Jun 3 14:36:06 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Tue, 3 Jun 2025 14:36:06 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v76] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 11:43:48 GMT, Emanuel Peter wrote: >> **Goal** >> We want to generate Java source code: >> - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. >> - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). >> >> Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). >> >> **How to get started** >> When reviewing, please start by looking at: >> https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 >> >> We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. >> >> Second, look at this advanced test: >> https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 >> >> And then for a "tutorial", look at: >> `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` >> >> It shows these features: >> - The `body` of a Template is essentially a list of `Token`s that are concatenated. >> - Templates can be nested: a `TemplateWithArgs` is also a `Token`. >> - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. >> - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. >> - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. >> - The use of recursive templates, and `fuel` to limit the recursion. >> - `Name`s: useful to register field and variable names in code scopes. >> >> Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. >> https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 >> >> For a better experience, you may want... > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > rename View -> FilteredSet Thanks for the update! Almost there, some last comments and then we're good to go :-) test/hotspot/jtreg/compiler/lib/template_framework/CodeFrame.java line 95: > 93: * where we would possibly want to make field or variable definitions during the insertion > 94: * that are not just local to the insertion but affect the {@link CodeFrame} that we > 95: * {@link Hook#anchor} earlier and are now {@link Hook#insert}ing into. It complains that `addName` cannot be found. Suggestion to use `{@link Template#addDataName}/ * {@link Template#addStructuralName}` instead: Suggestion: * Creates a special frame, which has a {@link #parent} but uses the {@link NameSet} * from the parent frame, allowing {@link Template#addDataName}/ * {@link Template#addStructuralName} to persist in the outer frame when the current frame * is exited. This is necessary for {@link Hook#insert}, where we would possibly want to * make field or variable definitions during the insertion that are not just local to the * insertion but affect the {@link CodeFrame} that we {@link Hook#anchor} earlier and are * now {@link Hook#insert}ing into. test/hotspot/jtreg/compiler/lib/template_framework/CodeFrame.java line 114: > 112: throw new RuntimeException("Internal error: Duplicate Hook in CodeFrame: " + hook.name()); > 113: } > 114: hookCodeLists.put(hook, new Code.CodeList(new ArrayList())); Suggestion: hookCodeLists.put(hook, new Code.CodeList(new ArrayList<>())); test/hotspot/jtreg/compiler/lib/template_framework/Hook.java line 26: > 24: package compiler.lib.template_framework; > 25: > 26: import java.util.List; Unused: Suggestion: test/hotspot/jtreg/compiler/lib/template_framework/Name.java line 29: > 27: import java.util.Map; > 28: import java.util.ArrayList; > 29: import java.util.List; Unused: Suggestion: test/hotspot/jtreg/compiler/lib/template_framework/NameSet.java line 89: > 87: if (w < 0) { > 88: throw new RuntimeException("Negative weight not allowed: " + w); > 89: } I thought zero is also not allowed? test/hotspot/jtreg/compiler/lib/template_framework/Renderer.java line 32: > 30: import java.util.regex.Matcher; > 31: import java.util.regex.Pattern; > 32: import java.util.stream.Stream; Some are unused: Suggestion: import java.util.List; import java.util.regex.MatchResult; import java.util.regex.Matcher; import java.util.regex.Pattern; test/hotspot/jtreg/compiler/lib/template_framework/Renderer.java line 358: > 356: // If the character was not found, we want to have the rest of the > 357: // String s, so instead of "-1" take the end/length of the String. > 358: dollar = (dollar == -1) ? s.length() : dollar; `s.length()` could be called once before the loop and then reused inside the loop. test/hotspot/jtreg/compiler/lib/template_framework/Renderer.java line 384: > 382: > 383: /** > 384: * We are parsing a part now. Befor the part, there was either a "#" or "$": Suggestion: * We are parsing a part now. Before the part, there was either a "#" or "$": ------------- PR Review: https://git.openjdk.org/jdk/pull/24217#pullrequestreview-2892355999 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2123827132 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2123829199 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2123830659 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2123806572 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2123937058 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2123820068 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2124044533 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2124038113 From chagedorn at openjdk.org Tue Jun 3 14:36:07 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Tue, 3 Jun 2025 14:36:07 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v71] In-Reply-To: References: Message-ID: <8ZmNt-_xwKXhDPfLL6U7EaOD4F0IwDn_2A4KB7DRze4=.182761fc-deb6-417b-948e-2bbf54bf3dab@github.com> On Tue, 3 Jun 2025 05:49:26 GMT, Emanuel Peter wrote: >> My IDE advises against matching on the raw type `List`. As an alternative you can match on `List`. > > Done, I must have been tired yesterday afternoon ? Thanks! No worries! :-) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2123839122 From amitkumar at openjdk.org Tue Jun 3 14:50:51 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Tue, 3 Jun 2025 14:50:51 GMT Subject: RFR: 8354636: [PPC64] Clean up comments regarding frame manager In-Reply-To: <28IlBh9k0o4RZMbIstYTCl8c0rfIIqVqyPXeXFyx1Ik=.1d4919d2-2437-4c81-8d30-75128b0a0afb@github.com> References: <28IlBh9k0o4RZMbIstYTCl8c0rfIIqVqyPXeXFyx1Ik=.1d4919d2-2437-4c81-8d30-75128b0a0afb@github.com> Message-ID: <8cSVDOqa_V2X0bF54lQRQTASvGyxuZSYiTf8kwTeb1k=.ad7a9b45-6836-4379-887a-60131e18d98a@github.com> On Tue, 3 Jun 2025 14:29:49 GMT, Martin Doerr wrote: > Trivial comment cleanup: Replace "frame manager" by "template interpreter". Looks good and trivial. ------------- Marked as reviewed by amitkumar (Committer). PR Review: https://git.openjdk.org/jdk/pull/25616#pullrequestreview-2892828163 From roland at openjdk.org Tue Jun 3 15:03:58 2025 From: roland at openjdk.org (Roland Westrelin) Date: Tue, 3 Jun 2025 15:03:58 GMT Subject: RFR: 8351889: C2 crash: assertion failed: Base pointers must match (addp 344) In-Reply-To: References: Message-ID: On Wed, 28 May 2025 08:34:27 GMT, Emanuel Peter wrote: > My (somewhat limited) experience with delaying optimizations is that this can be quite brittle. You need to get the condition just right, otherwise it just happens again in some generalized case again - maybe you check for 1 level, and later it happens with 2 or more layers. I don't disagree with that. > I'm half-understanding the example you present. Can you show the IR nodes for your last step: > > ``` > Store#195 -> AddP#516 -> AddP#544 -> CastPP#110 > -> CastPP#529 > ``` > > What exactly are the bases there? Your simplified drawings seem to show the flow of computation, but I cannot see what the bases are in it, right? You could enhance it, for example with `AddP#nnn(base:nnn)`. I think that would help me follow the example. In the example above, the `CastPP`s are the bases. So the simplified drawings mostly only show how the `AddP`s are chained and the bases. > Maybe some more full IR snippets could be helpful, maybe even IGV drawings. But that may be more work for you. I rarely use the IGV so, yeah, that would be more work. > I'm wondering if we could not have some other "cleanup" optimizations that fix up the bases. What are the assumptions about merging AddP's at a Phi? Is the base from before the Phi propagated to after the Phi? I'm missing some base understanding here to see through this ;) There is a cleanup already. It's `ConstraintCastNode::dominating_cast()`. It's run during igvn (but in the case of this failure igvn can't prove domination) and loop opts (but in the case of this failure, we are one pass of loop opts short of cleaning things up). So we would need an extra run of loop of opts which seems to be quite a bit of overhead for this sort of issues. That's why I went with the igvn delay fix even though it's fragile. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25386#issuecomment-2935784074 From roland at openjdk.org Tue Jun 3 15:03:59 2025 From: roland at openjdk.org (Roland Westrelin) Date: Tue, 3 Jun 2025 15:03:59 GMT Subject: RFR: 8351889: C2 crash: assertion failed: Base pointers must match (addp 344) In-Reply-To: References: Message-ID: <2q9RH_3nobpsee8aZzoqkkKPgUkjJuPCoH-LDV4roEs=.56385866-5158-45d2-826d-5ff8448284e9@github.com> On Wed, 28 May 2025 08:23:52 GMT, Emanuel Peter wrote: >> src/hotspot/share/opto/cfgnode.cpp line 2107: >> >>> 2105: } >>> 2106: return false; >>> 2107: } >> >> You check for a single level here. Could the same happen over multiple levels? > > If an update should come from further up, but has not propagated down? Right, possibly. I'm not 100% sure. I could check all the `Cast`s along the chain at the `Phi` instead. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25386#discussion_r2124147201 From yzheng at openjdk.org Tue Jun 3 15:14:00 2025 From: yzheng at openjdk.org (Yudi Zheng) Date: Tue, 3 Jun 2025 15:14:00 GMT Subject: RFR: 8358333: Use VEX2 prefix in Assembler::psllq In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 15:53:17 GMT, Yudi Zheng wrote: > While porting the commit https://github.com/openjdk/jdk/commit/0df8c9684b8782ef830e2bd425217864c3f51784 to Graal, I noticed that the Assembler::psllq instruction is using the VEX3 prefix. This results in the instruction being unrecognizable by my outdated version of hsdis. Currently, HotSpot generates the following bytes for vpsllq xmm7, xmm7, 0x34 > https://github.com/openjdk/jdk/blob/0df8c9684b8782ef830e2bd425217864c3f51784/src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp#L255 > > > c4 e1 c1 73 f7 34 > > > By setting the rex_w to WIG, the emitted bytes are: > > > c5 c1 73 f7 34 Thanks for the review! Passed tier1-3, hs-precheckin-comp, hs-comp-stress ------------- PR Comment: https://git.openjdk.org/jdk/pull/25593#issuecomment-2935865077 From yzheng at openjdk.org Tue Jun 3 15:14:01 2025 From: yzheng at openjdk.org (Yudi Zheng) Date: Tue, 3 Jun 2025 15:14:01 GMT Subject: Integrated: 8358333: Use VEX2 prefix in Assembler::psllq In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 15:53:17 GMT, Yudi Zheng wrote: > While porting the commit https://github.com/openjdk/jdk/commit/0df8c9684b8782ef830e2bd425217864c3f51784 to Graal, I noticed that the Assembler::psllq instruction is using the VEX3 prefix. This results in the instruction being unrecognizable by my outdated version of hsdis. Currently, HotSpot generates the following bytes for vpsllq xmm7, xmm7, 0x34 > https://github.com/openjdk/jdk/blob/0df8c9684b8782ef830e2bd425217864c3f51784/src/hotspot/cpu/x86/stubGenerator_x86_64_cbrt.cpp#L255 > > > c4 e1 c1 73 f7 34 > > > By setting the rex_w to WIG, the emitted bytes are: > > > c5 c1 73 f7 34 This pull request has now been integrated. Changeset: faf19abd Author: Yudi Zheng URL: https://git.openjdk.org/jdk/commit/faf19abd312ac461f9f74035fec61af7d834ffc1 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod 8358333: Use VEX2 prefix in Assembler::psllq Reviewed-by: jbhateja, thartmann ------------- PR: https://git.openjdk.org/jdk/pull/25593 From roland at openjdk.org Tue Jun 3 15:21:52 2025 From: roland at openjdk.org (Roland Westrelin) Date: Tue, 3 Jun 2025 15:21:52 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v9] In-Reply-To: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: > An `Initialize` node for an `Allocate` node is created with a memory > `Proj` of adr type raw memory. In order for stores to be captured, the > memory state out of the allocation is a `MergeMem` with slices for the > various object fields/array element set to the raw memory `Proj` of > the `Initialize` node. If `Phi`s need to be created during later > transformations from this memory state, The `Phi` for a particular > slice gets its adr type from the type of the `Proj` which is raw > memory. If during macro expansion, the `Allocate` is found to have no > use and so can be removed, the `Proj` out of the `Initialize` is > replaced by the memory state on input to the `Allocate`. A `Phi` for > some slice for a field of an object will end up with the raw memory > state on input to the `Allocate` node. As a result, memory state at > the `Phi` is incorrect and incorrect execution can happen. > > The fix I propose is, rather than have a single `Proj` for the memory > state out of the `Initialize` with adr type raw memory, to use one > `Proj` per slice added to the memory state after the `Initalize`. Each > of the `Proj` should return the right adr type for its slice. For that > I propose having a new type of `Proj`: `NarrowMemProj` that captures > the right adr type. > > Logic for the construction of the `Allocate`/`Initialize` subgraph is > tweaked so the right adr type captured in is own `NarrowMemProj` is > added to the memory sugraph. Code that removes an allocation or moves > it also has to be changed so it correctly takes the multiple memory > projections out of the `Initialize` node into account. > > One tricky issue is that when EA split types for a scalar replaceable > `Allocate` node: > > 1- the adr type captured in the `NarrowMemProj` becomes out of sync > with the type of the slices for the allocation > > 2- before EA, the memory state for one particular field out of the > `Initialize` node can be used for a `Store` to the just allocated > object or some other. So we can have a chain of `Store`s, some to > the newly allocated object, some to some other objects, all of them > using the state of `NarrowMemProj` out of the `Initialize`. After > split unique types, the `NarrowMemProj` is for the slice of a > particular allocation. So `Store`s to some other objects shouldn't > use that memory state but the memory state before the `Allocate`. > > For that, I added logic to update the adr type of `NarrowMemProj` > during split unique types and update the memory input of `Store`s that > don't depend on the memory state ... Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision: Update src/hotspot/share/opto/library_call.cpp Co-authored-by: Emanuel Peter ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24570/files - new: https://git.openjdk.org/jdk/pull/24570/files/43c6f822..c0a8ad21 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24570&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24570&range=07-08 Stats: 4 lines in 1 file changed: 3 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24570.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24570/head:pull/24570 PR: https://git.openjdk.org/jdk/pull/24570 From roland at openjdk.org Tue Jun 3 15:21:54 2025 From: roland at openjdk.org (Roland Westrelin) Date: Tue, 3 Jun 2025 15:21:54 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v8] In-Reply-To: <4ShW7VcaJrO0v0cHwUN1vccOH8tNPlJSIh_K0W2RdS0=.14954a26-c962-41a1-9088-2e1a1bc01eb4@github.com> References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> <1gdeBnZ7YuIf9CgQW2bCXkDDBWPjUgRnickHts-fvzE=.e6e901ba-3e9f-41a2-9c68-167a879e9655@github.com> <4ShW7VcaJrO0v0cHwUN1vccOH8tNPlJSIh_K0W2RdS0=.14954a26-c962-41a1-9088-2e1a1bc01eb4@github.com> Message-ID: On Tue, 27 May 2025 09:08:02 GMT, Emanuel Peter wrote: >> Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision: >> >> review > > test/hotspot/jtreg/compiler/macronodes/TestEliminationOfAllocationWithoutUse.java line 2: > >> 1: /* >> 2: * Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved. > > Is the copyright year accurate? It's your test that I took over and updated so you tell me: do you want the copyright updated? > test/hotspot/jtreg/compiler/macronodes/TestInitializingStoreCapturing.java line 2: > >> 1: /* >> 2: * Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved. > > Is the copyright year accurate? Same comment as above. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2124206114 PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2124207503 From epeter at openjdk.org Tue Jun 3 15:27:23 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 15:27:23 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v76] In-Reply-To: References: Message-ID: <0YLlUg2GzBB8eo4d7V8-NZz_J7ZjGLqpOrNzxpiqVd0=.3cda28b0-5de0-4872-a9e4-4c10dba056a9@github.com> On Tue, 3 Jun 2025 13:30:47 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> rename View -> FilteredSet > > test/hotspot/jtreg/compiler/lib/template_framework/CodeFrame.java line 95: > >> 93: * where we would possibly want to make field or variable definitions during the insertion >> 94: * that are not just local to the insertion but affect the {@link CodeFrame} that we >> 95: * {@link Hook#anchor} earlier and are now {@link Hook#insert}ing into. > > It complains that `addName` cannot be found. Suggestion to use `{@link Template#addDataName}/ > * {@link Template#addStructuralName}` instead: > > Suggestion: > > * Creates a special frame, which has a {@link #parent} but uses the {@link NameSet} > * from the parent frame, allowing {@link Template#addDataName}/ > * {@link Template#addStructuralName} to persist in the outer frame when the current frame > * is exited. This is necessary for {@link Hook#insert}, where we would possibly want to > * make field or variable definitions during the insertion that are not just local to the > * insertion but affect the {@link CodeFrame} that we {@link Hook#anchor} earlier and are > * now {@link Hook#insert}ing into. good catch! I got no complaints because `javadoc` does not look at private classes ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2124230818 From epeter at openjdk.org Tue Jun 3 15:36:03 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 15:36:03 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v78] In-Reply-To: References: Message-ID: > **Goal** > We want to generate Java source code: > - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. > - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). > > Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). > > **How to get started** > When reviewing, please start by looking at: > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 > > We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. > > Second, look at this advanced test: > https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 > > And then for a "tutorial", look at: > `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` > > It shows these features: > - The `body` of a Template is essentially a list of `Token`s that are concatenated. > - Templates can be nested: a `TemplateWithArgs` is also a `Token`. > - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. > - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. > - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. > - The use of recursive templates, and `fuel` to limit the recursion. > - `Name`s: useful to register field and variable names in code scopes. > > Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 > > For a better experience, you may want to generate the `javadocs`: > `javadoc -sourcepath test/hotspot/j... Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: Apply suggestions from code review Co-authored-by: Christian Hagedorn ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24217/files - new: https://git.openjdk.org/jdk/pull/24217/files/6ef71270..3efb9fc6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=77 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=76-77 Stats: 18 lines in 4 files changed: 1 ins; 10 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/24217.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24217/head:pull/24217 PR: https://git.openjdk.org/jdk/pull/24217 From epeter at openjdk.org Tue Jun 3 15:36:05 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 15:36:05 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v76] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 14:05:14 GMT, Christian Hagedorn wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> rename View -> FilteredSet > > test/hotspot/jtreg/compiler/lib/template_framework/NameSet.java line 89: > >> 87: if (w < 0) { >> 88: throw new RuntimeException("Negative weight not allowed: " + w); >> 89: } > > I thought zero is also not allowed? Well that should have been filtered out already earlier, when we added the individual names. Now we could have a total weight of zero if we have no names. Then we just return `null` here, and then throw an exception a little further up the use case chain, e.g. `DataName.sample` -> `throw new RendererException("No variable: " + mutability + msg1 + msg2 + ".");` This here is just a sanity check, hence I throw a `RuntimeException`, and not a `RendererException`. > test/hotspot/jtreg/compiler/lib/template_framework/Renderer.java line 358: > >> 356: // If the character was not found, we want to have the rest of the >> 357: // String s, so instead of "-1" take the end/length of the String. >> 358: dollar = (dollar == -1) ? s.length() : dollar; > > `s.length()` could be called once before the loop and then reused inside the loop. You mean as a performance optimization? Is that not something we let the compiler do? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2124241074 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2124247590 From asmehra at openjdk.org Tue Jun 3 15:36:22 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Tue, 3 Jun 2025 15:36:22 GMT Subject: RFR: 8358330: AsmRemarks and DbgStrings clear() method may not get called before their destructor [v2] In-Reply-To: References: Message-ID: <6yLBtrUKBPgV63susOsKKhAPYCofyOI_Yd0wqbSqrCU=.12d4c0ca-0fa6-4000-a5e1-3ffd0f2ea6cc@github.com> > This patch fixes a possible assert in debug builds if the allocation of memory for a CodeBlob fails when loading it from the AOT Code Cache. See description of [JDK-8358330](https://bugs.openjdk.org/browse/JDK-8358330) for more details. Ashutosh Mehra has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into JDK-8358330 - Address review comments Signed-off-by: Ashutosh Mehra - 8358330: AsmRemarks and DbgStrings clear() method may not get called before their destructor Signed-off-by: Ashutosh Mehra ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25598/files - new: https://git.openjdk.org/jdk/pull/25598/files/6869d630..889286b9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25598&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25598&range=00-01 Stats: 7988 lines in 205 files changed: 4294 ins; 1061 del; 2633 mod Patch: https://git.openjdk.org/jdk/pull/25598.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25598/head:pull/25598 PR: https://git.openjdk.org/jdk/pull/25598 From asmehra at openjdk.org Tue Jun 3 15:36:23 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Tue, 3 Jun 2025 15:36:23 GMT Subject: RFR: 8358330: AsmRemarks and DbgStrings clear() method may not get called before their destructor [v2] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 01:29:04 GMT, Vladimir Kozlov wrote: >> Ashutosh Mehra has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge branch 'master' into JDK-8358330 >> - Address review comments >> >> Signed-off-by: Ashutosh Mehra >> - 8358330: AsmRemarks and DbgStrings clear() method may not get called before their destructor >> >> Signed-off-by: Ashutosh Mehra > > I am not comfortable that you are changing code not used by AOT. > Can you consider populate `CodeBlob::_asm_remarks` and `_dbg_strings` after calling `CodeBlob::create()`? Then you don't need temporary `AsmRemarks` and `DbgStrings`. @vnkozlov I have updated the changes. Can you please review again. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25598#issuecomment-2936000546 From never at openjdk.org Tue Jun 3 15:52:03 2025 From: never at openjdk.org (Tom Rodriguez) Date: Tue, 3 Jun 2025 15:52:03 GMT Subject: RFR: 8316694: Implement relocation of nmethod within CodeCache [v19] In-Reply-To: References: <17al0aeFhm0iZHoHHGiqB03RfPeSrIHIoZuapOHPuy4=.a2ff2d67-392b-40f0-b6d9-6e3a7f396e8a@github.com> Message-ID: On Mon, 2 Jun 2025 17:01:26 GMT, Evgeny Astigeevich wrote: > If it is moved, the [CompiledMethodUnload](https://docs.oracle.com/en/java/javase/24/docs/specs/jvmti.html#CompiledMethodUnload) event is sent, followed by a new CompiledMethodLoad event. > we now have 2 nmethods alive with the same compile_id which could be confusing. It's nice that the JVMTI docs considered this problem but the notifications will be sent in the reverse order given our current implementation. We will create a new nmethod while the old nmethod might still be alive, at least for the purposes of deopt. Since this PR doesn't actually perform any relocation, I'm not sure what the plan is here. The most aggressive thing that could be done is to invalidate all frames which have the old nmethod on stack, but that still leaves the nmethod live for the purposes of deopt. It would probably be ok to synthesize an unload after the deopt since there should be no actual execution in those nmethods, but you will then have to suppress the one that's normally done during nmethod::unlink. I agree that the docs are fairly clear that all of this is ok, but that doesn't mean that assumptions haven't been made about the current implementation. We just need to make sure we do something rational and that it's possible to understand from our output what was done. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23573#issuecomment-2936067701 From chagedorn at openjdk.org Tue Jun 3 15:53:00 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Tue, 3 Jun 2025 15:53:00 GMT Subject: RFR: 8350177: C2 SuperWord: Integer.numberOfLeadingZeros, numberOfTrailingZeros, reverse and bitCount have input types wrongly turncated for byte and short In-Reply-To: References: Message-ID: On Mon, 26 May 2025 07:15:31 GMT, Jasmine Karthikeyan wrote: > Hi all, > This patch fixes cases in SuperWord when compiling subword types where vectorized code would be given a narrower type than expected, leading to miscompilation due to truncation. This fix is a generalization of the same fix applied for `Integer.reverseBytes` in [JDK-8305324](https://bugs.openjdk.org/browse/JDK-8305324). The patch introduces a check for nodes that are known to tolerate truncation, so that any future cases of subword truncation will avoid creating miscompiled code. > > The patch reuses the existing logic to set the type of the vectors to int, which currently disables vectorization for the affected patterns entirely. Once [JDK-8342095](https://bugs.openjdk.org/browse/JDK-8342095) is merged and automatic casting support is added the autovectorizer should automatically insert casts to and from int, maintaining correctness. > > I've added an IR test that checks for correctly compiled outputs. Thoughts and reviews would be appreciated! @jaskarth Just to let you know, the fork is coming up this Thursday. But since this is a P3, we still got some time left in RDP 1 to get this fixed in JDK 25. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25440#issuecomment-2936069655 From epeter at openjdk.org Tue Jun 3 15:53:34 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 15:53:34 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v79] In-Reply-To: References: Message-ID: > **Goal** > We want to generate Java source code: > - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. > - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). > > Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). > > **How to get started** > When reviewing, please start by looking at: > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 > > We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. > > Second, look at this advanced test: > https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 > > And then for a "tutorial", look at: > `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` > > It shows these features: > - The `body` of a Template is essentially a list of `Token`s that are concatenated. > - Templates can be nested: a `TemplateWithArgs` is also a `Token`. > - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. > - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. > - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. > - The use of recursive templates, and `fuel` to limit the recursion. > - `Name`s: useful to register field and variable names in code scopes. > > Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 > > For a better experience, you may want to generate the `javadocs`: > `javadoc -sourcepath test/hotspot/j... Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: more for Christian ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24217/files - new: https://git.openjdk.org/jdk/pull/24217/files/3efb9fc6..3f0beb4a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=78 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=77-78 Stats: 6 lines in 2 files changed: 4 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/24217.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24217/head:pull/24217 PR: https://git.openjdk.org/jdk/pull/24217 From epeter at openjdk.org Tue Jun 3 15:53:34 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 15:53:34 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v61] In-Reply-To: References: Message-ID: <63rjQWH6SMsAL2gNeZEmKADrHdr1BCf27oToad-qn2c=.32494c95-4bc4-4e34-bbb1-18c6c5020ce7@github.com> On Mon, 2 Jun 2025 12:14:48 GMT, Christian Hagedorn wrote: >> Thanks for all the updates and discussions! I've worked my way through the documentation in `Template` and the examples again in some more detail. It's much better and the new explanations are well done, excellent work! >> >> I left some comments here and there but mostly minor things. I will have another look at the implementation - probably only finished by Monday. The design now looks great. I'm glad we could find a good solution now after some more iterations :-) > >> @chhagedorn Alright, I now have a decent solution for `$$var` and `$1var` etc. I also added tests for it. >> >> These are issues we could continue the conversation, unless you are satisfied with my answers: [#24217 (comment)](https://github.com/openjdk/jdk/pull/24217#discussion_r2115388737) [#24217 (comment)](https://github.com/openjdk/jdk/pull/24217#discussion_r2115406391) >> >> This is now ready for another review pass ? > > Awesome, thanks for spending some more time with these nasty edge-cases and finding a solution! I had a look at your updates for all my comments, they look good, thanks! > > I'm going to make a pass over the implementation classes now and will have a look at the `Renderer` updates as well :-) @chhagedorn Thanks for this batch of comments, they are all addressed! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24217#issuecomment-2936069153 From epeter at openjdk.org Tue Jun 3 15:53:34 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 15:53:34 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v76] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 15:27:23 GMT, Emanuel Peter wrote: >> test/hotspot/jtreg/compiler/lib/template_framework/NameSet.java line 89: >> >>> 87: if (w < 0) { >>> 88: throw new RuntimeException("Negative weight not allowed: " + w); >>> 89: } >> >> I thought zero is also not allowed? > > Well that should have been filtered out already earlier, when we added the individual names. Now we could have a total weight of zero if we have no names. Then we just return `null` here, and then throw an exception a little further up the use case chain, e.g. `DataName.sample` -> `throw new RendererException("No variable: " + mutability + msg1 + msg2 + ".");` > > This here is just a sanity check, hence I throw a `RuntimeException`, and not a `RendererException`. As asked for offline: added some more comments here. >> test/hotspot/jtreg/compiler/lib/template_framework/Renderer.java line 358: >> >>> 356: // If the character was not found, we want to have the rest of the >>> 357: // String s, so instead of "-1" take the end/length of the String. >>> 358: dollar = (dollar == -1) ? s.length() : dollar; >> >> `s.length()` could be called once before the loop and then reused inside the loop. > > You mean as a performance optimization? Is that not something we let the compiler do? As discussed offline: I made `s` final, so we are sure that it is not mutated and the length should be moved out of the loop by the compiler. It would also only be a very small performance impact, as we are doing things like `indexOf` here, which are much more expensive anyway. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2124296212 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2124298706 From kvn at openjdk.org Tue Jun 3 15:56:54 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 3 Jun 2025 15:56:54 GMT Subject: RFR: 8356000: C1/C2-only modes use 2 compiler threads on low CPU count machines [v3] In-Reply-To: References: Message-ID: On Wed, 28 May 2025 18:05:12 GMT, Aleksey Shipilev wrote: >> There is an unfortunate limitation with default tiered policy that we would have at least 2 threads on 1 CPU machine: 1 thread for C1, and 1 thread for C2. >> >> But if we select C1-only or C2-only modes, we _also_ get 2 compiler threads, for which we have no good reason. These threads would just step on each other toes. The fix changes the behavior for 1..3 CPU hosts in C1/C2-only configurations, by using 1 thread instead of 2 threads. The change for 1 CPU config is what we really need. The change in 2..3 CPU configs is an additional effect, but I think it is still good not to use 100%/66% of the CPUs in those configurations as well. >> >> >> $ for I in `seq 1 8`; do build/linux-x86_64-server-release/images/jdk/bin/java \ >> -XX:-TieredCompilation -XX:ActiveProcessorCount=${I} \ >> -XX:+PrintFlagsFinal 2>&1 | grep "CICompilerCount "; done >> >> # Before >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> # After >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> >> It is a minor bug in `CompilationPolicy::initialize`, but it gets in the way studying Leyden in tight CPU scenarios. >> >> Additional testing: >> - [x] New regression test passes with the fix, fails without it >> - [x] GHA >> - [x] Linux AArch64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Better test, patch amendments > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Unnecessary arch limitation > - Simplify test > - Adjust test bound > - Fix Looks fine. I submitted testing. ------------- PR Review: https://git.openjdk.org/jdk/pull/24972#pullrequestreview-2893164969 From epeter at openjdk.org Tue Jun 3 15:57:32 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 15:57:32 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v80] In-Reply-To: References: Message-ID: > **Goal** > We want to generate Java source code: > - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. > - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). > > Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). > > **How to get started** > When reviewing, please start by looking at: > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 > > We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. > > Second, look at this advanced test: > https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 > > And then for a "tutorial", look at: > `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` > > It shows these features: > - The `body` of a Template is essentially a list of `Token`s that are concatenated. > - Templates can be nested: a `TemplateWithArgs` is also a `Token`. > - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. > - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. > - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. > - The use of recursive templates, and `fuel` to limit the recursion. > - `Name`s: useful to register field and variable names in code scopes. > > Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 > > For a better experience, you may want to generate the `javadocs`: > `javadoc -sourcepath test/hotspot/j... Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: fix whitespaces from applied suggestion ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24217/files - new: https://git.openjdk.org/jdk/pull/24217/files/3f0beb4a..72923879 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=79 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=78-79 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/24217.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24217/head:pull/24217 PR: https://git.openjdk.org/jdk/pull/24217 From kvn at openjdk.org Tue Jun 3 16:17:56 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 3 Jun 2025 16:17:56 GMT Subject: RFR: 8357600: Patch nmethod flushing message to include more details [v3] In-Reply-To: References: Message-ID: <7O4QTc1_uAcjWhyauZKWf0E1nwun5aq64sRDOFpB_YY=.1a0e7491-3d0a-458a-9ee0-caaf8c0217ee@github.com> On Mon, 2 Jun 2025 22:49:36 GMT, Cesar Soares Lucas wrote: >> Please review this patch for adding more details to nmethod flushing message. These details are particularly important when investigating interaction of JVMCI compiled code and code cache flushing heuristics. >> >> Tested on Linux x64 with JTREG tier1-3 using fastdebug and release builds. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Address PR feedback. src/hotspot/share/code/nmethod.cpp line 2131: > 2129: ResourceMark rm; > 2130: const char* name = method()->name()->as_C_string(); > 2131: const char* is_jvmci = ""; Please use `compiler_name()` instead. src/hotspot/share/code/nmethod.cpp line 2142: > 2140: } > 2141: #endif > 2142: log_debug(codecache)("Flushing nmethod %3d/" INTPTR_FORMAT ", level=%d, osr=%d, cold=%d, epoch=" UINT64_FORMAT ", cold_count=" UINT64_FORMAT ". " You can use `lt` here. src/hotspot/share/code/nmethod.cpp line 2145: > 2143: "Cache capacity: %zuKb, free space: %zuKb. %smethod %s", > 2144: _compile_id, p2i(this), _comp_level, is_osr_method(), is_cold(), _gc_epoch, CodeCache::cold_gc_count(), > 2145: codecache_capacity, codecache_free_space, is_jvmci, name); Swap `is_jvmci` and `name` so that method's names in output for nmethod not compiled by JVMCI (by C1) will be aligned. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25402#discussion_r2124355185 PR Review Comment: https://git.openjdk.org/jdk/pull/25402#discussion_r2124324589 PR Review Comment: https://git.openjdk.org/jdk/pull/25402#discussion_r2124331010 From kvn at openjdk.org Tue Jun 3 16:28:56 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 3 Jun 2025 16:28:56 GMT Subject: RFR: 8357434: x86: Simplify Interpreter::profile_taken_branch [v3] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 09:38:09 GMT, Aleksey Shipilev wrote: >> Noticed that `Interpreter::profile_taken_branch` has the same `sbbptr` pattern we have eliminated with [JDK-8356946](https://bugs.openjdk.org/browse/JDK-8356946). The same logic applies here: the counter is 64-bit, never practically overflows, and no other code cares about it. >> >> Also tidied up some comments around it. >> >> Additional testing; >> - [x] Linux x86_64 server fastdebug, `tier1 tier2` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Merge branch 'master' into JDK-8357434-x86-profile-taken > - Stale comment > - Merge branch 'master' into JDK-8357434-x86-profile-taken > - Fix Re-approved. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25343#pullrequestreview-2893267613 From kvn at openjdk.org Tue Jun 3 16:41:51 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 3 Jun 2025 16:41:51 GMT Subject: RFR: 8358330: AsmRemarks and DbgStrings clear() method may not get called before their destructor [v2] In-Reply-To: <6yLBtrUKBPgV63susOsKKhAPYCofyOI_Yd0wqbSqrCU=.12d4c0ca-0fa6-4000-a5e1-3ffd0f2ea6cc@github.com> References: <6yLBtrUKBPgV63susOsKKhAPYCofyOI_Yd0wqbSqrCU=.12d4c0ca-0fa6-4000-a5e1-3ffd0f2ea6cc@github.com> Message-ID: On Tue, 3 Jun 2025 15:36:22 GMT, Ashutosh Mehra wrote: >> This patch fixes a possible assert in debug builds if the allocation of memory for a CodeBlob fails when loading it from the AOT Code Cache. See description of [JDK-8358330](https://bugs.openjdk.org/browse/JDK-8358330) for more details. > > Ashutosh Mehra has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into JDK-8358330 > - Address review comments > > Signed-off-by: Ashutosh Mehra > - 8358330: AsmRemarks and DbgStrings clear() method may not get called before their destructor > > Signed-off-by: Ashutosh Mehra Good. Let me test it. ------------- PR Review: https://git.openjdk.org/jdk/pull/25598#pullrequestreview-2893307151 From shade at openjdk.org Tue Jun 3 16:43:36 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 3 Jun 2025 16:43:36 GMT Subject: RFR: 8357434: x86: Simplify Interpreter::profile_taken_branch [v3] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 09:38:09 GMT, Aleksey Shipilev wrote: >> Noticed that `Interpreter::profile_taken_branch` has the same `sbbptr` pattern we have eliminated with [JDK-8356946](https://bugs.openjdk.org/browse/JDK-8356946). The same logic applies here: the counter is 64-bit, never practically overflows, and no other code cares about it. >> >> Also tidied up some comments around it. >> >> Additional testing; >> - [x] Linux x86_64 server fastdebug, `tier1 tier2` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Merge branch 'master' into JDK-8357434-x86-profile-taken > - Stale comment > - Merge branch 'master' into JDK-8357434-x86-profile-taken > - Fix Still fine with it, @iwanowww? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25343#issuecomment-2936257590 From jbhateja at openjdk.org Tue Jun 3 17:03:58 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 3 Jun 2025 17:03:58 GMT Subject: RFR: 8351635: C2 ROR/ROL: assert failed: Long constant expected [v2] In-Reply-To: References: <5k2J6AUT-a3B006J_ksxccQVxprZa21uqUbKTGkkby0=.5dfc4f2b-ad7a-4393-bf5e-efc246582c83@github.com> Message-ID: On Tue, 3 Jun 2025 11:42:07 GMT, Tobias Hartmann wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Update test/hotspot/jtreg/compiler/vectorapi/TestVectorRotateScalarCount.java >> >> Co-authored-by: Tobias Hartmann > > Looks good to me. All tests passed. Thanks @TobiHartmann , @chhagedorn ------------- PR Comment: https://git.openjdk.org/jdk/pull/25493#issuecomment-2936325533 From jbhateja at openjdk.org Tue Jun 3 17:04:00 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 3 Jun 2025 17:04:00 GMT Subject: Integrated: 8351635: C2 ROR/ROL: assert failed: Long constant expected In-Reply-To: References: Message-ID: <5VS9_aH-pNteS0lAJK8NdwbhEqbguLwAnkjQwgX0dRg=.c7692617-7b17-45a4-866e-73eeab9887c8@github.com> On Wed, 28 May 2025 14:19:21 GMT, Jatin Bhateja wrote: > This bug fix patch relaxes the strict assertion check to allow other pattern matches for degenerated long vector ROL/ROR operations with non-constant scalar shift values. > > Kindly review and share feedback. > > Best Regards, > Jatin This pull request has now been integrated. Changeset: d7e58ac4 Author: Jatin Bhateja URL: https://git.openjdk.org/jdk/commit/d7e58ac480b06c6340a65e67731d8f6dc179acfb Stats: 129 lines in 2 files changed: 127 ins; 1 del; 1 mod 8351635: C2 ROR/ROL: assert failed: Long constant expected Reviewed-by: thartmann, chagedorn ------------- PR: https://git.openjdk.org/jdk/pull/25493 From epeter at openjdk.org Tue Jun 3 17:05:02 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 17:05:02 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v8] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> <1gdeBnZ7YuIf9CgQW2bCXkDDBWPjUgRnickHts-fvzE=.e6e901ba-3e9f-41a2-9c68-167a879e9655@github.com> <4ShW7VcaJrO0v0cHwUN1vccOH8tNPlJSIh_K0W2RdS0=.14954a26-c962-41a1-9088-2e1a1bc01eb4@github.com> Message-ID: On Tue, 3 Jun 2025 15:17:35 GMT, Roland Westrelin wrote: >> test/hotspot/jtreg/compiler/macronodes/TestEliminationOfAllocationWithoutUse.java line 2: >> >>> 1: /* >>> 2: * Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved. >> >> Is the copyright year accurate? > > It's your test that I took over and updated so you tell me: do you want the copyright updated? Ah right. Well ? I guess you could write `2024, 2025` then :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2124444173 From epeter at openjdk.org Tue Jun 3 17:13:55 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 17:13:55 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v3] In-Reply-To: <1jna58ZtxrGgcqNt9FQf5Tl-rIo6YwTFYzavusVZGyA=.87513e10-77ba-436e-9d9e-b82f5041d368@github.com> References: <1jna58ZtxrGgcqNt9FQf5Tl-rIo6YwTFYzavusVZGyA=.87513e10-77ba-436e-9d9e-b82f5041d368@github.com> Message-ID: On Tue, 3 Jun 2025 08:19:06 GMT, Jatin Bhateja wrote: >> A) Patch extends the following tests with hard-coded encoding checks for various BMI instructions to cover REX2 or extended EVEX encodings supported by APX. >> >> >> compiler/intrinsics/bmi/verifycode/AndnTestI.java >> compiler/intrinsics/bmi/verifycode/AndnTestL.java >> compiler/intrinsics/bmi/verifycode/BzhiTestI2L.java >> compiler/intrinsics/bmi/verifycode/LZcntTestL.java >> compiler/intrinsics/bmi/verifycode/TZcntTestL.java >> >> >> B) After integration of JDK-8349582, which added APX NDD support, AndN instruction selection patterns that expect (Xor SRC, -1) as one of its operands were not getting selected because of a lower-cost generic immediate pattern match; patch fixes this issue through strict predicate checks. >> >> Above tests are now passing, validations were carried out using Intel Software Development emulator. >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Update src/hotspot/cpu/x86/x86_64.ad > > Thanks :-) > > Co-authored-by: Tobias Hartmann @jatin-bhateja Thanks for looking into this! `predicate(!UseAPX && n->in(2)->bottom_type()->is_int()->get_con() != -1);` The PR title seems to suggest the bug is only about -XX:+UseAPX. Why are you changing things for the case !UseAPX? Are these not cases like a ^ -1, which basically flips all bits. What alternative does this end up using now? A code comment would be helpful. src/hotspot/cpu/x86/x86_64.ad line 10620: > 10618: instruct xorI_rReg_imm(rRegI dst, immI src, rFlagsReg cr) > 10619: %{ > 10620: predicate(!UseAPX && n->in(2)->bottom_type()->is_int()->get_con() != -1); The PR title seems to suggest the bug is only about -XX:+UseAPX. Why are you changing things for the case !UseAPX? ------------- PR Review: https://git.openjdk.org/jdk/pull/25501#pullrequestreview-2893385310 PR Review Comment: https://git.openjdk.org/jdk/pull/25501#discussion_r2124452416 From epeter at openjdk.org Tue Jun 3 17:23:57 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 17:23:57 GMT Subject: RFR: 8290892: C2: Intrinsify Reference.reachabilityFence [v3] In-Reply-To: References: Message-ID: On Fri, 23 May 2025 22:43:35 GMT, Vladimir Ivanov wrote: >> This PR introduces C2 support for `Reference.reachabilityFence()`. >> >> After [JDK-8199462](https://bugs.openjdk.org/browse/JDK-8199462) went in, it was discovered that C2 may break the invariant the fix relied upon [1]. So, this is an attempt to introduce proper support for `Reference.reachabilityFence()` in C2. C1 is left intact for now, because there are no signs yet it is affected. >> >> `Reference.reachabilityFence()` can be used in performance critical code, so the primary goal for C2 is to reduce its runtime overhead as much as possible. The ultimate goal is to ensure liveness information is attached to interfering safepoints, but it takes multiple steps to properly propagate the information through compilation pipeline without negatively affecting generated code quality. >> >> Also, I don't consider this fix as complete. It does fix the reported problem, but it doesn't provide any strong guarantees yet. In particular, since `ReachabilityFence` is CFG-only node, nothing explicitly forbids memory operations to float past `Reference.reachabilityFence()` and potentially reaching some other safepoints current analysis treats as non-interfering. Representing `ReachabilityFence` as memory barrier (e.g., `MemBarCPUOrder`) would solve the issue, but performance costs are prohibitively high. Alternatively, the optimization proposed in this PR can be improved to conservatively extend referent's live range beyond `ReachabilityFence` nodes associated with it. It would meet performance criteria, but I prefer to implement it as a followup fix. >> >> Another known issue relates to reachability fences on constant oops. If such constant is GCed (most likely, due to a bug in Java code), similar reachability issues may arise. For now, RFs on constants are treated as no-ops, but there's a diagnostic flag `PreserveReachabilityFencesOnConstants` to keep the fences. I plan to address it separately. >> >> [1] https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/ref/Reference.java#L667 >> "HotSpot JVM retains the ref and does not GC it before a call to this method, because the JIT-compilers do not have GC-only safepoints." >> >> Testing: >> - [x] hs-tier1 - hs-tier8 >> - [x] hs-tier1 - hs-tier6 w/ -XX:+StressReachabilityFences -XX:+VerifyLoopOptimizations >> - [x] java/lang/foreign microbenchmarks > > Vladimir Ivanov has updated the pull request incrementally with one additional commit since the last revision: > > renaming src/hotspot/share/opto/c2_globals.hpp line 83: > 81: \ > 82: product(bool, StressReachabilityFences, false, DIAGNOSTIC, \ > 83: "Randomly insert ReachabilityFence nodes") \ Drive-by sniping: what about a hello-world test where you test out these flags? test/hotspot/jtreg/compiler/c2/TestReachabilityFence.java line 38: > 36: * @summary Tests to ensure that reachabilityFence() correctly keeps objects from being collected prematurely. > 37: * @modules java.base/jdk.internal.misc > 38: * @run main/othervm -Xbatch compiler.c2.TestReachabilityFence What about some extra runs where you use your new flags? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25315#discussion_r2124474939 PR Review Comment: https://git.openjdk.org/jdk/pull/25315#discussion_r2124476770 From jbhateja at openjdk.org Tue Jun 3 17:24:52 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 3 Jun 2025 17:24:52 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v3] In-Reply-To: References: <1jna58ZtxrGgcqNt9FQf5Tl-rIo6YwTFYzavusVZGyA=.87513e10-77ba-436e-9d9e-b82f5041d368@github.com> Message-ID: On Tue, 3 Jun 2025 17:07:30 GMT, Emanuel Peter wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Update src/hotspot/cpu/x86/x86_64.ad >> >> Thanks :-) >> >> Co-authored-by: Tobias Hartmann > > src/hotspot/cpu/x86/x86_64.ad line 10620: > >> 10618: instruct xorI_rReg_imm(rRegI dst, immI src, rFlagsReg cr) >> 10619: %{ >> 10620: predicate(!UseAPX && n->in(2)->bottom_type()->is_int()->get_con() != -1); > > The PR title seems to suggest the bug is only about -XX:+UseAPX. Why are you changing things for the case !UseAPX? Hey, AD file change enables AndN inferening, test expects compiler to emit that instruction and has hardcoded encoding checks in place to verify it. So both encoding and AD file changes are necessary to fix this failing test. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25501#discussion_r2124477130 From chagedorn at openjdk.org Tue Jun 3 17:39:31 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Tue, 3 Jun 2025 17:39:31 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v80] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 15:57:32 GMT, Emanuel Peter wrote: >> **Goal** >> We want to generate Java source code: >> - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. >> - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). >> >> Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). >> >> **How to get started** >> When reviewing, please start by looking at: >> https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 >> >> We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. >> >> Second, look at this advanced test: >> https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 >> >> And then for a "tutorial", look at: >> `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` >> >> It shows these features: >> - The `body` of a Template is essentially a list of `Token`s that are concatenated. >> - Templates can be nested: a `TemplateWithArgs` is also a `Token`. >> - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. >> - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. >> - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. >> - The use of recursive templates, and `fuel` to limit the recursion. >> - `Name`s: useful to register field and variable names in code scopes. >> >> Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. >> https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 >> >> For a better experience, you may want... > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > fix whitespaces from applied suggestion Thanks a lot for addressing everything and all the interesting and insightful discussions - also learnt quite a lot :-) There is nothing left to say other than: Ship it! ? (if the other reviewers also agree with the latest changes) ------------- Marked as reviewed by chagedorn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24217#pullrequestreview-2893451928 From sviswanathan at openjdk.org Tue Jun 3 17:49:34 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Tue, 3 Jun 2025 17:49:34 GMT Subject: RFR: 8351645: C2: ExpandBitsNode::Ideal hits assert because of TOP input In-Reply-To: References: Message-ID: <2EnHipwji6WU4sYmsbJAZSGpSmbhREXUq9f7V-ka6AI=.e5f13037-a796-41b9-8feb-a5ffcb9bc1b6@github.com> On Mon, 2 Jun 2025 11:53:23 GMT, Jatin Bhateja wrote: > Bugfix patch adds missing safe type access checks in Expand/Compress Ideal transforms. > Test mentioned in the bug report has been included allong with the patch. > > Kindly review. > > Best Regards, > Jatin Looks good to me. ------------- Marked as reviewed by sviswanathan (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25586#pullrequestreview-2893459874 From jbhateja at openjdk.org Tue Jun 3 17:49:37 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 3 Jun 2025 17:49:37 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v2] In-Reply-To: References: Message-ID: <8mE0O0QjyMJMK7UWtfMiFc5ZjIxFYqVNUeu0qYbzaz8=.75e13abf-a2c9-407b-898d-1174a85a06cf@github.com> On Tue, 3 Jun 2025 02:43:42 GMT, Jatin Bhateja wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Review comments resolutions > > Thanks, encoding logic is concentrated in integral instruction tests and is shared with corresponding long variants, extended APX coverage for BLS/R/MSK. > @jatin-bhateja Thanks for looking into this! > > `predicate(!UseAPX && n->in(2)->bottom_type()->is_int()->get_con() != -1);` > > The PR title seems to suggest the bug is only about -XX:+UseAPX. Why are you changing things for the case !UseAPX? > > Are these not cases like a ^ -1, which basically flips all bits. What alternative does this end up using now? > > A code comment would be helpful. We are tightening the predicate check so that under no circumstances we pick this pattern during the reduction phase of instruction selection on account of having lower cost. There is a generic pattern (xorI_rReg_imm) for all integral immediate values, and then there is a special pattern for Xor with -1 (fxorI_rReg_im1), which is needed for AndN inferencing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25501#issuecomment-2936412986 From duke at openjdk.org Tue Jun 3 17:52:30 2025 From: duke at openjdk.org (Tom Shull) Date: Tue, 3 Jun 2025 17:52:30 GMT Subject: RFR: 8357660: [JVMCI] Add support for retrieving all BootstrapMethodInvocations directly from ConstantPool [v4] In-Reply-To: References: Message-ID: <5OVd27HKtqnWu4vn0VnDAWLdWk0iTEstxqhnt9XJ5xU=.efb8b1eb-9998-4caa-844d-e8af7765d3b2@github.com> > This PR adds support for directly retrieving both all invokedynamic and all condy BootstrapMethodInvocations from a ConstantPool via the new method `List lookupBootstrapMethodInvocations(boolean invokeDynamic)`. > > In addition, two methods are added to the BootstrapMethodInvocations: > 1. `void resolve()` > 2. `JavaConstant lookup()` > > The combination of these two features allows one to directly interact with all BSM information of a given ConstantPool without having to iterate through all of the Classfile's methods to find all invokedynamic bytecodes and/or iterate through all Constant Pool entries. Tom Shull has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision: - Merge remote-tracking branch 'origin/master' into JDK-8357660 - commit to trigger testing - commit to trigger testing - reviewer feedback and update javadoc formatting - complete changes - commit review suggestion Co-authored-by: Douglas Simon - commit review suggestion Co-authored-by: Douglas Simon - change to allow both indys and condys to be looked up all at once - address reviewer feedback - style fixes and add testing to TestDynamicConstants. - ... and 1 more: https://git.openjdk.org/jdk/compare/7bf6d3ed...e0707fb8 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25420/files - new: https://git.openjdk.org/jdk/pull/25420/files/4d508fc4..e0707fb8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25420&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25420&range=02-03 Stats: 63637 lines in 1081 files changed: 34781 ins; 18003 del; 10853 mod Patch: https://git.openjdk.org/jdk/pull/25420.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25420/head:pull/25420 PR: https://git.openjdk.org/jdk/pull/25420 From duke at openjdk.org Tue Jun 3 17:52:39 2025 From: duke at openjdk.org (Tom Shull) Date: Tue, 3 Jun 2025 17:52:39 GMT Subject: RFR: 8357987: [JVMCI] Add support for retrieving all methods of a ResolvedJavaType [v4] In-Reply-To: References: Message-ID: > Currently from ResolvedJavaType one can retrieve all declared methods, static methods, and constructors of the given type. However, internally in HotSpot there are also VM-internal methods, such as overpass methods, associated with a given type which we cannot access via the API. > > To correct this, we should add a new method which enables VM-internal methods, such as overpass methods, to be accessed. Tom Shull has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: - Merge remote-tracking branch 'origin/master' into JDK-8357987 - return List.of() from getAllMethods - format javadoc and update test - implement getAllMethods - address reviewer feedback - Add Support for Retrieving All Non-Static Methods of a ResolvedJavaType. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25498/files - new: https://git.openjdk.org/jdk/pull/25498/files/ae81d46f..606f3619 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25498&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25498&range=02-03 Stats: 63826 lines in 1089 files changed: 34970 ins; 18003 del; 10853 mod Patch: https://git.openjdk.org/jdk/pull/25498.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25498/head:pull/25498 PR: https://git.openjdk.org/jdk/pull/25498 From kvn at openjdk.org Tue Jun 3 17:52:47 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 3 Jun 2025 17:52:47 GMT Subject: RFR: 8020282: Generated code quality: redundant LEAs in the chained dereferences In-Reply-To: References: Message-ID: <-PFiiMlUghbFgg2fuU86vuEXKaexylDuk3kBdcBn9N8=.2c272bf1-10a7-4110-8919-f33ee0d491ba@github.com> On Tue, 27 May 2025 17:26:59 GMT, Manuel H?ssig wrote: > ## Summary > > On x86, chained dereferences of narrow oops at a constant offset from the base oop can use a `lea` instruction to perform the address computation in one go using the `leaP8Narrow`, `leaP32Narrow`, and `leaPCompressedOopOffset` matching rules. However, the generated code contains an additional `lea` with an unused result: > > ; OptoAssembly > 03d decode_heap_oop_not_null R8,R10 > 041 leaq R10, [R12 + R10 << 3 + #12] (compressed oop addressing) ; ptr compressedoopoff32 > > ; x86 > 0x00007f1f210625bd: lea (%r12,%r10,8),%r8 ; result is unused > 0x00007f1f210625c1: lea 0xc(%r12,%r10,8),%r10 ; the same computation as decode, but with offset > > > This PR adds a peephole optimization to remove such redundant `lea`s. > > ## The Issue in Detail > > The ideal subgraph producing redundant `lea`s, or rather redundant `decodeHeapOop_not_null`s, is `LoadN -> DecodeN -> AddP`, where both the address and base edge of the `AddP` originate from the `DecodeN`. After matching, this becomes > > LoadN -> decodeHeapOop_not_null -> leaP* > ______________________________? > > where `leaP*` is either of `leaP8Narrow`, `leaP32Narrow`, or `leaPCompressedOopOffset` (depending on the heap location and size). Here, the base input of `leaP*` comes from the decode. Looking at the matching code path, we find that the `leaP*` rules match both the `AddP` and the `DecodeN`, since x86 can fold this, but the following code adds the decode back as the base input to `leaP*`: > > https://github.com/openjdk/jdk/blob/c29537740efb04e061732a700582d43b1956cff4/src/hotspot/share/opto/matcher.cpp#L1894-L1897 > > On its face, this is completely unnecessary if we matched a `leaP*`, since it already computes the result of the decode, so adding the `LoadN` node as base seems like the logical choice. However, if the derived oop computed by the `leaP*` gets added to an oop map, this `DecodeN` is needed as the base for the derived oop. Because as of now, derived oops in oop maps cannot have narrow base pointers. > > This leaves us with a handful of possible solutions: > 1. implement narrow bases for derived oops in oop maps, > 2. perform some dead code elimination after we know which oops are part of oop maps, > 3. add a peephole optimization to simply remove unused `lea`s. > > Option 1 would have been ideal in the sense, that it is the earliest possible point to remove the decode, which would simplify the graph and reduce pressure on the register allocator. However, rewriting the oop map machinery to remove a... src/hotspot/cpu/x86/peephole_x86_64.cpp line 244: > 242: // the DecodeN. However, after matching the DecodeN is added back as the base for the leaP*, > 243: // which is nessecary if the oop derived by the leaP* gets added to an OopMap, because OopMaps > 244: // cannot contain derived oops with narrow oops as a base. Am I correct to assume that if it is referenced in OopMap (which is side table) it will by referenced by some Safepoint node in graph? src/hotspot/cpu/x86/peephole_x86_64.cpp line 255: > 253: // This peephole recognizes graphs of the shape as shown above, ensures that the result of the > 254: // decode is only used by the derived oop and removes that decode if this is the case. Futher, > 255: // multipe leaP*s can have the same decode as their base. This peephole will remove the decode Typo `multipe` src/hotspot/cpu/x86/peephole_x86_64.cpp line 267: > 265: // | / \ > 266: // leaP* MachProj (leaf) > 267: // In this case where te common parent of the leaP* and the decode is one MemToRegSpill Copy Typo: `te` src/hotspot/cpu/x86/peephole_x86_64.cpp line 268: > 266: // leaP* MachProj (leaf) > 267: // In this case where te common parent of the leaP* and the decode is one MemToRegSpill Copy > 268: // away, this peephole can als recognize the decode as redundant and also remove the spill copy Typo: `als` src/hotspot/cpu/x86/peephole_x86_64.cpp line 270: > 268: // away, this peephole can als recognize the decode as redundant and also remove the spill copy > 269: // if that is only used by the decode. > 270: bool Peephole::lea_remove_redundant(Block* block, int block_index, PhaseCFG* cfg_, PhaseRegAlloc* ra_, Why do you need `_` suffix? src/hotspot/cpu/x86/peephole_x86_64.cpp line 324: > 322: > 323: // Ensure the MachProj is in the same block as the decode and the lea. > 324: if (!block->contains(proj)) { Should we check `proj == nullptr` ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25471#discussion_r2124504924 PR Review Comment: https://git.openjdk.org/jdk/pull/25471#discussion_r2124512685 PR Review Comment: https://git.openjdk.org/jdk/pull/25471#discussion_r2124513810 PR Review Comment: https://git.openjdk.org/jdk/pull/25471#discussion_r2124514534 PR Review Comment: https://git.openjdk.org/jdk/pull/25471#discussion_r2124516088 PR Review Comment: https://git.openjdk.org/jdk/pull/25471#discussion_r2124528016 From jbhateja at openjdk.org Tue Jun 3 18:07:34 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 3 Jun 2025 18:07:34 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v4] In-Reply-To: References: Message-ID: > A) Patch extends the following tests with hard-coded encoding checks for various BMI instructions to cover REX2 or extended EVEX encodings supported by APX. > > > compiler/intrinsics/bmi/verifycode/AndnTestI.java > compiler/intrinsics/bmi/verifycode/AndnTestL.java > compiler/intrinsics/bmi/verifycode/BzhiTestI2L.java > compiler/intrinsics/bmi/verifycode/LZcntTestL.java > compiler/intrinsics/bmi/verifycode/TZcntTestL.java > > > B) After integration of JDK-8349582, which added APX NDD support, AndN instruction selection patterns that expect (Xor SRC, -1) as one of its operands were not getting selected because of a lower-cost generic immediate pattern match; patch fixes this issue through strict predicate checks. > > Above tests are now passing, validations were carried out using Intel Software Development emulator. > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Review comments resolutions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25501/files - new: https://git.openjdk.org/jdk/pull/25501/files/b5f69c8d..e332f191 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25501&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25501&range=02-03 Stats: 5 lines in 1 file changed: 3 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25501.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25501/head:pull/25501 PR: https://git.openjdk.org/jdk/pull/25501 From cslucas at openjdk.org Tue Jun 3 18:52:45 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 3 Jun 2025 18:52:45 GMT Subject: RFR: 8357600: Patch nmethod flushing message to include more details [v4] In-Reply-To: References: Message-ID: > Please review this patch for adding more details to nmethod flushing message. These details are particularly important when investigating interaction of JVMCI compiled code and code cache flushing heuristics. > > Tested on Linux x64 with JTREG tier1-3 using fastdebug and release builds. Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: Refactoring: use compiler_name(), LogStream ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25402/files - new: https://git.openjdk.org/jdk/pull/25402/files/2aabfa72..489d8eee Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25402&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25402&range=02-03 Stats: 14 lines in 1 file changed: 0 ins; 8 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/25402.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25402/head:pull/25402 PR: https://git.openjdk.org/jdk/pull/25402 From shade at openjdk.org Tue Jun 3 18:52:45 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 3 Jun 2025 18:52:45 GMT Subject: RFR: 8357600: Patch nmethod flushing message to include more details [v4] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 18:47:37 GMT, Cesar Soares Lucas wrote: >> Please review this patch for adding more details to nmethod flushing message. These details are particularly important when investigating interaction of JVMCI compiled code and code cache flushing heuristics. >> >> Tested on Linux x64 with JTREG tier1-3 using fastdebug and release builds. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Refactoring: use compiler_name(), LogStream Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25402#pullrequestreview-2893698653 From never at openjdk.org Tue Jun 3 19:22:22 2025 From: never at openjdk.org (Tom Rodriguez) Date: Tue, 3 Jun 2025 19:22:22 GMT Subject: RFR: 8357987: [JVMCI] Add support for retrieving all methods of a ResolvedJavaType [v4] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 17:52:39 GMT, Tom Shull wrote: >> Currently from ResolvedJavaType one can retrieve all declared methods, static methods, and constructors of the given type. However, internally in HotSpot there are also VM-internal methods, such as overpass methods, associated with a given type which we cannot access via the API. >> >> To correct this, we should add a new method which enables VM-internal methods, such as overpass methods, to be accessed. > > Tom Shull has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - Merge remote-tracking branch 'origin/master' into JDK-8357987 > - return List.of() from getAllMethods > - format javadoc and update test > - implement getAllMethods > - address reviewer feedback > - Add Support for Retrieving All Non-Static Methods of a ResolvedJavaType. Looks good. ------------- Marked as reviewed by never (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25498#pullrequestreview-2893798998 From duke at openjdk.org Tue Jun 3 19:38:19 2025 From: duke at openjdk.org (duke) Date: Tue, 3 Jun 2025 19:38:19 GMT Subject: RFR: 8357987: [JVMCI] Add support for retrieving all methods of a ResolvedJavaType [v4] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 17:52:39 GMT, Tom Shull wrote: >> Currently from ResolvedJavaType one can retrieve all declared methods, static methods, and constructors of the given type. However, internally in HotSpot there are also VM-internal methods, such as overpass methods, associated with a given type which we cannot access via the API. >> >> To correct this, we should add a new method which enables VM-internal methods, such as overpass methods, to be accessed. > > Tom Shull has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - Merge remote-tracking branch 'origin/master' into JDK-8357987 > - return List.of() from getAllMethods > - format javadoc and update test > - implement getAllMethods > - address reviewer feedback > - Add Support for Retrieving All Non-Static Methods of a ResolvedJavaType. @teshull Your change (at version 606f36196a7bd12abfc76c55b141d712cc613f42) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25498#issuecomment-2936879630 From kvn at openjdk.org Tue Jun 3 19:39:20 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 3 Jun 2025 19:39:20 GMT Subject: RFR: 8357600: Patch nmethod flushing message to include more details [v4] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 18:52:45 GMT, Cesar Soares Lucas wrote: >> Please review this patch for adding more details to nmethod flushing message. These details are particularly important when investigating interaction of JVMCI compiled code and code cache flushing heuristics. >> >> Tested on Linux x64 with JTREG tier1-3 using fastdebug and release builds. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Refactoring: use compiler_name(), LogStream Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25402#pullrequestreview-2893842755 From duke at openjdk.org Tue Jun 3 19:41:22 2025 From: duke at openjdk.org (Tom Shull) Date: Tue, 3 Jun 2025 19:41:22 GMT Subject: Integrated: 8357987: [JVMCI] Add support for retrieving all methods of a ResolvedJavaType In-Reply-To: References: Message-ID: On Wed, 28 May 2025 15:55:39 GMT, Tom Shull wrote: > Currently from ResolvedJavaType one can retrieve all declared methods, static methods, and constructors of the given type. However, internally in HotSpot there are also VM-internal methods, such as overpass methods, associated with a given type which we cannot access via the API. > > To correct this, we should add a new method which enables VM-internal methods, such as overpass methods, to be accessed. This pull request has now been integrated. Changeset: e235b61a Author: Tom Shull Committer: Doug Simon URL: https://git.openjdk.org/jdk/commit/e235b61a8bb70462921c09d197adc4b60267d327 Stats: 103 lines in 11 files changed: 102 ins; 0 del; 1 mod 8357987: [JVMCI] Add support for retrieving all methods of a ResolvedJavaType Reviewed-by: dnsimon, yzheng, never ------------- PR: https://git.openjdk.org/jdk/pull/25498 From cslucas at openjdk.org Tue Jun 3 19:53:32 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 3 Jun 2025 19:53:32 GMT Subject: RFR: 8358534: Bailout in Conv2B::Ideal when type of cmp input is not supported Message-ID: <2kB23xVQDRb7YT6aMt1SbIfPwSG1ummK29A1Hs3FD0Y=.59ea7dd7-c6fd-486c-a996-f839c9a15718@github.com> `Conv2BNode::ideal` segfaults in release builds when the type of `in(1)` is not INT or PTR. Creating a small test case to reproduce the issue is being a bit challenging so this PR only address the issue by bailing out of the method if the input type is unsupported. This other ticket https://bugs.openjdk.org/browse/JDK-8357885 will address creating a regression test for the problem. Tested with JTREG tier1-3 and Renaissance on Linux x64. ------------- Commit messages: - Bailout of Conv2BNode::ideal on unknown input type. Changes: https://git.openjdk.org/jdk/pull/25627/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25627&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358534 Stats: 5 lines in 1 file changed: 5 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25627.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25627/head:pull/25627 PR: https://git.openjdk.org/jdk/pull/25627 From shade at openjdk.org Tue Jun 3 20:02:16 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 3 Jun 2025 20:02:16 GMT Subject: RFR: 8358534: Bailout in Conv2B::Ideal when type of cmp input is not supported In-Reply-To: <2kB23xVQDRb7YT6aMt1SbIfPwSG1ummK29A1Hs3FD0Y=.59ea7dd7-c6fd-486c-a996-f839c9a15718@github.com> References: <2kB23xVQDRb7YT6aMt1SbIfPwSG1ummK29A1Hs3FD0Y=.59ea7dd7-c6fd-486c-a996-f839c9a15718@github.com> Message-ID: On Tue, 3 Jun 2025 19:48:37 GMT, Cesar Soares Lucas wrote: > `Conv2BNode::ideal` segfaults in release builds when the type of `in(1)` is not INT or PTR. Creating a small test case to reproduce the issue is being a bit challenging so this PR only address the issue by bailing out of the method if the input type is unsupported. This other ticket https://bugs.openjdk.org/browse/JDK-8357885 will address creating a regression test for the problem. > > Tested with JTREG tier1-3 and Renaissance on Linux x64. Looks good to me. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25627#pullrequestreview-2893932492 From shade at openjdk.org Tue Jun 3 20:21:24 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 3 Jun 2025 20:21:24 GMT Subject: RFR: 8358534: Bailout in Conv2B::Ideal when type of cmp input is not supported In-Reply-To: <2kB23xVQDRb7YT6aMt1SbIfPwSG1ummK29A1Hs3FD0Y=.59ea7dd7-c6fd-486c-a996-f839c9a15718@github.com> References: <2kB23xVQDRb7YT6aMt1SbIfPwSG1ummK29A1Hs3FD0Y=.59ea7dd7-c6fd-486c-a996-f839c9a15718@github.com> Message-ID: On Tue, 3 Jun 2025 19:48:37 GMT, Cesar Soares Lucas wrote: > `Conv2BNode::ideal` segfaults in release builds when the type of `in(1)` is not INT or PTR. Creating a small test case to reproduce the issue is being a bit challenging so this PR only address the issue by bailing out of the method if the input type is unsupported. This other ticket https://bugs.openjdk.org/browse/JDK-8357885 will address creating a regression test for the problem. > > Tested with JTREG tier1-3 and Renaissance on Linux x64. Wait, hold on. The rule is to wait for 24 hours for Hotspot changes and have at least 2 Reviewers. Unless the change is trivial. This one is simple, but not trivial. So stand by if anyone would ask to back it out. @vnkozlov, @TobiHartmann -- FYI, there was a process snag, keep an eye on testing. ^^^ ------------- PR Comment: https://git.openjdk.org/jdk/pull/25627#issuecomment-2937056595 PR Comment: https://git.openjdk.org/jdk/pull/25627#issuecomment-2937060106 From cslucas at openjdk.org Tue Jun 3 20:21:24 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 3 Jun 2025 20:21:24 GMT Subject: Integrated: 8358534: Bailout in Conv2B::Ideal when type of cmp input is not supported In-Reply-To: <2kB23xVQDRb7YT6aMt1SbIfPwSG1ummK29A1Hs3FD0Y=.59ea7dd7-c6fd-486c-a996-f839c9a15718@github.com> References: <2kB23xVQDRb7YT6aMt1SbIfPwSG1ummK29A1Hs3FD0Y=.59ea7dd7-c6fd-486c-a996-f839c9a15718@github.com> Message-ID: On Tue, 3 Jun 2025 19:48:37 GMT, Cesar Soares Lucas wrote: > `Conv2BNode::ideal` segfaults in release builds when the type of `in(1)` is not INT or PTR. Creating a small test case to reproduce the issue is being a bit challenging so this PR only address the issue by bailing out of the method if the input type is unsupported. This other ticket https://bugs.openjdk.org/browse/JDK-8357885 will address creating a regression test for the problem. > > Tested with JTREG tier1-3 and Renaissance on Linux x64. This pull request has now been integrated. Changeset: 704b5990 Author: Cesar Soares Lucas URL: https://git.openjdk.org/jdk/commit/704b5990a750719ca927e156553db7982637e590 Stats: 5 lines in 1 file changed: 5 ins; 0 del; 0 mod 8358534: Bailout in Conv2B::Ideal when type of cmp input is not supported Reviewed-by: shade ------------- PR: https://git.openjdk.org/jdk/pull/25627 From mdoerr at openjdk.org Tue Jun 3 20:53:21 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 3 Jun 2025 20:53:21 GMT Subject: RFR: 8354636: [PPC64] Clean up comments regarding frame manager In-Reply-To: <28IlBh9k0o4RZMbIstYTCl8c0rfIIqVqyPXeXFyx1Ik=.1d4919d2-2437-4c81-8d30-75128b0a0afb@github.com> References: <28IlBh9k0o4RZMbIstYTCl8c0rfIIqVqyPXeXFyx1Ik=.1d4919d2-2437-4c81-8d30-75128b0a0afb@github.com> Message-ID: On Tue, 3 Jun 2025 14:29:49 GMT, Martin Doerr wrote: > Trivial comment cleanup: Replace "frame manager" by "template interpreter". Thanks for the review! GHA for Windows needs update. (Known issue.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/25616#issuecomment-2937154838 From epeter at openjdk.org Tue Jun 3 21:10:38 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 3 Jun 2025 21:10:38 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v61] In-Reply-To: References: Message-ID: <2d4uleP_nUgeF02l9KHzJMsoBNLfp0IrXyoZnTm4CXY=.274f4ead-4028-458e-ade8-148a79d2f8c8@github.com> On Mon, 2 Jun 2025 12:14:48 GMT, Christian Hagedorn wrote: >> Thanks for all the updates and discussions! I've worked my way through the documentation in `Template` and the examples again in some more detail. It's much better and the new explanations are well done, excellent work! >> >> I left some comments here and there but mostly minor things. I will have another look at the implementation - probably only finished by Monday. The design now looks great. I'm glad we could find a good solution now after some more iterations :-) > >> @chhagedorn Alright, I now have a decent solution for `$$var` and `$1var` etc. I also added tests for it. >> >> These are issues we could continue the conversation, unless you are satisfied with my answers: [#24217 (comment)](https://github.com/openjdk/jdk/pull/24217#discussion_r2115388737) [#24217 (comment)](https://github.com/openjdk/jdk/pull/24217#discussion_r2115406391) >> >> This is now ready for another review pass ? > > Awesome, thanks for spending some more time with these nasty edge-cases and finding a solution! I had a look at your updates for all my comments, they look good, thanks! > > I'm going to make a pass over the implementation classes now and will have a look at the `Renderer` updates as well :-) @chhagedorn Thank you very much! ? This was surely my biggest patch, and most deeply reviewed. An intense but rewarding experience. And I really learned so much from so many of the contributors here :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/24217#issuecomment-2937209219 From duke at openjdk.org Tue Jun 3 21:52:22 2025 From: duke at openjdk.org (duke) Date: Tue, 3 Jun 2025 21:52:22 GMT Subject: Withdrawn: 8344116: C2: remove slice parameter from LoadNode::make In-Reply-To: References: Message-ID: On Wed, 26 Mar 2025 15:18:25 GMT, Zihao Lin wrote: > This patch remove slice parameter from LoadNode::make > > Mention in https://github.com/openjdk/jdk/pull/21834#pullrequestreview-2429164805 > > Hi team, I am new, I'd appreciate any guidance. Thank a lot! This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/24258 From cslucas at openjdk.org Tue Jun 3 23:42:21 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 3 Jun 2025 23:42:21 GMT Subject: Integrated: 8357600: Patch nmethod flushing message to include more details In-Reply-To: References: Message-ID: On Thu, 22 May 2025 22:40:51 GMT, Cesar Soares Lucas wrote: > Please review this patch for adding more details to nmethod flushing message. These details are particularly important when investigating interaction of JVMCI compiled code and code cache flushing heuristics. > > Tested on Linux x64 with JTREG tier1-3 using fastdebug and release builds. This pull request has now been integrated. Changeset: 23450651 Author: Cesar Soares Lucas URL: https://git.openjdk.org/jdk/commit/2345065166c56a958365a6362af356e7c95fcaff Stats: 13 lines in 1 file changed: 9 ins; 0 del; 4 mod 8357600: Patch nmethod flushing message to include more details Reviewed-by: shade, kvn ------------- PR: https://git.openjdk.org/jdk/pull/25402 From kvn at openjdk.org Wed Jun 4 00:57:20 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 4 Jun 2025 00:57:20 GMT Subject: RFR: 8356000: C1/C2-only modes use 2 compiler threads on low CPU count machines [v3] In-Reply-To: References: Message-ID: On Wed, 28 May 2025 18:05:12 GMT, Aleksey Shipilev wrote: >> There is an unfortunate limitation with default tiered policy that we would have at least 2 threads on 1 CPU machine: 1 thread for C1, and 1 thread for C2. >> >> But if we select C1-only or C2-only modes, we _also_ get 2 compiler threads, for which we have no good reason. These threads would just step on each other toes. The fix changes the behavior for 1..3 CPU hosts in C1/C2-only configurations, by using 1 thread instead of 2 threads. The change for 1 CPU config is what we really need. The change in 2..3 CPU configs is an additional effect, but I think it is still good not to use 100%/66% of the CPUs in those configurations as well. >> >> >> $ for I in `seq 1 8`; do build/linux-x86_64-server-release/images/jdk/bin/java \ >> -XX:-TieredCompilation -XX:ActiveProcessorCount=${I} \ >> -XX:+PrintFlagsFinal 2>&1 | grep "CICompilerCount "; done >> >> # Before >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> # After >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> >> It is a minor bug in `CompilationPolicy::initialize`, but it gets in the way studying Leyden in tight CPU scenarios. >> >> Additional testing: >> - [x] New regression test passes with the fix, fails without it >> - [x] GHA >> - [x] Linux AArch64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Better test, patch amendments > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Unnecessary arch limitation > - Simplify test > - Adjust test bound > - Fix Testing mostly passed. Only macOS-aarch64 left (linux-aarch64 passed). I think it is fine to integrate without waiting. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24972#pullrequestreview-2894516007 From fyang at openjdk.org Wed Jun 4 01:40:22 2025 From: fyang at openjdk.org (Fei Yang) Date: Wed, 4 Jun 2025 01:40:22 GMT Subject: RFR: 8358105: RISC-V: Optimize interpreter profile updates In-Reply-To: References: Message-ID: On Thu, 29 May 2025 11:05:04 GMT, Anjian Wen wrote: > The reason of this patch is same as the x86 and aarch64 but for riscv > [JDK-8356946](https://bugs.openjdk.org/browse/JDK-8356946) > [JDK-8357223](https://bugs.openjdk.org/browse/JDK-8357223) > >> First, we carry the implementation for counter decrements without using them. This is dead code, and can be purged. Second, we care about overflows for 64-bit for some reason. I think this is a reminiscent of 32-bit x86 support, where we can plausibly have 32-bit counter overflow in a reasonable timeframe. But for 64-bit counter, we need tens of years of constantly bashing the counter to get it to overflow. No other profile counter update code, e.g. in C1, cares about this. Thanks. My local tier1-2 test on linux-riscv64 is clean. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25520#pullrequestreview-2894576442 From wenanjian at openjdk.org Wed Jun 4 01:56:22 2025 From: wenanjian at openjdk.org (Anjian Wen) Date: Wed, 4 Jun 2025 01:56:22 GMT Subject: RFR: 8358105: RISC-V: Optimize interpreter profile updates In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 09:27:39 GMT, Feilong Jiang wrote: >> The reason of this patch is same as the x86 and aarch64 but for riscv >> [JDK-8356946](https://bugs.openjdk.org/browse/JDK-8356946) >> [JDK-8357223](https://bugs.openjdk.org/browse/JDK-8357223) >> >>> First, we carry the implementation for counter decrements without using them. This is dead code, and can be purged. Second, we care about overflows for 64-bit for some reason. I think this is a reminiscent of 32-bit x86 support, where we can plausibly have 32-bit counter overflow in a reasonable timeframe. But for 64-bit counter, we need tens of years of constantly bashing the counter to get it to overflow. No other profile counter update code, e.g. in C1, cares about this. > > Looks good! @feilongjiang @RealFYang Thanks for your review and comments? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25520#issuecomment-2938043407 From duke at openjdk.org Wed Jun 4 01:56:22 2025 From: duke at openjdk.org (duke) Date: Wed, 4 Jun 2025 01:56:22 GMT Subject: RFR: 8358105: RISC-V: Optimize interpreter profile updates In-Reply-To: References: Message-ID: On Thu, 29 May 2025 11:05:04 GMT, Anjian Wen wrote: > The reason of this patch is same as the x86 and aarch64 but for riscv > [JDK-8356946](https://bugs.openjdk.org/browse/JDK-8356946) > [JDK-8357223](https://bugs.openjdk.org/browse/JDK-8357223) > >> First, we carry the implementation for counter decrements without using them. This is dead code, and can be purged. Second, we care about overflows for 64-bit for some reason. I think this is a reminiscent of 32-bit x86 support, where we can plausibly have 32-bit counter overflow in a reasonable timeframe. But for 64-bit counter, we need tens of years of constantly bashing the counter to get it to overflow. No other profile counter update code, e.g. in C1, cares about this. @Anjian-Wen Your change (at version 3dcba0d22bb3747b6ab3590c42a0b07e80a3555b) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25520#issuecomment-2938044093 From wenanjian at openjdk.org Wed Jun 4 02:06:21 2025 From: wenanjian at openjdk.org (Anjian Wen) Date: Wed, 4 Jun 2025 02:06:21 GMT Subject: Integrated: 8358105: RISC-V: Optimize interpreter profile updates In-Reply-To: References: Message-ID: On Thu, 29 May 2025 11:05:04 GMT, Anjian Wen wrote: > The reason of this patch is same as the x86 and aarch64 but for riscv > [JDK-8356946](https://bugs.openjdk.org/browse/JDK-8356946) > [JDK-8357223](https://bugs.openjdk.org/browse/JDK-8357223) > >> First, we carry the implementation for counter decrements without using them. This is dead code, and can be purged. Second, we care about overflows for 64-bit for some reason. I think this is a reminiscent of 32-bit x86 support, where we can plausibly have 32-bit counter overflow in a reasonable timeframe. But for 64-bit counter, we need tens of years of constantly bashing the counter to get it to overflow. No other profile counter update code, e.g. in C1, cares about this. This pull request has now been integrated. Changeset: 939521b8 Author: Anjian Wen Committer: Feilong Jiang URL: https://git.openjdk.org/jdk/commit/939521b8e4120357108220d177228b683af3334f Stats: 33 lines in 2 files changed: 0 ins; 21 del; 12 mod 8358105: RISC-V: Optimize interpreter profile updates Reviewed-by: fjiang, fyang ------------- PR: https://git.openjdk.org/jdk/pull/25520 From vlivanov at openjdk.org Wed Jun 4 03:25:22 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Wed, 4 Jun 2025 03:25:22 GMT Subject: RFR: 8357434: x86: Simplify Interpreter::profile_taken_branch [v3] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 09:38:09 GMT, Aleksey Shipilev wrote: >> Noticed that `Interpreter::profile_taken_branch` has the same `sbbptr` pattern we have eliminated with [JDK-8356946](https://bugs.openjdk.org/browse/JDK-8356946). The same logic applies here: the counter is 64-bit, never practically overflows, and no other code cares about it. >> >> Also tidied up some comments around it. >> >> Additional testing; >> - [x] Linux x86_64 server fastdebug, `tier1 tier2` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Merge branch 'master' into JDK-8357434-x86-profile-taken > - Stale comment > - Merge branch 'master' into JDK-8357434-x86-profile-taken > - Fix Looks good. ------------- Marked as reviewed by vlivanov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25343#pullrequestreview-2894916789 From kvn at openjdk.org Wed Jun 4 04:08:24 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 4 Jun 2025 04:08:24 GMT Subject: RFR: 8356000: C1/C2-only modes use 2 compiler threads on low CPU count machines [v3] In-Reply-To: References: Message-ID: <7wXlPpo5ZSbeh3RmNIKUQaum4UIAg-pINoVUixEFRvw=.53242223-85cb-426d-bd45-9b846ce472aa@github.com> On Wed, 28 May 2025 18:05:12 GMT, Aleksey Shipilev wrote: >> There is an unfortunate limitation with default tiered policy that we would have at least 2 threads on 1 CPU machine: 1 thread for C1, and 1 thread for C2. >> >> But if we select C1-only or C2-only modes, we _also_ get 2 compiler threads, for which we have no good reason. These threads would just step on each other toes. The fix changes the behavior for 1..3 CPU hosts in C1/C2-only configurations, by using 1 thread instead of 2 threads. The change for 1 CPU config is what we really need. The change in 2..3 CPU configs is an additional effect, but I think it is still good not to use 100%/66% of the CPUs in those configurations as well. >> >> >> $ for I in `seq 1 8`; do build/linux-x86_64-server-release/images/jdk/bin/java \ >> -XX:-TieredCompilation -XX:ActiveProcessorCount=${I} \ >> -XX:+PrintFlagsFinal 2>&1 | grep "CICompilerCount "; done >> >> # Before >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> # After >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> >> It is a minor bug in `CompilationPolicy::initialize`, but it gets in the way studying Leyden in tight CPU scenarios. >> >> Additional testing: >> - [x] New regression test passes with the fix, fails without it >> - [x] GHA >> - [x] Linux AArch64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Better test, patch amendments > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Unnecessary arch limitation > - Simplify test > - Adjust test bound > - Fix All my testing passed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24972#issuecomment-2938390939 From jkarthikeyan at openjdk.org Wed Jun 4 04:28:38 2025 From: jkarthikeyan at openjdk.org (Jasmine Karthikeyan) Date: Wed, 4 Jun 2025 04:28:38 GMT Subject: RFR: 8350177: C2 SuperWord: Integer.numberOfLeadingZeros, numberOfTrailingZeros, reverse and bitCount have input types wrongly turncated for byte and short [v2] In-Reply-To: References: Message-ID: > Hi all, > This patch fixes cases in SuperWord when compiling subword types where vectorized code would be given a narrower type than expected, leading to miscompilation due to truncation. This fix is a generalization of the same fix applied for `Integer.reverseBytes` in [JDK-8305324](https://bugs.openjdk.org/browse/JDK-8305324). The patch introduces a check for nodes that are known to tolerate truncation, so that any future cases of subword truncation will avoid creating miscompiled code. > > The patch reuses the existing logic to set the type of the vectors to int, which currently disables vectorization for the affected patterns entirely. Once [JDK-8342095](https://bugs.openjdk.org/browse/JDK-8342095) is merged and automatic casting support is added the autovectorizer should automatically insert casts to and from int, maintaining correctness. > > I've added an IR test that checks for correctly compiled outputs. Thoughts and reviews would be appreciated! Jasmine Karthikeyan has updated the pull request incrementally with one additional commit since the last revision: Reformat, add comments and char tests ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25440/files - new: https://git.openjdk.org/jdk/pull/25440/files/8d1a8174..da692994 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25440&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25440&range=00-01 Stats: 144 lines in 2 files changed: 136 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/25440.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25440/head:pull/25440 PR: https://git.openjdk.org/jdk/pull/25440 From jkarthikeyan at openjdk.org Wed Jun 4 04:28:39 2025 From: jkarthikeyan at openjdk.org (Jasmine Karthikeyan) Date: Wed, 4 Jun 2025 04:28:39 GMT Subject: RFR: 8350177: C2 SuperWord: Integer.numberOfLeadingZeros, numberOfTrailingZeros, reverse and bitCount have input types wrongly turncated for byte and short [v2] In-Reply-To: References: Message-ID: On Mon, 26 May 2025 07:56:17 GMT, Quan Anh Mai wrote: >> Jasmine Karthikeyan has updated the pull request incrementally with one additional commit since the last revision: >> >> Reformat, add comments and char tests > > src/hotspot/share/opto/superword.cpp line 2496: > >> 2494: int opc = in->Opcode(); >> 2495: return opc == Op_AddI || opc == Op_SubI || opc == Op_MulI || opc == Op_AndI || opc == Op_OrI || opc == Op_XorI >> 2496: || opc == Op_ReverseBytesS || opc == Op_ReverseBytesUS; > > Are you sure? I don't think you can truncate a `ReverseByteS` to a `byte`. This is a good observation, thank you! I've fixed it so that it checks for short/char in this case. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25440#discussion_r2125605047 From jkarthikeyan at openjdk.org Wed Jun 4 04:28:39 2025 From: jkarthikeyan at openjdk.org (Jasmine Karthikeyan) Date: Wed, 4 Jun 2025 04:28:39 GMT Subject: RFR: 8350177: C2 SuperWord: Integer.numberOfLeadingZeros, numberOfTrailingZeros, reverse and bitCount have input types wrongly turncated for byte and short [v2] In-Reply-To: References: Message-ID: On Wed, 28 May 2025 07:33:36 GMT, Emanuel Peter wrote: >> Jasmine Karthikeyan has updated the pull request incrementally with one additional commit since the last revision: >> >> Reformat, add comments and char tests > > src/hotspot/share/opto/superword.cpp line 2553: > >> 2551: const Type* vt = vtn; >> 2552: int op = in->Opcode(); >> 2553: if (!can_subword_truncate(in)) { > > It seems `can_subword_truncate` does not cover `VectorNode::is_shift_opcode`, is that correct? Maybe we are missing IR tests that catch this, scary! In this case since the condition is negated, the old condition should still be true since it falls through to the `return false` at the end of the function. Previously, it checked for a small list of nodes as requiring truncation handling which allows nodes that were not whitelisted to slip through and produce incorrect code. Now, we check for a larger group of nodes that we know do not need handling for truncation, so that any nodes not whitelisted will still produce valid code, but will not vectorize (until #23413). > test/hotspot/jtreg/compiler/vectorization/TestSubwordTruncation.java line 64: > >> 62: >> 63: // Shorts >> 64: > > Suggestion: > > > Nit: you don't have a similar comment for other types, so just drop it here too ;) Further in the file I used `// Bytes` to separate the byte methods, as well as the new char methods I added in the new commit. I think it helps navigating the file at a glance, at least in my emacs editor :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25440#discussion_r2125606868 PR Review Comment: https://git.openjdk.org/jdk/pull/25440#discussion_r2125607475 From jkarthikeyan at openjdk.org Wed Jun 4 04:34:15 2025 From: jkarthikeyan at openjdk.org (Jasmine Karthikeyan) Date: Wed, 4 Jun 2025 04:34:15 GMT Subject: RFR: 8350177: C2 SuperWord: Integer.numberOfLeadingZeros, numberOfTrailingZeros, reverse and bitCount have input types wrongly turncated for byte and short In-Reply-To: References: Message-ID: On Wed, 28 May 2025 07:46:12 GMT, Emanuel Peter wrote: >> Hi all, >> This patch fixes cases in SuperWord when compiling subword types where vectorized code would be given a narrower type than expected, leading to miscompilation due to truncation. This fix is a generalization of the same fix applied for `Integer.reverseBytes` in [JDK-8305324](https://bugs.openjdk.org/browse/JDK-8305324). The patch introduces a check for nodes that are known to tolerate truncation, so that any future cases of subword truncation will avoid creating miscompiled code. >> >> The patch reuses the existing logic to set the type of the vectors to int, which currently disables vectorization for the affected patterns entirely. Once [JDK-8342095](https://bugs.openjdk.org/browse/JDK-8342095) is merged and automatic casting support is added the autovectorizer should automatically insert casts to and from int, maintaining correctness. >> >> I've added an IR test that checks for correctly compiled outputs. Thoughts and reviews would be appreciated! > > And just for good measure: should we also add tests for `char`? Thanks a lot for your review @eme64! I've pushed a commit that should address the reviews, and fix the GHA failures. I've added char tests as well. Regarding long operations, I don't think it's possible for this code path to encounter them. Earlier in the function, this logic is guarded with `vtn->basic_type() == T_INT` so I think only integer nodes need to be added to the list. Let me know what you think! @chhagedorn Thanks for the reminder! It might be good to run some testing so that we can get it tested and reviewed before the RDP1 cutoff. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25440#issuecomment-2938450410 From jkarthikeyan at openjdk.org Wed Jun 4 04:34:16 2025 From: jkarthikeyan at openjdk.org (Jasmine Karthikeyan) Date: Wed, 4 Jun 2025 04:34:16 GMT Subject: RFR: 8350177: C2 SuperWord: Integer.numberOfLeadingZeros, numberOfTrailingZeros, reverse and bitCount have input types wrongly turncated for byte and short [v2] In-Reply-To: <9d8EW1k2YAwyyeLvIG5Fnqpjx-3PdrnBq_bildM8jsE=.a3fd55f7-5586-4302-8a14-c2d251cf6fe4@github.com> References: <9d8EW1k2YAwyyeLvIG5Fnqpjx-3PdrnBq_bildM8jsE=.a3fd55f7-5586-4302-8a14-c2d251cf6fe4@github.com> Message-ID: On Wed, 28 May 2025 07:41:09 GMT, Emanuel Peter wrote: >> src/hotspot/share/opto/superword.cpp line 2496: >> >>> 2494: int opc = in->Opcode(); >>> 2495: return opc == Op_AddI || opc == Op_SubI || opc == Op_MulI || opc == Op_AndI || opc == Op_OrI || opc == Op_XorI >>> 2496: || opc == Op_ReverseBytesS || opc == Op_ReverseBytesUS; >> >> A switch might look nicer here, and be easier to extend later on ;) > > This list is a little scary... how do we know that we have all cases in it, and we are not getting regressions because we are missing some? A switch is a good idea, it'll definitely make the code easier to read. I've made the change in the recent commit. As for the cases, I ended up running the compiler unit tests and modifying the list until there were no test failures. Since we check for nodes that do not need truncation handling, any other nodes will automatically default to require truncation handling and fall back to not vectorizing. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25440#discussion_r2125610158 From bulasevich at openjdk.org Wed Jun 4 04:40:08 2025 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Wed, 4 Jun 2025 04:40:08 GMT Subject: RFR: 8358183: [JVMCI] crash accessing nmethod::jvmci_name in CodeCache::aggregate Message-ID: Zero out _mutable_data_size, _relocation_size and _metadata_size in purge() so that after purge jvmci_data_size() returns 0 and print_heapinfo() won?t touch an invalid _metadata. ------------- Commit messages: - 8358183: [JVMCI] crash accessing nmethod::jvmci_name in CodeCache::aggregate Changes: https://git.openjdk.org/jdk/pull/25608/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25608&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358183 Stats: 3 lines in 2 files changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25608.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25608/head:pull/25608 PR: https://git.openjdk.org/jdk/pull/25608 From epeter at openjdk.org Wed Jun 4 05:46:22 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 05:46:22 GMT Subject: RFR: 8350177: C2 SuperWord: Integer.numberOfLeadingZeros, numberOfTrailingZeros, reverse and bitCount have input types wrongly turncated for byte and short [v2] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 04:28:38 GMT, Jasmine Karthikeyan wrote: >> Hi all, >> This patch fixes cases in SuperWord when compiling subword types where vectorized code would be given a narrower type than expected, leading to miscompilation due to truncation. This fix is a generalization of the same fix applied for `Integer.reverseBytes` in [JDK-8305324](https://bugs.openjdk.org/browse/JDK-8305324). The patch introduces a check for nodes that are known to tolerate truncation, so that any future cases of subword truncation will avoid creating miscompiled code. >> >> The patch reuses the existing logic to set the type of the vectors to int, which currently disables vectorization for the affected patterns entirely. Once [JDK-8342095](https://bugs.openjdk.org/browse/JDK-8342095) is merged and automatic casting support is added the autovectorizer should automatically insert casts to and from int, maintaining correctness. >> >> I've added an IR test that checks for correctly compiled outputs. Thoughts and reviews would be appreciated! > > Jasmine Karthikeyan has updated the pull request incrementally with one additional commit since the last revision: > > Reformat, add comments and char tests @jaskarth Thanks for the updates! I'll run some testing now :) src/hotspot/share/opto/superword.cpp line 2519: > 2517: > 2518: // Default to disallowing vector truncation > 2519: return false; I was wondering: We could have an assert here that lists all operations that cannot be truncated? So if a new operation is added, then we will catch that it is not handled here yet, and we can add tests, and either allow it to truncate, or add it to the list of non-truncatable operations. src/hotspot/share/opto/superword.cpp line 2579: > 2577: Node* load = in->in(1); > 2578: // For certain operations such as shifts and abs(), use the size of the load if it exists > 2579: if ((VectorNode::is_shift_opcode(op) || op == Op_AbsI) && load->is_Load() && Can you say a little more about this? What about `Op_ReverseBytesI`, did that not previously also get through here? ------------- PR Review: https://git.openjdk.org/jdk/pull/25440#pullrequestreview-2895284484 PR Review Comment: https://git.openjdk.org/jdk/pull/25440#discussion_r2125689822 PR Review Comment: https://git.openjdk.org/jdk/pull/25440#discussion_r2125694632 From epeter at openjdk.org Wed Jun 4 05:46:23 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 05:46:23 GMT Subject: RFR: 8350177: C2 SuperWord: Integer.numberOfLeadingZeros, numberOfTrailingZeros, reverse and bitCount have input types wrongly turncated for byte and short [v2] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 05:37:09 GMT, Emanuel Peter wrote: >> Jasmine Karthikeyan has updated the pull request incrementally with one additional commit since the last revision: >> >> Reformat, add comments and char tests > > src/hotspot/share/opto/superword.cpp line 2519: > >> 2517: >> 2518: // Default to disallowing vector truncation >> 2519: return false; > > I was wondering: > We could have an assert here that lists all operations that cannot be truncated? > So if a new operation is added, then we will catch that it is not handled here yet, and we can add tests, and either allow it to truncate, or add it to the list of non-truncatable operations. > Earlier in the function, this logic is guarded with vtn->basic_type() == T_INT so I think only integer nodes need to be added to the list. Let me know what you think! That sounds reasonable. And that would mean we only have to add `int` operation to that assert. And if anybody ever relaxes that `vtn->basic_type() == T_INT` check, then they would immediately run into that assert. Would be nice. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25440#discussion_r2125696763 From shade at openjdk.org Wed Jun 4 06:05:26 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 4 Jun 2025 06:05:26 GMT Subject: RFR: 8357434: x86: Simplify Interpreter::profile_taken_branch [v3] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 09:38:09 GMT, Aleksey Shipilev wrote: >> Noticed that `Interpreter::profile_taken_branch` has the same `sbbptr` pattern we have eliminated with [JDK-8356946](https://bugs.openjdk.org/browse/JDK-8356946). The same logic applies here: the counter is 64-bit, never practically overflows, and no other code cares about it. >> >> Also tidied up some comments around it. >> >> Additional testing; >> - [x] Linux x86_64 server fastdebug, `tier1 tier2` >> - [x] Linux x86_64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Merge branch 'master' into JDK-8357434-x86-profile-taken > - Stale comment > - Merge branch 'master' into JDK-8357434-x86-profile-taken > - Fix Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25343#issuecomment-2938682685 From shade at openjdk.org Wed Jun 4 06:05:27 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 4 Jun 2025 06:05:27 GMT Subject: Integrated: 8357434: x86: Simplify Interpreter::profile_taken_branch In-Reply-To: References: Message-ID: On Wed, 21 May 2025 08:23:14 GMT, Aleksey Shipilev wrote: > Noticed that `Interpreter::profile_taken_branch` has the same `sbbptr` pattern we have eliminated with [JDK-8356946](https://bugs.openjdk.org/browse/JDK-8356946). The same logic applies here: the counter is 64-bit, never practically overflows, and no other code cares about it. > > Also tidied up some comments around it. > > Additional testing; > - [x] Linux x86_64 server fastdebug, `tier1 tier2` > - [x] Linux x86_64 server fastdebug, `all` This pull request has now been integrated. Changeset: b918dc84 Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/b918dc84ec8364321a5a6d9f6835edcb1d9ad62f Stats: 17 lines in 3 files changed: 0 ins; 12 del; 5 mod 8357434: x86: Simplify Interpreter::profile_taken_branch Reviewed-by: kvn, vlivanov ------------- PR: https://git.openjdk.org/jdk/pull/25343 From rehn at openjdk.org Wed Jun 4 06:07:26 2025 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 4 Jun 2025 06:07:26 GMT Subject: RFR: 8322174: RISC-V: C2 VectorizedHashCode RVV Version [v8] In-Reply-To: References: <5e1o1xtN0ZdQZGJi2aVmgCEApW625koeE9F53VhDi5E=.2390045d-844e-4800-8d4b-075a2a3a8793@github.com> Message-ID: On Mon, 5 May 2025 18:10:02 GMT, Yuri Gaevsky wrote: >> Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: >> >> change slli+add sequence to shadd > > As you can expect I am trying to implement the following code with RVV: > > for (; i + (N-1) < cnt; i += N) { > h = 31^^N * h > + 31^^(N-1) * val[i + 0] > + 31^^(N-2) * val[i + 1] > ... > + 31^^1 * val[i + (N-2)] > + 31^^0 * val[i + (N-1)]; > } > for (; i < cnt; i++) { > h = 31 * h + val[i]; > } > > where `N` is a number of processing array elements in "chunk". > IIUC, the main issue with your approach is "reverse" order of array elements versus preloaded `31^^X` coeffs WHEN the remaining number of elems is less than `N`, say `M=N-1`. > > h = 31^^M * h > + 31^^(M-1) * val[i + 0] > + 31^^(M-2) * val[i + 1] > ... > + 31^^1 * val[i + (M-2)] > + 32^^0 * val[i + (M-1)]; > > or returning to our `N` for clarity > > h = 31^^(N-1) * h > + 31^^(N-2) * val[i + 0] > + 31^^(N-3) * val[i + 1] > ... > + 31^^1 * val[i + (N-3)] > + 31^^0 * val[i + (N-2)]; > > Now we need to "slide down" preloaded multiplier coeffs in designated vector register by one (as `M=N-1`) to be in "sync" with `val[i + X]` (may be move them into temporary VR in the process), and moreover, DO this operation IFF the remaining `cnt` is less than `N` (==>an additional check on every iteration). That's probably acceptable only at tail phase as one-time operation but NOT inside of main loop... @ygaevsky @RealFYang how can we procced ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17413#issuecomment-2938689119 From epeter at openjdk.org Wed Jun 4 06:09:21 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 06:09:21 GMT Subject: RFR: 8352635: Improve inferencing of Float16 operations with constant inputs [v5] In-Reply-To: References: <44nVQBYgzCOB2mAB9xtAPvkUcOMJOITA2VjMdDFgm1g=.48266693-48bf-41db-8871-a7dcafe93509@github.com> Message-ID: On Sun, 1 Jun 2025 17:26:07 GMT, Jatin Bhateja wrote: >> This is a follow-up PR#22755 to improve Float16 operations inferencing. >> >> The existing scheme to detect Float16 operations for some operations is based on pattern matching which expects to receive inputs through ConvHF2F IR, this patch extends matching to accept constant floating point inputs within the Float16 value range. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Extending tests and review resolutions @jatin-bhateja Thanks for the updates! I have a first batch of comments about the test :) https://github.com/openjdk/jdk/pull/24179#discussion_r2111355331 Here I asked for this: > And: your pattern matching allows the constant to be lhs or rhs, so you should add corresponding tests. You commented "Done." underneath. Where did you add these tests exactly? Because I only see these patterns: res += Float.floatToFloat16(RANDOM1_VAR.floatValue() + RANDOM2.floatValue()); and not these res += Float.floatToFloat16(RANDOM2.floatValue() + RANDOM1_VAR.floatValue()); (except for maybe a single case where one was flipped, see question below) test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java line 63: > 61: > 62: private static Generator genF = G.uniformFloats(0.0f, 70000.0f); > 63: private static Generator genHF = G.uniformFloat16s(Float.floatToFloat16(-2000.0f), Float.floatToFloat16(2000.0f)); Is there a good reason to only take the uniform distribution? https://github.com/openjdk/jdk/blob/4a491bef6636441f14fc8bbdedf65063fce038bd/test/hotspot/jtreg/compiler/lib/generators/Generators.java#L102-L105 test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java line 335: > 333: res += Float.floatToFloat16(POSITIVE_ZERO_VAR.floatValue() - INEXACT_FP16); > 334: res += Float.floatToFloat16(INEXACT_FP16 * POSITIVE_ZERO_VAR.floatValue()); > 335: res += Float.floatToFloat16(POSITIVE_ZERO_VAR.floatValue() / INEXACT_FP16); Why is the mul case flipped here? test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java line 363: > 361: @Check(test="testSNaNFP16ConstantPatterns") > 362: public void checkSNaNFP16ConstantPatterns(short actual) throws Exception { > 363: TestFramework.deoptimize(TestFloat16ScalarOperations.class.getMethod("testSNaNFP16ConstantPatterns")); Oh wow, I have never seen this pattern used. Cool idea! Do you know what impact this has on test runtime? test/hotspot/jtreg/compiler/lib/generators/Generators.java line 622: > 620: > 621: /** > 622: * Fill the array with shorts using the distribution of nextDouble. Suggestion: * Fill the array with shorts using the distribution of the generator. There are actually a few other cases that seem to be wrong in this file. Would you mind changing them? ------------- PR Review: https://git.openjdk.org/jdk/pull/24179#pullrequestreview-2895308949 PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2125709341 PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2125713317 PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2125718345 PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2125706470 From epeter at openjdk.org Wed Jun 4 06:09:21 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 06:09:21 GMT Subject: RFR: 8352635: Improve inferencing of Float16 operations with constant inputs [v5] In-Reply-To: References: <44nVQBYgzCOB2mAB9xtAPvkUcOMJOITA2VjMdDFgm1g=.48266693-48bf-41db-8871-a7dcafe93509@github.com> Message-ID: On Wed, 4 Jun 2025 05:54:36 GMT, Emanuel Peter wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Extending tests and review resolutions > > test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java line 63: > >> 61: >> 62: private static Generator genF = G.uniformFloats(0.0f, 70000.0f); >> 63: private static Generator genHF = G.uniformFloat16s(Float.floatToFloat16(-2000.0f), Float.floatToFloat16(2000.0f)); > > Is there a good reason to only take the uniform distribution? > > https://github.com/openjdk/jdk/blob/4a491bef6636441f14fc8bbdedf65063fce038bd/test/hotspot/jtreg/compiler/lib/generators/Generators.java#L102-L105 What about `NaN` and `infty` etc? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2125711380 From epeter at openjdk.org Wed Jun 4 06:33:23 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 06:33:23 GMT Subject: RFR: 8352635: Improve inferencing of Float16 operations with constant inputs [v5] In-Reply-To: References: <44nVQBYgzCOB2mAB9xtAPvkUcOMJOITA2VjMdDFgm1g=.48266693-48bf-41db-8871-a7dcafe93509@github.com> Message-ID: On Sun, 1 Jun 2025 17:26:07 GMT, Jatin Bhateja wrote: >> This is a follow-up PR#22755 to improve Float16 operations inferencing. >> >> The existing scheme to detect Float16 operations for some operations is based on pattern matching which expects to receive inputs through ConvHF2F IR, this patch extends matching to accept constant floating point inputs within the Float16 value range. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Extending tests and review resolutions And some comments about the VM code :) Looks like we are making good progress here, thanks again for all the work you put in! src/hotspot/share/opto/convertnode.cpp line 281: > 279: conF = binopF->in(2); > 280: varS = binopF->in(1)->in(1); > 281: } Suggestion: if (Float16NodeFactory::is_float32_binary_oper(in(1)->Opcode())) { Node* binopF = in(1); // Check if the incoming binary operation has one floating point constant // input and the other input is a half precision to single precision upcasting node. // We land here because a prior HalfFloat to Float conversion promotes // an integral constant holding Float16 value to a floating point constant. // i.e. ConvHF2F ConI(short) => ConF Node* conF = nullptr; Node* varS = nullptr; if (binopF->in(1)->is_Con() && binopF->in(2)->Opcode() == Op_ConvHF2F) { conF = binopF->in(1); varS = binopF->in(2)->in(1); } else if (binopF->in(2)->is_Con() && binopF->in(1)->Opcode() == Op_ConvHF2F) { conF = binopF->in(2); varS = binopF->in(1)->in(1); } I think it is better to have the variables just before they are assigned. They are not needed in the scope outside the if at the top here anyway. src/hotspot/share/opto/convertnode.cpp line 294: > 292: // Conditions under which floating point constant can be considered for a pattern match. > 293: // 1. Constant must lie within Float16 value range, this will ensure that > 294: // we don't unintentially round off float constant to enforce a pattern match. What do you mean by `enforce a pattern match`? Are you just trying to say that we have to be careful with the pattern matching here, and we cannot just round off the float constant? Do you have an example where that rounding would lead to issues? src/hotspot/share/opto/convertnode.cpp line 302: > 300: // results into a quiet NaN but preserves the significand bits of signaling NaN. > 301: // c. Pattern being matched includes a Float to Float16 conversion after binary > 302: // expression, this downcast will still preserve significand bits of binary32 NaN. Suggestion: // 2. If a constant value is one of the valid IEEE 754 binary32 NaN bit patterns // then it's safe to consider it for pattern match because of the following reasons: // a. As per section 2.8 of JVMS, Java Virtual Machine does not support // signaling NaN value. // b. Any signaling NaN which takes part in a non-comparison expression // results in a quiet NaN but preserves the significand bits of signaling NaN. // c. The pattern being matched includes a Float to Float16 conversion after binary // expression, this downcast will still preserve the significand bits of binary32 NaN. src/hotspot/share/opto/convertnode.cpp line 304: > 302: // expression, this downcast will still preserve significand bits of binary32 NaN. > 303: bool isnan = ((*reinterpret_cast(&con) & 0x7F800000) == 0x7F800000) && > 304: ((*reinterpret_cast(&con) & 0x7FFFFF) != 0); Why are you hand-crafting this check here? Is there not some predefined function to do this check? ------------- Changes requested by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24179#pullrequestreview-2895350075 PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2125731408 PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2125737423 PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2125741224 PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2125743503 From epeter at openjdk.org Wed Jun 4 06:33:26 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 06:33:26 GMT Subject: RFR: 8352635: Improve inferencing of Float16 operations with constant inputs [v5] In-Reply-To: References: <44nVQBYgzCOB2mAB9xtAPvkUcOMJOITA2VjMdDFgm1g=.48266693-48bf-41db-8871-a7dcafe93509@github.com> Message-ID: On Wed, 4 Jun 2025 06:12:41 GMT, Emanuel Peter wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Extending tests and review resolutions > > src/hotspot/share/opto/convertnode.cpp line 281: > >> 279: conF = binopF->in(2); >> 280: varS = binopF->in(1)->in(1); >> 281: } > > Suggestion: > > if (Float16NodeFactory::is_float32_binary_oper(in(1)->Opcode())) { > Node* binopF = in(1); > // Check if the incoming binary operation has one floating point constant > // input and the other input is a half precision to single precision upcasting node. > // We land here because a prior HalfFloat to Float conversion promotes > // an integral constant holding Float16 value to a floating point constant. > // i.e. ConvHF2F ConI(short) => ConF > Node* conF = nullptr; > Node* varS = nullptr; > if (binopF->in(1)->is_Con() && binopF->in(2)->Opcode() == Op_ConvHF2F) { > conF = binopF->in(1); > varS = binopF->in(2)->in(1); > } else if (binopF->in(2)->is_Con() && binopF->in(1)->Opcode() == Op_ConvHF2F) { > conF = binopF->in(2); > varS = binopF->in(1)->in(1); > } > > I think it is better to have the variables just before they are assigned. They are not needed in the scope outside the if at the top here anyway. You make it sound like this is the only way we get here: // We land here because a prior HalfFloat to Float conversion promotes // an integral constant holding Float16 value to a floating point constant. // i.e. ConvHF2F ConI(short) => ConF Could this pattern not be created directly with Java code? So maybe rephrase it to "For example, e.g."? > src/hotspot/share/opto/convertnode.cpp line 304: > >> 302: // expression, this downcast will still preserve significand bits of binary32 NaN. >> 303: bool isnan = ((*reinterpret_cast(&con) & 0x7F800000) == 0x7F800000) && >> 304: ((*reinterpret_cast(&con) & 0x7FFFFF) != 0); > > Why are you hand-crafting this check here? Is there not some predefined function to do this check? Does `g_isnan` not work here? If not, add a comment why :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2125733195 PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2125753763 From epeter at openjdk.org Wed Jun 4 06:38:22 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 06:38:22 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v2] In-Reply-To: <8mE0O0QjyMJMK7UWtfMiFc5ZjIxFYqVNUeu0qYbzaz8=.75e13abf-a2c9-407b-898d-1174a85a06cf@github.com> References: <8mE0O0QjyMJMK7UWtfMiFc5ZjIxFYqVNUeu0qYbzaz8=.75e13abf-a2c9-407b-898d-1174a85a06cf@github.com> Message-ID: On Tue, 3 Jun 2025 17:28:07 GMT, Jatin Bhateja wrote: >> Thanks, encoding logic is concentrated in integral instruction tests and is shared with corresponding long variants, extended APX coverage for BLS/R/MSK. > >> @jatin-bhateja Thanks for looking into this! >> >> `predicate(!UseAPX && n->in(2)->bottom_type()->is_int()->get_con() != -1);` >> >> The PR title seems to suggest the bug is only about -XX:+UseAPX. Why are you changing things for the case !UseAPX? >> >> Are these not cases like a ^ -1, which basically flips all bits. What alternative does this end up using now? >> >> A code comment would be helpful. > > We are tightening the predicate check so that under no circumstances we pick this pattern during the reduction phase of instruction selection on account of having lower cost. There is a generic pattern (xorI_rReg_imm) for all integral immediate values, and then there is a special pattern for Xor with -1 (fxorI_rReg_im1), which is needed for AndN inferencing. @jatin-bhateja I'll wait with testing, until someone from Intel gives this the approval. Feel free to ping me for that once we are there :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/25501#issuecomment-2938775593 From epeter at openjdk.org Wed Jun 4 06:38:25 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 06:38:25 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v4] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 18:07:34 GMT, Jatin Bhateja wrote: >> A) Patch extends the following tests with hard-coded encoding checks for various BMI instructions to cover REX2 or extended EVEX encodings supported by APX. >> >> >> compiler/intrinsics/bmi/verifycode/AndnTestI.java >> compiler/intrinsics/bmi/verifycode/AndnTestL.java >> compiler/intrinsics/bmi/verifycode/BzhiTestI2L.java >> compiler/intrinsics/bmi/verifycode/LZcntTestL.java >> compiler/intrinsics/bmi/verifycode/TZcntTestL.java >> >> >> B) After integration of JDK-8349582, which added APX NDD support, AndN instruction selection patterns that expect (Xor SRC, -1) as one of its operands were not getting selected because of a lower-cost generic immediate pattern match; patch fixes this issue through strict predicate checks. >> >> Above tests are now passing, validations were carried out using Intel Software Development emulator. >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Review comments resolutions src/hotspot/cpu/x86/x86_64.ad line 11326: > 11324: instruct xorL_rReg_imm(rRegL dst, immL32 src, rFlagsReg cr) > 11325: %{ > 11326: predicate(!UseAPX && n->in(2)->bottom_type()->is_long()->get_con() != -1L); Could you add a similar comment here, like for all the others, please :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25501#discussion_r2125760840 From rrich at openjdk.org Wed Jun 4 06:44:25 2025 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 4 Jun 2025 06:44:25 GMT Subject: RFR: 8354636: [PPC64] Clean up comments regarding frame manager In-Reply-To: <28IlBh9k0o4RZMbIstYTCl8c0rfIIqVqyPXeXFyx1Ik=.1d4919d2-2437-4c81-8d30-75128b0a0afb@github.com> References: <28IlBh9k0o4RZMbIstYTCl8c0rfIIqVqyPXeXFyx1Ik=.1d4919d2-2437-4c81-8d30-75128b0a0afb@github.com> Message-ID: On Tue, 3 Jun 2025 14:29:49 GMT, Martin Doerr wrote: > Trivial comment cleanup: Replace "frame manager" by "template interpreter". Looks good. Thanks, Richard. ------------- Marked as reviewed by rrich (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25616#pullrequestreview-2895426967 From epeter at openjdk.org Wed Jun 4 06:49:25 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 06:49:25 GMT Subject: RFR: 8358534: Bailout in Conv2B::Ideal when type of cmp input is not supported In-Reply-To: <2kB23xVQDRb7YT6aMt1SbIfPwSG1ummK29A1Hs3FD0Y=.59ea7dd7-c6fd-486c-a996-f839c9a15718@github.com> References: <2kB23xVQDRb7YT6aMt1SbIfPwSG1ummK29A1Hs3FD0Y=.59ea7dd7-c6fd-486c-a996-f839c9a15718@github.com> Message-ID: <6F6gsbyRSrJ7_XHXoMh8j15Mog2DMec5DaOCVAdcdFQ=.0799c52f-275c-4303-8bd8-05f341c20ae0@github.com> On Tue, 3 Jun 2025 19:48:37 GMT, Cesar Soares Lucas wrote: > `Conv2BNode::ideal` segfaults in release builds when the type of `in(1)` is not INT or PTR. Creating a small test case to reproduce the issue is being a bit challenging so this PR only address the issue by bailing out of the method if the input type is unsupported. This other ticket https://bugs.openjdk.org/browse/JDK-8357885 will address creating a regression test for the problem. > > Tested with JTREG tier1-3 and Renaissance on Linux x64. @JohnTortugo If you integrate before 24h, you need to explicitly say that it is trivial, and the reviewer needs to agree. We in europe were sleeping and did not even have a chance to look at it. src/hotspot/share/opto/convertnode.cpp line 86: > 84: return nullptr; > 85: } > 86: @JohnTortugo Would it not have been better to put this check inside the `else` branch? ------------- PR Review: https://git.openjdk.org/jdk/pull/25627#pullrequestreview-2895442512 PR Review Comment: https://git.openjdk.org/jdk/pull/25627#discussion_r2125783037 From thartmann at openjdk.org Wed Jun 4 06:58:27 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 4 Jun 2025 06:58:27 GMT Subject: RFR: 8358534: Bailout in Conv2B::Ideal when type of cmp input is not supported In-Reply-To: <6F6gsbyRSrJ7_XHXoMh8j15Mog2DMec5DaOCVAdcdFQ=.0799c52f-275c-4303-8bd8-05f341c20ae0@github.com> References: <2kB23xVQDRb7YT6aMt1SbIfPwSG1ummK29A1Hs3FD0Y=.59ea7dd7-c6fd-486c-a996-f839c9a15718@github.com> <6F6gsbyRSrJ7_XHXoMh8j15Mog2DMec5DaOCVAdcdFQ=.0799c52f-275c-4303-8bd8-05f341c20ae0@github.com> Message-ID: On Wed, 4 Jun 2025 06:45:50 GMT, Emanuel Peter wrote: >> `Conv2BNode::ideal` segfaults in release builds when the type of `in(1)` is not INT or PTR. Creating a small test case to reproduce the issue is being a bit challenging so this PR only address the issue by bailing out of the method if the input type is unsupported. This other ticket https://bugs.openjdk.org/browse/JDK-8357885 will address creating a regression test for the problem. >> >> Tested with JTREG tier1-3 and Renaissance on Linux x64. > > src/hotspot/share/opto/convertnode.cpp line 86: > >> 84: return nullptr; >> 85: } >> 86: > > @JohnTortugo Would it not have been better to put this check inside the `else` branch? I agree, the `return nullptr;` should have been added below the assert in the else branch, no check required. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25627#discussion_r2125814466 From thartmann at openjdk.org Wed Jun 4 06:58:24 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 4 Jun 2025 06:58:24 GMT Subject: RFR: 8358534: Bailout in Conv2B::Ideal when type of cmp input is not supported In-Reply-To: <2kB23xVQDRb7YT6aMt1SbIfPwSG1ummK29A1Hs3FD0Y=.59ea7dd7-c6fd-486c-a996-f839c9a15718@github.com> References: <2kB23xVQDRb7YT6aMt1SbIfPwSG1ummK29A1Hs3FD0Y=.59ea7dd7-c6fd-486c-a996-f839c9a15718@github.com> Message-ID: On Tue, 3 Jun 2025 19:48:37 GMT, Cesar Soares Lucas wrote: > `Conv2BNode::ideal` segfaults in release builds when the type of `in(1)` is not INT or PTR. Creating a small test case to reproduce the issue is being a bit challenging so this PR only address the issue by bailing out of the method if the input type is unsupported. This other ticket https://bugs.openjdk.org/browse/JDK-8357885 will address creating a regression test for the problem. > > Tested with JTREG tier1-3 and Renaissance on Linux x64. Also, please don't file JBS issues (without a subcomponent) and integrate them directly. The component triaging teams should at least get a chance to properly triage the issue and set priority etc. Especially when getting close to the rampdown phases, this is required to determine if an issue is even eligible, potentially only with approval, to be fixed in the current release or needs to be deferred. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25627#issuecomment-2938828992 From thartmann at openjdk.org Wed Jun 4 07:03:20 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 4 Jun 2025 07:03:20 GMT Subject: RFR: 8358534: Bailout in Conv2B::Ideal when type of cmp input is not supported In-Reply-To: <2kB23xVQDRb7YT6aMt1SbIfPwSG1ummK29A1Hs3FD0Y=.59ea7dd7-c6fd-486c-a996-f839c9a15718@github.com> References: <2kB23xVQDRb7YT6aMt1SbIfPwSG1ummK29A1Hs3FD0Y=.59ea7dd7-c6fd-486c-a996-f839c9a15718@github.com> Message-ID: On Tue, 3 Jun 2025 19:48:37 GMT, Cesar Soares Lucas wrote: > `Conv2BNode::ideal` segfaults in release builds when the type of `in(1)` is not INT or PTR. Creating a small test case to reproduce the issue is being a bit challenging so this PR only address the issue by bailing out of the method if the input type is unsupported. This other ticket https://bugs.openjdk.org/browse/JDK-8357885 will address creating a regression test for the problem. > > Tested with JTREG tier1-3 and Renaissance on Linux x64. I think this is ok for now, assuming that [JDK-8357885](https://bugs.openjdk.org/browse/JDK-8357885) will clean this up with a full fix and a regression test. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25627#issuecomment-2938849202 From jbhateja at openjdk.org Wed Jun 4 07:10:16 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 4 Jun 2025 07:10:16 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v2] In-Reply-To: <8mE0O0QjyMJMK7UWtfMiFc5ZjIxFYqVNUeu0qYbzaz8=.75e13abf-a2c9-407b-898d-1174a85a06cf@github.com> References: <8mE0O0QjyMJMK7UWtfMiFc5ZjIxFYqVNUeu0qYbzaz8=.75e13abf-a2c9-407b-898d-1174a85a06cf@github.com> Message-ID: On Tue, 3 Jun 2025 17:28:07 GMT, Jatin Bhateja wrote: >> Thanks, encoding logic is concentrated in integral instruction tests and is shared with corresponding long variants, extended APX coverage for BLS/R/MSK. > >> @jatin-bhateja Thanks for looking into this! >> >> `predicate(!UseAPX && n->in(2)->bottom_type()->is_int()->get_con() != -1);` >> >> The PR title seems to suggest the bug is only about -XX:+UseAPX. Why are you changing things for the case !UseAPX? >> >> Are these not cases like a ^ -1, which basically flips all bits. What alternative does this end up using now? >> >> A code comment would be helpful. > > We are tightening the predicate check so that under no circumstances we pick this pattern during the reduction phase of instruction selection on account of having lower cost. There is a generic pattern (xorI_rReg_imm) for all integral immediate values, and then there is a special pattern for Xor with -1 (fxorI_rReg_im1), which is needed for AndN inferencing. > @jatin-bhateja I'll wait with testing, until someone from Intel gives this the approval. Feel free to ping me for that once we are there :) Hi @eme64 , I am process of updating this version with some more changes, please hold on your test runs for a while :-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/25501#issuecomment-2938870073 From epeter at openjdk.org Wed Jun 4 07:28:16 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 07:28:16 GMT Subject: RFR: 8351889: C2 crash: assertion failed: Base pointers must match (addp 344) In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 14:59:23 GMT, Roland Westrelin wrote: > In the example above, the CastPPs are the bases. Aaaah, ok now it makes a little more sense to me :) > > Maybe some more full IR snippets could be helpful, maybe even IGV drawings. But that may be more work for you. > > I rarely use the IGV so, yeah, that would be more work. Then what about just the dump of the relevant IR nodes in text form? That is what I meant by `full IR snippets` ;) Is there any (reasonable) way to push the `CastPP` through the `AddP` here? I guess that may mean duplicating some `AddP` in some cases... But it could also give an opportunity for the `CastPP` to common further up that way. What do you think? It is hard for me to see through it without looking at some examples of the IR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25386#issuecomment-2938917626 From roland at openjdk.org Wed Jun 4 07:32:20 2025 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 4 Jun 2025 07:32:20 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v8] In-Reply-To: <4ShW7VcaJrO0v0cHwUN1vccOH8tNPlJSIh_K0W2RdS0=.14954a26-c962-41a1-9088-2e1a1bc01eb4@github.com> References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> <1gdeBnZ7YuIf9CgQW2bCXkDDBWPjUgRnickHts-fvzE=.e6e901ba-3e9f-41a2-9c68-167a879e9655@github.com> <4ShW7VcaJrO0v0cHwUN1vccOH8tNPlJSIh_K0W2RdS0=.14954a26-c962-41a1-9088-2e1a1bc01eb4@github.com> Message-ID: On Tue, 27 May 2025 08:17:38 GMT, Emanuel Peter wrote: >> Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision: >> >> review > > src/hotspot/share/opto/escape.cpp line 4804: > >> 4802: assert(n->is_Initialize(), "We only push projections of Initialize"); >> 4803: if (use->as_Proj()->_con == TypeFunc::Memory) { // Ignore precedent edge >> 4804: memnode_worklist.append_if_missing(use); > > Do you know why we are using a `GrowableArray` here? Would a `UnikeNodeList` not serve us better since we are always doing `append_if_missing`, which essentially has to scan the whole `GrowableArray`? It's not clear to me. I filed: https://bugs.openjdk.org/browse/JDK-8358560 as a follow up. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2125886722 From epeter at openjdk.org Wed Jun 4 07:39:19 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 07:39:19 GMT Subject: RFR: 8357822: C2: Multiple string optimization tests are no longer testing string concatenation optimizations In-Reply-To: <4GDLAMfeWjgfcGvn4sUSMT2jjG3vsebjcFeJqgHqPQw=.e7dfa9e7-4608-4304-ba00-0b254b6bf2b1@github.com> References: <4GDLAMfeWjgfcGvn4sUSMT2jjG3vsebjcFeJqgHqPQw=.e7dfa9e7-4608-4304-ba00-0b254b6bf2b1@github.com> Message-ID: On Tue, 3 Jun 2025 07:17:47 GMT, Daniel Skantz wrote: > This PR updates a few tests to reintroduce testing of string concatenation optimizations since a few bugs have recently been identified in this area. > > Selection criteria: performed a text search on the test suite and identified tests for string concatenations or string optimizations that are not currently compiled with `-XDstringConcat=inline` and are not using StringBuilders explicitly. > > Testing: T1-4. > > Extra testing: ran the tests manually with `-XX:+PrintOptimizeStringConcat` and verified that the tests are exercising string optimizations after the fix. @danielogh Thanks for looking into this and finding more tests! Looks reasonable to me. I'm not super familiar with string optimizations, so it would be good if a second reviewer knew a little more. But it looks at least like a good step in the right direction from what I can see :) test/hotspot/jtreg/compiler/c2/Test7046096.java line 36: > 34: /* > 35: * @test id=stringConcatInline > 36: * @bug 7046096 Suggestion: * @bug 7046096 8357822 I'd at the new number here. But probably optional. ------------- Marked as reviewed by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25610#pullrequestreview-2895609510 PR Review Comment: https://git.openjdk.org/jdk/pull/25610#discussion_r2125886046 From mhaessig at openjdk.org Wed Jun 4 07:47:37 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 4 Jun 2025 07:47:37 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v80] In-Reply-To: References: Message-ID: <18z8xy6zbC5dWMAzveQOankso6vWI2yj4b4EpsCS3lg=.f176a82c-4660-4500-8369-8976088d3758@github.com> On Tue, 3 Jun 2025 15:57:32 GMT, Emanuel Peter wrote: >> **Goal** >> We want to generate Java source code: >> - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. >> - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). >> >> Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). >> >> **How to get started** >> When reviewing, please start by looking at: >> https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 >> >> We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. >> >> Second, look at this advanced test: >> https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 >> >> And then for a "tutorial", look at: >> `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` >> >> It shows these features: >> - The `body` of a Template is essentially a list of `Token`s that are concatenated. >> - Templates can be nested: a `TemplateWithArgs` is also a `Token`. >> - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. >> - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. >> - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. >> - The use of recursive templates, and `fuel` to limit the recursion. >> - `Name`s: useful to register field and variable names in code scopes. >> >> Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. >> https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 >> >> For a better experience, you may want... > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > fix whitespaces from applied suggestion I had a look at the changes since my last review. They look excellent. Especially good to see the tutorial improving even further. ------------- Marked as reviewed by mhaessig (Author). PR Review: https://git.openjdk.org/jdk/pull/24217#pullrequestreview-2895650620 From roland at openjdk.org Wed Jun 4 07:48:22 2025 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 4 Jun 2025 07:48:22 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v8] In-Reply-To: <4ShW7VcaJrO0v0cHwUN1vccOH8tNPlJSIh_K0W2RdS0=.14954a26-c962-41a1-9088-2e1a1bc01eb4@github.com> References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> <1gdeBnZ7YuIf9CgQW2bCXkDDBWPjUgRnickHts-fvzE=.e6e901ba-3e9f-41a2-9c68-167a879e9655@github.com> <4ShW7VcaJrO0v0cHwUN1vccOH8tNPlJSIh_K0W2RdS0=.14954a26-c962-41a1-9088-2e1a1bc01eb4@github.com> Message-ID: On Tue, 27 May 2025 08:50:51 GMT, Emanuel Peter wrote: >> Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision: >> >> review > > src/hotspot/share/opto/multnode.cpp line 48: > >> 46: ProjNode* MultiNode::proj_out_or_null(uint which_proj) const { >> 47: assert((Opcode() != Op_If && Opcode() != Op_RangeCheck) || which_proj == (uint)true || which_proj == (uint)false, "must be 1 or 0"); >> 48: assert(number_of_projs(which_proj) <= 1, "only when there's a single projection"); > > Does this hold for all `MultiNode`s under all circumstances? Or should we consider returning `nullptr` in this case? So you're suggesting that this could return `nullptr` so the caller could then test for `nullptr` and have some fallback logic? I would stick with the assert: if C2 crashes because of this, I think will be easier to diagnose an assert than an unexpected `nullptr` return. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2125914278 From kwei at openjdk.org Wed Jun 4 07:53:20 2025 From: kwei at openjdk.org (Kuai Wei) Date: Wed, 4 Jun 2025 07:53:20 GMT Subject: RFR: 8345485: C2 MergeLoads: merge adjacent array/native memory loads into larger load [v4] In-Reply-To: <9ABhENoZtR76wmsgRmzeEceDvCvoflfCcbDbK8H2rso=.e351f63f-1331-4e2e-8a02-763a8c0c4f70@github.com> References: <96Ny_BPjRCbNlD14DNDUOuQ0IX-F8hx21gxQKVfim9M=.d502019a-27ed-4a35-81ef-bc2aec5e7557@github.com> <_IhK2U23lIUOtBKOt-WMxQ3L7b2t26RzclJRdqbIgms=.3ef9a630-f99c-4de7-994a-bcabf912230b@github.com> <9ABhENoZtR76wmsgRmzeEceDvCvoflfCcbDbK8H2rso=.e351f63f-1331-4e2e-8a02-763a8c0c4f70@github.com> Message-ID: On Thu, 22 May 2025 07:03:13 GMT, Emanuel Peter wrote: >> @eme64 @wenshao I have a little change to this PR. I will send it soon. Thanks for your patience. > > @kuaiwei I'm not in a rush with this one. I'd rather we have a good design and be reasonably sure that it is correct, rather than rush it now and having to do extra cycles fixing things later ;) Hi @eme64 , I tried to use match pattern for `MergePrimitiveLoads::has_no_merge_load_combine_below()` . But I think it has some difficulty. For mergeable operators, they can be linked in different way, like: 1) (((item1 Or item2) Or item3) Or item4) 2) ((item1 Or item2) Or (item3 Or item4)) ... To check the next `Or` operator is a valid last one of combine operator chain. We may check its all input recursively. I didn't find a good way to revolve it. If you have better idea, I will check it. I think it's more easy to mark the combine operator checked. It works in this way: * If the checking combine operator has successor combine operator , which is not checked before, we do not optimize it and let the next one has chance to be optimized. * If we try to merge but failed, so we mark it as a `checked` and add its input into GVN worklist. So its input operators can be checked. I added comments of MergePrimitiveLoads::has_no_merge_load_combine_below() to describe the design. To reduce the memory size of `AddNode`. I removed the flag from `AddNode` and add 2 virtual fucntions ```c++ // Check if this node is checked by merge_memops phase virtual bool is_merge_memops_checked() const { return false; }; virtual void set_merge_memops_checked(bool v) { ShouldNotReachHere(); }; The flag , `_merge_memops_checked`, is only added in OrINode and OrLNode. Could you help to check the design and code? Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24023#issuecomment-2938986831 From xgong at openjdk.org Wed Jun 4 08:00:07 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Wed, 4 Jun 2025 08:00:07 GMT Subject: RFR: 8357726: Improve C2 to recognize counted loops with multiple casts in trip counter [v2] In-Reply-To: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> References: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> Message-ID: > C2 compiler fails to recognize counted loops when the induction variable is constrained by multiple consecutive `CastII` nodes. > This prevents optimizations like range check elimination, loop unrolling and auto-vectorization for these loops. Please refer > to the detailed discussion for a related performance issue from [1]. > > The ideal graph of such a loop typically looks like: > > > /-----------| > | | > | ConI | > loop | / / > | | / / > \ AddI / > RangeCheck \ / | > | \ / | > IfTrue Phi | > \ | | > RangeCheck \ | | > \ CastII / <- Range check #1 > | | / > IfTrue | | > \ | | > CastII | <- Range check #2 > | / > |-------/ > > > > For a counted loop, the loop induction variable (i.e `Phi`) should be the input of `AddI` ideally. However, in above case, it is used > by two consecutive `CastII` nodes generated by two different range check operations. Compiler should skip all such kind of `CastII` when recognizing a counted loop. > > This patch modifies the counted loop recognition code to iteratively uncast the loop `iv` until no `CastII` nodes remain, enabling proper counted loop recognition even when the induction variable undergoes multiple range constraint operations. > > Test: > - Tested tier1, tier2, tier3, and no regressions are found. > - An additional test case is added to verify the fix. > > Performance: > Here is the performance gain on a NVIDIA Grace machine which is an AArch64 architecture: > > > Benchmark Mode Cnt Unit Before After Gain > CountedLoopCastIV.loop_iv_int thrpt 30 ops/s 941482.597 4389292.439 4.66 > CountedLoopCastIV.loop_iv_long thrpt 30 ops/s 884563.232 1441485.455 1.62 > > > We can also observe the similar uplift on a x86_64 machine. > > [1] https://github.com/openjdk/jdk/pull/25138#issuecomment-2892720654 Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: Address review comments on jtreg and jmh tests ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25539/files - new: https://git.openjdk.org/jdk/pull/25539/files/796e96f7..afe6b2df Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25539&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25539&range=00-01 Stats: 376 lines in 3 files changed: 194 ins; 179 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/25539.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25539/head:pull/25539 PR: https://git.openjdk.org/jdk/pull/25539 From epeter at openjdk.org Wed Jun 4 08:01:25 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 08:01:25 GMT Subject: RFR: 8351645: C2: ExpandBitsNode::Ideal hits assert because of TOP input In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 11:53:23 GMT, Jatin Bhateja wrote: > Bugfix patch adds missing safe type access checks in Expand/Compress Ideal transforms. > Test mentioned in the bug report has been included allong with the patch. > > Kindly review. > > Best Regards, > Jatin @jatin-bhateja Thanks for looking into this. I just sanity checked the implementation of `compress` ... and the code there is exactly the same! I just took the reproducer and replaced `expand` with `compress` ... and got another assert. public class Test { public static long[] array_0 = fill(new long[10000]); public static long[] array_2 = fill(new long[10000]); public static long[] fill(long[] a) { for (int i = 0; i < a.length; i++) { a[i] = 1; } return a; } public static long one = 1L; static final long[] GOLD = test(); public static long[] test() { long[] out = new long[10000]; for (int i = 0; i < out.length; i++) { long y = array_0[i] % one; long x = (array_2[i] | 4294967298L) << -7640610671680100954L; out[i] = Long.compress(y, x); } return out; } public static void main(String[] args) { for (int i = 0; i < 10_000; i++) { test(); } long[] res = test(); for (int i = 0; i < 10_000; i++) { if (res[i] != GOLD[i]) { throw new RuntimeException("value mismatch: " + res[i] + " vs " + GOLD[i]); } } } } With: `java -Xbatch -XX:CompileCommand=compileonly,Test::test* -XX:+StressIGVN -XX:RepeatCompilation=100 Test.java` Can you please also fix that here, and add regression tests for `Integer/Long.compress`? You should also update the PR title accordingly. src/hotspot/share/opto/intrinsicnode.cpp line 196: > 194: Node* mask = in(2); > 195: if (bottom_type()->isa_int()) { > 196: if (mask->Opcode() == Op_LShiftI && phase->type(mask->in(1))->isa_int() && phase->type(mask->in(1))->is_int()->is_con()) { Why not just check for `top` at the beginning of the function? Just like here: 311 const Type* CompressBitsNode::Value(PhaseGVN* phase) const { 312 const Type* t1 = phase->type(in(1)); 313 const Type* t2 = phase->type(in(2)); 314 if (t1 == Type::TOP || t2 == Type::TOP) { 315 return Type::TOP; 316 } Of course in `Ideal` you would have to return `nullptr` instead, and wait for `Value` to clean it up. That has the benefit that you only need to check it in one place, and then any new optimization we might add in the future does not also have to deal with `top`. test/hotspot/jtreg/compiler/intrinsics/Test8351645.java line 1: > 1: /* I would put the test under `test/hotspot/jtreg/compiler/c2/gvn/TestExpandTopInput.java` Because this is not per se about an intrinsic, more about the `gvn` optimization failing. test/hotspot/jtreg/compiler/intrinsics/Test8351645.java line 28: > 26: * @bug 8351645 > 27: * @summary C2: ExpandBitsNode::Ideal hits assert because of TOP input > 28: * @run main/othervm -Xbatch -Xmx128m compiler.intrinsics.Test8351645 What is the reason for the flags here? Do you really need them? I guess `-Xbatch` could make sense, just to make sure the method is compiled. And you did not need `-XX:CompileCommand=compileonly,Test::test*` to reproduce this, right? Just want to be sure that inlining is not somehow creating issues here. test/hotspot/jtreg/compiler/intrinsics/Test8351645.java line 53: > 51: long y = array_0[i] % one; > 52: long x = (array_2[i] | 4294967298L) << -7640610671680100954L; > 53: out[i] = Long.expand(y, x); Can you please also add a test for `Integer.expand`? Because it seems that your fix addresses not just the `long` but also the `int` case, right? ------------- Changes requested by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25586#pullrequestreview-2895648186 PR Review Comment: https://git.openjdk.org/jdk/pull/25586#discussion_r2125911301 PR Review Comment: https://git.openjdk.org/jdk/pull/25586#discussion_r2125935617 PR Review Comment: https://git.openjdk.org/jdk/pull/25586#discussion_r2125916907 PR Review Comment: https://git.openjdk.org/jdk/pull/25586#discussion_r2125912495 From xgong at openjdk.org Wed Jun 4 08:06:17 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Wed, 4 Jun 2025 08:06:17 GMT Subject: RFR: 8357726: Improve C2 to recognize counted loops with multiple casts in trip counter In-Reply-To: References: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> <698Q9LoBFMdDFBnBVAB8FYiI0U-abyXms26RLoMv5Xc=.f21b9a25-8f64-412c-b37a-553f0a13192e@github.com> Message-ID: On Tue, 3 Jun 2025 07:17:32 GMT, Emanuel Peter wrote: >>> @XiaohongGong I suggest you change the title from: `8357726: C2 fails to recognize the counted loop when induction variable range is changed multiple times` to `8357726: C2 recognize loops with multiple casts in trip counter` or even: `8357726: C2 recognize loops with multiple casts in trip counter: phi -> CastII* -> AddI -> phi` >> >> Thanks for your suggestion! Sounds better to me. How about changing the title to `Improve C2 to recognize counted loops with multiple casts in trip counter` ? > >> Thanks for your suggestion! Sounds better to me. How about changing the title to Improve C2 to recognize counted loops with multiple casts in trip counter ? > > @XiaohongGong Sounds good too :) Hi @eme64 , I'v updated the IR test and JMH based on your comments. Could you please help review whether it's fine to you. Thanks for all your suggestion! Following shows the performance data of the new JMH test on Grace (the performance gain is almost the same on my x64 machine): Benchmark Mode Cnt limit Unit Before Error (99.9%) After Error (99.9%) Gain CountedLoopCastIV.loop_iv_int thrpt 30 1024 ops/s 1225620.536 39505.158362 5778120.132 4781.602088 4.71 CountedLoopCastIV.loop_iv_int thrpt 30 1536 ops/s 830600.832 14758.561182 3839404.338 3362.727083 4.62 CountedLoopCastIV.loop_iv_int thrpt 30 2048 ops/s 618114.174 36999.511727 2890853.495 416.969862 4.67 CountedLoopCastIV.loop_iv_long thrpt 30 1024 ops/s 1063902.078 4616.608855 1314828.963 1267.470199 1.23 CountedLoopCastIV.loop_iv_long thrpt 30 1536 ops/s 714538.178 630.085477 870801.472 753.347684 1.21 CountedLoopCastIV.loop_iv_long thrpt 30 2048 ops/s 536724.086 131.313178 652775.363 539.107806 1.21 The error term is larger as before. But I don't think this is caused by the large variance of loop iterations. Does the new benchmark look fine to you? Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25539#issuecomment-2939030428 From epeter at openjdk.org Wed Jun 4 08:06:20 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 08:06:20 GMT Subject: RFR: 8252473: [TESTBUG] compiler tests fail with minimal VM: Unrecognized VM option [v3] In-Reply-To: References: Message-ID: On Wed, 28 May 2025 18:39:27 GMT, Zdenek Zambersky wrote: >> This change adds ` -XX:-IgnoreUnrecognizedVMOptions` to problematic tests (or `@requires vm.compiler2.enabled` in one case), to prevent failures `Unrecognized VM option` on client VM. > > Zdenek Zambersky has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: > > Fix of compiler tests for client VM @zzambers Thanks for doing this work! The tests pass :green_circle: ------------- Marked as reviewed by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24262#pullrequestreview-2895714887 From epeter at openjdk.org Wed Jun 4 08:07:39 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 08:07:39 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v80] In-Reply-To: <18z8xy6zbC5dWMAzveQOankso6vWI2yj4b4EpsCS3lg=.f176a82c-4660-4500-8369-8976088d3758@github.com> References: <18z8xy6zbC5dWMAzveQOankso6vWI2yj4b4EpsCS3lg=.f176a82c-4660-4500-8369-8976088d3758@github.com> Message-ID: On Wed, 4 Jun 2025 07:44:35 GMT, Manuel H?ssig wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> fix whitespaces from applied suggestion > > I had a look at the changes since my last review. They look excellent. > Especially good to see the tutorial improving even further. @mhaessig Thank you very much for having another look! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24217#issuecomment-2939033431 From duke at openjdk.org Wed Jun 4 08:07:40 2025 From: duke at openjdk.org (Tom Shull) Date: Wed, 4 Jun 2025 08:07:40 GMT Subject: RFR: 8357660: [JVMCI] Add support for retrieving all BootstrapMethodInvocations directly from ConstantPool [v5] In-Reply-To: References: Message-ID: > This PR adds support for directly retrieving both all invokedynamic and all condy BootstrapMethodInvocations from a ConstantPool via the new method `List lookupBootstrapMethodInvocations(boolean invokeDynamic)`. > > In addition, two methods are added to the BootstrapMethodInvocations: > 1. `void resolve()` > 2. `JavaConstant lookup()` > > The combination of these two features allows one to directly interact with all BSM information of a given ConstantPool without having to iterate through all of the Classfile's methods to find all invokedynamic bytecodes and/or iterate through all Constant Pool entries. Tom Shull has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision: - Merge remote-tracking branch 'origin/master' into JDK-8357660 - Merge remote-tracking branch 'origin/master' into JDK-8357660 - commit to trigger testing - commit to trigger testing - reviewer feedback and update javadoc formatting - complete changes - commit review suggestion Co-authored-by: Douglas Simon - commit review suggestion Co-authored-by: Douglas Simon - change to allow both indys and condys to be looked up all at once - address reviewer feedback - ... and 2 more: https://git.openjdk.org/jdk/compare/826fea84...c7f5c1a7 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25420/files - new: https://git.openjdk.org/jdk/pull/25420/files/e0707fb8..c7f5c1a7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25420&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25420&range=03-04 Stats: 3303 lines in 64 files changed: 2485 ins; 442 del; 376 mod Patch: https://git.openjdk.org/jdk/pull/25420.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25420/head:pull/25420 PR: https://git.openjdk.org/jdk/pull/25420 From epeter at openjdk.org Wed Jun 4 08:12:18 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 08:12:18 GMT Subject: RFR: 8357726: Improve C2 to recognize counted loops with multiple casts in trip counter [v2] In-Reply-To: References: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> Message-ID: On Wed, 4 Jun 2025 08:00:07 GMT, Xiaohong Gong wrote: >> C2 compiler fails to recognize counted loops when the induction variable is constrained by multiple consecutive `CastII` nodes. >> This prevents optimizations like range check elimination, loop unrolling and auto-vectorization for these loops. Please refer >> to the detailed discussion for a related performance issue from [1]. >> >> The ideal graph of such a loop typically looks like: >> >> >> /-----------| >> | | >> | ConI | >> loop | / / >> | | / / >> \ AddI / >> RangeCheck \ / | >> | \ / | >> IfTrue Phi | >> \ | | >> RangeCheck \ | | >> \ CastII / <- Range check #1 >> | | / >> IfTrue | | >> \ | | >> CastII | <- Range check #2 >> | / >> |-------/ >> >> >> >> For a counted loop, the loop induction variable (i.e `Phi`) should be the input of `AddI` ideally. However, in above case, it is used >> by two consecutive `CastII` nodes generated by two different range check operations. Compiler should skip all such kind of `CastII` when recognizing a counted loop. >> >> This patch modifies the counted loop recognition code to iteratively uncast the loop `iv` until no `CastII` nodes remain, enabling proper counted loop recognition even when the induction variable undergoes multiple range constraint operations. >> >> Test: >> - Tested tier1, tier2, tier3, and no regressions are found. >> - An additional test case is added to verify the fix. >> >> Performance: >> Here is the performance gain on a NVIDIA Grace machine which is an AArch64 architecture: >> >> >> Benchmark Mode Cnt Unit Before After Gain >> CountedLoopCastIV.loop_iv_int thrpt 30 ops/s 941482.597 4389292.439 4.66 >> CountedLoopCastIV.loop_iv_long thrpt 30 ops/s 884563.232 1441485.455 1.62 >> >> >> We can also observe the similar uplift on a x86_64 machine. >> >> [1] https://github.com/openjdk/jdk/pull/25138#issuecomment-2892720654 > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments on jtreg and jmh tests test/hotspot/jtreg/compiler/loopopts/TestCountedLoopCastIV.java line 190: > 188: } else { > 189: TestFramework.run(); > 190: } I would recommend checking that there is no "unexpected" input here. Suggestion: if (args != null && args.length > 0 && args[0].equals("DisableUnroll")) { TestFramework.runWithFlags("-XX:LoopUnrollLimit=0"); } else { if (args.length != 0) { throw new RuntimeException("Unexpected args"); } TestFramework.run(); } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25539#discussion_r2125967112 From epeter at openjdk.org Wed Jun 4 08:15:18 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 08:15:18 GMT Subject: RFR: 8357726: Improve C2 to recognize counted loops with multiple casts in trip counter In-Reply-To: References: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> <698Q9LoBFMdDFBnBVAB8FYiI0U-abyXms26RLoMv5Xc=.f21b9a25-8f64-412c-b37a-553f0a13192e@github.com> Message-ID: On Wed, 4 Jun 2025 08:04:00 GMT, Xiaohong Gong wrote: >>> Thanks for your suggestion! Sounds better to me. How about changing the title to Improve C2 to recognize counted loops with multiple casts in trip counter ? >> >> @XiaohongGong Sounds good too :) > > Hi @eme64 , I'v updated the IR test and JMH based on your comments. Could you please help review whether it's fine to you. Thanks for all your suggestion! > > Following shows the performance data of the new JMH test on Grace (the performance gain is almost the same on my x64 machine): > > Benchmark Mode Cnt limit Unit Before Error (99.9%) After Error (99.9%) Gain > CountedLoopCastIV.loop_iv_int thrpt 30 1024 ops/s 1225620.536 39505.158362 5778120.132 4781.602088 4.71 > CountedLoopCastIV.loop_iv_int thrpt 30 1536 ops/s 830600.832 14758.561182 3839404.338 3362.727083 4.62 > CountedLoopCastIV.loop_iv_int thrpt 30 2048 ops/s 618114.174 36999.511727 2890853.495 416.969862 4.67 > CountedLoopCastIV.loop_iv_long thrpt 30 1024 ops/s 1063902.078 4616.608855 1314828.963 1267.470199 1.23 > CountedLoopCastIV.loop_iv_long thrpt 30 1536 ops/s 714538.178 630.085477 870801.472 753.347684 1.21 > CountedLoopCastIV.loop_iv_long thrpt 30 2048 ops/s 536724.086 131.313178 652775.363 539.107806 1.21 > > > The error term is larger as before. But I don't think this is caused by the large variance of loop iterations. Does the new benchmark look fine to you? Thanks! @XiaohongGong Nice, thanks for the updates! Especially the IR rules and reduction in JMH benchmark variance, excellent :) Please ping me again once you have addressed my comment above, and then I can run some internal testing for you! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25539#issuecomment-2939057374 From dfenacci at openjdk.org Wed Jun 4 08:18:19 2025 From: dfenacci at openjdk.org (Damon Fenacci) Date: Wed, 4 Jun 2025 08:18:19 GMT Subject: RFR: 8356000: C1/C2-only modes use 2 compiler threads on low CPU count machines [v3] In-Reply-To: References: Message-ID: <0AIqaIvewAyRN2mTf8rkMpl1m7Wcm6BJRzNQq84C9j4=.71353c09-76a7-41e2-9be2-30eaaa9eff29@github.com> On Wed, 28 May 2025 18:05:12 GMT, Aleksey Shipilev wrote: >> There is an unfortunate limitation with default tiered policy that we would have at least 2 threads on 1 CPU machine: 1 thread for C1, and 1 thread for C2. >> >> But if we select C1-only or C2-only modes, we _also_ get 2 compiler threads, for which we have no good reason. These threads would just step on each other toes. The fix changes the behavior for 1..3 CPU hosts in C1/C2-only configurations, by using 1 thread instead of 2 threads. The change for 1 CPU config is what we really need. The change in 2..3 CPU configs is an additional effect, but I think it is still good not to use 100%/66% of the CPUs in those configurations as well. >> >> >> $ for I in `seq 1 8`; do build/linux-x86_64-server-release/images/jdk/bin/java \ >> -XX:-TieredCompilation -XX:ActiveProcessorCount=${I} \ >> -XX:+PrintFlagsFinal 2>&1 | grep "CICompilerCount "; done >> >> # Before >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> # After >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> >> It is a minor bug in `CompilationPolicy::initialize`, but it gets in the way studying Leyden in tight CPU scenarios. >> >> Additional testing: >> - [x] New regression test passes with the fix, fails without it >> - [x] GHA >> - [x] Linux AArch64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Better test, patch amendments > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Unnecessary arch limitation > - Simplify test > - Adjust test bound > - Fix Very thorough test! Thanks a lot @shipilev! ------------- Marked as reviewed by dfenacci (Committer). PR Review: https://git.openjdk.org/jdk/pull/24972#pullrequestreview-2895749955 From mdoerr at openjdk.org Wed Jun 4 08:35:20 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 4 Jun 2025 08:35:20 GMT Subject: RFR: 8354636: [PPC64] Clean up comments regarding frame manager In-Reply-To: <28IlBh9k0o4RZMbIstYTCl8c0rfIIqVqyPXeXFyx1Ik=.1d4919d2-2437-4c81-8d30-75128b0a0afb@github.com> References: <28IlBh9k0o4RZMbIstYTCl8c0rfIIqVqyPXeXFyx1Ik=.1d4919d2-2437-4c81-8d30-75128b0a0afb@github.com> Message-ID: On Tue, 3 Jun 2025 14:29:49 GMT, Martin Doerr wrote: > Trivial comment cleanup: Replace "frame manager" by "template interpreter". Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25616#issuecomment-2939112946 From mdoerr at openjdk.org Wed Jun 4 08:35:20 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 4 Jun 2025 08:35:20 GMT Subject: Integrated: 8354636: [PPC64] Clean up comments regarding frame manager In-Reply-To: <28IlBh9k0o4RZMbIstYTCl8c0rfIIqVqyPXeXFyx1Ik=.1d4919d2-2437-4c81-8d30-75128b0a0afb@github.com> References: <28IlBh9k0o4RZMbIstYTCl8c0rfIIqVqyPXeXFyx1Ik=.1d4919d2-2437-4c81-8d30-75128b0a0afb@github.com> Message-ID: On Tue, 3 Jun 2025 14:29:49 GMT, Martin Doerr wrote: > Trivial comment cleanup: Replace "frame manager" by "template interpreter". This pull request has now been integrated. Changeset: ab235000 Author: Martin Doerr URL: https://git.openjdk.org/jdk/commit/ab235000349bfd268e80a7cb99bf07a229406119 Stats: 13 lines in 3 files changed: 0 ins; 2 del; 11 mod 8354636: [PPC64] Clean up comments regarding frame manager Reviewed-by: amitkumar, rrich ------------- PR: https://git.openjdk.org/jdk/pull/25616 From epeter at openjdk.org Wed Jun 4 08:44:19 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 08:44:19 GMT Subject: RFR: 8020282: Generated code quality: redundant LEAs in the chained dereferences In-Reply-To: References: Message-ID: On Tue, 27 May 2025 17:26:59 GMT, Manuel H?ssig wrote: > ## Summary > > On x86, chained dereferences of narrow oops at a constant offset from the base oop can use a `lea` instruction to perform the address computation in one go using the `leaP8Narrow`, `leaP32Narrow`, and `leaPCompressedOopOffset` matching rules. However, the generated code contains an additional `lea` with an unused result: > > ; OptoAssembly > 03d decode_heap_oop_not_null R8,R10 > 041 leaq R10, [R12 + R10 << 3 + #12] (compressed oop addressing) ; ptr compressedoopoff32 > > ; x86 > 0x00007f1f210625bd: lea (%r12,%r10,8),%r8 ; result is unused > 0x00007f1f210625c1: lea 0xc(%r12,%r10,8),%r10 ; the same computation as decode, but with offset > > > This PR adds a peephole optimization to remove such redundant `lea`s. > > ## The Issue in Detail > > The ideal subgraph producing redundant `lea`s, or rather redundant `decodeHeapOop_not_null`s, is `LoadN -> DecodeN -> AddP`, where both the address and base edge of the `AddP` originate from the `DecodeN`. After matching, this becomes > > LoadN -> decodeHeapOop_not_null -> leaP* > ______________________________? > > where `leaP*` is either of `leaP8Narrow`, `leaP32Narrow`, or `leaPCompressedOopOffset` (depending on the heap location and size). Here, the base input of `leaP*` comes from the decode. Looking at the matching code path, we find that the `leaP*` rules match both the `AddP` and the `DecodeN`, since x86 can fold this, but the following code adds the decode back as the base input to `leaP*`: > > https://github.com/openjdk/jdk/blob/c29537740efb04e061732a700582d43b1956cff4/src/hotspot/share/opto/matcher.cpp#L1894-L1897 > > On its face, this is completely unnecessary if we matched a `leaP*`, since it already computes the result of the decode, so adding the `LoadN` node as base seems like the logical choice. However, if the derived oop computed by the `leaP*` gets added to an oop map, this `DecodeN` is needed as the base for the derived oop. Because as of now, derived oops in oop maps cannot have narrow base pointers. > > This leaves us with a handful of possible solutions: > 1. implement narrow bases for derived oops in oop maps, > 2. perform some dead code elimination after we know which oops are part of oop maps, > 3. add a peephole optimization to simply remove unused `lea`s. > > Option 1 would have been ideal in the sense, that it is the earliest possible point to remove the decode, which would simplify the graph and reduce pressure on the register allocator. However, rewriting the oop map machinery to remove a... Drive by comment ;) test/micro/org/openjdk/bench/vm/compiler/x86/RedundantLeaPeephole.java line 33: > 31: @Warmup(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS) > 32: @Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS) > 33: @Fork(value = 3, jvmArgsAppend = {"-Xms1g", "-Xmx1g"}) Ha, what did you need these args for? Could be nice to have a little comment in the code. ------------- PR Review: https://git.openjdk.org/jdk/pull/25471#pullrequestreview-2895825725 PR Review Comment: https://git.openjdk.org/jdk/pull/25471#discussion_r2126028316 From chagedorn at openjdk.org Wed Jun 4 09:02:18 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Wed, 4 Jun 2025 09:02:18 GMT Subject: RFR: 8357726: Improve C2 to recognize counted loops with multiple casts in trip counter [v2] In-Reply-To: References: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> Message-ID: On Wed, 4 Jun 2025 08:00:07 GMT, Xiaohong Gong wrote: >> C2 compiler fails to recognize counted loops when the induction variable is constrained by multiple consecutive `CastII` nodes. >> This prevents optimizations like range check elimination, loop unrolling and auto-vectorization for these loops. Please refer >> to the detailed discussion for a related performance issue from [1]. >> >> The ideal graph of such a loop typically looks like: >> >> >> /-----------| >> | | >> | ConI | >> loop | / / >> | | / / >> \ AddI / >> RangeCheck \ / | >> | \ / | >> IfTrue Phi | >> \ | | >> RangeCheck \ | | >> \ CastII / <- Range check #1 >> | | / >> IfTrue | | >> \ | | >> CastII | <- Range check #2 >> | / >> |-------/ >> >> >> >> For a counted loop, the loop induction variable (i.e `Phi`) should be the input of `AddI` ideally. However, in above case, it is used >> by two consecutive `CastII` nodes generated by two different range check operations. Compiler should skip all such kind of `CastII` when recognizing a counted loop. >> >> This patch modifies the counted loop recognition code to iteratively uncast the loop `iv` until no `CastII` nodes remain, enabling proper counted loop recognition even when the induction variable undergoes multiple range constraint operations. >> >> Test: >> - Tested tier1, tier2, tier3, and no regressions are found. >> - An additional test case is added to verify the fix. >> >> Performance: >> Here is the performance gain on a NVIDIA Grace machine which is an AArch64 architecture: >> >> >> Benchmark Mode Cnt Unit Before After Gain >> CountedLoopCastIV.loop_iv_int thrpt 30 ops/s 941482.597 4389292.439 4.66 >> CountedLoopCastIV.loop_iv_long thrpt 30 ops/s 884563.232 1441485.455 1.62 >> >> >> We can also observe the similar uplift on a x86_64 machine. >> >> [1] https://github.com/openjdk/jdk/pull/25138#issuecomment-2892720654 > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments on jtreg and jmh tests Looks good to me, too! ------------- Marked as reviewed by chagedorn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25539#pullrequestreview-2895901032 From epeter at openjdk.org Wed Jun 4 09:06:17 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 09:06:17 GMT Subject: RFR: 8357726: Improve C2 to recognize counted loops with multiple casts in trip counter In-Reply-To: References: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> <698Q9LoBFMdDFBnBVAB8FYiI0U-abyXms26RLoMv5Xc=.f21b9a25-8f64-412c-b37a-553f0a13192e@github.com> Message-ID: On Wed, 4 Jun 2025 08:04:00 GMT, Xiaohong Gong wrote: >>> Thanks for your suggestion! Sounds better to me. How about changing the title to Improve C2 to recognize counted loops with multiple casts in trip counter ? >> >> @XiaohongGong Sounds good too :) > > Hi @eme64 , I'v updated the IR test and JMH based on your comments. Could you please help review whether it's fine to you. Thanks for all your suggestion! > > Following shows the performance data of the new JMH test on Grace (the performance gain is almost the same on my x64 machine): > > Benchmark Mode Cnt limit Unit Before Error (99.9%) After Error (99.9%) Gain > CountedLoopCastIV.loop_iv_int thrpt 30 1024 ops/s 1225620.536 39505.158362 5778120.132 4781.602088 4.71 > CountedLoopCastIV.loop_iv_int thrpt 30 1536 ops/s 830600.832 14758.561182 3839404.338 3362.727083 4.62 > CountedLoopCastIV.loop_iv_int thrpt 30 2048 ops/s 618114.174 36999.511727 2890853.495 416.969862 4.67 > CountedLoopCastIV.loop_iv_long thrpt 30 1024 ops/s 1063902.078 4616.608855 1314828.963 1267.470199 1.23 > CountedLoopCastIV.loop_iv_long thrpt 30 1536 ops/s 714538.178 630.085477 870801.472 753.347684 1.21 > CountedLoopCastIV.loop_iv_long thrpt 30 2048 ops/s 536724.086 131.313178 652775.363 539.107806 1.21 > > > The error term is larger as before. But I don't think this is caused by the large variance of loop iterations. Does the new benchmark look fine to you? Thanks! @XiaohongGong Let's please delay this until after Thursday, so that this does not go into JDK25 yet, and we have more time to fix it if something goes wrong down the line. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25539#issuecomment-2939218743 From xgong at openjdk.org Wed Jun 4 09:16:53 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Wed, 4 Jun 2025 09:16:53 GMT Subject: RFR: 8357726: Improve C2 to recognize counted loops with multiple casts in trip counter [v3] In-Reply-To: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> References: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> Message-ID: > C2 compiler fails to recognize counted loops when the induction variable is constrained by multiple consecutive `CastII` nodes. > This prevents optimizations like range check elimination, loop unrolling and auto-vectorization for these loops. Please refer > to the detailed discussion for a related performance issue from [1]. > > The ideal graph of such a loop typically looks like: > > > /-----------| > | | > | ConI | > loop | / / > | | / / > \ AddI / > RangeCheck \ / | > | \ / | > IfTrue Phi | > \ | | > RangeCheck \ | | > \ CastII / <- Range check #1 > | | / > IfTrue | | > \ | | > CastII | <- Range check #2 > | / > |-------/ > > > > For a counted loop, the loop induction variable (i.e `Phi`) should be the input of `AddI` ideally. However, in above case, it is used > by two consecutive `CastII` nodes generated by two different range check operations. Compiler should skip all such kind of `CastII` when recognizing a counted loop. > > This patch modifies the counted loop recognition code to iteratively uncast the loop `iv` until no `CastII` nodes remain, enabling proper counted loop recognition even when the induction variable undergoes multiple range constraint operations. > > Test: > - Tested tier1, tier2, tier3, and no regressions are found. > - An additional test case is added to verify the fix. > > Performance: > Here is the performance gain on a NVIDIA Grace machine which is an AArch64 architecture: > > > Benchmark Mode Cnt Unit Before After Gain > CountedLoopCastIV.loop_iv_int thrpt 30 ops/s 941482.597 4389292.439 4.66 > CountedLoopCastIV.loop_iv_long thrpt 30 ops/s 884563.232 1441485.455 1.62 > > > We can also observe the similar uplift on a x86_64 machine. > > [1] https://github.com/openjdk/jdk/pull/25138#issuecomment-2892720654 Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: Address reivew comments on IR test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25539/files - new: https://git.openjdk.org/jdk/pull/25539/files/afe6b2df..08538543 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25539&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25539&range=01-02 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25539.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25539/head:pull/25539 PR: https://git.openjdk.org/jdk/pull/25539 From xgong at openjdk.org Wed Jun 4 09:17:00 2025 From: xgong at openjdk.org (Xiaohong Gong) Date: Wed, 4 Jun 2025 09:17:00 GMT Subject: RFR: 8357726: Improve C2 to recognize counted loops with multiple casts in trip counter In-Reply-To: References: <-SKyhptjFPhuOPflySOZXJloR_Vgr4sC-xB5dSQXxZU=.fd6922bc-2498-4f4e-873a-999f82cd0a1a@github.com> <698Q9LoBFMdDFBnBVAB8FYiI0U-abyXms26RLoMv5Xc=.f21b9a25-8f64-412c-b37a-553f0a13192e@github.com> Message-ID: On Wed, 4 Jun 2025 09:03:21 GMT, Emanuel Peter wrote: > @XiaohongGong Let's please delay this until after Thursday, so that this does not go into JDK25 yet, and we have more time to fix it if something goes wrong down the line. Sure. That makes sense to me. Thanks! BTW, I'v updated the test according to your comment. So could you please help run all the tests? Thanks again! ------------- PR Comment: https://git.openjdk.org/jdk/pull/25539#issuecomment-2939250321 From mhaessig at openjdk.org Wed Jun 4 09:18:22 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 4 Jun 2025 09:18:22 GMT Subject: RFR: 8020282: Generated code quality: redundant LEAs in the chained dereferences In-Reply-To: <-PFiiMlUghbFgg2fuU86vuEXKaexylDuk3kBdcBn9N8=.2c272bf1-10a7-4110-8919-f33ee0d491ba@github.com> References: <-PFiiMlUghbFgg2fuU86vuEXKaexylDuk3kBdcBn9N8=.2c272bf1-10a7-4110-8919-f33ee0d491ba@github.com> Message-ID: On Tue, 3 Jun 2025 17:44:07 GMT, Vladimir Kozlov wrote: >> ## Summary >> >> On x86, chained dereferences of narrow oops at a constant offset from the base oop can use a `lea` instruction to perform the address computation in one go using the `leaP8Narrow`, `leaP32Narrow`, and `leaPCompressedOopOffset` matching rules. However, the generated code contains an additional `lea` with an unused result: >> >> ; OptoAssembly >> 03d decode_heap_oop_not_null R8,R10 >> 041 leaq R10, [R12 + R10 << 3 + #12] (compressed oop addressing) ; ptr compressedoopoff32 >> >> ; x86 >> 0x00007f1f210625bd: lea (%r12,%r10,8),%r8 ; result is unused >> 0x00007f1f210625c1: lea 0xc(%r12,%r10,8),%r10 ; the same computation as decode, but with offset >> >> >> This PR adds a peephole optimization to remove such redundant `lea`s. >> >> ## The Issue in Detail >> >> The ideal subgraph producing redundant `lea`s, or rather redundant `decodeHeapOop_not_null`s, is `LoadN -> DecodeN -> AddP`, where both the address and base edge of the `AddP` originate from the `DecodeN`. After matching, this becomes >> >> LoadN -> decodeHeapOop_not_null -> leaP* >> ______________________________? >> >> where `leaP*` is either of `leaP8Narrow`, `leaP32Narrow`, or `leaPCompressedOopOffset` (depending on the heap location and size). Here, the base input of `leaP*` comes from the decode. Looking at the matching code path, we find that the `leaP*` rules match both the `AddP` and the `DecodeN`, since x86 can fold this, but the following code adds the decode back as the base input to `leaP*`: >> >> https://github.com/openjdk/jdk/blob/c29537740efb04e061732a700582d43b1956cff4/src/hotspot/share/opto/matcher.cpp#L1894-L1897 >> >> On its face, this is completely unnecessary if we matched a `leaP*`, since it already computes the result of the decode, so adding the `LoadN` node as base seems like the logical choice. However, if the derived oop computed by the `leaP*` gets added to an oop map, this `DecodeN` is needed as the base for the derived oop. Because as of now, derived oops in oop maps cannot have narrow base pointers. >> >> This leaves us with a handful of possible solutions: >> 1. implement narrow bases for derived oops in oop maps, >> 2. perform some dead code elimination after we know which oops are part of oop maps, >> 3. add a peephole optimization to simply remove unused `lea`s. >> >> Option 1 would have been ideal in the sense, that it is the earliest possible point to remove the decode, which would simplify the graph and reduce pressure on the regi... > > src/hotspot/cpu/x86/peephole_x86_64.cpp line 270: > >> 268: // away, this peephole can als recognize the decode as redundant and also remove the spill copy >> 269: // if that is only used by the decode. >> 270: bool Peephole::lea_remove_redundant(Block* block, int block_index, PhaseCFG* cfg_, PhaseRegAlloc* ra_, > > Why do you need `_` suffix? I don't really need them. I only matched the signature of the other peephole functions. We could remove the underline for all peepholes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25471#discussion_r2126104773 From mhaessig at openjdk.org Wed Jun 4 09:18:23 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 4 Jun 2025 09:18:23 GMT Subject: RFR: 8020282: Generated code quality: redundant LEAs in the chained dereferences In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 08:39:57 GMT, Emanuel Peter wrote: >> ## Summary >> >> On x86, chained dereferences of narrow oops at a constant offset from the base oop can use a `lea` instruction to perform the address computation in one go using the `leaP8Narrow`, `leaP32Narrow`, and `leaPCompressedOopOffset` matching rules. However, the generated code contains an additional `lea` with an unused result: >> >> ; OptoAssembly >> 03d decode_heap_oop_not_null R8,R10 >> 041 leaq R10, [R12 + R10 << 3 + #12] (compressed oop addressing) ; ptr compressedoopoff32 >> >> ; x86 >> 0x00007f1f210625bd: lea (%r12,%r10,8),%r8 ; result is unused >> 0x00007f1f210625c1: lea 0xc(%r12,%r10,8),%r10 ; the same computation as decode, but with offset >> >> >> This PR adds a peephole optimization to remove such redundant `lea`s. >> >> ## The Issue in Detail >> >> The ideal subgraph producing redundant `lea`s, or rather redundant `decodeHeapOop_not_null`s, is `LoadN -> DecodeN -> AddP`, where both the address and base edge of the `AddP` originate from the `DecodeN`. After matching, this becomes >> >> LoadN -> decodeHeapOop_not_null -> leaP* >> ______________________________? >> >> where `leaP*` is either of `leaP8Narrow`, `leaP32Narrow`, or `leaPCompressedOopOffset` (depending on the heap location and size). Here, the base input of `leaP*` comes from the decode. Looking at the matching code path, we find that the `leaP*` rules match both the `AddP` and the `DecodeN`, since x86 can fold this, but the following code adds the decode back as the base input to `leaP*`: >> >> https://github.com/openjdk/jdk/blob/c29537740efb04e061732a700582d43b1956cff4/src/hotspot/share/opto/matcher.cpp#L1894-L1897 >> >> On its face, this is completely unnecessary if we matched a `leaP*`, since it already computes the result of the decode, so adding the `LoadN` node as base seems like the logical choice. However, if the derived oop computed by the `leaP*` gets added to an oop map, this `DecodeN` is needed as the base for the derived oop. Because as of now, derived oops in oop maps cannot have narrow base pointers. >> >> This leaves us with a handful of possible solutions: >> 1. implement narrow bases for derived oops in oop maps, >> 2. perform some dead code elimination after we know which oops are part of oop maps, >> 3. add a peephole optimization to simply remove unused `lea`s. >> >> Option 1 would have been ideal in the sense, that it is the earliest possible point to remove the decode, which would simplify the graph and reduce pressure on the regi... > > test/micro/org/openjdk/bench/vm/compiler/x86/RedundantLeaPeephole.java line 33: > >> 31: @Warmup(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS) >> 32: @Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS) >> 33: @Fork(value = 3, jvmArgsAppend = {"-Xms1g", "-Xmx1g"}) > > Ha, what did you need these args for? Could be nice to have a little comment in the code. This is what I gather to be good practice from @shipilev's [blog post about JMS benchmarks](https://shipilev.net/blog/2016/arrays-wisdom-ancients/#_benchmark). It ensures a consistent heap size across machines and runs because the `StoreN` benchmarks are sensitive to different GC's and heap layouts. But these are my first JMH benchmarks, so I appreciate any input. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25471#discussion_r2126107900 From roland at openjdk.org Wed Jun 4 09:21:23 2025 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 4 Jun 2025 09:21:23 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v10] In-Reply-To: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: > An `Initialize` node for an `Allocate` node is created with a memory > `Proj` of adr type raw memory. In order for stores to be captured, the > memory state out of the allocation is a `MergeMem` with slices for the > various object fields/array element set to the raw memory `Proj` of > the `Initialize` node. If `Phi`s need to be created during later > transformations from this memory state, The `Phi` for a particular > slice gets its adr type from the type of the `Proj` which is raw > memory. If during macro expansion, the `Allocate` is found to have no > use and so can be removed, the `Proj` out of the `Initialize` is > replaced by the memory state on input to the `Allocate`. A `Phi` for > some slice for a field of an object will end up with the raw memory > state on input to the `Allocate` node. As a result, memory state at > the `Phi` is incorrect and incorrect execution can happen. > > The fix I propose is, rather than have a single `Proj` for the memory > state out of the `Initialize` with adr type raw memory, to use one > `Proj` per slice added to the memory state after the `Initalize`. Each > of the `Proj` should return the right adr type for its slice. For that > I propose having a new type of `Proj`: `NarrowMemProj` that captures > the right adr type. > > Logic for the construction of the `Allocate`/`Initialize` subgraph is > tweaked so the right adr type captured in is own `NarrowMemProj` is > added to the memory sugraph. Code that removes an allocation or moves > it also has to be changed so it correctly takes the multiple memory > projections out of the `Initialize` node into account. > > One tricky issue is that when EA split types for a scalar replaceable > `Allocate` node: > > 1- the adr type captured in the `NarrowMemProj` becomes out of sync > with the type of the slices for the allocation > > 2- before EA, the memory state for one particular field out of the > `Initialize` node can be used for a `Store` to the just allocated > object or some other. So we can have a chain of `Store`s, some to > the newly allocated object, some to some other objects, all of them > using the state of `NarrowMemProj` out of the `Initialize`. After > split unique types, the `NarrowMemProj` is for the slice of a > particular allocation. So `Store`s to some other objects shouldn't > use that memory state but the memory state before the `Allocate`. > > For that, I added logic to update the adr type of `NarrowMemProj` > during split unique types and update the memory input of `Store`s that > don't depend on the memory state ... Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 39 additional commits since the last revision: - more - lambda return - lambda clean up - Merge branch 'master' into JDK-8327963 - Update src/hotspot/share/opto/library_call.cpp Co-authored-by: Emanuel Peter - review - new test tweak - new test - Merge branch 'master' into JDK-8327963 - Merge branch 'master' into JDK-8327963 - ... and 29 more: https://git.openjdk.org/jdk/compare/3f54e74e...69c6e50b ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24570/files - new: https://git.openjdk.org/jdk/pull/24570/files/c0a8ad21..69c6e50b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24570&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24570&range=08-09 Stats: 98759 lines in 1463 files changed: 60579 ins; 25167 del; 13013 mod Patch: https://git.openjdk.org/jdk/pull/24570.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24570/head:pull/24570 PR: https://git.openjdk.org/jdk/pull/24570 From shade at openjdk.org Wed Jun 4 09:25:17 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 4 Jun 2025 09:25:17 GMT Subject: RFR: 8020282: Generated code quality: redundant LEAs in the chained dereferences In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 09:15:12 GMT, Manuel H?ssig wrote: >> test/micro/org/openjdk/bench/vm/compiler/x86/RedundantLeaPeephole.java line 33: >> >>> 31: @Warmup(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS) >>> 32: @Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS) >>> 33: @Fork(value = 3, jvmArgsAppend = {"-Xms1g", "-Xmx1g"}) >> >> Ha, what did you need these args for? Could be nice to have a little comment in the code. > > This is what I gather to be good practice from @shipilev's [blog post about JMS benchmarks](https://shipilev.net/blog/2016/arrays-wisdom-ancients/#_benchmark). It ensures a consistent heap size across machines and runs because the `StoreN` benchmarks are sensitive to different GC's and heap layouts. But these are my first JMH benchmarks, so I appreciate any input. Yes, exactly. We often do this for allocation-heavy benchmarks to put GC in more consistent conditions. GC often tries to decide whether to do a GC cycle or expand the heap, and this decision might change with minor externalities. So it sometimes contributes to run-to-run and intra-run variance. This benchmark does allocate pretty hard, so setting a heap size makes sense. (Additionally, this forces a selection of a particular compressed oops mode, 32-bit in this case, which is also nice for reproducibility.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25471#discussion_r2126122739 From roland at openjdk.org Wed Jun 4 09:26:19 2025 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 4 Jun 2025 09:26:19 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v8] In-Reply-To: <4ShW7VcaJrO0v0cHwUN1vccOH8tNPlJSIh_K0W2RdS0=.14954a26-c962-41a1-9088-2e1a1bc01eb4@github.com> References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> <1gdeBnZ7YuIf9CgQW2bCXkDDBWPjUgRnickHts-fvzE=.e6e901ba-3e9f-41a2-9c68-167a879e9655@github.com> <4ShW7VcaJrO0v0cHwUN1vccOH8tNPlJSIh_K0W2RdS0=.14954a26-c962-41a1-9088-2e1a1bc01eb4@github.com> Message-ID: On Tue, 27 May 2025 09:12:18 GMT, Emanuel Peter wrote: > I was a little confused about this apply_to_proj construct. I this something we already use, a familiar concept, the apply_to? Thanks for the pointer to the style guide section. I followed the recommendation there. I also followed your suggestion for the return value of the callback. All of your other comments should be addressed in new commit as well. @eme64 please have another look when you find the time. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24570#issuecomment-2939274746 From roland at openjdk.org Wed Jun 4 09:26:20 2025 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 4 Jun 2025 09:26:20 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v8] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> <1gdeBnZ7YuIf9CgQW2bCXkDDBWPjUgRnickHts-fvzE=.e6e901ba-3e9f-41a2-9c68-167a879e9655@github.com> Message-ID: On Wed, 28 May 2025 07:34:36 GMT, Roberto Casta?eda Lozano wrote: > In the common case where allocations are not eliminated, matching transforms the introduced `NarrowMemProj` nodes into a sequence of redundant, raw `MemProj` nodes, see e.g. B6 here: [after-gcm.pdf](https://github.com/user-attachments/files/20477560/after-gcm.pdf). Would it be possible to clean them up during matching (or perhaps already during, or right after, macro expansion)? Thanks for looking at this @robcasloz I made the change you requested. > src/hotspot/share/opto/multnode.cpp line 49: > >> 47: assert((Opcode() != Op_If && Opcode() != Op_RangeCheck) || which_proj == (uint)true || which_proj == (uint)false, "must be 1 or 0"); >> 48: assert(number_of_projs(which_proj) <= 1, "only when there's a single projection"); >> 49: auto find_proj = [which_proj, this](ProjNode* proj) { > > This does not build on macosx-aarch64: > > > src/hotspot/share/opto/multnode.cpp:49:21: error: lambda capture 'which_proj' is not used [-Werror,-Wunused-lambda-capture] > auto find_proj = [which_proj, this](ProjNode* proj) { Thanks for the report. This should be fixed now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24570#issuecomment-2939277421 PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2126123810 From jbhateja at openjdk.org Wed Jun 4 09:43:44 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 4 Jun 2025 09:43:44 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v5] In-Reply-To: References: Message-ID: > A) Patch extends the following tests with hard-coded encoding checks for various BMI instructions to cover REX2 or extended EVEX encodings supported by APX. > > > compiler/intrinsics/bmi/verifycode/AndnTestI.java > compiler/intrinsics/bmi/verifycode/AndnTestL.java > compiler/intrinsics/bmi/verifycode/BzhiTestI2L.java > compiler/intrinsics/bmi/verifycode/LZcntTestL.java > compiler/intrinsics/bmi/verifycode/TZcntTestL.java > > > B) After integration of JDK-8349582, which added APX NDD support, AndN instruction selection patterns that expect (Xor SRC, -1) as one of its operands were not getting selected because of a lower-cost generic immediate pattern match; patch fixes this issue through strict predicate checks. > > Above tests are now passing, validations were carried out using Intel Software Development emulator. > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Adding cost to memory patterns ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25501/files - new: https://git.openjdk.org/jdk/pull/25501/files/e332f191..a67a7d0a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25501&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25501&range=03-04 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25501.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25501/head:pull/25501 PR: https://git.openjdk.org/jdk/pull/25501 From jbhateja at openjdk.org Wed Jun 4 09:51:03 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 4 Jun 2025 09:51:03 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v6] In-Reply-To: References: Message-ID: > A) Patch extends the following tests with hard-coded encoding checks for various BMI instructions to cover REX2 or extended EVEX encodings supported by APX. > > > compiler/intrinsics/bmi/verifycode/AndnTestI.java > compiler/intrinsics/bmi/verifycode/AndnTestL.java > compiler/intrinsics/bmi/verifycode/BzhiTestI2L.java > compiler/intrinsics/bmi/verifycode/LZcntTestL.java > compiler/intrinsics/bmi/verifycode/TZcntTestL.java > > > B) After integration of JDK-8349582, which added APX NDD support, AndN instruction selection patterns that expect (Xor SRC, -1) as one of its operands were not getting selected because of a lower-cost generic immediate pattern match; patch fixes this issue through strict predicate checks. > > Above tests are now passing, validations were carried out using Intel Software Development emulator. > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Adding comments. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25501/files - new: https://git.openjdk.org/jdk/pull/25501/files/a67a7d0a..38bf655e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25501&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25501&range=04-05 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25501.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25501/head:pull/25501 PR: https://git.openjdk.org/jdk/pull/25501 From jbhateja at openjdk.org Wed Jun 4 09:51:03 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 4 Jun 2025 09:51:03 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v2] In-Reply-To: References: <8mE0O0QjyMJMK7UWtfMiFc5ZjIxFYqVNUeu0qYbzaz8=.75e13abf-a2c9-407b-898d-1174a85a06cf@github.com> Message-ID: <7fnB8ubV2eSkP88UkrVQ6qmNZcomS5Zby6mAUukJP4Y=.c8048c62-404b-4985-a614-9637f3fd03e9@github.com> On Wed, 4 Jun 2025 06:35:50 GMT, Emanuel Peter wrote: >>> @jatin-bhateja Thanks for looking into this! >>> >>> `predicate(!UseAPX && n->in(2)->bottom_type()->is_int()->get_con() != -1);` >>> >>> The PR title seems to suggest the bug is only about -XX:+UseAPX. Why are you changing things for the case !UseAPX? >>> >>> Are these not cases like a ^ -1, which basically flips all bits. What alternative does this end up using now? >>> >>> A code comment would be helpful. >> >> We are tightening the predicate check so that under no circumstances we pick this pattern during the reduction phase of instruction selection on account of having lower cost. There is a generic pattern (xorI_rReg_imm) for all integral immediate values, and then there is a special pattern for Xor with -1 (fxorI_rReg_im1), which is needed for AndN inferencing. > > @jatin-bhateja I'll wait with testing, until someone from Intel gives this the approval. Feel free to ping me for that once we are there :) Hi @eme64, please initiate your test runs, we can have a second review from @sviswa7 once she is online. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25501#issuecomment-2939356138 From shade at openjdk.org Wed Jun 4 10:31:26 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 4 Jun 2025 10:31:26 GMT Subject: RFR: 8358534: Bailout in Conv2B::Ideal when type of cmp input is not supported In-Reply-To: References: <2kB23xVQDRb7YT6aMt1SbIfPwSG1ummK29A1Hs3FD0Y=.59ea7dd7-c6fd-486c-a996-f839c9a15718@github.com> <6F6gsbyRSrJ7_XHXoMh8j15Mog2DMec5DaOCVAdcdFQ=.0799c52f-275c-4303-8bd8-05f341c20ae0@github.com> Message-ID: On Wed, 4 Jun 2025 06:55:25 GMT, Tobias Hartmann wrote: >> src/hotspot/share/opto/convertnode.cpp line 86: >> >>> 84: return nullptr; >>> 85: } >>> 86: >> >> @JohnTortugo Would it not have been better to put this check inside the `else` branch? > > I agree, the `return nullptr;` should have been added below the assert in the else branch, no check required. Yes, putting `return nullptr;` into existing branch would have been cleaner. One more reason to wait for reviews! We can do a quick follow-up that sweeps this return to its better place, if you feel strongly about it. Note this would likely get backported, so being extra-clean pays off the process hassle. It is about 15 minute deal for me, I am happy to do it as a penance for not coming up with it myself :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25627#discussion_r2126256450 From mhaessig at openjdk.org Wed Jun 4 10:46:17 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 4 Jun 2025 10:46:17 GMT Subject: RFR: 8020282: Generated code quality: redundant LEAs in the chained dereferences In-Reply-To: References: <-PFiiMlUghbFgg2fuU86vuEXKaexylDuk3kBdcBn9N8=.2c272bf1-10a7-4110-8919-f33ee0d491ba@github.com> Message-ID: <17Sqc6IAzWE8E8JFrFCEgpi15ox-L9_IG8cxVuyc_A8=.709fb4de-5d33-4a50-9188-e3d306ebdb23@github.com> On Wed, 4 Jun 2025 09:13:34 GMT, Manuel H?ssig wrote: >> src/hotspot/cpu/x86/peephole_x86_64.cpp line 270: >> >>> 268: // away, this peephole can als recognize the decode as redundant and also remove the spill copy >>> 269: // if that is only used by the decode. >>> 270: bool Peephole::lea_remove_redundant(Block* block, int block_index, PhaseCFG* cfg_, PhaseRegAlloc* ra_, >> >> Why do you need `_` suffix? > > I don't really need them. I only matched the signature of the other peephole functions. > > We could remove the underline for all peepholes. This probably comes from the signature in `MachNode`: https://github.com/openjdk/jdk/blob/7838321b74276e45b92c54904ea31ef70ed9e33f/src/hotspot/share/opto/machnode.hpp#L368-L369 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25471#discussion_r2126287499 From jbhateja at openjdk.org Wed Jun 4 10:52:10 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 4 Jun 2025 10:52:10 GMT Subject: RFR: 8352635: Improve inferencing of Float16 operations with constant inputs [v6] In-Reply-To: <44nVQBYgzCOB2mAB9xtAPvkUcOMJOITA2VjMdDFgm1g=.48266693-48bf-41db-8871-a7dcafe93509@github.com> References: <44nVQBYgzCOB2mAB9xtAPvkUcOMJOITA2VjMdDFgm1g=.48266693-48bf-41db-8871-a7dcafe93509@github.com> Message-ID: > This is a follow-up PR#22755 to improve Float16 operations inferencing. > > The existing scheme to detect Float16 operations for some operations is based on pattern matching which expects to receive inputs through ConvHF2F IR, this patch extends matching to accept constant floating point inputs within the Float16 value range. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with three additional commits since the last revision: - Update test/hotspot/jtreg/compiler/lib/generators/Generators.java Co-authored-by: Emanuel Peter - Update src/hotspot/share/opto/convertnode.cpp Co-authored-by: Emanuel Peter - Update src/hotspot/share/opto/convertnode.cpp Co-authored-by: Emanuel Peter ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24179/files - new: https://git.openjdk.org/jdk/pull/24179/files/4a491bef..b95c51cb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24179&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24179&range=04-05 Stats: 14 lines in 2 files changed: 2 ins; 2 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/24179.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24179/head:pull/24179 PR: https://git.openjdk.org/jdk/pull/24179 From jbhateja at openjdk.org Wed Jun 4 10:52:11 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 4 Jun 2025 10:52:11 GMT Subject: RFR: 8352635: Improve inferencing of Float16 operations with constant inputs [v5] In-Reply-To: References: <44nVQBYgzCOB2mAB9xtAPvkUcOMJOITA2VjMdDFgm1g=.48266693-48bf-41db-8871-a7dcafe93509@github.com> Message-ID: <_IdYz769mq7-kTO802umUJX7Bmaz3Ds4GWLb75lAW8I=.0394a525-2288-407e-9201-7fb6b5f92353@github.com> On Wed, 4 Jun 2025 06:17:23 GMT, Emanuel Peter wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Extending tests and review resolutions > > src/hotspot/share/opto/convertnode.cpp line 294: > >> 292: // Conditions under which floating point constant can be considered for a pattern match. >> 293: // 1. Constant must lie within Float16 value range, this will ensure that >> 294: // we don't unintentially round off float constant to enforce a pattern match. > > What do you mean by `enforce a pattern match`? > > Are you just trying to say that we have to be careful with the pattern matching here, and we cannot just round off the float constant? Do you have an example where that rounding would lead to issues? import jdk.incubator.vector.*; public class verify_rounding { public static void check() { for (int i = 0; i < 65550; i++) { short post_rounding = Float.floatToFloat16(Float.float16ToFloat(Float.floatToFloat16((float)i)) * 2049.0f); short pre_rounding = Float16.float16ToRawShortBits(Float16.multiply(Float16.valueOf((float)i), Float16.valueOf((float)2049.0f))); if (pre_rounding != post_rounding) { System.out.println("Mismatch at val = " + (float)i); System.out.println("post_rounding val = " + post_rounding); System.out.println("pre_rounding val = " + pre_rounding); break; } } } public static void main(String [] args) { check(); } } CPROMPT>java --add-modules=jdk.incubator.vector -cp . verify_rounding WARNING: Using incubator modules: jdk.incubator.vector Mismatch at val = 3.0 post_rounding val = 28161 pre_rounding val = 28160 Since we intend to infer Float16 IR using patten match, hence it may be incorrect to transform post_rounting pattern to pre_rounding. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2126295150 From shade at openjdk.org Wed Jun 4 11:01:39 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 4 Jun 2025 11:01:39 GMT Subject: RFR: 8356000: C1/C2-only modes use 2 compiler threads on low CPU count machines [v3] In-Reply-To: References: Message-ID: On Wed, 28 May 2025 18:05:12 GMT, Aleksey Shipilev wrote: >> There is an unfortunate limitation with default tiered policy that we would have at least 2 threads on 1 CPU machine: 1 thread for C1, and 1 thread for C2. >> >> But if we select C1-only or C2-only modes, we _also_ get 2 compiler threads, for which we have no good reason. These threads would just step on each other toes. The fix changes the behavior for 1..3 CPU hosts in C1/C2-only configurations, by using 1 thread instead of 2 threads. The change for 1 CPU config is what we really need. The change in 2..3 CPU configs is an additional effect, but I think it is still good not to use 100%/66% of the CPUs in those configurations as well. >> >> >> $ for I in `seq 1 8`; do build/linux-x86_64-server-release/images/jdk/bin/java \ >> -XX:-TieredCompilation -XX:ActiveProcessorCount=${I} \ >> -XX:+PrintFlagsFinal 2>&1 | grep "CICompilerCount "; done >> >> # Before >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> # After >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> >> It is a minor bug in `CompilationPolicy::initialize`, but it gets in the way studying Leyden in tight CPU scenarios. >> >> Additional testing: >> - [x] New regression test passes with the fix, fails without it >> - [x] GHA >> - [x] Linux AArch64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Better test, patch amendments > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Unnecessary arch limitation > - Simplify test > - Adjust test bound > - Fix During the final pre-integration checks, I noticed that this PR still has buffer adjustments for `c2_only` case. That adjustment is, strictly speaking, outside the scope of this improvement. So I reverted that hunk. Tests still pass. I see [JDK-8354727](https://bugs.openjdk.org/browse/JDK-8354727) was filed to figure out what happens when we are scarce on code cache, so I would feel better to put it on @mhaessig to mix https://github.com/openjdk/jdk/pull/24972/commits/c43b18a6681acb541be1b3bdadbd635070a2d58d into his work :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/24972#issuecomment-2939570654 From shade at openjdk.org Wed Jun 4 11:01:38 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 4 Jun 2025 11:01:38 GMT Subject: RFR: 8356000: C1/C2-only modes use 2 compiler threads on low CPU count machines [v4] In-Reply-To: References: Message-ID: <2KMWJk5gHoc3t7lOHQRIyLViPdWxuQJpGsYFgma-Sic=.007d743c-8951-4719-b6b2-33dff63860e4@github.com> > There is an unfortunate limitation with default tiered policy that we would have at least 2 threads on 1 CPU machine: 1 thread for C1, and 1 thread for C2. > > But if we select C1-only or C2-only modes, we _also_ get 2 compiler threads, for which we have no good reason. These threads would just step on each other toes. The fix changes the behavior for 1..3 CPU hosts in C1/C2-only configurations, by using 1 thread instead of 2 threads. The change for 1 CPU config is what we really need. The change in 2..3 CPU configs is an additional effect, but I think it is still good not to use 100%/66% of the CPUs in those configurations as well. > > > $ for I in `seq 1 8`; do build/linux-x86_64-server-release/images/jdk/bin/java \ > -XX:-TieredCompilation -XX:ActiveProcessorCount=${I} \ > -XX:+PrintFlagsFinal 2>&1 | grep "CICompilerCount "; done > > # Before > intx CICompilerCount = 2 > intx CICompilerCount = 2 > intx CICompilerCount = 2 > intx CICompilerCount = 3 > intx CICompilerCount = 3 > intx CICompilerCount = 3 > intx CICompilerCount = 3 > intx CICompilerCount = 4 > > # After > intx CICompilerCount = 1 > intx CICompilerCount = 1 > intx CICompilerCount = 1 > intx CICompilerCount = 3 > intx CICompilerCount = 3 > intx CICompilerCount = 3 > intx CICompilerCount = 3 > intx CICompilerCount = 4 > > > It is a minor bug in `CompilationPolicy::initialize`, but it gets in the way studying Leyden in tight CPU scenarios. > > Additional testing: > - [x] New regression test passes with the fix, fails without it > - [x] GHA > - [x] Linux AArch64 server fastdebug, `all` Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: - Revert buffer size change - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count - Better test, patch amendments - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count - Unnecessary arch limitation - Simplify test - Adjust test bound - Fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24972/files - new: https://git.openjdk.org/jdk/pull/24972/files/f8519b46..c43b18a6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24972&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24972&range=02-03 Stats: 47592 lines in 764 files changed: 24205 ins; 14405 del; 8982 mod Patch: https://git.openjdk.org/jdk/pull/24972.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24972/head:pull/24972 PR: https://git.openjdk.org/jdk/pull/24972 From zzambers at openjdk.org Wed Jun 4 11:03:17 2025 From: zzambers at openjdk.org (Zdenek Zambersky) Date: Wed, 4 Jun 2025 11:03:17 GMT Subject: RFR: 8252473: [TESTBUG] compiler tests fail with minimal VM: Unrecognized VM option [v2] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 11:05:22 GMT, Emanuel Peter wrote: >> @eme64 I have rebased my changes on master and fixed conflicts. (caused by integration of [JDK-8350457](https://github.com/openjdk/jdk/pull/24522)) >> >> I have also updated PR description. >> (I have not changed JIRA as there is no info about fix. Should I add it there?) > >> (I have not changed JIRA as there is no info about fix. Should I add it there?) > > Yes please, that is generally what we should do :) @eme64 thank you for the review ------------- PR Comment: https://git.openjdk.org/jdk/pull/24262#issuecomment-2939581284 From duke at openjdk.org Wed Jun 4 11:03:18 2025 From: duke at openjdk.org (duke) Date: Wed, 4 Jun 2025 11:03:18 GMT Subject: RFR: 8252473: [TESTBUG] compiler tests fail with minimal VM: Unrecognized VM option [v3] In-Reply-To: References: Message-ID: On Wed, 28 May 2025 18:39:27 GMT, Zdenek Zambersky wrote: >> This change adds ` -XX:-IgnoreUnrecognizedVMOptions` to problematic tests (or `@requires vm.compiler2.enabled` in one case), to prevent failures `Unrecognized VM option` on client VM. > > Zdenek Zambersky has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: > > Fix of compiler tests for client VM @zzambers Your change (at version d6196a9a6c5c9797bb13e4629757aaf6d550a6b1) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24262#issuecomment-2939583470 From mhaessig at openjdk.org Wed Jun 4 11:13:35 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 4 Jun 2025 11:13:35 GMT Subject: RFR: 8020282: Generated code quality: redundant LEAs in the chained dereferences [v2] In-Reply-To: References: Message-ID: > ## Summary > > On x86, chained dereferences of narrow oops at a constant offset from the base oop can use a `lea` instruction to perform the address computation in one go using the `leaP8Narrow`, `leaP32Narrow`, and `leaPCompressedOopOffset` matching rules. However, the generated code contains an additional `lea` with an unused result: > > ; OptoAssembly > 03d decode_heap_oop_not_null R8,R10 > 041 leaq R10, [R12 + R10 << 3 + #12] (compressed oop addressing) ; ptr compressedoopoff32 > > ; x86 > 0x00007f1f210625bd: lea (%r12,%r10,8),%r8 ; result is unused > 0x00007f1f210625c1: lea 0xc(%r12,%r10,8),%r10 ; the same computation as decode, but with offset > > > This PR adds a peephole optimization to remove such redundant `lea`s. > > ## The Issue in Detail > > The ideal subgraph producing redundant `lea`s, or rather redundant `decodeHeapOop_not_null`s, is `LoadN -> DecodeN -> AddP`, where both the address and base edge of the `AddP` originate from the `DecodeN`. After matching, this becomes > > LoadN -> decodeHeapOop_not_null -> leaP* > ______________________________? > > where `leaP*` is either of `leaP8Narrow`, `leaP32Narrow`, or `leaPCompressedOopOffset` (depending on the heap location and size). Here, the base input of `leaP*` comes from the decode. Looking at the matching code path, we find that the `leaP*` rules match both the `AddP` and the `DecodeN`, since x86 can fold this, but the following code adds the decode back as the base input to `leaP*`: > > https://github.com/openjdk/jdk/blob/c29537740efb04e061732a700582d43b1956cff4/src/hotspot/share/opto/matcher.cpp#L1894-L1897 > > On its face, this is completely unnecessary if we matched a `leaP*`, since it already computes the result of the decode, so adding the `LoadN` node as base seems like the logical choice. However, if the derived oop computed by the `leaP*` gets added to an oop map, this `DecodeN` is needed as the base for the derived oop. Because as of now, derived oops in oop maps cannot have narrow base pointers. > > This leaves us with a handful of possible solutions: > 1. implement narrow bases for derived oops in oop maps, > 2. perform some dead code elimination after we know which oops are part of oop maps, > 3. add a peephole optimization to simply remove unused `lea`s. > > Option 1 would have been ideal in the sense, that it is the earliest possible point to remove the decode, which would simplify the graph and reduce pressure on the register allocator. However, rewriting the oop map machinery to remove a... Manuel H?ssig has updated the pull request incrementally with three additional commits since the last revision: - Add comment to benchmark as to why we fix the heap size - Add missing null chec - Fix typos ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25471/files - new: https://git.openjdk.org/jdk/pull/25471/files/2d8110b0..67afb3ca Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25471&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25471&range=00-01 Stats: 9 lines in 3 files changed: 1 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/25471.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25471/head:pull/25471 PR: https://git.openjdk.org/jdk/pull/25471 From mhaessig at openjdk.org Wed Jun 4 11:18:21 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 4 Jun 2025 11:18:21 GMT Subject: RFR: 8020282: Generated code quality: redundant LEAs in the chained dereferences [v2] In-Reply-To: <-PFiiMlUghbFgg2fuU86vuEXKaexylDuk3kBdcBn9N8=.2c272bf1-10a7-4110-8919-f33ee0d491ba@github.com> References: <-PFiiMlUghbFgg2fuU86vuEXKaexylDuk3kBdcBn9N8=.2c272bf1-10a7-4110-8919-f33ee0d491ba@github.com> Message-ID: <_J9vzkD1D2SWccO3uHIa7sdyhA8AEKg7sM0wBXeTegE=.97c69a7b-4d3d-4574-8130-54294f93e5cb@github.com> On Tue, 3 Jun 2025 17:41:57 GMT, Vladimir Kozlov wrote: >> Manuel H?ssig has updated the pull request incrementally with three additional commits since the last revision: >> >> - Add comment to benchmark as to why we fix the heap size >> - Add missing null chec >> - Fix typos > > src/hotspot/cpu/x86/peephole_x86_64.cpp line 255: > >> 253: // This peephole recognizes graphs of the shape as shown above, ensures that the result of the >> 254: // decode is only used by the derived oop and removes that decode if this is the case. Futher, >> 255: // multipe leaP*s can have the same decode as their base. This peephole will remove the decode > > Typo `multipe` Fixed in [fb728f9](https://github.com/openjdk/jdk/pull/25471/commits/fb728f925442729b111fbe2b2c3cd57e3ed659c0) > src/hotspot/cpu/x86/peephole_x86_64.cpp line 267: > >> 265: // | / \ >> 266: // leaP* MachProj (leaf) >> 267: // In this case where te common parent of the leaP* and the decode is one MemToRegSpill Copy > > Typo: `te` Fixed in [fb728f9](https://github.com/openjdk/jdk/pull/25471/commits/fb728f925442729b111fbe2b2c3cd57e3ed659c0) > src/hotspot/cpu/x86/peephole_x86_64.cpp line 268: > >> 266: // leaP* MachProj (leaf) >> 267: // In this case where te common parent of the leaP* and the decode is one MemToRegSpill Copy >> 268: // away, this peephole can als recognize the decode as redundant and also remove the spill copy > > Typo: `als` Fixed in [fb728f9](https://github.com/openjdk/jdk/pull/25471/commits/fb728f925442729b111fbe2b2c3cd57e3ed659c0) > src/hotspot/cpu/x86/peephole_x86_64.cpp line 324: > >> 322: >> 323: // Ensure the MachProj is in the same block as the decode and the lea. >> 324: if (!block->contains(proj)) { > > Should we check `proj == nullptr` ? Indeed, we should. Thank you for pointing it out. I fixed it in [bf75c0d](https://github.com/openjdk/jdk/pull/25471/commits/bf75c0da751def40149b5548fd0c89595318fc11). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25471#discussion_r2126340673 PR Review Comment: https://git.openjdk.org/jdk/pull/25471#discussion_r2126341088 PR Review Comment: https://git.openjdk.org/jdk/pull/25471#discussion_r2126341424 PR Review Comment: https://git.openjdk.org/jdk/pull/25471#discussion_r2126342950 From mhaessig at openjdk.org Wed Jun 4 11:18:23 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 4 Jun 2025 11:18:23 GMT Subject: RFR: 8020282: Generated code quality: redundant LEAs in the chained dereferences [v2] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 09:22:24 GMT, Aleksey Shipilev wrote: >> This is what I gather to be good practice from @shipilev's [blog post about JMS benchmarks](https://shipilev.net/blog/2016/arrays-wisdom-ancients/#_benchmark). It ensures a consistent heap size across machines and runs because the `StoreN` benchmarks are sensitive to different GC's and heap layouts. But these are my first JMH benchmarks, so I appreciate any input. > > Yes, exactly. We often do this for allocation-heavy benchmarks to put GC in more consistent conditions. GC often tries to decide whether to do a GC cycle or expand the heap, and this decision might change with minor externalities. So it sometimes contributes to run-to-run and intra-run variance. This benchmark does allocate pretty hard, so setting a heap size makes sense. > > (Additionally, this forces a selection of a particular compressed oops mode, 32-bit in this case, which is also nice for reproducibility.) Added comment in [67afb3c](https://github.com/openjdk/jdk/pull/25471/commits/67afb3ca570a5d0edc7b157f3403fde9d8829f2f) to explain this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25471#discussion_r2126340154 From mhaessig at openjdk.org Wed Jun 4 11:26:21 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 4 Jun 2025 11:26:21 GMT Subject: RFR: 8356000: C1/C2-only modes use 2 compiler threads on low CPU count machines [v4] In-Reply-To: <2KMWJk5gHoc3t7lOHQRIyLViPdWxuQJpGsYFgma-Sic=.007d743c-8951-4719-b6b2-33dff63860e4@github.com> References: <2KMWJk5gHoc3t7lOHQRIyLViPdWxuQJpGsYFgma-Sic=.007d743c-8951-4719-b6b2-33dff63860e4@github.com> Message-ID: On Wed, 4 Jun 2025 11:01:38 GMT, Aleksey Shipilev wrote: >> There is an unfortunate limitation with default tiered policy that we would have at least 2 threads on 1 CPU machine: 1 thread for C1, and 1 thread for C2. >> >> But if we select C1-only or C2-only modes, we _also_ get 2 compiler threads, for which we have no good reason. These threads would just step on each other toes. The fix changes the behavior for 1..3 CPU hosts in C1/C2-only configurations, by using 1 thread instead of 2 threads. The change for 1 CPU config is what we really need. The change in 2..3 CPU configs is an additional effect, but I think it is still good not to use 100%/66% of the CPUs in those configurations as well. >> >> >> $ for I in `seq 1 8`; do build/linux-x86_64-server-release/images/jdk/bin/java \ >> -XX:-TieredCompilation -XX:ActiveProcessorCount=${I} \ >> -XX:+PrintFlagsFinal 2>&1 | grep "CICompilerCount "; done >> >> # Before >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> # After >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> >> It is a minor bug in `CompilationPolicy::initialize`, but it gets in the way studying Leyden in tight CPU scenarios. >> >> Additional testing: >> - [x] New regression test passes with the fix, fails without it >> - [x] GHA >> - [x] Linux AArch64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: > > - Revert buffer size change > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Better test, patch amendments > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Unnecessary arch limitation > - Simplify test > - Adjust test bound > - Fix > I see [JDK-8354727](https://bugs.openjdk.org/browse/JDK-8354727) was filed to figure out what happens when we are scarce on code cache, so I would feel better to put it on @mhaessig to mix https://github.com/openjdk/jdk/commit/c43b18a6681acb541be1b3bdadbd635070a2d58d into his work :) Will do. Thanks for the heads up! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24972#issuecomment-2939648782 From epeter at openjdk.org Wed Jun 4 11:45:18 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 11:45:18 GMT Subject: RFR: 8252473: [TESTBUG] compiler tests fail with minimal VM: Unrecognized VM option [v2] In-Reply-To: References: Message-ID: <7A34dFa6iEzgYMzuQgRjl9LP5nbWZOfUe-42_i25wYI=.73856c54-c66e-4a3c-9451-d4f293fa9e61@github.com> On Wed, 4 Jun 2025 10:59:57 GMT, Zdenek Zambersky wrote: >>> (I have not changed JIRA as there is no info about fix. Should I add it there?) >> >> Yes please, that is generally what we should do :) > > @eme64 thank you for the review @zzambers Before we integrate, I'd like to hear if @vnkozlov agrees with the changes too! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24262#issuecomment-2939707225 From epeter at openjdk.org Wed Jun 4 12:27:37 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 12:27:37 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v81] In-Reply-To: References: Message-ID: > **Goal** > We want to generate Java source code: > - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. > - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). > > Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). > > **How to get started** > When reviewing, please start by looking at: > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 > > We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. > > Second, look at this advanced test: > https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 > > And then for a "tutorial", look at: > `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` > > It shows these features: > - The `body` of a Template is essentially a list of `Token`s that are concatenated. > - Templates can be nested: a `TemplateWithArgs` is also a `Token`. > - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. > - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. > - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. > - The use of recursive templates, and `fuel` to limit the recursion. > - `Name`s: useful to register field and variable names in code scopes. > > Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 > > For a better experience, you may want to generate the `javadocs`: > `javadoc -sourcepath test/hotspot/j... Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: Apply suggestions from code review Co-authored-by: Roberto Casta?eda Lozano ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24217/files - new: https://git.openjdk.org/jdk/pull/24217/files/72923879..0ec0949a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=80 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24217&range=79-80 Stats: 6 lines in 2 files changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/24217.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24217/head:pull/24217 PR: https://git.openjdk.org/jdk/pull/24217 From rcastanedalo at openjdk.org Wed Jun 4 12:28:50 2025 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Wed, 4 Jun 2025 12:28:50 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v80] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 15:57:32 GMT, Emanuel Peter wrote: >> **Goal** >> We want to generate Java source code: >> - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. >> - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). >> >> Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). >> >> **How to get started** >> When reviewing, please start by looking at: >> https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 >> >> We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. >> >> Second, look at this advanced test: >> https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 >> >> And then for a "tutorial", look at: >> `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` >> >> It shows these features: >> - The `body` of a Template is essentially a list of `Token`s that are concatenated. >> - Templates can be nested: a `TemplateWithArgs` is also a `Token`. >> - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. >> - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. >> - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. >> - The use of recursive templates, and `fuel` to limit the recursion. >> - `Name`s: useful to register field and variable names in code scopes. >> >> Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. >> https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 >> >> For a better experience, you may want... > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > fix whitespaces from applied suggestion Looks good, I just have a few language suggestions. Thanks for driving this Emanuel! test/hotspot/jtreg/compiler/lib/template_framework/Template.java line 248: > 246: * // The count is still not different to "c1". > 247: * let("c3", dataNames(MUTABLE).exactOf(someType).count()), > 248: * // We nest a Template. This creats a TemplateToken, which is later evaluated. Suggestion: * // We nest a Template. This creates a TemplateToken, which is later evaluated. test/hotspot/jtreg/compiler/lib/template_framework/Template.java line 552: > 550: * > 551: *

> 552: * Here an example with template arguments {@code 'a'} and {@code 'b'}, captured once as string names Suggestion: * Here is an example with template arguments {@code 'a'} and {@code 'b'}, captured once as string names test/hotspot/jtreg/compiler/lib/template_framework/Template.java line 596: > 594: /** > 595: * Creates a {@link TemplateBody} from a list of tokens, which can be {@link String}s, > 596: * boxed primitive types (e.g. {@link Integer} or auto-boxed {@code int}), any {@link Token}, For correct javadoc description generation in `Template.html`: Suggestion: * boxed primitive types (for example {@link Integer} or auto-boxed {@code int}), any {@link Token}, test/hotspot/jtreg/compiler/lib/template_framework/Template.java line 627: > 625: * > 626: *

> 627: * Here an example where a Template creates a local variable {@code 'var'}, Suggestion: * Here is an example where a Template creates a local variable {@code 'var'}, test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java line 277: > 275: // Fourth Template use with template2, no use of dollar, so > 276: // no "_4" shows up in the generated code. Internally, it > 277: // calls template1, shich is the fifth Template use, with Suggestion: // calls template1, which is the fifth Template use, with test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java line 832: > 830: > 831: // Having defined these helper methods, let us start with the first example. > 832: // You should start reading this example bottum-up, starting at Suggestion: // You should start reading this example bottom-up, starting at ------------- Marked as reviewed by rcastanedalo (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24217#pullrequestreview-2896428767 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2126413316 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2126436853 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2126427705 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2126435300 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2126407483 PR Review Comment: https://git.openjdk.org/jdk/pull/24217#discussion_r2126410019 From epeter at openjdk.org Wed Jun 4 12:28:58 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 12:28:58 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v16] In-Reply-To: References: Message-ID: On Tue, 20 May 2025 17:06:34 GMT, Roberto Casta?eda Lozano wrote: >> A few more documentation suggestions, will continue reviewing this changeset over the next days. > >> @robcasloz I addressed all your comments :) > > Thanks @eme64! @robcasloz Thanks for making another pass and for the suggestions! ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24217#issuecomment-2939808487 From epeter at openjdk.org Wed Jun 4 12:29:09 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 12:29:09 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v61] In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 12:14:48 GMT, Christian Hagedorn wrote: >> Thanks for all the updates and discussions! I've worked my way through the documentation in `Template` and the examples again in some more detail. It's much better and the new explanations are well done, excellent work! >> >> I left some comments here and there but mostly minor things. I will have another look at the implementation - probably only finished by Monday. The design now looks great. I'm glad we could find a good solution now after some more iterations :-) > >> @chhagedorn Alright, I now have a decent solution for `$$var` and `$1var` etc. I also added tests for it. >> >> These are issues we could continue the conversation, unless you are satisfied with my answers: [#24217 (comment)](https://github.com/openjdk/jdk/pull/24217#discussion_r2115388737) [#24217 (comment)](https://github.com/openjdk/jdk/pull/24217#discussion_r2115406391) >> >> This is now ready for another review pass ? > > Awesome, thanks for spending some more time with these nasty edge-cases and finding a solution! I had a look at your updates for all my comments, they look good, thanks! > > I'm going to make a pass over the implementation classes now and will have a look at the `Renderer` updates as well :-) @chhagedorn @robcasloz @mhaessig Thanks a lot for all the time you invested to see this through, I know it took a lot of effort to review this, so I am very thankful ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24217#issuecomment-2939819766 From roland at openjdk.org Wed Jun 4 12:37:33 2025 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 4 Jun 2025 12:37:33 GMT Subject: RFR: 8351889: C2 crash: assertion failed: Base pointers must match (addp 344) In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 07:25:54 GMT, Emanuel Peter wrote: > Then what about just the dump of the relevant IR nodes in text form? That is what I meant by `full IR snippets` ;) Are the omitted inputs to `AddP`s that you'd like to see? Anything else? Do you want to see them added to: /-> CastPP#110 Store#195 -> Phi#360 -> AddP#133 -> AddP#134 -> CastPP#110 -> AddP#277 -> AddP#278 -> CastPP#283 -> CastPP#283 ? > Is there any (reasonable) way to push the `CastPP` through the `AddP` here? I guess that may mean duplicating some `AddP` in some cases... But it could also give an opportunity for the `CastPP` to common further up that way. What do you think? It is hard for me to see through it without looking at some examples of the IR. That's not where C2 expects the `CastPP`s to be so I suppose it could be quite disruptive but hard for me to tell how much. Beyond that, wouldn't we need to know if one `CastPP` dominates the other `CastPP` before we can common them and would have the same issue we have here? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25386#issuecomment-2939858255 From mhaessig at openjdk.org Wed Jun 4 12:37:42 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 4 Jun 2025 12:37:42 GMT Subject: RFR: 8356000: C1/C2-only modes use 2 compiler threads on low CPU count machines [v4] In-Reply-To: <2KMWJk5gHoc3t7lOHQRIyLViPdWxuQJpGsYFgma-Sic=.007d743c-8951-4719-b6b2-33dff63860e4@github.com> References: <2KMWJk5gHoc3t7lOHQRIyLViPdWxuQJpGsYFgma-Sic=.007d743c-8951-4719-b6b2-33dff63860e4@github.com> Message-ID: On Wed, 4 Jun 2025 11:01:38 GMT, Aleksey Shipilev wrote: >> There is an unfortunate limitation with default tiered policy that we would have at least 2 threads on 1 CPU machine: 1 thread for C1, and 1 thread for C2. >> >> But if we select C1-only or C2-only modes, we _also_ get 2 compiler threads, for which we have no good reason. These threads would just step on each other toes. The fix changes the behavior for 1..3 CPU hosts in C1/C2-only configurations, by using 1 thread instead of 2 threads. The change for 1 CPU config is what we really need. The change in 2..3 CPU configs is an additional effect, but I think it is still good not to use 100%/66% of the CPUs in those configurations as well. >> >> >> $ for I in `seq 1 8`; do build/linux-x86_64-server-release/images/jdk/bin/java \ >> -XX:-TieredCompilation -XX:ActiveProcessorCount=${I} \ >> -XX:+PrintFlagsFinal 2>&1 | grep "CICompilerCount "; done >> >> # Before >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> # After >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> >> It is a minor bug in `CompilationPolicy::initialize`, but it gets in the way studying Leyden in tight CPU scenarios. >> >> Additional testing: >> - [x] New regression test passes with the fix, fails without it >> - [x] GHA >> - [x] Linux AArch64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: > > - Revert buffer size change > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Better test, patch amendments > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Unnecessary arch limitation > - Simplify test > - Adjust test bound > - Fix test/hotspot/jtreg/compiler/arguments/TestCompilerCounts.java line 2: > 1: /* > 2: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. IANAL, but shouldn't this include the year? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24972#discussion_r2126479501 From mhaessig at openjdk.org Wed Jun 4 12:38:26 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 4 Jun 2025 12:38:26 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v81] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 12:27:37 GMT, Emanuel Peter wrote: >> **Goal** >> We want to generate Java source code: >> - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. >> - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). >> >> Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). >> >> **How to get started** >> When reviewing, please start by looking at: >> https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 >> >> We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. >> >> Second, look at this advanced test: >> https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 >> >> And then for a "tutorial", look at: >> `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` >> >> It shows these features: >> - The `body` of a Template is essentially a list of `Token`s that are concatenated. >> - Templates can be nested: a `TemplateWithArgs` is also a `Token`. >> - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. >> - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. >> - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. >> - The use of recursive templates, and `fuel` to limit the recursion. >> - `Name`s: useful to register field and variable names in code scopes. >> >> Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. >> https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 >> >> For a better experience, you may want... > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > Apply suggestions from code review > > Co-authored-by: Roberto Casta?eda Lozano Marked as reviewed by mhaessig (Author). ------------- PR Review: https://git.openjdk.org/jdk/pull/24217#pullrequestreview-2896547360 From epeter at openjdk.org Wed Jun 4 12:38:27 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 12:38:27 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v81] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 12:27:37 GMT, Emanuel Peter wrote: >> **Goal** >> We want to generate Java source code: >> - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. >> - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). >> >> Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). >> >> **How to get started** >> When reviewing, please start by looking at: >> https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 >> >> We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. >> >> Second, look at this advanced test: >> https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 >> >> And then for a "tutorial", look at: >> `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` >> >> It shows these features: >> - The `body` of a Template is essentially a list of `Token`s that are concatenated. >> - Templates can be nested: a `TemplateWithArgs` is also a `Token`. >> - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. >> - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. >> - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. >> - The use of recursive templates, and `fuel` to limit the recursion. >> - `Name`s: useful to register field and variable names in code scopes. >> >> Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. >> https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 >> >> For a better experience, you may want... > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > Apply suggestions from code review > > Co-authored-by: Roberto Casta?eda Lozano I want to thank all the contributors here again before integration: @TobiHartmann For the early experiments and getting this project started. @tobiasholenstein For taking on the project, and mentoring Said. They came up with the `$` idea for de-duplication of variable names. @theoweidmannoracle For the fantastic idea using Generics and making it more functional. His prototype is what I then ran with. @chhagedorn We had a lot of enlightening conversations, he was the one who invested the most effort reviewing this patch. He pushed me to improve a lot of API parts, and I learned a lot from his efforts on the Testing Framework, especially that even such frameworks should be tested thoroughly. @robcasloz Also invested a lot of time, and played with it hands on. One of his ideas was to allow the curly brackets for `${name}` and `#{name}`. @mhaessig Even though he only just joined us, he already jumped on board quickly and is already reviewing, just fantastic! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24217#issuecomment-2939859277 From jbhateja at openjdk.org Wed Jun 4 12:39:59 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 4 Jun 2025 12:39:59 GMT Subject: RFR: 8352635: Improve inferencing of Float16 operations with constant inputs [v7] In-Reply-To: <44nVQBYgzCOB2mAB9xtAPvkUcOMJOITA2VjMdDFgm1g=.48266693-48bf-41db-8871-a7dcafe93509@github.com> References: <44nVQBYgzCOB2mAB9xtAPvkUcOMJOITA2VjMdDFgm1g=.48266693-48bf-41db-8871-a7dcafe93509@github.com> Message-ID: > This is a follow-up PR#22755 to improve Float16 operations inferencing. > > The existing scheme to detect Float16 operations for some operations is based on pattern matching which expects to receive inputs through ConvHF2F IR, this patch extends matching to accept constant floating point inputs within the Float16 value range. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Review comments resolutions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24179/files - new: https://git.openjdk.org/jdk/pull/24179/files/b95c51cb..18fb6dcb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24179&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24179&range=05-06 Stats: 31 lines in 2 files changed: 22 ins; 1 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/24179.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24179/head:pull/24179 PR: https://git.openjdk.org/jdk/pull/24179 From jbhateja at openjdk.org Wed Jun 4 12:40:00 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 4 Jun 2025 12:40:00 GMT Subject: RFR: 8352635: Improve inferencing of Float16 operations with constant inputs [v5] In-Reply-To: References: <44nVQBYgzCOB2mAB9xtAPvkUcOMJOITA2VjMdDFgm1g=.48266693-48bf-41db-8871-a7dcafe93509@github.com> Message-ID: <5-8MuNe9k6w-QscevXWn50i-ve5wte7b-QO6Js96ASc=.abc645b4-56cf-49a5-9bd0-afdd43454d99@github.com> On Wed, 4 Jun 2025 06:28:44 GMT, Emanuel Peter wrote: >> src/hotspot/share/opto/convertnode.cpp line 304: >> >>> 302: // expression, this downcast will still preserve significand bits of binary32 NaN. >>> 303: bool isnan = ((*reinterpret_cast(&con) & 0x7F800000) == 0x7F800000) && >>> 304: ((*reinterpret_cast(&con) & 0x7FFFFF) != 0); >> >> Why are you hand-crafting this check here? Is there not some predefined function to do this check? > > Does `g_isnan` not work here? If not, add a comment why :) Nice suggestion!, Fixed. >> test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java line 63: >> >>> 61: >>> 62: private static Generator genF = G.uniformFloats(0.0f, 70000.0f); >>> 63: private static Generator genHF = G.uniformFloat16s(Float.floatToFloat16(-2000.0f), Float.floatToFloat16(2000.0f)); >> >> Is there a good reason to only take the uniform distribution? >> >> https://github.com/openjdk/jdk/blob/4a491bef6636441f14fc8bbdedf65063fce038bd/test/hotspot/jtreg/compiler/lib/generators/Generators.java#L102-L105 > > What about `NaN` and `infty` etc? There are some value transforms which are sensitive to specific value range e.g. https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/subnode.cpp#L2020 https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/divnode.cpp#L897 Choosing any random value will make it tricky to put hard IR checks in place, uniformFloat range is hitting right sweet spot for us. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2126466898 PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2126466086 From jbhateja at openjdk.org Wed Jun 4 12:40:02 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 4 Jun 2025 12:40:02 GMT Subject: RFR: 8352635: Improve inferencing of Float16 operations with constant inputs [v5] In-Reply-To: References: <44nVQBYgzCOB2mAB9xtAPvkUcOMJOITA2VjMdDFgm1g=.48266693-48bf-41db-8871-a7dcafe93509@github.com> Message-ID: On Wed, 4 Jun 2025 05:57:53 GMT, Emanuel Peter wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Extending tests and review resolutions > > test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java line 335: > >> 333: res += Float.floatToFloat16(POSITIVE_ZERO_VAR.floatValue() - INEXACT_FP16); >> 334: res += Float.floatToFloat16(INEXACT_FP16 * POSITIVE_ZERO_VAR.floatValue()); >> 335: res += Float.floatToFloat16(POSITIVE_ZERO_VAR.floatValue() / INEXACT_FP16); > > Why is the mul case flipped here? To check for constant on either side of an expression. > test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java line 363: > >> 361: @Check(test="testSNaNFP16ConstantPatterns") >> 362: public void checkSNaNFP16ConstantPatterns(short actual) throws Exception { >> 363: TestFramework.deoptimize(TestFloat16ScalarOperations.class.getMethod("testSNaNFP16ConstantPatterns")); > > Oh wow, I have never seen this pattern used. Cool idea! Do you know what impact this has on test runtime? IIUC, since entier framework is based on whitebox APIs hence @Check annotated method is only invoked once after each @Test annotated method execution, I don't see much impact on test execution time here, we are just making sure that the expected value gets computed by the interpreter. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2126466180 PR Review Comment: https://git.openjdk.org/jdk/pull/24179#discussion_r2126466229 From chagedorn at openjdk.org Wed Jun 4 12:45:44 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Wed, 4 Jun 2025 12:45:44 GMT Subject: RFR: 8344942: Template-Based Testing Framework [v81] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 12:27:37 GMT, Emanuel Peter wrote: >> **Goal** >> We want to generate Java source code: >> - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. >> - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). >> >> Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). >> >> **How to get started** >> When reviewing, please start by looking at: >> https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 >> >> We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. >> >> Second, look at this advanced test: >> https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 >> >> And then for a "tutorial", look at: >> `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` >> >> It shows these features: >> - The `body` of a Template is essentially a list of `Token`s that are concatenated. >> - Templates can be nested: a `TemplateWithArgs` is also a `Token`. >> - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. >> - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. >> - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. >> - The use of recursive templates, and `fuel` to limit the recursion. >> - `Name`s: useful to register field and variable names in code scopes. >> >> Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. >> https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 >> >> For a better experience, you may want... > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > Apply suggestions from code review > > Co-authored-by: Roberto Casta?eda Lozano Marked as reviewed by chagedorn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/24217#pullrequestreview-2896583145 From epeter at openjdk.org Wed Jun 4 12:46:27 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 12:46:27 GMT Subject: RFR: 8345485: C2 MergeLoads: merge adjacent array/native memory loads into larger load [v4] In-Reply-To: References: <96Ny_BPjRCbNlD14DNDUOuQ0IX-F8hx21gxQKVfim9M=.d502019a-27ed-4a35-81ef-bc2aec5e7557@github.com> <_IhK2U23lIUOtBKOt-WMxQ3L7b2t26RzclJRdqbIgms=.3ef9a630-f99c-4de7-994a-bcabf912230b@github.com> <9ABhENoZtR76wmsgRmzeEceDvCvoflfCcbDbK8H2rso=.e351f63f-1331-4e2e-8a02-763a8c0c4f70@github.com> Message-ID: On Wed, 4 Jun 2025 07:50:24 GMT, Kuai Wei wrote: >> @kuaiwei I'm not in a rush with this one. I'd rather we have a good design and be reasonably sure that it is correct, rather than rush it now and having to do extra cycles fixing things later ;) > > Hi @eme64 , I tried to use match pattern for `MergePrimitiveLoads::has_no_merge_load_combine_below()` . But I think it has some difficulty. For mergeable operators, they can be linked in different way, like: > 1) (((item1 Or item2) Or item3) Or item4) > 2) ((item1 Or item2) Or (item3 Or item4)) > ... > To check the next `Or` operator is a valid last one of combine operator chain. We may check its all input recursively. I didn't find a good way to revolve it. If you have better idea, I will check it. > > I think it's more easy to mark the combine operator checked. It works in this way: > * If the checking combine operator has successor combine operator , which is not checked before, we do not optimize it and let the next one has chance to be optimized. > * If we try to merge but failed, so we mark it as a `checked` and add its input into GVN worklist. So its input operators can be checked. > > I added comments of MergePrimitiveLoads::has_no_merge_load_combine_below() to describe the design. > > To reduce the memory size of `AddNode`. I removed the flag from `AddNode` and add 2 virtual fucntions > ```c++ > // Check if this node is checked by merge_memops phase > virtual bool is_merge_memops_checked() const { return false; }; > virtual void set_merge_memops_checked(bool v) { ShouldNotReachHere(); }; > > The flag , `_merge_memops_checked`, is only added in OrINode and OrLNode. > > Could you help to check the design and code? > > Thanks. @kuaiwei Thanks for your reply! > I think it's more easy to mark the combine operator checked. It may seem easier now. But over time, if multiple operations had such flags, things would become very messy. And now every node that can be such a `combine operator` has to have an additional flag, and consumes more memory. > I tried to use match pattern for MergePrimitiveLoads::has_no_merge_load_combine_below() . But I think it has some difficulty. For mergeable operators, they can be linked in different way, like: > (((item1 Or item2) Or item3) Or item4) > ((item1 Or item2) Or (item3 Or item4)) > ... Yes, we may have to deal with inputs being permuted. But I think we should be able to deal with the permutations, we do that in other places too. > To check the next Or operator is a valid last one of combine operator chain. We may check its all input recursively. I didn't find a good way to revolve it. If you have better idea, I will check it. I'm not sure I understood what you said here. > We may check its all input recursively You probably mean we could check all outputs? So if you are looking at the `OrINode`, and the pattern above it is already a `MergeLoad` pattern, then we should also look down, and see if we find other `OrINode`. For each of these output nodes, we should check if their other input could also be merged with what we already have. Do you not think this is possible? What exactly makes it difficult or impossible? ------------- PR Comment: https://git.openjdk.org/jdk/pull/24023#issuecomment-2939903492 From epeter at openjdk.org Wed Jun 4 13:20:33 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 13:20:33 GMT Subject: Integrated: 8344942: Template-Based Testing Framework In-Reply-To: References: Message-ID: On Tue, 25 Mar 2025 08:31:36 GMT, Emanuel Peter wrote: > **Goal** > We want to generate Java source code: > - Make it easy to generate variants of tests. E.g. for each offset, for each operator, for each type, etc. > - Enable the generation of domain specific fuzzers (e.g. random expressions and statements). > > Note: with the Template Library draft I was already able to find a [list of bugs](https://bugs.openjdk.org/issues/?jql=labels%20%3D%20template-framework%20ORDER%20BY%20created%20DESC%2C%20summary%20DESC). > > **How to get started** > When reviewing, please start by looking at: > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestSimple.java#L60-L76 > > We have a Template with two arguments. They are typed (Integer and String). We then apply the arguments `template.withArgs(42, "7")`, producing a `TemplateWithArgs`. This can then be `render`ed to a String. And then that can be compiled and executed with the CompileFramework. > > Second, look at this advanced test: > https://github.com/openjdk/jdk/blob/77079807042fc5a3af04e0ccccad4ecd89e21cdb/test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestAdvanced.java#L102-L119 > > And then for a "tutorial", look at: > `test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestTutorial.java` > > It shows these features: > - The `body` of a Template is essentially a list of `Token`s that are concatenated. > - Templates can be nested: a `TemplateWithArgs` is also a `Token`. > - We can use `#name` replacements to directly format values into the String. If we had proper String Templates in Java, we would not need this feature. > - We can use `$var` to make variable names unique: if we applied the same template twice, we would get variable collisions. `$var` is then replaced with e.g. `var_7` in one template use and `var_42` in the other template use. > - The use of `Hook`s to insert code into outer (earlier) code locations. This is useful, for example, to insert fields on demand. > - The use of recursive templates, and `fuel` to limit the recursion. > - `Name`s: useful to register field and variable names in code scopes. > > Next, look at the documentation in. This file is the heart of the Template Framework, and describes all the important features. > https://github.com/openjdk/jdk/blob/d21a8aabaf3b191e851b6997c11bb30fcd0f942f/test/hotspot/jtreg/compiler/lib/template_framework/Template.java#L31-L76 > > For a better experience, you may want to generate the `javadocs`: > `javadoc -sourcepath test/hotspot/j... This pull request has now been integrated. Changeset: 248341d3 Author: Emanuel Peter URL: https://git.openjdk.org/jdk/commit/248341d372ba9c1031729a65eb10d8def52de641 Stats: 6722 lines in 27 files changed: 6722 ins; 0 del; 0 mod 8344942: Template-Based Testing Framework Co-authored-by: Tobias Hartmann Co-authored-by: Tobias Holenstein Co-authored-by: Theo Weidmann Co-authored-by: Roberto Casta?eda Lozano Co-authored-by: Christian Hagedorn Co-authored-by: Manuel H?ssig Reviewed-by: chagedorn, mhaessig, rcastanedalo ------------- PR: https://git.openjdk.org/jdk/pull/24217 From mhaessig at openjdk.org Wed Jun 4 13:24:58 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Wed, 4 Jun 2025 13:24:58 GMT Subject: RFR: 8020282: Generated code quality: redundant LEAs in the chained dereferences [v2] In-Reply-To: <-PFiiMlUghbFgg2fuU86vuEXKaexylDuk3kBdcBn9N8=.2c272bf1-10a7-4110-8919-f33ee0d491ba@github.com> References: <-PFiiMlUghbFgg2fuU86vuEXKaexylDuk3kBdcBn9N8=.2c272bf1-10a7-4110-8919-f33ee0d491ba@github.com> Message-ID: On Tue, 3 Jun 2025 17:37:41 GMT, Vladimir Kozlov wrote: >> Manuel H?ssig has updated the pull request incrementally with three additional commits since the last revision: >> >> - Add comment to benchmark as to why we fix the heap size >> - Add missing null chec >> - Fix typos > > src/hotspot/cpu/x86/peephole_x86_64.cpp line 244: > >> 242: // the DecodeN. However, after matching the DecodeN is added back as the base for the leaP*, >> 243: // which is nessecary if the oop derived by the leaP* gets added to an OopMap, because OopMaps >> 244: // cannot contain derived oops with narrow oops as a base. > > Am I correct to assume that if it is referenced in OopMap (which is side table) it will by referenced by some Safepoint node in graph? Exactly. This is why I can get away with only checking the usages of the decode. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25471#discussion_r2126592531 From yzheng at openjdk.org Wed Jun 4 13:23:58 2025 From: yzheng at openjdk.org (Yudi Zheng) Date: Wed, 4 Jun 2025 13:23:58 GMT Subject: RFR: 8357660: [JVMCI] Add support for retrieving all BootstrapMethodInvocations directly from ConstantPool [v5] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 08:07:40 GMT, Tom Shull wrote: >> This PR adds support for directly retrieving both all invokedynamic and all condy BootstrapMethodInvocations from a ConstantPool via the new method `List lookupBootstrapMethodInvocations(boolean invokeDynamic)`. >> >> In addition, two methods are added to the BootstrapMethodInvocations: >> 1. `void resolve()` >> 2. `JavaConstant lookup()` >> >> The combination of these two features allows one to directly interact with all BSM information of a given ConstantPool without having to iterate through all of the Classfile's methods to find all invokedynamic bytecodes and/or iterate through all Constant Pool entries. > > Tom Shull has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision: > > - Merge remote-tracking branch 'origin/master' into JDK-8357660 > - Merge remote-tracking branch 'origin/master' into JDK-8357660 > - commit to trigger testing > - commit to trigger testing > - reviewer feedback and update javadoc formatting > - complete changes > - commit review suggestion > > Co-authored-by: Douglas Simon > - commit review suggestion > > Co-authored-by: Douglas Simon > - change to allow both indys and condys to be looked up all at once > - address reviewer feedback > - ... and 2 more: https://git.openjdk.org/jdk/compare/2eb99b1a...c7f5c1a7 LGTM ------------- Marked as reviewed by yzheng (Committer). PR Review: https://git.openjdk.org/jdk/pull/25420#pullrequestreview-2896717439 From shade at openjdk.org Wed Jun 4 13:37:55 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 4 Jun 2025 13:37:55 GMT Subject: RFR: 8356000: C1/C2-only modes use 2 compiler threads on low CPU count machines [v4] In-Reply-To: References: <2KMWJk5gHoc3t7lOHQRIyLViPdWxuQJpGsYFgma-Sic=.007d743c-8951-4719-b6b2-33dff63860e4@github.com> Message-ID: On Wed, 4 Jun 2025 12:28:45 GMT, Manuel H?ssig wrote: >> Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: >> >> - Revert buffer size change >> - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count >> - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count >> - Better test, patch amendments >> - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count >> - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count >> - Unnecessary arch limitation >> - Simplify test >> - Adjust test bound >> - Fix > > test/hotspot/jtreg/compiler/arguments/TestCompilerCounts.java line 2: > >> 1: /* >> 2: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > IANAL, but shouldn't this include the year? Nope, that's our standard header. Saves us a hassle of updating the header every 365 days ;) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24972#discussion_r2126623649 From duke at openjdk.org Wed Jun 4 13:49:59 2025 From: duke at openjdk.org (duke) Date: Wed, 4 Jun 2025 13:49:59 GMT Subject: RFR: 8357660: [JVMCI] Add support for retrieving all BootstrapMethodInvocations directly from ConstantPool [v5] In-Reply-To: References: Message-ID: <4wnk2wbbgb3jbkjwVVvDH6JZH-EkT55XT01HhtXmVAI=.8d87c14e-770f-4a24-99af-5bff3c3fb89b@github.com> On Wed, 4 Jun 2025 08:07:40 GMT, Tom Shull wrote: >> This PR adds support for directly retrieving both all invokedynamic and all condy BootstrapMethodInvocations from a ConstantPool via the new method `List lookupBootstrapMethodInvocations(boolean invokeDynamic)`. >> >> In addition, two methods are added to the BootstrapMethodInvocations: >> 1. `void resolve()` >> 2. `JavaConstant lookup()` >> >> The combination of these two features allows one to directly interact with all BSM information of a given ConstantPool without having to iterate through all of the Classfile's methods to find all invokedynamic bytecodes and/or iterate through all Constant Pool entries. > > Tom Shull has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision: > > - Merge remote-tracking branch 'origin/master' into JDK-8357660 > - Merge remote-tracking branch 'origin/master' into JDK-8357660 > - commit to trigger testing > - commit to trigger testing > - reviewer feedback and update javadoc formatting > - complete changes > - commit review suggestion > > Co-authored-by: Douglas Simon > - commit review suggestion > > Co-authored-by: Douglas Simon > - change to allow both indys and condys to be looked up all at once > - address reviewer feedback > - ... and 2 more: https://git.openjdk.org/jdk/compare/865473d8...c7f5c1a7 @teshull Your change (at version c7f5c1a79a8ef8fdc7d50ee03b78ebc62b53fc83) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25420#issuecomment-2940109500 From kvn at openjdk.org Wed Jun 4 13:53:54 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 4 Jun 2025 13:53:54 GMT Subject: RFR: 8358330: AsmRemarks and DbgStrings clear() method may not get called before their destructor [v2] In-Reply-To: <6yLBtrUKBPgV63susOsKKhAPYCofyOI_Yd0wqbSqrCU=.12d4c0ca-0fa6-4000-a5e1-3ffd0f2ea6cc@github.com> References: <6yLBtrUKBPgV63susOsKKhAPYCofyOI_Yd0wqbSqrCU=.12d4c0ca-0fa6-4000-a5e1-3ffd0f2ea6cc@github.com> Message-ID: On Tue, 3 Jun 2025 15:36:22 GMT, Ashutosh Mehra wrote: >> This patch fixes a possible assert in debug builds if the allocation of memory for a CodeBlob fails when loading it from the AOT Code Cache. See description of [JDK-8358330](https://bugs.openjdk.org/browse/JDK-8358330) for more details. > > Ashutosh Mehra has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into JDK-8358330 > - Address review comments > > Signed-off-by: Ashutosh Mehra > - 8358330: AsmRemarks and DbgStrings clear() method may not get called before their destructor > > Signed-off-by: Ashutosh Mehra My testing passed. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25598#pullrequestreview-2896826999 From duke at openjdk.org Wed Jun 4 13:54:00 2025 From: duke at openjdk.org (Tom Shull) Date: Wed, 4 Jun 2025 13:54:00 GMT Subject: Integrated: 8357660: [JVMCI] Add support for retrieving all BootstrapMethodInvocations directly from ConstantPool In-Reply-To: References: Message-ID: On Fri, 23 May 2025 17:37:14 GMT, Tom Shull wrote: > This PR adds support for directly retrieving both all invokedynamic and all condy BootstrapMethodInvocations from a ConstantPool via the new method `List lookupBootstrapMethodInvocations(boolean invokeDynamic)`. > > In addition, two methods are added to the BootstrapMethodInvocations: > 1. `void resolve()` > 2. `JavaConstant lookup()` > > The combination of these two features allows one to directly interact with all BSM information of a given ConstantPool without having to iterate through all of the Classfile's methods to find all invokedynamic bytecodes and/or iterate through all Constant Pool entries. This pull request has now been integrated. Changeset: 0352477f Author: Tom Shull Committer: Doug Simon URL: https://git.openjdk.org/jdk/commit/0352477ff5977b0010e62000adbde88026a49a7e Stats: 144 lines in 5 files changed: 132 ins; 0 del; 12 mod 8357660: [JVMCI] Add support for retrieving all BootstrapMethodInvocations directly from ConstantPool Reviewed-by: dnsimon, yzheng ------------- PR: https://git.openjdk.org/jdk/pull/25420 From kvn at openjdk.org Wed Jun 4 13:57:54 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 4 Jun 2025 13:57:54 GMT Subject: RFR: 8356000: C1/C2-only modes use 2 compiler threads on low CPU count machines [v4] In-Reply-To: <2KMWJk5gHoc3t7lOHQRIyLViPdWxuQJpGsYFgma-Sic=.007d743c-8951-4719-b6b2-33dff63860e4@github.com> References: <2KMWJk5gHoc3t7lOHQRIyLViPdWxuQJpGsYFgma-Sic=.007d743c-8951-4719-b6b2-33dff63860e4@github.com> Message-ID: On Wed, 4 Jun 2025 11:01:38 GMT, Aleksey Shipilev wrote: >> There is an unfortunate limitation with default tiered policy that we would have at least 2 threads on 1 CPU machine: 1 thread for C1, and 1 thread for C2. >> >> But if we select C1-only or C2-only modes, we _also_ get 2 compiler threads, for which we have no good reason. These threads would just step on each other toes. The fix changes the behavior for 1..3 CPU hosts in C1/C2-only configurations, by using 1 thread instead of 2 threads. The change for 1 CPU config is what we really need. The change in 2..3 CPU configs is an additional effect, but I think it is still good not to use 100%/66% of the CPUs in those configurations as well. >> >> >> $ for I in `seq 1 8`; do build/linux-x86_64-server-release/images/jdk/bin/java \ >> -XX:-TieredCompilation -XX:ActiveProcessorCount=${I} \ >> -XX:+PrintFlagsFinal 2>&1 | grep "CICompilerCount "; done >> >> # Before >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> # After >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> >> It is a minor bug in `CompilationPolicy::initialize`, but it gets in the way studying Leyden in tight CPU scenarios. >> >> Additional testing: >> - [x] New regression test passes with the fix, fails without it >> - [x] GHA >> - [x] Linux AArch64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: > > - Revert buffer size change > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Better test, patch amendments > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Unnecessary arch limitation > - Simplify test > - Adjust test bound > - Fix Re-approved ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/24972#pullrequestreview-2896843574 From asmehra at openjdk.org Wed Jun 4 14:01:59 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Wed, 4 Jun 2025 14:01:59 GMT Subject: RFR: 8358330: AsmRemarks and DbgStrings clear() method may not get called before their destructor [v2] In-Reply-To: References: <6yLBtrUKBPgV63susOsKKhAPYCofyOI_Yd0wqbSqrCU=.12d4c0ca-0fa6-4000-a5e1-3ffd0f2ea6cc@github.com> Message-ID: On Wed, 4 Jun 2025 13:50:51 GMT, Vladimir Kozlov wrote: >> Ashutosh Mehra has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge branch 'master' into JDK-8358330 >> - Address review comments >> >> Signed-off-by: Ashutosh Mehra >> - 8358330: AsmRemarks and DbgStrings clear() method may not get called before their destructor >> >> Signed-off-by: Ashutosh Mehra > > My testing passed. @vnkozlov thanks for testing and reviewing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25598#issuecomment-2940155028 From kvn at openjdk.org Wed Jun 4 14:06:53 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 4 Jun 2025 14:06:53 GMT Subject: RFR: 8020282: Generated code quality: redundant LEAs in the chained dereferences [v2] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 11:13:35 GMT, Manuel H?ssig wrote: >> ## Summary >> >> On x86, chained dereferences of narrow oops at a constant offset from the base oop can use a `lea` instruction to perform the address computation in one go using the `leaP8Narrow`, `leaP32Narrow`, and `leaPCompressedOopOffset` matching rules. However, the generated code contains an additional `lea` with an unused result: >> >> ; OptoAssembly >> 03d decode_heap_oop_not_null R8,R10 >> 041 leaq R10, [R12 + R10 << 3 + #12] (compressed oop addressing) ; ptr compressedoopoff32 >> >> ; x86 >> 0x00007f1f210625bd: lea (%r12,%r10,8),%r8 ; result is unused >> 0x00007f1f210625c1: lea 0xc(%r12,%r10,8),%r10 ; the same computation as decode, but with offset >> >> >> This PR adds a peephole optimization to remove such redundant `lea`s. >> >> ## The Issue in Detail >> >> The ideal subgraph producing redundant `lea`s, or rather redundant `decodeHeapOop_not_null`s, is `LoadN -> DecodeN -> AddP`, where both the address and base edge of the `AddP` originate from the `DecodeN`. After matching, this becomes >> >> LoadN -> decodeHeapOop_not_null -> leaP* >> ______________________________? >> >> where `leaP*` is either of `leaP8Narrow`, `leaP32Narrow`, or `leaPCompressedOopOffset` (depending on the heap location and size). Here, the base input of `leaP*` comes from the decode. Looking at the matching code path, we find that the `leaP*` rules match both the `AddP` and the `DecodeN`, since x86 can fold this, but the following code adds the decode back as the base input to `leaP*`: >> >> https://github.com/openjdk/jdk/blob/c29537740efb04e061732a700582d43b1956cff4/src/hotspot/share/opto/matcher.cpp#L1894-L1897 >> >> On its face, this is completely unnecessary if we matched a `leaP*`, since it already computes the result of the decode, so adding the `LoadN` node as base seems like the logical choice. However, if the derived oop computed by the `leaP*` gets added to an oop map, this `DecodeN` is needed as the base for the derived oop. Because as of now, derived oops in oop maps cannot have narrow base pointers. >> >> This leaves us with a handful of possible solutions: >> 1. implement narrow bases for derived oops in oop maps, >> 2. perform some dead code elimination after we know which oops are part of oop maps, >> 3. add a peephole optimization to simply remove unused `lea`s. >> >> Option 1 would have been ideal in the sense, that it is the earliest possible point to remove the decode, which would simplify the graph and reduce pressure on the regi... > > Manuel H?ssig has updated the pull request incrementally with three additional commits since the last revision: > > - Add comment to benchmark as to why we fix the heap size > - Add missing null chec > - Fix typos Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25471#pullrequestreview-2896878808 From rcastanedalo at openjdk.org Wed Jun 4 14:09:53 2025 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Wed, 4 Jun 2025 14:09:53 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v8] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> <1gdeBnZ7YuIf9CgQW2bCXkDDBWPjUgRnickHts-fvzE=.e6e901ba-3e9f-41a2-9c68-167a879e9655@github.com> Message-ID: <2m1_XtiSsW_LaBRrkX4qv7AKtLOjNgnl4mUp3zisasE=.dda62164-7aa0-4c1a-b83f-fa40ba7902e5@github.com> On Wed, 4 Jun 2025 09:23:24 GMT, Roland Westrelin wrote: > > In the common case where allocations are not eliminated, matching transforms the introduced `NarrowMemProj` nodes into a sequence of redundant, raw `MemProj` nodes, see e.g. B6 here: [after-gcm.pdf](https://github.com/user-attachments/files/20477560/after-gcm.pdf). Would it be possible to clean them up during matching (or perhaps already during, or right after, macro expansion)? > > Thanks for looking at this @robcasloz I made the change you requested. Thanks, will run some testing and come back with the results. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24570#issuecomment-2940179798 From shade at openjdk.org Wed Jun 4 14:24:10 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 4 Jun 2025 14:24:10 GMT Subject: RFR: 8356000: C1/C2-only modes use 2 compiler threads on low CPU count machines [v4] In-Reply-To: <2KMWJk5gHoc3t7lOHQRIyLViPdWxuQJpGsYFgma-Sic=.007d743c-8951-4719-b6b2-33dff63860e4@github.com> References: <2KMWJk5gHoc3t7lOHQRIyLViPdWxuQJpGsYFgma-Sic=.007d743c-8951-4719-b6b2-33dff63860e4@github.com> Message-ID: On Wed, 4 Jun 2025 11:01:38 GMT, Aleksey Shipilev wrote: >> There is an unfortunate limitation with default tiered policy that we would have at least 2 threads on 1 CPU machine: 1 thread for C1, and 1 thread for C2. >> >> But if we select C1-only or C2-only modes, we _also_ get 2 compiler threads, for which we have no good reason. These threads would just step on each other toes. The fix changes the behavior for 1..3 CPU hosts in C1/C2-only configurations, by using 1 thread instead of 2 threads. The change for 1 CPU config is what we really need. The change in 2..3 CPU configs is an additional effect, but I think it is still good not to use 100%/66% of the CPUs in those configurations as well. >> >> >> $ for I in `seq 1 8`; do build/linux-x86_64-server-release/images/jdk/bin/java \ >> -XX:-TieredCompilation -XX:ActiveProcessorCount=${I} \ >> -XX:+PrintFlagsFinal 2>&1 | grep "CICompilerCount "; done >> >> # Before >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 2 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> # After >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 1 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 3 >> intx CICompilerCount = 4 >> >> >> It is a minor bug in `CompilationPolicy::initialize`, but it gets in the way studying Leyden in tight CPU scenarios. >> >> Additional testing: >> - [x] New regression test passes with the fix, fails without it >> - [x] GHA >> - [x] Linux AArch64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: > > - Revert buffer size change > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Better test, patch amendments > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Merge branch 'master' into JDK-8356000-c1-c2-compiler-count > - Unnecessary arch limitation > - Simplify test > - Adjust test bound > - Fix Excellent, thanks for reviews! Here goes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24972#issuecomment-2940233103 From shade at openjdk.org Wed Jun 4 14:24:14 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 4 Jun 2025 14:24:14 GMT Subject: Integrated: 8356000: C1/C2-only modes use 2 compiler threads on low CPU count machines In-Reply-To: References: Message-ID: On Wed, 30 Apr 2025 19:00:23 GMT, Aleksey Shipilev wrote: > There is an unfortunate limitation with default tiered policy that we would have at least 2 threads on 1 CPU machine: 1 thread for C1, and 1 thread for C2. > > But if we select C1-only or C2-only modes, we _also_ get 2 compiler threads, for which we have no good reason. These threads would just step on each other toes. The fix changes the behavior for 1..3 CPU hosts in C1/C2-only configurations, by using 1 thread instead of 2 threads. The change for 1 CPU config is what we really need. The change in 2..3 CPU configs is an additional effect, but I think it is still good not to use 100%/66% of the CPUs in those configurations as well. > > > $ for I in `seq 1 8`; do build/linux-x86_64-server-release/images/jdk/bin/java \ > -XX:-TieredCompilation -XX:ActiveProcessorCount=${I} \ > -XX:+PrintFlagsFinal 2>&1 | grep "CICompilerCount "; done > > # Before > intx CICompilerCount = 2 > intx CICompilerCount = 2 > intx CICompilerCount = 2 > intx CICompilerCount = 3 > intx CICompilerCount = 3 > intx CICompilerCount = 3 > intx CICompilerCount = 3 > intx CICompilerCount = 4 > > # After > intx CICompilerCount = 1 > intx CICompilerCount = 1 > intx CICompilerCount = 1 > intx CICompilerCount = 3 > intx CICompilerCount = 3 > intx CICompilerCount = 3 > intx CICompilerCount = 3 > intx CICompilerCount = 4 > > > It is a minor bug in `CompilationPolicy::initialize`, but it gets in the way studying Leyden in tight CPU scenarios. > > Additional testing: > - [x] New regression test passes with the fix, fails without it > - [x] GHA > - [x] Linux AArch64 server fastdebug, `all` This pull request has now been integrated. Changeset: 4e314cb9 Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/4e314cb9e025672b2f7b68cc021fa516ee219ad8 Stats: 185 lines in 2 files changed: 182 ins; 0 del; 3 mod 8356000: C1/C2-only modes use 2 compiler threads on low CPU count machines Reviewed-by: kvn, dfenacci, galder ------------- PR: https://git.openjdk.org/jdk/pull/24972 From rcastanedalo at openjdk.org Wed Jun 4 15:08:51 2025 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Wed, 4 Jun 2025 15:08:51 GMT Subject: RFR: 8357822: C2: Multiple string optimization tests are no longer testing string concatenation optimizations In-Reply-To: <4GDLAMfeWjgfcGvn4sUSMT2jjG3vsebjcFeJqgHqPQw=.e7dfa9e7-4608-4304-ba00-0b254b6bf2b1@github.com> References: <4GDLAMfeWjgfcGvn4sUSMT2jjG3vsebjcFeJqgHqPQw=.e7dfa9e7-4608-4304-ba00-0b254b6bf2b1@github.com> Message-ID: On Tue, 3 Jun 2025 07:17:47 GMT, Daniel Skantz wrote: > This PR updates a few tests to reintroduce testing of string concatenation optimizations since a few bugs have recently been identified in this area. > > Selection criteria: performed a text search on the test suite and identified tests for string concatenations or string optimizations that are not currently compiled with `-XDstringConcat=inline` and are not using StringBuilders explicitly. > > Testing: T1-4. > > Extra testing: ran the tests manually with `-XX:+PrintOptimizeStringConcat` and verified that the tests are exercising string optimizations after the fix. test/hotspot/jtreg/compiler/intrinsics/string/TestStringIntrinsics.java line 1: > 1: /* Do we need to add a second run at all to this test case? As far as I can see, all `concat*` test cases use explicit string builders and already exercise C2's string concatenation optimizations. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25610#discussion_r2126842291 From epeter at openjdk.org Wed Jun 4 16:02:12 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 4 Jun 2025 16:02:12 GMT Subject: RFR: 8358600: Template-Framework Library: Template for TestFramework test class Message-ID: We might want to write many IR/TestFramework tests, and so I would like to integrate a Template that generates the class, and the user has to only generate a list of tests. This is a first extension for https://github.com/openjdk/jdk/pull/24217. I had already prototyped it earlier and plan to use it in multiple tests https://github.com/openjdk/jdk/pull/23418 (see `IRTestClass.java`). https://github.com/openjdk/jdk/blob/11d55dc7ff2b2137700248e11492e11d2d748cab/test/hotspot/jtreg/compiler/lib/template_framework/library/TestFrameworkClass.java#L36-L44 ------------- Commit messages: - JDK-8358600 Changes: https://git.openjdk.org/jdk/pull/25643/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25643&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358600 Stats: 271 lines in 2 files changed: 271 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25643.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25643/head:pull/25643 PR: https://git.openjdk.org/jdk/pull/25643 From sviswanathan at openjdk.org Wed Jun 4 16:12:56 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Wed, 4 Jun 2025 16:12:56 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v4] In-Reply-To: References: Message-ID: On Tue, 3 Jun 2025 18:07:34 GMT, Jatin Bhateja wrote: >> A) Patch extends the following tests with hard-coded encoding checks for various BMI instructions to cover REX2 or extended EVEX encodings supported by APX. >> >> >> compiler/intrinsics/bmi/verifycode/AndnTestI.java >> compiler/intrinsics/bmi/verifycode/AndnTestL.java >> compiler/intrinsics/bmi/verifycode/BzhiTestI2L.java >> compiler/intrinsics/bmi/verifycode/LZcntTestL.java >> compiler/intrinsics/bmi/verifycode/TZcntTestL.java >> >> >> B) After integration of JDK-8349582, which added APX NDD support, AndN instruction selection patterns that expect (Xor SRC, -1) as one of its operands were not getting selected because of a lower-cost generic immediate pattern match; patch fixes this issue through strict predicate checks. >> >> Above tests are now passing, validations were carried out using Intel Software Development emulator. >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Review comments resolutions src/hotspot/cpu/x86/x86_64.ad line 11341: > 11339: %{ > 11340: // Strict predicate check to make selection of xorL_rReg_im1_ndd cost agnostic if immL32 src2 is -1. > 11341: predicate(UseAPX && n->in(2)->bottom_type()->is_long()->get_con() != -1L); We need a check here for isa_long() before accessing is_long() otherwise is_long() may assert. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25501#discussion_r2125105798 From asmehra at openjdk.org Wed Jun 4 16:55:56 2025 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Wed, 4 Jun 2025 16:55:56 GMT Subject: Integrated: 8358330: AsmRemarks and DbgStrings clear() method may not get called before their destructor In-Reply-To: References: Message-ID: On Mon, 2 Jun 2025 18:32:12 GMT, Ashutosh Mehra wrote: > This patch fixes a possible assert in debug builds if the allocation of memory for a CodeBlob fails when loading it from the AOT Code Cache. See description of [JDK-8358330](https://bugs.openjdk.org/browse/JDK-8358330) for more details. This pull request has now been integrated. Changeset: fd0ab043 Author: Ashutosh Mehra URL: https://git.openjdk.org/jdk/commit/fd0ab043677d103628afde628e3e75e23fb518b2 Stats: 51 lines in 5 files changed: 21 ins; 27 del; 3 mod 8358330: AsmRemarks and DbgStrings clear() method may not get called before their destructor Reviewed-by: kvn ------------- PR: https://git.openjdk.org/jdk/pull/25598 From cslucas at openjdk.org Wed Jun 4 17:41:03 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Wed, 4 Jun 2025 17:41:03 GMT Subject: RFR: 8358534: Bailout in Conv2B::Ideal when type of cmp input is not supported In-Reply-To: References: <2kB23xVQDRb7YT6aMt1SbIfPwSG1ummK29A1Hs3FD0Y=.59ea7dd7-c6fd-486c-a996-f839c9a15718@github.com> <6F6gsbyRSrJ7_XHXoMh8j15Mog2DMec5DaOCVAdcdFQ=.0799c52f-275c-4303-8bd8-05f341c20ae0@github.com> Message-ID: On Wed, 4 Jun 2025 10:25:30 GMT, Aleksey Shipilev wrote: >> I agree, the `return nullptr;` should have been added below the assert in the else branch, no check required. > > Yes, putting `return nullptr;` into existing branch would have been cleaner. One more reason to wait for reviews! We can do a quick follow-up that sweeps this return to its better place, if you feel strongly about it. Note this would likely get backported, so being extra-clean pays off the process hassle. It is about 15 minute deal for me, I am happy to do it as a penance for not coming up with it myself :) I agree that just adding the return after the assert would have been a better option. I opted for adding the `if (cmp == nullptr)` because of excess of caution. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25627#discussion_r2127115196 From jbhateja at openjdk.org Wed Jun 4 18:05:52 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 4 Jun 2025 18:05:52 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v4] In-Reply-To: References: Message-ID: <4RfdfDQvc_xT1eNJKUf0AiSBJEU27P8WWmnufSyi6WI=.ee7abfe6-e927-4230-a1f5-ecfedad9a630@github.com> On Tue, 3 Jun 2025 23:21:52 GMT, Sandhya Viswanathan wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Review comments resolutions > > src/hotspot/cpu/x86/x86_64.ad line 11341: > >> 11339: %{ >> 11340: // Strict predicate check to make selection of xorL_rReg_im1_ndd cost agnostic if immL32 src2 is -1. >> 11341: predicate(UseAPX && n->in(2)->bottom_type()->is_long()->get_con() != -1L); > > We need a check here for isa_long() before accessing is_long() otherwise is_long() may assert. Matcher DFA state checks preceding predicate checks will implicitly ensure this ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25501#discussion_r2127155498 From sviswanathan at openjdk.org Wed Jun 4 21:14:59 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Wed, 4 Jun 2025 21:14:59 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v6] In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 09:51:03 GMT, Jatin Bhateja wrote: >> A) Patch extends the following tests with hard-coded encoding checks for various BMI instructions to cover REX2 or extended EVEX encodings supported by APX. >> >> >> compiler/intrinsics/bmi/verifycode/AndnTestI.java >> compiler/intrinsics/bmi/verifycode/AndnTestL.java >> compiler/intrinsics/bmi/verifycode/BzhiTestI2L.java >> compiler/intrinsics/bmi/verifycode/LZcntTestL.java >> compiler/intrinsics/bmi/verifycode/TZcntTestL.java >> >> >> B) After integration of JDK-8349582, which added APX NDD support, AndN instruction selection patterns that expect (Xor SRC, -1) as one of its operands were not getting selected because of a lower-cost generic immediate pattern match; patch fixes this issue through strict predicate checks. >> >> Above tests are now passing, validations were carried out using Intel Software Development emulator. >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Adding comments. test/hotspot/jtreg/compiler/intrinsics/bmi/verifycode/BlsiTestI.java line 78: > 76: (byte) 0x00, > 77: (byte) 0xF3, > 78: (byte) 0x3}; The line 78 should be same as line 61. test/hotspot/jtreg/compiler/intrinsics/bmi/verifycode/BlsmskTestI.java line 77: > 75: (byte) 0x00, > 76: (byte) 0xF3, > 77: (byte) 0x2}; This line 77 should be same as line 60. test/hotspot/jtreg/compiler/intrinsics/bmi/verifycode/BlsrTestI.java line 78: > 76: (byte) 0x00, > 77: (byte) 0xF3, > 78: (byte) 0x1}; The line 78 should be same as line 61. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25501#discussion_r2127042159 PR Review Comment: https://git.openjdk.org/jdk/pull/25501#discussion_r2127043141 PR Review Comment: https://git.openjdk.org/jdk/pull/25501#discussion_r2127043645 From epeter at openjdk.org Thu Jun 5 06:16:47 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 5 Jun 2025 06:16:47 GMT Subject: RFR: 8358600: Template-Framework Library: Template for TestFramework test class [v2] In-Reply-To: References: Message-ID: > We might want to write many IR/TestFramework tests, and so I would like to integrate a Template that generates the class, and the user has to only generate a list of tests. > > This is a first extension for https://github.com/openjdk/jdk/pull/24217. I had already prototyped it earlier and plan to use it in multiple tests https://github.com/openjdk/jdk/pull/23418 (see `IRTestClass.java`). > > https://github.com/openjdk/jdk/blob/dc640cbd8fb8ec76920a7ab52dfe7955ed1d77f2/test/hotspot/jtreg/compiler/lib/template_framework/library/TestFrameworkClass.java#L36-L45 Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: streamline API to a single render method ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25643/files - new: https://git.openjdk.org/jdk/pull/25643/files/11d55dc7..dc640cbd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25643&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25643&range=00-01 Stats: 70 lines in 2 files changed: 22 ins; 26 del; 22 mod Patch: https://git.openjdk.org/jdk/pull/25643.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25643/head:pull/25643 PR: https://git.openjdk.org/jdk/pull/25643 From duke at openjdk.org Thu Jun 5 07:20:07 2025 From: duke at openjdk.org (Yuri Gaevsky) Date: Thu, 5 Jun 2025 07:20:07 GMT Subject: RFR: 8322174: RISC-V: C2 VectorizedHashCode RVV Version [v8] In-Reply-To: References: <5e1o1xtN0ZdQZGJi2aVmgCEApW625koeE9F53VhDi5E=.2390045d-844e-4800-8d4b-075a2a3a8793@github.com> Message-ID: <8OzDPy_-fHmZXhZ2fvVjqmDWzIibOPNix2SDuwkRQbg=.8b1fb4a0-399f-4f58-af2b-5ce2a0f7bfbc@github.com> On Mon, 5 May 2025 18:10:02 GMT, Yuri Gaevsky wrote: >> Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: >> >> change slli+add sequence to shadd > > As you can expect I am trying to implement the following code with RVV: > > for (; i + (N-1) < cnt; i += N) { > h = 31^^N * h > + 31^^(N-1) * val[i + 0] > + 31^^(N-2) * val[i + 1] > ... > + 31^^1 * val[i + (N-2)] > + 31^^0 * val[i + (N-1)]; > } > for (; i < cnt; i++) { > h = 31 * h + val[i]; > } > > where `N` is a number of processing array elements in "chunk". > IIUC, the main issue with your approach is "reverse" order of array elements versus preloaded `31^^X` coeffs WHEN the remaining number of elems is less than `N`, say `M=N-1`. > > h = 31^^M * h > + 31^^(M-1) * val[i + 0] > + 31^^(M-2) * val[i + 1] > ... > + 31^^1 * val[i + (M-2)] > + 32^^0 * val[i + (M-1)]; > > or returning to our `N` for clarity > > h = 31^^(N-1) * h > + 31^^(N-2) * val[i + 0] > + 31^^(N-3) * val[i + 1] > ... > + 31^^1 * val[i + (N-3)] > + 31^^0 * val[i + (N-2)]; > > Now we need to "slide down" preloaded multiplier coeffs in designated vector register by one (as `M=N-1`) to be in "sync" with `val[i + X]` (may be move them into temporary VR in the process), and moreover, DO this operation IFF the remaining `cnt` is less than `N` (==>an additional check on every iteration). That's probably acceptable only at tail phase as one-time operation but NOT inside of main loop... > @ygaevsky @RealFYang how can we procced ? My apologies, just busy at the moment with other things, going to update the patch soon. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17413#issuecomment-2943036356 From jbhateja at openjdk.org Thu Jun 5 08:08:48 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Thu, 5 Jun 2025 08:08:48 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v7] In-Reply-To: References: Message-ID: > A) Patch extends the following tests with hard-coded encoding checks for various BMI instructions to cover REX2 or extended EVEX encodings supported by APX. > > > compiler/intrinsics/bmi/verifycode/AndnTestI.java > compiler/intrinsics/bmi/verifycode/AndnTestL.java > compiler/intrinsics/bmi/verifycode/BzhiTestI2L.java > compiler/intrinsics/bmi/verifycode/LZcntTestL.java > compiler/intrinsics/bmi/verifycode/TZcntTestL.java > > > B) After integration of JDK-8349582, which added APX NDD support, AndN instruction selection patterns that expect (Xor SRC, -1) as one of its operands were not getting selected because of a lower-cost generic immediate pattern match; patch fixes this issue through strict predicate checks. > > Above tests are now passing, validations were carried out using Intel Software Development emulator. > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Review resolutions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25501/files - new: https://git.openjdk.org/jdk/pull/25501/files/38bf655e..45db368d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25501&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25501&range=05-06 Stats: 5 lines in 3 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/25501.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25501/head:pull/25501 PR: https://git.openjdk.org/jdk/pull/25501 From roland at openjdk.org Thu Jun 5 08:27:47 2025 From: roland at openjdk.org (Roland Westrelin) Date: Thu, 5 Jun 2025 08:27:47 GMT Subject: RFR: 8342692: C2: long counted loop/long range checks: don't create loop-nest for short running loops [v34] In-Reply-To: References: Message-ID: > To optimize a long counted loop and long range checks in a long or int > counted loop, the loop is turned into a loop nest. When the loop has > few iterations, the overhead of having an outer loop whose backedge is > never taken, has a measurable cost. Furthermore, creating the loop > nest usually causes one iteration of the loop to be peeled so > predicates can be set up. If the loop is short running, then it's an > extra iteration that's run with range checks (compared to an int > counted loop with int range checks). > > This change doesn't create a loop nest when: > > 1- it can be determined statically at loop nest creation time that the > loop runs for a short enough number of iterations > > 2- profiling reports that the loop runs for no more than ShortLoopIter > iterations (1000 by default). > > For 2-, a guard is added which is implemented as yet another predicate. > > While this change is in principle simple, I ran into a few > implementation issues: > > - while c2 has a way to compute the number of iterations of an int > counted loop, it doesn't have that for long counted loop. The > existing logic for int counted loops promotes values to long to > avoid overflows. I reworked it so it now works for both long and int > counted loops. > > - I added a new deoptimization reason (Reason_short_running_loop) for > the new predicate. Given the number of iterations is narrowed down > by the predicate, the limit of the loop after transformation is a > cast node that's control dependent on the short running loop > predicate. Because once the counted loop is transformed, it is > likely that range check predicates will be inserted and they will > depend on the limit, the short running loop predicate has to be the > one that's further away from the loop entry. Now it is also possible > that the limit before transformation depends on a predicate > (TestShortRunningLongCountedLoopPredicatesClone is an example), we > can have: new predicates inserted after the transformation that > depend on the casted limit that itself depend on old predicates > added before the transformation. To solve this cicular dependency, > parse and assert predicates are cloned between the old predicates > and the loop head. The cloned short running loop parse predicate is > the one that's used to insert the short running loop predicate. > > - In the case of a long counted loop, the loop is transformed into a > regular loop with a new limit and transformed range checks that's > later turned into an in counted loop. The int ... Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 94 commits: - small fix - Merge branch 'master' into JDK-8342692 - review - review - Update test/micro/org/openjdk/bench/java/lang/foreign/HeapMismatchManualLoopTest.java Co-authored-by: Christian Hagedorn - Update test/hotspot/jtreg/compiler/longcountedloops/TestShortRunningLongCountedLoopScaleOverflow.java Co-authored-by: Christian Hagedorn - Update test/hotspot/jtreg/compiler/longcountedloops/TestShortRunningLongCountedLoopPredicatesClone.java Co-authored-by: Christian Hagedorn - Update test/hotspot/jtreg/compiler/longcountedloops/TestShortRunningLongCountedLoop.java Co-authored-by: Christian Hagedorn - Update test/hotspot/jtreg/compiler/longcountedloops/TestShortRunningIntLoopWithLongChecksPredicates.java Co-authored-by: Christian Hagedorn - Update src/hotspot/share/opto/loopnode.cpp Co-authored-by: Christian Hagedorn - ... and 84 more: https://git.openjdk.org/jdk/compare/faf19abd...fd19ee84 ------------- Changes: https://git.openjdk.org/jdk/pull/21630/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=21630&range=33 Stats: 1618 lines in 26 files changed: 1539 ins; 22 del; 57 mod Patch: https://git.openjdk.org/jdk/pull/21630.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/21630/head:pull/21630 PR: https://git.openjdk.org/jdk/pull/21630 From dskantz at openjdk.org Thu Jun 5 08:51:11 2025 From: dskantz at openjdk.org (Daniel Skantz) Date: Thu, 5 Jun 2025 08:51:11 GMT Subject: RFR: 8357822: C2: Multiple string optimization tests are no longer testing string concatenation optimizations [v2] In-Reply-To: <4GDLAMfeWjgfcGvn4sUSMT2jjG3vsebjcFeJqgHqPQw=.e7dfa9e7-4608-4304-ba00-0b254b6bf2b1@github.com> References: <4GDLAMfeWjgfcGvn4sUSMT2jjG3vsebjcFeJqgHqPQw=.e7dfa9e7-4608-4304-ba00-0b254b6bf2b1@github.com> Message-ID: > This PR updates a few tests to reintroduce testing of string concatenation optimizations since a few bugs have recently been identified in this area. > > Selection criteria: performed a text search on the test suite and identified tests for string concatenations or string optimizations that are not currently compiled with `-XDstringConcat=inline` and are not using StringBuilders explicitly. > > Testing: T1-4. > > Extra testing: ran the tests manually with `-XX:+PrintOptimizeStringConcat` and verified that the tests are exercising string optimizations after the fix. Daniel Skantz has updated the pull request incrementally with two additional commits since the last revision: - revert change to TestStringIntrinsics.java - Update test/hotspot/jtreg/compiler/c2/Test7046096.java Co-authored-by: Emanuel Peter ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25610/files - new: https://git.openjdk.org/jdk/pull/25610/files/731667f4..7fd8568a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25610&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25610&range=00-01 Stats: 13 lines in 2 files changed: 0 ins; 11 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25610.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25610/head:pull/25610 PR: https://git.openjdk.org/jdk/pull/25610 From dskantz at openjdk.org Thu Jun 5 08:51:11 2025 From: dskantz at openjdk.org (Daniel Skantz) Date: Thu, 5 Jun 2025 08:51:11 GMT Subject: RFR: 8357822: C2: Multiple string optimization tests are no longer testing string concatenation optimizations [v2] In-Reply-To: References: <4GDLAMfeWjgfcGvn4sUSMT2jjG3vsebjcFeJqgHqPQw=.e7dfa9e7-4608-4304-ba00-0b254b6bf2b1@github.com> Message-ID: On Wed, 4 Jun 2025 15:06:19 GMT, Roberto Casta?eda Lozano wrote: >> Daniel Skantz has updated the pull request incrementally with two additional commits since the last revision: >> >> - revert change to TestStringIntrinsics.java >> - Update test/hotspot/jtreg/compiler/c2/Test7046096.java >> >> Co-authored-by: Emanuel Peter > > test/hotspot/jtreg/compiler/intrinsics/string/TestStringIntrinsics.java line 1: > >> 1: /* > > Do we need to add a second run at all to this test case? As far as I can see, all `concat*` test cases use explicit string builders and already exercise C2's string concatenation optimizations. Thanks for checking! I reverted the change to this test as on a second look the benefits of adding the new configuration to it are modest and out of scope. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25610#discussion_r2128300211 From duke at openjdk.org Thu Jun 5 09:12:30 2025 From: duke at openjdk.org (erifan) Date: Thu, 5 Jun 2025 09:12:30 GMT Subject: RFR: 8354242: VectorAPI: combine vector not operation with compare [v7] In-Reply-To: References: Message-ID: > This patch optimizes the following patterns: > For integer types: > > (XorV (VectorMaskCmp src1 src2 cond) (Replicate -1)) > => (VectorMaskCmp src1 src2 ncond) > (XorVMask (VectorMaskCmp src1 src2 cond) (MaskAll m1)) > => (VectorMaskCmp src1 src2 ncond) > > cond can be eq, ne, le, ge, lt, gt, ule, uge, ult and ugt, ncond is the negative comparison of cond. > > For float and double types: > > (XorV (VectorMaskCast (VectorMaskCmp src1 src2 cond)) (Replicate -1)) > => (VectorMaskCast (VectorMaskCmp src1 src2 ncond)) > (XorVMask (VectorMaskCast (VectorMaskCmp src1 src2 cond)) (MaskAll m1)) > => (VectorMaskCast (VectorMaskCmp src1 src2 ncond)) > > cond can be eq or ne. > > Benchmarks on Nvidia Grace machine with 128-bit SVE2: With option `-XX:UseSVE=2`: > > Benchmark Unit Before Score Error After Score Error Uplift > testCompareEQMaskNotByte ops/s 7912127.225 2677.289518 10266136.26 8955.008548 1.29 > testCompareEQMaskNotDouble ops/s 884737.6799 446.963779 1179760.772 448.031844 1.33 > testCompareEQMaskNotFloat ops/s 1765045.787 682.332214 2359520.803 896.305743 1.33 > testCompareEQMaskNotInt ops/s 1787221.411 977.743935 2353952.519 960.069976 1.31 > testCompareEQMaskNotLong ops/s 895297.1974 673.44808 1178449.02 323.804205 1.31 > testCompareEQMaskNotShort ops/s 3339987.002 3415.2226 4712761.965 2110.862053 1.41 > testCompareGEMaskNotByte ops/s 7907615.16 4094.243652 10251646.9 9486.699831 1.29 > testCompareGEMaskNotInt ops/s 1683738.958 4233.813092 2352855.205 1251.952546 1.39 > testCompareGEMaskNotLong ops/s 854496.1561 8594.598885 1177811.493 521.1229 1.37 > testCompareGEMaskNotShort ops/s 3341860.309 1578.975338 4714008.434 1681.10365 1.41 > testCompareGTMaskNotByte ops/s 7910823.674 2993.367032 10245063.58 9774.75138 1.29 > testCompareGTMaskNotInt ops/s 1673393.928 3153.099431 2353654.521 1190.848583 1.4 > testCompareGTMaskNotLong ops/s 849405.9159 2432.858159 1177952.041 359.96413 1.38 > testCompareGTMaskNotShort ops/s 3339509.141 3339.976585 4711442.496 2673.364893 1.41 > testCompareLEMaskNotByte ops/s 7911340.004 3114.69191 10231626.5 27134.20035 1.29 > testCompareLEMaskNotInt ops/s 1675812.113 1340.969885 2353255.341 1452.4522 1.4 > testCompareLEMaskNotLong ops/s 848862.8036 6564.841731 1177763.623 539.290106 1.38 > testCompareLEMaskNotShort ops/s 3324951.54 2380.29473 4712116.251 1544.559684 1.41 > testCompareLTMaskNotByte ops/s 7910390.844 2630.861436 10239567.69 6487.441672 1.29 > testCompareLTMaskNotInt ops/s 1672180.09 995.238142 2353757.863 853.774734 1.4 > testCompareLTMaskNotLong ops/s 856502.26... erifan has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision: - Addressed some review comments - Merge branch 'master' into JDK-8354242 - Refactor the JTReg tests for compare.xor(maskAll) Also made a bit change to support pattern `VectorMask.fromLong()`. - Merge branch 'master' into JDK-8354242 - Refactor code Add a new function XorVNode::Ideal_XorV_VectorMaskCmp to do this optimization, making the code more modular. - Merge branch 'master' into JDK-8354242 - Update the jtreg test - Merge branch 'master' into JDK-8354242 - Addressed some review comments 1. Call VectorNode::Ideal() only once in XorVNode::Ideal. 2. Improve code comments. - Merge branch 'master' into JDK-8354242 - ... and 2 more: https://git.openjdk.org/jdk/compare/71938fba...ebbcc405 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24674/files - new: https://git.openjdk.org/jdk/pull/24674/files/f2f71e34..ebbcc405 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24674&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24674&range=05-06 Stats: 146911 lines in 2345 files changed: 87334 ins; 41007 del; 18570 mod Patch: https://git.openjdk.org/jdk/pull/24674.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24674/head:pull/24674 PR: https://git.openjdk.org/jdk/pull/24674 From duke at openjdk.org Thu Jun 5 09:12:33 2025 From: duke at openjdk.org (erifan) Date: Thu, 5 Jun 2025 09:12:33 GMT Subject: RFR: 8354242: VectorAPI: combine vector not operation with compare [v6] In-Reply-To: References: Message-ID: On Wed, 28 May 2025 12:03:48 GMT, Emanuel Peter wrote: >> erifan has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: >> >> - Refactor the JTReg tests for compare.xor(maskAll) >> >> Also made a bit change to support pattern `VectorMask.fromLong()`. >> - Merge branch 'master' into JDK-8354242 >> - Refactor code >> >> Add a new function XorVNode::Ideal_XorV_VectorMaskCmp to do this >> optimization, making the code more modular. >> - Merge branch 'master' into JDK-8354242 >> - Update the jtreg test >> - Merge branch 'master' into JDK-8354242 >> - Addressed some review comments >> >> 1. Call VectorNode::Ideal() only once in XorVNode::Ideal. >> 2. Improve code comments. >> - Merge branch 'master' into JDK-8354242 >> - Merge branch 'master' into JDK-8354242 >> - 8354242: VectorAPI: combine vector not operation with compare >> >> This patch optimizes the following patterns: >> For integer types: >> ``` >> (XorV (VectorMaskCmp src1 src2 cond) (Replicate -1)) >> => (VectorMaskCmp src1 src2 ncond) >> (XorVMask (VectorMaskCmp src1 src2 cond) (MaskAll m1)) >> => (VectorMaskCmp src1 src2 ncond) >> ``` >> cond can be eq, ne, le, ge, lt, gt, ule, uge, ult and ugt, ncond is the >> negative comparison of cond. >> >> For float and double types: >> ``` >> (XorV (VectorMaskCast (VectorMaskCmp src1 src2 cond)) (Replicate -1)) >> => (VectorMaskCast (VectorMaskCmp src1 src2 ncond)) >> (XorVMask (VectorMaskCast (VectorMaskCmp src1 src2 cond)) (MaskAll m1)) >> => (VectorMaskCast (VectorMaskCmp src1 src2 ncond)) >> ``` >> cond can be eq or ne. >> >> Benchmarks on Nvidia Grace machine with 128-bit SVE2: >> With option `-XX:UseSVE=2`: >> ``` >> Benchmark Unit Before Score Error After Score Error Uplift >> testCompareEQMaskNotByte ops/s 7912127.225 2677.289518 10266136.26 8955.008548 1.29 >> testCompareEQMaskNotDouble ops/s 884737.6799 446.963779 1179760.772 448.031844 1.33 >> testCompareEQMaskNotFloat ops/s 1765045.787 682.332214 2359520.803 896.305743 1.33 >> testCompareEQMaskNotInt ops/s 1787221.411 977.743935 2353952.519 960.069976 1.31 >> testCompareEQMaskNotLong ops/s 895297.1974 673.44808 1178449.02 323.804205 1.31 >> testCompareEQMaskNotShort ops/s 3339987.002 3415.2226 4712761.965 2110.862053 1.41 >> testCompareGEMaskNotByte ops/s 7907615.16 4... > > src/hotspot/share/opto/vectornode.cpp line 2213: > >> 2211: Node* in1 = in(1); >> 2212: Node* in2 = in(2); >> 2213: // Transformations for predicated IRs are not supported for now. > > Suggestion: > > // Transformations for predicated vectors are not supported for now. Done. > src/hotspot/share/opto/vectornode.cpp line 2215: > >> 2213: // Transformations for predicated IRs are not supported for now. >> 2214: if (is_predicated_vector() || in1->is_predicated_vector() || >> 2215: in2->is_predicated_vector()) { > > I would either put all on the same line, or all on separate lines. Done. > src/hotspot/share/opto/vectornode.cpp line 2219: > >> 2217: } >> 2218: >> 2219: // XorV/XorVMask is commutative, swap VectorMaskCmp/Op_VectorMaskCast to in1. > > Suggestion: > > // XorV/XorVMask is commutative, swap VectorMaskCmp/VectorMaskCast to in1. > > Would look a little cleaner, and you did also not write `Op_VectorMaskCmp` either ;) Done, thanks! > src/hotspot/share/opto/vectornode.cpp line 2225: > >> 2223: } >> 2224: >> 2225: const TypeVect* vmcast_vt = nullptr; > > Suggestion: > > const TypeVect* vector_mask_cast_vt = nullptr; > > I think it would not hurt to write it out. Otherwise, the reader always has to reconstruct that in their head. Done. > src/hotspot/share/opto/vectornode.cpp line 2230: > >> 2228: vmcast_vt = in1->as_Vector()->vect_type(); >> 2229: in1 = in1->in(1); >> 2230: } > > Add a comment why you check `in1->outcnt() == 1`. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24674#discussion_r2128341063 PR Review Comment: https://git.openjdk.org/jdk/pull/24674#discussion_r2128340484 PR Review Comment: https://git.openjdk.org/jdk/pull/24674#discussion_r2128341959 PR Review Comment: https://git.openjdk.org/jdk/pull/24674#discussion_r2128342468 PR Review Comment: https://git.openjdk.org/jdk/pull/24674#discussion_r2128342908 From duke at openjdk.org Thu Jun 5 09:12:33 2025 From: duke at openjdk.org (erifan) Date: Thu, 5 Jun 2025 09:12:33 GMT Subject: RFR: 8354242: VectorAPI: combine vector not operation with compare [v6] In-Reply-To: References: Message-ID: On Thu, 29 May 2025 08:00:05 GMT, erifan wrote: >> src/hotspot/share/opto/vectornode.cpp line 2233: >> >>> 2231: if (in2->Opcode() == Op_VectorMaskCast) { >>> 2232: in2 = in2->in(1); >>> 2233: } >> >> Wow, this seems to be an addition that is not covered in the patterns you mention above, right? >> But is that even necessary? >> I suppose here `in2 = VectorMaskCast(all_ones_vector)`. >> Would we not already want to transform this pattern in `VectorMaskCast::Ideal`, is that not possible and more powerful? > > Oh yeah, I forgot to mention it in the above comment and commit message. > > Yes, this is for `in2 = VectorMaskCast(all_ones_vector)`. I agree it's better to do this transformation in `VectorMaskCast::Ideal`. I'll remove this code change and do the `VectorMaskCast` optimization later. Thanks! Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24674#discussion_r2128344233 From duke at openjdk.org Thu Jun 5 09:17:54 2025 From: duke at openjdk.org (erifan) Date: Thu, 5 Jun 2025 09:17:54 GMT Subject: RFR: 8354242: VectorAPI: combine vector not operation with compare [v6] In-Reply-To: References: <9u6hJ-WgnHLMaYBa8ViRdpUZY-bI2wOk-TCRKWJJdqk=.b3303f1f-da3b-4c2e-8f0c-a2e16ba9688e@github.com> Message-ID: On Thu, 29 May 2025 07:55:06 GMT, erifan wrote: >> Also: You now cast `(VectorMaskCmpNode*) in1` twice. Can we not do `as_VectorMaskCmp()`? Or could we at least cast it only once, and then use it as `in1_mask_cmp` instead? > >> What is the hard-coded ^ 4 here? > > This is to negate the comparison condition. We can't use `BoolTest::negate()` here because the comparison condition may be **unsigned** comparison. Since there's already a `negate()` function in `BoolTest`, so I tend to add a new function `get_negative_predicate` for this into class `VectorMaskCmpNode`. > >> Also: You now cast (VectorMaskCmpNode*) in1 twice. Can we not do as_VectorMaskCmp()? Or could we at least cast it only once, and then use it as in1_mask_cmp instead? > > For the first cast, I think you mean > > if (in1->Opcode() != Op_VectorMaskCmp || > in1->outcnt() > 1 || > !((VectorMaskCmpNode*) in1)->predicate_can_be_negated() || > !VectorNode::is_all_ones_vector(in2)) { > return nullptr; > } > > To remove one cast, then we have to split the above `if` because `in1` may not be a `VectorMaskCmpNode`. > > if (in1->Opcode() != Op_VectorMaskCmp) { > return nullptr; > } > VectorMaskCmpNode* in1_as_mask_cmp = (VectorMaskCmpNode*) in1; > if (in1->outcnt() > 1 || > !in1_as_mask_cmp->predicate_can_be_negated() || > !VectorNode::is_all_ones_vector(in2)) { > return nullptr; > } > BoolTest::mask neg_cond = (BoolTest::mask) (in1_as_mask_cmp->get_predicate() ^ 4); > > Does this look better to you ? For now I kept the current approach, as I feel it's a little more compact. Thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24674#discussion_r2128358563 From epeter at openjdk.org Thu Jun 5 09:26:59 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 5 Jun 2025 09:26:59 GMT Subject: RFR: 8354242: VectorAPI: combine vector not operation with compare [v7] In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 09:12:30 GMT, erifan wrote: >> This patch optimizes the following patterns: >> For integer types: >> >> (XorV (VectorMaskCmp src1 src2 cond) (Replicate -1)) >> => (VectorMaskCmp src1 src2 ncond) >> (XorVMask (VectorMaskCmp src1 src2 cond) (MaskAll m1)) >> => (VectorMaskCmp src1 src2 ncond) >> >> cond can be eq, ne, le, ge, lt, gt, ule, uge, ult and ugt, ncond is the negative comparison of cond. >> >> For float and double types: >> >> (XorV (VectorMaskCast (VectorMaskCmp src1 src2 cond)) (Replicate -1)) >> => (VectorMaskCast (VectorMaskCmp src1 src2 ncond)) >> (XorVMask (VectorMaskCast (VectorMaskCmp src1 src2 cond)) (MaskAll m1)) >> => (VectorMaskCast (VectorMaskCmp src1 src2 ncond)) >> >> cond can be eq or ne. >> >> Benchmarks on Nvidia Grace machine with 128-bit SVE2: With option `-XX:UseSVE=2`: >> >> Benchmark Unit Before Score Error After Score Error Uplift >> testCompareEQMaskNotByte ops/s 7912127.225 2677.289518 10266136.26 8955.008548 1.29 >> testCompareEQMaskNotDouble ops/s 884737.6799 446.963779 1179760.772 448.031844 1.33 >> testCompareEQMaskNotFloat ops/s 1765045.787 682.332214 2359520.803 896.305743 1.33 >> testCompareEQMaskNotInt ops/s 1787221.411 977.743935 2353952.519 960.069976 1.31 >> testCompareEQMaskNotLong ops/s 895297.1974 673.44808 1178449.02 323.804205 1.31 >> testCompareEQMaskNotShort ops/s 3339987.002 3415.2226 4712761.965 2110.862053 1.41 >> testCompareGEMaskNotByte ops/s 7907615.16 4094.243652 10251646.9 9486.699831 1.29 >> testCompareGEMaskNotInt ops/s 1683738.958 4233.813092 2352855.205 1251.952546 1.39 >> testCompareGEMaskNotLong ops/s 854496.1561 8594.598885 1177811.493 521.1229 1.37 >> testCompareGEMaskNotShort ops/s 3341860.309 1578.975338 4714008.434 1681.10365 1.41 >> testCompareGTMaskNotByte ops/s 7910823.674 2993.367032 10245063.58 9774.75138 1.29 >> testCompareGTMaskNotInt ops/s 1673393.928 3153.099431 2353654.521 1190.848583 1.4 >> testCompareGTMaskNotLong ops/s 849405.9159 2432.858159 1177952.041 359.96413 1.38 >> testCompareGTMaskNotShort ops/s 3339509.141 3339.976585 4711442.496 2673.364893 1.41 >> testCompareLEMaskNotByte ops/s 7911340.004 3114.69191 10231626.5 27134.20035 1.29 >> testCompareLEMaskNotInt ops/s 1675812.113 1340.969885 2353255.341 1452.4522 1.4 >> testCompareLEMaskNotLong ops/s 848862.8036 6564.841731 1177763.623 539.290106 1.38 >> testCompareLEMaskNotShort ops/s 3324951.54 2380.29473 4712116.251 1544.559684 1.41 >> testCompareLTMaskNotByte ops/s 7910390.844 2630.861436 10239567.69 6487.441672 1.29 >> testCompareLTMaskNotInt ops/s 16721... > > erifan has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision: > > - Addressed some review comments > - Merge branch 'master' into JDK-8354242 > - Refactor the JTReg tests for compare.xor(maskAll) > > Also made a bit change to support pattern `VectorMask.fromLong()`. > - Merge branch 'master' into JDK-8354242 > - Refactor code > > Add a new function XorVNode::Ideal_XorV_VectorMaskCmp to do this > optimization, making the code more modular. > - Merge branch 'master' into JDK-8354242 > - Update the jtreg test > - Merge branch 'master' into JDK-8354242 > - Addressed some review comments > > 1. Call VectorNode::Ideal() only once in XorVNode::Ideal. > 2. Improve code comments. > - Merge branch 'master' into JDK-8354242 > - ... and 2 more: https://git.openjdk.org/jdk/compare/93b141e6...ebbcc405 FYI: `BoolTest::negate` already does what you want: `mask negate( ) const { return mask(_test^4); }` I think you should use that instead :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/24674#issuecomment-2943422494 From duke at openjdk.org Thu Jun 5 09:27:03 2025 From: duke at openjdk.org (erifan) Date: Thu, 5 Jun 2025 09:27:03 GMT Subject: RFR: 8354242: VectorAPI: combine vector not operation with compare [v6] In-Reply-To: References: Message-ID: On Wed, 28 May 2025 12:18:15 GMT, Emanuel Peter wrote: >> erifan has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: >> >> - Refactor the JTReg tests for compare.xor(maskAll) >> >> Also made a bit change to support pattern `VectorMask.fromLong()`. >> - Merge branch 'master' into JDK-8354242 >> - Refactor code >> >> Add a new function XorVNode::Ideal_XorV_VectorMaskCmp to do this >> optimization, making the code more modular. >> - Merge branch 'master' into JDK-8354242 >> - Update the jtreg test >> - Merge branch 'master' into JDK-8354242 >> - Addressed some review comments >> >> 1. Call VectorNode::Ideal() only once in XorVNode::Ideal. >> 2. Improve code comments. >> - Merge branch 'master' into JDK-8354242 >> - Merge branch 'master' into JDK-8354242 >> - 8354242: VectorAPI: combine vector not operation with compare >> >> This patch optimizes the following patterns: >> For integer types: >> ``` >> (XorV (VectorMaskCmp src1 src2 cond) (Replicate -1)) >> => (VectorMaskCmp src1 src2 ncond) >> (XorVMask (VectorMaskCmp src1 src2 cond) (MaskAll m1)) >> => (VectorMaskCmp src1 src2 ncond) >> ``` >> cond can be eq, ne, le, ge, lt, gt, ule, uge, ult and ugt, ncond is the >> negative comparison of cond. >> >> For float and double types: >> ``` >> (XorV (VectorMaskCast (VectorMaskCmp src1 src2 cond)) (Replicate -1)) >> => (VectorMaskCast (VectorMaskCmp src1 src2 ncond)) >> (XorVMask (VectorMaskCast (VectorMaskCmp src1 src2 cond)) (MaskAll m1)) >> => (VectorMaskCast (VectorMaskCmp src1 src2 ncond)) >> ``` >> cond can be eq or ne. >> >> Benchmarks on Nvidia Grace machine with 128-bit SVE2: >> With option `-XX:UseSVE=2`: >> ``` >> Benchmark Unit Before Score Error After Score Error Uplift >> testCompareEQMaskNotByte ops/s 7912127.225 2677.289518 10266136.26 8955.008548 1.29 >> testCompareEQMaskNotDouble ops/s 884737.6799 446.963779 1179760.772 448.031844 1.33 >> testCompareEQMaskNotFloat ops/s 1765045.787 682.332214 2359520.803 896.305743 1.33 >> testCompareEQMaskNotInt ops/s 1787221.411 977.743935 2353952.519 960.069976 1.31 >> testCompareEQMaskNotLong ops/s 895297.1974 673.44808 1178449.02 323.804205 1.31 >> testCompareEQMaskNotShort ops/s 3339987.002 3415.2226 4712761.965 2110.862053 1.41 >> testCompareGEMaskNotByte ops/s 7907615.16 4... > > src/hotspot/share/opto/vectornode.cpp line 2251: > >> 2249: predicate_node, vt); >> 2250: if (vmcast_vt != nullptr) { >> 2251: // We optimized out an VectorMaskCast, and in order to ensure type > > Suggestion: > > // We optimized out a VectorMaskCast, and in order to ensure type Done. > src/hotspot/share/opto/vectornode.cpp line 2253: > >> 2251: // We optimized out an VectorMaskCast, and in order to ensure type >> 2252: // correctness, we need to regenerate one. VectorMaskCast will be encoded as >> 2253: // empty for types with the same size. > > Suggestion: > > // a no-op (identity function) for types with the same size. > > Or what do you mean by "empty"? `TOP`? All zeros? I mean `no-op`. Done, thanks. > test/hotspot/jtreg/compiler/vectorapi/VectorMaskCompareNotTest.java line 96: > >> 94: Generator lGen = RD.uniformLongs(Long.MIN_VALUE, Long.MAX_VALUE); >> 95: Generator fGen = RD.uniformFloats(Float.MIN_VALUE, Float.MAX_VALUE); >> 96: Generator dGen = RD.uniformDoubles(Double.MIN_VALUE, Double.MAX_VALUE); > > Are you sure you only want to draw from the uniform distribution? > If you don't super care about the distribution, please just take `RD.ints/longs/floats/doubles()`. > That way, you get all sorts of distributions, and also some that include NaN values etc. I think that would be important for your float cmp cases, no? For float and double, we have to use the uniform distribution, because we have to make sure `NAN` is not generated. I added some comments about the reasons. For other types, changed to use `RD.ints/longs`. We have covered the special cases like +/- Inf, NaN. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24674#discussion_r2128376851 PR Review Comment: https://git.openjdk.org/jdk/pull/24674#discussion_r2128378888 PR Review Comment: https://git.openjdk.org/jdk/pull/24674#discussion_r2128375032 From duke at openjdk.org Thu Jun 5 09:27:04 2025 From: duke at openjdk.org (erifan) Date: Thu, 5 Jun 2025 09:27:04 GMT Subject: RFR: 8354242: VectorAPI: combine vector not operation with compare [v6] In-Reply-To: <9u6hJ-WgnHLMaYBa8ViRdpUZY-bI2wOk-TCRKWJJdqk=.b3303f1f-da3b-4c2e-8f0c-a2e16ba9688e@github.com> References: <9u6hJ-WgnHLMaYBa8ViRdpUZY-bI2wOk-TCRKWJJdqk=.b3303f1f-da3b-4c2e-8f0c-a2e16ba9688e@github.com> Message-ID: On Wed, 28 May 2025 12:28:20 GMT, Emanuel Peter wrote: >> test/hotspot/jtreg/compiler/vectorapi/VectorMaskCompareNotTest.java line 237: >> >>> 235: // Byte tests >>> 236: @Test >>> 237: @IR(counts = { IRNode.XOR_V_MASK, "= 0", IRNode.XOR_VB, "= 0" }, >> >> Could you still assert the presence of some other vectors, just to make sure we are indeed getting vectors here? > > Not testing for any present vectors makes me a little nervous: what if we just don't get any vectors because inlining fails or something else silly happens? Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24674#discussion_r2128379987 From duke at openjdk.org Thu Jun 5 09:27:05 2025 From: duke at openjdk.org (erifan) Date: Thu, 5 Jun 2025 09:27:05 GMT Subject: RFR: 8354242: VectorAPI: combine vector not operation with compare [v6] In-Reply-To: References: Message-ID: On Thu, 29 May 2025 01:44:49 GMT, Xiaohong Gong wrote: >> test/micro/org/openjdk/bench/jdk/incubator/vector/MaskCompareNotBenchmark.java line 49: >> >>> 47: private static final VectorSpecies L_SPECIES = LongVector.SPECIES_MAX; >>> 48: private static final VectorSpecies F_SPECIES = FloatVector.SPECIES_MAX; >>> 49: private static final VectorSpecies D_SPECIES = DoubleVector.SPECIES_MAX; >> >> Are you taking `SPECIES_MAX` on purpose here, or could we take `SPECIES_PREFERRED` instead? >> @jatin-bhateja What is the best to do in these tests? I suppose best would be to test with all vector lengths... > > Thanks for pointing out this @eme64 ! Per my understanding, `SPECIES_MAX` is almost the same with `SPECIES_PREFERRED` in this case which are all specified to the max vector size of a hardware. Since the max vector size is different on different architectures, not all vector lengths are supported to be intrinsified on a specified architecture like AArch64, especially the SVE arch with different vector register size. Hence, just testing the max species makes sense to me as this is a mid-end common transformation. Changed to use `ofLargestShape()` because on x64 the max vector length is related to data types. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24674#discussion_r2128383805 From duke at openjdk.org Thu Jun 5 09:34:59 2025 From: duke at openjdk.org (erifan) Date: Thu, 5 Jun 2025 09:34:59 GMT Subject: RFR: 8354242: VectorAPI: combine vector not operation with compare [v7] In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 09:24:10 GMT, Emanuel Peter wrote: > FYI: `BoolTest::negate` already does what you want: `mask negate( ) const { return mask(_test^4); }` I think you should use that instead :) Indeed, I hadn't noticed that, thank you. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24674#issuecomment-2943449327 From duke at openjdk.org Thu Jun 5 09:51:59 2025 From: duke at openjdk.org (erifan) Date: Thu, 5 Jun 2025 09:51:59 GMT Subject: RFR: 8354242: VectorAPI: combine vector not operation with compare [v7] In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 09:32:15 GMT, erifan wrote: > > FYI: `BoolTest::negate` already does what you want: `mask negate( ) const { return mask(_test^4); }` I think you should use that instead :) > > Indeed, I hadn't noticed that, thank you. Oh I think we still cannot use `BoolTest::negate`, because we cannot instantiate a `BoolTest` object with **unsigned** comparison. `BoolTest::negate` is a non-static function. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24674#issuecomment-2943500455 From epeter at openjdk.org Thu Jun 5 11:08:56 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 5 Jun 2025 11:08:56 GMT Subject: RFR: 8354242: VectorAPI: combine vector not operation with compare [v7] In-Reply-To: References: Message-ID: <15TW6hiffz65NhHevPefL_6swSC07UD-GwiJ4tPDtFs=.b83081df-8abd-4756-b4e0-1d969678a0d2@github.com> On Thu, 5 Jun 2025 09:48:46 GMT, erifan wrote: > Oh I think we still cannot use `BoolTest::negate`, because we cannot instantiate a `BoolTest` object with **unsigned** comparison. `BoolTest::negate` is a non-static function. I see. Ok. Hmm. I still think that the logic should be in `BoolTest`, because that is where the exact implementation of the enum values is. In that context it is easier to see why `^4` does the negation. And imagine we were ever to change the enum values, then it would be harder to find your code and fix it. Maybe it could be called `BoolTest::negate_mask(mast btm)` and explain in a comment that both signed and unsigned is supported. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24674#issuecomment-2943747849 From mhaessig at openjdk.org Thu Jun 5 11:34:51 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Thu, 5 Jun 2025 11:34:51 GMT Subject: RFR: 8358600: Template-Framework Library: Template for TestFramework test class [v2] In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 06:16:47 GMT, Emanuel Peter wrote: >> We might want to write many IR/TestFramework tests, and so I would like to integrate a Template that generates the class, and the user has to only generate a list of tests. >> >> This is a first extension for https://github.com/openjdk/jdk/pull/24217. I had already prototyped it earlier and plan to use it in multiple tests https://github.com/openjdk/jdk/pull/23418 (see `IRTestClass.java`). >> >> https://github.com/openjdk/jdk/blob/dc640cbd8fb8ec76920a7ab52dfe7955ed1d77f2/test/hotspot/jtreg/compiler/lib/template_framework/library/TestFrameworkClass.java#L36-L45 > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > streamline API to a single render method Thank you for your continued work on the Template Framework. This seems like good start to the template library. While it already looks good, I have a few small questions :) Thank you for your continued work on the Template Framework. This seems like good start to the template library. While it already looks good, I have a few small questions :) test/hotspot/jtreg/compiler/lib/template_framework/library/TestFrameworkClass.java line 76: > 74: public static String render(final String packageName, > 75: final String className, > 76: final List imports, To eliminate duplicate imports this could also be a `Set`. test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestWithTestFrameworkClass.java line 135: > 133: )); > 134: > 135: // Create a test for each operator.. Suggestion: // Create a test for each operator. Tiny nit :) test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestWithTestFrameworkClass.java line 145: > 143: // List of imports. Duplicates are permitted. > 144: List.of("compiler.lib.generators.*", > 145: "compiler.lib.ir_framework.*", Suggestion: This should not be needed since its imported by default in `TestFrameworkClass`. Or is this a deliberate duplication? ------------- PR Review: https://git.openjdk.org/jdk/pull/25643#pullrequestreview-2899874579 PR Review: https://git.openjdk.org/jdk/pull/25643#pullrequestreview-2899929556 PR Review Comment: https://git.openjdk.org/jdk/pull/25643#discussion_r2128593841 PR Review Comment: https://git.openjdk.org/jdk/pull/25643#discussion_r2128586200 PR Review Comment: https://git.openjdk.org/jdk/pull/25643#discussion_r2128588949 From epeter at openjdk.org Thu Jun 5 12:01:17 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 5 Jun 2025 12:01:17 GMT Subject: RFR: 8358600: Template-Framework Library: Template for TestFramework test class [v3] In-Reply-To: References: Message-ID: > We might want to write many IR/TestFramework tests, and so I would like to integrate a Template that generates the class, and the user has to only generate a list of tests. > > This is a first extension for https://github.com/openjdk/jdk/pull/24217. I had already prototyped it earlier and plan to use it in multiple tests https://github.com/openjdk/jdk/pull/23418 (see `IRTestClass.java`). > > https://github.com/openjdk/jdk/blob/dc640cbd8fb8ec76920a7ab52dfe7955ed1d77f2/test/hotspot/jtreg/compiler/lib/template_framework/library/TestFrameworkClass.java#L36-L45 Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: review suggestions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25643/files - new: https://git.openjdk.org/jdk/pull/25643/files/dc640cbd..256d922c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25643&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25643&range=01-02 Stats: 10 lines in 2 files changed: 2 ins; 2 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/25643.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25643/head:pull/25643 PR: https://git.openjdk.org/jdk/pull/25643 From epeter at openjdk.org Thu Jun 5 12:01:17 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 5 Jun 2025 12:01:17 GMT Subject: RFR: 8358600: Template-Framework Library: Template for TestFramework test class [v2] In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 11:31:50 GMT, Manuel H?ssig wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> streamline API to a single render method > > Thank you for your continued work on the Template Framework. This seems like good start to the template library. While it already looks good, I have a few small questions :) @mhaessig Thanks for reviewing! I applied you suggestions :) > test/hotspot/jtreg/compiler/lib/template_framework/library/TestFrameworkClass.java line 76: > >> 74: public static String render(final String packageName, >> 75: final String className, >> 76: final List imports, > > To eliminate duplicate imports this could also be a `Set`. Sure, I'll make it a `Set`. > test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestWithTestFrameworkClass.java line 135: > >> 133: )); >> 134: >> 135: // Create a test for each operator.. > > Suggestion: > > // Create a test for each operator. > > Tiny nit :) Fixed. > test/hotspot/jtreg/testlibrary_tests/template_framework/examples/TestWithTestFrameworkClass.java line 145: > >> 143: // List of imports. Duplicates are permitted. >> 144: List.of("compiler.lib.generators.*", >> 145: "compiler.lib.ir_framework.*", > > Suggestion: > > > This should not be needed since its imported by default in `TestFrameworkClass`. Or is this a deliberate duplication? Yes, it was deliberate. But I'll just remove it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25643#issuecomment-2943901270 PR Review Comment: https://git.openjdk.org/jdk/pull/25643#discussion_r2128673497 PR Review Comment: https://git.openjdk.org/jdk/pull/25643#discussion_r2128672040 PR Review Comment: https://git.openjdk.org/jdk/pull/25643#discussion_r2128672889 From rcastanedalo at openjdk.org Thu Jun 5 12:10:57 2025 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Thu, 5 Jun 2025 12:10:57 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v8] In-Reply-To: <2m1_XtiSsW_LaBRrkX4qv7AKtLOjNgnl4mUp3zisasE=.dda62164-7aa0-4c1a-b83f-fa40ba7902e5@github.com> References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> <1gdeBnZ7YuIf9CgQW2bCXkDDBWPjUgRnickHts-fvzE=.e6e901ba-3e9f-41a2-9c68-167a879e9655@github.com> <2m1_XtiSsW_LaBRrkX4qv7AKtLOjNgnl4mUp3zisasE=.dda62164-7aa0-4c1a-b83f-fa40ba7902e5@github.com> Message-ID: On Wed, 4 Jun 2025 14:06:46 GMT, Roberto Casta?eda Lozano wrote: > Thanks, will run some testing and come back with the results. `compiler/c2/TestVerifyIterativeGVN.java` fails as follows (I tested the PR applied on top of jdk-25+25 but I see the failure also in the [GHA results](https://github.com/rwestrel/jdk/actions/runs/15438508506/job/43452592735)): # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/home/rocastan/git/views/JDK-8327963-memprojs/open/src/hotspot/share/opto/node.hpp:457), pid=464370, tid=464390 # assert(is_not_dead(n)) failed: can not use dead node # (...) Current CompileTask: C2:128 9 b 4 java.lang.reflect.ClassFileFormatVersion:: (417 bytes) Stack: [0x000070ba24200000,0x000070ba24300000], sp=0x000070ba242fae70, free space=1003k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x519ad1] Node::set_req(unsigned int, Node*)+0x251 (node.hpp:457) V [libjvm.so+0x17d2d99] PhaseIterGVN::subsume_node(Node*, Node*)+0x239 (phaseX.cpp:1425) V [libjvm.so+0x15feaf4] InitializeNode::replace_mem_projs_by(Node*, PhaseIterGVN*)+0x1d4 (phaseX.hpp:539) V [libjvm.so+0x1532cd7] PhaseMacroExpand::expand_allocate_common(AllocateNode*, Node*, TypeFunc const*, unsigned char*, Node*)+0x167 (macro.cpp:1316) V [libjvm.so+0x153e92e] PhaseMacroExpand::expand_macro_nodes()+0xc5e (macro.cpp:2687) V [libjvm.so+0xb286e7] Compile::Optimize()+0xe37 (compile.cpp:2533) (...) ------------- PR Comment: https://git.openjdk.org/jdk/pull/24570#issuecomment-2943934786 From mhaessig at openjdk.org Thu Jun 5 12:11:53 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Thu, 5 Jun 2025 12:11:53 GMT Subject: RFR: 8358600: Template-Framework Library: Template for TestFramework test class [v3] In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 12:01:17 GMT, Emanuel Peter wrote: >> We might want to write many IR/TestFramework tests, and so I would like to integrate a Template that generates the class, and the user has to only generate a list of tests. >> >> This is a first extension for https://github.com/openjdk/jdk/pull/24217. I had already prototyped it earlier and plan to use it in multiple tests https://github.com/openjdk/jdk/pull/23418 (see `IRTestClass.java`). >> >> https://github.com/openjdk/jdk/blob/dc640cbd8fb8ec76920a7ab52dfe7955ed1d77f2/test/hotspot/jtreg/compiler/lib/template_framework/library/TestFrameworkClass.java#L36-L45 > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > review suggestions Thank you for addressing my comments. Looks good to me! ------------- Marked as reviewed by mhaessig (Author). PR Review: https://git.openjdk.org/jdk/pull/25643#pullrequestreview-2900057451 From epeter at openjdk.org Thu Jun 5 12:11:54 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 5 Jun 2025 12:11:54 GMT Subject: RFR: 8358600: Template-Framework Library: Template for TestFramework test class [v3] In-Reply-To: References: Message-ID: <7hVvcQKyxl8xFhoBam3QF036Wz_zxqRwlkv1CI5u6EA=.b0695a3d-ca1c-4cf7-a75e-7a77dd4c982b@github.com> On Thu, 5 Jun 2025 12:07:44 GMT, Manuel H?ssig wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> review suggestions > > Thank you for addressing my comments. Looks good to me! @mhaessig Thank you ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25643#issuecomment-2943946510 From rcastanedalo at openjdk.org Thu Jun 5 12:56:51 2025 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Thu, 5 Jun 2025 12:56:51 GMT Subject: RFR: 8357822: C2: Multiple string optimization tests are no longer testing string concatenation optimizations [v2] In-Reply-To: References: <4GDLAMfeWjgfcGvn4sUSMT2jjG3vsebjcFeJqgHqPQw=.e7dfa9e7-4608-4304-ba00-0b254b6bf2b1@github.com> Message-ID: On Thu, 5 Jun 2025 08:51:11 GMT, Daniel Skantz wrote: >> This PR updates a few tests to reintroduce testing of string concatenation optimizations since a few bugs have recently been identified in this area. >> >> Selection criteria: performed a text search on the test suite and identified tests for string concatenations or string optimizations that are not currently compiled with `-XDstringConcat=inline` and are not using StringBuilders explicitly. >> >> Testing: T1-4. >> >> Extra testing: ran the tests manually with `-XX:+PrintOptimizeStringConcat` and verified that the tests are exercising string optimizations after the fix. > > Daniel Skantz has updated the pull request incrementally with two additional commits since the last revision: > > - revert change to TestStringIntrinsics.java > - Update test/hotspot/jtreg/compiler/c2/Test7046096.java > > Co-authored-by: Emanuel Peter Thanks! ------------- Marked as reviewed by rcastanedalo (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25610#pullrequestreview-2900209239 From sviswanathan at openjdk.org Thu Jun 5 21:33:57 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Thu, 5 Jun 2025 21:33:57 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v7] In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 08:08:48 GMT, Jatin Bhateja wrote: >> A) Patch extends the following tests with hard-coded encoding checks for various BMI instructions to cover REX2 or extended EVEX encodings supported by APX. >> >> >> compiler/intrinsics/bmi/verifycode/AndnTestI.java >> compiler/intrinsics/bmi/verifycode/AndnTestL.java >> compiler/intrinsics/bmi/verifycode/BzhiTestI2L.java >> compiler/intrinsics/bmi/verifycode/LZcntTestL.java >> compiler/intrinsics/bmi/verifycode/TZcntTestL.java >> >> >> B) After integration of JDK-8349582, which added APX NDD support, AndN instruction selection patterns that expect (Xor SRC, -1) as one of its operands were not getting selected because of a lower-cost generic immediate pattern match; patch fixes this issue through strict predicate checks. >> >> Above tests are now passing, validations were carried out using Intel Software Development emulator. >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Review resolutions src/hotspot/cpu/x86/x86_64.ad line 10621: > 10619: %{ > 10620: // Strict predicate check to make selection of xorI_rReg_im1 cost agnostic if immI src is -1. > 10621: predicate(!UseAPX && n->in(2)->bottom_type()->is_int()->get_con() != -1); We don't need to add the strict predicate check to xorI_rReg_imm, xorI_rReg_rReg_imm_ndd, xorL_rReg_imm, xorL_rReg_rReg_imm_ndd. The only change required is to add ins_cost(150) to xorI_rReg_mem_imm_ndd and xorL_rReg_mem_imm_ndd. src/hotspot/cpu/x86/x86_64.ad line 10636: > 10634: %{ > 10635: // Strict predicate check to make selection of xorI_rReg_im1_ndd cost agnostic if immI src2 is -1. > 10636: predicate(UseAPX && n->in(2)->bottom_type()->is_int()->get_con() != -1); This strict predicate check could be removed. src/hotspot/cpu/x86/x86_64.ad line 11328: > 11326: %{ > 11327: // Strict predicate check to make selection of xorL_rReg_im1 cost agnostic if immL32 src is -1. > 11328: predicate(!UseAPX && n->in(2)->bottom_type()->is_long()->get_con() != -1L); This strict predicate check could be removed. src/hotspot/cpu/x86/x86_64.ad line 11343: > 11341: %{ > 11342: // Strict predicate check to make selection of xorL_rReg_im1_ndd cost agnostic if immL32 src2 is -1. > 11343: predicate(UseAPX && n->in(2)->bottom_type()->is_long()->get_con() != -1L); This strict predicate check could be removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25501#discussion_r2130466887 PR Review Comment: https://git.openjdk.org/jdk/pull/25501#discussion_r2130468837 PR Review Comment: https://git.openjdk.org/jdk/pull/25501#discussion_r2130469792 PR Review Comment: https://git.openjdk.org/jdk/pull/25501#discussion_r2130470425 From duke at openjdk.org Thu Jun 5 23:32:26 2025 From: duke at openjdk.org (=?UTF-8?B?QmVub8OudA==?= Maillard) Date: Thu, 5 Jun 2025 23:32:26 GMT Subject: RFR: 8357951: Remove the IdealLoopTree* loop parameter from PhaseIdealLoop::loop_iv_phi Message-ID: <7wwF_valtc-AYihX6oADgPvPNczVqSr8-UHre6S1-4U=.66909b6f-18dd-43b8-a14a-509b4e525b5e@github.com> This PR removes unused parameters, namely `IdealLoopTree* loop` from `PhaseIdealLoop::loop_iv_phi` and `PhaseIdealLoop::loop_iv_stride` Best regards! ------------- Commit messages: - 8357951: Fix bad automatic renaming - 8357951: Remove unused parameters from PhaseIdealLoop::loop_iv_stride and PhaseIdealLoop::loop_iv_phi Changes: https://git.openjdk.org/jdk/pull/25659/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25659&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8357951 Stats: 10 lines in 3 files changed: 0 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/25659.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25659/head:pull/25659 PR: https://git.openjdk.org/jdk/pull/25659 From thartmann at openjdk.org Thu Jun 5 23:32:26 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Thu, 5 Jun 2025 23:32:26 GMT Subject: RFR: 8357951: Remove the IdealLoopTree* loop parameter from PhaseIdealLoop::loop_iv_phi In-Reply-To: <7wwF_valtc-AYihX6oADgPvPNczVqSr8-UHre6S1-4U=.66909b6f-18dd-43b8-a14a-509b4e525b5e@github.com> References: <7wwF_valtc-AYihX6oADgPvPNczVqSr8-UHre6S1-4U=.66909b6f-18dd-43b8-a14a-509b4e525b5e@github.com> Message-ID: On Thu, 5 Jun 2025 12:02:10 GMT, Beno?t Maillard wrote: > This PR removes unused parameters, namely `IdealLoopTree* loop` from `PhaseIdealLoop::loop_iv_phi` and `PhaseIdealLoop::loop_iv_stride` > > Best regards! Looks good and trivial to me. Congratulations on your first PR! :slightly_smiling_face: ------------- Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25659#pullrequestreview-2900231530 From sviswanathan at openjdk.org Thu Jun 5 23:50:52 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Thu, 5 Jun 2025 23:50:52 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v7] In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 08:08:48 GMT, Jatin Bhateja wrote: >> A) Patch extends the following tests with hard-coded encoding checks for various BMI instructions to cover REX2 or extended EVEX encodings supported by APX. >> >> >> compiler/intrinsics/bmi/verifycode/AndnTestI.java >> compiler/intrinsics/bmi/verifycode/AndnTestL.java >> compiler/intrinsics/bmi/verifycode/BzhiTestI2L.java >> compiler/intrinsics/bmi/verifycode/LZcntTestL.java >> compiler/intrinsics/bmi/verifycode/TZcntTestL.java >> >> >> B) After integration of JDK-8349582, which added APX NDD support, AndN instruction selection patterns that expect (Xor SRC, -1) as one of its operands were not getting selected because of a lower-cost generic immediate pattern match; patch fixes this issue through strict predicate checks. >> >> Above tests are now passing, validations were carried out using Intel Software Development emulator. >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Review resolutions test/hotspot/jtreg/compiler/intrinsics/bmi/verifycode/LZcntTestI.java line 57: > 55: // REX2 variant > 56: instrMaskAPX = new byte[]{(byte) 0xFF, (byte)0x80, (byte) 0xFF}; > 57: instrPatternAPX = new byte[]{(byte) 0xD5, (byte) 0x80, (byte) 0xBD}; I think we should check for 0xF3 as well here for lzcnt to differentiate it from bsr: instrMaskAPX = new byte[]{(byte) 0xFF, (byte) 0xFF, (byte)0x80, (byte) 0xFF}; instrPatternAPX = new byte[]{(byte) 0xF3, (byte) 0xD5, (byte) 0x80, (byte) 0xBD}; test/hotspot/jtreg/compiler/intrinsics/bmi/verifycode/TZcntTestI.java line 56: > 54: // REX2 variant > 55: instrMaskAPX = new byte[]{(byte) 0xFF, (byte)0x80, (byte) 0xFF}; > 56: instrPatternAPX = new byte[]{(byte) 0xD5, (byte) 0x80, (byte) 0xBC}; I think we should check for 0xF3 as well here for tzcnt to differentiate it from bsf: instrMaskAPX = new byte[]{(byte) 0xFF, (byte) 0xFF, (byte)0x80, (byte) 0xFF}; instrPatternAPX = new byte[]{(byte) 0xF3, (byte) 0xD5, (byte) 0x80, (byte) 0xBC}; ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25501#discussion_r2130968611 PR Review Comment: https://git.openjdk.org/jdk/pull/25501#discussion_r2130962379 From kbarrett at openjdk.org Fri Jun 6 06:13:32 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 6 Jun 2025 06:13:32 GMT Subject: RFR: 8342639: Global operator new in adlc has wrong exception spec Message-ID: <2-5wm21LtxEx0HMSd6gBbLv6tNLSBO4rGQ2W-uHfX6Q=.489a263e-fb34-4ca0-bc3f-5c74027367d6@github.com> Please review this change to remove the definition of `operator new(size_t, int, const char*, int) throw()` It doesn't seem to be needed anymore, if it ever really was. See discussion in JBS for some more details. Testing: mach5 tier1-5, to cover various different build configurations. ------------- Commit messages: - remove operator new Changes: https://git.openjdk.org/jdk/pull/25668/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25668&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8342639 Stats: 4 lines in 1 file changed: 0 ins; 4 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/25668.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25668/head:pull/25668 PR: https://git.openjdk.org/jdk/pull/25668 From jbhateja at openjdk.org Fri Jun 6 06:42:54 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 6 Jun 2025 06:42:54 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v7] In-Reply-To: References: Message-ID: On Thu, 5 Jun 2025 21:29:46 GMT, Sandhya Viswanathan wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Review resolutions > > src/hotspot/cpu/x86/x86_64.ad line 10636: > >> 10634: %{ >> 10635: // Strict predicate check to make selection of xorI_rReg_im1_ndd cost agnostic if immI src2 is -1. >> 10636: predicate(UseAPX && n->in(2)->bottom_type()->is_int()->get_con() != -1); > > This strict predicate check could be removed. It's over a general immI match and make selection more robust and agnostic to static patten cost. > src/hotspot/cpu/x86/x86_64.ad line 11328: > >> 11326: %{ >> 11327: // Strict predicate check to make selection of xorL_rReg_im1 cost agnostic if immL32 src is -1. >> 11328: predicate(!UseAPX && n->in(2)->bottom_type()->is_long()->get_con() != -1L); > > This strict predicate check could be removed. Please see that checks are only on generic immediate rules and constrain selection. > src/hotspot/cpu/x86/x86_64.ad line 11343: > >> 11341: %{ >> 11342: // Strict predicate check to make selection of xorL_rReg_im1_ndd cost agnostic if immL32 src2 is -1. >> 11343: predicate(UseAPX && n->in(2)->bottom_type()->is_long()->get_con() != -1L); > > This strict predicate check could be removed. Same as above. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25501#discussion_r2131603149 PR Review Comment: https://git.openjdk.org/jdk/pull/25501#discussion_r2131603957 PR Review Comment: https://git.openjdk.org/jdk/pull/25501#discussion_r2131604272 From jbhateja at openjdk.org Fri Jun 6 06:56:52 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 6 Jun 2025 06:56:52 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v7] In-Reply-To: References: Message-ID: <8SMpHtG_wt9kTWUCrMrmUt6ae0ecV63YdDr6yh_YySU=.414b0712-9d40-4685-968c-19ded7db2951@github.com> On Thu, 5 Jun 2025 23:48:03 GMT, Sandhya Viswanathan wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Review resolutions > > test/hotspot/jtreg/compiler/intrinsics/bmi/verifycode/LZcntTestI.java line 57: > >> 55: // REX2 variant >> 56: instrMaskAPX = new byte[]{(byte) 0xFF, (byte)0x80, (byte) 0xFF}; >> 57: instrPatternAPX = new byte[]{(byte) 0xD5, (byte) 0x80, (byte) 0xBD}; > > I think we should check for 0xF3 as well here for lzcnt to differentiate it from bsr: > instrMaskAPX = new byte[]{(byte) 0xFF, (byte) 0xFF, (byte)0x80, (byte) 0xFF}; > instrPatternAPX = new byte[]{(byte) 0xF3, (byte) 0xD5, (byte) 0x80, (byte) 0xBD}; Mask 0xFF is against the legacy prefix byte, which should be checked in entirety i.e. 0xF3 > test/hotspot/jtreg/compiler/intrinsics/bmi/verifycode/TZcntTestI.java line 56: > >> 54: // REX2 variant >> 55: instrMaskAPX = new byte[]{(byte) 0xFF, (byte)0x80, (byte) 0xFF}; >> 56: instrPatternAPX = new byte[]{(byte) 0xD5, (byte) 0x80, (byte) 0xBC}; > > I think we should check for 0xF3 as well here for tzcnt to differentiate it from bsf: > instrMaskAPX = new byte[]{(byte) 0xFF, (byte) 0xFF, (byte)0x80, (byte) 0xFF}; > instrPatternAPX = new byte[]{(byte) 0xF3, (byte) 0xD5, (byte) 0x80, (byte) 0xBC}; Mask 0xFF is against the legacy prefix byte, which should be checked in entirety i.e. 0xF3 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25501#discussion_r2131620708 PR Review Comment: https://git.openjdk.org/jdk/pull/25501#discussion_r2131620781 From duke at openjdk.org Fri Jun 6 07:05:03 2025 From: duke at openjdk.org (erifan) Date: Fri, 6 Jun 2025 07:05:03 GMT Subject: RFR: 8354242: VectorAPI: combine vector not operation with compare [v7] In-Reply-To: <15TW6hiffz65NhHevPefL_6swSC07UD-GwiJ4tPDtFs=.b83081df-8abd-4756-b4e0-1d969678a0d2@github.com> References: <15TW6hiffz65NhHevPefL_6swSC07UD-GwiJ4tPDtFs=.b83081df-8abd-4756-b4e0-1d969678a0d2@github.com> Message-ID: <7jhSkkRnLI9jPxnO55qlmkoJa-0By2VbkUnAFsJsFD8=.1eda80fa-424f-4766-9fb6-2cf6eb061c68@github.com> On Thu, 5 Jun 2025 11:05:48 GMT, Emanuel Peter wrote: > > Oh I think we still cannot use `BoolTest::negate`, because we cannot instantiate a `BoolTest` object with **unsigned** comparison. `BoolTest::negate` is a non-static function. > > I see. Ok. Hmm. I still think that the logic should be in `BoolTest`, because that is where the exact implementation of the enum values is. In that context it is easier to see why `^4` does the negation. And imagine we were ever to change the enum values, then it would be harder to find your code and fix it. > > Maybe it could be called `BoolTest::negate_mask(mast btm)` and explain in a comment that both signed and unsigned is supported. Make sense, I'll update later, thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24674#issuecomment-2948297917 From mhaessig at openjdk.org Fri Jun 6 07:08:51 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Fri, 6 Jun 2025 07:08:51 GMT Subject: RFR: 8357951: Remove the IdealLoopTree* loop parameter from PhaseIdealLoop::loop_iv_phi In-Reply-To: <7wwF_valtc-AYihX6oADgPvPNczVqSr8-UHre6S1-4U=.66909b6f-18dd-43b8-a14a-509b4e525b5e@github.com> References: <7wwF_valtc-AYihX6oADgPvPNczVqSr8-UHre6S1-4U=.66909b6f-18dd-43b8-a14a-509b4e525b5e@github.com> Message-ID: On Thu, 5 Jun 2025 12:02:10 GMT, Beno?t Maillard wrote: > This PR removes unused parameters, namely `IdealLoopTree* loop` from `PhaseIdealLoop::loop_iv_phi` and `PhaseIdealLoop::loop_iv_stride` > > Best regards! Looks good and trivial to me as well. Congrats on your first PR ? ------------- Marked as reviewed by mhaessig (Author). PR Review: https://git.openjdk.org/jdk/pull/25659#pullrequestreview-2903994593 From duke at openjdk.org Fri Jun 6 07:24:50 2025 From: duke at openjdk.org (duke) Date: Fri, 6 Jun 2025 07:24:50 GMT Subject: RFR: 8357951: Remove the IdealLoopTree* loop parameter from PhaseIdealLoop::loop_iv_phi In-Reply-To: <7wwF_valtc-AYihX6oADgPvPNczVqSr8-UHre6S1-4U=.66909b6f-18dd-43b8-a14a-509b4e525b5e@github.com> References: <7wwF_valtc-AYihX6oADgPvPNczVqSr8-UHre6S1-4U=.66909b6f-18dd-43b8-a14a-509b4e525b5e@github.com> Message-ID: On Thu, 5 Jun 2025 12:02:10 GMT, Beno?t Maillard wrote: > This PR removes unused parameters, namely `IdealLoopTree* loop` from `PhaseIdealLoop::loop_iv_phi` and `PhaseIdealLoop::loop_iv_stride` > > Best regards! @benoitmaillard Your change (at version b6d2b57d81e2e0aa919a101e30a440359ee5e7f5) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25659#issuecomment-2948341047 From jbhateja at openjdk.org Fri Jun 6 07:46:32 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 6 Jun 2025 07:46:32 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v8] In-Reply-To: References: Message-ID: > A) Patch extends the following tests with hard-coded encoding checks for various BMI instructions to cover REX2 or extended EVEX encodings supported by APX. > > > compiler/intrinsics/bmi/verifycode/AndnTestI.java > compiler/intrinsics/bmi/verifycode/AndnTestL.java > compiler/intrinsics/bmi/verifycode/BzhiTestI2L.java > compiler/intrinsics/bmi/verifycode/LZcntTestL.java > compiler/intrinsics/bmi/verifycode/TZcntTestL.java > > > B) After integration of JDK-8349582, which added APX NDD support, AndN instruction selection patterns that expect (Xor SRC, -1) as one of its operands were not getting selected because of a lower-cost generic immediate pattern match; patch fixes this issue through strict predicate checks. > > Above tests are now passing, validations were carried out using Intel Software Development emulator. > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Review comments resoltions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25501/files - new: https://git.openjdk.org/jdk/pull/25501/files/45db368d..a89188e1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25501&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25501&range=06-07 Stats: 4 lines in 2 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/25501.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25501/head:pull/25501 PR: https://git.openjdk.org/jdk/pull/25501 From duke at openjdk.org Fri Jun 6 07:47:30 2025 From: duke at openjdk.org (=?UTF-8?B?QmVub8OudA==?= Maillard) Date: Fri, 6 Jun 2025 07:47:30 GMT Subject: RFR: 8356780: PhaseMacroExpand::_has_locks is unused Message-ID: <4Y_qCkNICY97KdColxyShQuBy9zdEVaZjJjkDtJD9do=.f09a465c-13d9-4e15-8b86-d99cac09b807@github.com> This PR introduces two cleanup changes to `PhaseMacroExpand`: - Removes the unused field `PhaseMacroExpand::_has_locks` - Merges two `while` loops in `PhaseMacroExpand::eliminate_macro_nodes` into a single loop Previously, `eliminate_macro_nodes` used two separate `while` loops: - The first loop removed lock nodes - The second loop removed allocation nodes Both loops had the same structure and independently traversed the same set of nodes. Since their operations do not interfere, the lock node removal logic was moved into the second loop as an additional case in the `switch` statement. Thanks! ------------- Commit messages: - 8356780: Remove useless assert - 8356780: Merge both while loops in PhaseMacroExpand::eliminate_macro_nodes - 8356780: Remove unused field PhaseMacroExpand::_has_locks Changes: https://git.openjdk.org/jdk/pull/25669/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25669&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8356780 Stats: 32 lines in 2 files changed: 4 ins; 24 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/25669.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25669/head:pull/25669 PR: https://git.openjdk.org/jdk/pull/25669 From duke at openjdk.org Fri Jun 6 08:18:56 2025 From: duke at openjdk.org (=?UTF-8?B?QmVub8OudA==?= Maillard) Date: Fri, 6 Jun 2025 08:18:56 GMT Subject: Integrated: 8357951: Remove the IdealLoopTree* loop parameter from PhaseIdealLoop::loop_iv_phi In-Reply-To: <7wwF_valtc-AYihX6oADgPvPNczVqSr8-UHre6S1-4U=.66909b6f-18dd-43b8-a14a-509b4e525b5e@github.com> References: <7wwF_valtc-AYihX6oADgPvPNczVqSr8-UHre6S1-4U=.66909b6f-18dd-43b8-a14a-509b4e525b5e@github.com> Message-ID: On Thu, 5 Jun 2025 12:02:10 GMT, Beno?t Maillard wrote: > This PR removes unused parameters, namely `IdealLoopTree* loop` from `PhaseIdealLoop::loop_iv_phi` and `PhaseIdealLoop::loop_iv_stride` > > Best regards! This pull request has now been integrated. Changeset: d1b78800 Author: Beno?t Maillard Committer: Tobias Hartmann URL: https://git.openjdk.org/jdk/commit/d1b788005bdf11f1426baa8e811c121a956482c9 Stats: 10 lines in 3 files changed: 0 ins; 0 del; 10 mod 8357951: Remove the IdealLoopTree* loop parameter from PhaseIdealLoop::loop_iv_phi Reviewed-by: thartmann, mhaessig ------------- PR: https://git.openjdk.org/jdk/pull/25659 From roland at openjdk.org Fri Jun 6 09:01:46 2025 From: roland at openjdk.org (Roland Westrelin) Date: Fri, 6 Jun 2025 09:01:46 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v11] In-Reply-To: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> Message-ID: > An `Initialize` node for an `Allocate` node is created with a memory > `Proj` of adr type raw memory. In order for stores to be captured, the > memory state out of the allocation is a `MergeMem` with slices for the > various object fields/array element set to the raw memory `Proj` of > the `Initialize` node. If `Phi`s need to be created during later > transformations from this memory state, The `Phi` for a particular > slice gets its adr type from the type of the `Proj` which is raw > memory. If during macro expansion, the `Allocate` is found to have no > use and so can be removed, the `Proj` out of the `Initialize` is > replaced by the memory state on input to the `Allocate`. A `Phi` for > some slice for a field of an object will end up with the raw memory > state on input to the `Allocate` node. As a result, memory state at > the `Phi` is incorrect and incorrect execution can happen. > > The fix I propose is, rather than have a single `Proj` for the memory > state out of the `Initialize` with adr type raw memory, to use one > `Proj` per slice added to the memory state after the `Initalize`. Each > of the `Proj` should return the right adr type for its slice. For that > I propose having a new type of `Proj`: `NarrowMemProj` that captures > the right adr type. > > Logic for the construction of the `Allocate`/`Initialize` subgraph is > tweaked so the right adr type captured in is own `NarrowMemProj` is > added to the memory sugraph. Code that removes an allocation or moves > it also has to be changed so it correctly takes the multiple memory > projections out of the `Initialize` node into account. > > One tricky issue is that when EA split types for a scalar replaceable > `Allocate` node: > > 1- the adr type captured in the `NarrowMemProj` becomes out of sync > with the type of the slices for the allocation > > 2- before EA, the memory state for one particular field out of the > `Initialize` node can be used for a `Store` to the just allocated > object or some other. So we can have a chain of `Store`s, some to > the newly allocated object, some to some other objects, all of them > using the state of `NarrowMemProj` out of the `Initialize`. After > split unique types, the `NarrowMemProj` is for the slice of a > particular allocation. So `Store`s to some other objects shouldn't > use that memory state but the memory state before the `Allocate`. > > For that, I added logic to update the adr type of `NarrowMemProj` > during split unique types and update the memory input of `Store`s that > don't depend on the memory state ... Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision: more ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24570/files - new: https://git.openjdk.org/jdk/pull/24570/files/69c6e50b..3b5b54a3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24570&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24570&range=09-10 Stats: 39 lines in 3 files changed: 28 ins; 10 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24570.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24570/head:pull/24570 PR: https://git.openjdk.org/jdk/pull/24570 From roland at openjdk.org Fri Jun 6 09:05:54 2025 From: roland at openjdk.org (Roland Westrelin) Date: Fri, 6 Jun 2025 09:05:54 GMT Subject: RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v8] In-Reply-To: References: <3jUFOPYDIqmzEywhzf58guwS0qZGBUCMZ3lXeltlS3c=.5c82601f-cf4d-4b2a-a525-1f8f4c7c4a3b@github.com> <1gdeBnZ7YuIf9CgQW2bCXkDDBWPjUgRnickHts-fvzE=.e6e901ba-3e9f-41a2-9c68-167a879e9655@github.com> <2m1_XtiSsW_LaBRrkX4qv7AKtLOjNgnl4mUp3zisasE=.dda62164-7aa0-4c1a-b83f-fa40ba7902e5@github.com> Message-ID: On Thu, 5 Jun 2025 12:07:01 GMT, Roberto Casta?eda Lozano wrote: > > Thanks, will run some testing and come back with the results. > > `compiler/c2/TestVerifyIterativeGVN.java` fails as follows (I tested the PR applied on top of jdk-25+25 but I see the failure also in the [GHA results](https://github.com/rwestrel/jdk/actions/runs/15438508506/job/43452592735)): > > ``` > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/home/rocastan/git/views/JDK-8327963-memprojs/open/src/hotspot/share/opto/node.hpp:457), pid=464370, tid=464390 > # assert(is_not_dead(n)) failed: can not use dead node > # > (...) > Current CompileTask: > C2:128 9 b 4 java.lang.reflect.ClassFileFormatVersion:: (417 bytes) > > Stack: [0x000070ba24200000,0x000070ba24300000], sp=0x000070ba242fae70, free space=1003k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x519ad1] Node::set_req(unsigned int, Node*)+0x251 (node.hpp:457) > V [libjvm.so+0x17d2d99] PhaseIterGVN::subsume_node(Node*, Node*)+0x239 (phaseX.cpp:1425) > V [libjvm.so+0x15feaf4] InitializeNode::replace_mem_projs_by(Node*, PhaseIterGVN*)+0x1d4 (phaseX.hpp:539) > V [libjvm.so+0x1532cd7] PhaseMacroExpand::expand_allocate_common(AllocateNode*, Node*, TypeFunc const*, unsigned char*, Node*)+0x167 (macro.cpp:1316) > V [libjvm.so+0x153e92e] PhaseMacroExpand::expand_macro_nodes()+0xc5e (macro.cpp:2687) > V [libjvm.so+0xb286e7] Compile::Optimize()+0xe37 (compile.cpp:2533) > (...) > ``` Thanks for the report. Should be fixed in new commit. The logic I used to remove `NarrowMemProj`s was a bit of a hack. I replaced it by code that's a bit longer but also robuster. On that topic: removal `NarrowMemProj`s happens during macro expansion. There are rare cases where C2 can't go from the `Allocate` to the `Initialize` because the IR graph doesn't match the expected pattern (some transformation introduced a region in between). In that case, `NarrowMemProj`s are not removed. That seems reasonable as there's no requirement to remove them and guaranteeing they are always removed would require a bit more code AFAICT. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24570#issuecomment-2948580536 From mhaessig at openjdk.org Fri Jun 6 09:10:54 2025 From: mhaessig at openjdk.org (Manuel =?UTF-8?B?SMOkc3NpZw==?=) Date: Fri, 6 Jun 2025 09:10:54 GMT Subject: RFR: 8356780: PhaseMacroExpand::_has_locks is unused In-Reply-To: <4Y_qCkNICY97KdColxyShQuBy9zdEVaZjJjkDtJD9do=.f09a465c-13d9-4e15-8b86-d99cac09b807@github.com> References: <4Y_qCkNICY97KdColxyShQuBy9zdEVaZjJjkDtJD9do=.f09a465c-13d9-4e15-8b86-d99cac09b807@github.com> Message-ID: <8E_ltzNOatAEe9cilqhvX2SqNPANuorX6qc-Z4MUp2M=.2cc2e4fd-3c43-4d8a-a24a-f78111d3d922@github.com> On Fri, 6 Jun 2025 07:41:45 GMT, Beno?t Maillard wrote: > This PR introduces two cleanup changes to `PhaseMacroExpand`: > > - Removes the unused field `PhaseMacroExpand::_has_locks` > - Merges two `while` loops in `PhaseMacroExpand::eliminate_macro_nodes` into a single loop > > Previously, `eliminate_macro_nodes` used two separate `while` loops: > > - The first loop removed lock nodes > - The second loop removed allocation nodes > > Both loops had the same structure and independently traversed the same set of nodes. Since their operations do not interfere, the lock node removal logic was moved into the second loop as an additional case in the `switch` statement. > > Thanks! Thank you for working on this! Your change looks good. However, it would be nice if you could mention in the PR description what testing you ran. Also, you forgot to update the copyright year in `opto/macro.hpp`. src/hotspot/share/opto/macro.hpp line 199: > 197: > 198: public: > 199: PhaseMacroExpand(PhaseIterGVN &igvn) : Phase(Macro_Expand), _igvn(igvn) { You forgot to update the copyright year. You can use the helper script [`make/scripts/update_copyright_year.sh`](https://github.com/openjdk/jdk/blob/master/make/scripts/update_copyright_year.sh) to do it automatically, before you send out a PR. ------------- Changes requested by mhaessig (Author). PR Review: https://git.openjdk.org/jdk/pull/25669#pullrequestreview-2904259441 PR Review Comment: https://git.openjdk.org/jdk/pull/25669#discussion_r2131794133 From duke at openjdk.org Fri Jun 6 09:37:30 2025 From: duke at openjdk.org (=?UTF-8?B?QmVub8OudA==?= Maillard) Date: Fri, 6 Jun 2025 09:37:30 GMT Subject: RFR: 8356780: PhaseMacroExpand::_has_locks is unused [v2] In-Reply-To: <4Y_qCkNICY97KdColxyShQuBy9zdEVaZjJjkDtJD9do=.f09a465c-13d9-4e15-8b86-d99cac09b807@github.com> References: <4Y_qCkNICY97KdColxyShQuBy9zdEVaZjJjkDtJD9do=.f09a465c-13d9-4e15-8b86-d99cac09b807@github.com> Message-ID: > This PR introduces two cleanup changes to `PhaseMacroExpand`: > > - Removes the unused field `PhaseMacroExpand::_has_locks` > - Merges two `while` loops in `PhaseMacroExpand::eliminate_macro_nodes` into a single loop > > Previously, `eliminate_macro_nodes` used two separate `while` loops: > > - The first loop removed lock nodes > - The second loop removed allocation nodes > > Both loops had the same structure and independently traversed the same set of nodes. Since their operations do not interfere, the lock node removal logic was moved into the second loop as an additional case in the `switch` statement. > > Thanks! Beno?t Maillard has updated the pull request incrementally with one additional commit since the last revision: 8356780: Update copyright ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25669/files - new: https://git.openjdk.org/jdk/pull/25669/files/4e7ee579..90c94a25 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25669&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25669&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25669.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25669/head:pull/25669 PR: https://git.openjdk.org/jdk/pull/25669 From duke at openjdk.org Fri Jun 6 10:38:11 2025 From: duke at openjdk.org (erifan) Date: Fri, 6 Jun 2025 10:38:11 GMT Subject: RFR: 8354242: VectorAPI: combine vector not operation with compare [v8] In-Reply-To: References: Message-ID: > This patch optimizes the following patterns: > For integer types: > > (XorV (VectorMaskCmp src1 src2 cond) (Replicate -1)) > => (VectorMaskCmp src1 src2 ncond) > (XorVMask (VectorMaskCmp src1 src2 cond) (MaskAll m1)) > => (VectorMaskCmp src1 src2 ncond) > > cond can be eq, ne, le, ge, lt, gt, ule, uge, ult and ugt, ncond is the negative comparison of cond. > > For float and double types: > > (XorV (VectorMaskCast (VectorMaskCmp src1 src2 cond)) (Replicate -1)) > => (VectorMaskCast (VectorMaskCmp src1 src2 ncond)) > (XorVMask (VectorMaskCast (VectorMaskCmp src1 src2 cond)) (MaskAll m1)) > => (VectorMaskCast (VectorMaskCmp src1 src2 ncond)) > > cond can be eq or ne. > > Benchmarks on Nvidia Grace machine with 128-bit SVE2: With option `-XX:UseSVE=2`: > > Benchmark Unit Before Score Error After Score Error Uplift > testCompareEQMaskNotByte ops/s 7912127.225 2677.289518 10266136.26 8955.008548 1.29 > testCompareEQMaskNotDouble ops/s 884737.6799 446.963779 1179760.772 448.031844 1.33 > testCompareEQMaskNotFloat ops/s 1765045.787 682.332214 2359520.803 896.305743 1.33 > testCompareEQMaskNotInt ops/s 1787221.411 977.743935 2353952.519 960.069976 1.31 > testCompareEQMaskNotLong ops/s 895297.1974 673.44808 1178449.02 323.804205 1.31 > testCompareEQMaskNotShort ops/s 3339987.002 3415.2226 4712761.965 2110.862053 1.41 > testCompareGEMaskNotByte ops/s 7907615.16 4094.243652 10251646.9 9486.699831 1.29 > testCompareGEMaskNotInt ops/s 1683738.958 4233.813092 2352855.205 1251.952546 1.39 > testCompareGEMaskNotLong ops/s 854496.1561 8594.598885 1177811.493 521.1229 1.37 > testCompareGEMaskNotShort ops/s 3341860.309 1578.975338 4714008.434 1681.10365 1.41 > testCompareGTMaskNotByte ops/s 7910823.674 2993.367032 10245063.58 9774.75138 1.29 > testCompareGTMaskNotInt ops/s 1673393.928 3153.099431 2353654.521 1190.848583 1.4 > testCompareGTMaskNotLong ops/s 849405.9159 2432.858159 1177952.041 359.96413 1.38 > testCompareGTMaskNotShort ops/s 3339509.141 3339.976585 4711442.496 2673.364893 1.41 > testCompareLEMaskNotByte ops/s 7911340.004 3114.69191 10231626.5 27134.20035 1.29 > testCompareLEMaskNotInt ops/s 1675812.113 1340.969885 2353255.341 1452.4522 1.4 > testCompareLEMaskNotLong ops/s 848862.8036 6564.841731 1177763.623 539.290106 1.38 > testCompareLEMaskNotShort ops/s 3324951.54 2380.29473 4712116.251 1544.559684 1.41 > testCompareLTMaskNotByte ops/s 7910390.844 2630.861436 10239567.69 6487.441672 1.29 > testCompareLTMaskNotInt ops/s 1672180.09 995.238142 2353757.863 853.774734 1.4 > testCompareLTMaskNotLong ops/s 856502.26... erifan has updated the pull request incrementally with one additional commit since the last revision: Support negating unsigned comparison for BoolTest::mask Added a static method `negate_mask(mask btm)` into BoolTest class to negate both signed and unsigned comparison. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24674/files - new: https://git.openjdk.org/jdk/pull/24674/files/ebbcc405..f51bf722 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24674&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24674&range=06-07 Stats: 6 lines in 3 files changed: 2 ins; 3 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/24674.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24674/head:pull/24674 PR: https://git.openjdk.org/jdk/pull/24674 From duke at openjdk.org Fri Jun 6 10:38:11 2025 From: duke at openjdk.org (erifan) Date: Fri, 6 Jun 2025 10:38:11 GMT Subject: RFR: 8354242: VectorAPI: combine vector not operation with compare [v7] In-Reply-To: <7jhSkkRnLI9jPxnO55qlmkoJa-0By2VbkUnAFsJsFD8=.1eda80fa-424f-4766-9fb6-2cf6eb061c68@github.com> References: <15TW6hiffz65NhHevPefL_6swSC07UD-GwiJ4tPDtFs=.b83081df-8abd-4756-b4e0-1d969678a0d2@github.com> <7jhSkkRnLI9jPxnO55qlmkoJa-0By2VbkUnAFsJsFD8=.1eda80fa-424f-4766-9fb6-2cf6eb061c68@github.com> Message-ID: <_YWpW68O6gzU99lre_qtNA5zkH_fCU50rcs8Va4E5eQ=.aaa944f8-5150-405a-8534-50de7c74d762@github.com> On Fri, 6 Jun 2025 07:01:58 GMT, erifan wrote: > > > Oh I think we still cannot use `BoolTest::negate`, because we cannot instantiate a `BoolTest` object with **unsigned** comparison. `BoolTest::negate` is a non-static function. > > > > > > I see. Ok. Hmm. I still think that the logic should be in `BoolTest`, because that is where the exact implementation of the enum values is. In that context it is easier to see why `^4` does the negation. And imagine we were ever to change the enum values, then it would be harder to find your code and fix it. > > Maybe it could be called `BoolTest::negate_mask(mast btm)` and explain in a comment that both signed and unsigned is supported. > > Make sense, I'll update later, thanks. @eme64 your comment is addressed, thanks for your suggestion! ------------- PR Comment: https://git.openjdk.org/jdk/pull/24674#issuecomment-2948832452 From chagedorn at openjdk.org Fri Jun 6 11:05:49 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Fri, 6 Jun 2025 11:05:49 GMT Subject: RFR: 8356780: PhaseMacroExpand::_has_locks is unused [v2] In-Reply-To: References: <4Y_qCkNICY97KdColxyShQuBy9zdEVaZjJjkDtJD9do=.f09a465c-13d9-4e15-8b86-d99cac09b807@github.com> Message-ID: On Fri, 6 Jun 2025 09:37:30 GMT, Beno?t Maillard wrote: >> This PR introduces two cleanup changes to `PhaseMacroExpand`: >> >> - Removes the unused field `PhaseMacroExpand::_has_locks` >> - Merges two `while` loops in `PhaseMacroExpand::eliminate_macro_nodes` into a single loop >> >> Previously, `eliminate_macro_nodes` used two separate `while` loops: >> >> - The first loop removed lock nodes >> - The second loop removed allocation nodes >> >> Both loops had the same structure and independently traversed the same set of nodes. Since their operations do not interfere, the lock node removal logic was moved into the second loop as an additional case in the `switch` statement. >> >> Thanks! > > Beno?t Maillard has updated the pull request incrementally with one additional commit since the last revision: > > 8356780: Update copyright Looks good! Since you also fuse the two loops, I suggest to update the PR/JBS title accordingly. ------------- Marked as reviewed by chagedorn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25669#pullrequestreview-2904593581 From roland at openjdk.org Fri Jun 6 13:51:57 2025 From: roland at openjdk.org (Roland Westrelin) Date: Fri, 6 Jun 2025 13:51:57 GMT Subject: RFR: 8275202: C2: optimize out more redundant conditions [v2] In-Reply-To: <978cgwy3Nb_x7yU6jZz0f6zhTBZfphstisAkBf1Vktc=.283d06eb-4f79-40cf-b8dd-a9c230e59902@github.com> References: <978cgwy3Nb_x7yU6jZz0f6zhTBZfphstisAkBf1Vktc=.283d06eb-4f79-40cf-b8dd-a9c230e59902@github.com> Message-ID: > This change adds a new loop opts pass to optimize redundant conditions > such as the second one in: > > > if (i < 10) { > if (i < 42) { > > > In the branch of the first if, the type of i can be narrowed down to > [min_jint, 9] which can then be used to constant fold the second > condition. > > The compiler already keeps track of type[n] for every node in the > current compilation unit. That's not sufficient to optimize the > snippet above though because the type of i can only be narrowed in > some sections of the control flow (that is a subset of all > controls). The solution is to build a new table that tracks the type > of n at every control c > > > type'[n, root] = type[n] // initialized from igvn's type table > type'[n, c] = type[n, idom(c)] > > > This pass iterates over the CFG looking for conditions such as: > > > if (i < 10) { > > > that allows narrowing the type of i and updates the type' table > accordingly. > > At a region r: > > > type'[n, r] = meet(type'[n, r->in(1)], type'[n, r->in(2)]...) > > > For a Phi phi at a region r: > > > type'[phi, r] = meet(type'[phi->in(1), r->in(1)], type'[phi->in(2), r->in(2)]...) > > > Once a type is narrowed, uses are enqueued and their types are > computed by calling the Value() methods. If a use's type is narrowed, > it's recorded at c in the type' table. Value() methods retrieve types > from the type table, not the type' table. To address that issue while > leaving Value() methods unchanged, before calling Value() at c, the > type table is updated so: > > > type[n] = type'[n, c] > > > An exception is for Phi::Value which needs to retrieve the type of > nodes are various controls: there, a new type(Node* n, Node* c) > method is used. > > For most n and c, type'[n, c] is likely the same as type[n], the type > recorded in the global igvn table (that is there shouldn't be many > nodes at only a few control for which we can narrow the type down). As > a consequence, the types'[n, c] table is implemented with: > > - At c, narrowed down types are stored in a GrowableArray. Each entry > records the previous type at idom(c) and the narrowed down type at > c. > > - The GrowableArray of type updates is recorded in a hash table > indexed by c. If there's no update at c, there's no entry in the > hash table. > > This pass operates in 2 steps: > > - it first iterates over the graph looking for conditions that narrow > the types of some nodes and propagate type updates to uses until a > fix point. > > - it transforms the graph so newly found constant nodes are folded. > > > The new pass is run on every loop opts. There are a couple rea... Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: - updated conditional propagation - Merge branch 'master' into JDK-8275202 - conditional propagation ------------- Changes: https://git.openjdk.org/jdk/pull/14586/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14586&range=01 Stats: 4509 lines in 31 files changed: 4398 ins; 40 del; 71 mod Patch: https://git.openjdk.org/jdk/pull/14586.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14586/head:pull/14586 PR: https://git.openjdk.org/jdk/pull/14586 From kvn at openjdk.org Fri Jun 6 14:12:51 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 6 Jun 2025 14:12:51 GMT Subject: RFR: 8342639: Global operator new in adlc has wrong exception spec In-Reply-To: <2-5wm21LtxEx0HMSd6gBbLv6tNLSBO4rGQ2W-uHfX6Q=.489a263e-fb34-4ca0-bc3f-5c74027367d6@github.com> References: <2-5wm21LtxEx0HMSd6gBbLv6tNLSBO4rGQ2W-uHfX6Q=.489a263e-fb34-4ca0-bc3f-5c74027367d6@github.com> Message-ID: On Fri, 6 Jun 2025 06:07:43 GMT, Kim Barrett wrote: > Please review this change to remove the definition of > `operator new(size_t, int, const char*, int) throw()` > > It doesn't seem to be needed anymore, if it ever really was. See discussion in > JBS for some more details. > > Testing: mach5 tier1-5, to cover various different build configurations. Good. @kimbarrett can you enable GHA testing for this branch? It will test more builds. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25668#pullrequestreview-2905050629 PR Comment: https://git.openjdk.org/jdk/pull/25668#issuecomment-2949388049 From shade at openjdk.org Fri Jun 6 14:15:05 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 6 Jun 2025 14:15:05 GMT Subject: RFR: 8358749: Fix input checks in Vector API intrinsics Message-ID: We have been carrying this patch in Leyden/premain for a while: https://github.com/openjdk/leyden/commit/7faed7fc5c8e1bbd9a16ab22673a77099396179c. I believe it deserves to be in mainline. I polished it a little further. It is _mostly_ a cleanup, but there are also new checks, on the paths where we do take constants off the arguments. In those cases, I believe the alternative is compiler SEGV-ing. Additional testing: - [x] Linux x86_64 server fastdebug, `hotspot_vector_1 hotspot_vector_2` - [x] Linux x86_64 server fastdebug, `jdk_vector` ------------- Commit messages: - Fix Changes: https://git.openjdk.org/jdk/pull/25673/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25673&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8358749 Stats: 50 lines in 1 file changed: 21 ins; 0 del; 29 mod Patch: https://git.openjdk.org/jdk/pull/25673.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25673/head:pull/25673 PR: https://git.openjdk.org/jdk/pull/25673 From roland at openjdk.org Fri Jun 6 14:18:43 2025 From: roland at openjdk.org (Roland Westrelin) Date: Fri, 6 Jun 2025 14:18:43 GMT Subject: RFR: 8275202: C2: optimize out more redundant conditions [v3] In-Reply-To: <978cgwy3Nb_x7yU6jZz0f6zhTBZfphstisAkBf1Vktc=.283d06eb-4f79-40cf-b8dd-a9c230e59902@github.com> References: <978cgwy3Nb_x7yU6jZz0f6zhTBZfphstisAkBf1Vktc=.283d06eb-4f79-40cf-b8dd-a9c230e59902@github.com> Message-ID: > This change adds a new loop opts pass to optimize redundant conditions > such as the second one in: > > > if (i < 10) { > if (i < 42) { > > > In the branch of the first if, the type of i can be narrowed down to > [min_jint, 9] which can then be used to constant fold the second > condition. > > The compiler already keeps track of type[n] for every node in the > current compilation unit. That's not sufficient to optimize the > snippet above though because the type of i can only be narrowed in > some sections of the control flow (that is a subset of all > controls). The solution is to build a new table that tracks the type > of n at every control c > > > type'[n, root] = type[n] // initialized from igvn's type table > type'[n, c] = type[n, idom(c)] > > > This pass iterates over the CFG looking for conditions such as: > > > if (i < 10) { > > > that allows narrowing the type of i and updates the type' table > accordingly. > > At a region r: > > > type'[n, r] = meet(type'[n, r->in(1)], type'[n, r->in(2)]...) > > > For a Phi phi at a region r: > > > type'[phi, r] = meet(type'[phi->in(1), r->in(1)], type'[phi->in(2), r->in(2)]...) > > > Once a type is narrowed, uses are enqueued and their types are > computed by calling the Value() methods. If a use's type is narrowed, > it's recorded at c in the type' table. Value() methods retrieve types > from the type table, not the type' table. To address that issue while > leaving Value() methods unchanged, before calling Value() at c, the > type table is updated so: > > > type[n] = type'[n, c] > > > An exception is for Phi::Value which needs to retrieve the type of > nodes are various controls: there, a new type(Node* n, Node* c) > method is used. > > For most n and c, type'[n, c] is likely the same as type[n], the type > recorded in the global igvn table (that is there shouldn't be many > nodes at only a few control for which we can narrow the type down). As > a consequence, the types'[n, c] table is implemented with: > > - At c, narrowed down types are stored in a GrowableArray. Each entry > records the previous type at idom(c) and the narrowed down type at > c. > > - The GrowableArray of type updates is recorded in a hash table > indexed by c. If there's no update at c, there's no entry in the > hash table. > > This pass operates in 2 steps: > > - it first iterates over the graph looking for conditions that narrow > the types of some nodes and propagate type updates to uses until a > fix point. > > - it transforms the graph so newly found constant nodes are folded. > > > The new pass is run on every loop opts. There are a couple rea... Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision: more ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14586/files - new: https://git.openjdk.org/jdk/pull/14586/files/b1396f1e..22091449 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14586&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14586&range=01-02 Stats: 59 lines in 15 files changed: 1 ins; 33 del; 25 mod Patch: https://git.openjdk.org/jdk/pull/14586.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14586/head:pull/14586 PR: https://git.openjdk.org/jdk/pull/14586 From roland at openjdk.org Fri Jun 6 14:18:56 2025 From: roland at openjdk.org (Roland Westrelin) Date: Fri, 6 Jun 2025 14:18:56 GMT Subject: RFR: 8275202: C2: optimize out more redundant conditions [v2] In-Reply-To: References: <978cgwy3Nb_x7yU6jZz0f6zhTBZfphstisAkBf1Vktc=.283d06eb-4f79-40cf-b8dd-a9c230e59902@github.com> Message-ID: On Fri, 6 Jun 2025 13:51:57 GMT, Roland Westrelin wrote: >> This change adds a new loop opts pass to optimize redundant conditions >> such as the second one in: >> >> >> if (i < 10) { >> if (i < 42) { >> >> >> In the branch of the first if, the type of i can be narrowed down to >> [min_jint, 9] which can then be used to constant fold the second >> condition. >> >> The compiler already keeps track of type[n] for every node in the >> current compilation unit. That's not sufficient to optimize the >> snippet above though because the type of i can only be narrowed in >> some sections of the control flow (that is a subset of all >> controls). The solution is to build a new table that tracks the type >> of n at every control c >> >> >> type'[n, root] = type[n] // initialized from igvn's type table >> type'[n, c] = type[n, idom(c)] >> >> >> This pass iterates over the CFG looking for conditions such as: >> >> >> if (i < 10) { >> >> >> that allows narrowing the type of i and updates the type' table >> accordingly. >> >> At a region r: >> >> >> type'[n, r] = meet(type'[n, r->in(1)], type'[n, r->in(2)]...) >> >> >> For a Phi phi at a region r: >> >> >> type'[phi, r] = meet(type'[phi->in(1), r->in(1)], type'[phi->in(2), r->in(2)]...) >> >> >> Once a type is narrowed, uses are enqueued and their types are >> computed by calling the Value() methods. If a use's type is narrowed, >> it's recorded at c in the type' table. Value() methods retrieve types >> from the type table, not the type' table. To address that issue while >> leaving Value() methods unchanged, before calling Value() at c, the >> type table is updated so: >> >> >> type[n] = type'[n, c] >> >> >> An exception is for Phi::Value which needs to retrieve the type of >> nodes are various controls: there, a new type(Node* n, Node* c) >> method is used. >> >> For most n and c, type'[n, c] is likely the same as type[n], the type >> recorded in the global igvn table (that is there shouldn't be many >> nodes at only a few control for which we can narrow the type down). As >> a consequence, the types'[n, c] table is implemented with: >> >> - At c, narrowed down types are stored in a GrowableArray. Each entry >> records the previous type at idom(c) and the narrowed down type at >> c. >> >> - The GrowableArray of type updates is recorded in a hash table >> indexed by c. If there's no update at c, there's no entry in the >> hash table. >> >> This pass operates in 2 steps: >> >> - it first iterates over the graph looking for conditions that narrow >> the types of some nodes and propagate type updates to uses ... > > Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: > > - updated conditional propagation > - Merge branch 'master' into JDK-8275202 > - conditional propagation I finally updated this PR. The main push back from the previous version of this change was excessive compile time. The updated change addresses this issue. In the previous version of the change, the new optimization pass was run on every pass of loop optimizations. The reason for that was issues similar to the one explained in https://github.com/openjdk/jdk/pull/23468 . Running the new pass often was a way to mitigate the issue. Since the first version of this PR, I actually found code patterns where running the pass often was not even sufficient to prevent a crash. The change from 8349479 solves all those issues and running the pass only once or a few times doesn't cause any problem. This helps compilation time quite a bit. According to my rough measurement (running `CompileTheWorld` on java.base and looking at times reported by `CITime`), the overhead of the new pass is now around 20% for `IdealLoop` (from 4.7s to 5.7s) and around 2.5% on total compilation time (1 extra second on a round 40s of total compilation time). I also refactored the code quite a bit, worked on integrating changes that are non specific to the new pass and is in other parts of the compiler so there are a lot fewer unrelated changes now, added more tests and comments. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14586#issuecomment-2949396772 From roland at openjdk.org Fri Jun 6 14:24:55 2025 From: roland at openjdk.org (Roland Westrelin) Date: Fri, 6 Jun 2025 14:24:55 GMT Subject: RFR: 8275202: C2: optimize out more redundant conditions [v3] In-Reply-To: <01eSe02XoUbSWslVQxaHiumE8gZhXm2jTetkHQmB91c=.2a2ec5dd-943c-4e43-a02d-9800eccd790b@github.com> References: <978cgwy3Nb_x7yU6jZz0f6zhTBZfphstisAkBf1Vktc=.283d06eb-4f79-40cf-b8dd-a9c230e59902@github.com> <01eSe02XoUbSWslVQxaHiumE8gZhXm2jTetkHQmB91c=.2a2ec5dd-943c-4e43-a02d-9800eccd790b@github.com> Message-ID: On Sat, 21 Dec 2024 16:12:49 GMT, Quan Anh Mai wrote: >> Yes, I'm still working on this. I'll update this PR soon, hopefully. I reworked it quite a bit. > > Some more observations: When removing an `IfNode`, not only for `LoadNode`, you will need to pin all nodes that `depends_only_on_test` at that point (e.g. `ConstraintCast`), since the node does not only depend on the immediate dominating test (if any) but on the whole sequence of control nodes leading to that position. This can be done by e.g upgrading a `RegularDependency` `ConstraintCast` to one with `StrongDependency`. However, this action can lead to the node not able to float as freely. So I think before completing all loop opts, you should refrain from removing any `IfNode` of which the taken path has at least 1 node that `depends_only_on_test`. Pruning the untaken branch should still be done to simplify the graph. I think you have probably thought about this but just in case it can help. Thanks for the observations. The updated patch does pin a `LoadNode` when a range check it depends on is optimized out. It doesn't pin it at the control that immediately dominates the eliminated range check. Instead, it goes over the dominating controls until it finds the earliest control at which the type of the range check test constant folds and sets the `LoadNode` control to that control. So it pins it at the earliest possible control. The new constant propagation pass is also run after all loop opts by default so, nodes get pinned late in the compilation process. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14586#discussion_r2132296301 From kvn at openjdk.org Fri Jun 6 14:44:50 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 6 Jun 2025 14:44:50 GMT Subject: RFR: 8356780: PhaseMacroExpand::_has_locks is unused [v2] In-Reply-To: References: <4Y_qCkNICY97KdColxyShQuBy9zdEVaZjJjkDtJD9do=.f09a465c-13d9-4e15-8b86-d99cac09b807@github.com> Message-ID: <66E-LoLtJxrFSDo75VdNoYKRjsgZPTDPeTpXUEaXLk0=.cc98d1b5-ccf7-4dfe-b328-84971cc27828@github.com> On Fri, 6 Jun 2025 09:37:30 GMT, Beno?t Maillard wrote: >> This PR introduces two cleanup changes to `PhaseMacroExpand`: >> >> - Removes the unused field `PhaseMacroExpand::_has_locks` >> - Merges two `while` loops in `PhaseMacroExpand::eliminate_macro_nodes` into a single loop >> >> Previously, `eliminate_macro_nodes` used two separate `while` loops: >> >> - The first loop removed lock nodes >> - The second loop removed allocation nodes >> >> Both loops had the same structure and independently traversed the same set of nodes. Since their operations do not interfere, the lock node removal logic was moved into the second loop as an additional case in the `switch` statement. >> >> Thanks! > > Beno?t Maillard has updated the pull request incrementally with one additional commit since the last revision: > > 8356780: Update copyright They are dependent. Allocation may be referenced in locks and will not be eliminated if locks are still in graph. That is why locks are eliminated first, ------------- PR Review: https://git.openjdk.org/jdk/pull/25669#pullrequestreview-2905146374 From duke at openjdk.org Fri Jun 6 15:24:11 2025 From: duke at openjdk.org (=?UTF-8?B?QmVub8OudA==?= Maillard) Date: Fri, 6 Jun 2025 15:24:11 GMT Subject: RFR: 8356780: PhaseMacroExpand::_has_locks is unused [v3] In-Reply-To: <4Y_qCkNICY97KdColxyShQuBy9zdEVaZjJjkDtJD9do=.f09a465c-13d9-4e15-8b86-d99cac09b807@github.com> References: <4Y_qCkNICY97KdColxyShQuBy9zdEVaZjJjkDtJD9do=.f09a465c-13d9-4e15-8b86-d99cac09b807@github.com> Message-ID: > This PR introduces two cleanup changes to `PhaseMacroExpand`: > > - Removes the unused field `PhaseMacroExpand::_has_locks` > - Merges two `while` loops in `PhaseMacroExpand::eliminate_macro_nodes` into a single loop > > Previously, `eliminate_macro_nodes` used two separate `while` loops: > > - The first loop removed lock nodes > - The second loop removed allocation nodes > > Both loops had the same structure and independently traversed the same set of nodes. Since their operations do not interfere, the lock node removal logic was moved into the second loop as an additional case in the `switch` statement. > > Thanks! Beno?t Maillard has updated the pull request incrementally with two additional commits since the last revision: - Revert "8356780: Merge both while loops in PhaseMacroExpand::eliminate_macro_nodes" This reverts commit 13deb61de3e2a07d51e3692bb408971f6c18cecf. - Revert "8356780: Remove useless assert" This reverts commit 4e7ee57981b9019d030ac59dc697a2470d8eb5eb. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25669/files - new: https://git.openjdk.org/jdk/pull/25669/files/90c94a25..0158481e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25669&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25669&range=01-02 Stats: 27 lines in 1 file changed: 20 ins; 5 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/25669.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25669/head:pull/25669 PR: https://git.openjdk.org/jdk/pull/25669 From mdoerr at openjdk.org Fri Jun 6 15:32:50 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 6 Jun 2025 15:32:50 GMT Subject: RFR: 8342639: Global operator new in adlc has wrong exception spec In-Reply-To: <2-5wm21LtxEx0HMSd6gBbLv6tNLSBO4rGQ2W-uHfX6Q=.489a263e-fb34-4ca0-bc3f-5c74027367d6@github.com> References: <2-5wm21LtxEx0HMSd6gBbLv6tNLSBO4rGQ2W-uHfX6Q=.489a263e-fb34-4ca0-bc3f-5c74027367d6@github.com> Message-ID: On Fri, 6 Jun 2025 06:07:43 GMT, Kim Barrett wrote: > Please review this change to remove the definition of > `operator new(size_t, int, const char*, int) throw()` > > It doesn't seem to be needed anymore, if it ever really was. See discussion in > JBS for some more details. > > Testing: mach5 tier1-5, to cover various different build configurations. LGTM. Thanks for cleaning this up! ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25668#pullrequestreview-2905281270 From duke at openjdk.org Fri Jun 6 15:48:56 2025 From: duke at openjdk.org (=?UTF-8?B?QmVub8OudA==?= Maillard) Date: Fri, 6 Jun 2025 15:48:56 GMT Subject: RFR: 8356780: PhaseMacroExpand::_has_locks is unused [v3] In-Reply-To: References: <4Y_qCkNICY97KdColxyShQuBy9zdEVaZjJjkDtJD9do=.f09a465c-13d9-4e15-8b86-d99cac09b807@github.com> Message-ID: <1V_0c8N2UAYXQegl_ztEQnFizlC3rnhg9weY6d2TAag=.ec918b2f-b250-4b98-b0e4-8a8c5b7de62c@github.com> On Fri, 6 Jun 2025 15:24:11 GMT, Beno?t Maillard wrote: >> This PR introduces two cleanup changes to `PhaseMacroExpand`: >> >> - Removes the unused field `PhaseMacroExpand::_has_locks` >> - Merges two `while` loops in `PhaseMacroExpand::eliminate_macro_nodes` into a single loop >> >> Previously, `eliminate_macro_nodes` used two separate `while` loops: >> >> - The first loop removed lock nodes >> - The second loop removed allocation nodes >> >> Both loops had the same structure and independently traversed the same set of nodes. Since their operations do not interfere, the lock node removal logic was moved into the second loop as an additional case in the `switch` statement. >> >> Thanks! > > Beno?t Maillard has updated the pull request incrementally with two additional commits since the last revision: > > - Revert "8356780: Merge both while loops in PhaseMacroExpand::eliminate_macro_nodes" > > This reverts commit 13deb61de3e2a07d51e3692bb408971f6c18cecf. > - Revert "8356780: Remove useless assert" > > This reverts commit 4e7ee57981b9019d030ac59dc697a2470d8eb5eb. I have reverted the changes related to the fusing of the two loops, and have created a new issue to investigate this separately: [JDK-8358788](https://bugs.openjdk.org/browse/JDK-8358788) ------------- PR Comment: https://git.openjdk.org/jdk/pull/25669#issuecomment-2949688261 From mchevalier at openjdk.org Fri Jun 6 16:56:52 2025 From: mchevalier at openjdk.org (Marc Chevalier) Date: Fri, 6 Jun 2025 16:56:52 GMT Subject: RFR: 8356780: PhaseMacroExpand::_has_locks is unused [v3] In-Reply-To: References: <4Y_qCkNICY97KdColxyShQuBy9zdEVaZjJjkDtJD9do=.f09a465c-13d9-4e15-8b86-d99cac09b807@github.com> Message-ID: On Fri, 6 Jun 2025 15:24:11 GMT, Beno?t Maillard wrote: >> This PR removes the unused field `PhaseMacroExpand::_has_locks` >> >> Thanks! > > Beno?t Maillard has updated the pull request incrementally with two additional commits since the last revision: > > - Revert "8356780: Merge both while loops in PhaseMacroExpand::eliminate_macro_nodes" > > This reverts commit 13deb61de3e2a07d51e3692bb408971f6c18cecf. > - Revert "8356780: Remove useless assert" > > This reverts commit 4e7ee57981b9019d030ac59dc697a2470d8eb5eb. Seems safe and it does what it says. Interestingly, the cases are now here only for an assert. I guess that's still good to have and it's pretty harmless: in product, the assert won't exist and it will just be an immediate break. ------------- Marked as reviewed by mchevalier (Committer). PR Review: https://git.openjdk.org/jdk/pull/25669#pullrequestreview-2905502236 From sviswanathan at openjdk.org Fri Jun 6 16:57:52 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Fri, 6 Jun 2025 16:57:52 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v8] In-Reply-To: References: Message-ID: On Fri, 6 Jun 2025 07:46:32 GMT, Jatin Bhateja wrote: >> A) Patch extends the following tests with hard-coded encoding checks for various BMI instructions to cover REX2 or extended EVEX encodings supported by APX. >> >> >> compiler/intrinsics/bmi/verifycode/AndnTestI.java >> compiler/intrinsics/bmi/verifycode/AndnTestL.java >> compiler/intrinsics/bmi/verifycode/BzhiTestI2L.java >> compiler/intrinsics/bmi/verifycode/LZcntTestL.java >> compiler/intrinsics/bmi/verifycode/TZcntTestL.java >> >> >> B) After integration of JDK-8349582, which added APX NDD support, AndN instruction selection patterns that expect (Xor SRC, -1) as one of its operands were not getting selected because of a lower-cost generic immediate pattern match; patch fixes this issue through strict predicate checks. >> >> Above tests are now passing, validations were carried out using Intel Software Development emulator. >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Review comments resoltions Looks good to me. ------------- Marked as reviewed by sviswanathan (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25501#pullrequestreview-2905507817 From kvn at openjdk.org Fri Jun 6 17:54:53 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 6 Jun 2025 17:54:53 GMT Subject: RFR: 8356780: PhaseMacroExpand::_has_locks is unused [v3] In-Reply-To: References: <4Y_qCkNICY97KdColxyShQuBy9zdEVaZjJjkDtJD9do=.f09a465c-13d9-4e15-8b86-d99cac09b807@github.com> Message-ID: On Fri, 6 Jun 2025 15:24:11 GMT, Beno?t Maillard wrote: >> This PR removes the unused field `PhaseMacroExpand::_has_locks` >> >> Thanks! > > Beno?t Maillard has updated the pull request incrementally with two additional commits since the last revision: > > - Revert "8356780: Merge both while loops in PhaseMacroExpand::eliminate_macro_nodes" > > This reverts commit 13deb61de3e2a07d51e3692bb408971f6c18cecf. > - Revert "8356780: Remove useless assert" > > This reverts commit 4e7ee57981b9019d030ac59dc697a2470d8eb5eb. Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25669#pullrequestreview-2905635855 From chagedorn at openjdk.org Fri Jun 6 18:02:51 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Fri, 6 Jun 2025 18:02:51 GMT Subject: RFR: 8356780: PhaseMacroExpand::_has_locks is unused [v3] In-Reply-To: References: <4Y_qCkNICY97KdColxyShQuBy9zdEVaZjJjkDtJD9do=.f09a465c-13d9-4e15-8b86-d99cac09b807@github.com> Message-ID: On Fri, 6 Jun 2025 15:24:11 GMT, Beno?t Maillard wrote: >> This PR removes the unused field `PhaseMacroExpand::_has_locks` >> >> Thanks! > > Beno?t Maillard has updated the pull request incrementally with two additional commits since the last revision: > > - Revert "8356780: Merge both while loops in PhaseMacroExpand::eliminate_macro_nodes" > > This reverts commit 13deb61de3e2a07d51e3692bb408971f6c18cecf. > - Revert "8356780: Remove useless assert" > > This reverts commit 4e7ee57981b9019d030ac59dc697a2470d8eb5eb. Marked as reviewed by chagedorn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/25669#pullrequestreview-2905650667 From sviswanathan at openjdk.org Fri Jun 6 18:31:50 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Fri, 6 Jun 2025 18:31:50 GMT Subject: RFR: 8357982: Fix several failing BMI tests with -XX:+UseAPX [v2] In-Reply-To: References: <8mE0O0QjyMJMK7UWtfMiFc5ZjIxFYqVNUeu0qYbzaz8=.75e13abf-a2c9-407b-898d-1174a85a06cf@github.com> Message-ID: On Wed, 4 Jun 2025 06:35:50 GMT, Emanuel Peter wrote: >>> @jatin-bhateja Thanks for looking into this! >>> >>> `predicate(!UseAPX && n->in(2)->bottom_type()->is_int()->get_con() != -1);` >>> >>> The PR title seems to suggest the bug is only about -XX:+UseAPX. Why are you changing things for the case !UseAPX? >>> >>> Are these not cases like a ^ -1, which basically flips all bits. What alternative does this end up using now? >>> >>> A code comment would be helpful. >> >> We are tightening the predicate check so that under no circumstances we pick this pattern during the reduction phase of instruction selection on account of having lower cost. There is a generic pattern (xorI_rReg_imm) for all integral immediate values, and then there is a special pattern for Xor with -1 (fxorI_rReg_im1), which is needed for AndN inferencing. > > @jatin-bhateja I'll wait with testing, until someone from Intel gives this the approval. Feel free to ping me for that once we are there :) @eme64 This PR is now ready for your testing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25501#issuecomment-2950110042 From vlivanov at openjdk.org Fri Jun 6 21:41:49 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Fri, 6 Jun 2025 21:41:49 GMT Subject: RFR: 8358749: Fix input checks in Vector API intrinsics In-Reply-To: References: Message-ID: On Fri, 6 Jun 2025 14:09:11 GMT, Aleksey Shipilev wrote: > We have been carrying this patch in Leyden/premain for a while: https://github.com/openjdk/leyden/commit/7faed7fc5c8e1bbd9a16ab22673a77099396179c. I believe it deserves to be in mainline. I polished it a little further. > > It is _mostly_ a cleanup, but there are also new checks, on the paths where we do take constants off the arguments. In those cases, I believe the alternative is compiler SEGV-ing. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `hotspot_vector_1 hotspot_vector_2` > - [x] Linux x86_64 server fastdebug, `jdk_vector` Looks good. Thanks for taking care of upstreaming it. ------------- Marked as reviewed by vlivanov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25673#pullrequestreview-2906212581 From sviswanathan at openjdk.org Fri Jun 6 22:49:51 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Fri, 6 Jun 2025 22:49:51 GMT Subject: RFR: 8358749: Fix input checks in Vector API intrinsics In-Reply-To: References: Message-ID: On Fri, 6 Jun 2025 14:09:11 GMT, Aleksey Shipilev wrote: > We have been carrying this patch in Leyden/premain for a while: https://github.com/openjdk/leyden/commit/7faed7fc5c8e1bbd9a16ab22673a77099396179c. I believe it deserves to be in mainline. I polished it a little further. > > It is _mostly_ a cleanup, but there are also new checks, on the paths where we do take constants off the arguments. In those cases, I believe the alternative is compiler SEGV-ing. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `hotspot_vector_1 hotspot_vector_2` > - [x] Linux x86_64 server fastdebug, `jdk_vector` Looks good to me as well. ------------- Marked as reviewed by sviswanathan (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25673#pullrequestreview-2906334077 From duke at openjdk.org Fri Jun 6 23:14:50 2025 From: duke at openjdk.org (ExE Boss) Date: Fri, 6 Jun 2025 23:14:50 GMT Subject: RFR: 8358749: Fix input checks in Vector API intrinsics In-Reply-To: References: Message-ID: On Fri, 6 Jun 2025 14:09:11 GMT, Aleksey Shipilev wrote: > We have been carrying this patch in Leyden/premain for a while: https://github.com/openjdk/leyden/commit/7faed7fc5c8e1bbd9a16ab22673a77099396179c. I believe it deserves to be in mainline. I polished it a little further. > > It is _mostly_ a cleanup, but there are also new checks, on the paths where we do take constants off the arguments. In those cases, I believe the alternative is compiler SEGV-ing. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `hotspot_vector_1 hotspot_vector_2` > - [x] Linux x86_64 server fastdebug, `jdk_vector` Also?note that?the?implementation of?`Utils.isNonCapturingLambda(?)` is?wrong?when the?`jdk.internal.lambda.disableEagerInitialization` system?property is?set to?`"true"`, as?that?causes lambda?classes to?have one?`static?final`?field: https://github.com/openjdk/jdk/blob/d7352559195b9e052c3eb24d773c0d6c10dc23ad/src/java.base/share/classes/jdk/internal/vm/vector/Utils.java#L36-L38 https://github.com/openjdk/jdk/blob/d7352559195b9e052c3eb24d773c0d6c10dc23ad/src/java.base/share/classes/java/lang/invoke/InnerClassLambdaMetafactory.java#L365-L372 ------------- PR Comment: https://git.openjdk.org/jdk/pull/25673#issuecomment-2951174116