From dnsimon at openjdk.org Mon Dec 2 07:14:15 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 2 Dec 2024 07:14:15 GMT Subject: RFR: 8345267: Fix memory leak in JVMCIEnv dtor Message-ID: The `ALLOW_C_FUNCTION` macro takes the identifier for the relevant C function, followed by a statement containing the use as additional (variadic) macro args. This PR fixes a use of this macro where the leading identifier arg was being omitted. ------------- Commit messages: - fix use of ALLOW_C_FUNCTION Changes: https://git.openjdk.org/jdk/pull/22471/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22471&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8345267 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/22471.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22471/head:pull/22471 PR: https://git.openjdk.org/jdk/pull/22471 From simonis at openjdk.org Mon Dec 2 08:02:36 2024 From: simonis at openjdk.org (Volker Simonis) Date: Mon, 2 Dec 2024 08:02:36 GMT Subject: RFR: 8345267: Fix memory leak in JVMCIEnv dtor In-Reply-To: References: Message-ID: On Mon, 2 Dec 2024 07:08:22 GMT, Doug Simon wrote: > The `ALLOW_C_FUNCTION` macro takes the identifier for the relevant C function, followed by a statement containing the use as additional (variadic) macro args. This PR fixes a use of this macro where the leading identifier arg was being omitted. Looks good. Have you checked (maybe with a simple grep command) if we don't have other instances of this issue? ------------- Marked as reviewed by simonis (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22471#pullrequestreview-2471983004 From kbarrett at openjdk.org Mon Dec 2 08:11:37 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 2 Dec 2024 08:11:37 GMT Subject: RFR: 8345267: Fix memory leak in JVMCIEnv dtor In-Reply-To: References: Message-ID: On Mon, 2 Dec 2024 07:08:22 GMT, Doug Simon wrote: > The `ALLOW_C_FUNCTION` macro takes the identifier for the relevant C function, followed by a statement containing the use as additional (variadic) macro args. This PR fixes a use of this macro where the leading identifier arg was being omitted. Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22471#pullrequestreview-2471996944 From kbarrett at openjdk.org Mon Dec 2 08:11:38 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 2 Dec 2024 08:11:38 GMT Subject: RFR: 8345267: Fix memory leak in JVMCIEnv dtor In-Reply-To: References:

Message-ID: On Mon, 2 Dec 2024 07:59:53 GMT, Volker Simonis wrote: > Looks good. Have you checked (maybe with a simple grep command) if we don't have other instances of this issue? I've just recently looked at all of the uses of that macro for other reasons, and this was the only one I found like this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22471#issuecomment-2510838072 From dnsimon at openjdk.org Mon Dec 2 08:39:19 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 2 Dec 2024 08:39:19 GMT Subject: RFR: 8345267: Fix memory leak in JVMCIEnv dtor [v2] In-Reply-To: References: Message-ID: > The `ALLOW_C_FUNCTION` macro takes the identifier for the relevant C function, followed by a statement containing the use as additional (variadic) macro args. This PR fixes a use of this macro where the leading identifier arg was being omitted. Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: fix use of ALLOW_C_FUNCTION ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22471/files - new: https://git.openjdk.org/jdk/pull/22471/files/c789f049..35c0c3de Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22471&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22471&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/22471.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22471/head:pull/22471 PR: https://git.openjdk.org/jdk/pull/22471 From simonis at openjdk.org Mon Dec 2 08:43:41 2024 From: simonis at openjdk.org (Volker Simonis) Date: Mon, 2 Dec 2024 08:43:41 GMT Subject: RFR: 8345267: Fix memory leak in JVMCIEnv dtor [v2] In-Reply-To: References:

Message-ID: On Mon, 2 Dec 2024 08:39:19 GMT, Doug Simon wrote: >> The `ALLOW_C_FUNCTION` macro takes the identifier for the relevant C function, followed by a statement containing the use as additional (variadic) macro args. This PR fixes a use of this macro where the leading identifier arg was being omitted. > > Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > fix use of ALLOW_C_FUNCTION Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/22471#issuecomment-2511612001 From dnsimon at openjdk.org Mon Dec 2 13:59:49 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 2 Dec 2024 13:59:49 GMT Subject: Integrated: 8345267: Fix memory leak in JVMCIEnv dtor In-Reply-To: References: Message-ID: On Mon, 2 Dec 2024 07:08:22 GMT, Doug Simon wrote: > The `ALLOW_C_FUNCTION` macro takes the identifier for the relevant C function, followed by a statement containing the use as additional (variadic) macro args. This PR fixes a use of this macro where the leading identifier arg was being omitted. This pull request has now been integrated. Changeset: b8233989 Author: Doug Simon URL: https://git.openjdk.org/jdk/commit/b8233989e7605268dda908e6b639ca373789792b Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8345267: Fix memory leak in JVMCIEnv dtor Reviewed-by: simonis, kbarrett ------------- PR: https://git.openjdk.org/jdk/pull/22471 From pchilanomate at openjdk.org Tue Dec 3 20:11:53 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 3 Dec 2024 20:11:53 GMT Subject: RFR: 8343957: Rename ObjectMonitor::owner_from() and JavaThread::_lock_id Message-ID: <9PPc-HCpTmRz0ouqvFaawkmB510eCOWmwmzv4CeilW8=.a894cabb-45bf-40be-a516-ff7f8488c435@github.com> Please review this small renaming patch. During the review of JDK-8338383 there were some comments about improving the naming for `ObjectMonitor::owner_from()` and `JavaThread::_lock_id`. These originate from the changes introduced to inflated monitors, where we now record the `java.lang.Thread.tid` of the owner in the ObjectMonitor's `_owner` field instead of a `JavaThread*`. I renamed `_lock_id` as `_monitor_owner_id` and `owner_from()` as `owner_id_from()`. Thanks, Patricio ------------- Commit messages: - v1 Changes: https://git.openjdk.org/jdk/pull/22524/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22524&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8343957 Stats: 83 lines in 19 files changed: 0 ins; 2 del; 81 mod Patch: https://git.openjdk.org/jdk/pull/22524.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22524/head:pull/22524 PR: https://git.openjdk.org/jdk/pull/22524 From coleenp at openjdk.org Tue Dec 3 23:00:37 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 3 Dec 2024 23:00:37 GMT Subject: RFR: 8343957: Rename ObjectMonitor::owner_from() and JavaThread::_lock_id In-Reply-To: <9PPc-HCpTmRz0ouqvFaawkmB510eCOWmwmzv4CeilW8=.a894cabb-45bf-40be-a516-ff7f8488c435@github.com> References: <9PPc-HCpTmRz0ouqvFaawkmB510eCOWmwmzv4CeilW8=.a894cabb-45bf-40be-a516-ff7f8488c435@github.com> Message-ID: On Tue, 3 Dec 2024 19:10:55 GMT, Patricio Chilano Mateo wrote: > Please review this small renaming patch. During the review of JDK-8338383 there were some comments about improving the naming for `ObjectMonitor::owner_from()` and `JavaThread::_lock_id`. These originate from the changes introduced to inflated monitors, where we now record the `java.lang.Thread.tid` of the owner in the ObjectMonitor's `_owner` field instead of a `JavaThread*`. I renamed `_lock_id` as `_monitor_owner_id` and `owner_from()` as `owner_id_from()`. > > Thanks, > Patricio Renaming looks good and makes it clearer what the id is. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22524#pullrequestreview-2476917190 From dholmes at openjdk.org Wed Dec 4 01:47:39 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 4 Dec 2024 01:47:39 GMT Subject: RFR: 8343957: Rename ObjectMonitor::owner_from() and JavaThread::_lock_id In-Reply-To: <9PPc-HCpTmRz0ouqvFaawkmB510eCOWmwmzv4CeilW8=.a894cabb-45bf-40be-a516-ff7f8488c435@github.com> References: <9PPc-HCpTmRz0ouqvFaawkmB510eCOWmwmzv4CeilW8=.a894cabb-45bf-40be-a516-ff7f8488c435@github.com> Message-ID: On Tue, 3 Dec 2024 19:10:55 GMT, Patricio Chilano Mateo wrote: > Please review this small renaming patch. During the review of JDK-8338383 there were some comments about improving the naming for `ObjectMonitor::owner_from()` and `JavaThread::_lock_id`. These originate from the changes introduced to inflated monitors, where we now record the `java.lang.Thread.tid` of the owner in the ObjectMonitor's `_owner` field instead of a `JavaThread*`. I renamed `_lock_id` as `_monitor_owner_id` and `owner_from()` as `owner_id_from()`. > > Thanks, > Patricio Thanks for making the changes. One minor nit, but looks good. src/hotspot/share/runtime/javaThread.hpp line 174: > 172: void set_monitor_owner_id(int64_t val) { > 173: assert(val >= ThreadIdentifier::initial() && val < ThreadIdentifier::current(), ""); > 174: _monitor_owner_id = val; Nit: Using `id` rather than `val` would be more consistent with other changes (`ObjectMonitor::owner_id_from`) src/hotspot/share/runtime/threads.cpp line 1363: > 1361: p->print_stack_on(st); > 1362: if (p->is_vthread_mounted()) { > 1363: st->print_cr(" Mounted virtual thread #" INT64_FORMAT, java_lang_Thread::thread_id(p->vthread())); Was initially unsure why `p->lock_id()` didn't change to `p->monitor_owner_id()`, but here you want the thread-id not something that happens to match the thread-id. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22524#pullrequestreview-2477079004 PR Review Comment: https://git.openjdk.org/jdk/pull/22524#discussion_r1868577497 PR Review Comment: https://git.openjdk.org/jdk/pull/22524#discussion_r1868580851 From pchilanomate at openjdk.org Wed Dec 4 15:47:26 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 4 Dec 2024 15:47:26 GMT Subject: RFR: 8343957: Rename ObjectMonitor::owner_from() and JavaThread::_lock_id [v2] In-Reply-To: <9PPc-HCpTmRz0ouqvFaawkmB510eCOWmwmzv4CeilW8=.a894cabb-45bf-40be-a516-ff7f8488c435@github.com> References: <9PPc-HCpTmRz0ouqvFaawkmB510eCOWmwmzv4CeilW8=.a894cabb-45bf-40be-a516-ff7f8488c435@github.com> Message-ID: > Please review this small renaming patch. During the review of JDK-8338383 there were some comments about improving the naming for `ObjectMonitor::owner_from()` and `JavaThread::_lock_id`. These originate from the changes introduced to inflated monitors, where we now record the `java.lang.Thread.tid` of the owner in the ObjectMonitor's `_owner` field instead of a `JavaThread*`. I renamed `_lock_id` as `_monitor_owner_id` and `owner_from()` as `owner_id_from()`. > > Thanks, > Patricio Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: Fix parameter name ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22524/files - new: https://git.openjdk.org/jdk/pull/22524/files/108abc2b..8cfc9a10 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22524&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22524&range=00-01 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/22524.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22524/head:pull/22524 PR: https://git.openjdk.org/jdk/pull/22524 From pchilanomate at openjdk.org Wed Dec 4 15:47:26 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 4 Dec 2024 15:47:26 GMT Subject: RFR: 8343957: Rename ObjectMonitor::owner_from() and JavaThread::_lock_id [v2] In-Reply-To: References: <9PPc-HCpTmRz0ouqvFaawkmB510eCOWmwmzv4CeilW8=.a894cabb-45bf-40be-a516-ff7f8488c435@github.com> Message-ID: <-c_LKYc0sEMbVtxPOCYWt8Fv2L-XoBfY9kmyDbNbhfg=.7b9bf7cf-1f46-471e-b184-5253add1f846@github.com> On Wed, 4 Dec 2024 01:36:42 GMT, David Holmes wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix parameter name > > src/hotspot/share/runtime/javaThread.hpp line 174: > >> 172: void set_monitor_owner_id(int64_t val) { >> 173: assert(val >= ThreadIdentifier::initial() && val < ThreadIdentifier::current(), ""); >> 174: _monitor_owner_id = val; > > Nit: Using `id` rather than `val` would be more consistent with other changes (`ObjectMonitor::owner_id_from`) Fixed. > src/hotspot/share/runtime/threads.cpp line 1363: > >> 1361: p->print_stack_on(st); >> 1362: if (p->is_vthread_mounted()) { >> 1363: st->print_cr(" Mounted virtual thread #" INT64_FORMAT, java_lang_Thread::thread_id(p->vthread())); > > Was initially unsure why `p->lock_id()` didn't change to `p->monitor_owner_id()`, but here you want the thread-id not something that happens to match the thread-id. Exactly. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22524#discussion_r1869818716 PR Review Comment: https://git.openjdk.org/jdk/pull/22524#discussion_r1869819018 From coleenp at openjdk.org Wed Dec 4 19:08:13 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 4 Dec 2024 19:08:13 GMT Subject: RFR: 8343957: Rename ObjectMonitor::owner_from() and JavaThread::_lock_id [v2] In-Reply-To: References: <9PPc-HCpTmRz0ouqvFaawkmB510eCOWmwmzv4CeilW8=.a894cabb-45bf-40be-a516-ff7f8488c435@github.com> Message-ID: On Wed, 4 Dec 2024 15:47:26 GMT, Patricio Chilano Mateo wrote: >> Please review this small renaming patch. During the review of JDK-8338383 there were some comments about improving the naming for `ObjectMonitor::owner_from()` and `JavaThread::_lock_id`. These originate from the changes introduced to inflated monitors, where we now record the `java.lang.Thread.tid` of the owner in the ObjectMonitor's `_owner` field instead of a `JavaThread*`. I renamed `_lock_id` as `_monitor_owner_id` and `owner_from()` as `owner_id_from()`. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > Fix parameter name Update looks good! ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22524#pullrequestreview-2479668829 From pchilanomate at openjdk.org Wed Dec 4 21:28:46 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 4 Dec 2024 21:28:46 GMT Subject: RFR: 8343957: Rename ObjectMonitor::owner_from() and JavaThread::_lock_id [v2] In-Reply-To: References: <9PPc-HCpTmRz0ouqvFaawkmB510eCOWmwmzv4CeilW8=.a894cabb-45bf-40be-a516-ff7f8488c435@github.com> Message-ID: On Wed, 4 Dec 2024 15:47:26 GMT, Patricio Chilano Mateo wrote: >> Please review this small renaming patch. During the review of JDK-8338383 there were some comments about improving the naming for `ObjectMonitor::owner_from()` and `JavaThread::_lock_id`. These originate from the changes introduced to inflated monitors, where we now record the `java.lang.Thread.tid` of the owner in the ObjectMonitor's `_owner` field instead of a `JavaThread*`. I renamed `_lock_id` as `_monitor_owner_id` and `owner_from()` as `owner_id_from()`. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > Fix parameter name Thanks for the reviews Coleen and David! ------------- PR Comment: https://git.openjdk.org/jdk/pull/22524#issuecomment-2518592483 From pchilanomate at openjdk.org Wed Dec 4 21:28:47 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 4 Dec 2024 21:28:47 GMT Subject: Integrated: 8343957: Rename ObjectMonitor::owner_from() and JavaThread::_lock_id In-Reply-To: <9PPc-HCpTmRz0ouqvFaawkmB510eCOWmwmzv4CeilW8=.a894cabb-45bf-40be-a516-ff7f8488c435@github.com> References: <9PPc-HCpTmRz0ouqvFaawkmB510eCOWmwmzv4CeilW8=.a894cabb-45bf-40be-a516-ff7f8488c435@github.com> Message-ID: On Tue, 3 Dec 2024 19:10:55 GMT, Patricio Chilano Mateo wrote: > Please review this small renaming patch. During the review of JDK-8338383 there were some comments about improving the naming for `ObjectMonitor::owner_from()` and `JavaThread::_lock_id`. These originate from the changes introduced to inflated monitors, where we now record the `java.lang.Thread.tid` of the owner in the ObjectMonitor's `_owner` field instead of a `JavaThread*`. I renamed `_lock_id` as `_monitor_owner_id` and `owner_from()` as `owner_id_from()`. > > Thanks, > Patricio This pull request has now been integrated. Changeset: c113f82f Author: Patricio Chilano Mateo URL: https://git.openjdk.org/jdk/commit/c113f82f78c7d9be1ac297aebfeb6051f0f904fb Stats: 83 lines in 19 files changed: 0 ins; 2 del; 81 mod 8343957: Rename ObjectMonitor::owner_from() and JavaThread::_lock_id Reviewed-by: coleenp, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/22524 From ihse at openjdk.org Mon Dec 9 12:17:44 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 9 Dec 2024 12:17:44 GMT Subject: RFR: 8345795: Update copyright year to 2024 for hotspot in files where it was missed Message-ID: Some files have been modified in 2024, but the copyright year has not been properly updated. This should be fixed. I have located these modified files using: git log --since="Jan 1" --name-only --pretty=format: | sort -u > file.list and then run a script to update the copyright year to 2024 on these files. I have made a manual sampling of files in the list to verify that they have indeed been modified in 2024. ------------- Commit messages: - 8345795: Update copyright year to 2024 for hotspot in files where it was missed Changes: https://git.openjdk.org/jdk/pull/22637/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22637&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8345795 Stats: 844 lines in 844 files changed: 0 ins; 0 del; 844 mod Patch: https://git.openjdk.org/jdk/pull/22637.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22637/head:pull/22637 PR: https://git.openjdk.org/jdk/pull/22637 From ihse at openjdk.org Mon Dec 9 12:41:45 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 9 Dec 2024 12:41:45 GMT Subject: RFR: 8345795: Update copyright year to 2024 for hotspot in files where it was missed [v2] In-Reply-To: References: Message-ID: > Some files have been modified in 2024, but the copyright year has not been properly updated. This should be fixed. > > I have located these modified files using: > > git log --since="Jan 1" --name-only --pretty=format: | sort -u > file.list > > and then run a script to update the copyright year to 2024 on these files. > > I have made a manual sampling of files in the list to verify that they have indeed been modified in 2024. Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: Add more hotspot files ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22637/files - new: https://git.openjdk.org/jdk/pull/22637/files/0605277d..0166c68e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22637&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22637&range=00-01 Stats: 77 lines in 82 files changed: 0 ins; 0 del; 77 mod Patch: https://git.openjdk.org/jdk/pull/22637.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22637/head:pull/22637 PR: https://git.openjdk.org/jdk/pull/22637 From ihse at openjdk.org Mon Dec 9 12:48:41 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 9 Dec 2024 12:48:41 GMT Subject: RFR: 8345795: Update copyright year to 2024 for hotspot in files where it was missed [v2] In-Reply-To: References:

Message-ID: On Mon, 9 Dec 2024 12:41:45 GMT, Magnus Ihse Bursie wrote: >> Some files have been modified in 2024, but the copyright year has not been properly updated. This should be fixed. >> >> I have located these modified files using: >> >> git log --since="Jan 1" --name-only --pretty=format: | sort -u > file.list >> >> and then run a script to update the copyright year to 2024 on these files. >> >> I have made a manual sampling of files in the list to verify that they have indeed been modified in 2024. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Add more hotspot files I've looked at the Serviceability related copyright updates (prims, SA, debugger and tests). They are good. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22637#pullrequestreview-2489952883 From mullan at openjdk.org Mon Dec 9 20:22:37 2024 From: mullan at openjdk.org (Sean Mullan) Date: Mon, 9 Dec 2024 20:22:37 GMT Subject: RFR: 8345805: Update copyright year to 2024 for other files where it was missed In-Reply-To: References: Message-ID: On Mon, 9 Dec 2024 13:02:15 GMT, Magnus Ihse Bursie wrote: > Some files have been modified in 2024, but the copyright year has not been properly updated. This should be fixed. > > I have located these modified files using: > > git log --since="Jan 1" --name-only --pretty=format: | sort -u > file.list > > and then run a script to update the copyright year to 2024 on these files. > > I have made a manual sampling of files in the list to verify that they have indeed been modified in 2024. > > This is a "misc" bucket bug report, covering for all those files that has not been clearly assigned in some other issue. My strategy was to update the copyright year on all files in the JDK repo, and then try to the best of my ability to partition that huge chunk of files between groups. These are the remainder after I've done the large chunks. When you review, please state clearly what part of the code you are reviewing. The security related files look fine. ------------- Marked as reviewed by mullan (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22645#pullrequestreview-2489994087 From eirbjo at openjdk.org Mon Dec 9 20:40:37 2024 From: eirbjo at openjdk.org (Eirik =?UTF-8?B?QmrDuHJzbsO4cw==?=) Date: Mon, 9 Dec 2024 20:40:37 GMT Subject: RFR: 8345805: Update copyright year to 2024 for other files where it was missed In-Reply-To: References: Message-ID: On Mon, 9 Dec 2024 13:02:15 GMT, Magnus Ihse Bursie wrote: > Some files have been modified in 2024, but the copyright year has not been properly updated. This should be fixed. > > I have located these modified files using: > > git log --since="Jan 1" --name-only --pretty=format: | sort -u > file.list > > and then run a script to update the copyright year to 2024 on these files. > > I have made a manual sampling of files in the list to verify that they have indeed been modified in 2024. > > This is a "misc" bucket bug report, covering for all those files that has not been clearly assigned in some other issue. My strategy was to update the copyright year on all files in the JDK repo, and then try to the best of my ability to partition that huge chunk of files between groups. These are the remainder after I've done the large chunks. When you review, please state clearly what part of the code you are reviewing. src/jdk.httpserver/share/classes/sun/net/httpserver/simpleserver/resources/favicon.ico line 1: This file should probably not be included? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22645#discussion_r1876722968 From ihse at openjdk.org Mon Dec 9 21:02:03 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 9 Dec 2024 21:02:03 GMT Subject: RFR: 8345805: Update copyright year to 2024 for other files where it was missed [v2] In-Reply-To: References: Message-ID: > Some files have been modified in 2024, but the copyright year has not been properly updated. This should be fixed. > > I have located these modified files using: > > git log --since="Jan 1" --name-only --pretty=format: | sort -u > file.list > > and then run a script to update the copyright year to 2024 on these files. > > I have made a manual sampling of files in the list to verify that they have indeed been modified in 2024. > > This is a "misc" bucket bug report, covering for all those files that has not been clearly assigned in some other issue. My strategy was to update the copyright year on all files in the JDK repo, and then try to the best of my ability to partition that huge chunk of files between groups. These are the remainder after I've done the large chunks. When you review, please state clearly what part of the code you are reviewing. Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: Revert mistaken changes to binary file ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22645/files - new: https://git.openjdk.org/jdk/pull/22645/files/01702dae..edbc3fbb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22645&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22645&range=00-01 Stats: 0 lines in 1 file changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/22645.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22645/head:pull/22645 PR: https://git.openjdk.org/jdk/pull/22645 From ihse at openjdk.org Mon Dec 9 21:02:03 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 9 Dec 2024 21:02:03 GMT Subject: RFR: 8345805: Update copyright year to 2024 for other files where it was missed [v2] In-Reply-To: References:

Message-ID: On Mon, 9 Dec 2024 20:38:18 GMT, Eirik Bj?rsn?s wrote: >> Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert mistaken changes to binary file > > src/jdk.httpserver/share/classes/sun/net/httpserver/simpleserver/resources/favicon.ico line 1: > > > This file should probably not be included? Correct, that is a mistake. Reverted the change now. My script had a bug which made some binary files be "converted" from CRLF to LF format, making unintended changes. I've verified that there are no more such issues in this PR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22645#discussion_r1876748379 From ihse at openjdk.org Mon Dec 9 21:09:41 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 9 Dec 2024 21:09:41 GMT Subject: RFR: 8345795: Update copyright year to 2024 for hotspot in files where it was missed [v3] In-Reply-To: References: Message-ID: <0TATGf2SyXxL9BAJlx3Xky0kpjQPFNtYOJDOwjEHJzo=.1d2af5d1-308a-4403-82c7-13ab2a614ee8@github.com> > Some files have been modified in 2024, but the copyright year has not been properly updated. This should be fixed. > > I have located these modified files using: > > git log --since="Jan 1" --name-only --pretty=format: | sort -u > file.list > > and then run a script to update the copyright year to 2024 on these files. > > I have made a manual sampling of files in the list to verify that they have indeed been modified in 2024. Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: Revert mistaken changes to binary files ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22637/files - new: https://git.openjdk.org/jdk/pull/22637/files/0166c68e..5610a605 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22637&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22637&range=01-02 Stats: 0 lines in 5 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/22637.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22637/head:pull/22637 PR: https://git.openjdk.org/jdk/pull/22637 From dholmes at openjdk.org Tue Dec 10 04:01:38 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 10 Dec 2024 04:01:38 GMT Subject: RFR: 8345795: Update copyright year to 2024 for hotspot in files where it was missed [v3] In-Reply-To: <0TATGf2SyXxL9BAJlx3Xky0kpjQPFNtYOJDOwjEHJzo=.1d2af5d1-308a-4403-82c7-13ab2a614ee8@github.com> References: <0TATGf2SyXxL9BAJlx3Xky0kpjQPFNtYOJDOwjEHJzo=.1d2af5d1-308a-4403-82c7-13ab2a614ee8@github.com> Message-ID: On Mon, 9 Dec 2024 21:09:41 GMT, Magnus Ihse Bursie wrote: >> Some files have been modified in 2024, but the copyright year has not been properly updated. This should be fixed. >> >> I have located these modified files using: >> >> git log --since="Jan 1" --name-only --pretty=format: | sort -u > file.list >> >> and then run a script to update the copyright year to 2024 on these files. >> >> I have made a manual sampling of files in the list to verify that they have indeed been modified in 2024. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Revert mistaken changes to binary files I scanned the full diff and also did some random checks. Looks good. Thanks for fixing. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22637#pullrequestreview-2490894140 From tschatzl at openjdk.org Tue Dec 10 07:20:38 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 10 Dec 2024 07:20:38 GMT Subject: RFR: 8345795: Update copyright year to 2024 for hotspot in files where it was missed [v3] In-Reply-To: <0TATGf2SyXxL9BAJlx3Xky0kpjQPFNtYOJDOwjEHJzo=.1d2af5d1-308a-4403-82c7-13ab2a614ee8@github.com> References: <0TATGf2SyXxL9BAJlx3Xky0kpjQPFNtYOJDOwjEHJzo=.1d2af5d1-308a-4403-82c7-13ab2a614ee8@github.com> Message-ID: <46K2W9LUIv9jOu7TEtm4Js4TfgzBRyfwhntnqdaQTUY=.f4c7b046-942b-4fc9-9bb5-8690b1f33391@github.com> On Mon, 9 Dec 2024 21:09:41 GMT, Magnus Ihse Bursie wrote: >> Some files have been modified in 2024, but the copyright year has not been properly updated. This should be fixed. >> >> I have located these modified files using: >> >> git log --since="Jan 1" --name-only --pretty=format: | sort -u > file.list >> >> and then run a script to update the copyright year to 2024 on these files. >> >> I have made a manual sampling of files in the list to verify that they have indeed been modified in 2024. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Revert mistaken changes to binary files lgtm, looked at gc files in both shared and cpu and os specific directories. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22637#pullrequestreview-2491237029 From ihse at openjdk.org Tue Dec 10 08:51:46 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 10 Dec 2024 08:51:46 GMT Subject: Integrated: 8345795: Update copyright year to 2024 for hotspot in files where it was missed In-Reply-To: References: Message-ID: On Mon, 9 Dec 2024 12:11:11 GMT, Magnus Ihse Bursie wrote: > Some files have been modified in 2024, but the copyright year has not been properly updated. This should be fixed. > > I have located these modified files using: > > git log --since="Jan 1" --name-only --pretty=format: | sort -u > file.list > > and then run a script to update the copyright year to 2024 on these files. > > I have made a manual sampling of files in the list to verify that they have indeed been modified in 2024. This pull request has now been integrated. Changeset: 2979806c Author: Magnus Ihse Bursie URL: https://git.openjdk.org/jdk/commit/2979806c72561cb4d4e8ac3d44dbcea347ace966 Stats: 921 lines in 921 files changed: 0 ins; 0 del; 921 mod 8345795: Update copyright year to 2024 for hotspot in files where it was missed Reviewed-by: dholmes, tschatzl, dnsimon, sspitsyn ------------- PR: https://git.openjdk.org/jdk/pull/22637 From ihse at openjdk.org Tue Dec 10 08:51:45 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 10 Dec 2024 08:51:45 GMT Subject: RFR: 8345795: Update copyright year to 2024 for hotspot in files where it was missed [v3] In-Reply-To: <0TATGf2SyXxL9BAJlx3Xky0kpjQPFNtYOJDOwjEHJzo=.1d2af5d1-308a-4403-82c7-13ab2a614ee8@github.com> References: <0TATGf2SyXxL9BAJlx3Xky0kpjQPFNtYOJDOwjEHJzo=.1d2af5d1-308a-4403-82c7-13ab2a614ee8@github.com> Message-ID: On Mon, 9 Dec 2024 21:09:41 GMT, Magnus Ihse Bursie wrote: >> Some files have been modified in 2024, but the copyright year has not been properly updated. This should be fixed. >> >> I have located these modified files using: >> >> git log --since="Jan 1" --name-only --pretty=format: | sort -u > file.list >> >> and then run a script to update the copyright year to 2024 on these files. >> >> I have made a manual sampling of files in the list to verify that they have indeed been modified in 2024. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Revert mistaken changes to binary files Thanks for all reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/22637#issuecomment-2530834262 From galder at openjdk.org Tue Dec 10 09:13:41 2024 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Tue, 10 Dec 2024 09:13:41 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com>

Message-ID: On Thu, 7 Nov 2024 10:18:14 GMT, Galder Zamarre?o wrote: >> Overall, looks fine. >> >> So, there will be `inline_min_max`, `inline_fp_min_max`, and `inline_long_min_max` which slightly vary. I'd prefer to see them unified. (Or, at least, enhance `inline_min_max` to cover `minL`/maxL` cases). >> >> Also, it's a bit confusing to see int variants names w/o basic type (`_min`/`_minL` vs `_minI`/`_minL`). Please, clean it up along the way. (FTR I'm also fine handling the renaming as a separate change.) > >> Overall, looks fine. >> >> So, there will be `inline_min_max`, `inline_fp_min_max`, and `inline_long_min_max` which slightly vary. I'd prefer to see them unified. (Or, at least, enhance `inline_min_max` to cover `minL`/maxL` cases). >> >> Also, it's a bit confusing to see int variants names w/o basic type (`_min`/`_minL` vs `_minI`/`_minL`). Please, clean it up along the way. (FTR I'm also fine handling the renaming as a separate change.) > > @iwanowww I applied the changes you suggested. Could you review them? > @galderz Thanks for taking this task on! Had a quick look at it. So auto-vectorization in SuperWord should now be working, right? If yes: > > It would be nice if you tested both for `IRNode.MIN_VL` and `IRNode.MIN_REDUCTION_V`, the same for max. > > You may want to look at these existing tests, to see what other tests there are for the `int` version: `test/hotspot/jtreg/compiler/loopopts/superword/MinMaxRed_Int.java` `test/hotspot/jtreg/compiler/c2/irTests/TestIfMinMax.java` `test/hotspot/jtreg/compiler/vectorization/TestAutoVecIntMinMax.java` `test/hotspot/jtreg/compiler/c2/TestMinMaxSubword.java` There may be some duplicates already here... not sure. +1 to adding such tests. I'm looking into it right now. It's a bit confusing how the tests are spread around (and duplication?) but I'm currently leaning towards adding a `compiler/loopopts/superword/MinMaxRed_Long.java`. > And maybe you need to check what to do about probabilities as well. I will add probabilities logic (50, 80, 100) to control data, but you can already see that from https://github.com/openjdk/jdk/pull/20098#issuecomment-2379386872 that with the patch in an AVX512 system min/max reduction nodes will appear in all probabilities. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2530905728 From galder at openjdk.org Tue Dec 10 09:13:47 2024 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Tue, 10 Dec 2024 09:13:47 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v4] In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com>

Message-ID: On Fri, 29 Nov 2024 11:27:01 GMT, Emanuel Peter wrote: >> Galder Zamarre?o has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 30 additional commits since the last revision: >> >> - Use same default size as in other vector reduction benchmarks >> - Renamed benchmark class >> - Double/Float tests only when avx enabled >> - Make state class non-final >> - Restore previous benchmark iterations and default param size >> - Add clipping range benchmark that uses min/max >> - Encapsulate benchmark state within an inner class >> - Avoid creating result array in benchmark method >> - Merge branch 'master' into topic.intrinsify-max-min-long >> - Revert "Implement cmovL as a jump+mov branch" >> >> This reverts commit 1522e26bf66c47b780ebd0d0d0c4f78a4c564e44. >> - ... and 20 more: https://git.openjdk.org/jdk/compare/b713ae85...0a8718e1 > > test/hotspot/jtreg/compiler/intrinsics/math/TestMinMaxInlining.java line 108: > >> 106: @Test >> 107: @Arguments(values = { Argument.NUMBER_MINUS_42, Argument.NUMBER_42 }) >> 108: @IR(counts = { IRNode.MIN_F, "1" }, applyIfCPUFeatureOr = {"avx", "true"}) > > Is this not supported by `asimd`? Same question for the other cases. Good point. I'll look into that aarch64 environments to see how things behave. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1877657011 From chagedorn at openjdk.org Tue Dec 10 09:39:48 2024 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Tue, 10 Dec 2024 09:39:48 GMT Subject: RFR: 8345801: C2: Clean up include statements to speed up compilation when touching type.hpp Message-ID: I've noticed that after touching `type.hpp` in Valhalla, it requires more than 7 minutes to build hotspot again on my machine. I would have suspected that we mostly recompile C2/opto source files. But that is far from the truth: A lot of unrelated source files must be recompiled, including, for example, C1, JFR, or runtime files. In mainline, the impact is not that severe. But it still requires around 1 minute to build hotspot again on my machine after touching `type.hpp`. I've had a look at the include chains and removed quite a lot of unused includes. For the active includes, the most impact had including `output.hpp` inside `c2compiler.hpp`. This set up the following dependency chain: ... more C1 files #include "c1/c1_Compilation.hpp" #include "compiler/compilerDefinitions.inline.hpp" #include "opto/c2compiler.hpp" #include "opto/output.hpp" <------------ Problematic include #include "opto/c2_CodeStubs.hpp" #include "opto/compile.hpp" ... more opto files and eventually type.hpp This means that a lot of C1 files also need to be re-compiled as well as some more source file that include `compiler/compilerDefinitions.inline.hpp`. I cut this dependency chain by removing the include of `opto/output.hpp` in `opto/c2compiler.hpp` and moved the constant `initial_const_capacity` from `output.hpp` to `c2Compiler.hpp` which seemed to be the only reason why we have the include in place. After this change, I was required to add some missing includes that were included transitively before. The final mainline patch could also be applied to the current Valhalla repository (with some small tweaks). The results were quite promising. I could bring the compilation time on my machine significantly down in mainline and especially in Valhalla after touching `type.hpp`: - Mainline: ~1min -> ~40s (1.5 times faster) - Valhalla: ~7min -> ~40s (10.5 times faster) I've only focused on `type.hpp` here but I guess other includes in the JIT compiler area or other parts of hotspot could also be revisited to possibly speed up the compilation after touching some header files. Testing: - Oracle CI - GHA Maybe some Thanks, Christian ------------- Commit messages: - C2: Clean up include statements to speed up compilation when touching type.hpp Changes: https://git.openjdk.org/jdk/pull/22658/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22658&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8345801 Stats: 86 lines in 31 files changed: 15 ins; 69 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/22658.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22658/head:pull/22658 PR: https://git.openjdk.org/jdk/pull/22658 From kvn at openjdk.org Tue Dec 10 19:56:37 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 10 Dec 2024 19:56:37 GMT Subject: RFR: 8345801: C2: Clean up include statements to speed up compilation when touching type.hpp In-Reply-To: References: Message-ID: On Tue, 10 Dec 2024 09:31:27 GMT, Christian Hagedorn wrote: > I've noticed that after touching `type.hpp` in Valhalla, it requires more than 7 minutes to build hotspot again on my machine. I would have suspected that we mostly recompile C2/opto source files. But that is far from the truth: A lot of unrelated source files must be recompiled, including, for example, C1, JFR, or runtime files. > > In mainline, the impact is not that severe. But it still requires around 1 minute to build hotspot again on my machine after touching `type.hpp`. I've had a look at the include chains and removed quite a lot of unused includes. For the active includes, the most impact had including `output.hpp` inside `c2compiler.hpp`. This set up the following dependency chain: > > ... more C1 files > #include "c1/c1_Compilation.hpp" > #include "compiler/compilerDefinitions.inline.hpp" > #include "opto/c2compiler.hpp" > #include "opto/output.hpp" <------------ Problematic include > #include "opto/c2_CodeStubs.hpp" > #include "opto/compile.hpp" > ... more opto files and eventually type.hpp > > This means that a lot of C1 files also need to be re-compiled as well as some more source file that include `compiler/compilerDefinitions.inline.hpp`. I cut this dependency chain by removing the include of `opto/output.hpp` in `opto/c2compiler.hpp` and moved the constant `initial_const_capacity` from `output.hpp` to `c2Compiler.hpp` which seemed to be the only reason why we have the include in place. After this change, I was required to add some missing includes that were included transitively before. > > The final mainline patch could also be applied to the current Valhalla repository (with some small tweaks). The results were quite promising. I could bring the compilation time on my machine significantly down in mainline and especially in Valhalla after touching `type.hpp`: > > - Mainline: ~1min -> ~40s (1.5 times faster) > - Valhalla: ~7min -> ~40s (10.5 times faster) > > I've only focused on `type.hpp` here but I guess other includes in the JIT compiler area or other parts of hotspot could also be revisited to possibly speed up the compilation after touching some header files. > > Testing: > - Oracle CI > - GHA > > Thanks, > Christian Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22658#pullrequestreview-2493434167 From dlong at openjdk.org Wed Dec 11 02:13:37 2024 From: dlong at openjdk.org (Dean Long) Date: Wed, 11 Dec 2024 02:13:37 GMT Subject: RFR: 8345801: C2: Clean up include statements to speed up compilation when touching type.hpp In-Reply-To: References: Message-ID: On Tue, 10 Dec 2024 09:31:27 GMT, Christian Hagedorn wrote: > I've noticed that after touching `type.hpp` in Valhalla, it requires more than 7 minutes to build hotspot again on my machine. I would have suspected that we mostly recompile C2/opto source files. But that is far from the truth: A lot of unrelated source files must be recompiled, including, for example, C1, JFR, or runtime files. > > In mainline, the impact is not that severe. But it still requires around 1 minute to build hotspot again on my machine after touching `type.hpp`. I've had a look at the include chains and removed quite a lot of unused includes. For the active includes, the most impact had including `output.hpp` inside `c2compiler.hpp`. This set up the following dependency chain: > > ... more C1 files > #include "c1/c1_Compilation.hpp" > #include "compiler/compilerDefinitions.inline.hpp" > #include "opto/c2compiler.hpp" > #include "opto/output.hpp" <------------ Problematic include > #include "opto/c2_CodeStubs.hpp" > #include "opto/compile.hpp" > ... more opto files and eventually type.hpp > > This means that a lot of C1 files also need to be re-compiled as well as some more source file that include `compiler/compilerDefinitions.inline.hpp`. I cut this dependency chain by removing the include of `opto/output.hpp` in `opto/c2compiler.hpp` and moved the constant `initial_const_capacity` from `output.hpp` to `c2Compiler.hpp` which seemed to be the only reason why we have the include in place. After this change, I was required to add some missing includes that were included transitively before. > > The final mainline patch could also be applied to the current Valhalla repository (with some small tweaks). The results were quite promising. I could bring the compilation time on my machine significantly down in mainline and especially in Valhalla after touching `type.hpp`: > > - Mainline: ~1min -> ~40s (1.5 times faster) > - Valhalla: ~7min -> ~40s (10.5 times faster) > > I've only focused on `type.hpp` here but I guess other includes in the JIT compiler area or other parts of hotspot could also be revisited to possibly speed up the compilation after touching some header files. > > Testing: > - Oracle CI > - GHA > > Thanks, > Christian Looks good, as long as you tested w/ and w/o precompiled headers. src/hotspot/share/opto/node.hpp line 2003: > 2001: } > 2002: > 2003: inline Node_Notes* Compile::node_notes_at(int idx) { It seems weird to have these inlined Compile methods in node.hpp. Could we clean this up separately, maybe by moving them to compile.inline.hpp? ------------- Marked as reviewed by dlong (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22658#pullrequestreview-2494128348 PR Review Comment: https://git.openjdk.org/jdk/pull/22658#discussion_r1879215004 From dholmes at openjdk.org Wed Dec 11 05:03:40 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 11 Dec 2024 05:03:40 GMT Subject: RFR: 8345805: Update copyright year to 2024 for other files where it was missed [v2] In-Reply-To: References:

Message-ID: On Mon, 9 Dec 2024 21:02:03 GMT, Magnus Ihse Bursie wrote: >> Some files have been modified in 2024, but the copyright year has not been properly updated. This should be fixed. >> >> I have located these modified files using: >> >> git log --since="Jan 1" --name-only --pretty=format: | sort -u > file.list >> >> and then run a script to update the copyright year to 2024 on these files. >> >> I have made a manual sampling of files in the list to verify that they have indeed been modified in 2024. >> >> This is a "misc" bucket bug report, covering for all those files that has not been clearly assigned in some other issue. My strategy was to update the copyright year on all files in the JDK repo, and then try to the best of my ability to partition that huge chunk of files between groups. These are the remainder after I've done the large chunks. When you review, please state clearly what part of the code you are reviewing. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Revert mistaken changes to binary file Looks fine. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22645#pullrequestreview-2494322540 From mli at openjdk.org Wed Dec 11 09:34:39 2024 From: mli at openjdk.org (Hamlin Li) Date: Wed, 11 Dec 2024 09:34:39 GMT Subject: RFR: 8345805: Update copyright year to 2024 for other files where it was missed [v2] In-Reply-To: References:

Message-ID: <7Bia437IpuiRbqBsYiwM2WQIAAxn0bTbUuRNNw1LNTw=.d8ce64de-67ad-4f6c-ba28-81571b5b2f68@github.com> On Mon, 9 Dec 2024 21:02:03 GMT, Magnus Ihse Bursie wrote: >> Some files have been modified in 2024, but the copyright year has not been properly updated. This should be fixed. >> >> I have located these modified files using: >> >> git log --since="Jan 1" --name-only --pretty=format: | sort -u > file.list >> >> and then run a script to update the copyright year to 2024 on these files. >> >> I have made a manual sampling of files in the list to verify that they have indeed been modified in 2024. >> >> This is a "misc" bucket bug report, covering for all those files that has not been clearly assigned in some other issue. My strategy was to update the copyright year on all files in the JDK repo, and then try to the best of my ability to partition that huge chunk of files between groups. These are the remainder after I've done the large chunks. When you review, please state clearly what part of the code you are reviewing. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Revert mistaken changes to binary file Nice batch cleanup! ------------- Marked as reviewed by mli (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22645#pullrequestreview-2494911355 From ihse at openjdk.org Wed Dec 11 10:41:43 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 11 Dec 2024 10:41:43 GMT Subject: Integrated: 8345805: Update copyright year to 2024 for other files where it was missed In-Reply-To: References: Message-ID: On Mon, 9 Dec 2024 13:02:15 GMT, Magnus Ihse Bursie wrote: > Some files have been modified in 2024, but the copyright year has not been properly updated. This should be fixed. > > I have located these modified files using: > > git log --since="Jan 1" --name-only --pretty=format: | sort -u > file.list > > and then run a script to update the copyright year to 2024 on these files. > > I have made a manual sampling of files in the list to verify that they have indeed been modified in 2024. > > This is a "misc" bucket bug report, covering for all those files that has not been clearly assigned in some other issue. My strategy was to update the copyright year on all files in the JDK repo, and then try to the best of my ability to partition that huge chunk of files between groups. These are the remainder after I've done the large chunks. When you review, please state clearly what part of the code you are reviewing. This pull request has now been integrated. Changeset: 8e0f929e Author: Magnus Ihse Bursie URL: https://git.openjdk.org/jdk/commit/8e0f929ecfc1d8de1c2a78e608bcabc45ff6b6af Stats: 107 lines in 107 files changed: 0 ins; 0 del; 107 mod 8345805: Update copyright year to 2024 for other files where it was missed Reviewed-by: dholmes, mli, mullan ------------- PR: https://git.openjdk.org/jdk/pull/22645 From chagedorn at openjdk.org Wed Dec 11 12:35:41 2024 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Wed, 11 Dec 2024 12:35:41 GMT Subject: RFR: 8345801: C2: Clean up include statements to speed up compilation when touching type.hpp In-Reply-To: References:

Message-ID: On Tue, 10 Dec 2024 19:53:52 GMT, Vladimir Kozlov wrote: >> I've noticed that after touching `type.hpp` in Valhalla, it requires more than 7 minutes to build hotspot again on my machine. I would have suspected that we mostly recompile C2/opto source files. But that is far from the truth: A lot of unrelated source files must be recompiled, including, for example, C1, JFR, or runtime files. >> >> In mainline, the impact is not that severe. But it still requires around 1 minute to build hotspot again on my machine after touching `type.hpp`. I've had a look at the include chains and removed quite a lot of unused includes. For the active includes, the most impact had including `output.hpp` inside `c2compiler.hpp`. This set up the following dependency chain: >> >> ... more C1 files >> #include "c1/c1_Compilation.hpp" >> #include "compiler/compilerDefinitions.inline.hpp" >> #include "opto/c2compiler.hpp" >> #include "opto/output.hpp" <------------ Problematic include >> #include "opto/c2_CodeStubs.hpp" >> #include "opto/compile.hpp" >> ... more opto files and eventually type.hpp >> >> This means that a lot of C1 files also need to be re-compiled as well as some more source file that include `compiler/compilerDefinitions.inline.hpp`. I cut this dependency chain by removing the include of `opto/output.hpp` in `opto/c2compiler.hpp` and moved the constant `initial_const_capacity` from `output.hpp` to `c2Compiler.hpp` which seemed to be the only reason why we have the include in place. After this change, I was required to add some missing includes that were included transitively before. >> >> The final mainline patch could also be applied to the current Valhalla repository (with some small tweaks). The results were quite promising. I could bring the compilation time on my machine significantly down in mainline and especially in Valhalla after touching `type.hpp`: >> >> - Mainline: ~1min -> ~40s (1.5 times faster) >> - Valhalla: ~7min -> ~40s (10.5 times faster) >> >> I've only focused on `type.hpp` here but I guess other includes in the JIT compiler area or other parts of hotspot could also be revisited to possibly speed up the compilation after touching some header files. >> >> Testing: >> - Oracle CI >> - GHA >> >> Thanks, >> Christian > > Good. Thanks @vnkozlov and @dean-long for your reviews! > as long as you tested w/ and w/o precompiled headers. That's a good point. I thought it was covered by our CI but I've run now a separate testing with `--disable-precompiled-headers` which looked good. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22658#issuecomment-2535869467 From chagedorn at openjdk.org Wed Dec 11 12:35:43 2024 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Wed, 11 Dec 2024 12:35:43 GMT Subject: RFR: 8345801: C2: Clean up include statements to speed up compilation when touching type.hpp In-Reply-To: References:

Message-ID: On Wed, 11 Dec 2024 02:10:13 GMT, Dean Long wrote: >> I've noticed that after touching `type.hpp` in Valhalla, it requires more than 7 minutes to build hotspot again on my machine. I would have suspected that we mostly recompile C2/opto source files. But that is far from the truth: A lot of unrelated source files must be recompiled, including, for example, C1, JFR, or runtime files. >> >> In mainline, the impact is not that severe. But it still requires around 1 minute to build hotspot again on my machine after touching `type.hpp`. I've had a look at the include chains and removed quite a lot of unused includes. For the active includes, the most impact had including `output.hpp` inside `c2compiler.hpp`. This set up the following dependency chain: >> >> ... more C1 files >> #include "c1/c1_Compilation.hpp" >> #include "compiler/compilerDefinitions.inline.hpp" >> #include "opto/c2compiler.hpp" >> #include "opto/output.hpp" <------------ Problematic include >> #include "opto/c2_CodeStubs.hpp" >> #include "opto/compile.hpp" >> ... more opto files and eventually type.hpp >> >> This means that a lot of C1 files also need to be re-compiled as well as some more source file that include `compiler/compilerDefinitions.inline.hpp`. I cut this dependency chain by removing the include of `opto/output.hpp` in `opto/c2compiler.hpp` and moved the constant `initial_const_capacity` from `output.hpp` to `c2Compiler.hpp` which seemed to be the only reason why we have the include in place. After this change, I was required to add some missing includes that were included transitively before. >> >> The final mainline patch could also be applied to the current Valhalla repository (with some small tweaks). The results were quite promising. I could bring the compilation time on my machine significantly down in mainline and especially in Valhalla after touching `type.hpp`: >> >> - Mainline: ~1min -> ~40s (1.5 times faster) >> - Valhalla: ~7min -> ~40s (10.5 times faster) >> >> I've only focused on `type.hpp` here but I guess other includes in the JIT compiler area or other parts of hotspot could also be revisited to possibly speed up the compilation after touching some header files. >> >> Testing: >> - Oracle CI >> - GHA >> >> Thanks, >> Christian > > src/hotspot/share/opto/node.hpp line 2003: > >> 2001: } >> 2002: >> 2003: inline Node_Notes* Compile::node_notes_at(int idx) { > > It seems weird to have these inlined Compile methods in node.hpp. Could we clean this up separately, maybe by moving them to compile.inline.hpp? I was confused about that as well. There is currently no `compile.inline.hpp` but maybe we should introduce one for such cases. I can file an RFE to clean this up separately. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22658#discussion_r1880115791 From jwaters at openjdk.org Wed Dec 11 13:10:38 2024 From: jwaters at openjdk.org (Julian Waters) Date: Wed, 11 Dec 2024 13:10:38 GMT Subject: RFR: 8345801: C2: Clean up include statements to speed up compilation when touching type.hpp In-Reply-To: References: Message-ID: <7qfOP2-q-3rAgSv1HopwqcUwxGpvzTHkdR6mePsE7IU=.6ca1e945-1aea-4087-bc0a-0f7a235c1874@github.com> On Tue, 10 Dec 2024 09:31:27 GMT, Christian Hagedorn wrote: > I've noticed that after touching `type.hpp` in Valhalla, it requires more than 7 minutes to build hotspot again on my machine. I would have suspected that we mostly recompile C2/opto source files. But that is far from the truth: A lot of unrelated source files must be recompiled, including, for example, C1, JFR, or runtime files. > > In mainline, the impact is not that severe. But it still requires around 1 minute to build hotspot again on my machine after touching `type.hpp`. I've had a look at the include chains and removed quite a lot of unused includes. For the active includes, the most impact had including `output.hpp` inside `c2compiler.hpp`. This set up the following dependency chain: > > ... more C1 files > #include "c1/c1_Compilation.hpp" > #include "compiler/compilerDefinitions.inline.hpp" > #include "opto/c2compiler.hpp" > #include "opto/output.hpp" <------------ Problematic include > #include "opto/c2_CodeStubs.hpp" > #include "opto/compile.hpp" > ... more opto files and eventually type.hpp > > This means that a lot of C1 files also need to be re-compiled as well as some more source file that include `compiler/compilerDefinitions.inline.hpp`. I cut this dependency chain by removing the include of `opto/output.hpp` in `opto/c2compiler.hpp` and moved the constant `initial_const_capacity` from `output.hpp` to `c2Compiler.hpp` which seemed to be the only reason why we have the include in place. After this change, I was required to add some missing includes that were included transitively before. > > The final mainline patch could also be applied to the current Valhalla repository (with some small tweaks). The results were quite promising. I could bring the compilation time on my machine significantly down in mainline and especially in Valhalla after touching `type.hpp`: > > - Mainline: ~1min -> ~40s (1.5 times faster) > - Valhalla: ~7min -> ~40s (10.5 times faster) > > I've only focused on `type.hpp` here but I guess other includes in the JIT compiler area or other parts of hotspot could also be revisited to possibly speed up the compilation after touching some header files. > > Testing: > - Oracle CI > - GHA > > Thanks, > Christian It takes just 7 minutes to compile and link HotSpot for you? If only I had that kind of luxury... ------------- Marked as reviewed by jwaters (Committer). PR Review: https://git.openjdk.org/jdk/pull/22658#pullrequestreview-2495622649 From chagedorn at openjdk.org Wed Dec 11 13:21:37 2024 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Wed, 11 Dec 2024 13:21:37 GMT Subject: RFR: 8345801: C2: Clean up include statements to speed up compilation when touching type.hpp In-Reply-To: <7qfOP2-q-3rAgSv1HopwqcUwxGpvzTHkdR6mePsE7IU=.6ca1e945-1aea-4087-bc0a-0f7a235c1874@github.com> References: <7qfOP2-q-3rAgSv1HopwqcUwxGpvzTHkdR6mePsE7IU=.6ca1e945-1aea-4087-bc0a-0f7a235c1874@github.com> Message-ID: On Wed, 11 Dec 2024 13:08:27 GMT, Julian Waters wrote: >> I've noticed that after touching `type.hpp` in Valhalla, it requires more than 7 minutes to build hotspot again on my machine. I would have suspected that we mostly recompile C2/opto source files. But that is far from the truth: A lot of unrelated source files must be recompiled, including, for example, C1, JFR, or runtime files. >> >> In mainline, the impact is not that severe. But it still requires around 1 minute to build hotspot again on my machine after touching `type.hpp`. I've had a look at the include chains and removed quite a lot of unused includes. For the active includes, the most impact had including `output.hpp` inside `c2compiler.hpp`. This set up the following dependency chain: >> >> ... more C1 files >> #include "c1/c1_Compilation.hpp" >> #include "compiler/compilerDefinitions.inline.hpp" >> #include "opto/c2compiler.hpp" >> #include "opto/output.hpp" <------------ Problematic include >> #include "opto/c2_CodeStubs.hpp" >> #include "opto/compile.hpp" >> ... more opto files and eventually type.hpp >> >> This means that a lot of C1 files also need to be re-compiled as well as some more source file that include `compiler/compilerDefinitions.inline.hpp`. I cut this dependency chain by removing the include of `opto/output.hpp` in `opto/c2compiler.hpp` and moved the constant `initial_const_capacity` from `output.hpp` to `c2Compiler.hpp` which seemed to be the only reason why we have the include in place. After this change, I was required to add some missing includes that were included transitively before. >> >> The final mainline patch could also be applied to the current Valhalla repository (with some small tweaks). The results were quite promising. I could bring the compilation time on my machine significantly down in mainline and especially in Valhalla after touching `type.hpp`: >> >> - Mainline: ~1min -> ~40s (1.5 times faster) >> - Valhalla: ~7min -> ~40s (10.5 times faster) >> >> I've only focused on `type.hpp` here but I guess other includes in the JIT compiler area or other parts of hotspot could also be revisited to possibly speed up the compilation after touching some header files. >> >> Testing: >> - Oracle CI >> - GHA >> >> Thanks, >> Christian > > It takes just 7 minutes to compile and link HotSpot for you? If only I had that kind of luxury... Thanks @TheShermanTanker for your review! The measured numbers are only when starting a hotspot build again after touching `type.hpp`. It takes slightly longer for a full `make hotspot` in Valhalla. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22658#issuecomment-2535972320 From galder at openjdk.org Thu Dec 12 10:21:34 2024 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Thu, 12 Dec 2024 10:21:34 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v5] In-Reply-To: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> Message-ID: > This patch intrinsifies `Math.max(long, long)` and `Math.min(long, long)` in order to help improve vectorization performance. > > Currently vectorization does not kick in for loops containing either of these calls because of the following error: > > > VLoop::check_preconditions: failed: control flow in loop not allowed > > > The control flow is due to the java implementation for these methods, e.g. > > > public static long max(long a, long b) { > return (a >= b) ? a : b; > } > > > This patch intrinsifies the calls to replace the CmpL + Bool nodes for MaxL/MinL nodes respectively. > By doing this, vectorization no longer finds the control flow and so it can carry out the vectorization. > E.g. > > > SuperWord::transform_loop: > Loop: N518/N126 counted [int,int),+4 (1025 iters) main has_sfpt strip_mined > 518 CountedLoop === 518 246 126 [[ 513 517 518 242 521 522 422 210 ]] inner stride: 4 main of N518 strip mined !orig=[419],[247],[216],[193] !jvms: Test::test @ bci:14 (line 21) > > > Applying the same changes to `ReductionPerf` as in https://github.com/openjdk/jdk/pull/13056, we can compare the results before and after. Before the patch, on darwin/aarch64 (M1): > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java > 1 1 0 0 > ============================== > TEST SUCCESS > > long min 1155 > long max 1173 > > > After the patch, on darwin/aarch64 (M1): > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java > 1 1 0 0 > ============================== > TEST SUCCESS > > long min 1042 > long max 1042 > > > This patch does not add an platform-specific backend implementations for the MaxL/MinL nodes. > Therefore, it still relies on the macro expansion to transform those into CMoveL. > > I've run tier1 and hotspot compiler tests on darwin/aarch64 and got these results: > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg:tier1 2500 2500 0 0 >>> jtreg:test/jdk:tier1 ... Galder Zamarre?o has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 31 additional commits since the last revision: - Merge branch 'master' into topic.intrinsify-max-min-long - Use same default size as in other vector reduction benchmarks - Renamed benchmark class - Double/Float tests only when avx enabled - Make state class non-final - Restore previous benchmark iterations and default param size - Add clipping range benchmark that uses min/max - Encapsulate benchmark state within an inner class - Avoid creating result array in benchmark method - Merge branch 'master' into topic.intrinsify-max-min-long - ... and 21 more: https://git.openjdk.org/jdk/compare/3c126865...aca09222 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20098/files - new: https://git.openjdk.org/jdk/pull/20098/files/0a8718e1..aca09222 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20098&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20098&range=03-04 Stats: 592212 lines in 9243 files changed: 324831 ins; 214673 del; 52708 mod Patch: https://git.openjdk.org/jdk/pull/20098.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20098/head:pull/20098 PR: https://git.openjdk.org/jdk/pull/20098 From epeter at openjdk.org Thu Dec 12 12:05:42 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 12 Dec 2024 12:05:42 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com>

Message-ID: On Tue, 10 Dec 2024 09:10:04 GMT, Galder Zamarre?o wrote: >>> Overall, looks fine. >>> >>> So, there will be `inline_min_max`, `inline_fp_min_max`, and `inline_long_min_max` which slightly vary. I'd prefer to see them unified. (Or, at least, enhance `inline_min_max` to cover `minL`/maxL` cases). >>> >>> Also, it's a bit confusing to see int variants names w/o basic type (`_min`/`_minL` vs `_minI`/`_minL`). Please, clean it up along the way. (FTR I'm also fine handling the renaming as a separate change.) >> >> @iwanowww I applied the changes you suggested. Could you review them? > >> @galderz Thanks for taking this task on! Had a quick look at it. So auto-vectorization in SuperWord should now be working, right? If yes: >> >> It would be nice if you tested both for `IRNode.MIN_VL` and `IRNode.MIN_REDUCTION_V`, the same for max. >> >> You may want to look at these existing tests, to see what other tests there are for the `int` version: `test/hotspot/jtreg/compiler/loopopts/superword/MinMaxRed_Int.java` `test/hotspot/jtreg/compiler/c2/irTests/TestIfMinMax.java` `test/hotspot/jtreg/compiler/vectorization/TestAutoVecIntMinMax.java` `test/hotspot/jtreg/compiler/c2/TestMinMaxSubword.java` There may be some duplicates already here... not sure. > > +1 to adding such tests. I'm looking into it right now. It's a bit confusing how the tests are spread around (and duplication?) but I'm currently leaning towards adding a `compiler/loopopts/superword/MinMaxRed_Long.java`. > >> And maybe you need to check what to do about probabilities as well. > > I will add probabilities logic (50, 80, 100) to control data, but you can already see that from https://github.com/openjdk/jdk/pull/20098#issuecomment-2379386872 that with the patch in an AVX512 system min/max reduction nodes will appear in all probabilities. @galderz Yes, there is significant duplication, sadly. Often there were old tests there, but then one comes along and sees that one wants to have more comprehensive tests. So one adds it, but does not feel 100% comfortable removing old tests. A little bit of duplication is probably ok. Often, there are still subtle differences, and sometimes those end up mattering. `compiler/loopopts/superword/MinMaxRed_Long.java` sounds like a good idea. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2538713997 From jbhateja at openjdk.org Sun Dec 15 17:59:51 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sun, 15 Dec 2024 17:59:51 GMT Subject: Withdrawn: 8342103: C2 compiler support for Float16 type and associated operations In-Reply-To: References: Message-ID: On Mon, 14 Oct 2024 11:40:01 GMT, Jatin Bhateja wrote: > Hi All, > > This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) > > Following is the summary of changes included with this patch:- > > 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. > 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. > 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. > - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. > 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. > 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/21490#issuecomment-2482867818)for more details. > 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA instructions generally operate over floating point registers, therefore compiler injectes reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. > 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF > 6. Auto-vectorization of newly supported scalar operations. > 7. X86 and AARCH64 backend implementation for all supported intrinsics. > 9. Functional and Performance validation tests. > > Kindly review and share your feedback. > > Best Regards, > Jatin This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/21490 From jbhateja at openjdk.org Sun Dec 15 18:19:35 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sun, 15 Dec 2024 18:19:35 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations Message-ID: Hi All, This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) Following is the summary of changes included with this patch:- 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA instructions generally operate over floating point registers, therefore compiler injectes reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF 6. Auto-vectorization of newly supported scalar operations. 7. X86 and AARCH64 backend implementation for all supported intrinsics. 9. Functional and Performance validation tests. Kindly review and share your feedback. Best Regards, Jatin ------------- Commit messages: - C2 compiler support for float16 scalar operations. Changes: https://git.openjdk.org/jdk/pull/22754/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8342103 Stats: 2633 lines in 54 files changed: 2589 ins; 0 del; 44 mod Patch: https://git.openjdk.org/jdk/pull/22754.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22754/head:pull/22754 PR: https://git.openjdk.org/jdk/pull/22754 From jbhateja at openjdk.org Sun Dec 15 18:19:35 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sun, 15 Dec 2024 18:19:35 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations In-Reply-To: References: Message-ID: <5s62x13e3X2XmGxSwxY6zrtlSLVS7Y_uDdTHJpxNz1U=.2d5cf677-7af2-4483-8ff1-1f91fb26a5da@github.com> On Sun, 15 Dec 2024 18:05:02 GMT, Jatin Bhateja wrote: > Hi All, > > This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) > > Following is the summary of changes included with this patch:- > > 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. > 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. > 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. > - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. > 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. > 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. > 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA instructions generally operate over floating point registers, therefore compiler injectes reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. > 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF > 6. Auto-vectorization of newly supported scalar operations. > 7. X86 and AARCH64 backend implementation for all supported intrinsics. > 9. Functional and Performance validation tests. > > Kindly review and share your feedback. > > Best Regards, > Jatin Some FAQs on the newly added ideal type for half-float IR nodes:- Q. Why do we not use existing TypeInt::SHORT instead of creating a new TypeH type? A. Newly defined half float type named TypeH is special as its basic type is T_SHORT while its ideal type is RegF. Thus, the C2 type system views its associated IR node as a 16-bit short value while the register allocator assigns it a floating point register. Q. Problem with ConF? A. During Auto-Vectorization, ConF replication constrains the operational vector lane count to half of what can otherwise be used for regular Float16 operation i.e. only 16 floats can be accommodated into a 512-bit vector thereby limiting the lane count of vectors in its use-def chain, one possible way to address it is through a kludge in auto-vectorizer to cast them to a 16 bits constant by analyzing its context. Newly defined Float16 constant nodes 'ConH' are inherently 16-bit encoded IEEE 754 FP16 values and can be efficiently packed to leverage full target vector width. All Float16 IR nodes now carry newly defined Type::HALF_FLOAT type instead of Type::FLOAT, thus we no longer need special handling in auto-vectorizer to prune their container type to short. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22754#issuecomment-2543982577 From epeter at openjdk.org Mon Dec 16 07:24:36 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 16 Dec 2024 07:24:36 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations In-Reply-To: References: Message-ID: On Sun, 15 Dec 2024 18:05:02 GMT, Jatin Bhateja wrote: > Hi All, > > This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) > > Following is the summary of changes included with this patch:- > > 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. > 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. > 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. > - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. > 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. > 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. > 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. > 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF > 9. X86 backend implementation for all supported intrinsics. > 10. Functional and Performance validation tests. > > Kindly review the patch and share your feedback. > > Best Regards, > Jatin Can you quickly summarize what tests you have, and what they test? test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorConvChain.java line 49: > 47: counts = {IRNode.VECTOR_CAST_HF2F, IRNode.VECTOR_SIZE_ANY, ">= 1", IRNode.VECTOR_CAST_F2HF, IRNode.VECTOR_SIZE_ANY, " >= 1"}) > 48: @IR(applyIfCPUFeatureAnd = {"avx512_fp16", "false", "zvfh", "true"}, > 49: counts = {IRNode.VECTOR_CAST_HF2F, IRNode.VECTOR_SIZE_ANY, ">= 1", IRNode.VECTOR_CAST_F2HF, IRNode.VECTOR_SIZE_ANY, " >= 1"}) Looks like this is having vector changes? And this is pre-existing: but why are we using `VECTOR_SIZE_ANY` here? Can we not know the vector size? Maybe we can introduce a new tag `max_float16` or `max_hf`. And do something like this: `IRNode.VECTOR_SIZE + "min(max_float, max_hf)", "> 0"` The downside with using `ANY` is that the exact size is not tested, and that might mean that the size is much smaller than ideal. ------------- PR Review: https://git.openjdk.org/jdk/pull/22754#pullrequestreview-2505332519 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1886290546 From jbhateja at openjdk.org Mon Dec 16 08:35:31 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 16 Dec 2024 08:35:31 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v2] In-Reply-To: References: Message-ID: > Hi All, > > This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) > > Following is the summary of changes included with this patch:- > > 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. > 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. > 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. > - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. > 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. > 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. > 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. > 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF > 9. X86 backend implementation for all supported intrinsics. > 10. Functional and Performance validation tests. > > Kindly review the patch and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Adding missed check in container type detection. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22754/files - new: https://git.openjdk.org/jdk/pull/22754/files/c215eac7..7cb694fa Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/22754.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22754/head:pull/22754 PR: https://git.openjdk.org/jdk/pull/22754 From jbhateja at openjdk.org Mon Dec 16 08:35:33 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 16 Dec 2024 08:35:33 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v2] In-Reply-To: References:

Message-ID: On Mon, 16 Dec 2024 07:22:04 GMT, Emanuel Peter wrote: > Can you quickly summarize what tests you have, and what they test? Patch includes functional and performance tests, as per your suggestions IR framework-based tests now cover various special cases for constant folding transformation. Let me know if you see any gaps. > test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorConvChain.java line 49: > >> 47: counts = {IRNode.VECTOR_CAST_HF2F, IRNode.VECTOR_SIZE_ANY, ">= 1", IRNode.VECTOR_CAST_F2HF, IRNode.VECTOR_SIZE_ANY, " >= 1"}) >> 48: @IR(applyIfCPUFeatureAnd = {"avx512_fp16", "false", "zvfh", "true"}, >> 49: counts = {IRNode.VECTOR_CAST_HF2F, IRNode.VECTOR_SIZE_ANY, ">= 1", IRNode.VECTOR_CAST_F2HF, IRNode.VECTOR_SIZE_ANY, " >= 1"}) > > Looks like this is having vector changes? > And this is pre-existing: but why are we using `VECTOR_SIZE_ANY` here? Can we not know the vector size? Maybe we can introduce a new tag `max_float16` or `max_hf`. And do something like this: > `IRNode.VECTOR_SIZE + "min(max_float, max_hf)", "> 0"` > > The downside with using `ANY` is that the exact size is not tested, and that might mean that the size is much smaller than ideal. Hi @eme64 , Test modification looks ok to me, we intend to trigger these IR rules on non AVX512-FP16 targets. On AVX512-FP16 target compiler will infer scalar float16 add operation which will not get auto-vectorized. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22754#issuecomment-2544914959 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1886373922 From epeter at openjdk.org Mon Dec 16 09:06:36 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 16 Dec 2024 09:06:36 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v2] In-Reply-To: References:

Message-ID: On Mon, 16 Dec 2024 08:32:32 GMT, Jatin Bhateja wrote: > > Can you quickly summarize what tests you have, and what they test? > > Patch includes functional and performance tests, as per your suggestions IR framework-based tests now cover various special cases for constant folding transformation. Let me know if you see any gaps. I was hoping that you could make a list of all optimizations that are included here, and tell me where the tests are for it. That would significantly reduce the review time on my end. Otherwise I have to correlate everything myself, and that will take me hours. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22754#issuecomment-2544992852 From chagedorn at openjdk.org Mon Dec 16 09:56:42 2024 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Mon, 16 Dec 2024 09:56:42 GMT Subject: Integrated: 8345801: C2: Clean up include statements to speed up compilation when touching type.hpp In-Reply-To: References: Message-ID: On Tue, 10 Dec 2024 09:31:27 GMT, Christian Hagedorn wrote: > I've noticed that after touching `type.hpp` in Valhalla, it requires more than 7 minutes to build hotspot again on my machine. I would have suspected that we mostly recompile C2/opto source files. But that is far from the truth: A lot of unrelated source files must be recompiled, including, for example, C1, JFR, or runtime files. > > In mainline, the impact is not that severe. But it still requires around 1 minute to build hotspot again on my machine after touching `type.hpp`. I've had a look at the include chains and removed quite a lot of unused includes. For the active includes, the most impact had including `output.hpp` inside `c2compiler.hpp`. This set up the following dependency chain: > > ... more C1 files > #include "c1/c1_Compilation.hpp" > #include "compiler/compilerDefinitions.inline.hpp" > #include "opto/c2compiler.hpp" > #include "opto/output.hpp" <------------ Problematic include > #include "opto/c2_CodeStubs.hpp" > #include "opto/compile.hpp" > ... more opto files and eventually type.hpp > > This means that a lot of C1 files also need to be re-compiled as well as some more source file that include `compiler/compilerDefinitions.inline.hpp`. I cut this dependency chain by removing the include of `opto/output.hpp` in `opto/c2compiler.hpp` and moved the constant `initial_const_capacity` from `output.hpp` to `c2Compiler.hpp` which seemed to be the only reason why we have the include in place. After this change, I was required to add some missing includes that were included transitively before. > > The final mainline patch could also be applied to the current Valhalla repository (with some small tweaks). The results were quite promising. I could bring the compilation time on my machine significantly down in mainline and especially in Valhalla after touching `type.hpp`: > > - Mainline: ~1min -> ~40s (1.5 times faster) > - Valhalla: ~7min -> ~40s (10.5 times faster) > > I've only focused on `type.hpp` here but I guess other includes in the JIT compiler area or other parts of hotspot could also be revisited to possibly speed up the compilation after touching some header files. > > Testing: > - Oracle CI > - GHA > > Thanks, > Christian This pull request has now been integrated. Changeset: 32c8195c Author: Christian Hagedorn URL: https://git.openjdk.org/jdk/commit/32c8195c3acce2d220829bf5b81e3cef907fff3c Stats: 86 lines in 31 files changed: 15 ins; 69 del; 2 mod 8345801: C2: Clean up include statements to speed up compilation when touching type.hpp Reviewed-by: kvn, dlong, jwaters ------------- PR: https://git.openjdk.org/jdk/pull/22658 From jbhateja at openjdk.org Mon Dec 16 14:23:16 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 16 Dec 2024 14:23:16 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v3] In-Reply-To: References: Message-ID: > Hi All, > > This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) > > Following is the summary of changes included with this patch:- > > 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. > 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. > 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. > - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. > 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. > 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. > 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. > 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF > 9. X86 backend implementation for all supported intrinsics. > 10. Functional and Performance validation tests. > > Kindly review the patch and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Adding more test points ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22754/files - new: https://git.openjdk.org/jdk/pull/22754/files/7cb694fa..3a6697e3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=01-02 Stats: 56 lines in 3 files changed: 54 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/22754.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22754/head:pull/22754 PR: https://git.openjdk.org/jdk/pull/22754 From jbhateja at openjdk.org Mon Dec 16 14:23:16 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 16 Dec 2024 14:23:16 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v3] In-Reply-To: References:

Message-ID: <03ozC1NfpoBMN8fyLJY6gt2_7GZQpDtTHEj8cgxD_dU=.dd851537-820d-4b72-acf9-b170aa756e4b@github.com> On Mon, 16 Dec 2024 09:03:38 GMT, Emanuel Peter wrote: > > > Can you quickly summarize what tests you have, and what they test? > > > > > > Patch includes functional and performance tests, as per your suggestions IR framework-based tests now cover various special cases for constant folding transformation. Let me know if you see any gaps. > > I was hoping that you could make a list of all optimizations that are included here, and tell me where the tests are for it. That would significantly reduce the review time on my end. Otherwise I have to correlate everything myself, and that will take me hours. Validations details:- A) x86 backend changes - new assembler instruction - macro assembly routines. Test point:- test/jdk/jdk/incubator/vector/ScalarFloat16OperationsTest.java - This test is based on a testng framework and includes new DataProviders to generate test vectors. - Test vectors cover the entire float16 value range and also special floating point values (NaN, +Int, -Inf, 0.0 and -0.0) B) GVN transformations:- - Value Transforms Test point:- test test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java - Covers all the constant folding scenarios for add, sub, mul, div, sqrt, fma, min, and max operations addressed by this patch. - It also tests special case scenarios for each operation as specified by Java language specification. - identity Transforms Test point:- test test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java - Covers identity transformation for ReinterpretS2HFNode, DivHFNode - idealization Transforms Test points:- test/hotspot/jtreg/compiler/c2/irTests/MulHFNodeIdealizationTests.java :- test test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java - Contains test point for the following transform MulHF idealization i.e. MulHF * 2 => AddHF - Contains test point for the following transform DivHF SRC , PoT(constant) => MulHF SRC * reciprocal (constant) - Contains idealization test points for the following transform ConvF2HF(FP32BinOp(ConvHF2F(x), ConvHF2F(y))) => ReinterpretHF2S(FP16BinOp(ReinterpretS2HF(x), ReinterpretS2HF(y))) ------------- PR Comment: https://git.openjdk.org/jdk/pull/22754#issuecomment-2545754021 From dnsimon at openjdk.org Mon Dec 16 16:51:12 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 16 Dec 2024 16:51:12 GMT Subject: RFR: 8346282: [JVMCI] Add failure reason support to UnresolvedJava/Type/Method/Field Message-ID: The JVMCI UnresolvedJava/Type/Method/Field types can be used to represent resolution failures. It would be useful if an exception describing the resolution failure could be attached to these objects. ------------- Commit messages: - allow exception to be attached to UnresolvedJava/Type/Method/Field Changes: https://git.openjdk.org/jdk/pull/22767/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22767&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8346282 Stats: 66 lines in 4 files changed: 54 ins; 1 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/22767.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22767/head:pull/22767 PR: https://git.openjdk.org/jdk/pull/22767 From darcy at openjdk.org Mon Dec 16 18:45:39 2024 From: darcy at openjdk.org (Joe Darcy) Date: Mon, 16 Dec 2024 18:45:39 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v3] In-Reply-To: References:

Message-ID: <62v0TMtl7q-EzfKx_fuIa9zTkv4uG_wzOKWr5hYGzy8=.4615f84f-db63-4a3e-b65f-3ff520e2e86d@github.com> On Mon, 16 Dec 2024 20:19:02 GMT, Doug Simon wrote: >> The JVMCI UnresolvedJava/Type/Method/Field types can be used to represent resolution failures. It would be useful if an exception describing the resolution failure could be attached to these objects. > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > fixed comment Marked as reviewed by never (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/22767#pullrequestreview-2507276043 From yzheng at openjdk.org Mon Dec 16 21:01:41 2024 From: yzheng at openjdk.org (Yudi Zheng) Date: Mon, 16 Dec 2024 21:01:41 GMT Subject: RFR: 8346282: [JVMCI] Add failure reason support to UnresolvedJava/Type/Method/Field [v2] In-Reply-To: References:

Message-ID: On Mon, 16 Dec 2024 18:47:50 GMT, Joe Darcy wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Adding more test points > > src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Float16.java line 1415: > >> 1413: // double; not necessary to widen to double before the >> 1414: // multiply. >> 1415: short fa = float16ToRawShortBits(a); > > The new implementations in fma and sqrt are comparatively long and obscure compared to the current versions. That might be the price of intrinsification, but it would be helpful to at least have a comment to the reader explaining why the more obvious code was not being used. @jatin-bhateja could we change the intrinsic to declare the three Float16 values as additional parameters which are only ever passed to the lambda? I believe when intrinsic we will just drop those extra parameters. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1887733028 From darcy at openjdk.org Tue Dec 17 00:26:41 2024 From: darcy at openjdk.org (Joe Darcy) Date: Tue, 17 Dec 2024 00:26:41 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v3] In-Reply-To: References:

Message-ID: <22UQZNt9TGIWmQ4rS7CAMSZg5zmxBeV71UiBIRd0t5E=.4db6389a-48aa-4f0c-b4fd-dd4e9a5238bd@github.com> On Mon, 16 Dec 2024 14:23:16 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes included with this patch:- >> >> 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. >> 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. >> 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. >> - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. >> 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. >> 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. >> 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. >> 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF >> 9. X86 backend implementation for all supported intrinsics. >> 10. Functional and Performance validation tests. >> >> Kindly review the patch and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Adding more test points src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Float16.java line 328: > 326: @ForceInline > 327: public static Float16 valueOf(float f) { > 328: short hf = floatToFloat16(f); Does the VM need the explicit short variable here? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1887738297 From epeter at openjdk.org Tue Dec 17 07:50:04 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 17 Dec 2024 07:50:04 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v3] In-Reply-To: References:

Message-ID: On Mon, 16 Dec 2024 14:23:16 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes included with this patch:- >> >> 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. >> 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. >> 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. >> - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. >> 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. >> 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. >> 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. >> 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF >> 9. X86 backend implementation for all supported intrinsics. >> 10. Functional and Performance validation tests. >> >> Kindly review the patch and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Adding more test points @jatin-bhateja I took 1h to go over this change. I left 15 comments, probably some of them you can just answer by a quick explanation / pointing to the relevant test. src/hotspot/share/opto/convertnode.cpp line 282: > 280: return new ReinterpretHF2SNode(binop); > 281: } > 282: } Where are the constant folding tests for this? src/hotspot/share/opto/convertnode.cpp line 960: > 958: } > 959: return TypeInt::SHORT; > 960: } Do we have tests for these constant folding operations? src/hotspot/share/opto/divnode.cpp line 815: > 813: !g_isnan(t1->getf()) && g_isfinite(t1->getf()) && t1->getf() != 0.0) { // could be negative ZERO or NaN > 814: return TypeH::ONE; > 815: } Do we cover all cases here? src/hotspot/share/opto/divnode.cpp line 821: > 819: } > 820: > 821: // If divisor is a constant and not zero, divide them numbers Suggestion: // If divisor is a constant and not zero, divide the numbers src/hotspot/share/opto/divnode.cpp line 826: > 824: t2->getf() != 0.0) { > 825: // could be negative zero > 826: return TypeH::make(t1->getf()/t2->getf()); Suggestion: return TypeH::make(t1->getf() / t2->getf()); src/hotspot/share/opto/divnode.cpp line 840: > 838: if (g_isnan(t1->getf()) || g_isnan(t2->getf())) { > 839: return TypeH::make(NAN); > 840: } I'm a little confused here. We are working with nodes that have type Float16, but we are asking for Float constants here. Why is that, how does it work? src/hotspot/share/opto/subnode.cpp line 566: > 564: return t1; > 565: } > 566: else if(g_isnan(t2->getf())) { General question: why are you using `getf` and not `geth` all over the code? src/hotspot/share/opto/type.cpp line 1465: > 1463: //------------------------------meet------------------------------------------- > 1464: // Compute the MEET of two types. It returns a new Type object. > 1465: const Type *TypeH::xmeet( const Type *t ) const { Please write `TypeH*` and not `TypeH *` src/hotspot/share/opto/type.cpp line 1530: > 1528: uint TypeH::hash(void) const { > 1529: return *(uint*)(&_f); > 1530: } I just saw that `_f` is a `short`, which I think is 16 bits, right? And the cast to `uint` would mean we take 32 bits. That looks a bit off, but maybe it is not. Can you explain, and maybe also put a comment in the code for that? test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java line 275: > 273: @IR(counts = {IRNode.ADD_HF, " 0 ", IRNode.REINTERPRET_S2HF, " 0 ", IRNode.REINTERPRET_HF2S, " 0 "}, > 274: applyIfCPUFeature = {"avx512_fp16", "true"}) > 275: public void testAddConstantFolding() { Ok, this is great. I'm missing some cases that check correct rounding. For that, it might be good to have one example with random constants, so 2 random Float16 values. You can generate them in static context, and also compute the result in static context, so it should be evaluated in the interpreter. That way, we can compare the result of interpreter to compiled code. Do that for all operations. test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java line 421: > 419: > 420: assertResult(divide(valueOf(2.0f), valueOf(2.0f)).floatValue(), 1.0f, "testDivConstantFolding"); > 421: } What about cases like `x/x`, where `x` is a variable, and then feed in all sorts of values, including NaN. I think there we must ensure that it does not fold to `1`. Could be a separate IR test. But also `x/x` with all sorts of constants is relevant. It would test this section in the `Ideal` code: // x/x == 1, we ignore 0/0. // Note: if t1 and t2 are zero then result is NaN (JVMS page 213) // Does not work for variables because of NaN's if (in(1) == in(2) && t1->base() == Type::HalfFloatCon && !g_isnan(t1->getf()) && g_isfinite(t1->getf()) && t1->getf() != 0.0) { // could be negative ZERO or NaN return TypeH::ONE; } test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java line 494: > 492: assertResult(fma(valueOf(1.0f), valueOf(2.0f), valueOf(3.0f)).floatValue(), 1.0f * 2.0f + 3.0f, "testFMAConstantFolding"); > 493: } > 494: } I am missing constant folding tests with `shortBitsToFloat16` etc. ------------- Changes requested by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22754#pullrequestreview-2508020252 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888008209 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888009160 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888012154 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888027070 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888027339 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888038360 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888030240 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888013140 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888017396 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888005513 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888026278 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888021315 From epeter at openjdk.org Tue Dec 17 07:50:04 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 17 Dec 2024 07:50:04 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v3] In-Reply-To: References:

Message-ID: On Tue, 17 Dec 2024 07:16:37 GMT, Emanuel Peter wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Adding more test points > > src/hotspot/share/opto/convertnode.cpp line 960: > >> 958: } >> 959: return TypeInt::SHORT; >> 960: } > > Do we have tests for these constant folding operations? We would need all sorts of conversion with Float16 <-> short. With Float16 constant and variable values. And also with short constant and variable values. > src/hotspot/share/opto/divnode.cpp line 826: > >> 824: t2->getf() != 0.0) { >> 825: // could be negative zero >> 826: return TypeH::make(t1->getf()/t2->getf()); > > Suggestion: > > return TypeH::make(t1->getf() / t2->getf()); Are we sure that the rounding behaviour of float is the correct behaviour for Float16? I would like to see some examples where rounding matters. > src/hotspot/share/opto/type.cpp line 1465: > >> 1463: //------------------------------meet------------------------------------------- >> 1464: // Compute the MEET of two types. It returns a new Type object. >> 1465: const Type *TypeH::xmeet( const Type *t ) const { > > Please write `TypeH*` and not `TypeH *` Do that everywhere in the code that you touch, except it breaks strongly with immediately surrounding code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888015077 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888033031 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888018364 From jbhateja at openjdk.org Tue Dec 17 08:03:07 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 17 Dec 2024 08:03:07 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v3] In-Reply-To: References:

Message-ID: <3ICo3LtSL6Lhx3AdPYpeUccUCYcMi25lqGb9pNsyEt0=.0c53bb8b-5387-46be-8790-f49023fd9230@github.com> On Tue, 17 Dec 2024 00:14:39 GMT, Paul Sandoz wrote: >> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Float16.java line 1415: >> >>> 1413: // double; not necessary to widen to double before the >>> 1414: // multiply. >>> 1415: short fa = float16ToRawShortBits(a); >> >> The new implementations in fma and sqrt are comparatively long and obscure compared to the current versions. That might be the price of intrinsification, but it would be helpful to at least have a comment to the reader explaining why the more obvious code was not being used. > > @jatin-bhateja could we change the intrinsic to declare the three Float16 values as additional parameters which are only ever passed to the lambda? I believe when intrinsic we will just drop those extra parameters. Our intent here is to explicitly unbox the float16 values in Java code to avoid complexifying unboxing by C2. In that case, the VM needs to be aware of the Float16 class and record its field offsets, as it did for the [VectorPayload class.](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/classfile/javaClasses.cpp#L5065) I have re-structured the code to remove unnecessary obfuscation introduced as a side effect of intrinsification. Let me know if we need to refine it further. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888053509 From jbhateja at openjdk.org Tue Dec 17 08:03:07 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 17 Dec 2024 08:03:07 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v4] In-Reply-To: References: Message-ID: <_Xvwu4MCoc_tveXX-iDMB5nW9UNpEj3mZdOvlGDMVd0=.b4d97187-de30-4e95-a036-75f69ad5db3f@github.com> > Hi All, > > This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) > > Following is the summary of changes included with this patch:- > > 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. > 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. > 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. > - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. > 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. > 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. > 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. > 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF > 9. X86 backend implementation for all supported intrinsics. > 10. Functional and Performance validation tests. > > Kindly review the patch and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Fixing obfuscation due to intrinsic entries ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22754/files - new: https://git.openjdk.org/jdk/pull/22754/files/3a6697e3..246cb270 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=02-03 Stats: 30 lines in 2 files changed: 13 ins; 1 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/22754.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22754/head:pull/22754 PR: https://git.openjdk.org/jdk/pull/22754 From jbhateja at openjdk.org Tue Dec 17 11:09:09 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 17 Dec 2024 11:09:09 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v5] In-Reply-To: References: Message-ID: > Hi All, > > This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) > > Following is the summary of changes included with this patch:- > > 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. > 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. > 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. > - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. > 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. > 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. > 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. > 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF > 9. X86 backend implementation for all supported intrinsics. > 10. Functional and Performance validation tests. > > Kindly review the patch and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Addressing review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22754/files - new: https://git.openjdk.org/jdk/pull/22754/files/246cb270..ec0834a3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=03-04 Stats: 28 lines in 4 files changed: 6 ins; 0 del; 22 mod Patch: https://git.openjdk.org/jdk/pull/22754.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22754/head:pull/22754 PR: https://git.openjdk.org/jdk/pull/22754 From jbhateja at openjdk.org Tue Dec 17 11:09:10 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 17 Dec 2024 11:09:10 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v3] In-Reply-To: References:

Message-ID: On Tue, 17 Dec 2024 07:15:35 GMT, Emanuel Peter wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Adding more test points > > src/hotspot/share/opto/convertnode.cpp line 282: > >> 280: return new ReinterpretHF2SNode(binop); >> 281: } >> 282: } > > Where are the constant folding tests for this? This is the core idealization logic which infers FP16 IR. Every test point added in the test points added in test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java verifies this. > src/hotspot/share/opto/divnode.cpp line 815: > >> 813: !g_isnan(t1->getf()) && g_isfinite(t1->getf()) && t1->getf() != 0.0) { // could be negative ZERO or NaN >> 814: return TypeH::ONE; >> 815: } > > Do we cover all cases here? Please refer to test point at https://github.com/openjdk/jdk/pull/22754/files#diff-3f8786f9f62662eda4b4a5c76c01fa04534c94d870d496501bfc20434ad45579R419 > src/hotspot/share/opto/divnode.cpp line 840: > >> 838: if (g_isnan(t1->getf()) || g_isnan(t2->getf())) { >> 839: return TypeH::make(NAN); >> 840: } > > I'm a little confused here. We are working with nodes that have type Float16, but we are asking for Float constants here. Why is that, how does it work? Please refer to PhaseIGVN::transform, we create constant IR for singleton types. https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/phaseX.cpp#L721 > src/hotspot/share/opto/subnode.cpp line 566: > >> 564: return t1; >> 565: } >> 566: else if(g_isnan(t2->getf())) { > > General question: why are you using `getf` and not `geth` all over the code? getf is a routine that performs half float to float conversion for TypeH. > src/hotspot/share/opto/type.cpp line 1530: > >> 1528: uint TypeH::hash(void) const { >> 1529: return *(uint*)(&_f); >> 1530: } > > I just saw that `_f` is a `short`, which I think is 16 bits, right? And the cast to `uint` would mean we take 32 bits. That looks a bit off, but maybe it is not. Can you explain, and maybe also put a comment in the code for that? This is to comply with Node::hash signature which returns uint value > test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java line 275: > >> 273: @IR(counts = {IRNode.ADD_HF, " 0 ", IRNode.REINTERPRET_S2HF, " 0 ", IRNode.REINTERPRET_HF2S, " 0 "}, >> 274: applyIfCPUFeature = {"avx512_fp16", "true"}) >> 275: public void testAddConstantFolding() { > > Ok, this is great. I'm missing some cases that check correct rounding. For that, it might be good to have one example with random constants, so 2 random Float16 values. You can generate them in static context, and also compute the result in static context, so it should be evaluated in the interpreter. That way, we can compare the result of interpreter to compiled code. > > Do that for all operations. Hey @eme64 , constant folding is done at FP32 granularity, so we first upcast FP16 to FP32 values using hf2f runtime helper, operate over FP32 values and then down cast it back to FP16 value using f2hf helper. Thus both compiler value transformations and interpreter use the same runtime helper underneath. Fallback implementation of each Float16 API is using Float.floatToFloat16 and Float.floa16ToFloat routines which are intrinsified at [interpreter level.](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/interpreter/templateInterpreterGenerator.cpp#L488), these interpreter intrinsic invokes same leaf level macro assembly routine flt16_to_flt which is also called though runtime helpers. So it may not add much value to do interpreter vs compiler comparison in these cases. > test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java line 421: > >> 419: >> 420: assertResult(divide(valueOf(2.0f), valueOf(2.0f)).floatValue(), 1.0f, "testDivConstantFolding"); >> 421: } > > What about cases like `x/x`, where `x` is a variable, and then feed in all sorts of values, including NaN. I think there we must ensure that it does not fold to `1`. Could be a separate IR test. > > But also `x/x` with all sorts of constants is relevant. It would test this section in the `Ideal` code: > > // x/x == 1, we ignore 0/0. > // Note: if t1 and t2 are zero then result is NaN (JVMS page 213) > // Does not work for variables because of NaN's > if (in(1) == in(2) && t1->base() == Type::HalfFloatCon && > !g_isnan(t1->getf()) && g_isfinite(t1->getf()) && t1->getf() != 0.0) { // could be negative ZERO or NaN > return TypeH::ONE; > } The above predicate filters out the NaN dividend case. For the variable argument, we rely on the x86 floating point ISA specification, which complies with the IEEE 754 floating point specification. Please refer to section 4.8.3.5 Operating on SNaNs and QNaNs for Intel Software Development Manual for more details. Note: Float16 to float conversion helpers preserve the NaN significand bits, but Java only deals in QNaN values. I am adding a few test points for signaling NaN. > test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java line 494: > >> 492: assertResult(fma(valueOf(1.0f), valueOf(2.0f), valueOf(3.0f)).floatValue(), 1.0f * 2.0f + 3.0f, "testFMAConstantFolding"); >> 493: } >> 494: } > > I am missing constant folding tests with `shortBitsToFloat16` etc. Added a few test points for the same ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888331413 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888331196 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888330121 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888330274 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888331089 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888331480 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888330506 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888330775 From jbhateja at openjdk.org Tue Dec 17 11:09:11 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 17 Dec 2024 11:09:11 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v3] In-Reply-To: References:

Message-ID: On Tue, 17 Dec 2024 07:22:45 GMT, Emanuel Peter wrote: >> src/hotspot/share/opto/convertnode.cpp line 960: >> >>> 958: } >>> 959: return TypeInt::SHORT; >>> 960: } >> >> Do we have tests for these constant folding operations? > > We would need all sorts of conversion with Float16 <-> short. With Float16 constant and variable values. And also with short constant and variable values. Yes, there are multiple test points in newly added test which receive floating-point constant which goes through following IR logic before being constant folded ConF -> ConvF2HF -> ReinterpretS2HF >> src/hotspot/share/opto/divnode.cpp line 826: >> >>> 824: t2->getf() != 0.0) { >>> 825: // could be negative zero >>> 826: return TypeH::make(t1->getf()/t2->getf()); >> >> Suggestion: >> >> return TypeH::make(t1->getf() / t2->getf()); > > Are we sure that the rounding behaviour of float is the correct behaviour for Float16? I would like to see some examples where rounding matters. FP16 has 11 bits precision and FP32 has 24 bit precision, thus as per [2P rule ](https://dl.acm.org/doi/pdf/10.1145/221332.221334) the operation is innocuous to double rounding effects. In addition, fall back implementation of Float16.divide also takes the same route of performing the operation at FP32 granularity. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888331298 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888330385 From dnsimon at openjdk.org Tue Dec 17 12:14:40 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 17 Dec 2024 12:14:40 GMT Subject: RFR: 8346282: [JVMCI] Add failure reason support to UnresolvedJava/Type/Method/Field [v2] In-Reply-To: References:

Message-ID: On Mon, 16 Dec 2024 20:22:12 GMT, Doug Simon wrote: >> The JVMCI UnresolvedJava/Type/Method/Field types can be used to represent resolution failures. It would be useful if an exception describing the resolution failure could be attached to these objects. > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > fixed comment Thanks for the reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22767#issuecomment-2548290339 From dnsimon at openjdk.org Tue Dec 17 12:14:41 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 17 Dec 2024 12:14:41 GMT Subject: Integrated: 8346282: [JVMCI] Add failure reason support to UnresolvedJava/Type/Method/Field In-Reply-To: References: Message-ID: On Mon, 16 Dec 2024 15:54:53 GMT, Doug Simon wrote: > The JVMCI UnresolvedJava/Type/Method/Field types can be used to represent resolution failures. It would be useful if an exception describing the resolution failure could be attached to these objects. This pull request has now been integrated. Changeset: 8a645954 Author: Doug Simon URL: https://git.openjdk.org/jdk/commit/8a6459544855e3c0561678769b9123f7df959cb4 Stats: 66 lines in 4 files changed: 54 ins; 1 del; 11 mod 8346282: [JVMCI] Add failure reason support to UnresolvedJava/Type/Method/Field Reviewed-by: never, yzheng ------------- PR: https://git.openjdk.org/jdk/pull/22767 From epeter at openjdk.org Tue Dec 17 16:30:48 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 17 Dec 2024 16:30:48 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v3] In-Reply-To: References:

Message-ID: On Tue, 17 Dec 2024 11:06:17 GMT, Jatin Bhateja wrote: >> test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java line 275: >> >>> 273: @IR(counts = {IRNode.ADD_HF, " 0 ", IRNode.REINTERPRET_S2HF, " 0 ", IRNode.REINTERPRET_HF2S, " 0 "}, >>> 274: applyIfCPUFeature = {"avx512_fp16", "true"}) >>> 275: public void testAddConstantFolding() { >> >> Ok, this is great. I'm missing some cases that check correct rounding. For that, it might be good to have one example with random constants, so 2 random Float16 values. You can generate them in static context, and also compute the result in static context, so it should be evaluated in the interpreter. That way, we can compare the result of interpreter to compiled code. >> >> Do that for all operations. > > Hey @eme64 , constant folding is done at FP32 granularity, so we first upcast FP16 to FP32 values using hf2f runtime helper, operate over FP32 values and then down cast it back to FP16 value using f2hf helper. Thus both compiler value transformations and interpreter use the same runtime helper underneath. > > Fallback implementation of each Float16 API is using Float.floatToFloat16 and Float.floa16ToFloat routines which are intrinsified at [interpreter level.](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/interpreter/templateInterpreterGenerator.cpp#L488), these interpreter intrinsic invokes same leaf level macro assembly routine flt16_to_flt which is also called though runtime helpers. > > So it may not add much value to do interpreter vs compiler comparison in these cases. Ah, yes, you are right, compiler vs interpreter comparison does not help as much as I thought, though we should still do it. What we need to do is compare interpreter and C2-constant-folded results with the results of the backend instructions, but we can also do that with variable values. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888838421 From galder at openjdk.org Tue Dec 17 16:42:48 2024 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Tue, 17 Dec 2024 16:42:48 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v4] In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com>

Message-ID: <9ReqLUCZ6XDaSQxgYw3NyZZdMv3SOHkCkzJ0DLAksas=.8cb29982-8cb8-4068-a251-59a189c83b93@github.com> On Fri, 29 Nov 2024 11:26:10 GMT, Emanuel Peter wrote: >> Galder Zamarre?o has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 30 additional commits since the last revision: >> >> - Use same default size as in other vector reduction benchmarks >> - Renamed benchmark class >> - Double/Float tests only when avx enabled >> - Make state class non-final >> - Restore previous benchmark iterations and default param size >> - Add clipping range benchmark that uses min/max >> - Encapsulate benchmark state within an inner class >> - Avoid creating result array in benchmark method >> - Merge branch 'master' into topic.intrinsify-max-min-long >> - Revert "Implement cmovL as a jump+mov branch" >> >> This reverts commit 1522e26bf66c47b780ebd0d0d0c4f78a4c564e44. >> - ... and 20 more: https://git.openjdk.org/jdk/compare/b290c6e3...0a8718e1 > > test/hotspot/jtreg/compiler/intrinsics/math/TestMinMaxInlining.java line 80: > >> 78: @IR(phase = { CompilePhase.BEFORE_MACRO_EXPANSION }, counts = { IRNode.MIN_L, "1" }) >> 79: @IR(phase = { CompilePhase.AFTER_MACRO_EXPANSION }, counts = { IRNode.MIN_L, "0" }) >> 80: private static long testLongMin(long a, long b) { > > Can you add a comment why it disappears after macro expansion? Good question. On non-avx512 machines after macro expansion the min/max nodes become cmov nodes, but but that's not the full story because on avx512 machines, they become minV/maxV nodes. Would you tweak the `@IR` annotations to capture this? Or would you leave it just as a comment? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1888858177 From epeter at openjdk.org Tue Dec 17 16:43:39 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 17 Dec 2024 16:43:39 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v3] In-Reply-To: References:

Message-ID: On Tue, 17 Dec 2024 11:06:13 GMT, Jatin Bhateja wrote: >> src/hotspot/share/opto/convertnode.cpp line 282: >> >>> 280: return new ReinterpretHF2SNode(binop); >>> 281: } >>> 282: } >> >> Where are the constant folding tests for this? > > This is the core idealization logic which infers FP16 IR. Every test point added in the test points added in test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java verifies this. Picking a random line from `testAddConstantFolding()` ` assertResult(add(Float16.POSITIVE_INFINITY, Float16.POSITIVE_INFINITY).floatValue(), Float.POSITIVE_INFINITY, "testAddConstantFolding");` So this seems to do a FP16 -> FP16 add, then convert to float, so I don't immediately see the FP16 -> Float -> FP16 conversion. Ah, how do we intrinsify this? public static Float16 add(Float16 addend, Float16 augend) { return valueOf(addend.floatValue() + augend.floatValue()); } Is it not the `add` that is intfinsified, but the `valueOf`, `floatValue` and Float `+`? Why not intrinsify the `Float16.add` directly? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888858048 From epeter at openjdk.org Tue Dec 17 16:43:40 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 17 Dec 2024 16:43:40 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v3] In-Reply-To: References:

Message-ID: On Tue, 17 Dec 2024 11:06:07 GMT, Jatin Bhateja wrote: >> We would need all sorts of conversion with Float16 <-> short. With Float16 constant and variable values. And also with short constant and variable values. > > Yes, there are multiple test points in newly added test which receive floating-point constant which goes through following IR logic before being constant folded > ConF -> ConvF2HF -> ReinterpretS2HF Please show me an example. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888859046 From epeter at openjdk.org Tue Dec 17 16:58:40 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 17 Dec 2024 16:58:40 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v3] In-Reply-To: References:

Message-ID: On Tue, 17 Dec 2024 11:05:27 GMT, Jatin Bhateja wrote: >> Are we sure that the rounding behaviour of float is the correct behaviour for Float16? I would like to see some examples where rounding matters. > > FP16 has 11 bits precision and FP32 has 24 bit precision, thus as per [2P rule ](https://dl.acm.org/doi/pdf/10.1145/221332.221334) the operation is innocuous to double rounding effects. In addition, fall back implementation of Float16.divide also takes the same route of performing the operation at FP32 granularity. I understand. Thanks for the explanation. But the backend `Float16` instructions do not convert to float, right? So we need to be sure, and test that rounding works correctly in all cases. Can you point me to a test for that? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888882880 From epeter at openjdk.org Tue Dec 17 16:58:43 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 17 Dec 2024 16:58:43 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v3] In-Reply-To: References:

Message-ID: On Tue, 17 Dec 2024 11:05:18 GMT, Jatin Bhateja wrote: >> src/hotspot/share/opto/divnode.cpp line 840: >> >>> 838: if (g_isnan(t1->getf()) || g_isnan(t2->getf())) { >>> 839: return TypeH::make(NAN); >>> 840: } >> >> I'm a little confused here. We are working with nodes that have type Float16, but we are asking for Float constants here. Why is that, how does it work? > > Please refer to PhaseIGVN::transform, we create constant IR for singleton types. > https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/phaseX.cpp#L721 That misses my question, I was again confused about float vs Float16. But I see from your previous answer that `getf` does the conversion from `Float16` -> `float`. So all good. >> src/hotspot/share/opto/subnode.cpp line 566: >> >>> 564: return t1; >>> 565: } >>> 566: else if(g_isnan(t2->getf())) { >> >> General question: why are you using `getf` and not `geth` all over the code? > > getf is a routine that performs half float to float conversion for TypeH. Ah right, I see now. Thanks. >> src/hotspot/share/opto/type.cpp line 1530: >> >>> 1528: uint TypeH::hash(void) const { >>> 1529: return *(uint*)(&_f); >>> 1530: } >> >> I just saw that `_f` is a `short`, which I think is 16 bits, right? And the cast to `uint` would mean we take 32 bits. That looks a bit off, but maybe it is not. Can you explain, and maybe also put a comment in the code for that? > > This is to comply with Node::hash signature which returns uint value But does that not mean that we have a 4-byte load, but the end of the object already happens after 2 bytes? If so, what are those 2 extra bytes? Is that safe and correct? >> test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java line 494: >> >>> 492: assertResult(fma(valueOf(1.0f), valueOf(2.0f), valueOf(3.0f)).floatValue(), 1.0f * 2.0f + 3.0f, "testFMAConstantFolding"); >>> 493: } >>> 494: } >> >> I am missing constant folding tests with `shortBitsToFloat16` etc. > > Added a few test points for the same Oh, I was more hoping for a separate test with `shortBitsToFloat16`, not mixed in with `fma`. And what about constant folding tests for `float16ToRawShortBits`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888887926 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888885400 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888869947 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1888875984 From epeter at openjdk.org Tue Dec 17 17:01:40 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 17 Dec 2024 17:01:40 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v5] In-Reply-To: <03ozC1NfpoBMN8fyLJY6gt2_7GZQpDtTHEj8cgxD_dU=.dd851537-820d-4b72-acf9-b170aa756e4b@github.com> References:

<03ozC1NfpoBMN8fyLJY6gt2_7GZQpDtTHEj8cgxD_dU=.dd851537-820d-4b72-acf9-b170aa756e4b@github.com> Message-ID: On Mon, 16 Dec 2024 14:19:49 GMT, Jatin Bhateja wrote: >>> > Can you quickly summarize what tests you have, and what they test? >>> >>> Patch includes functional and performance tests, as per your suggestions IR framework-based tests now cover various special cases for constant folding transformation. Let me know if you see any gaps. >> >> I was hoping that you could make a list of all optimizations that are included here, and tell me where the tests are for it. That would significantly reduce the review time on my end. Otherwise I have to correlate everything myself, and that will take me hours. > >> > > Can you quickly summarize what tests you have, and what they test? >> > >> > >> > Patch includes functional and performance tests, as per your suggestions IR framework-based tests now cover various special cases for constant folding transformation. Let me know if you see any gaps. >> >> I was hoping that you could make a list of all optimizations that are included here, and tell me where the tests are for it. That would significantly reduce the review time on my end. Otherwise I have to correlate everything myself, and that will take me hours. > > > Validations details:- > > A) x86 backend changes > - new assembler instruction > - macro assembly routines. > Test point:- test/jdk/jdk/incubator/vector/ScalarFloat16OperationsTest.java > - This test is based on a testng framework and includes new DataProviders to generate test vectors. > - Test vectors cover the entire float16 value range and also special floating point values (NaN, +Int, -Inf, 0.0 and -0.0) > B) GVN transformations:- > - Value Transforms > Test point:- test test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java > - Covers all the constant folding scenarios for add, sub, mul, div, sqrt, fma, min, and max operations addressed by this patch. > - It also tests special case scenarios for each operation as specified by Java language specification. > - identity Transforms > Test point:- test test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java > - Covers identity transformation for ReinterpretS2HFNode, DivHFNode > - idealization Transforms > Test points:- test/hotspot/jtreg/compiler/c2/irTests/MulHFNodeIdealizationTests.java > :- test test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java > - Contains test point for the following transform > MulHF idealization i.e. MulHF * 2 => AddHF > - Contains test point for the following transform > DivHF SRC , PoT(constant) => MulHF SRC * reciprocal (constant) > - Contains idealization test points for the following transform > ConvF2HF(FP32BinOp(ConvHF2F(x), ConvHF2F(y))) => > ReinterpretHF2S(FP16BinOp(ReinterpretS2HF(x), ReinterpretS2HF(y))) @jatin-bhateja thanks for all the updates and explanations. I'm still missing some tests for rounding effects. It would be important to verify that the interpreter, constant folding, and backend instructions all lead to the same rounding for add/sub/div/mul/fma/... I propose that you pick random `short` values, reinterpet them as `Float16`, then do the calculation via interpreter, let it constant fold with C2, and put it in variables that cannot be constant folded, so that backend instructions are emitted for it. Does that make sense? ------------- PR Comment: https://git.openjdk.org/jdk/pull/22754#issuecomment-2549040374 From galder at openjdk.org Tue Dec 17 18:12:24 2024 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Tue, 17 Dec 2024 18:12:24 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v6] In-Reply-To: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> Message-ID: > This patch intrinsifies `Math.max(long, long)` and `Math.min(long, long)` in order to help improve vectorization performance. > > Currently vectorization does not kick in for loops containing either of these calls because of the following error: > > > VLoop::check_preconditions: failed: control flow in loop not allowed > > > The control flow is due to the java implementation for these methods, e.g. > > > public static long max(long a, long b) { > return (a >= b) ? a : b; > } > > > This patch intrinsifies the calls to replace the CmpL + Bool nodes for MaxL/MinL nodes respectively. > By doing this, vectorization no longer finds the control flow and so it can carry out the vectorization. > E.g. > > > SuperWord::transform_loop: > Loop: N518/N126 counted [int,int),+4 (1025 iters) main has_sfpt strip_mined > 518 CountedLoop === 518 246 126 [[ 513 517 518 242 521 522 422 210 ]] inner stride: 4 main of N518 strip mined !orig=[419],[247],[216],[193] !jvms: Test::test @ bci:14 (line 21) > > > Applying the same changes to `ReductionPerf` as in https://github.com/openjdk/jdk/pull/13056, we can compare the results before and after. Before the patch, on darwin/aarch64 (M1): > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java > 1 1 0 0 > ============================== > TEST SUCCESS > > long min 1155 > long max 1173 > > > After the patch, on darwin/aarch64 (M1): > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java > 1 1 0 0 > ============================== > TEST SUCCESS > > long min 1042 > long max 1042 > > > This patch does not add an platform-specific backend implementations for the MaxL/MinL nodes. > Therefore, it still relies on the macro expansion to transform those into CMoveL. > > I've run tier1 and hotspot compiler tests on darwin/aarch64 and got these results: > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg:tier1 2500 2500 0 0 >>> jtreg:test/jdk:tier1 ... Galder Zamarre?o has updated the pull request incrementally with five additional commits since the last revision: - Added comment around the assertions - Adjust min/max identity IR test expectations after changes - Fix style - Add max reduction test - Add empty line ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20098/files - new: https://git.openjdk.org/jdk/pull/20098/files/aca09222..130b4755 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20098&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20098&range=04-05 Stats: 169 lines in 5 files changed: 162 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/20098.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20098/head:pull/20098 PR: https://git.openjdk.org/jdk/pull/20098 From galder at openjdk.org Tue Dec 17 18:23:39 2024 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Tue, 17 Dec 2024 18:23:39 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com>

Message-ID: On Thu, 12 Dec 2024 12:03:11 GMT, Emanuel Peter wrote: >>> @galderz Thanks for taking this task on! Had a quick look at it. So auto-vectorization in SuperWord should now be working, right? If yes: >>> >>> It would be nice if you tested both for `IRNode.MIN_VL` and `IRNode.MIN_REDUCTION_V`, the same for max. >>> >>> You may want to look at these existing tests, to see what other tests there are for the `int` version: `test/hotspot/jtreg/compiler/loopopts/superword/MinMaxRed_Int.java` `test/hotspot/jtreg/compiler/c2/irTests/TestIfMinMax.java` `test/hotspot/jtreg/compiler/vectorization/TestAutoVecIntMinMax.java` `test/hotspot/jtreg/compiler/c2/TestMinMaxSubword.java` There may be some duplicates already here... not sure. >> >> +1 to adding such tests. I'm looking into it right now. It's a bit confusing how the tests are spread around (and duplication?) but I'm currently leaning towards adding a `compiler/loopopts/superword/MinMaxRed_Long.java`. >> >>> And maybe you need to check what to do about probabilities as well. >> >> I will add probabilities logic (50, 80, 100) to control data, but you can already see that from https://github.com/openjdk/jdk/pull/20098#issuecomment-2379386872 that with the patch in an AVX512 system min/max reduction nodes will appear in all probabilities. > > @galderz Yes, there is significant duplication, sadly. Often there were old tests there, but then one comes along and sees that one wants to have more comprehensive tests. So one adds it, but does not feel 100% comfortable removing old tests. A little bit of duplication is probably ok. Often, there are still subtle differences, and sometimes those end up mattering. > > `compiler/loopopts/superword/MinMaxRed_Long.java` sounds like a good idea. @eme64 I've addressed all your comments except aarch64 testing. `asimd` is not enough, you need `sve` for this, but I'm yet to make it work even with `sve`, something's up and need to debug it further. @jaskarth FYI I've adjusted the expectations in `TestMinMaxIdentities` after this change (thx for adding the test!). Check if there's any comments/changes you'd like. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2549259008 From coleenp at openjdk.org Tue Dec 17 21:28:47 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 17 Dec 2024 21:28:47 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata Message-ID: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. before: /* size: 216, cachelines: 4, members: 25, static members: 17 */ /* sum members: 194, holes: 3, sum holes: 18 */ after: /* size: 200, cachelines: 4, members: 25, static members: 17 */ /* sum members: 188, holes: 4, sum holes: 12 */ We may eventually move the modifiers to java.lang.Class but that's WIP. Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. ------------- Commit messages: - Remove JVM_ACC_WRITTEN_FLAGS because they are all written in the classfile now. - 8339113: AccessFlags can be u2 in metadata Changes: https://git.openjdk.org/jdk/pull/22246/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8339113 Stats: 167 lines in 41 files changed: 16 ins; 40 del; 111 mod Patch: https://git.openjdk.org/jdk/pull/22246.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22246/head:pull/22246 PR: https://git.openjdk.org/jdk/pull/22246 From epeter at openjdk.org Wed Dec 18 06:22:42 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 18 Dec 2024 06:22:42 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v6] In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> Message-ID: On Tue, 17 Dec 2024 18:12:24 GMT, Galder Zamarre?o wrote: >> This patch intrinsifies `Math.max(long, long)` and `Math.min(long, long)` in order to help improve vectorization performance. >> >> Currently vectorization does not kick in for loops containing either of these calls because of the following error: >> >> >> VLoop::check_preconditions: failed: control flow in loop not allowed >> >> >> The control flow is due to the java implementation for these methods, e.g. >> >> >> public static long max(long a, long b) { >> return (a >= b) ? a : b; >> } >> >> >> This patch intrinsifies the calls to replace the CmpL + Bool nodes for MaxL/MinL nodes respectively. >> By doing this, vectorization no longer finds the control flow and so it can carry out the vectorization. >> E.g. >> >> >> SuperWord::transform_loop: >> Loop: N518/N126 counted [int,int),+4 (1025 iters) main has_sfpt strip_mined >> 518 CountedLoop === 518 246 126 [[ 513 517 518 242 521 522 422 210 ]] inner stride: 4 main of N518 strip mined !orig=[419],[247],[216],[193] !jvms: Test::test @ bci:14 (line 21) >> >> >> Applying the same changes to `ReductionPerf` as in https://github.com/openjdk/jdk/pull/13056, we can compare the results before and after. Before the patch, on darwin/aarch64 (M1): >> >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >> jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java >> 1 1 0 0 >> ============================== >> TEST SUCCESS >> >> long min 1155 >> long max 1173 >> >> >> After the patch, on darwin/aarch64 (M1): >> >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >> jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java >> 1 1 0 0 >> ============================== >> TEST SUCCESS >> >> long min 1042 >> long max 1042 >> >> >> This patch does not add an platform-specific backend implementations for the MaxL/MinL nodes. >> Therefore, it still relies on the macro expansion to transform those into CMoveL. >> >> I've run tier1 and hotspot compiler tests on darwin/aarch64 and got these results: >> >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PA... > > Galder Zamarre?o has updated the pull request incrementally with five additional commits since the last revision: > > - Added comment around the assertions > - Adjust min/max identity IR test expectations after changes > - Fix style > - Add max reduction test > - Add empty line test/hotspot/jtreg/compiler/loopopts/superword/MinMaxRed_Long.java line 2: > 1: /* > 2: * Copyright (c) 2023, Oracle and/or its affiliates. All rights reserved. Suggestion: * Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1889678576 From epeter at openjdk.org Wed Dec 18 06:25:37 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 18 Dec 2024 06:25:37 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com>

Message-ID: On Tue, 17 Dec 2024 18:20:36 GMT, Galder Zamarre?o wrote: >> @galderz Yes, there is significant duplication, sadly. Often there were old tests there, but then one comes along and sees that one wants to have more comprehensive tests. So one adds it, but does not feel 100% comfortable removing old tests. A little bit of duplication is probably ok. Often, there are still subtle differences, and sometimes those end up mattering. >> >> `compiler/loopopts/superword/MinMaxRed_Long.java` sounds like a good idea. > > @eme64 I've addressed all your comments except aarch64 testing. `asimd` is not enough, you need `sve` for this, but I'm yet to make it work even with `sve`, something's up and need to debug it further. > > @jaskarth FYI I've adjusted the expectations in `TestMinMaxIdentities` after this change (thx for adding the test!). Check if there's any comments/changes you'd like. @galderz Nice, thanks for the updates. I gave the patch a quick scan and I think it looks really good. Just ping me again when you are done with your aarch64 investigations, and you think I should review again :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2550456266 From coleenp at openjdk.org Wed Dec 18 19:23:39 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 18 Dec 2024 19:23:39 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata In-Reply-To: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: <4z2P9wv0UyYE7Ct-X0W6VYLVOqpocol7MblNDxlpy6U=.5a09be4a-b10a-4a55-91c2-99abeafef3db@github.com> On Tue, 19 Nov 2024 16:18:48 GMT, Coleen Phillimore wrote: > Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. > > before: > > /* size: 216, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 194, holes: 3, sum holes: 18 */ > > > after: > > /* size: 200, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 188, holes: 4, sum holes: 12 */ > > > We may eventually move the modifiers to java.lang.Class but that's WIP. > > Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. No, bot, this is a new PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22246#issuecomment-2552096922 From aturbanov at openjdk.org Thu Dec 19 09:18:40 2024 From: aturbanov at openjdk.org (Andrey Turbanov) Date: Thu, 19 Dec 2024 09:18:40 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v5] In-Reply-To: References:

Message-ID: On Tue, 17 Dec 2024 11:09:09 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes included with this patch:- >> >> 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. >> 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. >> 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. >> - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. >> 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. >> 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. >> 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. >> 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF >> 9. X86 backend implementation for all supported intrinsics. >> 10. Functional and Performance validation tests. >> >> Kindly review the patch and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Addressing review comments test/jdk/jdk/incubator/vector/ScalarFloat16OperationsTest.java line 222: > 220: } > 221: } > 222: assertArraysEquals(res, farr, (fp16) -> Float.isInfinite(fp16.floatValue())); Suggestion: assertArraysEquals(res, farr, (fp16) -> Float.isInfinite(fp16.floatValue())); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1891411255 From coleenp at openjdk.org Thu Dec 19 12:52:34 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 19 Dec 2024 12:52:34 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v2] In-Reply-To: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: > Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. > > before: > > /* size: 216, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 194, holes: 3, sum holes: 18 */ > > > after: > > /* size: 200, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 188, holes: 4, sum holes: 12 */ > > > We may eventually move the modifiers to java.lang.Class but that's WIP. > > Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: - Update src/hotspot/cpu/ppc/templateInterpreterGenerator_ppc.cpp Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> - Update src/hotspot/share/opto/library_call.cpp Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22246/files - new: https://git.openjdk.org/jdk/pull/22246/files/3106bdd3..cc69a3f2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=00-01 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/22246.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22246/head:pull/22246 PR: https://git.openjdk.org/jdk/pull/22246 From coleenp at openjdk.org Thu Dec 19 12:52:35 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 19 Dec 2024 12:52:35 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata In-Reply-To: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Tue, 19 Nov 2024 16:18:48 GMT, Coleen Phillimore wrote: > Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. > > before: > > /* size: 216, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 194, holes: 3, sum holes: 18 */ > > > after: > > /* size: 200, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 188, holes: 4, sum holes: 12 */ > > > We may eventually move the modifiers to java.lang.Class but that's WIP. > > Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. The as_int() was because we are using the AccessFlags as an _integral_ value. I was trying to minimize the effects of the change and the code uses AccessFlags as an integral value. as_int() returns u2 so I guess that's confusing. I don't want AccessFlags::get_flags() because that's implies the return is AccessFlags. I could change the name to as_unsigned_short(). Would that be less confusing? Thank you David for looking through this change. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22246#issuecomment-2553766012 From coleenp at openjdk.org Thu Dec 19 12:52:36 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 19 Dec 2024 12:52:36 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v2] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Thu, 19 Dec 2024 01:48:49 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: >> >> - Update src/hotspot/cpu/ppc/templateInterpreterGenerator_ppc.cpp >> >> Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> >> - Update src/hotspot/share/opto/library_call.cpp >> >> Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> > > src/hotspot/share/opto/library_call.cpp line 3874: > >> 3872: Node* LibraryCallKit::generate_interface_guard(Node* kls, RegionNode* region) { >> 3873: return generate_klass_flags_guard(kls, JVM_ACC_INTERFACE, 0, region, >> 3874: Klass::access_flags_offset(), TypeInt::CHAR, T_CHAR); > > Is this CHAR/T_CHAR because you want unsigned? Yes. T_SHORT generates the wrong code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1891931433 From coleenp at openjdk.org Thu Dec 19 12:56:38 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 19 Dec 2024 12:56:38 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v2] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Thu, 19 Dec 2024 01:33:57 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: >> >> - Update src/hotspot/cpu/ppc/templateInterpreterGenerator_ppc.cpp >> >> Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> >> - Update src/hotspot/share/opto/library_call.cpp >> >> Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> > > src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp line 690: > >> 688: push(state); >> 689: >> 690: // Skip if we don't have to unlock. (???is this right???) > > The logic seems consistent with other platforms. Not sure what you are querying. It wasn't the logic. When I went through I didn't know if this instruction needed fixing because we loaded an unsigned short instead of an int. So I left myself a note to look at it again that you noticed and I didn't in my final walk through. It seems right but maybe someone with ppc knowledge can answer this. rldicl_(R0, Raccess_flags, 64-JVM_ACC_SYNCHRONIZED_BIT, 63); // Extract bit and compare to 0. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1891953704 From aph at openjdk.org Thu Dec 19 13:50:43 2024 From: aph at openjdk.org (Andrew Haley) Date: Thu, 19 Dec 2024 13:50:43 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v2] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Thu, 19 Dec 2024 12:52:34 GMT, Coleen Phillimore wrote: >> Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. >> >> before: >> >> /* size: 216, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 194, holes: 3, sum holes: 18 */ >> >> >> after: >> >> /* size: 200, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 188, holes: 4, sum holes: 12 */ >> >> >> We may eventually move the modifiers to java.lang.Class but that's WIP. >> >> Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. > > Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: > > - Update src/hotspot/cpu/ppc/templateInterpreterGenerator_ppc.cpp > > Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> > - Update src/hotspot/share/opto/library_call.cpp > > Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp line 823: > 821: { > 822: Label done; > 823: __ load_unsigned_short(r0, access_flags); Could you please use `ldrh` rather than `load_unsigned_short` here? `load_unsigned_short` is only used in the termplate interpreter, and is a hangover from the hand-translation from x86. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1892155007 From dlong at openjdk.org Thu Dec 19 19:10:36 2024 From: dlong at openjdk.org (Dean Long) Date: Thu, 19 Dec 2024 19:10:36 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v2] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: <7u51ceqCUvoClO7UvFk57s_pLil_wncOZAVRRuQexTE=.772835f1-1811-44e9-a969-3fe3f05c9de5@github.com> On Thu, 19 Dec 2024 12:52:34 GMT, Coleen Phillimore wrote: >> Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. >> >> before: >> >> /* size: 216, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 194, holes: 3, sum holes: 18 */ >> >> >> after: >> >> /* size: 200, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 188, holes: 4, sum holes: 12 */ >> >> >> We may eventually move the modifiers to java.lang.Class but that's WIP. >> >> Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. > > Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: > > - Update src/hotspot/cpu/ppc/templateInterpreterGenerator_ppc.cpp > > Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> > - Update src/hotspot/share/opto/library_call.cpp > > Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp line 691: > 689: > 690: // Skip if we don't have to unlock. (???is this right???) > 691: rldicl_(R0, Raccess_flags, 64-JVM_ACC_SYNCHRONIZED_BIT, 63); // Extract bit and compare to 0. Using `testbitdi` might make it more readable to non-experts. It took me a while reading aix docs to realize that this platform numbers LSB as 63 and MSB/sign as 0. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1893009550 From coleenp at openjdk.org Thu Dec 19 20:15:14 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 19 Dec 2024 20:15:14 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v3] In-Reply-To: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: <3L1E24-MgN60lpElt0drkUi3nBz3n078fcx9iwab-U8=.0054a22f-90cc-4e07-ac2b-f52558b9775c@github.com> > Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. > > before: > > /* size: 216, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 194, holes: 3, sum holes: 18 */ > > > after: > > /* size: 200, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 188, holes: 4, sum holes: 12 */ > > > We may eventually move the modifiers to java.lang.Class but that's WIP. > > Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - Fixed bug. - Fix merge and compilation errors. - Merge branch 'master' into access-flags - Renamed AccessFlags.as_int() to as_unsigned_short(), moved masks for method, field and class modifiers into AccessFlags. Change ciFlags to use AccessFlags. - Update src/hotspot/cpu/ppc/templateInterpreterGenerator_ppc.cpp Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> - Update src/hotspot/share/opto/library_call.cpp Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> - Remove JVM_ACC_WRITTEN_FLAGS because they are all written in the classfile now. - 8339113: AccessFlags can be u2 in metadata ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22246/files - new: https://git.openjdk.org/jdk/pull/22246/files/cc69a3f2..522ade8c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=01-02 Stats: 5573 lines in 201 files changed: 3545 ins; 1398 del; 630 mod Patch: https://git.openjdk.org/jdk/pull/22246.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22246/head:pull/22246 PR: https://git.openjdk.org/jdk/pull/22246 From coleenp at openjdk.org Thu Dec 19 20:15:14 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 19 Dec 2024 20:15:14 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v2] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com>

Message-ID: <1hwxD50hCezqeSWdKVroWZhJkP4MiTH0votpw4YBOGs=.3b77bab6-0479-4b7b-aa1f-92b33efe28cd@github.com> On Thu, 19 Dec 2024 13:48:16 GMT, Andrew Haley wrote: >> Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: >> >> - Update src/hotspot/cpu/ppc/templateInterpreterGenerator_ppc.cpp >> >> Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> >> - Update src/hotspot/share/opto/library_call.cpp >> >> Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> > > src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp line 823: > >> 821: { >> 822: Label done; >> 823: __ load_unsigned_short(r0, access_flags); > > Could you please use `ldrh` rather than `load_unsigned_short` here? `load_unsigned_short` is only used in the termplate interpreter, and is a hangover from the hand-translation from x86. Oh, I thought it was quite nice that I didn't have to know the ldrh instruction as a platform independent load_unsigned_short was available. I can change it in the aarch64 code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1893073141 From coleenp at openjdk.org Thu Dec 19 20:29:36 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 19 Dec 2024 20:29:36 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v2] In-Reply-To: <7u51ceqCUvoClO7UvFk57s_pLil_wncOZAVRRuQexTE=.772835f1-1811-44e9-a969-3fe3f05c9de5@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> <7u51ceqCUvoClO7UvFk57s_pLil_wncOZAVRRuQexTE=.772835f1-1811-44e9-a969-3fe3f05c9de5@github.com> Message-ID: On Thu, 19 Dec 2024 19:08:06 GMT, Dean Long wrote: >> Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: >> >> - Update src/hotspot/cpu/ppc/templateInterpreterGenerator_ppc.cpp >> >> Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> >> - Update src/hotspot/share/opto/library_call.cpp >> >> Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> > > src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp line 691: > >> 689: >> 690: // Skip if we don't have to unlock. (???is this right???) >> 691: rldicl_(R0, Raccess_flags, 64-JVM_ACC_SYNCHRONIZED_BIT, 63); // Extract bit and compare to 0. > > Using `testbitdi` might make it more readable to non-experts. It took me a while reading aix docs to realize that this platform numbers LSB as 63 and MSB/sign as 0. yes I like testbitdi better. I found a sample in the templateInterpreterGenerator code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1893099373 From dholmes at openjdk.org Thu Dec 19 21:11:36 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 19 Dec 2024 21:11:36 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: <0AxZxBjocVYDGFLq6nYhjXOd_TwAJ4qseWsBlYurNnE=.25da67e6-9a51-4229-9118-540d3f775601@github.com> On Thu, 19 Dec 2024 12:47:19 GMT, Coleen Phillimore wrote: > I could change the name to as_unsigned_short(). Would that be less confusing? How about `as_u2()` as that is what it is? (less typing :) ). ------------- PR Comment: https://git.openjdk.org/jdk/pull/22246#issuecomment-2555777513 From coleenp at openjdk.org Thu Dec 19 22:22:13 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 19 Dec 2024 22:22:13 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v4] In-Reply-To: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: <7xCemQV3MFSKQh81__K-VbLDj8lZYPwfy6a-s0_MAz0=.a01e2ef0-b2ef-4bdb-be2c-2671707a8f90@github.com> > Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. > > before: > > /* size: 216, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 194, holes: 3, sum holes: 18 */ > > > after: > > /* size: 200, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 188, holes: 4, sum holes: 12 */ > > > We may eventually move the modifiers to java.lang.Class but that's WIP. > > Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Use ldrh rather than load_unsigned_short for aarch64, use testbitdi for ppc. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22246/files - new: https://git.openjdk.org/jdk/pull/22246/files/522ade8c..828e4835 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=02-03 Stats: 12 lines in 4 files changed: 0 ins; 0 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/22246.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22246/head:pull/22246 PR: https://git.openjdk.org/jdk/pull/22246 From coleenp at openjdk.org Thu Dec 19 22:22:14 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 19 Dec 2024 22:22:14 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v3] In-Reply-To: <3L1E24-MgN60lpElt0drkUi3nBz3n078fcx9iwab-U8=.0054a22f-90cc-4e07-ac2b-f52558b9775c@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> <3L1E24-MgN60lpElt0drkUi3nBz3n078fcx9iwab-U8=.0054a22f-90cc-4e07-ac2b-f52558b9775c@github.com> Message-ID: On Thu, 19 Dec 2024 20:15:14 GMT, Coleen Phillimore wrote: >> Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. >> >> before: >> >> /* size: 216, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 194, holes: 3, sum holes: 18 */ >> >> >> after: >> >> /* size: 200, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 188, holes: 4, sum holes: 12 */ >> >> >> We may eventually move the modifiers to java.lang.Class but that's WIP. >> >> Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. > > Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Fixed bug. > - Fix merge and compilation errors. > - Merge branch 'master' into access-flags > - Renamed AccessFlags.as_int() to as_unsigned_short(), moved masks for method, field and class modifiers into AccessFlags. Change ciFlags to use AccessFlags. > - Update src/hotspot/cpu/ppc/templateInterpreterGenerator_ppc.cpp > > Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> > - Update src/hotspot/share/opto/library_call.cpp > > Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com> > - Remove JVM_ACC_WRITTEN_FLAGS because they are all written in the classfile now. > - 8339113: AccessFlags can be u2 in metadata I didn't really like that name since it's the name of the type rather than more of a description of the type returned. I was able to reduce the number of these by adding some helper functions in AccessFlags. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22246#issuecomment-2555868918 From dholmes at openjdk.org Fri Dec 20 05:05:37 2024 From: dholmes at openjdk.org (David Holmes) Date: Fri, 20 Dec 2024 05:05:37 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v4] In-Reply-To: <7xCemQV3MFSKQh81__K-VbLDj8lZYPwfy6a-s0_MAz0=.a01e2ef0-b2ef-4bdb-be2c-2671707a8f90@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> <7xCemQV3MFSKQh81__K-VbLDj8lZYPwfy6a-s0_MAz0=.a01e2ef0-b2ef-4bdb-be2c-2671707a8f90@github.com> Message-ID: On Thu, 19 Dec 2024 22:22:13 GMT, Coleen Phillimore wrote: >> Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. >> >> before: >> >> /* size: 216, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 194, holes: 3, sum holes: 18 */ >> >> >> after: >> >> /* size: 200, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 188, holes: 4, sum holes: 12 */ >> >> >> We may eventually move the modifiers to java.lang.Class but that's WIP. >> >> Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Use ldrh rather than load_unsigned_short for aarch64, use testbitdi for ppc. Things become somewhat clearer once one realizes/recalls that `AccessFlags` do not specifically pertain to access, but to all class/method modifiers as per the various JVMS ACC_XXX constants. We need to restore the leading comment in accessFlags.hpp to read // AccessFlags is an abstraction over Java ACC flags. as it originally did. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22246#issuecomment-2556293078 From coleenp at openjdk.org Fri Dec 20 13:17:17 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 20 Dec 2024 13:17:17 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v5] In-Reply-To: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: > Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. > > before: > > /* size: 216, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 194, holes: 3, sum holes: 18 */ > > > after: > > /* size: 200, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 188, holes: 4, sum holes: 12 */ > > > We may eventually move the modifiers to java.lang.Class but that's WIP. > > Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Restore ACC in comment. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22246/files - new: https://git.openjdk.org/jdk/pull/22246/files/828e4835..4faf19ba Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/22246.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22246/head:pull/22246 PR: https://git.openjdk.org/jdk/pull/22246 From coleenp at openjdk.org Fri Dec 20 13:30:38 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 20 Dec 2024 13:30:38 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v5] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Thu, 19 Dec 2024 01:40:41 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Restore ACC in comment. > > src/hotspot/share/jfr/leakprofiler/chains/edgeUtils.cpp line 75: > >> 73: while (!jfs.done()) { >> 74: if (offset == jfs.offset()) { >> 75: *modifiers = jfs.access_flags().as_int(); > > This looks wrong - we want a short and you extracted as an int when it was already a short. ?? I changed as_int() to as_unsigned_short() which hopefully is less confusing, so resolving these conversations/questions. > src/hotspot/share/oops/method.cpp line 1655: > >> 1653: return; >> 1654: } >> 1655: jshort flags = access_flags().as_int(); > > Again why the short -> int -> short? And why isn't this unsigned? The call below takes jshort, so added a checked_cast<> The top sign bit won't be set because we filter that out (it was ACC_MODULE). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1893940015 PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1893941294 From fgao at openjdk.org Fri Dec 20 15:44:40 2024 From: fgao at openjdk.org (Fei Gao) Date: Fri, 20 Dec 2024 15:44:40 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v6] In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> Message-ID: <9uGYNmVdvCXvyYSOAfwmvD70nWkimOFIlQJolQWa_Z4=.c6ffbfa0-5eb1-40a4-83a4-b657f57c9836@github.com> On Tue, 17 Dec 2024 18:12:24 GMT, Galder Zamarre?o wrote: >> This patch intrinsifies `Math.max(long, long)` and `Math.min(long, long)` in order to help improve vectorization performance. >> >> Currently vectorization does not kick in for loops containing either of these calls because of the following error: >> >> >> VLoop::check_preconditions: failed: control flow in loop not allowed >> >> >> The control flow is due to the java implementation for these methods, e.g. >> >> >> public static long max(long a, long b) { >> return (a >= b) ? a : b; >> } >> >> >> This patch intrinsifies the calls to replace the CmpL + Bool nodes for MaxL/MinL nodes respectively. >> By doing this, vectorization no longer finds the control flow and so it can carry out the vectorization. >> E.g. >> >> >> SuperWord::transform_loop: >> Loop: N518/N126 counted [int,int),+4 (1025 iters) main has_sfpt strip_mined >> 518 CountedLoop === 518 246 126 [[ 513 517 518 242 521 522 422 210 ]] inner stride: 4 main of N518 strip mined !orig=[419],[247],[216],[193] !jvms: Test::test @ bci:14 (line 21) >> >> >> Applying the same changes to `ReductionPerf` as in https://github.com/openjdk/jdk/pull/13056, we can compare the results before and after. Before the patch, on darwin/aarch64 (M1): >> >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >> jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java >> 1 1 0 0 >> ============================== >> TEST SUCCESS >> >> long min 1155 >> long max 1173 >> >> >> After the patch, on darwin/aarch64 (M1): >> >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >> jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java >> 1 1 0 0 >> ============================== >> TEST SUCCESS >> >> long min 1042 >> long max 1042 >> >> >> This patch does not add an platform-specific backend implementations for the MaxL/MinL nodes. >> Therefore, it still relies on the macro expansion to transform those into CMoveL. >> >> I've run tier1 and hotspot compiler tests on darwin/aarch64 and got these results: >> >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PA... > > Galder Zamarre?o has updated the pull request incrementally with five additional commits since the last revision: > > - Added comment around the assertions > - Adjust min/max identity IR test expectations after changes > - Fix style > - Add max reduction test > - Add empty line test/hotspot/jtreg/compiler/loopopts/superword/MinMaxRed_Long.java line 135: > 133: @IR(applyIf = {"SuperWordReductions", "true"}, > 134: applyIfCPUFeatureOr = { "avx512", "true" }, > 135: counts = {IRNode.MIN_REDUCTION_V, " > 0"}) > @eme64 I've addressed all your comments except aarch64 testing. `asimd` is not enough, you need `sve` for this, but I'm yet to make it work even with `sve`, something's up and need to debug it further. Hi @galderz , may I ask if these long-reduction cases can't work even with `sve`? It might be related with the limitation [here](https://github.com/openjdk/jdk/blob/75420e9314c54adc5b45f9b274a87af54dd6b5a8/src/hotspot/share/opto/superword.cpp#L1564-L1566). Some `sve` machines have only 128 bits. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1894089883 From aph at openjdk.org Fri Dec 20 16:11:38 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 20 Dec 2024 16:11:38 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v6] In-Reply-To: <9uGYNmVdvCXvyYSOAfwmvD70nWkimOFIlQJolQWa_Z4=.c6ffbfa0-5eb1-40a4-83a4-b657f57c9836@github.com> References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> <9uGYNmVdvCXvyYSOAfwmvD70nWkimOFIlQJolQWa_Z4=.c6ffbfa0-5eb1-40a4-83a4-b657f57c9836@github.com> Message-ID: On Fri, 20 Dec 2024 15:42:14 GMT, Fei Gao wrote: >> Galder Zamarre?o has updated the pull request incrementally with five additional commits since the last revision: >> >> - Added comment around the assertions >> - Adjust min/max identity IR test expectations after changes >> - Fix style >> - Add max reduction test >> - Add empty line > > test/hotspot/jtreg/compiler/loopopts/superword/MinMaxRed_Long.java line 135: > >> 133: @IR(applyIf = {"SuperWordReductions", "true"}, >> 134: applyIfCPUFeatureOr = { "avx512", "true" }, >> 135: counts = {IRNode.MIN_REDUCTION_V, " > 0"}) > >> @eme64 I've addressed all your comments except aarch64 testing. `asimd` is not enough, you need `sve` for this, but I'm yet to make it work even with `sve`, something's up and need to debug it further. > > Hi @galderz , may I ask if these long-reduction cases can't work even with `sve`? It might be related with the limitation [here](https://github.com/openjdk/jdk/blob/75420e9314c54adc5b45f9b274a87af54dd6b5a8/src/hotspot/share/opto/superword.cpp#L1564-L1566). Some `sve` machines have only 128 bits. That's right. Neoverse V2 is 4 pipes of 128 bits, V1 is 2 pipes of 256 bits. That comment is "interesting". Maybe it should be tunable by the back end. Given that Neoverse V2 can issue 4 SVE operations per clock cycle, it might still be a win. Galder, how about you disable that line and give it another try? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1894118531 From dlong at openjdk.org Fri Dec 20 21:37:38 2024 From: dlong at openjdk.org (Dean Long) Date: Fri, 20 Dec 2024 21:37:38 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v5] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Fri, 20 Dec 2024 13:17:17 GMT, Coleen Phillimore wrote: >> Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. >> >> before: >> >> /* size: 216, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 194, holes: 3, sum holes: 18 */ >> >> >> after: >> >> /* size: 200, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 188, holes: 4, sum holes: 12 */ >> >> >> We may eventually move the modifiers to java.lang.Class but that's WIP. >> >> Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Restore ACC in comment. src/hotspot/share/oops/method.cpp line 1655: > 1653: return; > 1654: } > 1655: jshort flags = checked_cast