From coleenp at openjdk.org Thu Jan 2 12:37:39 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 2 Jan 2025 12:37:39 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v5] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Wed, 25 Dec 2024 16:34:27 GMT, Martin Doerr wrote: >> It wasn't the logic. When I went through I didn't know if this instruction needed fixing because we loaded an unsigned short instead of an int. So I left myself a note to look at it again that you noticed and I didn't in my final walk through. It seems right but maybe someone with ppc knowledge can answer this. >> >> >> rldicl_(R0, Raccess_flags, 64-JVM_ACC_SYNCHRONIZED_BIT, 63); // Extract bit and compare to 0. > > The instruction looks still correct. We are checking the same bit of the 64 bit register as before. (Using `testbitdi` would also work.) I changed this to testbitdi. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1900836026 From mdoerr at openjdk.org Thu Jan 2 12:55:37 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 2 Jan 2025 12:55:37 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v5] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Thu, 2 Jan 2025 12:34:59 GMT, Coleen Phillimore wrote: >> The instruction looks still correct. We are checking the same bit of the 64 bit register as before. (Using `testbitdi` would also work.) > > I changed this to testbitdi. Ok. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1900849068 From coleenp at openjdk.org Thu Jan 2 12:59:37 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 2 Jan 2025 12:59:37 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v5] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Fri, 20 Dec 2024 21:34:58 GMT, Dean Long wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Restore ACC in comment. > > src/hotspot/share/oops/method.cpp line 1655: > >> 1653: return; >> 1654: } >> 1655: jshort flags = checked_cast(access_flags().as_unsigned_short()); > > Can we cleanup the intrinsics code next, to stop using jshort for flags? I was going to file a new issue but this is really easy to fix, so I added this fix to this change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1900852859 From coleenp at openjdk.org Thu Jan 2 13:04:18 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 2 Jan 2025 13:04:18 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v6] In-Reply-To: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: > Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. > > before: > > /* size: 216, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 194, holes: 3, sum holes: 18 */ > > > after: > > /* size: 200, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 188, holes: 4, sum holes: 12 */ > > > We may eventually move the modifiers to java.lang.Class but that's WIP. > > Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Make vmIntrinsics::find_id() take u2 not jshort. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22246/files - new: https://git.openjdk.org/jdk/pull/22246/files/4faf19ba..df95d447 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=04-05 Stats: 13 lines in 3 files changed: 1 ins; 1 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/22246.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22246/head:pull/22246 PR: https://git.openjdk.org/jdk/pull/22246 From coleenp at openjdk.org Thu Jan 2 13:08:12 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 2 Jan 2025 13:08:12 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v7] In-Reply-To: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: > Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. > > before: > > /* size: 216, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 194, holes: 3, sum holes: 18 */ > > > after: > > /* size: 200, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 188, holes: 4, sum holes: 12 */ > > > We may eventually move the modifiers to java.lang.Class but that's WIP. > > Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix lhz in ppc ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22246/files - new: https://git.openjdk.org/jdk/pull/22246/files/df95d447..5a84b139 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=05-06 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/22246.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22246/head:pull/22246 PR: https://git.openjdk.org/jdk/pull/22246 From coleenp at openjdk.org Thu Jan 2 13:08:12 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 2 Jan 2025 13:08:12 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v5] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Wed, 25 Dec 2024 16:40:19 GMT, Martin Doerr wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Restore ACC in comment. > > src/hotspot/cpu/ppc/templateInterpreterGenerator_ppc.cpp line 149: > >> 147: // _access_flags must be a 16 bit value. >> 148: assert(sizeof(AccessFlags) == 2, "wrong size"); >> 149: __ lha(R11_scratch1/*access_flags*/, method_(access_flags)); > > Using `lhz` would be more consistent. `lha` uses sign extend instead of zero extend. Feel free to clean this up if you want. Both instructions should work. I fixed this - it was the only one. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1900858739 From coleenp at openjdk.org Thu Jan 2 13:14:14 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 2 Jan 2025 13:14:14 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v8] In-Reply-To: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: <67hlSAvg5aEfWNhl2tj2Fe5FKNNXsJ8TsOCUe10KRQg=.54a14494-af1f-4bd0-a7dc-b5ece84ecd97@github.com> > Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. > > before: > > /* size: 216, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 194, holes: 3, sum holes: 18 */ > > > after: > > /* size: 200, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 188, holes: 4, sum holes: 12 */ > > > We may eventually move the modifiers to java.lang.Class but that's WIP. > > Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Import @offamitkumar patch for s390. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22246/files - new: https://git.openjdk.org/jdk/pull/22246/files/5a84b139..8cafd452 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=06-07 Stats: 32 lines in 5 files changed: 15 ins; 0 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/22246.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22246/head:pull/22246 PR: https://git.openjdk.org/jdk/pull/22246 From coleenp at openjdk.org Thu Jan 2 13:14:15 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 2 Jan 2025 13:14:15 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v7] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Thu, 2 Jan 2025 13:08:12 GMT, Coleen Phillimore wrote: >> Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. >> >> before: >> >> /* size: 216, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 194, holes: 3, sum holes: 18 */ >> >> >> after: >> >> /* size: 200, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 188, holes: 4, sum holes: 12 */ >> >> >> We may eventually move the modifiers to java.lang.Class but that's WIP. >> >> Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix lhz in ppc Thank you for the patch Amit. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22246#issuecomment-2567753064 From coleenp at openjdk.org Thu Jan 2 13:20:20 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 2 Jan 2025 13:20:20 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v9] In-Reply-To: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: > Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. > > before: > > /* size: 216, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 194, holes: 3, sum holes: 18 */ > > > after: > > /* size: 200, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 188, holes: 4, sum holes: 12 */ > > > We may eventually move the modifiers to java.lang.Class but that's WIP. > > Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Happy New Year ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22246/files - new: https://git.openjdk.org/jdk/pull/22246/files/8cafd452..c60e96b4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=07-08 Stats: 39 lines in 39 files changed: 0 ins; 0 del; 39 mod Patch: https://git.openjdk.org/jdk/pull/22246.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22246/head:pull/22246 PR: https://git.openjdk.org/jdk/pull/22246 From mdoerr at openjdk.org Thu Jan 2 13:55:36 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 2 Jan 2025 13:55:36 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v5] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: <9uPHxBroRJc6lcPjNOMALruNbfomUw-zG3oxYJg_N10=.9b3e439b-d3ff-4092-a205-699e5f9921b9@github.com> On Thu, 2 Jan 2025 13:05:10 GMT, Coleen Phillimore wrote: >> src/hotspot/cpu/ppc/templateInterpreterGenerator_ppc.cpp line 149: >> >>> 147: // _access_flags must be a 16 bit value. >>> 148: assert(sizeof(AccessFlags) == 2, "wrong size"); >>> 149: __ lha(R11_scratch1/*access_flags*/, method_(access_flags)); >> >> Using `lhz` would be more consistent. `lha` uses sign extend instead of zero extend. Feel free to clean this up if you want. Both instructions should work. > > I fixed this - it was the only one. Thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1900899700 From mdoerr at openjdk.org Thu Jan 2 22:21:51 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 2 Jan 2025 22:21:51 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: References: Message-ID: <12-_FKsXl3OOujKaHQewgKdDUi95q9M0fSmuLeo6_Wg=.1ec72306-698f-42f8-86b7-e53bc711fde6@github.com> On Tue, 31 Dec 2024 06:42:43 GMT, Kim Barrett wrote: >> Please review this change to how HotSpot prevents the use of certain C library >> functions (e.g. poisons references to those functions), while permitting a >> subset to be used in restricted circumstances. Reasons for poisoning a >> function include it being considered obsolete, or a security concern, or there >> is a HotSpot function (typically in the os:: namespace) providing similar >> functionality that should be used instead. >> >> The old mechanism, based on -Wattribute-warning and the associated attribute, >> only worked for gcc. (Clang's implementation differs in an important way from >> gcc, which is the subject of a clang bug that has been open for years. MSVC >> doesn't provide a similar mechanism.) It also had problems with LTO, due to a >> gcc bug. >> >> The new mechanism is based on deprecation warnings, using [[deprecated]] >> attributes. We redeclare or forward declare the functions we want to prevent >> use of as being deprecated. This relies on deprecation warnings being >> enabled, which they already are in our build configuration. All of our >> supported compilers support the [[deprecated]] attribute. >> >> Another benefit of using deprecation warnings rather than warning attributes >> is the time when the check is performed. Warning attributes are checked only >> if the function is referenced after all optimizations have been performed. >> Deprecation is checked during initial semantic analysis. That's better for >> our purposes here. (This is also part of why gcc LTO has problems with the >> old mechanism, but not the new.) >> >> Adding these redeclarations or forward declarations isn't as simple as >> expected, due to differences between the various compilers. We hide the >> differences behind a set of macros, FORBID_C_FUNCTION and related macros. See >> the compiler-specific parts of those macros for details. >> >> In some situations we need to allow references to these poisoned functions. >> >> One common case is where our poisoning is visible to some 3rd party code we >> don't want to modify. This is typically 3rd party headers included in HotSpot >> code, such as from Google Test or the C++ Standard Library. For these the >> BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context >> where such references are permitted. >> >> Some of the poisoned functions are needed to implement associated HotSpot os:: >> functions, or in other similarly restricted contexts. For these, a wrapper >> function is provided that calls the poison... > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > forbid windows _snprintf Unfortunately, this doesn't compile on AIX: globalDefinitions_gcc.hpp:42: In file included from /opt/IBM/openxlC/17.1.1/bin/../include/c++/v1/stdlib.h:93: /usr/include/stdlib.h:304:18: error: 'noreturn' attribute does not appear on the first declaration extern _NOTHROW(_NORETURN(void, exit), (int)); ^ /usr/include/comp_macros.h:30:35: note: expanded from macro '_NORETURN' #define _NORETURN(_T, _F) _T _F [[noreturn]] ^ ... (rest of output omitted) ------------- PR Comment: https://git.openjdk.org/jdk/pull/22890#issuecomment-2568451299 From sspitsyn at openjdk.org Thu Jan 2 23:36:41 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 2 Jan 2025 23:36:41 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v9] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Thu, 2 Jan 2025 13:20:20 GMT, Coleen Phillimore wrote: >> Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. >> >> before: >> >> /* size: 216, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 194, holes: 3, sum holes: 18 */ >> >> >> after: >> >> /* size: 200, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 188, holes: 4, sum holes: 12 */ >> >> >> We may eventually move the modifiers to java.lang.Class but that's WIP. >> >> Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Happy New Year src/hotspot/share/prims/jvmtiRedefineClasses.cpp line 1151: > 1149: // methods match, be sure modifiers do too > 1150: old_flags = k_old_method->access_flags().as_unsigned_short(); > 1151: new_flags = k_new_method->access_flags().as_unsigned_short(); Nit: I'd suggest to use `as_method_flags()` and `as_class_flags()` at lines 1008-1009 to make it consistent with the lines 1043-1044. Good example is `jvmtiClassFileReconstituter.cpp`. Also, it would make sense to expend this rule to some other files, e.g.: `methodHandles.cpp`, `jvmtiEnv.cpp`, `jvm.cpp`, instanceClass.cpp`, `fieldInfo.inline.hpp`, `fieldInfo.cpp ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1901371957 From kbarrett at openjdk.org Thu Jan 2 23:37:39 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 2 Jan 2025 23:37:39 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: <12-_FKsXl3OOujKaHQewgKdDUi95q9M0fSmuLeo6_Wg=.1ec72306-698f-42f8-86b7-e53bc711fde6@github.com> References: <12-_FKsXl3OOujKaHQewgKdDUi95q9M0fSmuLeo6_Wg=.1ec72306-698f-42f8-86b7-e53bc711fde6@github.com> Message-ID: <8M7tvmDz6F63ehnC7efuV98IwRwOYUEq5K2wXPiepCI=.a187f96b-6a2a-4044-bd18-4152584dcf9a@github.com> On Thu, 2 Jan 2025 22:19:00 GMT, Martin Doerr wrote: > Unfortunately, this doesn't compile on AIX: [...] Ugh! The clang workaround for noreturn handling is going to need to be more extensive. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22890#issuecomment-2568520652 From amitkumar at openjdk.org Fri Jan 3 07:49:43 2025 From: amitkumar at openjdk.org (Amit Kumar) Date: Fri, 3 Jan 2025 07:49:43 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v9] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Thu, 2 Jan 2025 13:20:20 GMT, Coleen Phillimore wrote: >> Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. >> >> before: >> >> /* size: 216, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 194, holes: 3, sum holes: 18 */ >> >> >> after: >> >> /* size: 200, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 188, holes: 4, sum holes: 12 */ >> >> >> We may eventually move the modifiers to java.lang.Class but that's WIP. >> >> Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Happy New Year Thanks for adding the patch Coleen :-) I did another test-run and s390x looks fine now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22246#issuecomment-2568808253 From epeter at openjdk.org Fri Jan 3 08:51:37 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Fri, 3 Jan 2025 08:51:37 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v6] In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> <9uGYNmVdvCXvyYSOAfwmvD70nWkimOFIlQJolQWa_Z4=.c6ffbfa0-5eb1-40a4-83a4-b657f57c9836@github.com> Message-ID: On Fri, 20 Dec 2024 16:08:51 GMT, Andrew Haley wrote: >> test/hotspot/jtreg/compiler/loopopts/superword/MinMaxRed_Long.java line 135: >> >>> 133: @IR(applyIf = {"SuperWordReductions", "true"}, >>> 134: applyIfCPUFeatureOr = { "avx512", "true" }, >>> 135: counts = {IRNode.MIN_REDUCTION_V, " > 0"}) >> >>> @eme64 I've addressed all your comments except aarch64 testing. `asimd` is not enough, you need `sve` for this, but I'm yet to make it work even with `sve`, something's up and need to debug it further. >> >> Hi @galderz , may I ask if these long-reduction cases can't work even with `sve`? It might be related with the limitation [here](https://github.com/openjdk/jdk/blob/75420e9314c54adc5b45f9b274a87af54dd6b5a8/src/hotspot/share/opto/superword.cpp#L1564-L1566). Some `sve` machines have only 128 bits. > > That's right. Neoverse V2 is 4 pipes of 128 bits, V1 is 2 pipes of 256 bits. > That comment is "interesting". Maybe it should be tunable by the back end. Given that Neoverse V2 can issue 4 SVE operations per clock cycle, it might still be a win. > > Galder, how about you disable that line and give it another try? FYI: I'm working on removing the line [here](https://github.com/openjdk/jdk/blob/75420e9314c54adc5b45f9b274a87af54dd6b5a8/src/hotspot/share/opto/superword.cpp#L1564-L1566). The issue is that on some platforms 2-element vectors are somehow really slower, and we need a cost-model to give us a better heuristic, rather than the hard "no". See my draft https://github.com/openjdk/jdk/pull/20964. But yes: why don't you remove the line, and see if that makes it work. If so, then don't worry about this case for now, and maybe leave a comment in the test. We can then fix that later. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1901576209 From epeter at openjdk.org Fri Jan 3 09:01:57 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Fri, 3 Jan 2025 09:01:57 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v7] In-Reply-To: References: Message-ID: On Mon, 23 Dec 2024 06:54:34 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes included with this patch:- >> >> 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. >> 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. >> 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. >> - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. >> 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. >> 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. >> 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. >> 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF >> 9. X86 backend implementation for all supported intrinsics. >> 10. Functional and Performance validation tests. >> >> Kindly review the patch and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > Review comments resolutions src/hotspot/share/opto/type.cpp line 1465: > 1463: //------------------------------meet------------------------------------------- > 1464: // Compute the MEET of two types. It returns a new Type object. > 1465: const Type *TypeH::xmeet( const Type *t ) const { Suggestion: const Type* TypeH::xmeet( const Type* t ) const { Please check all other occurances. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1901578736 From jsjolen at openjdk.org Fri Jan 3 13:27:50 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 3 Jan 2025 13:27:50 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: References: Message-ID: <5JjsCtkAmOXUdGJjqvcgsek8ft_XlaIw2iLM_mgqFj8=.16d96a5f-a005-4340-81a9-653b52572a8a@github.com> On Tue, 31 Dec 2024 06:42:43 GMT, Kim Barrett wrote: >> Please review this change to how HotSpot prevents the use of certain C library >> functions (e.g. poisons references to those functions), while permitting a >> subset to be used in restricted circumstances. Reasons for poisoning a >> function include it being considered obsolete, or a security concern, or there >> is a HotSpot function (typically in the os:: namespace) providing similar >> functionality that should be used instead. >> >> The old mechanism, based on -Wattribute-warning and the associated attribute, >> only worked for gcc. (Clang's implementation differs in an important way from >> gcc, which is the subject of a clang bug that has been open for years. MSVC >> doesn't provide a similar mechanism.) It also had problems with LTO, due to a >> gcc bug. >> >> The new mechanism is based on deprecation warnings, using [[deprecated]] >> attributes. We redeclare or forward declare the functions we want to prevent >> use of as being deprecated. This relies on deprecation warnings being >> enabled, which they already are in our build configuration. All of our >> supported compilers support the [[deprecated]] attribute. >> >> Another benefit of using deprecation warnings rather than warning attributes >> is the time when the check is performed. Warning attributes are checked only >> if the function is referenced after all optimizations have been performed. >> Deprecation is checked during initial semantic analysis. That's better for >> our purposes here. (This is also part of why gcc LTO has problems with the >> old mechanism, but not the new.) >> >> Adding these redeclarations or forward declarations isn't as simple as >> expected, due to differences between the various compilers. We hide the >> differences behind a set of macros, FORBID_C_FUNCTION and related macros. See >> the compiler-specific parts of those macros for details. >> >> In some situations we need to allow references to these poisoned functions. >> >> One common case is where our poisoning is visible to some 3rd party code we >> don't want to modify. This is typically 3rd party headers included in HotSpot >> code, such as from Google Test or the C++ Standard Library. For these the >> BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context >> where such references are permitted. >> >> Some of the poisoned functions are needed to implement associated HotSpot os:: >> functions, or in other similarly restricted contexts. For these, a wrapper >> function is provided that calls the poison... > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > forbid windows _snprintf Hi, I really like the syntactic change of this feature, and it's very nice that we get to have working auto-complete (makes it easier to find out that a specific function isn't forbidden). The syntactic change isn't shown in the PR description, isn't that useful to add? I have a bike shedding request: Could we skip the S at the end and have it be `permit_forbidden_function::free(my_ptr);`? We only allow one forbidden function in that call, so this reads neater. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22890#issuecomment-2569218247 From coleenp at openjdk.org Fri Jan 3 14:38:22 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 3 Jan 2025 14:38:22 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros Message-ID: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. ------------- Commit messages: - 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros Changes: https://git.openjdk.org/jdk/pull/22916/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22916&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8346990 Stats: 339 lines in 83 files changed: 0 ins; 13 del; 326 mod Patch: https://git.openjdk.org/jdk/pull/22916.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22916/head:pull/22916 PR: https://git.openjdk.org/jdk/pull/22916 From jwaters at openjdk.org Fri Jan 3 15:43:41 2025 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 3 Jan 2025 15:43:41 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros In-Reply-To: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Fri, 3 Jan 2025 14:32:39 GMT, Coleen Phillimore wrote: > There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. > > Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. Speaking of %z, there is a non Standard %Ix in os_windows.cpp tty->print_cr("reserve_memory of %Ix bytes took " JLONG_FORMAT " ms (" JLONG_FORMAT " ticks)", bytes, reserveTimer.milliseconds(), reserveTimer.ticks()); Could changing that to %zu be trivial enough to fit into this change? ------------- PR Comment: https://git.openjdk.org/jdk/pull/22916#issuecomment-2569435948 From jwaters at openjdk.org Fri Jan 3 15:45:43 2025 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 3 Jan 2025 15:45:43 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: References: Message-ID: <-Yq74ZD6bE9umMSXPGo0Ko9ZkKAQC5K5dpnYMjTxXvQ=.5e657584-6734-4cca-956e-e2a99449861a@github.com> On Tue, 31 Dec 2024 06:42:43 GMT, Kim Barrett wrote: >> Please review this change to how HotSpot prevents the use of certain C library >> functions (e.g. poisons references to those functions), while permitting a >> subset to be used in restricted circumstances. Reasons for poisoning a >> function include it being considered obsolete, or a security concern, or there >> is a HotSpot function (typically in the os:: namespace) providing similar >> functionality that should be used instead. >> >> The old mechanism, based on -Wattribute-warning and the associated attribute, >> only worked for gcc. (Clang's implementation differs in an important way from >> gcc, which is the subject of a clang bug that has been open for years. MSVC >> doesn't provide a similar mechanism.) It also had problems with LTO, due to a >> gcc bug. >> >> The new mechanism is based on deprecation warnings, using [[deprecated]] >> attributes. We redeclare or forward declare the functions we want to prevent >> use of as being deprecated. This relies on deprecation warnings being >> enabled, which they already are in our build configuration. All of our >> supported compilers support the [[deprecated]] attribute. >> >> Another benefit of using deprecation warnings rather than warning attributes >> is the time when the check is performed. Warning attributes are checked only >> if the function is referenced after all optimizations have been performed. >> Deprecation is checked during initial semantic analysis. That's better for >> our purposes here. (This is also part of why gcc LTO has problems with the >> old mechanism, but not the new.) >> >> Adding these redeclarations or forward declarations isn't as simple as >> expected, due to differences between the various compilers. We hide the >> differences behind a set of macros, FORBID_C_FUNCTION and related macros. See >> the compiler-specific parts of those macros for details. >> >> In some situations we need to allow references to these poisoned functions. >> >> One common case is where our poisoning is visible to some 3rd party code we >> don't want to modify. This is typically 3rd party headers included in HotSpot >> code, such as from Google Test or the C++ Standard Library. For these the >> BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context >> where such references are permitted. >> >> Some of the poisoned functions are needed to implement associated HotSpot os:: >> functions, or in other similarly restricted contexts. For these, a wrapper >> function is provided that calls the poison... > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > forbid windows _snprintf Hi Johan, which syntactic change are you referring to? ------------- PR Comment: https://git.openjdk.org/jdk/pull/22890#issuecomment-2569437739 From coleenp at openjdk.org Fri Jan 3 16:23:31 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 3 Jan 2025 16:23:31 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v2] In-Reply-To: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: > There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. > > Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix %Ix to %zx. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22916/files - new: https://git.openjdk.org/jdk/pull/22916/files/6d6fbfa7..1748797a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22916&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22916&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/22916.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22916/head:pull/22916 PR: https://git.openjdk.org/jdk/pull/22916 From coleenp at openjdk.org Fri Jan 3 16:23:32 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 3 Jan 2025 16:23:32 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros In-Reply-To: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Fri, 3 Jan 2025 14:32:39 GMT, Coleen Phillimore wrote: > There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. > > Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. I was going to take on the other FORMAT ones in separate PRs. Sorry I see what you're saying. yes, I'll fix that too. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22916#issuecomment-2569484010 From coleenp at openjdk.org Fri Jan 3 19:22:37 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 3 Jan 2025 19:22:37 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v9] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: <5-GUDe1mvBa86obIcZdPrzFkXtrXU7Epn9ol6pxfHbE=.a90acfed-bece-4d9d-a56b-6981a52cbbb0@github.com> On Thu, 2 Jan 2025 23:33:31 GMT, Serguei Spitsyn wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Happy New Year > > src/hotspot/share/prims/jvmtiRedefineClasses.cpp line 1151: > >> 1149: // methods match, be sure modifiers do too >> 1150: old_flags = k_old_method->access_flags().as_unsigned_short(); >> 1151: new_flags = k_new_method->access_flags().as_unsigned_short(); > > Nit: I'd suggest to use `as_method_flags()` and `as_class_flags()` at lines 1008-1009 to make it consistent with the lines 1043-1044. Good example is `jvmtiClassFileReconstituter.cpp`. Also, it would make sense to expend this rule to some other files, e.g.: `method.cpp`, `methodHandles.cpp`, `jvmtiEnv.cpp`, `jvm.cpp`, `instanceClass.cpp`, `fieldInfo.inline.hpp`, `fieldInfo.cpp` This is a good suggestion. I strengthened the as_{field|method|class}_flags functions because they should be stored with only their recognized modifiers in the appropriate place. Retesting. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1902091752 From jbhateja at openjdk.org Fri Jan 3 20:36:21 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 3 Jan 2025 20:36:21 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v8] In-Reply-To: References: Message-ID: > Hi All, > > This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) > > Following is the summary of changes included with this patch:- > > 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. > 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. > 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. > - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. > 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. > 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. > 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. > 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF > 9. X86 backend implementation for all supported intrinsics. > 10. Functional and Performance validation tests. > > Kindly review the patch and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains nine additional commits since the last revision: - Updating copyright year of modified files. - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8342103 - Review suggestions incorporated. - Review comments resolutions - Addressing review comments - Fixing obfuscation due to intrinsic entries - Adding more test points - Adding missed check in container type detection. - C2 compiler support for float16 scalar operations. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22754/files - new: https://git.openjdk.org/jdk/pull/22754/files/dd444c44..d3cbf2c4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=06-07 Stats: 17820 lines in 567 files changed: 13308 ins; 2583 del; 1929 mod Patch: https://git.openjdk.org/jdk/pull/22754.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22754/head:pull/22754 PR: https://git.openjdk.org/jdk/pull/22754 From jbhateja at openjdk.org Fri Jan 3 20:42:15 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 3 Jan 2025 20:42:15 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v9] In-Reply-To: References: Message-ID: > Hi All, > > This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) > > Following is the summary of changes included with this patch:- > > 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. > 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. > 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. > - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. > 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. > 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. > 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. > 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF > 9. X86 backend implementation for all supported intrinsics. > 10. Functional and Performance validation tests. > > Kindly review the patch and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: Updating copyright year of modified files. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22754/files - new: https://git.openjdk.org/jdk/pull/22754/files/d3cbf2c4..175f4ed2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=07-08 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/22754.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22754/head:pull/22754 PR: https://git.openjdk.org/jdk/pull/22754 From jsjolen at openjdk.org Fri Jan 3 20:47:34 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 3 Jan 2025 20:47:34 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: <-Yq74ZD6bE9umMSXPGo0Ko9ZkKAQC5K5dpnYMjTxXvQ=.5e657584-6734-4cca-956e-e2a99449861a@github.com> References: <-Yq74ZD6bE9umMSXPGo0Ko9ZkKAQC5K5dpnYMjTxXvQ=.5e657584-6734-4cca-956e-e2a99449861a@github.com> Message-ID: On Fri, 3 Jan 2025 15:42:52 GMT, Julian Waters wrote: > Hi Johan, which syntactic change are you referring to? Oh, just that we don't have to repeat the name of the function we want to use. No more macro, just a namespace and a function! ------------- PR Comment: https://git.openjdk.org/jdk/pull/22890#issuecomment-2569793807 From jsjolen at openjdk.org Fri Jan 3 21:31:45 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 3 Jan 2025 21:31:45 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: References: Message-ID: On Tue, 31 Dec 2024 06:42:43 GMT, Kim Barrett wrote: >> Please review this change to how HotSpot prevents the use of certain C library >> functions (e.g. poisons references to those functions), while permitting a >> subset to be used in restricted circumstances. Reasons for poisoning a >> function include it being considered obsolete, or a security concern, or there >> is a HotSpot function (typically in the os:: namespace) providing similar >> functionality that should be used instead. >> >> The old mechanism, based on -Wattribute-warning and the associated attribute, >> only worked for gcc. (Clang's implementation differs in an important way from >> gcc, which is the subject of a clang bug that has been open for years. MSVC >> doesn't provide a similar mechanism.) It also had problems with LTO, due to a >> gcc bug. >> >> The new mechanism is based on deprecation warnings, using [[deprecated]] >> attributes. We redeclare or forward declare the functions we want to prevent >> use of as being deprecated. This relies on deprecation warnings being >> enabled, which they already are in our build configuration. All of our >> supported compilers support the [[deprecated]] attribute. >> >> Another benefit of using deprecation warnings rather than warning attributes >> is the time when the check is performed. Warning attributes are checked only >> if the function is referenced after all optimizations have been performed. >> Deprecation is checked during initial semantic analysis. That's better for >> our purposes here. (This is also part of why gcc LTO has problems with the >> old mechanism, but not the new.) >> >> Adding these redeclarations or forward declarations isn't as simple as >> expected, due to differences between the various compilers. We hide the >> differences behind a set of macros, FORBID_C_FUNCTION and related macros. See >> the compiler-specific parts of those macros for details. >> >> In some situations we need to allow references to these poisoned functions. >> >> One common case is where our poisoning is visible to some 3rd party code we >> don't want to modify. This is typically 3rd party headers included in HotSpot >> code, such as from Google Test or the C++ Standard Library. For these the >> BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context >> where such references are permitted. >> >> Some of the poisoned functions are needed to implement associated HotSpot os:: >> functions, or in other similarly restricted contexts. For these, a wrapper >> function is provided that calls the poison... > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > forbid windows _snprintf I've read through it all, and it seems like good code to me. I like the change in syntax a lot. I've one nit regarding comment style. src/hotspot/share/utilities/compilerWarnings.hpp line 89: > 87: #endif > 88: > 89: ////////////////////////////////////////////////////////////////////////////// This doesn't adhere to the style of the file AFAICS and is unnecessary, can we remove it? ------------- PR Review: https://git.openjdk.org/jdk/pull/22890#pullrequestreview-2529737692 PR Review Comment: https://git.openjdk.org/jdk/pull/22890#discussion_r1902178578 From coleenp at openjdk.org Fri Jan 3 21:54:26 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 3 Jan 2025 21:54:26 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v10] In-Reply-To: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: > Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. > > before: > > /* size: 216, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 194, holes: 3, sum holes: 18 */ > > > after: > > /* size: 200, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 188, holes: 4, sum holes: 12 */ > > > We may eventually move the modifiers to java.lang.Class but that's WIP. > > Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Adopt @sspitsyn suggestion and strengthen field/method/class flag retrieval. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22246/files - new: https://git.openjdk.org/jdk/pull/22246/files/c60e96b4..8682a778 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=08-09 Stats: 36 lines in 9 files changed: 11 ins; 5 del; 20 mod Patch: https://git.openjdk.org/jdk/pull/22246.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22246/head:pull/22246 PR: https://git.openjdk.org/jdk/pull/22246 From coleenp at openjdk.org Fri Jan 3 22:00:34 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 3 Jan 2025 22:00:34 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v11] In-Reply-To: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: > Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. > > before: > > /* size: 216, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 194, holes: 3, sum holes: 18 */ > > > after: > > /* size: 200, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 188, holes: 4, sum holes: 12 */ > > > We may eventually move the modifiers to java.lang.Class but that's WIP. > > Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix more copyrights. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22246/files - new: https://git.openjdk.org/jdk/pull/22246/files/8682a778..82bd1a24 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=09-10 Stats: 7 lines in 7 files changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/22246.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22246/head:pull/22246 PR: https://git.openjdk.org/jdk/pull/22246 From coleenp at openjdk.org Fri Jan 3 22:45:40 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 3 Jan 2025 22:45:40 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: References: Message-ID: On Tue, 31 Dec 2024 06:42:43 GMT, Kim Barrett wrote: >> Please review this change to how HotSpot prevents the use of certain C library >> functions (e.g. poisons references to those functions), while permitting a >> subset to be used in restricted circumstances. Reasons for poisoning a >> function include it being considered obsolete, or a security concern, or there >> is a HotSpot function (typically in the os:: namespace) providing similar >> functionality that should be used instead. >> >> The old mechanism, based on -Wattribute-warning and the associated attribute, >> only worked for gcc. (Clang's implementation differs in an important way from >> gcc, which is the subject of a clang bug that has been open for years. MSVC >> doesn't provide a similar mechanism.) It also had problems with LTO, due to a >> gcc bug. >> >> The new mechanism is based on deprecation warnings, using [[deprecated]] >> attributes. We redeclare or forward declare the functions we want to prevent >> use of as being deprecated. This relies on deprecation warnings being >> enabled, which they already are in our build configuration. All of our >> supported compilers support the [[deprecated]] attribute. >> >> Another benefit of using deprecation warnings rather than warning attributes >> is the time when the check is performed. Warning attributes are checked only >> if the function is referenced after all optimizations have been performed. >> Deprecation is checked during initial semantic analysis. That's better for >> our purposes here. (This is also part of why gcc LTO has problems with the >> old mechanism, but not the new.) >> >> Adding these redeclarations or forward declarations isn't as simple as >> expected, due to differences between the various compilers. We hide the >> differences behind a set of macros, FORBID_C_FUNCTION and related macros. See >> the compiler-specific parts of those macros for details. >> >> In some situations we need to allow references to these poisoned functions. >> >> One common case is where our poisoning is visible to some 3rd party code we >> don't want to modify. This is typically 3rd party headers included in HotSpot >> code, such as from Google Test or the C++ Standard Library. For these the >> BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context >> where such references are permitted. >> >> Some of the poisoned functions are needed to implement associated HotSpot os:: >> functions, or in other similarly restricted contexts. For these, a wrapper >> function is provided that calls the poison... > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > forbid windows _snprintf This is very nice, now that I've navigated to all the pieces and see how it works. I would rather see #ifdef include the posix file rather than the dispatch header files which would help finding all the bits again. src/hotspot/share/utilities/forbiddenFunctions.hpp line 33: > 31: #include // for size_t > 32: > 33: #include OS_HEADER(forbiddenFunctions) I thought there was an OS_VARIANT_HEADER to dispatch directly to the posix variant, but I guess it doesn't exist. Using the semaphore example, this could save your dispatch header files, at the cost of an #ifdef. Not sure which is worse: #if defined(LINUX) || defined(AIX) # include "semaphore_posix.hpp" #else # include OS_HEADER(semaphore) #endif ------------- PR Review: https://git.openjdk.org/jdk/pull/22890#pullrequestreview-2529794425 PR Review Comment: https://git.openjdk.org/jdk/pull/22890#discussion_r1902213295 From coleenp at openjdk.org Fri Jan 3 22:45:41 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 3 Jan 2025 22:45:41 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: References: Message-ID: On Fri, 3 Jan 2025 22:31:18 GMT, Coleen Phillimore wrote: >> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: >> >> forbid windows _snprintf > > src/hotspot/share/utilities/forbiddenFunctions.hpp line 33: > >> 31: #include // for size_t >> 32: >> 33: #include OS_HEADER(forbiddenFunctions) > > I thought there was an OS_VARIANT_HEADER to dispatch directly to the posix variant, but I guess it doesn't exist. Using the semaphore example, this could save your dispatch header files, at the cost of an #ifdef. Not sure which is worse: > > > #if defined(LINUX) || defined(AIX) > # include "semaphore_posix.hpp" > #else > # include OS_HEADER(semaphore) > #endif park.hpp has the same: #if defined(LINUX) || defined(AIX) || defined(BSD) # include "park_posix.hpp" #else # include OS_HEADER(park) #endif ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22890#discussion_r1902218159 From kbarrett at openjdk.org Sat Jan 4 10:04:46 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 4 Jan 2025 10:04:46 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v2] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Fri, 3 Jan 2025 16:23:31 GMT, Coleen Phillimore wrote: >> There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. >> >> Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix %Ix to %zx. Uses of `[U]INTX_FORMAT_X` have been replaced with `0x%zx`. I mentioned the possibility of instead using `%#zx`. I don't know if we really want to use some of the (to me) more obscure flag options though. src/hotspot/cpu/x86/vm_version_x86.cpp line 1725: > 1723: ArrayOperationPartialInlineSize = MaxVectorSize >= 16 ? MaxVectorSize : 0; > 1724: if (ArrayOperationPartialInlineSize) { > 1725: warning("Setting ArrayOperationPartialInlineSize as MaxVectorSize%zd)", MaxVectorSize); pre-existing: seems like there should be a separator of some kind between "MaxVectorSize" and the value, either a space or an "=" would be okay. src/hotspot/os/linux/os_linux.cpp line 1370: > 1368: > 1369: #define _UFM "%zu" > 1370: #define _DFM "%zd" Why not get rid of these? src/hotspot/share/ci/ciMethodData.cpp line 788: > 786: // which makes comparing it with the SA version of this output > 787: // harder. data()'s element type is intptr_t. > 788: out->print(" 0x%zx", data()[i]); Could instead use " %#zx". src/hotspot/share/compiler/disassembler.cpp line 600: > 598: st->print("Stub::%s", desc->name()); > 599: if (desc->begin() != adr) { > 600: st->print("%+zd " PTR_FORMAT, adr - desc->begin(), p2i(adr)); Oh, that's an interesting "abuse" of the `_W` variant. src/hotspot/share/gc/shared/ageTable.cpp line 38: > 36: #include "logging/logStream.hpp" > 37: > 38: /* Copyright (c) 1992, 2025, Oracle and/or its affiliates, and Stanford University. Well this is weird. An atypical copyright down inside the file? src/hotspot/share/oops/instanceKlass.cpp line 3695: > 3693: > 3694: st->print(BULLET"hash_slot: %d", hash_slot()); st->cr(); > 3695: st->print(BULLET"secondary bitmap: " LP64_ONLY("0x%016zu") NOT_LP64("0x%08zu"), _secondary_supers_bitmap); st->cr(); Should be using "zx" rather than "zu". I think this could be written as `"%#0*zx", (2 * BytesPerWord + 2), _secondary_supers_bitmap` That's looking a lot like line noise though. I think this and ones like it probably ought not be changed at all. src/hotspot/share/oops/klass.cpp line 1308: > 1306: if (secondary_supers() != nullptr) { > 1307: st->print(" - "); st->print("%d elements;", _secondary_supers->length()); > 1308: st->print_cr(" bitmap: " LP64_ONLY("0x%016zu") NOT_LP64("0x%08zu"), _secondary_supers_bitmap); Same as in instanceKlass - maybe this shouldn't be changed at all. src/hotspot/share/utilities/globalDefinitions.hpp line 156: > 154: #define UINTX_FORMAT_X_0 "0x%016" PRIxPTR > 155: #else > 156: #define UINTX_FORMAT_X_0 "0x%08" PRIxPTR As noted in places where it's used, I'm not sure we should remove and replace UINTX_FORMAT_X_0. test/hotspot/gtest/utilities/test_globalDefinitions.cpp line 281: > 279: > 280: check_format("%zd", (intx)123, "123"); > 281: check_format("0x%zx", (intx)0x123, "0x123"); Could be "%#zx". test/hotspot/gtest/utilities/test_globalDefinitions.cpp line 286: > 284: > 285: check_format("%zu", (uintx)123u, "123"); > 286: check_format("0x%zx", (uintx)0x123u, "0x123"); Could be "%#zx". ------------- Changes requested by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22916#pullrequestreview-2530503795 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1902879593 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1902886743 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1902972028 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1902912020 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1902916165 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1902944144 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1902945394 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1902960940 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1902965078 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1902966477 From kbarrett at openjdk.org Sat Jan 4 10:17:35 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 4 Jan 2025 10:17:35 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v3] In-Reply-To: References: Message-ID: > Please review this change to how HotSpot prevents the use of certain C library > functions (e.g. poisons references to those functions), while permitting a > subset to be used in restricted circumstances. Reasons for poisoning a > function include it being considered obsolete, or a security concern, or there > is a HotSpot function (typically in the os:: namespace) providing similar > functionality that should be used instead. > > The old mechanism, based on -Wattribute-warning and the associated attribute, > only worked for gcc. (Clang's implementation differs in an important way from > gcc, which is the subject of a clang bug that has been open for years. MSVC > doesn't provide a similar mechanism.) It also had problems with LTO, due to a > gcc bug. > > The new mechanism is based on deprecation warnings, using [[deprecated]] > attributes. We redeclare or forward declare the functions we want to prevent > use of as being deprecated. This relies on deprecation warnings being > enabled, which they already are in our build configuration. All of our > supported compilers support the [[deprecated]] attribute. > > Another benefit of using deprecation warnings rather than warning attributes > is the time when the check is performed. Warning attributes are checked only > if the function is referenced after all optimizations have been performed. > Deprecation is checked during initial semantic analysis. That's better for > our purposes here. (This is also part of why gcc LTO has problems with the > old mechanism, but not the new.) > > Adding these redeclarations or forward declarations isn't as simple as > expected, due to differences between the various compilers. We hide the > differences behind a set of macros, FORBID_C_FUNCTION and related macros. See > the compiler-specific parts of those macros for details. > > In some situations we need to allow references to these poisoned functions. > > One common case is where our poisoning is visible to some 3rd party code we > don't want to modify. This is typically 3rd party headers included in HotSpot > code, such as from Google Test or the C++ Standard Library. For these the > BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context > where such references are permitted. > > Some of the poisoned functions are needed to implement associated HotSpot os:: > functions, or in other similarly restricted contexts. For these, a wrapper > function is provided that calls the poisoned function with the warning > suppressed. These wrappers are defined in the permit_fo... Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: - Merge branch 'master' into new-poison - update copyrights - tidy comment - drop plural in permit_forbidden_functions namespace - new workarounds for platform issues - forbid windows _snprintf - new poisoning ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22890/files - new: https://git.openjdk.org/jdk/pull/22890/files/e2a63247..77a80170 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22890&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22890&range=01-02 Stats: 450 lines in 48 files changed: 135 ins; 175 del; 140 mod Patch: https://git.openjdk.org/jdk/pull/22890.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22890/head:pull/22890 PR: https://git.openjdk.org/jdk/pull/22890 From kbarrett at openjdk.org Sat Jan 4 10:22:35 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 4 Jan 2025 10:22:35 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: <12-_FKsXl3OOujKaHQewgKdDUi95q9M0fSmuLeo6_Wg=.1ec72306-698f-42f8-86b7-e53bc711fde6@github.com> References: <12-_FKsXl3OOujKaHQewgKdDUi95q9M0fSmuLeo6_Wg=.1ec72306-698f-42f8-86b7-e53bc711fde6@github.com> Message-ID: On Thu, 2 Jan 2025 22:19:00 GMT, Martin Doerr wrote: > Unfortunately, this doesn't compile on AIX: > > ``` > globalDefinitions_gcc.hpp:42: > In file included from /opt/IBM/openxlC/17.1.1/bin/../include/c++/v1/stdlib.h:93: > /usr/include/stdlib.h:304:18: error: 'noreturn' attribute does not appear on the first declaration > extern _NOTHROW(_NORETURN(void, exit), (int)); > ^ > /usr/include/comp_macros.h:30:35: note: expanded from macro '_NORETURN' > #define _NORETURN(_T, _F) _T _F [[noreturn]] > ^ > ... (rest of output omitted) > ``` Thanks for testing that port. I restructured the implementation of FORBID_C_FUNCTION, and added more commentary about the clang issue. I think that didn't change the resulting expansion, but I think made it easier to describe how I tried to address this problem. The actual "solution" (I hope) is to ensure that is included before the forbidding declarations for the noreturn functions. Please try an aix build again. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22890#issuecomment-2571149222 From kbarrett at openjdk.org Sat Jan 4 10:28:36 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 4 Jan 2025 10:28:36 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: <5JjsCtkAmOXUdGJjqvcgsek8ft_XlaIw2iLM_mgqFj8=.16d96a5f-a005-4340-81a9-653b52572a8a@github.com> References: <5JjsCtkAmOXUdGJjqvcgsek8ft_XlaIw2iLM_mgqFj8=.16d96a5f-a005-4340-81a9-653b52572a8a@github.com> Message-ID: On Fri, 3 Jan 2025 13:24:53 GMT, Johan Sj?len wrote: > I really like the syntactic change of this feature, and it's very nice that we get to have working auto-complete (makes it easier to find out that a specific function isn't forbidden). The syntactic change isn't shown in the PR description, isn't that useful to add? Isn't the last paragraph of the PR description (beginning with "Some of the poisoned functions...") what your are looking for? Admittedly I didn't mention working with auto-complete as one of the benefits. > I have a bike shedding request: Could we skip the S at the end and have it be `permit_forbidden_function::free(my_ptr);`? We only allow one forbidden function in that call, so this reads neater. I named it as a collection of forbidden functions. But I agree it reads better without the plural, so I've removed the S. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22890#issuecomment-2571175973 From kbarrett at openjdk.org Sat Jan 4 10:36:35 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 4 Jan 2025 10:36:35 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: References: Message-ID: On Fri, 3 Jan 2025 22:42:44 GMT, Coleen Phillimore wrote: >> src/hotspot/share/utilities/forbiddenFunctions.hpp line 33: >> >>> 31: #include // for size_t >>> 32: >>> 33: #include OS_HEADER(forbiddenFunctions) >> >> I thought there was an OS_VARIANT_HEADER to dispatch directly to the posix variant, but I guess it doesn't exist. Using the semaphore example, this could save your dispatch header files, at the cost of an #ifdef. Not sure which is worse: >> >> >> #if defined(LINUX) || defined(AIX) >> # include "semaphore_posix.hpp" >> #else >> # include OS_HEADER(semaphore) >> #endif > > park.hpp has the same: > > > #if defined(LINUX) || defined(AIX) || defined(BSD) > # include "park_posix.hpp" > #else > # include OS_HEADER(park) > #endif If I was going to go that route, I'd be more inclined toward #if !define(_WINDOWS) #include "forbiddenFunctions_windows.hpp" #else #include "forbiddenFunctions_posix.hpp" #endif rather than list all (or maybe just a subset) of the posix-like ports. My inclination is to leave it as-is with OS_HEADERS, since I think that's the "intended" idiom. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22890#discussion_r1903054016 From mdoerr at openjdk.org Sun Jan 5 14:55:38 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Sun, 5 Jan 2025 14:55:38 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: References: <12-_FKsXl3OOujKaHQewgKdDUi95q9M0fSmuLeo6_Wg=.1ec72306-698f-42f8-86b7-e53bc711fde6@github.com> Message-ID: On Sat, 4 Jan 2025 10:19:29 GMT, Kim Barrett wrote: > > Unfortunately, this doesn't compile on AIX: > > ``` > > globalDefinitions_gcc.hpp:42: > > In file included from /opt/IBM/openxlC/17.1.1/bin/../include/c++/v1/stdlib.h:93: > > /usr/include/stdlib.h:304:18: error: 'noreturn' attribute does not appear on the first declaration > > extern _NOTHROW(_NORETURN(void, exit), (int)); > > ^ > > /usr/include/comp_macros.h:30:35: note: expanded from macro '_NORETURN' > > #define _NORETURN(_T, _F) _T _F [[noreturn]] > > ^ > > ... (rest of output omitted) > > ``` > > Thanks for testing that port. > > I restructured the implementation of FORBID_C_FUNCTION, and added more commentary about the clang issue. I think that didn't change the resulting expansion, but I think made it easier to describe how I tried to address this problem. The actual "solution" (I hope) is to ensure that is included before the forbidding declarations for the noreturn functions. Please try an aix build again. Thanks! This has solved one of the problems, but not all. The next one is: globalDefinitions_gcc.hpp:55: In file included from /usr/include/fcntl.h:242: /usr/include/unistd.h:181:8: error: 'noreturn' attribute does not appear on the first declaration extern _NORETURN(void, _exit)(int); ^ /usr/include/comp_macros.h:30:35: note: expanded from macro '_NORETURN' #define _NORETURN(_T, _F) _T _F [[noreturn]] ------------- PR Comment: https://git.openjdk.org/jdk/pull/22890#issuecomment-2571652508 From kbarrett at openjdk.org Sun Jan 5 22:16:35 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Sun, 5 Jan 2025 22:16:35 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: References: <12-_FKsXl3OOujKaHQewgKdDUi95q9M0fSmuLeo6_Wg=.1ec72306-698f-42f8-86b7-e53bc711fde6@github.com> Message-ID: On Sun, 5 Jan 2025 14:53:01 GMT, Martin Doerr wrote: > Thanks! This has solved one of the problems, but not all. The next one is: > > ``` > globalDefinitions_gcc.hpp:55: > In file included from /usr/include/fcntl.h:242: > /usr/include/unistd.h:181:8: error: 'noreturn' attribute does not appear on the first declaration > extern _NORETURN(void, _exit)(int); > ^ Drat. I forgot that posix _exit comes from unistd.h. Back to work... ------------- PR Comment: https://git.openjdk.org/jdk/pull/22890#issuecomment-2571767901 From dholmes at openjdk.org Mon Jan 6 01:38:47 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 6 Jan 2025 01:38:47 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: References: Message-ID: <8gxnsZYbqEJ7T3N637tjijrVbmQgbu8BrHHmVAjCt5M=.f98893f3-692a-4168-80f0-997f522ec4b0@github.com> On Sat, 4 Jan 2025 10:33:51 GMT, Kim Barrett wrote: >> park.hpp has the same: >> >> >> #if defined(LINUX) || defined(AIX) || defined(BSD) >> # include "park_posix.hpp" >> #else >> # include OS_HEADER(park) >> #endif > > If I was going to go that route, I'd be more inclined toward > > #if !define(_WINDOWS) > #include "forbiddenFunctions_windows.hpp" > #else > #include "forbiddenFunctions_posix.hpp" > #endif > > rather than list all (or maybe just a subset) of the posix-like ports. My > inclination is to leave it as-is with OS_HEADERS, since I think that's the > "intended" idiom. Overall I like this change. I appreciate the effort that has been put in to try and find an elegant solution to this problem. but having OS specific files created just to include the posix version runs counter to why we have the posix variants in the first place IMO. Please select one of the above approaches so that the new aix/bsd/linux specific files can be removed in favour of the posix one. Thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22890#discussion_r1903399996 From dholmes at openjdk.org Mon Jan 6 04:00:39 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 6 Jan 2025 04:00:39 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v11] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: <8UJ0xv2YRkXDMkD05IxpRA-_CmGa2K_14YB3ZMawwAE=.274408f1-f47b-4d45-bf90-d66915291f2f@github.com> On Fri, 3 Jan 2025 22:00:34 GMT, Coleen Phillimore wrote: >> Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. >> >> before: >> >> /* size: 216, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 194, holes: 3, sum holes: 18 */ >> >> >> after: >> >> /* size: 200, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 188, holes: 4, sum holes: 12 */ >> >> >> We may eventually move the modifiers to java.lang.Class but that's WIP. >> >> Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix more copyrights. src/hotspot/share/ci/ciFlags.cpp line 95: > 93: // ciFlags::print > 94: void ciFlags::print(outputStream* st) { > 95: st->print(" flags=%x", _flags.as_unsigned_short()); Here, and elsewhere, are we relying on an implicit widening of the u2 result to int so that the format specifier is correct? src/hotspot/share/ci/ciFlags.hpp line 71: > 69: > 70: // Conversion > 71: jint as_int() { return _flags.as_unsigned_short(); } It is unclear to me whether the fact we are dealing with u2 should be exposed in this API as well. src/hotspot/share/ci/ciKlass.cpp line 225: > 223: assert(is_loaded(), "not loaded"); > 224: GUARDED_VM_ENTRY( > 225: return get_Klass()->access_flags().as_unsigned_short(); Again it is unclear to me whether this API should also now return u2. src/hotspot/share/jvmci/jvmciCompilerToVM.cpp line 1003: > 1001: JVMCI_ERROR_NULL("info must not be null and have a length of 4"); > 1002: } > 1003: JVMCIENV->put_int_at(info, 0, fd.access_flags().as_unsigned_short()); Again are we relying on implicit widening from u2 to int? It really isn't clear to me whether the only thing we should have changed here is the actual type of the `_flags` field and let everything else continue to represent flags as int, so we don't get these transitions from u2 to int in these higher level APIs. src/hotspot/share/jvmci/jvmciEnv.cpp line 1595: > 1593: HotSpotJVMCI::FieldInfo::set_signatureIndex(JVMCIENV, obj_h(), (jint)fieldinfo->signature_index()); > 1594: HotSpotJVMCI::FieldInfo::set_offset(JVMCIENV, obj_h(), (jint)fieldinfo->offset()); > 1595: HotSpotJVMCI::FieldInfo::set_classfileFlags(JVMCIENV, obj_h(), (jint)fieldinfo->access_flags().as_unsigned_short()); I'm curious why we need the explicit cast here - is it because we are going from unsigned to signed? src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/AccessFlags.java line 82: > 80: public int getStandardFlags() { > 81: return (int)flags; > 82: } This function seems unused. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1903619988 PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1903621196 PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1903621733 PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1903624061 PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1903624365 PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1903619429 From shade at openjdk.org Mon Jan 6 12:05:12 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 6 Jan 2025 12:05:12 GMT Subject: RFR: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port Message-ID: **NOTE: This is work-in-progress draft for interested parties. The JEP is not even submitted, let alone targeted.** My plan is to to get this done in a quiet time in mainline to limit the ongoing conflicts with mainline. Feel free to comment in this PR, if you see something ahead of time. These comments might adjust the trajectory we take to implement this removal and/or allows us submit and work out more RFEs ahead of this removal. I plan to re-open a clean PR after this preliminary PR is done, maybe after the round of preliminary reviews. This removes the 32-bit x86 port and does a deeper cleaning in Hotspot. The following paragraphs describe what and why was being done. Easy stuff first: all files named `*_x86_32` are gone. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. The code under `!LP64`, `!AMD64` and `IA32` is removed in `x86`-specific files. There is quite a bit of the code, especially around `Assembler` and `MacroAssembler`. I think these removals make the whole thing cleaner. The downside is that some of the `MacroAssembler::*ptr` functions that were used to select the "machine pointer" instructions either from x86_64 or x86_32 are now exclusively for x86_64. I don't think we want to rewrite `*ptr` -> `*q` at this point. I think we gradually morph the code base to use `*q`-flavored methods in new code. x86_32 is the only platform that has special cases for x87 FPU. C1 even implements the whole separate thing to deal with x87 FPU: the parts of regalloc treat it specially, there is `FpuStackSim`, there is `VerifyFPU` family of flags, etc. There are also peculiarities with FP conversions that use FPU, that's why x86_32 used to have template interpreter stubs for FP conversion methods. None of that is needed anymore without x86_32. This cleans up some arch-specific code as well. Both C1 and C2 implement the workarounds for non-IEEE compliant rounding of x87 FPU. After x86_32 is gone, these are not needed anymore. This removes some C2 nodes, removes the rounding instructions in C1. x86_64 is baselined on SSE2+, the VM would not even start if SSE2 is not supported. Most of the checks that we have for `UseSSE < 2` are for the benefit of x86_32. Because of this I folded redundant `UseSSE` checks around Hotspot. The one thing I _deliberately_ avoided doing is merging `x86.ad` and `x86_64.ad`. It would likely introduce uncomfortable amount of conflicts with pending work in mainline, so I would like to do it separately as the follow-up. ------------- Commit messages: - More cleanups/reversals - More FPU cleanups in C1 regalloc - More touchups - Fix more backsliding LP64 in Assembler - Revert accidental removal in C1 regalloc - C1: Cleanup dead lir_f stack ops - Cleanup more FPU-related stuff - Remove rounding code from C1 and template interpreter - Purge 32-bit specific rounding mode - OS cleanup - ... and 9 more: https://git.openjdk.org/jdk/compare/f1d85ab3...b55fc750 Changes: https://git.openjdk.org/jdk/pull/22567/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22567&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8345169 Stats: 40692 lines in 213 files changed: 33 ins; 39906 del; 753 mod Patch: https://git.openjdk.org/jdk/pull/22567.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22567/head:pull/22567 PR: https://git.openjdk.org/jdk/pull/22567 From coleenp at openjdk.org Mon Jan 6 13:46:40 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 6 Jan 2025 13:46:40 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v11] In-Reply-To: <8UJ0xv2YRkXDMkD05IxpRA-_CmGa2K_14YB3ZMawwAE=.274408f1-f47b-4d45-bf90-d66915291f2f@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> <8UJ0xv2YRkXDMkD05IxpRA-_CmGa2K_14YB3ZMawwAE=.274408f1-f47b-4d45-bf90-d66915291f2f@github.com> Message-ID: On Mon, 6 Jan 2025 03:44:09 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix more copyrights. > > src/hotspot/share/ci/ciFlags.cpp line 95: > >> 93: // ciFlags::print >> 94: void ciFlags::print(outputStream* st) { >> 95: st->print(" flags=%x", _flags.as_unsigned_short()); > > Here, and elsewhere, are we relying on an implicit widening of the u2 result to int so that the format specifier is correct? Yes, we are. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1904168828 From coleenp at openjdk.org Mon Jan 6 14:12:56 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 6 Jan 2025 14:12:56 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v11] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> <8UJ0xv2YRkXDMkD05IxpRA-_CmGa2K_14YB3ZMawwAE=.274408f1-f47b-4d45-bf90-d66915291f2f@github.com> Message-ID: On Mon, 6 Jan 2025 13:44:27 GMT, Coleen Phillimore wrote: >> src/hotspot/share/ci/ciFlags.cpp line 95: >> >>> 93: // ciFlags::print >>> 94: void ciFlags::print(outputStream* st) { >>> 95: st->print(" flags=%x", _flags.as_unsigned_short()); >> >> Here, and elsewhere, are we relying on an implicit widening of the u2 result to int so that the format specifier is correct? > > Yes, we are. I had a little experiment. extern "C" int printf(const char *,...); int main() { unsigned short ss = 0x8000; printf("does unsigned sign extend to int? %d\n", int(ss)); } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1904172191 From coleenp at openjdk.org Mon Jan 6 14:12:57 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 6 Jan 2025 14:12:57 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v11] In-Reply-To: <8UJ0xv2YRkXDMkD05IxpRA-_CmGa2K_14YB3ZMawwAE=.274408f1-f47b-4d45-bf90-d66915291f2f@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> <8UJ0xv2YRkXDMkD05IxpRA-_CmGa2K_14YB3ZMawwAE=.274408f1-f47b-4d45-bf90-d66915291f2f@github.com> Message-ID: On Mon, 6 Jan 2025 03:47:05 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix more copyrights. > > src/hotspot/share/ci/ciFlags.hpp line 71: > >> 69: >> 70: // Conversion >> 71: jint as_int() { return _flags.as_unsigned_short(); } > > It is unclear to me whether the fact we are dealing with u2 should be exposed in this API as well. I don't think it should be. > src/hotspot/share/ci/ciKlass.cpp line 225: > >> 223: assert(is_loaded(), "not loaded"); >> 224: GUARDED_VM_ENTRY( >> 225: return get_Klass()->access_flags().as_unsigned_short(); > > Again it is unclear to me whether this API should also now return u2. I don't think it should. I think the boundary of where we promote the u2 to int should be at this API. If those working on the compiler code would like to propagate the size of the storage (u2) around, they can decide to do that. > src/hotspot/share/jvmci/jvmciCompilerToVM.cpp line 1003: > >> 1001: JVMCI_ERROR_NULL("info must not be null and have a length of 4"); >> 1002: } >> 1003: JVMCIENV->put_int_at(info, 0, fd.access_flags().as_unsigned_short()); > > Again are we relying on implicit widening from u2 to int? > > It really isn't clear to me whether the only thing we should have changed here is the actual type of the `_flags` field and let everything else continue to represent flags as int, so we don't get these transitions from u2 to int in these higher level APIs. Yes, we can widen u2 to int. See above. The ci code represents the integral value of access flags as jint. I am leaving that API in place. For this, the widening happens when fetching the u2 field. The conversion is implicit. If this field were to be stored back to a u2 somewhere, all the code in the compiler should change but the current code doesn't do that. > src/hotspot/share/jvmci/jvmciEnv.cpp line 1595: > >> 1593: HotSpotJVMCI::FieldInfo::set_signatureIndex(JVMCIENV, obj_h(), (jint)fieldinfo->signature_index()); >> 1594: HotSpotJVMCI::FieldInfo::set_offset(JVMCIENV, obj_h(), (jint)fieldinfo->offset()); >> 1595: HotSpotJVMCI::FieldInfo::set_classfileFlags(JVMCIENV, obj_h(), (jint)fieldinfo->access_flags().as_unsigned_short()); > > I'm curious why we need the explicit cast here - is it because we are going from unsigned to signed? The casts were all there for all these fields, so I left it. It is unnecessary but matches the style of the preceding lines. > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/AccessFlags.java line 82: > >> 80: public int getStandardFlags() { >> 81: return (int)flags; >> 82: } > > This function seems unused. Ah thanks. Another SA function to remove. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1904197071 PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1904190829 PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1904193482 PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1904194643 PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1904195078 From coleenp at openjdk.org Mon Jan 6 14:12:56 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 6 Jan 2025 14:12:56 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v12] In-Reply-To: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: > Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. > > before: > > /* size: 216, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 194, holes: 3, sum holes: 18 */ > > > after: > > /* size: 200, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 188, holes: 4, sum holes: 12 */ > > > We may eventually move the modifiers to java.lang.Class but that's WIP. > > Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Remove unused SA function. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22246/files - new: https://git.openjdk.org/jdk/pull/22246/files/82bd1a24..d43b2fac Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=10-11 Stats: 5 lines in 1 file changed: 0 ins; 5 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/22246.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22246/head:pull/22246 PR: https://git.openjdk.org/jdk/pull/22246 From coleenp at openjdk.org Mon Jan 6 15:09:18 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 6 Jan 2025 15:09:18 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v3] In-Reply-To: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: > There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. > > Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fixed some code review comments. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22916/files - new: https://git.openjdk.org/jdk/pull/22916/files/1748797a..15b1052a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22916&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22916&range=01-02 Stats: 16 lines in 5 files changed: 0 ins; 9 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/22916.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22916/head:pull/22916 PR: https://git.openjdk.org/jdk/pull/22916 From coleenp at openjdk.org Mon Jan 6 15:09:19 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 6 Jan 2025 15:09:19 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v2] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: <3vQ-kxRahCEhGLRshu6KE_0ZkWCnrgtnyx8cbXsPIeE=.24a34a54-28b0-4202-8ea3-6bd2b7325ce3@github.com> On Fri, 3 Jan 2025 16:23:31 GMT, Coleen Phillimore wrote: >> There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. >> >> Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix %Ix to %zx. Kim, thanks for slogging through this change. I've updated the patch with your suggested changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22916#issuecomment-2573301941 From coleenp at openjdk.org Mon Jan 6 15:09:19 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 6 Jan 2025 15:09:19 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v2] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Sat, 4 Jan 2025 09:02:34 GMT, Kim Barrett wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix %Ix to %zx. > > src/hotspot/os/linux/os_linux.cpp line 1370: > >> 1368: >> 1369: #define _UFM "%zu" >> 1370: #define _DFM "%zd" > > Why not get rid of these? Fixed. > src/hotspot/share/gc/shared/ageTable.cpp line 38: > >> 36: #include "logging/logStream.hpp" >> 37: >> 38: /* Copyright (c) 1992, 2025, Oracle and/or its affiliates, and Stanford University. > > Well this is weird. An atypical copyright down inside the file? This is a relic and not the legal copyright that got updated since nobody noticed. Until you did. Removed. > src/hotspot/share/oops/instanceKlass.cpp line 3695: > >> 3693: >> 3694: st->print(BULLET"hash_slot: %d", hash_slot()); st->cr(); >> 3695: st->print(BULLET"secondary bitmap: " LP64_ONLY("0x%016zu") NOT_LP64("0x%08zu"), _secondary_supers_bitmap); st->cr(); > > Should be using "zx" rather than "zu". I think this could be written as > `"%#0*zx", (2 * BytesPerWord + 2), _secondary_supers_bitmap` > That's looking a lot like line noise though. I think this and ones like it probably ought not be > changed at all. I have to confess that I have no idea what this is trying to show. I'd rather have all the UINTX_FORMAT purged and not leave a remnant for these two special cases. A function whose name describes what this is trying to show would be better. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1904264225 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1904264062 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1904263162 From coleenp at openjdk.org Mon Jan 6 15:24:18 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 6 Jan 2025 15:24:18 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v4] In-Reply-To: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: > There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. > > Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Use INTPTR_FORMAT instead of zu for secondary_supers_bitmap. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22916/files - new: https://git.openjdk.org/jdk/pull/22916/files/15b1052a..6e8b2702 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22916&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22916&range=02-03 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/22916.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22916/head:pull/22916 PR: https://git.openjdk.org/jdk/pull/22916 From coleenp at openjdk.org Mon Jan 6 15:24:19 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 6 Jan 2025 15:24:19 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v2] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Mon, 6 Jan 2025 15:03:34 GMT, Coleen Phillimore wrote: >> src/hotspot/share/oops/instanceKlass.cpp line 3695: >> >>> 3693: >>> 3694: st->print(BULLET"hash_slot: %d", hash_slot()); st->cr(); >>> 3695: st->print(BULLET"secondary bitmap: " LP64_ONLY("0x%016zu") NOT_LP64("0x%08zu"), _secondary_supers_bitmap); st->cr(); >> >> Should be using "zx" rather than "zu". I think this could be written as >> `"%#0*zx", (2 * BytesPerWord + 2), _secondary_supers_bitmap` >> That's looking a lot like line noise though. I think this and ones like it probably ought not be >> changed at all. > > I have to confess that I have no idea what this is trying to show. I'd rather have all the UINTX_FORMAT purged and not leave a remnant for these two special cases. A function whose name describes what this is trying to show would be better. @theRealAph added this with the secondary super cache work, but I think it may have also been meant to be zx because of the leading 0x. So INTPTR_FORMAT would also work. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1904284828 From coleenp at openjdk.org Mon Jan 6 16:02:36 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 6 Jan 2025 16:02:36 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v2] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Sat, 4 Jan 2025 09:52:00 GMT, Kim Barrett wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix %Ix to %zx. > > test/hotspot/gtest/utilities/test_globalDefinitions.cpp line 281: > >> 279: >> 280: check_format("%zd", (intx)123, "123"); >> 281: check_format("0x%zx", (intx)0x123, "0x123"); > > Could be "%#zx". I fixed this. This seems ok. I didn't know about this format option tbh but if it's standard, why not? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1904331779 From sspitsyn at openjdk.org Mon Jan 6 17:10:39 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 6 Jan 2025 17:10:39 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v12] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Mon, 6 Jan 2025 14:12:56 GMT, Coleen Phillimore wrote: >> Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. >> >> before: >> >> /* size: 216, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 194, holes: 3, sum holes: 18 */ >> >> >> after: >> >> /* size: 200, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 188, holes: 4, sum holes: 12 */ >> >> >> We may eventually move the modifiers to java.lang.Class but that's WIP. >> >> Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Remove unused SA function. Thank you for update with this an unification! I've posted a couple of comments with similar nits. src/hotspot/share/interpreter/linkResolver.cpp line 586: > 584: // We need to change "protected" to "public". > 585: assert(flags.is_protected(), "clone not protected?"); > 586: u2 new_flags = flags.as_unsigned_short(); Nit: Should this also be replaced with `as_method_flags()`? src/hotspot/share/opto/memnode.cpp line 1985: > 1983: // The field is Klass::_access_flags. Return its (constant) value. > 1984: // (Folds up the 2nd indirection in Reflection.getClassAccessFlags(aClassConstant).) > 1985: assert(this->Opcode() == Op_LoadUS, "must load an unsigned short from _access_flags"); Nit: This can be unified with line 1979 and also get rid of `this->`. src/hotspot/share/prims/jvm.cpp line 2472: > 2470: u2 field_access_flags = InstanceKlass::cast(k)->field_access_flags(field_index); > 2471: // This & should be unnecessary. > 2472: assert((field_access_flags & JVM_RECOGNIZED_FIELD_MODIFIERS) == field_access_flags, "already masked"); Nit: Yes, it is better to remove the lines: 2471-2472. ------------- PR Review: https://git.openjdk.org/jdk/pull/22246#pullrequestreview-2532540668 PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1904386978 PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1904405970 PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1904404626 From kvn at openjdk.org Mon Jan 6 17:49:41 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 6 Jan 2025 17:49:41 GMT Subject: RFR: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Thu, 5 Dec 2024 08:26:10 GMT, Aleksey Shipilev wrote: > **NOTE: This is work-in-progress draft for interested parties. The JEP is not even submitted, let alone targeted.** > > My plan is to to get this done in a quiet time in mainline to limit the ongoing conflicts with mainline. Feel free to comment in this PR, if you see something ahead of time. These comments might adjust the trajectory we take to implement this removal and/or allows us submit and work out more RFEs ahead of this removal. I plan to re-open a clean PR after this preliminary PR is done, maybe after the round of preliminary reviews. > > This removes the 32-bit x86 port and does a deeper cleaning in Hotspot. The following paragraphs describe what and why was being done. > > Easy stuff first: all files named `*_x86_32` are gone. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. > > The code under `!LP64`, `!AMD64` and `IA32` is removed in `x86`-specific files. There is quite a bit of the code, especially around `Assembler` and `MacroAssembler`. I think these removals make the whole thing cleaner. The downside is that some of the `MacroAssembler::*ptr` functions that were used to select the "machine pointer" instructions either from x86_64 or x86_32 are now exclusively for x86_64. I don't think we want to rewrite `*ptr` -> `*q` at this point. I think we gradually morph the code base to use `*q`-flavored methods in new code. > > x86_32 is the only platform that has special cases for x87 FPU. > > C1 even implements the whole separate thing to deal with x87 FPU: the parts of regalloc treat it specially, there is `FpuStackSim`, there is `VerifyFPU` family of flags, etc. There are also peculiarities with FP conversions that use FPU, that's why x86_32 used to have template interpreter stubs for FP conversion methods. None of that is needed anymore without x86_32. This cleans up some arch-specific code as well. > > Both C1 and C2 implement the workarounds for non-IEEE compliant rounding of x87 FPU. After x86_32 is gone, these are not needed anymore. This removes some C2 nodes, removes the rounding instructions in C1. > > x86_64 is baselined on SSE2+, the VM would not even start if SSE2 is not supported. Most of the checks that we have for `UseSSE < 2` are for the benefit of x86_32. Because of this I folded redundant `UseSSE` checks around Hotspot. > > The one thing I _deliberately_ avoided doing is merging `x86.ad` and `x86_64.ad`. It would likely introduce uncomfortable amount of conflicts with pending work in mainli... > The one thing I deliberately avoided doing is merging x86.ad and x86_64.ad. I think we can keep them separate (big .ad files is difficult to navigate). `x86.ad` is mostly used for vector instructions. We can rename it to ``x86_vect.ad`. And `x86_64.ad` to `x86.ad`. As followup changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22567#issuecomment-2573606824 From kvn at openjdk.org Mon Jan 6 17:53:35 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 6 Jan 2025 17:53:35 GMT Subject: RFR: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Thu, 5 Dec 2024 08:26:10 GMT, Aleksey Shipilev wrote: > **NOTE: This is work-in-progress draft for interested parties. The JEP is not even submitted, let alone targeted.** > > My plan is to to get this done in a quiet time in mainline to limit the ongoing conflicts with mainline. Feel free to comment in this PR, if you see something ahead of time. These comments might adjust the trajectory we take to implement this removal and/or allows us submit and work out more RFEs ahead of this removal. I plan to re-open a clean PR after this preliminary PR is done, maybe after the round of preliminary reviews. > > This removes the 32-bit x86 port and does a deeper cleaning in Hotspot. The following paragraphs describe what and why was being done. > > Easy stuff first: all files named `*_x86_32` are gone. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. > > The code under `!LP64`, `!AMD64` and `IA32` is removed in `x86`-specific files. There is quite a bit of the code, especially around `Assembler` and `MacroAssembler`. I think these removals make the whole thing cleaner. The downside is that some of the `MacroAssembler::*ptr` functions that were used to select the "machine pointer" instructions either from x86_64 or x86_32 are now exclusively for x86_64. I don't think we want to rewrite `*ptr` -> `*q` at this point. I think we gradually morph the code base to use `*q`-flavored methods in new code. > > x86_32 is the only platform that has special cases for x87 FPU. > > C1 even implements the whole separate thing to deal with x87 FPU: the parts of regalloc treat it specially, there is `FpuStackSim`, there is `VerifyFPU` family of flags, etc. There are also peculiarities with FP conversions that use FPU, that's why x86_32 used to have template interpreter stubs for FP conversion methods. None of that is needed anymore without x86_32. This cleans up some arch-specific code as well. > > Both C1 and C2 implement the workarounds for non-IEEE compliant rounding of x87 FPU. After x86_32 is gone, these are not needed anymore. This removes some C2 nodes, removes the rounding instructions in C1. > > x86_64 is baselined on SSE2+, the VM would not even start if SSE2 is not supported. Most of the checks that we have for `UseSSE < 2` are for the benefit of x86_32. Because of this I folded redundant `UseSSE` checks around Hotspot. > > The one thing I _deliberately_ avoided doing is merging `x86.ad` and `x86_64.ad`. It would likely introduce uncomfortable amount of conflicts with pending work in mainli... I don't see make files changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22567#issuecomment-2573613772 From kvn at openjdk.org Mon Jan 6 18:01:50 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 6 Jan 2025 18:01:50 GMT Subject: RFR: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Thu, 5 Dec 2024 08:26:10 GMT, Aleksey Shipilev wrote: > **NOTE: This is work-in-progress draft for interested parties. The JEP is not even submitted, let alone targeted.** > > My plan is to to get this done in a quiet time in mainline to limit the ongoing conflicts with mainline. Feel free to comment in this PR, if you see something ahead of time. These comments might adjust the trajectory we take to implement this removal and/or allows us submit and work out more RFEs ahead of this removal. I plan to re-open a clean PR after this preliminary PR is done, maybe after the round of preliminary reviews. > > This removes the 32-bit x86 port and does a deeper cleaning in Hotspot. The following paragraphs describe what and why was being done. > > Easy stuff first: all files named `*_x86_32` are gone. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. > > The code under `!LP64`, `!AMD64` and `IA32` is removed in `x86`-specific files. There is quite a bit of the code, especially around `Assembler` and `MacroAssembler`. I think these removals make the whole thing cleaner. The downside is that some of the `MacroAssembler::*ptr` functions that were used to select the "machine pointer" instructions either from x86_64 or x86_32 are now exclusively for x86_64. I don't think we want to rewrite `*ptr` -> `*q` at this point. I think we gradually morph the code base to use `*q`-flavored methods in new code. > > x86_32 is the only platform that has special cases for x87 FPU. > > C1 even implements the whole separate thing to deal with x87 FPU: the parts of regalloc treat it specially, there is `FpuStackSim`, there is `VerifyFPU` family of flags, etc. There are also peculiarities with FP conversions that use FPU, that's why x86_32 used to have template interpreter stubs for FP conversion methods. None of that is needed anymore without x86_32. This cleans up some arch-specific code as well. > > Both C1 and C2 implement the workarounds for non-IEEE compliant rounding of x87 FPU. After x86_32 is gone, these are not needed anymore. This removes some C2 nodes, removes the rounding instructions in C1. > > x86_64 is baselined on SSE2+, the VM would not even start if SSE2 is not supported. Most of the checks that we have for `UseSSE < 2` are for the benefit of x86_32. Because of this I folded redundant `UseSSE` checks around Hotspot. > > The one thing I _deliberately_ avoided doing is merging `x86.ad` and `x86_64.ad`. It would likely introduce uncomfortable amount of conflicts with pending work in mainli... It would be nice to split this into separate PRs for easy review. Removing "rounding of x87 FPU" could be definitely done separately. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22567#issuecomment-2573626448 From dholmes at openjdk.org Tue Jan 7 02:20:40 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 7 Jan 2025 02:20:40 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v12] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Mon, 6 Jan 2025 14:12:56 GMT, Coleen Phillimore wrote: >> Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. >> >> before: >> >> /* size: 216, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 194, holes: 3, sum holes: 18 */ >> >> >> after: >> >> /* size: 200, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 188, holes: 4, sum holes: 12 */ >> >> >> We may eventually move the modifiers to java.lang.Class but that's WIP. >> >> Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Remove unused SA function. I can't say that I really understand which API's are fine with flags-as-int and which need to care about the actual flag storage size. but I'll leave it at that. I will tick approve for the shared code portion of the change. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22246#pullrequestreview-2533262347 From coleenp at openjdk.org Tue Jan 7 02:30:49 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 7 Jan 2025 02:30:49 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v12] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Mon, 6 Jan 2025 16:46:25 GMT, Serguei Spitsyn wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove unused SA function. > > src/hotspot/share/interpreter/linkResolver.cpp line 586: > >> 584: // We need to change "protected" to "public". >> 585: assert(flags.is_protected(), "clone not protected?"); >> 586: u2 new_flags = flags.as_unsigned_short(); > > Nit: Should this also be replaced with `as_method_flags()`? Thanks Serguei, I replaced this one and a couple of as_field_flags() so that as_unsigned_short() is more limited to the cases where we don't want masking. > src/hotspot/share/opto/memnode.cpp line 1985: > >> 1983: // The field is Klass::_access_flags. Return its (constant) value. >> 1984: // (Folds up the 2nd indirection in Reflection.getClassAccessFlags(aClassConstant).) >> 1985: assert(this->Opcode() == Op_LoadUS, "must load an unsigned short from _access_flags"); > > Nit: This can be unified with line 1979 and also get rid of `this->`. 1979 and 1985 are in different branches of an if statement (address of modifier flags vs access flags) so needs to be repeated. But I did remove the this-> > src/hotspot/share/prims/jvm.cpp line 2472: > >> 2470: u2 field_access_flags = InstanceKlass::cast(k)->field_access_flags(field_index); >> 2471: // This & should be unnecessary. >> 2472: assert((field_access_flags & JVM_RECOGNIZED_FIELD_MODIFIERS) == field_access_flags, "already masked"); > > Nit: Yes, it is better to remove the lines: 2471-2472. fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1904829376 PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1904830369 PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1904829864 From coleenp at openjdk.org Tue Jan 7 02:36:36 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 7 Jan 2025 02:36:36 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v13] In-Reply-To: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: > Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. > > before: > > /* size: 216, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 194, holes: 3, sum holes: 18 */ > > > after: > > /* size: 200, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 188, holes: 4, sum holes: 12 */ > > > We may eventually move the modifiers to java.lang.Class but that's WIP. > > Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Make more AccessFlags fetches more specific and remove an assert and remove this->s. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22246/files - new: https://git.openjdk.org/jdk/pull/22246/files/d43b2fac..35784c70 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22246&range=11-12 Stats: 16 lines in 10 files changed: 0 ins; 3 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/22246.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22246/head:pull/22246 PR: https://git.openjdk.org/jdk/pull/22246 From coleenp at openjdk.org Tue Jan 7 02:47:37 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 7 Jan 2025 02:47:37 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v13] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Tue, 7 Jan 2025 02:36:36 GMT, Coleen Phillimore wrote: >> Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. >> >> before: >> >> /* size: 216, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 194, holes: 3, sum holes: 18 */ >> >> >> after: >> >> /* size: 200, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 188, holes: 4, sum holes: 12 */ >> >> >> We may eventually move the modifiers to java.lang.Class but that's WIP. >> >> Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Make more AccessFlags fetches more specific and remove an assert and remove this->s. Thanks David. I'm hoping @iwanowww or @dean-long can review the compiler parts. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22246#issuecomment-2574287163 From dlong at openjdk.org Tue Jan 7 03:20:38 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 7 Jan 2025 03:20:38 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v13] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: <7RpN7gvi4yXmEGG03o8JTg68OzymPhL76j6ndM4JPe8=.f5bcf133-ae7f-4efb-8df4-f8f9d6423b2d@github.com> On Tue, 7 Jan 2025 02:36:36 GMT, Coleen Phillimore wrote: >> Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. >> >> before: >> >> /* size: 216, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 194, holes: 3, sum holes: 18 */ >> >> >> after: >> >> /* size: 200, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 188, holes: 4, sum holes: 12 */ >> >> >> We may eventually move the modifiers to java.lang.Class but that's WIP. >> >> Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Make more AccessFlags fetches more specific and remove an assert and remove this->s. Sure, I'll try to take a look at it tomorrow. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22246#issuecomment-2574313889 From vlivanov at openjdk.org Tue Jan 7 06:13:48 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Tue, 7 Jan 2025 06:13:48 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v13] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Tue, 7 Jan 2025 02:36:36 GMT, Coleen Phillimore wrote: >> Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. >> >> before: >> >> /* size: 216, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 194, holes: 3, sum holes: 18 */ >> >> >> after: >> >> /* size: 200, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 188, holes: 4, sum holes: 12 */ >> >> >> We may eventually move the modifiers to java.lang.Class but that's WIP. >> >> Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Make more AccessFlags fetches more specific and remove an assert and remove this->s. Looks good. ------------- Marked as reviewed by vlivanov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22246#pullrequestreview-2533455739 From vlivanov at openjdk.org Tue Jan 7 06:13:48 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Tue, 7 Jan 2025 06:13:48 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v11] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> <8UJ0xv2YRkXDMkD05IxpRA-_CmGa2K_14YB3ZMawwAE=.274408f1-f47b-4d45-bf90-d66915291f2f@github.com> Message-ID: On Mon, 6 Jan 2025 13:47:35 GMT, Coleen Phillimore wrote: >> Yes, we are. > > I had a little experiment. > > extern "C" int printf(const char *,...); > int main() { > unsigned short ss = 0x8000; > printf("does unsigned sign extend to int? %d\n", int(ss)); > } IMO you could just call `as_int()` here. All other usages of `ciFlags::as_int()` are in printing code. Ideally, `ciField::print()` could use `ciFlags::print()`, but such cleanup can be done separately. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1904950560 From dholmes at openjdk.org Tue Jan 7 06:24:41 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 7 Jan 2025 06:24:41 GMT Subject: RFR: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Thu, 5 Dec 2024 08:26:10 GMT, Aleksey Shipilev wrote: > **NOTE: This is work-in-progress draft for interested parties. The JEP is not even submitted, let alone targeted.** > > My plan is to to get this done in a quiet time in mainline to limit the ongoing conflicts with mainline. Feel free to comment in this PR, if you see something ahead of time. These comments might adjust the trajectory we take to implement this removal and/or allows us submit and work out more RFEs ahead of this removal. I plan to re-open a clean PR after this preliminary PR is done, maybe after the round of preliminary reviews. > > This removes the 32-bit x86 port and does a deeper cleaning in Hotspot. The following paragraphs describe what and why was being done. > > Easy stuff first: all files named `*_x86_32` are gone. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. > > The code under `!LP64`, `!AMD64` and `IA32` is removed in `x86`-specific files. There is quite a bit of the code, especially around `Assembler` and `MacroAssembler`. I think these removals make the whole thing cleaner. The downside is that some of the `MacroAssembler::*ptr` functions that were used to select the "machine pointer" instructions either from x86_64 or x86_32 are now exclusively for x86_64. I don't think we want to rewrite `*ptr` -> `*q` at this point. I think we gradually morph the code base to use `*q`-flavored methods in new code. > > x86_32 is the only platform that has special cases for x87 FPU. > > C1 even implements the whole separate thing to deal with x87 FPU: the parts of regalloc treat it specially, there is `FpuStackSim`, there is `VerifyFPU` family of flags, etc. There are also peculiarities with FP conversions that use FPU, that's why x86_32 used to have template interpreter stubs for FP conversion methods. None of that is needed anymore without x86_32. This cleans up some arch-specific code as well. > > Both C1 and C2 implement the workarounds for non-IEEE compliant rounding of x87 FPU. After x86_32 is gone, these are not needed anymore. This removes some C2 nodes, removes the rounding instructions in C1. > > x86_64 is baselined on SSE2+, the VM would not even start if SSE2 is not supported. Most of the checks that we have for `UseSSE < 2` are for the benefit of x86_32. Because of this I folded redundant `UseSSE` checks around Hotspot. > > The one thing I _deliberately_ avoided doing is merging `x86.ad` and `x86_64.ad`. It would likely introduce uncomfortable amount of conflicts with pending work in mainli... src/hotspot/share/interpreter/abstractInterpreter.cpp line 137: > 135: case vmIntrinsics::_floatToRawIntBits: return java_lang_Float_floatToRawIntBits; > 136: case vmIntrinsics::_longBitsToDouble: return java_lang_Double_longBitsToDouble; > 137: case vmIntrinsics::_doubleToRawLongBits: return java_lang_Double_doubleToRawLongBits; Why are these intrinsics for the Java methods disappearing? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22567#discussion_r1904957718 From kbarrett at openjdk.org Tue Jan 7 08:34:41 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 7 Jan 2025 08:34:41 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v2] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Mon, 6 Jan 2025 15:04:19 GMT, Coleen Phillimore wrote: >> src/hotspot/share/gc/shared/ageTable.cpp line 38: >> >>> 36: #include "logging/logStream.hpp" >>> 37: >>> 38: /* Copyright (c) 1992, 2025, Oracle and/or its affiliates, and Stanford University. >> >> Well this is weird. An atypical copyright down inside the file? > > This is a relic and not the legal copyright that got updated since nobody noticed. Until you did. Removed. Not sure we're allowed to remove a copyright statement, even if not in the usual place. >> test/hotspot/gtest/utilities/test_globalDefinitions.cpp line 281: >> >>> 279: >>> 280: check_format("%zd", (intx)123, "123"); >>> 281: check_format("0x%zx", (intx)0x123, "0x123"); >> >> Could be "%#zx". > > I fixed this. This seems ok. I didn't know about this format option tbh but if it's standard, why not? I'd forgotten about that format option too, which is why I'm not enamored of it. Also, written that way the prefix gets included in the width when dealing with field width, which might not be great either. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1905081061 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1905079637 From kbarrett at openjdk.org Tue Jan 7 08:34:42 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 7 Jan 2025 08:34:42 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v2] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Mon, 6 Jan 2025 15:21:14 GMT, Coleen Phillimore wrote: >> I have to confess that I have no idea what this is trying to show. I'd rather have all the UINTX_FORMAT purged and not leave a remnant for these two special cases. A function whose name describes what this is trying to show would be better. > > @theRealAph added this with the secondary super cache work, but I think it may have also been meant to be zx because of the leading 0x. So INTPTR_FORMAT would also work. I don't think we should be mixing uintx types and UINTPTR_FORMAT like that. As I said earlier, this is one that I think probably ought not be changed at all. I think some of the FORMAT macros are useful to avoid inline format directives that resemble line noise, or ugly conditionals like that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1905076840 From kbarrett at openjdk.org Tue Jan 7 08:52:43 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 7 Jan 2025 08:52:43 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v2] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Tue, 7 Jan 2025 08:31:13 GMT, Kim Barrett wrote: >> I fixed this. This seems ok. I didn't know about this format option tbh but if it's standard, why not? > > I'd forgotten about that format option too, which is why I'm not enamored of it. Also, written that way the > prefix gets included in the width when dealing with field width, which might not be great either. The problem of accounting for the prefix in the field width calculation can be dealt with by using precision rather than field width. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1905104428 From kbarrett at openjdk.org Tue Jan 7 08:52:42 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 7 Jan 2025 08:52:42 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v2] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Tue, 7 Jan 2025 08:28:32 GMT, Kim Barrett wrote: >> @theRealAph added this with the secondary super cache work, but I think it may have also been meant to be zx because of the leading 0x. So INTPTR_FORMAT would also work. > > I don't think we should be mixing uintx types and UINTPTR_FORMAT like that. As I said earlier, this is one that > I think probably ought not be changed at all. I think some of the FORMAT macros are useful to avoid inline > format directives that resemble line noise, or ugly conditionals like that. Improving on my prior suggestion `"%#.*zx", (2 * BytesPerWord), _secondary_supers_bitmap` Using precision rather than field width, to avoid needing to account for the prefix in the width calculation. But still looking a lot like line noise, and still think it shouldn't be changed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1905101674 From kbarrett at openjdk.org Tue Jan 7 09:09:16 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 7 Jan 2025 09:09:16 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v4] In-Reply-To: References: Message-ID: > Please review this change to how HotSpot prevents the use of certain C library > functions (e.g. poisons references to those functions), while permitting a > subset to be used in restricted circumstances. Reasons for poisoning a > function include it being considered obsolete, or a security concern, or there > is a HotSpot function (typically in the os:: namespace) providing similar > functionality that should be used instead. > > The old mechanism, based on -Wattribute-warning and the associated attribute, > only worked for gcc. (Clang's implementation differs in an important way from > gcc, which is the subject of a clang bug that has been open for years. MSVC > doesn't provide a similar mechanism.) It also had problems with LTO, due to a > gcc bug. > > The new mechanism is based on deprecation warnings, using [[deprecated]] > attributes. We redeclare or forward declare the functions we want to prevent > use of as being deprecated. This relies on deprecation warnings being > enabled, which they already are in our build configuration. All of our > supported compilers support the [[deprecated]] attribute. > > Another benefit of using deprecation warnings rather than warning attributes > is the time when the check is performed. Warning attributes are checked only > if the function is referenced after all optimizations have been performed. > Deprecation is checked during initial semantic analysis. That's better for > our purposes here. (This is also part of why gcc LTO has problems with the > old mechanism, but not the new.) > > Adding these redeclarations or forward declarations isn't as simple as > expected, due to differences between the various compilers. We hide the > differences behind a set of macros, FORBID_C_FUNCTION and related macros. See > the compiler-specific parts of those macros for details. > > In some situations we need to allow references to these poisoned functions. > > One common case is where our poisoning is visible to some 3rd party code we > don't want to modify. This is typically 3rd party headers included in HotSpot > code, such as from Google Test or the C++ Standard Library. For these the > BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context > where such references are permitted. > > Some of the poisoned functions are needed to implement associated HotSpot os:: > functions, or in other similarly restricted contexts. For these, a wrapper > function is provided that calls the poisoned function with the warning > suppressed. These wrappers are defined in the permit_fo... Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: more fixes for clang noreturn issues ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22890/files - new: https://git.openjdk.org/jdk/pull/22890/files/77a80170..c478bda1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22890&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22890&range=02-03 Stats: 7 lines in 2 files changed: 4 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/22890.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22890/head:pull/22890 PR: https://git.openjdk.org/jdk/pull/22890 From shade at openjdk.org Tue Jan 7 09:15:41 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 7 Jan 2025 09:15:41 GMT Subject: RFR: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Tue, 7 Jan 2025 06:21:50 GMT, David Holmes wrote: >> **NOTE: This is work-in-progress draft for interested parties. The JEP is not even submitted, let alone targeted.** >> >> My plan is to to get this done in a quiet time in mainline to limit the ongoing conflicts with mainline. Feel free to comment in this PR, if you see something ahead of time. These comments might adjust the trajectory we take to implement this removal and/or allows us submit and work out more RFEs ahead of this removal. I plan to re-open a clean PR after this preliminary PR is done, maybe after the round of preliminary reviews. >> >> This removes the 32-bit x86 port and does a deeper cleaning in Hotspot. The following paragraphs describe what and why was being done. >> >> Easy stuff first: all files named `*_x86_32` are gone. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. >> >> The code under `!LP64`, `!AMD64` and `IA32` is removed in `x86`-specific files. There is quite a bit of the code, especially around `Assembler` and `MacroAssembler`. I think these removals make the whole thing cleaner. The downside is that some of the `MacroAssembler::*ptr` functions that were used to select the "machine pointer" instructions either from x86_64 or x86_32 are now exclusively for x86_64. I don't think we want to rewrite `*ptr` -> `*q` at this point. I think we gradually morph the code base to use `*q`-flavored methods in new code. >> >> x86_32 is the only platform that has special cases for x87 FPU. >> >> C1 even implements the whole separate thing to deal with x87 FPU: the parts of regalloc treat it specially, there is `FpuStackSim`, there is `VerifyFPU` family of flags, etc. There are also peculiarities with FP conversions that use FPU, that's why x86_32 used to have template interpreter stubs for FP conversion methods. None of that is needed anymore without x86_32. This cleans up some arch-specific code as well. >> >> Both C1 and C2 implement the workarounds for non-IEEE compliant rounding of x87 FPU. After x86_32 is gone, these are not needed anymore. This removes some C2 nodes, removes the rounding instructions in C1. >> >> x86_64 is baselined on SSE2+, the VM would not even start if SSE2 is not supported. Most of the checks that we have for `UseSSE < 2` are for the benefit of x86_32. Because of this I folded redundant `UseSSE` checks around Hotspot. >> >> The one thing I _deliberately_ avoided doing is merging `x86.ad` and `x86_64.ad`. It would likely introduce uncomfortable amount of... > > src/hotspot/share/interpreter/abstractInterpreter.cpp line 137: > >> 135: case vmIntrinsics::_floatToRawIntBits: return java_lang_Float_floatToRawIntBits; >> 136: case vmIntrinsics::_longBitsToDouble: return java_lang_Double_longBitsToDouble; >> 137: case vmIntrinsics::_doubleToRawLongBits: return java_lang_Double_doubleToRawLongBits; > > Why are these intrinsics for the Java methods disappearing? These are interpreter "intrinsics" that are only implemented on x86_32 to handle x87 FPU pecularities. Look around for `TemplateInterpreterGenerator::generate_Float_intBitsToFloat_entry`, for example. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22567#discussion_r1905134973 From kbarrett at openjdk.org Tue Jan 7 09:34:38 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 7 Jan 2025 09:34:38 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: References: <12-_FKsXl3OOujKaHQewgKdDUi95q9M0fSmuLeo6_Wg=.1ec72306-698f-42f8-86b7-e53bc711fde6@github.com> Message-ID: On Sun, 5 Jan 2025 22:14:15 GMT, Kim Barrett wrote: > > Thanks! This has solved one of the problems, but not all. The next one is: > > ``` > > globalDefinitions_gcc.hpp:55: > > In file included from /usr/include/fcntl.h:242: > > /usr/include/unistd.h:181:8: error: 'noreturn' attribute does not appear on the first declaration > > extern _NORETURN(void, _exit)(int); > > ^ > > ``` > > Drat. I forgot that posix _exit comes from unistd.h. Back to work... Hopefully that's fixed now too, and there won't be any more. I also added _Exit to the forbidden set. I looked into the more robust / less kludgy approaches, but they are more work than I want to do as part of this PR. Maybe in a followup. And I wonder if someone might file a bug about the clang weirdness? This might be related, but seems kind of different: https://github.com/llvm/llvm-project/issues/113511 Test case: __attribute__((__noreturn__)) void frob(int); [[noreturn]] void frob(int); => 'noreturn' attribute does not appear on the first declaration gcc compiles this without error. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22890#issuecomment-2574803325 From coleenp at openjdk.org Tue Jan 7 12:36:46 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 7 Jan 2025 12:36:46 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v2] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Tue, 7 Jan 2025 08:50:04 GMT, Kim Barrett wrote: >> I'd forgotten about that format option too, which is why I'm not enamored of it. Also, written that way the >> prefix gets included in the width when dealing with field width, which might not be great either. > > The problem of accounting for the prefix in the field width calculation can be dealt with by using precision > rather than field width. Well then that leaves the fun of dealing with these format specifiers when you're trying to do your own formatting. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1905390045 From coleenp at openjdk.org Tue Jan 7 12:51:33 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 7 Jan 2025 12:51:33 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v5] In-Reply-To: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: > There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. > > Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Restore copyright and macro. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22916/files - new: https://git.openjdk.org/jdk/pull/22916/files/6e8b2702..ae9d9f6f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22916&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22916&range=03-04 Stats: 8 lines in 4 files changed: 5 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/22916.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22916/head:pull/22916 PR: https://git.openjdk.org/jdk/pull/22916 From coleenp at openjdk.org Tue Jan 7 12:51:33 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 7 Jan 2025 12:51:33 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v2] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Tue, 7 Jan 2025 08:48:08 GMT, Kim Barrett wrote: >> I don't think we should be mixing uintx types and UINTPTR_FORMAT like that. As I said earlier, this is one that >> I think probably ought not be changed at all. I think some of the FORMAT macros are useful to avoid inline >> format directives that resemble line noise, or ugly conditionals like that. > > Improving on my prior suggestion > `"%#.*zx", (2 * BytesPerWord), _secondary_supers_bitmap` > Using precision rather than field width, to avoid needing to account for the prefix in the width calculation. > But still looking a lot like line noise, and still think it shouldn't be changed. Yes, this looks horrible. The macro that I was trying to remove is better. I restored but moved it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1905405696 From mdoerr at openjdk.org Tue Jan 7 15:21:39 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 7 Jan 2025 15:21:39 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v4] In-Reply-To: References: Message-ID: On Tue, 7 Jan 2025 09:09:16 GMT, Kim Barrett wrote: >> Please review this change to how HotSpot prevents the use of certain C library >> functions (e.g. poisons references to those functions), while permitting a >> subset to be used in restricted circumstances. Reasons for poisoning a >> function include it being considered obsolete, or a security concern, or there >> is a HotSpot function (typically in the os:: namespace) providing similar >> functionality that should be used instead. >> >> The old mechanism, based on -Wattribute-warning and the associated attribute, >> only worked for gcc. (Clang's implementation differs in an important way from >> gcc, which is the subject of a clang bug that has been open for years. MSVC >> doesn't provide a similar mechanism.) It also had problems with LTO, due to a >> gcc bug. >> >> The new mechanism is based on deprecation warnings, using [[deprecated]] >> attributes. We redeclare or forward declare the functions we want to prevent >> use of as being deprecated. This relies on deprecation warnings being >> enabled, which they already are in our build configuration. All of our >> supported compilers support the [[deprecated]] attribute. >> >> Another benefit of using deprecation warnings rather than warning attributes >> is the time when the check is performed. Warning attributes are checked only >> if the function is referenced after all optimizations have been performed. >> Deprecation is checked during initial semantic analysis. That's better for >> our purposes here. (This is also part of why gcc LTO has problems with the >> old mechanism, but not the new.) >> >> Adding these redeclarations or forward declarations isn't as simple as >> expected, due to differences between the various compilers. We hide the >> differences behind a set of macros, FORBID_C_FUNCTION and related macros. See >> the compiler-specific parts of those macros for details. >> >> In some situations we need to allow references to these poisoned functions. >> >> One common case is where our poisoning is visible to some 3rd party code we >> don't want to modify. This is typically 3rd party headers included in HotSpot >> code, such as from Google Test or the C++ Standard Library. For these the >> BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context >> where such references are permitted. >> >> Some of the poisoned functions are needed to implement associated HotSpot os:: >> functions, or in other similarly restricted contexts. For these, a wrapper >> function is provided that calls the poison... > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > more fixes for clang noreturn issues Thanks! That solved the issue. Now, we need to adapt all usages in the low level AIX code and support `strdup`: diff --git a/src/hotspot/os/aix/libodm_aix.cpp b/src/hotspot/os/aix/libodm_aix.cpp index 9fe0fb7abd8..854fd5e2b79 100644 --- a/src/hotspot/os/aix/libodm_aix.cpp +++ b/src/hotspot/os/aix/libodm_aix.cpp @@ -30,6 +30,7 @@ #include #include "runtime/arguments.hpp" #include "runtime/os.hpp" +#include "utilities/permitForbiddenFunctions.hpp" dynamicOdm::dynamicOdm() { @@ -59,7 +60,7 @@ dynamicOdm::~dynamicOdm() { } -void odmWrapper::clean_data() { if (_data) { free(_data); _data = nullptr; } } +void odmWrapper::clean_data() { if (_data) { permit_forbidden_function::free(_data); _data = nullptr; } } int odmWrapper::class_offset(const char *field, bool is_aix_5) diff --git a/src/hotspot/os/aix/loadlib_aix.cpp b/src/hotspot/os/aix/loadlib_aix.cpp index bc21aef3836..88f660ad46f 100644 --- a/src/hotspot/os/aix/loadlib_aix.cpp +++ b/src/hotspot/os/aix/loadlib_aix.cpp @@ -38,6 +38,7 @@ #include "logging/log.hpp" #include "utilities/debug.hpp" #include "utilities/ostream.hpp" +#include "utilities/permitForbiddenFunctions.hpp" // For loadquery() #include @@ -55,7 +56,7 @@ class StringList { // Enlarge list. If oom, leave old list intact and return false. bool enlarge() { int cap2 = _cap + 64; - char** l2 = (char**) ::realloc(_list, sizeof(char*) * cap2); + char** l2 = (char**) permit_forbidden_function::realloc(_list, sizeof(char*) * cap2); if (!l2) { return false; } @@ -73,7 +74,7 @@ class StringList { } } assert0(_cap > _num); - char* s2 = ::strdup(s); + char* s2 = permit_forbidden_function::strdup(s); if (!s2) { return nullptr; } @@ -167,7 +168,7 @@ static void free_entry_list(loaded_module_t** start) { loaded_module_t* lm = *start; while (lm) { loaded_module_t* const lm2 = lm->next; - ::free(lm); + permit_forbidden_function::free(lm); lm = lm2; } *start = nullptr; @@ -190,7 +191,7 @@ static bool reload_table() { uint8_t* buffer = nullptr; size_t buflen = 1024; for (;;) { - buffer = (uint8_t*) ::realloc(buffer, buflen); + buffer = (uint8_t*) permit_forbidden_function::realloc(buffer, buflen); if (loadquery(L_GETINFO, buffer, buflen) == -1) { if (errno == ENOMEM) { buflen *= 2; @@ -210,7 +211,7 @@ static bool reload_table() { for (;;) { - loaded_module_t* lm = (loaded_module_t*) ::malloc(sizeof(loaded_module_t)); + loaded_module_t* lm = (loaded_module_t*) permit_forbidden_function::malloc(sizeof(loaded_module_t)); if (!lm) { log_warning(os)("OOM."); goto cleanup; @@ -226,7 +227,7 @@ static bool reload_table() { lm->path = g_stringlist.add(ldi->ldinfo_filename); if (!lm->path) { log_warning(os)("OOM."); - free(lm); + permit_forbidden_function::free(lm); goto cleanup; } @@ -248,7 +249,7 @@ static bool reload_table() { lm->member = g_stringlist.add(p_mbr_name); if (!lm->member) { log_warning(os)("OOM."); - free(lm); + permit_forbidden_function::free(lm); goto cleanup; } } else { @@ -296,7 +297,7 @@ static bool reload_table() { free_entry_list(&new_list); } - ::free(buffer); + permit_forbidden_function::free(buffer); return rc; diff --git a/src/hotspot/os/aix/os_aix.cpp b/src/hotspot/os/aix/os_aix.cpp index 26627c2f8fb..2d0859a4d5e 100644 --- a/src/hotspot/os/aix/os_aix.cpp +++ b/src/hotspot/os/aix/os_aix.cpp @@ -74,6 +74,7 @@ #include "utilities/defaultStream.hpp" #include "utilities/events.hpp" #include "utilities/growableArray.hpp" +#include "utilities/permitForbiddenFunctions.hpp" #include "utilities/vmError.hpp" #if INCLUDE_JFR #include "jfr/support/jfrNativeLibraryLoadEvent.hpp" @@ -364,9 +365,9 @@ static void query_multipage_support() { // or by environment variable LDR_CNTRL (suboption DATAPSIZE). If none is given, // default should be 4K. { - void* p = ::malloc(16*M); + void* p = permit_forbidden_function::malloc(16*M); g_multipage_support.datapsize = os::Aix::query_pagesize(p); - ::free(p); + permit_forbidden_function::free(p); } // Query default shm page size (LDR_CNTRL SHMPSIZE). @@ -1406,7 +1407,7 @@ static struct { } vmem; static void vmembk_add(char* addr, size_t size, size_t pagesize, int type) { - vmembk_t* p = (vmembk_t*) ::malloc(sizeof(vmembk_t)); + vmembk_t* p = (vmembk_t*) permit_forbidden_function::malloc(sizeof(vmembk_t)); assert0(p); if (p) { MiscUtils::AutoCritSect lck(&vmem.cs); @@ -1435,7 +1436,7 @@ static void vmembk_remove(vmembk_t* p0) { for (vmembk_t** pp = &(vmem.first); *pp; pp = &((*pp)->next)) { if (*pp == p0) { *pp = p0->next; - ::free(p0); + permit_forbidden_function::free(p0); return; } } diff --git a/src/hotspot/os/aix/porting_aix.cpp b/src/hotspot/os/aix/porting_aix.cpp index 9d91c91bf8a..2235d3da686 100644 --- a/src/hotspot/os/aix/porting_aix.cpp +++ b/src/hotspot/os/aix/porting_aix.cpp @@ -1082,7 +1082,7 @@ void* Aix_dlopen(const char* filename, int Flags, int *eno, const char** error_r if (g_handletable_used == max_handletable) { // No place in array anymore; increase array. unsigned new_max = MAX2(max_handletable * 2, init_num_handles); - struct handletableentry* new_tab = (struct handletableentry*)::realloc(p_handletable, new_max * sizeof(struct handletableentry)); + struct handletableentry* new_tab = (struct handletableentry*) permit_forbidden_function::realloc(p_handletable, new_max * sizeof(struct handletableentry)); assert(new_tab != nullptr, "no more memory for handletable"); if (new_tab == nullptr) { *error_report = "dlopen: no more memory for handletable"; diff --git a/src/hotspot/share/utilities/permitForbiddenFunctions.hpp b/src/hotspot/share/utilities/permitForbiddenFunctions.hpp index 0611d7e1996..a751158832c 100644 --- a/src/hotspot/share/utilities/permitForbiddenFunctions.hpp +++ b/src/hotspot/share/utilities/permitForbiddenFunctions.hpp @@ -60,6 +60,7 @@ inline void* malloc(size_t size) { return ::malloc(size); } inline void free(void* ptr) { return ::free(ptr); } inline void* calloc(size_t nmemb, size_t size) { return ::calloc(nmemb, size); } inline void* realloc(void* ptr, size_t size) { return ::realloc(ptr, size); } +inline char* strdup(const char* str) { return ::strdup(str); } END_ALLOW_FORBIDDEN_FUNCTIONS } // namespace permit_forbidden_function It may be possible to use some `os::` versions, but that would need a careful review. I think it's safer to keep it this way for now. @JoKern65: You may want to take a look, too. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22890#issuecomment-2575556410 From yzheng at openjdk.org Tue Jan 7 16:17:45 2025 From: yzheng at openjdk.org (Yudi Zheng) Date: Tue, 7 Jan 2025 16:17:45 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v13] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: <8DdLbSa2qunYTvNQxss7Dh26pYhGzZepTCgE1252_VI=.0150a509-d248-43a5-a01b-f47a96589727@github.com> On Tue, 7 Jan 2025 02:36:36 GMT, Coleen Phillimore wrote: >> Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. >> >> before: >> >> /* size: 216, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 194, holes: 3, sum holes: 18 */ >> >> >> after: >> >> /* size: 200, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 188, holes: 4, sum holes: 12 */ >> >> >> We may eventually move the modifiers to java.lang.Class but that's WIP. >> >> Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Make more AccessFlags fetches more specific and remove an assert and remove this->s. JVMCI changes look good to me! ------------- Marked as reviewed by yzheng (Committer). PR Review: https://git.openjdk.org/jdk/pull/22246#pullrequestreview-2534792137 From sspitsyn at openjdk.org Tue Jan 7 20:26:41 2025 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 7 Jan 2025 20:26:41 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v13] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Tue, 7 Jan 2025 02:36:36 GMT, Coleen Phillimore wrote: >> Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. >> >> before: >> >> /* size: 216, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 194, holes: 3, sum holes: 18 */ >> >> >> after: >> >> /* size: 200, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 188, holes: 4, sum holes: 12 */ >> >> >> We may eventually move the modifiers to java.lang.Class but that's WIP. >> >> Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Make more AccessFlags fetches more specific and remove an assert and remove this->s. Thank you for one more unification update! Looks good. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22246#pullrequestreview-2535297355 From dlong at openjdk.org Tue Jan 7 21:02:19 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 7 Jan 2025 21:02:19 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v13] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: <4DemtiMYMfFk02L_CpRut0MMGssK3jBcKfSTMLH36Y4=.24ee458f-5893-4249-b166-496f04a35d70@github.com> On Tue, 7 Jan 2025 02:36:36 GMT, Coleen Phillimore wrote: >> Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. >> >> before: >> >> /* size: 216, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 194, holes: 3, sum holes: 18 */ >> >> >> after: >> >> /* size: 200, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 188, holes: 4, sum holes: 12 */ >> >> >> We may eventually move the modifiers to java.lang.Class but that's WIP. >> >> Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Make more AccessFlags fetches more specific and remove an assert and remove this->s. src/hotspot/share/classfile/vmIntrinsics.cpp line 39: > 37: > 38: // These are flag-matching functions: > 39: inline bool match_F_R(u2 flags) { I wish more code could be size-agnostic. So instead of using `u2` here, there could be a typedef in accessFlags.hpp that we could use that hides the size. However, it's not a big deal, because it seems unlikely this type will change much in the future without a JVM spec change. I'm tempted to suggesting using AccessFlags here, but it's a class. Since this is an end-point "consumer" of the type that doesn't store it or pass it along, we could even use something like `uint` here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1906033152 From dlong at openjdk.org Tue Jan 7 21:23:37 2025 From: dlong at openjdk.org (Dean Long) Date: Tue, 7 Jan 2025 21:23:37 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v13] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Tue, 7 Jan 2025 02:36:36 GMT, Coleen Phillimore wrote: >> Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. >> >> before: >> >> /* size: 216, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 194, holes: 3, sum holes: 18 */ >> >> >> after: >> >> /* size: 200, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 188, holes: 4, sum holes: 12 */ >> >> >> We may eventually move the modifiers to java.lang.Class but that's WIP. >> >> Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Make more AccessFlags fetches more specific and remove an assert and remove this->s. Marked as reviewed by dlong (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/22246#pullrequestreview-2535389245 From coleenp at openjdk.org Tue Jan 7 22:06:42 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 7 Jan 2025 22:06:42 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v13] In-Reply-To: References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Tue, 7 Jan 2025 02:36:36 GMT, Coleen Phillimore wrote: >> Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. >> >> before: >> >> /* size: 216, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 194, holes: 3, sum holes: 18 */ >> >> >> after: >> >> /* size: 200, cachelines: 4, members: 25, static members: 17 */ >> /* sum members: 188, holes: 4, sum holes: 12 */ >> >> >> We may eventually move the modifiers to java.lang.Class but that's WIP. >> >> Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Make more AccessFlags fetches more specific and remove an assert and remove this->s. > IMO you could just call as_int() here. All other usages of ciFlags::as_int() are in printing code. > Ideally, ciField::print() could use ciFlags::print(), but such cleanup can be done separately. This does seem like a good future cleanup. Thanks for reviewing Vladimir. Thanks for reviewing David, Yudi and Serguei. I reran tiers 1-7 on this change. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22246#issuecomment-2576308012 PR Comment: https://git.openjdk.org/jdk/pull/22246#issuecomment-2576309500 From coleenp at openjdk.org Tue Jan 7 22:06:43 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 7 Jan 2025 22:06:43 GMT Subject: RFR: 8339113: AccessFlags can be u2 in metadata [v13] In-Reply-To: <4DemtiMYMfFk02L_CpRut0MMGssK3jBcKfSTMLH36Y4=.24ee458f-5893-4249-b166-496f04a35d70@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> <4DemtiMYMfFk02L_CpRut0MMGssK3jBcKfSTMLH36Y4=.24ee458f-5893-4249-b166-496f04a35d70@github.com> Message-ID: On Tue, 7 Jan 2025 20:59:18 GMT, Dean Long wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Make more AccessFlags fetches more specific and remove an assert and remove this->s. > > src/hotspot/share/classfile/vmIntrinsics.cpp line 39: > >> 37: >> 38: // These are flag-matching functions: >> 39: inline bool match_F_R(u2 flags) { > > I wish more code could be size-agnostic. So instead of using `u2` here, there could be a typedef in accessFlags.hpp that we could use that hides the size. However, it's not a big deal, because it seems unlikely this type will change much in the future without a JVM spec change. I'm tempted to suggesting using AccessFlags here, but it's a class. Since this is an end-point "consumer" of the type that doesn't store it or pass it along, we could even use something like `uint` here. That seems like a good idea if this weren't so limited. I did chat with DanH. to see if access flags will ever grow in size from u2 and he was doubtful that would ever happen. uint would work here too, but this small use might prevent some future -Wconversion warnings and doesn't hurt anything. Thank you for reviewing. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22246#discussion_r1906090983 From coleenp at openjdk.org Tue Jan 7 22:06:44 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 7 Jan 2025 22:06:44 GMT Subject: Integrated: 8339113: AccessFlags can be u2 in metadata In-Reply-To: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> References: <0esPcg-bCT6iGHTebf8WsmbokSuIYUUUe5okCARAX9k=.a86a14d3-8cef-46d5-9887-095ac02a1b6d@github.com> Message-ID: On Tue, 19 Nov 2024 16:18:48 GMT, Coleen Phillimore wrote: > Please review this change that makes AccessFlags and modifier_flags u2 types and removes the last remnants of Hotspot adding internal access flags. This change moves AccessFlags and modifier_flags in Klass to alignment gaps saving 16 bytes. From pahole: so it's a bit better. > > before: > > /* size: 216, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 194, holes: 3, sum holes: 18 */ > > > after: > > /* size: 200, cachelines: 4, members: 25, static members: 17 */ > /* sum members: 188, holes: 4, sum holes: 12 */ > > > We may eventually move the modifiers to java.lang.Class but that's WIP. > > Tested with tier1-7 on oracle platforms. Did test builds on other platforms (please try these changes ppc/arm32 and s390). Also requires minor Graal changes. This pull request has now been integrated. Changeset: 098afc8b Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/098afc8b7d0e7caa82999fb9d4e319ea8aed09a1 Stats: 328 lines in 58 files changed: 45 ins; 50 del; 233 mod 8339113: AccessFlags can be u2 in metadata Co-authored-by: Amit Kumar Reviewed-by: sspitsyn, vlivanov, yzheng, dlong, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/22246 From dholmes at openjdk.org Wed Jan 8 00:10:20 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 8 Jan 2025 00:10:20 GMT Subject: RFR: 8347148: [BACKOUT] AccessFlags can be u2 in metadata Message-ID: <8nNvTk0YI5WeE9zzHt5NkoIdW0J5OfxFabGVNVN2saM=.fadfb990-0d5d-4e14-9989-10c422a03ac1@github.com> Revert "8339113: AccessFlags can be u2 in metadata" This reverts commit 098afc8b7d0e7caa82999fb9d4e319ea8aed09a1. The integration of the original change needs to be coordinated with a change to Graal so backing out till that is ready. Thanks ------------- Commit messages: - Revert "8339113: AccessFlags can be u2 in metadata" Changes: https://git.openjdk.org/jdk/pull/22959/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22959&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347148 Stats: 328 lines in 58 files changed: 50 ins; 45 del; 233 mod Patch: https://git.openjdk.org/jdk/pull/22959.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22959/head:pull/22959 PR: https://git.openjdk.org/jdk/pull/22959 From coleenp at openjdk.org Wed Jan 8 00:12:44 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 8 Jan 2025 00:12:44 GMT Subject: RFR: 8347148: [BACKOUT] AccessFlags can be u2 in metadata In-Reply-To: <8nNvTk0YI5WeE9zzHt5NkoIdW0J5OfxFabGVNVN2saM=.fadfb990-0d5d-4e14-9989-10c422a03ac1@github.com> References: <8nNvTk0YI5WeE9zzHt5NkoIdW0J5OfxFabGVNVN2saM=.fadfb990-0d5d-4e14-9989-10c422a03ac1@github.com> Message-ID: On Wed, 8 Jan 2025 00:04:31 GMT, David Holmes wrote: > Revert "8339113: AccessFlags can be u2 in metadata" > > This reverts commit 098afc8b7d0e7caa82999fb9d4e319ea8aed09a1. > > The integration of the original change needs to be coordinated with a change to Graal so backing out till that is ready. > > Thanks Looks good plus urgent. Thanks for doing this, David. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22959#pullrequestreview-2535594292 From dholmes at openjdk.org Wed Jan 8 00:18:34 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 8 Jan 2025 00:18:34 GMT Subject: RFR: 8347148: [BACKOUT] AccessFlags can be u2 in metadata In-Reply-To: References: <8nNvTk0YI5WeE9zzHt5NkoIdW0J5OfxFabGVNVN2saM=.fadfb990-0d5d-4e14-9989-10c422a03ac1@github.com> Message-ID: On Wed, 8 Jan 2025 00:10:25 GMT, Coleen Phillimore wrote: >> Revert "8339113: AccessFlags can be u2 in metadata" >> >> This reverts commit 098afc8b7d0e7caa82999fb9d4e319ea8aed09a1. >> >> The integration of the original change needs to be coordinated with a change to Graal so backing out till that is ready. >> >> Thanks > > Looks good plus urgent. Thanks for doing this, David. Thanks for the review @coleenp ! Just waiting tier1 sanity to complete ------------- PR Comment: https://git.openjdk.org/jdk/pull/22959#issuecomment-2576463303 From dholmes at openjdk.org Wed Jan 8 00:42:38 2025 From: dholmes at openjdk.org (David Holmes) Date: Wed, 8 Jan 2025 00:42:38 GMT Subject: Integrated: 8347148: [BACKOUT] AccessFlags can be u2 in metadata In-Reply-To: <8nNvTk0YI5WeE9zzHt5NkoIdW0J5OfxFabGVNVN2saM=.fadfb990-0d5d-4e14-9989-10c422a03ac1@github.com> References: <8nNvTk0YI5WeE9zzHt5NkoIdW0J5OfxFabGVNVN2saM=.fadfb990-0d5d-4e14-9989-10c422a03ac1@github.com> Message-ID: <9o7YHNAv4mkotoVPqwxDgL_-s1CN9UCrrIihsBVe89o=.b6337941-8549-433d-bda2-fb42c6e5b112@github.com> On Wed, 8 Jan 2025 00:04:31 GMT, David Holmes wrote: > Revert "8339113: AccessFlags can be u2 in metadata" > > This reverts commit 098afc8b7d0e7caa82999fb9d4e319ea8aed09a1. > > The integration of the original change needs to be coordinated with a change to Graal so backing out till that is ready. > > Thanks This pull request has now been integrated. Changeset: 021c4764 Author: David Holmes URL: https://git.openjdk.org/jdk/commit/021c476409c52c65cc7b40516d81dedef040fe83 Stats: 328 lines in 58 files changed: 50 ins; 45 del; 233 mod 8347148: [BACKOUT] AccessFlags can be u2 in metadata Reviewed-by: coleenp ------------- PR: https://git.openjdk.org/jdk/pull/22959 From kbarrett at openjdk.org Wed Jan 8 04:25:18 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 8 Jan 2025 04:25:18 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v5] In-Reply-To: References: Message-ID: > Please review this change to how HotSpot prevents the use of certain C library > functions (e.g. poisons references to those functions), while permitting a > subset to be used in restricted circumstances. Reasons for poisoning a > function include it being considered obsolete, or a security concern, or there > is a HotSpot function (typically in the os:: namespace) providing similar > functionality that should be used instead. > > The old mechanism, based on -Wattribute-warning and the associated attribute, > only worked for gcc. (Clang's implementation differs in an important way from > gcc, which is the subject of a clang bug that has been open for years. MSVC > doesn't provide a similar mechanism.) It also had problems with LTO, due to a > gcc bug. > > The new mechanism is based on deprecation warnings, using [[deprecated]] > attributes. We redeclare or forward declare the functions we want to prevent > use of as being deprecated. This relies on deprecation warnings being > enabled, which they already are in our build configuration. All of our > supported compilers support the [[deprecated]] attribute. > > Another benefit of using deprecation warnings rather than warning attributes > is the time when the check is performed. Warning attributes are checked only > if the function is referenced after all optimizations have been performed. > Deprecation is checked during initial semantic analysis. That's better for > our purposes here. (This is also part of why gcc LTO has problems with the > old mechanism, but not the new.) > > Adding these redeclarations or forward declarations isn't as simple as > expected, due to differences between the various compilers. We hide the > differences behind a set of macros, FORBID_C_FUNCTION and related macros. See > the compiler-specific parts of those macros for details. > > In some situations we need to allow references to these poisoned functions. > > One common case is where our poisoning is visible to some 3rd party code we > don't want to modify. This is typically 3rd party headers included in HotSpot > code, such as from Google Test or the C++ Standard Library. For these the > BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context > where such references are permitted. > > Some of the poisoned functions are needed to implement associated HotSpot os:: > functions, or in other similarly restricted contexts. For these, a wrapper > function is provided that calls the poisoned function with the warning > suppressed. These wrappers are defined in the permit_fo... Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: aix permit patches ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22890/files - new: https://git.openjdk.org/jdk/pull/22890/files/c478bda1..b774f14c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22890&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22890&range=03-04 Stats: 18 lines in 4 files changed: 5 ins; 0 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/22890.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22890/head:pull/22890 PR: https://git.openjdk.org/jdk/pull/22890 From kbarrett at openjdk.org Wed Jan 8 04:27:37 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 8 Jan 2025 04:27:37 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v4] In-Reply-To: References: Message-ID: On Tue, 7 Jan 2025 15:18:47 GMT, Martin Doerr wrote: > Thanks! That solved the issue. Now, we need to adapt all usages in the low level AIX code and support `strdup`: Not surprising that there are a few of these, since the aix-specific code wasn't previously under the forbidden-function regime. I've applied most of the provided patches. (And thanks for providing them.) The exception is the patches related to strdup. There's only one use, in loadlib_aix.cpp. Rather than the patch to add a permit wrapper for strdup (affecting all other code too) and use that here, I'd rather just suppress the warning explicitly here. I filed an enhancement to remove the use of ::strdup. https://bugs.openjdk.org/browse/JDK-8347157 I also filed this after looking for strdup uses in aix code. https://bugs.openjdk.org/browse/JDK-8347143 It's about a strange use of os::strdup, so doesn't affect this PR. > It may be possible to use some `os::` versions, but that would need a careful review. I think it's safer to keep it this way for now. @JoKern65: You may want to take a look, too. I'll leave it to you folks to file any additional issues related to using permit bypass functions instead of the relevant os:: functions. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22890#issuecomment-2576721514 From kbarrett at openjdk.org Wed Jan 8 06:20:38 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 8 Jan 2025 06:20:38 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: <8gxnsZYbqEJ7T3N637tjijrVbmQgbu8BrHHmVAjCt5M=.f98893f3-692a-4168-80f0-997f522ec4b0@github.com> References: <8gxnsZYbqEJ7T3N637tjijrVbmQgbu8BrHHmVAjCt5M=.f98893f3-692a-4168-80f0-997f522ec4b0@github.com> Message-ID: <7w5uOtMR8ROpBIYdJHf4L44ANnxawhijM_iBCHpUtcI=.46e22653-732f-4edd-ad2f-c947d5928f6d@github.com> On Mon, 6 Jan 2025 01:35:54 GMT, David Holmes wrote: >> If I was going to go that route, I'd be more inclined toward >> >> #if !define(_WINDOWS) >> #include "forbiddenFunctions_windows.hpp" >> #else >> #include "forbiddenFunctions_posix.hpp" >> #endif >> >> rather than list all (or maybe just a subset) of the posix-like ports. My >> inclination is to leave it as-is with OS_HEADERS, since I think that's the >> "intended" idiom. > > Overall I like this change. I appreciate the effort that has been put in to try and find an elegant solution to this problem. > > but having OS specific files created just to include the posix version runs counter to why we have the posix variants in the first place IMO. Please select one of the above approaches so that the new aix/bsd/linux specific files can be removed in favour of the posix one. Thanks. I disagree. It seems to me that breaking the abstraction like that is just asking for trouble. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22890#discussion_r1906533984 From mdoerr at openjdk.org Wed Jan 8 11:17:37 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 8 Jan 2025 11:17:37 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v5] In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 04:25:18 GMT, Kim Barrett wrote: >> Please review this change to how HotSpot prevents the use of certain C library >> functions (e.g. poisons references to those functions), while permitting a >> subset to be used in restricted circumstances. Reasons for poisoning a >> function include it being considered obsolete, or a security concern, or there >> is a HotSpot function (typically in the os:: namespace) providing similar >> functionality that should be used instead. >> >> The old mechanism, based on -Wattribute-warning and the associated attribute, >> only worked for gcc. (Clang's implementation differs in an important way from >> gcc, which is the subject of a clang bug that has been open for years. MSVC >> doesn't provide a similar mechanism.) It also had problems with LTO, due to a >> gcc bug. >> >> The new mechanism is based on deprecation warnings, using [[deprecated]] >> attributes. We redeclare or forward declare the functions we want to prevent >> use of as being deprecated. This relies on deprecation warnings being >> enabled, which they already are in our build configuration. All of our >> supported compilers support the [[deprecated]] attribute. >> >> Another benefit of using deprecation warnings rather than warning attributes >> is the time when the check is performed. Warning attributes are checked only >> if the function is referenced after all optimizations have been performed. >> Deprecation is checked during initial semantic analysis. That's better for >> our purposes here. (This is also part of why gcc LTO has problems with the >> old mechanism, but not the new.) >> >> Adding these redeclarations or forward declarations isn't as simple as >> expected, due to differences between the various compilers. We hide the >> differences behind a set of macros, FORBID_C_FUNCTION and related macros. See >> the compiler-specific parts of those macros for details. >> >> In some situations we need to allow references to these poisoned functions. >> >> One common case is where our poisoning is visible to some 3rd party code we >> don't want to modify. This is typically 3rd party headers included in HotSpot >> code, such as from Google Test or the C++ Standard Library. For these the >> BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context >> where such references are permitted. >> >> Some of the poisoned functions are needed to implement associated HotSpot os:: >> functions, or in other similarly restricted contexts. For these, a wrapper >> function is provided that calls the poison... > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > aix permit patches Thanks! The AIX build was successful with the recent changes. Regarding [JDK-8347157](https://bugs.openjdk.org/browse/JDK-8347157), I don't think we should use `os::strdup` in `StringList` because the class is designed to use raw malloc. I guess @JoKern65 can take a look at the 2 RFEs. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22890#issuecomment-2577419121 From jsjolen at openjdk.org Wed Jan 8 12:11:49 2025 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 8 Jan 2025 12:11:49 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v5] In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 04:25:18 GMT, Kim Barrett wrote: >> Please review this change to how HotSpot prevents the use of certain C library >> functions (e.g. poisons references to those functions), while permitting a >> subset to be used in restricted circumstances. Reasons for poisoning a >> function include it being considered obsolete, or a security concern, or there >> is a HotSpot function (typically in the os:: namespace) providing similar >> functionality that should be used instead. >> >> The old mechanism, based on -Wattribute-warning and the associated attribute, >> only worked for gcc. (Clang's implementation differs in an important way from >> gcc, which is the subject of a clang bug that has been open for years. MSVC >> doesn't provide a similar mechanism.) It also had problems with LTO, due to a >> gcc bug. >> >> The new mechanism is based on deprecation warnings, using [[deprecated]] >> attributes. We redeclare or forward declare the functions we want to prevent >> use of as being deprecated. This relies on deprecation warnings being >> enabled, which they already are in our build configuration. All of our >> supported compilers support the [[deprecated]] attribute. >> >> Another benefit of using deprecation warnings rather than warning attributes >> is the time when the check is performed. Warning attributes are checked only >> if the function is referenced after all optimizations have been performed. >> Deprecation is checked during initial semantic analysis. That's better for >> our purposes here. (This is also part of why gcc LTO has problems with the >> old mechanism, but not the new.) >> >> Adding these redeclarations or forward declarations isn't as simple as >> expected, due to differences between the various compilers. We hide the >> differences behind a set of macros, FORBID_C_FUNCTION and related macros. See >> the compiler-specific parts of those macros for details. >> >> In some situations we need to allow references to these poisoned functions. >> >> One common case is where our poisoning is visible to some 3rd party code we >> don't want to modify. This is typically 3rd party headers included in HotSpot >> code, such as from Google Test or the C++ Standard Library. For these the >> BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context >> where such references are permitted. >> >> Some of the poisoned functions are needed to implement associated HotSpot os:: >> functions, or in other similarly restricted contexts. For these, a wrapper >> function is provided that calls the poison... > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > aix permit patches If this passes testing and David's comments are fixed, then I'm very happy with this PR as it is. Thank you for this. ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22890#pullrequestreview-2536977481 From thartmann at openjdk.org Wed Jan 8 12:12:32 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 8 Jan 2025 12:12:32 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic Message-ID: C2's arraycopy intrinsic adds guards that check that the source and destination objects are arrays: https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5917-L5919 If these guards pass, the array length is loaded: https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5930-L5933 But since the `LoadRangeNode` is not pinned, it might float above the array guard: https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/graphKit.cpp#L1214 If the object is not an array, we will read garbage. That's usually fine because the result will not be used (the array guard will trigger) but with `-XX:+UseCompactObjectHeaders` it can happen that the memory right after the header is not mapped and we crash. The fix is to add a `CheckCastPPNode` to propagate the information that the operand is an array and prevent the load from floating. Thanks to @shipilev for identifying the root cause! Best regards, Tobias ------------- Commit messages: - 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic Changes: https://git.openjdk.org/jdk/pull/22967/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22967&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347006 Stats: 10 lines in 2 files changed: 8 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/22967.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22967/head:pull/22967 PR: https://git.openjdk.org/jdk/pull/22967 From roland at openjdk.org Wed Jan 8 12:19:41 2025 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 8 Jan 2025 12:19:41 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 12:07:16 GMT, Tobias Hartmann wrote: > C2's arraycopy intrinsic adds guards that check that the source and destination objects are arrays: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5917-L5919 > > If these guards pass, the array length is loaded: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5930-L5933 > > But since the `LoadRangeNode` is not pinned, it might float above the array guard: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/graphKit.cpp#L1214 > > If the object is not an array, we will read garbage. That's usually fine because the result will not be used (the array guard will trigger) but with `-XX:+UseCompactObjectHeaders` it can happen that the memory right after the header is not mapped and we crash. > > The fix is to add a `CheckCastPPNode` to propagate the information that the operand is an array and prevent the load from floating. > > Thanks to @shipilev for identifying the root cause! > > I was able to reliably reproduce the issue with `compiler/arraycopy/TestArrayCopyNoInit.java` and `-XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:-UseCompressedClassPointers` on Linux AArch64 and verified that the fix solves the problem. > > Best regards, > Tobias Looks good to me. src/hotspot/share/opto/library_call.cpp line 5920: > 5918: // Keep track of the information that src/dest are arrays to prevent below array specific accesses from floating above. > 5919: generate_non_array_guard(load_object_klass(src), slow_region); > 5920: const Type* tary = TypeAryPtr::make(TypePtr::BotPTR, TypeAry::make(Type::BOTTOM, TypeInt::POS), nullptr, false, Type::OffsetBot); Is this never used elsewhere? Should it a static field in `TypeAryPtr` same as `TypeAryPtr::BYTES` and friends? ------------- Marked as reviewed by roland (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22967#pullrequestreview-2536996400 PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1907095997 From shade at openjdk.org Wed Jan 8 12:29:49 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 8 Jan 2025 12:29:49 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 12:07:16 GMT, Tobias Hartmann wrote: > C2's arraycopy intrinsic adds guards that check that the source and destination objects are arrays: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5917-L5919 > > If these guards pass, the array length is loaded: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5930-L5933 > > But since the `LoadRangeNode` is not pinned, it might float above the array guard: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/graphKit.cpp#L1214 > > If the object is not an array, we will read garbage. That's usually fine because the result will not be used (the array guard will trigger) but with `-XX:+UseCompactObjectHeaders` it can happen that the memory right after the header is not mapped and we crash. > > The fix is to add a `CheckCastPPNode` to propagate the information that the operand is an array and prevent the load from floating. > > Thanks to @shipilev for identifying the root cause! > > I was able to reliably reproduce the issue with `compiler/arraycopy/TestArrayCopyNoInit.java` and `-XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:-UseCompressedClassPointers` on Linux AArch64 and verified that the fix solves the problem. > > Best regards, > Tobias I have a question about the test :) test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyNoInit.java line 32: > 30: * compiler.arraycopy.TestArrayCopyNoInit > 31: * @run main/othervm -XX:-BackgroundCompilation -XX:-UseOnStackReplacement -XX:TypeProfileLevel=020 > 32: * -XX:+UnlockExperimentalVMOptions -XX:+UseCompactObjectHeaders -XX:-UseTLAB You have been able to reproduce this with `-UseCompressedClassPointers`, right? If so, I'd suggest we do a run config with `-UseCCP` instead of `+UseCOH`, because this gives us a cleaner way for backports, if we need one later. ------------- PR Review: https://git.openjdk.org/jdk/pull/22967#pullrequestreview-2537002862 PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1907099990 From thartmann at openjdk.org Wed Jan 8 12:29:49 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 8 Jan 2025 12:29:49 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 12:07:16 GMT, Tobias Hartmann wrote: > C2's arraycopy intrinsic adds guards that check that the source and destination objects are arrays: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5917-L5919 > > If these guards pass, the array length is loaded: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5930-L5933 > > But since the `LoadRangeNode` is not pinned, it might float above the array guard: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/graphKit.cpp#L1214 > > If the object is not an array, we will read garbage. That's usually fine because the result will not be used (the array guard will trigger) but with `-XX:+UseCompactObjectHeaders` it can happen that the memory right after the header is not mapped and we crash. > > The fix is to add a `CheckCastPPNode` to propagate the information that the operand is an array and prevent the load from floating. > > Thanks to @shipilev for identifying the root cause! > > I was able to reliably reproduce the issue with `compiler/arraycopy/TestArrayCopyNoInit.java` and `-XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:-UseCompressedClassPointers` on Linux AArch64 and verified that the fix solves the problem. > > Best regards, > Tobias Thanks for the review, Roland! ------------- PR Comment: https://git.openjdk.org/jdk/pull/22967#issuecomment-2577545127 From roland at openjdk.org Wed Jan 8 12:29:50 2025 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 8 Jan 2025 12:29:50 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 12:23:07 GMT, Tobias Hartmann wrote: >> src/hotspot/share/opto/library_call.cpp line 5920: >> >>> 5918: // Keep track of the information that src/dest are arrays to prevent below array specific accesses from floating above. >>> 5919: generate_non_array_guard(load_object_klass(src), slow_region); >>> 5920: const Type* tary = TypeAryPtr::make(TypePtr::BotPTR, TypeAry::make(Type::BOTTOM, TypeInt::POS), nullptr, false, Type::OffsetBot); >> >> Is this never used elsewhere? Should it a static field in `TypeAryPtr` same as `TypeAryPtr::BYTES` and friends? > > I wondered as well and no, we don't use this type anywhere else (the closest would be `TypeAryPtr::RANGE`). We only create it when meeting arrays of primitive and non-primitive element type. Do you think this should still go to `TypeAryPtr::*`? I would add it to `TypeAryPtr` (maybe as `TypeAryPtr::BOTTOM`) . The main benefit I see is that the new code would more readable if it referred to `TypeAryPtr::BOTTOM` rather than the long type creation expression. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1907107453 From thartmann at openjdk.org Wed Jan 8 12:29:50 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 8 Jan 2025 12:29:50 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 12:17:05 GMT, Roland Westrelin wrote: >> C2's arraycopy intrinsic adds guards that check that the source and destination objects are arrays: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5917-L5919 >> >> If these guards pass, the array length is loaded: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5930-L5933 >> >> But since the `LoadRangeNode` is not pinned, it might float above the array guard: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/graphKit.cpp#L1214 >> >> If the object is not an array, we will read garbage. That's usually fine because the result will not be used (the array guard will trigger) but with `-XX:+UseCompactObjectHeaders` it can happen that the memory right after the header is not mapped and we crash. >> >> The fix is to add a `CheckCastPPNode` to propagate the information that the operand is an array and prevent the load from floating. >> >> Thanks to @shipilev for identifying the root cause! >> >> I was able to reliably reproduce the issue with `compiler/arraycopy/TestArrayCopyNoInit.java` and `-XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:-UseCompressedClassPointers` on Linux AArch64 and verified that the fix solves the problem. >> >> Best regards, >> Tobias > > src/hotspot/share/opto/library_call.cpp line 5920: > >> 5918: // Keep track of the information that src/dest are arrays to prevent below array specific accesses from floating above. >> 5919: generate_non_array_guard(load_object_klass(src), slow_region); >> 5920: const Type* tary = TypeAryPtr::make(TypePtr::BotPTR, TypeAry::make(Type::BOTTOM, TypeInt::POS), nullptr, false, Type::OffsetBot); > > Is this never used elsewhere? Should it a static field in `TypeAryPtr` same as `TypeAryPtr::BYTES` and friends? I wondered as well and no, we don't use this type anywhere else (the closest would be `TypeAryPtr::RANGE`). We only create it when meeting arrays of primitive and non-primitive element type. Do you think this should still go to `TypeAryPtr::*`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1907103145 From thartmann at openjdk.org Wed Jan 8 12:29:51 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 8 Jan 2025 12:29:51 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 12:20:24 GMT, Aleksey Shipilev wrote: > You have been able to reproduce this with -UseCompressedClassPointers, right? No, I was never able to reproduce this with `-XX:-UseCompressedClassPointers`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1907103924 From shade at openjdk.org Wed Jan 8 12:29:51 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 8 Jan 2025 12:29:51 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic In-Reply-To: References: Message-ID: <3d7ezYP3-By8ArAoM6IkKGH8XHZ4VNVcviiMzMy2EPQ=.d82874c8-40e0-42ce-808c-527e44aac2dc@github.com> On Wed, 8 Jan 2025 12:23:50 GMT, Tobias Hartmann wrote: >> test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyNoInit.java line 32: >> >>> 30: * compiler.arraycopy.TestArrayCopyNoInit >>> 31: * @run main/othervm -XX:-BackgroundCompilation -XX:-UseOnStackReplacement -XX:TypeProfileLevel=020 >>> 32: * -XX:+UnlockExperimentalVMOptions -XX:+UseCompactObjectHeaders -XX:-UseTLAB >> >> You have been able to reproduce this with `-UseCompressedClassPointers`, right? If so, I'd suggest we do a run config with `-UseCCP` instead of `+UseCOH`, because this gives us a cleaner way for backports, if we need one later. > >> You have been able to reproduce this with -UseCompressedClassPointers, right? > > No, I was never able to reproduce this with `-XX:-UseCompressedClassPointers`. OK, I was confused by this in PR body then: > I was able to reliably reproduce the issue with compiler/arraycopy/TestArrayCopyNoInit.java and -XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:-UseCompressedClassPointers on Linux AArch64 and verified that the fix solves the problem. But fine, if it reproduces with +UCOH, let it be there. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1907107122 From thartmann at openjdk.org Wed Jan 8 12:48:43 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 8 Jan 2025 12:48:43 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic In-Reply-To: <3d7ezYP3-By8ArAoM6IkKGH8XHZ4VNVcviiMzMy2EPQ=.d82874c8-40e0-42ce-808c-527e44aac2dc@github.com> References: <3d7ezYP3-By8ArAoM6IkKGH8XHZ4VNVcviiMzMy2EPQ=.d82874c8-40e0-42ce-808c-527e44aac2dc@github.com> Message-ID: On Wed, 8 Jan 2025 12:26:28 GMT, Aleksey Shipilev wrote: >>> You have been able to reproduce this with -UseCompressedClassPointers, right? >> >> No, I was never able to reproduce this with `-XX:-UseCompressedClassPointers`. > > OK, I was confused by this in PR body then: > >> I was able to reliably reproduce the issue with compiler/arraycopy/TestArrayCopyNoInit.java and -XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:-UseCompressedClassPointers on Linux AArch64 and verified that the fix solves the problem. > > But fine, if it reproduces with +UCOH, let it be there. Ah, that's actually a typo, good catch. Should be `-XX:+UseCompactObjectHeaders`. I'll fix it in the description. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1907130502 From shade at openjdk.org Wed Jan 8 12:52:35 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 8 Jan 2025 12:52:35 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 12:07:16 GMT, Tobias Hartmann wrote: > C2's arraycopy intrinsic adds guards that check that the source and destination objects are arrays: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5917-L5919 > > If these guards pass, the array length is loaded: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5930-L5933 > > But since the `LoadRangeNode` is not pinned, it might float above the array guard: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/graphKit.cpp#L1214 > > If the object is not an array, we will read garbage. That's usually fine because the result will not be used (the array guard will trigger) but with `-XX:+UseCompactObjectHeaders` it can happen that the memory right after the header is not mapped and we crash. > > The fix is to add a `CheckCastPPNode` to propagate the information that the operand is an array and prevent the load from floating. > > Thanks to @shipilev for identifying the root cause! > > I was able to reliably reproduce the issue with `compiler/arraycopy/TestArrayCopyNoInit.java` and `-XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:+UseCompactObjectHeaders` on Linux AArch64 and verified that the fix solves the problem. > > Best regards, > Tobias I think bug should target 25 and then we backport it to 24. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22967#issuecomment-2577597242 From thartmann at openjdk.org Wed Jan 8 13:08:52 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 8 Jan 2025 13:08:52 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v2] In-Reply-To: References: Message-ID: > C2's arraycopy intrinsic adds guards that check that the source and destination objects are arrays: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5917-L5919 > > If these guards pass, the array length is loaded: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5930-L5933 > > But since the `LoadRangeNode` is not pinned, it might float above the array guard: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/graphKit.cpp#L1214 > > If the object is not an array, we will read garbage. That's usually fine because the result will not be used (the array guard will trigger) but with `-XX:+UseCompactObjectHeaders` it can happen that the memory right after the header is not mapped and we crash. > > The fix is to add a `CheckCastPPNode` to propagate the information that the operand is an array and prevent the load from floating. > > Thanks to @shipilev for identifying the root cause! > > I was able to reliably reproduce the issue with `compiler/arraycopy/TestArrayCopyNoInit.java` and `-XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:+UseCompactObjectHeaders` on Linux AArch64 and verified that the fix solves the problem. > > Best regards, > Tobias Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: Added missing stopped checks, refactoring and updated copyright dates ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22967/files - new: https://git.openjdk.org/jdk/pull/22967/files/425bbb6a..5c0292a8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22967&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22967&range=00-01 Stats: 12 lines in 3 files changed: 6 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/22967.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22967/head:pull/22967 PR: https://git.openjdk.org/jdk/pull/22967 From thartmann at openjdk.org Wed Jan 8 13:08:54 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 8 Jan 2025 13:08:54 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 12:07:16 GMT, Tobias Hartmann wrote: > C2's arraycopy intrinsic adds guards that check that the source and destination objects are arrays: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5917-L5919 > > If these guards pass, the array length is loaded: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5930-L5933 > > But since the `LoadRangeNode` is not pinned, it might float above the array guard: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/graphKit.cpp#L1214 > > If the object is not an array, we will read garbage. That's usually fine because the result will not be used (the array guard will trigger) but with `-XX:+UseCompactObjectHeaders` it can happen that the memory right after the header is not mapped and we crash. > > The fix is to add a `CheckCastPPNode` to propagate the information that the operand is an array and prevent the load from floating. > > Thanks to @shipilev for identifying the root cause! > > I was able to reliably reproduce the issue with `compiler/arraycopy/TestArrayCopyNoInit.java` and `-XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:+UseCompactObjectHeaders` on Linux AArch64 and verified that the fix solves the problem. > > Best regards, > Tobias I updated the fix according to Roland's suggestions and also added missing stopped checks (otherwise, the Cast will become TOP and below code does not like that). ------------- PR Comment: https://git.openjdk.org/jdk/pull/22967#issuecomment-2577627210 From thartmann at openjdk.org Wed Jan 8 13:08:53 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 8 Jan 2025 13:08:53 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 12:50:14 GMT, Aleksey Shipilev wrote: > I think bug should target 25 and then we backport it to 24. Since we should fix the issue in JDK 24, the bug should remain targeted to JDK 24. The Skara bot will then take care of updating the fix version to JDK 25 and creating a backport to JDK 24 once we push this into master. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22967#issuecomment-2577622654 From thartmann at openjdk.org Wed Jan 8 13:08:54 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 8 Jan 2025 13:08:54 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v2] In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 12:26:44 GMT, Roland Westrelin wrote: >> I wondered as well and no, we don't use this type anywhere else (the closest would be `TypeAryPtr::RANGE`). We only create it when meeting arrays of primitive and non-primitive element type. Do you think this should still go to `TypeAryPtr::*`? > > I would add it to `TypeAryPtr` (maybe as `TypeAryPtr::BOTTOM`) . The main benefit I see is that the new code would be more readable if it referred to `TypeAryPtr::BOTTOM` rather than the long type creation expression. Sounds good, I'll update the patch accordingly. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1907151707 From roland at openjdk.org Wed Jan 8 13:15:43 2025 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 8 Jan 2025 13:15:43 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v2] In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 13:08:52 GMT, Tobias Hartmann wrote: >> C2's arraycopy intrinsic adds guards that check that the source and destination objects are arrays: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5917-L5919 >> >> If these guards pass, the array length is loaded: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5930-L5933 >> >> But since the `LoadRangeNode` is not pinned, it might float above the array guard: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/graphKit.cpp#L1214 >> >> If the object is not an array, we will read garbage. That's usually fine because the result will not be used (the array guard will trigger) but with `-XX:+UseCompactObjectHeaders` it can happen that the memory right after the header is not mapped and we crash. >> >> The fix is to add a `CheckCastPPNode` to propagate the information that the operand is an array and prevent the load from floating. >> >> Thanks to @shipilev for identifying the root cause! >> >> I was able to reliably reproduce the issue with `compiler/arraycopy/TestArrayCopyNoInit.java` and `-XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:+UseCompactObjectHeaders` on Linux AArch64 and verified that the fix solves the problem. >> >> Best regards, >> Tobias > > Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: > > Added missing stopped checks, refactoring and updated copyright dates src/hotspot/share/opto/library_call.cpp line 5920: > 5918: // Keep track of the information that src/dest are arrays to prevent below array specific accesses from floating above. > 5919: generate_non_array_guard(load_object_klass(src), slow_region); > 5920: if (!stopped()) { Shouldn't we simply return then? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1907160417 From thartmann at openjdk.org Wed Jan 8 13:15:43 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 8 Jan 2025 13:15:43 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v2] In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 13:10:51 GMT, Roland Westrelin wrote: >> Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: >> >> Added missing stopped checks, refactoring and updated copyright dates > > src/hotspot/share/opto/library_call.cpp line 5920: > >> 5918: // Keep track of the information that src/dest are arrays to prevent below array specific accesses from floating above. >> 5919: generate_non_array_guard(load_object_klass(src), slow_region); >> 5920: if (!stopped()) { > > Shouldn't we simply return then? But we need to set up the `slow_region` path, right? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1907162736 From coleenp at openjdk.org Wed Jan 8 13:26:49 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 8 Jan 2025 13:26:49 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: <7w5uOtMR8ROpBIYdJHf4L44ANnxawhijM_iBCHpUtcI=.46e22653-732f-4edd-ad2f-c947d5928f6d@github.com> References: <8gxnsZYbqEJ7T3N637tjijrVbmQgbu8BrHHmVAjCt5M=.f98893f3-692a-4168-80f0-997f522ec4b0@github.com> <7w5uOtMR8ROpBIYdJHf4L44ANnxawhijM_iBCHpUtcI=.46e22653-732f-4edd-ad2f-c947d5928f6d@github.com> Message-ID: On Wed, 8 Jan 2025 06:18:01 GMT, Kim Barrett wrote: >> Overall I like this change. I appreciate the effort that has been put in to try and find an elegant solution to this problem. >> >> but having OS specific files created just to include the posix version runs counter to why we have the posix variants in the first place IMO. Please select one of the above approaches so that the new aix/bsd/linux specific files can be removed in favour of the posix one. Thanks. > > I disagree. It seems to me that breaking the abstraction like that is just asking for trouble. Yes, I think it's fine to say !WINDOWS instead of listing all the posix ports. We've been avoiding dispatch files and once it reaches a threshold of too many #ifdef !WINDOWS #include posix.hpp one, then we could add another macro like the OS_CPU one. Also the copyright script added 2025 for these because they started with 2024 so it's sort of a bug in the script but not really solvable because it doesn't know that you didn't check this in in 2024. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22890#discussion_r1907176766 From roland at openjdk.org Wed Jan 8 13:39:39 2025 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 8 Jan 2025 13:39:39 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v2] In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 13:12:37 GMT, Tobias Hartmann wrote: > But we need to set up the `slow_region` path, right? By that you mean have the `slow_region` feed into the uncommon trap that's only created later. It does feel weird that we know we have reached a dead end and we keep trying to add stuff, but ok then. The other thing is shouldn't the cast be added in `generate_non_array_guard()`? I see it's used elsewhere (`LibraryCallKit::inline_native_getLength()`): couldn't the same bug occur there? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1907195835 From coleenp at openjdk.org Wed Jan 8 13:43:52 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 8 Jan 2025 13:43:52 GMT Subject: RFR: 8347147: [REDO] AccessFlags can be u2 in metadata Message-ID: This is the same change that I pushed as commit 098afc8b7d0e7caa82999fb9d4e319ea8aed09a1 but now coordinated with @mur47x111 for the graal changes. A tier1 sanity test in progress. ------------- Commit messages: - 8347147: [REDO] AccessFlags can be u2 in metadata Changes: https://git.openjdk.org/jdk/pull/22968/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22968&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347147 Stats: 328 lines in 58 files changed: 45 ins; 50 del; 233 mod Patch: https://git.openjdk.org/jdk/pull/22968.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22968/head:pull/22968 PR: https://git.openjdk.org/jdk/pull/22968 From thartmann at openjdk.org Wed Jan 8 13:56:46 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 8 Jan 2025 13:56:46 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v2] In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 13:37:31 GMT, Roland Westrelin wrote: > By that you mean have the slow_region feed into the uncommon trap that's only created later. Right. It's a bit weird but probably still the best solution in terms of complexity. The trap will lead to recompilation and then the `too_many_traps` check will trigger. > The other thing is shouldn't the cast be added in generate_non_array_guard()? I see it's used elsewhere (LibraryCallKit::inline_native_getLength()): couldn't the same bug occur there? Right, good catch. I think the use in `LibraryCallKit::inline_native_getLength` has the same problem. We can't easily put the cast into `generate_non_array_guard` though because it operates on the Klass and not on the object. The other `generate*array*guard` methods potentially have the same issue but current uses look good. I guess it's best to fix the `LibraryCallKit::inline_native_getLength` as well, i.e., make it the caller's responsibility to add a cast. What do you think? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1907220074 From thartmann at openjdk.org Wed Jan 8 14:03:37 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 8 Jan 2025 14:03:37 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v2] In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 13:53:20 GMT, Tobias Hartmann wrote: >>> But we need to set up the `slow_region` path, right? >> >> By that you mean have the `slow_region` feed into the uncommon trap that's only created later. It does feel weird that we know we have reached a dead end and we keep trying to add stuff, but ok then. >> The other thing is shouldn't the cast be added in `generate_non_array_guard()`? I see it's used elsewhere (`LibraryCallKit::inline_native_getLength()`): couldn't the same bug occur there? > >> By that you mean have the slow_region feed into the uncommon trap that's only created later. > > Right. It's a bit weird but probably still the best solution in terms of complexity. The trap will lead to recompilation and then the `too_many_traps` check will trigger. > >> The other thing is shouldn't the cast be added in generate_non_array_guard()? I see it's used elsewhere (LibraryCallKit::inline_native_getLength()): couldn't the same bug occur there? > > Right, good catch. I think the use in `LibraryCallKit::inline_native_getLength` has the same problem. We can't easily put the cast into `generate_non_array_guard` though because it operates on the Klass and not on the object. The other `generate*array*guard` methods potentially have the same issue but current uses look good. I guess it's best to fix the `LibraryCallKit::inline_native_getLength` as well, i.e., make it the caller's responsibility to add a cast. What do you think? Hmm, maybe `inline_getObjectSize` is affected as well: https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L8535-L8543 And `LibraryCallKit::inline_native_clone` as well: https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5257-L5262 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1907228186 From epeter at openjdk.org Wed Jan 8 14:13:40 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 8 Jan 2025 14:13:40 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v9] In-Reply-To: <03ozC1NfpoBMN8fyLJY6gt2_7GZQpDtTHEj8cgxD_dU=.dd851537-820d-4b72-acf9-b170aa756e4b@github.com> References: <03ozC1NfpoBMN8fyLJY6gt2_7GZQpDtTHEj8cgxD_dU=.dd851537-820d-4b72-acf9-b170aa756e4b@github.com> Message-ID: On Mon, 16 Dec 2024 14:19:49 GMT, Jatin Bhateja wrote: >>> > Can you quickly summarize what tests you have, and what they test? >>> >>> Patch includes functional and performance tests, as per your suggestions IR framework-based tests now cover various special cases for constant folding transformation. Let me know if you see any gaps. >> >> I was hoping that you could make a list of all optimizations that are included here, and tell me where the tests are for it. That would significantly reduce the review time on my end. Otherwise I have to correlate everything myself, and that will take me hours. > >> > > Can you quickly summarize what tests you have, and what they test? >> > >> > >> > Patch includes functional and performance tests, as per your suggestions IR framework-based tests now cover various special cases for constant folding transformation. Let me know if you see any gaps. >> >> I was hoping that you could make a list of all optimizations that are included here, and tell me where the tests are for it. That would significantly reduce the review time on my end. Otherwise I have to correlate everything myself, and that will take me hours. > > > Validations details:- > > A) x86 backend changes > - new assembler instruction > - macro assembly routines. > Test point:- test/jdk/jdk/incubator/vector/ScalarFloat16OperationsTest.java > - This test is based on a testng framework and includes new DataProviders to generate test vectors. > - Test vectors cover the entire float16 value range and also special floating point values (NaN, +Int, -Inf, 0.0 and -0.0) > B) GVN transformations:- > - Value Transforms > Test point:- test test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java > - Covers all the constant folding scenarios for add, sub, mul, div, sqrt, fma, min, and max operations addressed by this patch. > - It also tests special case scenarios for each operation as specified by Java language specification. > - identity Transforms > Test point:- test test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java > - Covers identity transformation for ReinterpretS2HFNode, DivHFNode > - idealization Transforms > Test points:- test/hotspot/jtreg/compiler/c2/irTests/MulHFNodeIdealizationTests.java > :- test test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java > - Contains test point for the following transform > MulHF idealization i.e. MulHF * 2 => AddHF > - Contains test point for the following transform > DivHF SRC , PoT(constant) => MulHF SRC * reciprocal (constant) > - Contains idealization test points for the following transform > ConvF2HF(FP32BinOp(ConvHF2F(x), ConvHF2F(y))) => > ReinterpretHF2S(FP16BinOp(ReinterpretS2HF(x), ReinterpretS2HF(y))) @jatin-bhateja Is this ready for another review pass? ------------- PR Comment: https://git.openjdk.org/jdk/pull/22754#issuecomment-2577768041 From roland at openjdk.org Wed Jan 8 14:16:37 2025 From: roland at openjdk.org (Roland Westrelin) Date: Wed, 8 Jan 2025 14:16:37 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v2] In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 13:59:00 GMT, Tobias Hartmann wrote: >>> By that you mean have the slow_region feed into the uncommon trap that's only created later. >> >> Right. It's a bit weird but probably still the best solution in terms of complexity. The trap will lead to recompilation and then the `too_many_traps` check will trigger. >> >>> The other thing is shouldn't the cast be added in generate_non_array_guard()? I see it's used elsewhere (LibraryCallKit::inline_native_getLength()): couldn't the same bug occur there? >> >> Right, good catch. I think the use in `LibraryCallKit::inline_native_getLength` has the same problem. We can't easily put the cast into `generate_non_array_guard` though because it operates on the Klass and not on the object. The other `generate*array*guard` methods potentially have the same issue but current uses look good. I guess it's best to fix the `LibraryCallKit::inline_native_getLength` as well, i.e., make it the caller's responsibility to add a cast. What do you think? > > Hmm, maybe `inline_getObjectSize` is affected as well: > > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L8535-L8543 > > And `LibraryCallKit::inline_native_clone` as well: > > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5257-L5262 > I guess it's best to fix the `LibraryCallKit::inline_native_getLength` as well, i.e., make it the caller's responsibility to add a cast. What do you think? Maybe the methods need to take an extra parameter (the object to cast)? Having the cast in the method would lead to less code duplication and a lower risk of forgetting the cast when new calls of the method are added so that's what I would go with unless it's really a pain. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1907249957 From coleenp at openjdk.org Wed Jan 8 14:49:35 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 8 Jan 2025 14:49:35 GMT Subject: RFR: 8347147: [REDO] AccessFlags can be u2 in metadata In-Reply-To: References: Message-ID: <7NzZ7Lj7Aapsnt9zTgcJ7KUzLWNHNiskGiePGkpQIxU=.7c6779dd-6194-4a20-a598-c65bffff1c89@github.com> On Wed, 8 Jan 2025 13:38:09 GMT, Coleen Phillimore wrote: > This is the same change that I pushed as commit 098afc8b7d0e7caa82999fb9d4e319ea8aed09a1 but now coordinated with @mur47x111 for the graal changes. > > A tier1 sanity test complete. PR for original review here: https://github.com/openjdk/jdk/pull/22246 ------------- PR Comment: https://git.openjdk.org/jdk/pull/22968#issuecomment-2577853778 From thartmann at openjdk.org Wed Jan 8 15:55:49 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 8 Jan 2025 15:55:49 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v2] In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 14:13:32 GMT, Roland Westrelin wrote: >> Hmm, maybe `inline_getObjectSize` is affected as well: >> >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L8535-L8543 >> >> And `LibraryCallKit::inline_native_clone` as well: >> >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5257-L5262 > >> I guess it's best to fix the `LibraryCallKit::inline_native_getLength` as well, i.e., make it the caller's responsibility to add a cast. What do you think? > > Maybe the methods need to take an extra parameter (the object to cast)? > Having the cast in the method would lead to less code duplication and a lower risk of forgetting the cast when new calls of the method are added so that's what I would go with unless it's really a pain. Right, I'll give that a try. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1907411409 From qamai at openjdk.org Wed Jan 8 16:17:04 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Wed, 8 Jan 2025 16:17:04 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v2] In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 13:08:52 GMT, Tobias Hartmann wrote: >> C2's arraycopy intrinsic adds guards that check that the source and destination objects are arrays: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5917-L5919 >> >> If these guards pass, the array length is loaded: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5930-L5933 >> >> But since the `LoadRangeNode` is not pinned, it might float above the array guard: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/graphKit.cpp#L1214 >> >> If the object is not an array, we will read garbage. That's usually fine because the result will not be used (the array guard will trigger) but with `-XX:+UseCompactObjectHeaders` it can happen that the memory right after the header is not mapped and we crash. >> >> The fix is to add a `CheckCastPPNode` to propagate the information that the operand is an array and prevent the load from floating. >> >> Thanks to @shipilev for identifying the root cause! >> >> I was able to reliably reproduce the issue with `compiler/arraycopy/TestArrayCopyNoInit.java` and `-XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:+UseCompactObjectHeaders` on Linux AArch64 and verified that the fix solves the problem. >> >> Best regards, >> Tobias > > Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: > > Added missing stopped checks, refactoring and updated copyright dates src/hotspot/share/opto/library_call.cpp line 5921: > 5919: generate_non_array_guard(load_object_klass(src), slow_region); > 5920: if (!stopped()) { > 5921: src = _gvn.transform(new CheckCastPPNode(control(), src, TypeAryPtr::BOTTOM)); Why is this a `CheckCastPP` and not a `CastPP`? My understanding is that a `CheckCastPP` is used when we force changing the type of a node (e.g a raw pointer of `Allocate` into a typed pointer), so we do not join the type of the input with that of the output. src/hotspot/share/opto/type.hpp line 1476: > 1474: > 1475: // Convenience common pre-built types. > 1476: static const TypeAryPtr* BOTTOM; While you are here it may be better to change the other constant to `TypeAryPtr*` instead of `TypeAryPtr *` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1907435011 PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1907440010 From coleenp at openjdk.org Wed Jan 8 16:40:59 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 8 Jan 2025 16:40:59 GMT Subject: RFR: 8347147: [REDO] AccessFlags can be u2 in metadata [v2] In-Reply-To: References: Message-ID: <8tNmyDP0TIeRCdxsXomkLvU5Xqq3ACegEUHDPe-jPSc=.df292c1b-f901-4068-8915-c501d4ee8ab1@github.com> > This is the same change that I pushed as commit 098afc8b7d0e7caa82999fb9d4e319ea8aed09a1 but now coordinated with @mur47x111 for the graal changes. > > A tier1 sanity test complete. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix some more triggerUnloading() in tests. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22968/files - new: https://git.openjdk.org/jdk/pull/22968/files/71419187..9b2abf9c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22968&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22968&range=00-01 Stats: 13 lines in 3 files changed: 5 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/22968.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22968/head:pull/22968 PR: https://git.openjdk.org/jdk/pull/22968 From coleenp at openjdk.org Wed Jan 8 16:43:51 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 8 Jan 2025 16:43:51 GMT Subject: RFR: 8347147: [REDO] AccessFlags can be u2 in metadata [v3] In-Reply-To: References: Message-ID: <3FLD6UOms5vFKQ7Mkm61hJrSBlppwtY5IwZo40Vyly8=.1070a4eb-ecc4-4158-9c32-0982dd8612ed@github.com> > This is the same change that I pushed as commit 098afc8b7d0e7caa82999fb9d4e319ea8aed09a1 but now coordinated with @mur47x111 for the graal changes. > > A tier1 sanity test complete. Coleen Phillimore has refreshed the contents of this pull request, and previous commits have been removed. Incremental views are not available. The pull request now contains one commit: 8347147: [REDO] AccessFlags can be u2 in metadata ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22968/files - new: https://git.openjdk.org/jdk/pull/22968/files/9b2abf9c..71419187 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22968&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22968&range=01-02 Stats: 13 lines in 3 files changed: 0 ins; 5 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/22968.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22968/head:pull/22968 PR: https://git.openjdk.org/jdk/pull/22968 From vlivanov at openjdk.org Wed Jan 8 19:17:11 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Wed, 8 Jan 2025 19:17:11 GMT Subject: RFR: 8347147: [REDO] AccessFlags can be u2 in metadata [v3] In-Reply-To: <3FLD6UOms5vFKQ7Mkm61hJrSBlppwtY5IwZo40Vyly8=.1070a4eb-ecc4-4158-9c32-0982dd8612ed@github.com> References: <3FLD6UOms5vFKQ7Mkm61hJrSBlppwtY5IwZo40Vyly8=.1070a4eb-ecc4-4158-9c32-0982dd8612ed@github.com> Message-ID: On Wed, 8 Jan 2025 16:43:51 GMT, Coleen Phillimore wrote: >> This is the same change that I pushed as commit 098afc8b7d0e7caa82999fb9d4e319ea8aed09a1 but now coordinated with @mur47x111 for the graal changes. >> >> A tier1 sanity test complete. > > Coleen Phillimore has refreshed the contents of this pull request, and previous commits have been removed. Incremental views are not available. The pull request now contains one commit: > > 8347147: [REDO] AccessFlags can be u2 in metadata Reviewed. ------------- Marked as reviewed by vlivanov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22968#pullrequestreview-2537996774 From yzheng at openjdk.org Wed Jan 8 19:50:44 2025 From: yzheng at openjdk.org (Yudi Zheng) Date: Wed, 8 Jan 2025 19:50:44 GMT Subject: RFR: 8347147: [REDO] AccessFlags can be u2 in metadata [v3] In-Reply-To: <3FLD6UOms5vFKQ7Mkm61hJrSBlppwtY5IwZo40Vyly8=.1070a4eb-ecc4-4158-9c32-0982dd8612ed@github.com> References: <3FLD6UOms5vFKQ7Mkm61hJrSBlppwtY5IwZo40Vyly8=.1070a4eb-ecc4-4158-9c32-0982dd8612ed@github.com> Message-ID: On Wed, 8 Jan 2025 16:43:51 GMT, Coleen Phillimore wrote: >> This is the same change that I pushed as commit 098afc8b7d0e7caa82999fb9d4e319ea8aed09a1 but now coordinated with @mur47x111 for the graal changes. >> >> A tier1 sanity test complete. > > Coleen Phillimore has refreshed the contents of this pull request, and previous commits have been removed. Incremental views are not available. The pull request now contains one commit: > > 8347147: [REDO] AccessFlags can be u2 in metadata Marked as reviewed by yzheng (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/22968#pullrequestreview-2538071009 From coleenp at openjdk.org Wed Jan 8 19:50:44 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 8 Jan 2025 19:50:44 GMT Subject: RFR: 8347147: [REDO] AccessFlags can be u2 in metadata [v3] In-Reply-To: <3FLD6UOms5vFKQ7Mkm61hJrSBlppwtY5IwZo40Vyly8=.1070a4eb-ecc4-4158-9c32-0982dd8612ed@github.com> References: <3FLD6UOms5vFKQ7Mkm61hJrSBlppwtY5IwZo40Vyly8=.1070a4eb-ecc4-4158-9c32-0982dd8612ed@github.com> Message-ID: On Wed, 8 Jan 2025 16:43:51 GMT, Coleen Phillimore wrote: >> This is the same change that I pushed as commit 098afc8b7d0e7caa82999fb9d4e319ea8aed09a1 but now coordinated with @mur47x111 for the graal changes. >> >> A tier1 sanity test complete. > > Coleen Phillimore has refreshed the contents of this pull request, and previous commits have been removed. Incremental views are not available. The pull request now contains one commit: > > 8347147: [REDO] AccessFlags can be u2 in metadata Thank you for the re-review Vladimir and Yudi. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22968#issuecomment-2578500217 From coleenp at openjdk.org Wed Jan 8 19:50:45 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 8 Jan 2025 19:50:45 GMT Subject: Integrated: 8347147: [REDO] AccessFlags can be u2 in metadata In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 13:38:09 GMT, Coleen Phillimore wrote: > This is the same change that I pushed as commit 098afc8b7d0e7caa82999fb9d4e319ea8aed09a1 but now coordinated with @mur47x111 for the graal changes. > > A tier1 sanity test complete. This pull request has now been integrated. Changeset: 6ee2bd2f Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/6ee2bd2f33e38c13f93fba9953b33850828d031b Stats: 328 lines in 58 files changed: 45 ins; 50 del; 233 mod 8347147: [REDO] AccessFlags can be u2 in metadata Co-authored-by: Amit Kumar Reviewed-by: vlivanov, yzheng ------------- PR: https://git.openjdk.org/jdk/pull/22968 From dholmes at openjdk.org Thu Jan 9 00:50:39 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 9 Jan 2025 00:50:39 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: References: <8gxnsZYbqEJ7T3N637tjijrVbmQgbu8BrHHmVAjCt5M=.f98893f3-692a-4168-80f0-997f522ec4b0@github.com> <7w5uOtMR8ROpBIYdJHf4L44ANnxawhijM_iBCHpUtcI=.46e22653-732f-4edd-ad2f-c947d5928f6d@github.com> Message-ID: <4EpMWy1cYrxFRM28fNjJwB4_UXwfLR9Y5hFhjXWeVBk=.4da7362c-2384-44ca-b8a4-1e001aaf35f6@github.com> On Wed, 8 Jan 2025 13:23:29 GMT, Coleen Phillimore wrote: >> I disagree. It seems to me that breaking the abstraction like that is just asking for trouble. > > Yes, I think it's fine to say !WINDOWS instead of listing all the posix ports. We've been avoiding dispatch files and once it reaches a threshold of too many #ifdef !WINDOWS #include posix.hpp one, then we could add another macro like the OS_CPU one. > > Also the copyright script added 2025 for these because they started with 2024 so it's sort of a bug in the script but not really solvable because it doesn't know that you didn't check this in in 2024. > It seems to me that breaking the abstraction like that is just asking for trouble. Sorry but what "abstraction" are you referring to? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22890#discussion_r1908044703 From dholmes at openjdk.org Thu Jan 9 01:21:15 2025 From: dholmes at openjdk.org (David Holmes) Date: Thu, 9 Jan 2025 01:21:15 GMT Subject: RFR: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Tue, 7 Jan 2025 09:13:06 GMT, Aleksey Shipilev wrote: >> src/hotspot/share/interpreter/abstractInterpreter.cpp line 137: >> >>> 135: case vmIntrinsics::_floatToRawIntBits: return java_lang_Float_floatToRawIntBits; >>> 136: case vmIntrinsics::_longBitsToDouble: return java_lang_Double_longBitsToDouble; >>> 137: case vmIntrinsics::_doubleToRawLongBits: return java_lang_Double_doubleToRawLongBits; >> >> Why are these intrinsics for the Java methods disappearing? > > These are interpreter "intrinsics" that are only implemented on x86_32 to handle x87 FPU pecularities. Look around for `TemplateInterpreterGenerator::generate_Float_intBitsToFloat_entry`, for example. Hmmm ... okay ... I see something "special" is done only on x86_32, but what is done seems to have nothing to do with x87 code. Just to be clear these Java methods still get intrinsified, it is just handled in a different way - right? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22567#discussion_r1908061750 From kbarrett at openjdk.org Thu Jan 9 06:34:48 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 9 Jan 2025 06:34:48 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: <4EpMWy1cYrxFRM28fNjJwB4_UXwfLR9Y5hFhjXWeVBk=.4da7362c-2384-44ca-b8a4-1e001aaf35f6@github.com> References: <8gxnsZYbqEJ7T3N637tjijrVbmQgbu8BrHHmVAjCt5M=.f98893f3-692a-4168-80f0-997f522ec4b0@github.com> <7w5uOtMR8ROpBIYdJHf4L44ANnxawhijM_iBCHpUtcI=.46e22653-732f-4edd-ad2f-c947d5928f6d@github.com> <4EpMWy1cYrxFRM28fNjJwB4_UXwfLR9Y5hFhjXWeVBk=.4da7362c-2384-44ca-b8a4-1e001aaf35f6@github.com> Message-ID: On Thu, 9 Jan 2025 00:46:51 GMT, David Holmes wrote: >> Yes, I think it's fine to say !WINDOWS instead of listing all the posix ports. We've been avoiding dispatch files and once it reaches a threshold of too many #ifdef !WINDOWS #include posix.hpp one, then we could add another macro like the OS_CPU one. >> >> Also the copyright script added 2025 for these because they started with 2024 so it's sort of a bug in the script but not really solvable because it doesn't know that you didn't check this in in 2024. > >> It seems to me that breaking the abstraction like that is just asking for trouble. > > Sorry but what "abstraction" are you referring to? @dholmes-ora - OS_HEADER and CPU_HEADER instead of explicit knowledge of what platforms exist and which ones share some code in a "_posix" file and which ones have some unshared code. IMO the includer shouldn't need to know that kind of implementation detail. But oh well, all y'all seem to really hate the simple forwarding files and would prefer to skip that in favor of hard-coding the file organization. In the interest of making progress, I'll do that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22890#discussion_r1908229538 From kbarrett at openjdk.org Thu Jan 9 06:34:49 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 9 Jan 2025 06:34:49 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: References: <8gxnsZYbqEJ7T3N637tjijrVbmQgbu8BrHHmVAjCt5M=.f98893f3-692a-4168-80f0-997f522ec4b0@github.com> <7w5uOtMR8ROpBIYdJHf4L44ANnxawhijM_iBCHpUtcI=.46e22653-732f-4edd-ad2f-c947d5928f6d@github.com> Message-ID: On Wed, 8 Jan 2025 13:23:29 GMT, Coleen Phillimore wrote: >> I disagree. It seems to me that breaking the abstraction like that is just asking for trouble. > > Yes, I think it's fine to say !WINDOWS instead of listing all the posix ports. We've been avoiding dispatch files and once it reaches a threshold of too many #ifdef !WINDOWS #include posix.hpp one, then we could add another macro like the OS_CPU one. > > Also the copyright script added 2025 for these because they started with 2024 so it's sort of a bug in the script but not really solvable because it doesn't know that you didn't check this in in 2024. @coleenp The copyrights on the new files being 2024-2025 is intentional. They were first published (in this PR) in 2024, so I think are supposed to have that starting year. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22890#discussion_r1908231286 From kbarrett at openjdk.org Thu Jan 9 07:08:37 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 9 Jan 2025 07:08:37 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v4] In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 04:24:46 GMT, Kim Barrett wrote: > I filed an enhancement to remove the use of ::strdup. https://bugs.openjdk.org/browse/JDK-8347157 Per discussion in that bug, it's been closed as not an issue, and I'm changing the code here to add and use the permit wrapper for strdup. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22890#issuecomment-2579311852 From shade at openjdk.org Thu Jan 9 09:38:38 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 9 Jan 2025 09:38:38 GMT Subject: RFR: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Thu, 9 Jan 2025 01:17:49 GMT, David Holmes wrote: >> These are interpreter "intrinsics" that are only implemented on x86_32 to handle x87 FPU pecularities. Look around for `TemplateInterpreterGenerator::generate_Float_intBitsToFloat_entry`, for example. > > Hmmm ... okay ... I see something "special" is done only on x86_32, but what is done seems to have nothing to do with x87 code. > > Just to be clear these Java methods still get intrinsified, it is just handled in a different way - right? It *is* about x87 handling of NaNs, a common problem for x86_32 code in Hotspot, you can read about this mess in [JDK-8076373](https://bugs.openjdk.org/browse/JDK-8076373), if you are interested. If we allow to use native implementations of these conversion methods, we get into trouble with NaNs. What these interpreter intrinsics do on x86_32: going for SSE if available, thus avoiding x87. Since this is a correctness problem, these intrinsics go all the way down to interpreter as well. There is still a gaping hole when SSE is not available, but then we have no choice than to use x87 and have all the relevant issues. But all of this is only a headache for x86_32, all other platforms do not have these interpreter intrinsics implemented. With x86_32 going away, we can finally yank these and relevant scaffolding out. The C1/C2 intrinsics are still up and enabled for supported platforms: those are for performance :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22567#discussion_r1908436596 From kbarrett at openjdk.org Thu Jan 9 10:02:15 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 9 Jan 2025 10:02:15 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v6] In-Reply-To: References: Message-ID: <4FG-nUwG3rnddQFgdiZ-vtnGqtm_Ij0LbmQ6nX96fdw=.8b63229c-0686-43bd-b783-9f6402ee0b0d@github.com> > Please review this change to how HotSpot prevents the use of certain C library > functions (e.g. poisons references to those functions), while permitting a > subset to be used in restricted circumstances. Reasons for poisoning a > function include it being considered obsolete, or a security concern, or there > is a HotSpot function (typically in the os:: namespace) providing similar > functionality that should be used instead. > > The old mechanism, based on -Wattribute-warning and the associated attribute, > only worked for gcc. (Clang's implementation differs in an important way from > gcc, which is the subject of a clang bug that has been open for years. MSVC > doesn't provide a similar mechanism.) It also had problems with LTO, due to a > gcc bug. > > The new mechanism is based on deprecation warnings, using [[deprecated]] > attributes. We redeclare or forward declare the functions we want to prevent > use of as being deprecated. This relies on deprecation warnings being > enabled, which they already are in our build configuration. All of our > supported compilers support the [[deprecated]] attribute. > > Another benefit of using deprecation warnings rather than warning attributes > is the time when the check is performed. Warning attributes are checked only > if the function is referenced after all optimizations have been performed. > Deprecation is checked during initial semantic analysis. That's better for > our purposes here. (This is also part of why gcc LTO has problems with the > old mechanism, but not the new.) > > Adding these redeclarations or forward declarations isn't as simple as > expected, due to differences between the various compilers. We hide the > differences behind a set of macros, FORBID_C_FUNCTION and related macros. See > the compiler-specific parts of those macros for details. > > In some situations we need to allow references to these poisoned functions. > > One common case is where our poisoning is visible to some 3rd party code we > don't want to modify. This is typically 3rd party headers included in HotSpot > code, such as from Google Test or the C++ Standard Library. For these the > BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context > where such references are permitted. > > Some of the poisoned functions are needed to implement associated HotSpot os:: > functions, or in other similarly restricted contexts. For these, a wrapper > function is provided that calls the poisoned function with the warning > suppressed. These wrappers are defined in the permit_fo... Kim Barrett has updated the pull request incrementally with two additional commits since the last revision: - add permit wrapper for strdup and use in aix - remove os-specific posix forwarding headers ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22890/files - new: https://git.openjdk.org/jdk/pull/22890/files/b774f14c..000aca91 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22890&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22890&range=04-05 Stats: 100 lines in 6 files changed: 6 ins; 92 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/22890.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22890/head:pull/22890 PR: https://git.openjdk.org/jdk/pull/22890 From thartmann at openjdk.org Thu Jan 9 10:13:21 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Thu, 9 Jan 2025 10:13:21 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v3] In-Reply-To: References: Message-ID: > C2's arraycopy intrinsic adds guards that check that the source and destination objects are arrays: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5917-L5919 > > If these guards pass, the array length is loaded: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5930-L5933 > > But since the `LoadRangeNode` is not pinned, it might float above the array guard: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/graphKit.cpp#L1214 > > If the object is not an array, we will read garbage. That's usually fine because the result will not be used (the array guard will trigger) but with `-XX:+UseCompactObjectHeaders` it can happen that the memory right after the header is not mapped and we crash. > > The fix is to add a `CheckCastPPNode` to propagate the information that the operand is an array and prevent the load from floating. > > Thanks to @shipilev for identifying the root cause! > > I was able to reliably reproduce the issue with `compiler/arraycopy/TestArrayCopyNoInit.java` and `-XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:+UseCompactObjectHeaders` on Linux AArch64 and verified that the fix solves the problem. > > Best regards, > Tobias Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: Moved cast into guard ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22967/files - new: https://git.openjdk.org/jdk/pull/22967/files/5c0292a8..3b465a4b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22967&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22967&range=01-02 Stats: 55 lines in 4 files changed: 8 ins; 8 del; 39 mod Patch: https://git.openjdk.org/jdk/pull/22967.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22967/head:pull/22967 PR: https://git.openjdk.org/jdk/pull/22967 From stefank at openjdk.org Thu Jan 9 10:13:44 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 9 Jan 2025 10:13:44 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v6] In-Reply-To: <4FG-nUwG3rnddQFgdiZ-vtnGqtm_Ij0LbmQ6nX96fdw=.8b63229c-0686-43bd-b783-9f6402ee0b0d@github.com> References: <4FG-nUwG3rnddQFgdiZ-vtnGqtm_Ij0LbmQ6nX96fdw=.8b63229c-0686-43bd-b783-9f6402ee0b0d@github.com> Message-ID: On Thu, 9 Jan 2025 10:02:15 GMT, Kim Barrett wrote: >> Please review this change to how HotSpot prevents the use of certain C library >> functions (e.g. poisons references to those functions), while permitting a >> subset to be used in restricted circumstances. Reasons for poisoning a >> function include it being considered obsolete, or a security concern, or there >> is a HotSpot function (typically in the os:: namespace) providing similar >> functionality that should be used instead. >> >> The old mechanism, based on -Wattribute-warning and the associated attribute, >> only worked for gcc. (Clang's implementation differs in an important way from >> gcc, which is the subject of a clang bug that has been open for years. MSVC >> doesn't provide a similar mechanism.) It also had problems with LTO, due to a >> gcc bug. >> >> The new mechanism is based on deprecation warnings, using [[deprecated]] >> attributes. We redeclare or forward declare the functions we want to prevent >> use of as being deprecated. This relies on deprecation warnings being >> enabled, which they already are in our build configuration. All of our >> supported compilers support the [[deprecated]] attribute. >> >> Another benefit of using deprecation warnings rather than warning attributes >> is the time when the check is performed. Warning attributes are checked only >> if the function is referenced after all optimizations have been performed. >> Deprecation is checked during initial semantic analysis. That's better for >> our purposes here. (This is also part of why gcc LTO has problems with the >> old mechanism, but not the new.) >> >> Adding these redeclarations or forward declarations isn't as simple as >> expected, due to differences between the various compilers. We hide the >> differences behind a set of macros, FORBID_C_FUNCTION and related macros. See >> the compiler-specific parts of those macros for details. >> >> In some situations we need to allow references to these poisoned functions. >> >> One common case is where our poisoning is visible to some 3rd party code we >> don't want to modify. This is typically 3rd party headers included in HotSpot >> code, such as from Google Test or the C++ Standard Library. For these the >> BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context >> where such references are permitted. >> >> Some of the poisoned functions are needed to implement associated HotSpot os:: >> functions, or in other similarly restricted contexts. For these, a wrapper >> function is provided that calls the poison... > > Kim Barrett has updated the pull request incrementally with two additional commits since the last revision: > > - add permit wrapper for strdup and use in aix > - remove os-specific posix forwarding headers src/hotspot/os/posix/forbiddenFunctions_posix.hpp line 29: > 27: > 28: #include "utilities/compilerWarnings.hpp" > 29: #include // for size_t Suggestion: #include "utilities/compilerWarnings.hpp" #include // for size_t src/hotspot/os/windows/forbiddenFunctions_windows.hpp line 29: > 27: > 28: #include "utilities/compilerWarnings.hpp" > 29: #include // for size_t Suggestion: #include "utilities/compilerWarnings.hpp" #include // for size_t src/hotspot/share/utilities/forbiddenFunctions.hpp line 30: > 28: #include "utilities/compilerWarnings.hpp" > 29: #include "utilities/macros.hpp" > 30: #include // for va_list Suggestion: #include "utilities/macros.hpp" #include // for va_list src/hotspot/share/utilities/permitForbiddenFunctions.hpp line 70: > 68: > 69: #endif // SHARE_UTILITIES_PERMITFORBIDDENFUNCTIONS_HPP > 70: Suggestion: #endif // SHARE_UTILITIES_PERMITFORBIDDENFUNCTIONS_HPP test/hotspot/gtest/unittest.hpp line 27: > 25: #define UNITTEST_HPP > 26: > 27: #include "utilities/globalDefinitions.hpp" Suggestion: #include "utilities/globalDefinitions.hpp" ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22890#discussion_r1908485156 PR Review Comment: https://git.openjdk.org/jdk/pull/22890#discussion_r1908485732 PR Review Comment: https://git.openjdk.org/jdk/pull/22890#discussion_r1908486633 PR Review Comment: https://git.openjdk.org/jdk/pull/22890#discussion_r1908487438 PR Review Comment: https://git.openjdk.org/jdk/pull/22890#discussion_r1908488569 From thartmann at openjdk.org Thu Jan 9 10:22:12 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Thu, 9 Jan 2025 10:22:12 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v4] In-Reply-To: References: Message-ID: > C2's arraycopy intrinsic adds guards that check that the source and destination objects are arrays: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5917-L5919 > > If these guards pass, the array length is loaded: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5930-L5933 > > But since the `LoadRangeNode` is not pinned, it might float above the array guard: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/graphKit.cpp#L1214 > > If the object is not an array, we will read garbage. That's usually fine because the result will not be used (the array guard will trigger) but with `-XX:+UseCompactObjectHeaders` it can happen that the memory right after the header is not mapped and we crash. > > The fix is to add a `CheckCastPPNode` to propagate the information that the operand is an array and prevent the load from floating. > > Thanks to @shipilev for identifying the root cause! > > I was able to reliably reproduce the issue with `compiler/arraycopy/TestArrayCopyNoInit.java` and `-XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:+UseCompactObjectHeaders` on Linux AArch64 and verified that the fix solves the problem. > > Best regards, > Tobias Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: Copyright date ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22967/files - new: https://git.openjdk.org/jdk/pull/22967/files/3b465a4b..0a1fe387 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22967&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22967&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/22967.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22967/head:pull/22967 PR: https://git.openjdk.org/jdk/pull/22967 From thartmann at openjdk.org Thu Jan 9 10:22:13 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Thu, 9 Jan 2025 10:22:13 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v3] In-Reply-To: References: Message-ID: <7Sz-NJmdqr3NavPPGz7eNcUbCjybbVoFVA6OKll1t8I=.7686c684-71fa-4e37-8dd6-bb722b044d82@github.com> On Thu, 9 Jan 2025 10:13:21 GMT, Tobias Hartmann wrote: >> C2's arraycopy intrinsic adds guards that check that the source and destination objects are arrays: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5917-L5919 >> >> If these guards pass, the array length is loaded: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5930-L5933 >> >> But since the `LoadRangeNode` is not pinned, it might float above the array guard: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/graphKit.cpp#L1214 >> >> If the object is not an array, we will read garbage. That's usually fine because the result will not be used (the array guard will trigger) but with `-XX:+UseCompactObjectHeaders` it can happen that the memory right after the header is not mapped and we crash. >> >> The fix is to add a `CheckCastPPNode` to propagate the information that the operand is an array and prevent the load from floating. >> >> Thanks to @shipilev for identifying the root cause! >> >> I was able to reliably reproduce the issue with `compiler/arraycopy/TestArrayCopyNoInit.java` and `-XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:+UseCompactObjectHeaders` on Linux AArch64 and verified that the fix solves the problem. >> >> Best regards, >> Tobias > > Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: > > Moved cast into guard Roland, Quan Anh, thanks for the reviews! I pushed a new version that should address all comments. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22967#issuecomment-2579707064 From thartmann at openjdk.org Thu Jan 9 10:22:13 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Thu, 9 Jan 2025 10:22:13 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v2] In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 16:09:28 GMT, Quan Anh Mai wrote: >> Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: >> >> Added missing stopped checks, refactoring and updated copyright dates > > src/hotspot/share/opto/library_call.cpp line 5921: > >> 5919: generate_non_array_guard(load_object_klass(src), slow_region); >> 5920: if (!stopped()) { >> 5921: src = _gvn.transform(new CheckCastPPNode(control(), src, TypeAryPtr::BOTTOM)); > > Why is this a `CheckCastPP` and not a `CastPP`? My understanding is that a `CheckCastPP` is used when we force changing the type of a node (e.g a raw pointer of `Allocate` into a typed pointer), so we do not join the type of the input with that of the output. Good point, I changed that. > src/hotspot/share/opto/type.hpp line 1476: > >> 1474: >> 1475: // Convenience common pre-built types. >> 1476: static const TypeAryPtr* BOTTOM; > > While you are here it may be better to change the other constant to `TypeAryPtr*` instead of `TypeAryPtr *` Right, done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1908496196 PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1908496011 From kbarrett at openjdk.org Thu Jan 9 10:28:28 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 9 Jan 2025 10:28:28 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v7] In-Reply-To: References: Message-ID: > Please review this change to how HotSpot prevents the use of certain C library > functions (e.g. poisons references to those functions), while permitting a > subset to be used in restricted circumstances. Reasons for poisoning a > function include it being considered obsolete, or a security concern, or there > is a HotSpot function (typically in the os:: namespace) providing similar > functionality that should be used instead. > > The old mechanism, based on -Wattribute-warning and the associated attribute, > only worked for gcc. (Clang's implementation differs in an important way from > gcc, which is the subject of a clang bug that has been open for years. MSVC > doesn't provide a similar mechanism.) It also had problems with LTO, due to a > gcc bug. > > The new mechanism is based on deprecation warnings, using [[deprecated]] > attributes. We redeclare or forward declare the functions we want to prevent > use of as being deprecated. This relies on deprecation warnings being > enabled, which they already are in our build configuration. All of our > supported compilers support the [[deprecated]] attribute. > > Another benefit of using deprecation warnings rather than warning attributes > is the time when the check is performed. Warning attributes are checked only > if the function is referenced after all optimizations have been performed. > Deprecation is checked during initial semantic analysis. That's better for > our purposes here. (This is also part of why gcc LTO has problems with the > old mechanism, but not the new.) > > Adding these redeclarations or forward declarations isn't as simple as > expected, due to differences between the various compilers. We hide the > differences behind a set of macros, FORBID_C_FUNCTION and related macros. See > the compiler-specific parts of those macros for details. > > In some situations we need to allow references to these poisoned functions. > > One common case is where our poisoning is visible to some 3rd party code we > don't want to modify. This is typically 3rd party headers included in HotSpot > code, such as from Google Test or the C++ Standard Library. For these the > BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context > where such references are permitted. > > Some of the poisoned functions are needed to implement associated HotSpot os:: > functions, or in other similarly restricted contexts. For these, a wrapper > function is provided that calls the poisoned function with the warning > suppressed. These wrappers are defined in the permit_fo... Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: stefank whitespace suggestions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22890/files - new: https://git.openjdk.org/jdk/pull/22890/files/000aca91..97a56ae6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22890&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22890&range=05-06 Stats: 5 lines in 5 files changed: 4 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/22890.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22890/head:pull/22890 PR: https://git.openjdk.org/jdk/pull/22890 From kbarrett at openjdk.org Thu Jan 9 10:31:57 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 9 Jan 2025 10:31:57 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v7] In-Reply-To: References: Message-ID: On Thu, 9 Jan 2025 10:28:28 GMT, Kim Barrett wrote: >> Please review this change to how HotSpot prevents the use of certain C library >> functions (e.g. poisons references to those functions), while permitting a >> subset to be used in restricted circumstances. Reasons for poisoning a >> function include it being considered obsolete, or a security concern, or there >> is a HotSpot function (typically in the os:: namespace) providing similar >> functionality that should be used instead. >> >> The old mechanism, based on -Wattribute-warning and the associated attribute, >> only worked for gcc. (Clang's implementation differs in an important way from >> gcc, which is the subject of a clang bug that has been open for years. MSVC >> doesn't provide a similar mechanism.) It also had problems with LTO, due to a >> gcc bug. >> >> The new mechanism is based on deprecation warnings, using [[deprecated]] >> attributes. We redeclare or forward declare the functions we want to prevent >> use of as being deprecated. This relies on deprecation warnings being >> enabled, which they already are in our build configuration. All of our >> supported compilers support the [[deprecated]] attribute. >> >> Another benefit of using deprecation warnings rather than warning attributes >> is the time when the check is performed. Warning attributes are checked only >> if the function is referenced after all optimizations have been performed. >> Deprecation is checked during initial semantic analysis. That's better for >> our purposes here. (This is also part of why gcc LTO has problems with the >> old mechanism, but not the new.) >> >> Adding these redeclarations or forward declarations isn't as simple as >> expected, due to differences between the various compilers. We hide the >> differences behind a set of macros, FORBID_C_FUNCTION and related macros. See >> the compiler-specific parts of those macros for details. >> >> In some situations we need to allow references to these poisoned functions. >> >> One common case is where our poisoning is visible to some 3rd party code we >> don't want to modify. This is typically 3rd party headers included in HotSpot >> code, such as from Google Test or the C++ Standard Library. For these the >> BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context >> where such references are permitted. >> >> Some of the poisoned functions are needed to implement associated HotSpot os:: >> functions, or in other similarly restricted contexts. For these, a wrapper >> function is provided that calls the poison... > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > stefank whitespace suggestions @stefank Someday that long ago long discussion about include ordering and formatting needs to get cleaned up and added to the style guide. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22890#issuecomment-2579749535 From galder at openjdk.org Thu Jan 9 10:43:15 2025 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Thu, 9 Jan 2025 10:43:15 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v7] In-Reply-To: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> Message-ID: <1NAcKoms9A361Do3Vi6f4xT_euhDNTrotPWMskOsi70=.ca927892-1d5b-4a0d-b07a-f4f987f824a7@github.com> > This patch intrinsifies `Math.max(long, long)` and `Math.min(long, long)` in order to help improve vectorization performance. > > Currently vectorization does not kick in for loops containing either of these calls because of the following error: > > > VLoop::check_preconditions: failed: control flow in loop not allowed > > > The control flow is due to the java implementation for these methods, e.g. > > > public static long max(long a, long b) { > return (a >= b) ? a : b; > } > > > This patch intrinsifies the calls to replace the CmpL + Bool nodes for MaxL/MinL nodes respectively. > By doing this, vectorization no longer finds the control flow and so it can carry out the vectorization. > E.g. > > > SuperWord::transform_loop: > Loop: N518/N126 counted [int,int),+4 (1025 iters) main has_sfpt strip_mined > 518 CountedLoop === 518 246 126 [[ 513 517 518 242 521 522 422 210 ]] inner stride: 4 main of N518 strip mined !orig=[419],[247],[216],[193] !jvms: Test::test @ bci:14 (line 21) > > > Applying the same changes to `ReductionPerf` as in https://github.com/openjdk/jdk/pull/13056, we can compare the results before and after. Before the patch, on darwin/aarch64 (M1): > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java > 1 1 0 0 > ============================== > TEST SUCCESS > > long min 1155 > long max 1173 > > > After the patch, on darwin/aarch64 (M1): > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java > 1 1 0 0 > ============================== > TEST SUCCESS > > long min 1042 > long max 1042 > > > This patch does not add an platform-specific backend implementations for the MaxL/MinL nodes. > Therefore, it still relies on the macro expansion to transform those into CMoveL. > > I've run tier1 and hotspot compiler tests on darwin/aarch64 and got these results: > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg:tier1 2500 2500 0 0 >>> jtreg:test/jdk:tier1 ... Galder Zamarre?o has updated the pull request incrementally with two additional commits since the last revision: - Fix license header - Tests should also run on aarch64 asimd=true envs ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20098/files - new: https://git.openjdk.org/jdk/pull/20098/files/130b4755..fb0f731f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20098&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20098&range=05-06 Stats: 9 lines in 2 files changed: 4 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/20098.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20098/head:pull/20098 PR: https://git.openjdk.org/jdk/pull/20098 From galder at openjdk.org Thu Jan 9 11:28:40 2025 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Thu, 9 Jan 2025 11:28:40 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v6] In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> <9uGYNmVdvCXvyYSOAfwmvD70nWkimOFIlQJolQWa_Z4=.c6ffbfa0-5eb1-40a4-83a4-b657f57c9836@github.com> Message-ID: On Fri, 3 Jan 2025 08:48:37 GMT, Emanuel Peter wrote: >> That's right. Neoverse V2 is 4 pipes of 128 bits, V1 is 2 pipes of 256 bits. >> That comment is "interesting". Maybe it should be tunable by the back end. Given that Neoverse V2 can issue 4 SVE operations per clock cycle, it might still be a win. >> >> Galder, how about you disable that line and give it another try? > > FYI: I'm working on removing the line [here](https://github.com/openjdk/jdk/blob/75420e9314c54adc5b45f9b274a87af54dd6b5a8/src/hotspot/share/opto/superword.cpp#L1564-L1566). > > The issue is that on some platforms 2-element vectors are somehow really slower, and we need a cost-model to give us a better heuristic, rather than the hard "no". See my draft https://github.com/openjdk/jdk/pull/20964. > > But yes: why don't you remove the line, and see if that makes it work. If so, then don't worry about this case for now, and maybe leave a comment in the test. We can then fix that later. Yeah, this limit limits reductions like this working on 128 bit registers: // Length 2 reductions of INT/LONG do not offer performance benefits if (((arith_type->basic_type() == T_INT) || (arith_type->basic_type() == T_LONG)) && (size == 2)) { retValue = false; I've tried today to remove that but then the profitable checks fail to pass. So, I'm not going down that route now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1908608309 From galder at openjdk.org Thu Jan 9 11:47:58 2025 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Thu, 9 Jan 2025 11:47:58 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v8] In-Reply-To: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> Message-ID: > This patch intrinsifies `Math.max(long, long)` and `Math.min(long, long)` in order to help improve vectorization performance. > > Currently vectorization does not kick in for loops containing either of these calls because of the following error: > > > VLoop::check_preconditions: failed: control flow in loop not allowed > > > The control flow is due to the java implementation for these methods, e.g. > > > public static long max(long a, long b) { > return (a >= b) ? a : b; > } > > > This patch intrinsifies the calls to replace the CmpL + Bool nodes for MaxL/MinL nodes respectively. > By doing this, vectorization no longer finds the control flow and so it can carry out the vectorization. > E.g. > > > SuperWord::transform_loop: > Loop: N518/N126 counted [int,int),+4 (1025 iters) main has_sfpt strip_mined > 518 CountedLoop === 518 246 126 [[ 513 517 518 242 521 522 422 210 ]] inner stride: 4 main of N518 strip mined !orig=[419],[247],[216],[193] !jvms: Test::test @ bci:14 (line 21) > > > Applying the same changes to `ReductionPerf` as in https://github.com/openjdk/jdk/pull/13056, we can compare the results before and after. Before the patch, on darwin/aarch64 (M1): > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java > 1 1 0 0 > ============================== > TEST SUCCESS > > long min 1155 > long max 1173 > > > After the patch, on darwin/aarch64 (M1): > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java > 1 1 0 0 > ============================== > TEST SUCCESS > > long min 1042 > long max 1042 > > > This patch does not add an platform-specific backend implementations for the MaxL/MinL nodes. > Therefore, it still relies on the macro expansion to transform those into CMoveL. > > I've run tier1 and hotspot compiler tests on darwin/aarch64 and got these results: > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg:tier1 2500 2500 0 0 >>> jtreg:test/jdk:tier1 ... Galder Zamarre?o has updated the pull request incrementally with one additional commit since the last revision: Test can only run with 256 bit registers or bigger * Remove platform dependant check and use platform independent configuration instead. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20098/files - new: https://git.openjdk.org/jdk/pull/20098/files/fb0f731f..c0491987 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20098&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20098&range=06-07 Stats: 4 lines in 1 file changed: 0 ins; 2 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/20098.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20098/head:pull/20098 PR: https://git.openjdk.org/jdk/pull/20098 From galder at openjdk.org Thu Jan 9 12:06:41 2025 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Thu, 9 Jan 2025 12:06:41 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> Message-ID: On Wed, 18 Dec 2024 06:22:56 GMT, Emanuel Peter wrote: >> @eme64 I've addressed all your comments except aarch64 testing. `asimd` is not enough, you need `sve` for this, but I'm yet to make it work even with `sve`, something's up and need to debug it further. >> >> @jaskarth FYI I've adjusted the expectations in `TestMinMaxIdentities` after this change (thx for adding the test!). Check if there's any comments/changes you'd like. > > @galderz Nice, thanks for the updates. I gave the patch a quick scan and I think it looks really good. Just ping me again when you are done with your aarch64 investigations, and you think I should review again :) @eme64 aarch64 work for this is now complete. I tweaked the `applyIf` condition `MinMaxRed_Long` to make sure `MaxVectorSize` is 32 or higher. I verified this on both Graviton 3 (256 bit register, `MaxVectorSize=32`) and an AVX-512 intel (512 bit register, `MaxVectorSize=64`) ------------- PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2579989885 From epeter at openjdk.org Thu Jan 9 12:21:44 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 9 Jan 2025 12:21:44 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> Message-ID: On Thu, 9 Jan 2025 12:03:55 GMT, Galder Zamarre?o wrote: >> @galderz Nice, thanks for the updates. I gave the patch a quick scan and I think it looks really good. Just ping me again when you are done with your aarch64 investigations, and you think I should review again :) > > @eme64 aarch64 work for this is now complete. I tweaked the `applyIf` condition `MinMaxRed_Long` to make sure `MaxVectorSize` is 32 or higher. I verified this on both Graviton 3 (256 bit register, `MaxVectorSize=32`) and an AVX-512 intel (512 bit register, `MaxVectorSize=64`) @galderz So you want me to review again? ------------- PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2580018680 From stefank at openjdk.org Thu Jan 9 13:03:49 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 9 Jan 2025 13:03:49 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v7] In-Reply-To: References: Message-ID: On Thu, 9 Jan 2025 10:29:14 GMT, Kim Barrett wrote: > @stefank Someday that long ago long discussion about include ordering and formatting needs to get cleaned up and added to the style guide. Yes. The guide that system includes should come last and be separated from the rest of the HotSpot includes was never written down here: https://github.com/openjdk/jdk/blob/master/doc/hotspot-style.md#source-files ------------- PR Comment: https://git.openjdk.org/jdk/pull/22890#issuecomment-2580098775 From coleenp at openjdk.org Thu Jan 9 13:03:51 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 9 Jan 2025 13:03:51 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v7] In-Reply-To: References: Message-ID: On Thu, 9 Jan 2025 10:28:28 GMT, Kim Barrett wrote: >> Please review this change to how HotSpot prevents the use of certain C library >> functions (e.g. poisons references to those functions), while permitting a >> subset to be used in restricted circumstances. Reasons for poisoning a >> function include it being considered obsolete, or a security concern, or there >> is a HotSpot function (typically in the os:: namespace) providing similar >> functionality that should be used instead. >> >> The old mechanism, based on -Wattribute-warning and the associated attribute, >> only worked for gcc. (Clang's implementation differs in an important way from >> gcc, which is the subject of a clang bug that has been open for years. MSVC >> doesn't provide a similar mechanism.) It also had problems with LTO, due to a >> gcc bug. >> >> The new mechanism is based on deprecation warnings, using [[deprecated]] >> attributes. We redeclare or forward declare the functions we want to prevent >> use of as being deprecated. This relies on deprecation warnings being >> enabled, which they already are in our build configuration. All of our >> supported compilers support the [[deprecated]] attribute. >> >> Another benefit of using deprecation warnings rather than warning attributes >> is the time when the check is performed. Warning attributes are checked only >> if the function is referenced after all optimizations have been performed. >> Deprecation is checked during initial semantic analysis. That's better for >> our purposes here. (This is also part of why gcc LTO has problems with the >> old mechanism, but not the new.) >> >> Adding these redeclarations or forward declarations isn't as simple as >> expected, due to differences between the various compilers. We hide the >> differences behind a set of macros, FORBID_C_FUNCTION and related macros. See >> the compiler-specific parts of those macros for details. >> >> In some situations we need to allow references to these poisoned functions. >> >> One common case is where our poisoning is visible to some 3rd party code we >> don't want to modify. This is typically 3rd party headers included in HotSpot >> code, such as from Google Test or the C++ Standard Library. For these the >> BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context >> where such references are permitted. >> >> Some of the poisoned functions are needed to implement associated HotSpot os:: >> functions, or in other similarly restricted contexts. For these, a wrapper >> function is provided that calls the poison... > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > stefank whitespace suggestions src/hotspot/os/bsd/permitForbiddenFunctions_bsd.hpp line 30: > 28: #include "permitForbiddenFunctions_posix.hpp" > 29: > 30: #endif // OS_BSD_PERMITFORBIDDENFUNCTIONS_BSD_HPP I thought you were eliminating these in favor of #ifdef where included? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22890#discussion_r1908753962 From qamai at openjdk.org Thu Jan 9 13:23:37 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 9 Jan 2025 13:23:37 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v4] In-Reply-To: References: Message-ID: On Thu, 9 Jan 2025 10:22:12 GMT, Tobias Hartmann wrote: >> C2's arraycopy intrinsic adds guards that check that the source and destination objects are arrays: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5917-L5919 >> >> If these guards pass, the array length is loaded: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5930-L5933 >> >> But since the `LoadRangeNode` is not pinned, it might float above the array guard: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/graphKit.cpp#L1214 >> >> If the object is not an array, we will read garbage. That's usually fine because the result will not be used (the array guard will trigger) but with `-XX:+UseCompactObjectHeaders` it can happen that the memory right after the header is not mapped and we crash. >> >> The fix is to add a `CheckCastPPNode` to propagate the information that the operand is an array and prevent the load from floating. >> >> Thanks to @shipilev for identifying the root cause! >> >> I was able to reliably reproduce the issue with `compiler/arraycopy/TestArrayCopyNoInit.java` and `-XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:+UseCompactObjectHeaders` on Linux AArch64 and verified that the fix solves the problem. >> >> Best regards, >> Tobias > > Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: > > Copyright date Marked as reviewed by qamai (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/22967#pullrequestreview-2539897229 From thartmann at openjdk.org Thu Jan 9 13:23:38 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Thu, 9 Jan 2025 13:23:38 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v4] In-Reply-To: References: Message-ID: On Thu, 9 Jan 2025 13:18:32 GMT, Quan Anh Mai wrote: >> Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: >> >> Copyright date > > Marked as reviewed by qamai (Committer). Thanks again for the review, @merykitty! ------------- PR Comment: https://git.openjdk.org/jdk/pull/22967#issuecomment-2580140315 From epeter at openjdk.org Thu Jan 9 13:28:50 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 9 Jan 2025 13:28:50 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v9] In-Reply-To: References: Message-ID: <_SCKY9fuTqNDfR6K1y-FuMvursDMuOx39sKrXMj0Tdg=.225da2f1-fcdc-4418-a753-6d7404b4a83e@github.com> On Fri, 3 Jan 2025 20:42:15 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes included with this patch:- >> >> 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. >> 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. >> 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. >> - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. >> 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. >> 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. >> 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. >> 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF >> 9. X86 backend implementation for all supported intrinsics. >> 10. Functional and Performance validation tests. >> >> Kindly review the patch and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > Updating copyright year of modified files. We are on the final approach. Just a few small comments / suggestions left. src/hotspot/share/opto/convertnode.cpp line 991: > 989: return Op_MinHF; > 990: default: > 991: return false; Is that a sane return value? Should we not assert here? src/hotspot/share/opto/library_call.cpp line 8665: > 8663: fatal_unexpected_iid(id); > 8664: break; > 8665: } Suggestion: switch (id) { // Unary operations case vmIntrinsics::_sqrt_float16: result = _gvn.transform(new SqrtHFNode(C, control(), fld1)); break; // Ternary operations case vmIntrinsics::_fma_float16: result = _gvn.transform(new FmaHFNode(control(), fld1, fld2, fld3)); break; default: fatal_unexpected_iid(id); break; } Formatting could be improved. In the other switch you indent the cases. The lines are also a little long. src/hotspot/share/opto/mulnode.cpp line 560: > 558: // Compute the product type of two half float ranges into this node. > 559: const Type* MulHFNode::mul_ring(const Type* t0, const Type* t1) const { > 560: if(t0 == Type::HALF_FLOAT || t1 == Type::HALF_FLOAT) return Type::HALF_FLOAT; Suggestion: if(t0 == Type::HALF_FLOAT || t1 == Type::HALF_FLOAT) { return Type::HALF_FLOAT; } src/hotspot/share/opto/superword.cpp line 2567: > 2565: // half float to float, in such a case back propagation of narrow type (SHORT) > 2566: // may not be possible. > 2567: if (n->Opcode() == Op_ConvF2HF || n->Opcode() == Op_ReinterpretHF2S) { Is this relevant, or does that belong to a different (vector) RFE? src/hotspot/share/opto/type.cpp line 460: > 458: RETURN_ADDRESS=make(Return_Address); > 459: FLOAT = make(FloatBot); // All floats > 460: HALF_FLOAT = make(HalfFloatBot); // All half floats Suggestion: HALF_FLOAT = make(HalfFloatBot); // All half floats src/hotspot/share/opto/type.cpp line 1092: > 1090: if (_base == DoubleTop || _base == DoubleBot) return Type::BOTTOM; > 1091: typerr(t); > 1092: return Type::BOTTOM; Please use curly-braces even for single-line ifs src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Float16.java line 1434: > 1432: return float16ToRawShortBits(valueOf(product + float16ToFloat(f16c))); > 1433: }); > 1434: return shortBitsToFloat16(res); I don't understand what is happening here. But I leave this to @PaulSandoz to review ------------- Changes requested by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22754#pullrequestreview-2539863536 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1908759602 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1908769721 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1908771698 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1908776380 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1908777422 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1908779530 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1908792237 From epeter at openjdk.org Thu Jan 9 13:28:52 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 9 Jan 2025 13:28:52 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v3] In-Reply-To: References: Message-ID: On Mon, 16 Dec 2024 18:42:48 GMT, Joe Darcy wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Adding more test points > > src/java.base/share/classes/jdk/internal/vm/vector/Float16Math.java line 35: > >> 33: * The class {@code Float16Math} constains intrinsic entry points corresponding >> 34: * to scalar numeric operations defined in Float16 class. >> 35: * @author > > Please remove all author tags. We haven't used them in new code in the JDK for some time. @jatin-bhateja did you remove them? I still see an `@author` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1908788132 From epeter at openjdk.org Thu Jan 9 13:28:51 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 9 Jan 2025 13:28:51 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v9] In-Reply-To: <_SCKY9fuTqNDfR6K1y-FuMvursDMuOx39sKrXMj0Tdg=.225da2f1-fcdc-4418-a753-6d7404b4a83e@github.com> References: <_SCKY9fuTqNDfR6K1y-FuMvursDMuOx39sKrXMj0Tdg=.225da2f1-fcdc-4418-a753-6d7404b4a83e@github.com> Message-ID: On Thu, 9 Jan 2025 13:14:13 GMT, Emanuel Peter wrote: >> Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> Updating copyright year of modified files. > > src/hotspot/share/opto/type.cpp line 460: > >> 458: RETURN_ADDRESS=make(Return_Address); >> 459: FLOAT = make(FloatBot); // All floats >> 460: HALF_FLOAT = make(HalfFloatBot); // All half floats > > Suggestion: > > HALF_FLOAT = make(HalfFloatBot); // All half floats If alignment is already broken, we might as well just use single spaces. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1908778435 From coleenp at openjdk.org Thu Jan 9 13:34:39 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 9 Jan 2025 13:34:39 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v2] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Tue, 7 Jan 2025 08:32:27 GMT, Kim Barrett wrote: >> This is a relic and not the legal copyright that got updated since nobody noticed. Until you did. Removed. > > Not sure we're allowed to remove a copyright statement, even if not in the usual place. put copyright back. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1908806441 From kvn at openjdk.org Thu Jan 9 16:31:41 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 9 Jan 2025 16:31:41 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v4] In-Reply-To: References: Message-ID: <9OqXAax9IpkALXJHRnSuMqUSpo5VJbTVgR-REsMUT3o=.47dab670-dbac-4667-b746-c00992bdeb6a@github.com> On Thu, 9 Jan 2025 10:22:12 GMT, Tobias Hartmann wrote: >> C2's arraycopy intrinsic adds guards that check that the source and destination objects are arrays: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5917-L5919 >> >> If these guards pass, the array length is loaded: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5930-L5933 >> >> But since the `LoadRangeNode` is not pinned, it might float above the array guard: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/graphKit.cpp#L1214 >> >> If the object is not an array, we will read garbage. That's usually fine because the result will not be used (the array guard will trigger) but with `-XX:+UseCompactObjectHeaders` it can happen that the memory right after the header is not mapped and we crash. >> >> The fix is to add a `CheckCastPPNode` to propagate the information that the operand is an array and prevent the load from floating. >> >> Thanks to @shipilev for identifying the root cause! >> >> I was able to reliably reproduce the issue with `compiler/arraycopy/TestArrayCopyNoInit.java` and `-XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:+UseCompactObjectHeaders` on Linux AArch64 and verified that the fix solves the problem. >> >> Best regards, >> Tobias > > Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: > > Copyright date src/hotspot/share/opto/library_call.cpp line 4307: > 4305: // Keep track of the fact that 'obj' is an array to prevent > 4306: // array specific accesses from floating above the guard. > 4307: *obj = _gvn.transform(new CastPPNode(is_array_ctrl, *obj, TypeAryPtr::BOTTOM)); Should we do this for above code when layout is known for compiler (`layout_con` is checked)? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1909125151 From kbarrett at openjdk.org Thu Jan 9 18:32:42 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 9 Jan 2025 18:32:42 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v7] In-Reply-To: References: Message-ID: On Thu, 9 Jan 2025 12:58:48 GMT, Coleen Phillimore wrote: >> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: >> >> stefank whitespace suggestions > > src/hotspot/os/bsd/permitForbiddenFunctions_bsd.hpp line 30: > >> 28: #include "permitForbiddenFunctions_posix.hpp" >> 29: >> 30: #endif // OS_BSD_PERMITFORBIDDENFUNCTIONS_BSD_HPP > > I thought you were eliminating these in favor of #ifdef where included? Removed the forbid files, forgot the permit files. Sigh! I'll push a new commit after running tests. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22890#discussion_r1909272432 From matsaave at openjdk.org Thu Jan 9 19:04:51 2025 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Thu, 9 Jan 2025 19:04:51 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v5] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Tue, 7 Jan 2025 12:51:33 GMT, Coleen Phillimore wrote: >> There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. >> >> Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Restore copyright and macro. >From what I've looked at so far it looks good! I noticed there are several cases where you mix format specifiers with macros. I understand that replacing other macros may not be in the scope of this change but I find it inconsistent in places where we have both. I listed out some of the cases below, but if you don't believe this to be necessary you can ignore me. src/hotspot/os/bsd/os_bsd.cpp line 2527: > 2525: "\n\n" > 2526: "Do you want to debug the problem?\n\n" > 2527: "To debug, run 'gdb /proc/%d/exe %d'; then switch to thread %zd (" INTPTR_FORMAT ")\n" There is both `%zd` and `INTPTR_FORMAT` in this line. I think it would be more consistent to convert both to format specifiers here. src/hotspot/os/linux/os_linux.cpp line 5276: > 5274: "\n\n" > 5275: "Do you want to debug the problem?\n\n" > 5276: "To debug, run 'gdb /proc/%d/exe %d'; then switch to thread %zu (" INTPTR_FORMAT ")\n" Same as above src/hotspot/os/windows/os_windows.cpp line 533: > 531: } > 532: > 533: log_info(os, thread)("Thread is alive (tid: %zu, stacksize: " SIZE_FORMAT "k).", os::current_thread_id(), thread->stack_size() / K); Same as above, this time with `SIZE_FORMAT` src/hotspot/os/windows/os_windows.cpp line 618: > 616: thread->set_osthread(osthread); > 617: > 618: log_info(os, thread)("Thread attached (tid: %zu, stack: " This line also mixes format specifiers and macros src/hotspot/os/windows/os_windows.cpp line 3340: > 3338: if (Verbose && PrintMiscellaneous) { > 3339: reserveTimer.stop(); > 3340: tty->print_cr("reserve_memory of %zx bytes took " JLONG_FORMAT " ms (" JLONG_FORMAT " ticks)", bytes, Here too src/hotspot/share/classfile/classLoaderStats.cpp line 115: > 113: Klass* parent_klass = (cls._parent == nullptr ? nullptr : cls._parent->klass()); > 114: > 115: _out->print(INTPTR_FORMAT " " INTPTR_FORMAT " " INTPTR_FORMAT " %6zu " SIZE_FORMAT_W(8) " " SIZE_FORMAT_W(8) " ", Here too src/hotspot/share/classfile/classLoaderStats.cpp line 126: > 124: _out->cr(); > 125: if (cls._hidden_classes_count > 0) { > 126: _out->print_cr(SPACE SPACE SPACE " %6zu " SIZE_FORMAT_W(8) " " SIZE_FORMAT_W(8) " + hidden classes", And here src/hotspot/share/classfile/classLoaderStats.cpp line 140: > 138: _out->print("Total = %-6zu", _total_loaders); > 139: _out->print(SPACE SPACE SPACE " ", "", "", ""); > 140: _out->print_cr("%6zu " SIZE_FORMAT_W(8) " " SIZE_FORMAT_W(8) " ", And here src/hotspot/share/code/vtableStubs.cpp line 82: > 80: > 81: void VtableStub::print_on(outputStream* st) const { > 82: st->print("vtable stub (index = %d, receiver_location = %zd, code = [" INTPTR_FORMAT ", " INTPTR_FORMAT "])", And here ------------- Changes requested by matsaave (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22916#pullrequestreview-2540706941 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1909299619 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1909300550 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1909300883 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1909301552 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1909301678 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1909303066 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1909303216 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1909303480 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1909303991 From psandoz at openjdk.org Thu Jan 9 19:25:53 2025 From: psandoz at openjdk.org (Paul Sandoz) Date: Thu, 9 Jan 2025 19:25:53 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v9] In-Reply-To: <_SCKY9fuTqNDfR6K1y-FuMvursDMuOx39sKrXMj0Tdg=.225da2f1-fcdc-4418-a753-6d7404b4a83e@github.com> References: <_SCKY9fuTqNDfR6K1y-FuMvursDMuOx39sKrXMj0Tdg=.225da2f1-fcdc-4418-a753-6d7404b4a83e@github.com> Message-ID: On Thu, 9 Jan 2025 13:23:19 GMT, Emanuel Peter wrote: >> Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> Updating copyright year of modified files. > > src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Float16.java line 1434: > >> 1432: return float16ToRawShortBits(valueOf(product + float16ToFloat(f16c))); >> 1433: }); >> 1434: return shortBitsToFloat16(res); > > I don't understand what is happening here. But I leave this to @PaulSandoz to review Uncertain on what bits, but i am guessing it's mostly related to the fallback code in the lambda. To avoid the intrinsics operating on Float16 instances we instead "unpack" the carrier (16bits) values and pass those as arguments to the intrinsic. The fallback (when intrinsification is not supported) also accepts those carrier values as arguments and we convert the carriers to floats, operate on then, convert to the carrier, and then back to float16 on the result. The code in the lambda could potentially be simplified if `Float16Math.fma` accepted six arguments the first three being the carrier values used by the intrinsic, and the subsequent three being the float16 values used by the fallback. Then we could express the code in the original source in the lambda. I believe when intrinsified there would be no penalty for those extra arguments. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1909327094 From coleenp at openjdk.org Thu Jan 9 20:39:41 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 9 Jan 2025 20:39:41 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v5] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: <3rsYHTsq8K_5SIPzeMJQJFM6HMWNTz7OdCBgVBwUUD8=.f3b67c30-8ecb-4034-b0b7-8396c5f8b531@github.com> On Tue, 7 Jan 2025 12:51:33 GMT, Coleen Phillimore wrote: >> There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. >> >> Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Restore copyright and macro. The intention is to keep INTPTR_FORMAT and some of the other format specifiers that vary by platform. I have another issue to remove the SIZE_FORMAT ones but that's a bigger change. So this mixture is intentional. JLONG_FORMAT might be something we can remove too but I didn't want to do it all at once. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22916#issuecomment-2581199763 From matsaave at openjdk.org Thu Jan 9 21:52:47 2025 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Thu, 9 Jan 2025 21:52:47 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v5] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: <2NN6jS-4TNxlwq8K0ovl2o9A3ZdCsTVJJ6NcOWDh-P8=.069b6da4-4c08-4cc6-9532-2b1f96a1793a@github.com> On Tue, 7 Jan 2025 12:51:33 GMT, Coleen Phillimore wrote: >> There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. >> >> Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Restore copyright and macro. Looks good! I saw the discussion on `UINTPTR_FORMAT_X_0` so I left it alone. src/hotspot/share/runtime/objectMonitor.cpp line 2500: > 2498: // The minimal things to print for markWord printing, more can be added for debugging and logging. > 2499: st->print("{contentions=0x%08x,waiters=0x%08x" > 2500: ",recursions=%zd,owner=" INT64_FORMAT "}", Is `INT64_FORMAT` different from `INTX_FORMAT`? ------------- Marked as reviewed by matsaave (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22916#pullrequestreview-2540981143 PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1909469703 From kbarrett at openjdk.org Thu Jan 9 22:00:59 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 9 Jan 2025 22:00:59 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v5] In-Reply-To: <2NN6jS-4TNxlwq8K0ovl2o9A3ZdCsTVJJ6NcOWDh-P8=.069b6da4-4c08-4cc6-9532-2b1f96a1793a@github.com> References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> <2NN6jS-4TNxlwq8K0ovl2o9A3ZdCsTVJJ6NcOWDh-P8=.069b6da4-4c08-4cc6-9532-2b1f96a1793a@github.com> Message-ID: On Thu, 9 Jan 2025 21:47:47 GMT, Matias Saavedra Silva wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Restore copyright and macro. > > src/hotspot/share/runtime/objectMonitor.cpp line 2500: > >> 2498: // The minimal things to print for markWord printing, more can be added for debugging and logging. >> 2499: st->print("{contentions=0x%08x,waiters=0x%08x" >> 2500: ",recursions=%zd,owner=" INT64_FORMAT "}", > > Is `INT64_FORMAT` different from `INTX_FORMAT`? Currently yes. The type underlying [u]intx varies by platform, being a 32-bit type on 32-bit platforms and a 64-bit type on 64-bit platforms. We've been trimming the set of supported 32-bit platforms though, so maybe someday we won't need that distinction any more. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1909478987 From kbarrett at openjdk.org Thu Jan 9 22:07:12 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 9 Jan 2025 22:07:12 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v8] In-Reply-To: References: Message-ID: > Please review this change to how HotSpot prevents the use of certain C library > functions (e.g. poisons references to those functions), while permitting a > subset to be used in restricted circumstances. Reasons for poisoning a > function include it being considered obsolete, or a security concern, or there > is a HotSpot function (typically in the os:: namespace) providing similar > functionality that should be used instead. > > The old mechanism, based on -Wattribute-warning and the associated attribute, > only worked for gcc. (Clang's implementation differs in an important way from > gcc, which is the subject of a clang bug that has been open for years. MSVC > doesn't provide a similar mechanism.) It also had problems with LTO, due to a > gcc bug. > > The new mechanism is based on deprecation warnings, using [[deprecated]] > attributes. We redeclare or forward declare the functions we want to prevent > use of as being deprecated. This relies on deprecation warnings being > enabled, which they already are in our build configuration. All of our > supported compilers support the [[deprecated]] attribute. > > Another benefit of using deprecation warnings rather than warning attributes > is the time when the check is performed. Warning attributes are checked only > if the function is referenced after all optimizations have been performed. > Deprecation is checked during initial semantic analysis. That's better for > our purposes here. (This is also part of why gcc LTO has problems with the > old mechanism, but not the new.) > > Adding these redeclarations or forward declarations isn't as simple as > expected, due to differences between the various compilers. We hide the > differences behind a set of macros, FORBID_C_FUNCTION and related macros. See > the compiler-specific parts of those macros for details. > > In some situations we need to allow references to these poisoned functions. > > One common case is where our poisoning is visible to some 3rd party code we > don't want to modify. This is typically 3rd party headers included in HotSpot > code, such as from Google Test or the C++ Standard Library. For these the > BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context > where such references are permitted. > > Some of the poisoned functions are needed to implement associated HotSpot os:: > functions, or in other similarly restricted contexts. For these, a wrapper > function is provided that calls the poisoned function with the warning > suppressed. These wrappers are defined in the permit_fo... Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision: - Merge branch 'master' into new-poison - Merge branch 'master' into new-poison - remove more os-specific posix forwarding headers - stefank whitespace suggestions - add permit wrapper for strdup and use in aix - remove os-specific posix forwarding headers - aix permit patches - more fixes for clang noreturn issues - Merge branch 'master' into new-poison - update copyrights - ... and 5 more: https://git.openjdk.org/jdk/compare/02b290ae...6d49abbb ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22890/files - new: https://git.openjdk.org/jdk/pull/22890/files/97a56ae6..6d49abbb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22890&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22890&range=06-07 Stats: 18325 lines in 471 files changed: 5136 ins; 11130 del; 2059 mod Patch: https://git.openjdk.org/jdk/pull/22890.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22890/head:pull/22890 PR: https://git.openjdk.org/jdk/pull/22890 From thartmann at openjdk.org Fri Jan 10 07:08:51 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 10 Jan 2025 07:08:51 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v4] In-Reply-To: <9OqXAax9IpkALXJHRnSuMqUSpo5VJbTVgR-REsMUT3o=.47dab670-dbac-4667-b746-c00992bdeb6a@github.com> References: <9OqXAax9IpkALXJHRnSuMqUSpo5VJbTVgR-REsMUT3o=.47dab670-dbac-4667-b746-c00992bdeb6a@github.com> Message-ID: <8QuLkvAR5QduPWwpwK-Q72lhDHuLDXAfglCIoBb0WyU=.323176ab-777b-4aa6-9a39-50ca78c4effd@github.com> On Thu, 9 Jan 2025 16:28:46 GMT, Vladimir Kozlov wrote: >> Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: >> >> Copyright date > > src/hotspot/share/opto/library_call.cpp line 4307: > >> 4305: // Keep track of the fact that 'obj' is an array to prevent >> 4306: // array specific accesses from floating above the guard. >> 4307: *obj = _gvn.transform(new CastPPNode(is_array_ctrl, *obj, TypeAryPtr::BOTTOM)); > > Should we do this for above code when layout is known for compiler (`layout_con` is checked)? I thought about this as well but I don't think it's necessary because: - No cast is needed if we know the type already - We don't emit a guard and only one branch remains, so there is no risk of the array specific access floating above So I went with the simplest changes for now, also since we need to backport this change (it got already much more complicated than I was aiming for). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1909916119 From dholmes at openjdk.org Fri Jan 10 07:18:37 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 10 Jan 2025 07:18:37 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: References: <8gxnsZYbqEJ7T3N637tjijrVbmQgbu8BrHHmVAjCt5M=.f98893f3-692a-4168-80f0-997f522ec4b0@github.com> <7w5uOtMR8ROpBIYdJHf4L44ANnxawhijM_iBCHpUtcI=.46e22653-732f-4edd-ad2f-c947d5928f6d@github.com> Message-ID: <012w8uAs0fzFh52ycdTUSBLh0Kv-TgwiCSAdWA5sBDM=.16e4e1be-6fbe-4543-b565-499fd7c40f63@github.com> On Thu, 9 Jan 2025 06:31:55 GMT, Kim Barrett wrote: >> Yes, I think it's fine to say !WINDOWS instead of listing all the posix ports. We've been avoiding dispatch files and once it reaches a threshold of too many #ifdef !WINDOWS #include posix.hpp one, then we could add another macro like the OS_CPU one. >> >> Also the copyright script added 2025 for these because they started with 2024 so it's sort of a bug in the script but not really solvable because it doesn't know that you didn't check this in in 2024. > > @coleenp The copyrights on the new files being 2024-2025 is intentional. They were first published (in this PR) > in 2024, so I think are supposed to have that starting year. @kimbarrett I understand. I also don't like the includer needing to know about this, and maybe it is time to add OS_FAMILY_HEADER to deal with it. But in the absence of the new macro I prefer this break of abstraction to the creation of a bunch of tiny forwarding files. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22890#discussion_r1909925627 From dholmes at openjdk.org Fri Jan 10 07:21:42 2025 From: dholmes at openjdk.org (David Holmes) Date: Fri, 10 Jan 2025 07:21:42 GMT Subject: RFR: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Thu, 9 Jan 2025 09:36:07 GMT, Aleksey Shipilev wrote: >> Hmmm ... okay ... I see something "special" is done only on x86_32, but what is done seems to have nothing to do with x87 code. >> >> Just to be clear these Java methods still get intrinsified, it is just handled in a different way - right? > > It *is* about x87 handling of NaNs, a common problem for x86_32 code in Hotspot, you can read about this mess in [JDK-8076373](https://bugs.openjdk.org/browse/JDK-8076373), if you are interested. If we allow to use native implementations of these conversion methods, we get into trouble with NaNs. What these interpreter intrinsics do on x86_32: going for SSE if available, thus avoiding x87. Since this is a correctness problem, these intrinsics go all the way down to interpreter as well. There is still a gaping hole when SSE is not available, but then we have no choice than to use x87 and have all the relevant issues. > > But all of this is only a headache for x86_32, all other platforms do not have these interpreter intrinsics implemented. With x86_32 going away, we can finally yank these and relevant scaffolding out. > > The C1/C2 intrinsics are still up and enabled for supported platforms: those are for performance :) Okay now I get. Thanks ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22567#discussion_r1909928264 From coleenp at openjdk.org Fri Jan 10 12:57:51 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 10 Jan 2025 12:57:51 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v2] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Sat, 4 Jan 2025 09:41:29 GMT, Kim Barrett wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix %Ix to %zx. > > src/hotspot/share/oops/klass.cpp line 1308: > >> 1306: if (secondary_supers() != nullptr) { >> 1307: st->print(" - "); st->print("%d elements;", _secondary_supers->length()); >> 1308: st->print_cr(" bitmap: " LP64_ONLY("0x%016zu") NOT_LP64("0x%08zu"), _secondary_supers_bitmap); > > Same as in instanceKlass - maybe this shouldn't be changed at all. I restored this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22916#discussion_r1910340969 From coleenp at openjdk.org Fri Jan 10 13:03:44 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 10 Jan 2025 13:03:44 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: <012w8uAs0fzFh52ycdTUSBLh0Kv-TgwiCSAdWA5sBDM=.16e4e1be-6fbe-4543-b565-499fd7c40f63@github.com> References: <8gxnsZYbqEJ7T3N637tjijrVbmQgbu8BrHHmVAjCt5M=.f98893f3-692a-4168-80f0-997f522ec4b0@github.com> <7w5uOtMR8ROpBIYdJHf4L44ANnxawhijM_iBCHpUtcI=.46e22653-732f-4edd-ad2f-c947d5928f6d@github.com> <012w8uAs0fzFh52ycdTUSBLh0Kv-TgwiCSAdWA5sBDM=.16e4e1be-6fbe-4543-b565-499fd7c40f63@github.com> Message-ID: On Fri, 10 Jan 2025 07:15:51 GMT, David Holmes wrote: >> @coleenp The copyrights on the new files being 2024-2025 is intentional. They were first published (in this PR) >> in 2024, so I think are supposed to have that starting year. > > @kimbarrett I understand. I also don't like the includer needing to know about this, and maybe it is time to add OS_FAMILY_HEADER to deal with it. But in the absence of the new macro I prefer this break of abstraction to the creation of a bunch of tiny forwarding files. I prefer the opposite. If you break these up into a bunch of forwarding files, do all the rest of the VM the same way. It may be my problem but I don't have a good IDE to have to navigate through a slew of similarly named files to find the real implementation. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22890#discussion_r1910347603 From coleenp at openjdk.org Fri Jan 10 13:32:40 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 10 Jan 2025 13:32:40 GMT Subject: RFR: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Thu, 5 Dec 2024 08:26:10 GMT, Aleksey Shipilev wrote: > **NOTE: This is work-in-progress draft for interested parties. The JEP is not even submitted, let alone targeted.** > > My plan is to to get this done in a quiet time in mainline to limit the ongoing conflicts with mainline. Feel free to comment in this PR, if you see something ahead of time. These comments might adjust the trajectory we take to implement this removal and/or allows us submit and work out more RFEs ahead of this removal. I plan to re-open a clean PR after this preliminary PR is done, maybe after the round of preliminary reviews. > > This removes the 32-bit x86 port and does a deeper cleaning in Hotspot. The following paragraphs describe what and why was being done. > > Easy stuff first: all files named `*_x86_32` are gone. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. > > The code under `!LP64`, `!AMD64` and `IA32` is removed in `x86`-specific files. There is quite a bit of the code, especially around `Assembler` and `MacroAssembler`. I think these removals make the whole thing cleaner. The downside is that some of the `MacroAssembler::*ptr` functions that were used to select the "machine pointer" instructions either from x86_64 or x86_32 are now exclusively for x86_64. I don't think we want to rewrite `*ptr` -> `*q` at this point. I think we gradually morph the code base to use `*q`-flavored methods in new code. > > x86_32 is the only platform that has special cases for x87 FPU. > > C1 even implements the whole separate thing to deal with x87 FPU: the parts of regalloc treat it specially, there is `FpuStackSim`, there is `VerifyFPU` family of flags, etc. There are also peculiarities with FP conversions that use FPU, that's why x86_32 used to have template interpreter stubs for FP conversion methods. None of that is needed anymore without x86_32. This cleans up some arch-specific code as well. > > Both C1 and C2 implement the workarounds for non-IEEE compliant rounding of x87 FPU. After x86_32 is gone, these are not needed anymore. This removes some C2 nodes, removes the rounding instructions in C1. > > x86_64 is baselined on SSE2+, the VM would not even start if SSE2 is not supported. Most of the checks that we have for `UseSSE < 2` are for the benefit of x86_32. Because of this I folded redundant `UseSSE` checks around Hotspot. > > The one thing I _deliberately_ avoided doing is merging `x86.ad` and `x86_64.ad`. It would likely introduce uncomfortable amount of conflicts with pending work in mainli... I reviewed the template interpreter changes. They look great. src/hotspot/cpu/x86/templateTable_x86.cpp line 330: > 328: void TemplateTable::dconst(int value) { > 329: transition(vtos, dtos); > 330: if (UseSSE >= 2) { I admit that I don't know what UseSSE is but now this is unconditional? Is there a further cleanup necessary for this option? ------------- PR Review: https://git.openjdk.org/jdk/pull/22567#pullrequestreview-2542434532 PR Review Comment: https://git.openjdk.org/jdk/pull/22567#discussion_r1910374250 From shade at openjdk.org Fri Jan 10 13:57:46 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 10 Jan 2025 13:57:46 GMT Subject: RFR: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Fri, 10 Jan 2025 13:23:46 GMT, Coleen Phillimore wrote: >> **NOTE: This is work-in-progress draft for interested parties. The JEP is not even submitted, let alone targeted.** >> >> My plan is to to get this done in a quiet time in mainline to limit the ongoing conflicts with mainline. Feel free to comment in this PR, if you see something ahead of time. These comments might adjust the trajectory we take to implement this removal and/or allows us submit and work out more RFEs ahead of this removal. I plan to re-open a clean PR after this preliminary PR is done, maybe after the round of preliminary reviews. >> >> This removes the 32-bit x86 port and does a deeper cleaning in Hotspot. The following paragraphs describe what and why was being done. >> >> Easy stuff first: all files named `*_x86_32` are gone. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. >> >> The code under `!LP64`, `!AMD64` and `IA32` is removed in `x86`-specific files. There is quite a bit of the code, especially around `Assembler` and `MacroAssembler`. I think these removals make the whole thing cleaner. The downside is that some of the `MacroAssembler::*ptr` functions that were used to select the "machine pointer" instructions either from x86_64 or x86_32 are now exclusively for x86_64. I don't think we want to rewrite `*ptr` -> `*q` at this point. I think we gradually morph the code base to use `*q`-flavored methods in new code. >> >> x86_32 is the only platform that has special cases for x87 FPU. >> >> C1 even implements the whole separate thing to deal with x87 FPU: the parts of regalloc treat it specially, there is `FpuStackSim`, there is `VerifyFPU` family of flags, etc. There are also peculiarities with FP conversions that use FPU, that's why x86_32 used to have template interpreter stubs for FP conversion methods. None of that is needed anymore without x86_32. This cleans up some arch-specific code as well. >> >> Both C1 and C2 implement the workarounds for non-IEEE compliant rounding of x87 FPU. After x86_32 is gone, these are not needed anymore. This removes some C2 nodes, removes the rounding instructions in C1. >> >> x86_64 is baselined on SSE2+, the VM would not even start if SSE2 is not supported. Most of the checks that we have for `UseSSE < 2` are for the benefit of x86_32. Because of this I folded redundant `UseSSE` checks around Hotspot. >> >> The one thing I _deliberately_ avoided doing is merging `x86.ad` and `x86_64.ad`. It would likely introduce uncomfortable amount of... > > src/hotspot/cpu/x86/templateTable_x86.cpp line 330: > >> 328: void TemplateTable::dconst(int value) { >> 329: transition(vtos, dtos); >> 330: if (UseSSE >= 2) { > > I admit that I don't know what UseSSE is but now this is unconditional? Is there a further cleanup necessary for this option? Yes, now it is unconditional. x86_64 [requires](https://github.com/openjdk/jdk/blob/ec7393e9190c1b93ca08e1107f734c869f400b89/src/hotspot/cpu/x86/vm_version_x86.cpp#L896-L903) UseSSE >= 2. Only x86_32 cared about UseSSE < 2, so now we can eliminate these checks. I think I got the majority, if not all of the cases where these checks are now redundant: there are more in various assemblers and compiler code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22567#discussion_r1910408581 From coleenp at openjdk.org Fri Jan 10 16:24:47 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 10 Jan 2025 16:24:47 GMT Subject: RFR: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Fri, 10 Jan 2025 13:54:32 GMT, Aleksey Shipilev wrote: >> src/hotspot/cpu/x86/templateTable_x86.cpp line 330: >> >>> 328: void TemplateTable::dconst(int value) { >>> 329: transition(vtos, dtos); >>> 330: if (UseSSE >= 2) { >> >> I admit that I don't know what UseSSE is but now this is unconditional? Is there a further cleanup necessary for this option? > > Yes, now it is unconditional. x86_64 [requires](https://github.com/openjdk/jdk/blob/ec7393e9190c1b93ca08e1107f734c869f400b89/src/hotspot/cpu/x86/vm_version_x86.cpp#L896-L903) UseSSE >= 2. Only x86_32 cared about UseSSE < 2, so now we can eliminate these checks. I think I got the majority, if not all of the cases where these checks are now redundant: there are more in various assemblers and compiler code. Maybe this should change from range (2,4) then. product(int, UseSSE, 4, \ "Highest supported SSE instructions set on x86/x64") \ range(0, 4) \ ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22567#discussion_r1910614336 From ihse at openjdk.org Fri Jan 10 17:17:46 2025 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 10 Jan 2025 17:17:46 GMT Subject: RFR: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Thu, 5 Dec 2024 08:26:10 GMT, Aleksey Shipilev wrote: > **NOTE: This is work-in-progress draft for interested parties. The JEP is not even submitted, let alone targeted.** > > My plan is to to get this done in a quiet time in mainline to limit the ongoing conflicts with mainline. Feel free to comment in this PR, if you see something ahead of time. These comments might adjust the trajectory we take to implement this removal and/or allows us submit and work out more RFEs ahead of this removal. I plan to re-open a clean PR after this preliminary PR is done, maybe after the round of preliminary reviews. > > This removes the 32-bit x86 port and does a deeper cleaning in Hotspot. The following paragraphs describe what and why was being done. > > Easy stuff first: all files named `*_x86_32` are gone. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. > > The code under `!LP64`, `!AMD64` and `IA32` is removed in `x86`-specific files. There is quite a bit of the code, especially around `Assembler` and `MacroAssembler`. I think these removals make the whole thing cleaner. The downside is that some of the `MacroAssembler::*ptr` functions that were used to select the "machine pointer" instructions either from x86_64 or x86_32 are now exclusively for x86_64. I don't think we want to rewrite `*ptr` -> `*q` at this point. I think we gradually morph the code base to use `*q`-flavored methods in new code. > > x86_32 is the only platform that has special cases for x87 FPU. > > C1 even implements the whole separate thing to deal with x87 FPU: the parts of regalloc treat it specially, there is `FpuStackSim`, there is `VerifyFPU` family of flags, etc. There are also peculiarities with FP conversions that use FPU, that's why x86_32 used to have template interpreter stubs for FP conversion methods. None of that is needed anymore without x86_32. This cleans up some arch-specific code as well. > > Both C1 and C2 implement the workarounds for non-IEEE compliant rounding of x87 FPU. After x86_32 is gone, these are not needed anymore. This removes some C2 nodes, removes the rounding instructions in C1. > > x86_64 is baselined on SSE2+, the VM would not even start if SSE2 is not supported. Most of the checks that we have for `UseSSE < 2` are for the benefit of x86_32. Because of this I folded redundant `UseSSE` checks around Hotspot. > > The one thing I _deliberately_ avoided doing is merging `x86.ad` and `x86_64.ad`. It would likely introduce uncomfortable amount of conflicts with pending work in mainli... Don't forget the 32-bit x86 classes under `src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot`. There might be other x86-specific code in other JDK libraries as well, and not just in Hotspot. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22567#issuecomment-2583294294 From lmesnik at openjdk.org Fri Jan 10 17:40:36 2025 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 10 Jan 2025 17:40:36 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v4] In-Reply-To: References: <3d7ezYP3-By8ArAoM6IkKGH8XHZ4VNVcviiMzMy2EPQ=.d82874c8-40e0-42ce-808c-527e44aac2dc@github.com> Message-ID: On Wed, 8 Jan 2025 12:45:48 GMT, Tobias Hartmann wrote: >> OK, I was confused by this in PR body then: >> >>> I was able to reliably reproduce the issue with compiler/arraycopy/TestArrayCopyNoInit.java and -XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:-UseCompressedClassPointers on Linux AArch64 and verified that the fix solves the problem. >> >> But fine, if it reproduces with +UCOH, let it be there. > > Ah, that's actually a typo, good catch. Should be `-XX:+UseCompactObjectHeaders`. I'll fix it in the description. Cant you please add this '@run' as a separate testcase with it's own id. So it is easier to identify and exclude the failures. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22967#discussion_r1910706521 From shade at openjdk.org Fri Jan 10 18:23:49 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 10 Jan 2025 18:23:49 GMT Subject: RFR: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Fri, 10 Jan 2025 16:22:06 GMT, Coleen Phillimore wrote: >> Yes, now it is unconditional. x86_64 [requires](https://github.com/openjdk/jdk/blob/ec7393e9190c1b93ca08e1107f734c869f400b89/src/hotspot/cpu/x86/vm_version_x86.cpp#L896-L903) UseSSE >= 2. Only x86_32 cared about UseSSE < 2, so now we can eliminate these checks. I think I got the majority, if not all of the cases where these checks are now redundant: there are more in various assemblers and compiler code. > > Maybe this should change from range (2,4) then. > product(int, UseSSE, 4, \ > "Highest supported SSE instructions set on x86/x64") \ > range(0, 4) \ Right. Now that I am thinking more deeply about it, maybe that would be a first step here: lift UseSSE >= 2 for x86_32 ahead of this JEP, eliminate all UseSSE < 2 parts. I can see how intrusive this gets. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22567#discussion_r1910843574 From kbarrett at openjdk.org Fri Jan 10 18:33:47 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 10 Jan 2025 18:33:47 GMT Subject: RFR: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Fri, 10 Jan 2025 18:21:25 GMT, Aleksey Shipilev wrote: >> Maybe this should change from range (2,4) then. >> product(int, UseSSE, 4, \ >> "Highest supported SSE instructions set on x86/x64") \ >> range(0, 4) \ > > Right. Now that I am thinking more deeply about it, maybe that would be a first step here: lift UseSSE >= 2 for x86_32 ahead of this JEP, eliminate all UseSSE < 2 parts. I can see how intrusive this gets. [not reviewing, just a drive-by comment] Does UseSSE < 2 provide a way to _avoid_ using relevant parts of SSE on x86_64, perhaps for debugging? Or does x86_64 effectively hard-wire UseSSE >= 2? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22567#discussion_r1910878630 From kvn at openjdk.org Fri Jan 10 20:06:38 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 10 Jan 2025 20:06:38 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v4] In-Reply-To: References: Message-ID: On Thu, 9 Jan 2025 10:22:12 GMT, Tobias Hartmann wrote: >> C2's arraycopy intrinsic adds guards that check that the source and destination objects are arrays: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5917-L5919 >> >> If these guards pass, the array length is loaded: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5930-L5933 >> >> But since the `LoadRangeNode` is not pinned, it might float above the array guard: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/graphKit.cpp#L1214 >> >> If the object is not an array, we will read garbage. That's usually fine because the result will not be used (the array guard will trigger) but with `-XX:+UseCompactObjectHeaders` it can happen that the memory right after the header is not mapped and we crash. >> >> The fix is to add a `CheckCastPPNode` to propagate the information that the operand is an array and prevent the load from floating. >> >> Thanks to @shipilev for identifying the root cause! >> >> I was able to reliably reproduce the issue with `compiler/arraycopy/TestArrayCopyNoInit.java` and `-XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:+UseCompactObjectHeaders` on Linux AArch64 and verified that the fix solves the problem. >> >> Best regards, >> Tobias > > Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: > > Copyright date Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22967#pullrequestreview-2543636498 From vlivanov at openjdk.org Fri Jan 10 20:28:49 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Fri, 10 Jan 2025 20:28:49 GMT Subject: RFR: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Thu, 5 Dec 2024 08:26:10 GMT, Aleksey Shipilev wrote: > **NOTE: This is work-in-progress draft for interested parties. The JEP is not even submitted, let alone targeted.** > > My plan is to to get this done in a quiet time in mainline to limit the ongoing conflicts with mainline. Feel free to comment in this PR, if you see something ahead of time. These comments might adjust the trajectory we take to implement this removal and/or allows us submit and work out more RFEs ahead of this removal. I plan to re-open a clean PR after this preliminary PR is done, maybe after the round of preliminary reviews. > > This removes the 32-bit x86 port and does a deeper cleaning in Hotspot. The following paragraphs describe what and why was being done. > > Easy stuff first: all files named `*_x86_32` are gone. Those are only built when build system knows we are compiling for x86_32. There is therefore no impact on x86_64. > > The code under `!LP64`, `!AMD64` and `IA32` is removed in `x86`-specific files. There is quite a bit of the code, especially around `Assembler` and `MacroAssembler`. I think these removals make the whole thing cleaner. The downside is that some of the `MacroAssembler::*ptr` functions that were used to select the "machine pointer" instructions either from x86_64 or x86_32 are now exclusively for x86_64. I don't think we want to rewrite `*ptr` -> `*q` at this point. I think we gradually morph the code base to use `*q`-flavored methods in new code. > > x86_32 is the only platform that has special cases for x87 FPU. > > C1 even implements the whole separate thing to deal with x87 FPU: the parts of regalloc treat it specially, there is `FpuStackSim`, there is `VerifyFPU` family of flags, etc. There are also peculiarities with FP conversions that use FPU, that's why x86_32 used to have template interpreter stubs for FP conversion methods. None of that is needed anymore without x86_32. This cleans up some arch-specific code as well. > > Both C1 and C2 implement the workarounds for non-IEEE compliant rounding of x87 FPU. After x86_32 is gone, these are not needed anymore. This removes some C2 nodes, removes the rounding instructions in C1. > > x86_64 is baselined on SSE2+, the VM would not even start if SSE2 is not supported. Most of the checks that we have for `UseSSE < 2` are for the benefit of x86_32. Because of this I folded redundant `UseSSE` checks around Hotspot. > > The one thing I _deliberately_ avoided doing is merging `x86.ad` and `x86_64.ad`. It would likely introduce uncomfortable amount of conflicts with pending work in mainli... Personally, I'd prefer to see initial x86-32 removal changeset as straighforward as possible: x86-32-specific files, plus (optionally) x86-32-specific code in x86-specific files. IMO it's better to cover the rest (getting rid of unused features after x86-32 removal) as follow-up cleanups. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22567#issuecomment-2584013803 From kvn at openjdk.org Fri Jan 10 20:28:49 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 10 Jan 2025 20:28:49 GMT Subject: RFR: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Fri, 10 Jan 2025 18:30:04 GMT, Kim Barrett wrote: >> Right. Now that I am thinking more deeply about it, maybe that would be a first step here: lift UseSSE >= 2 for x86_32 ahead of this JEP, eliminate all UseSSE < 2 parts. I can see how intrusive this gets. > > [not reviewing, just a drive-by comment] Does UseSSE < 2 provide a way to _avoid_ using relevant parts of > SSE on x86_64, perhaps for debugging? Or does x86_64 effectively hard-wire UseSSE >= 2? By default all 64-bits x86 CPU (starting from AMD64) supports all instructions up to SSE2. 32-bit x86 CPU may not support SSE2. We can generated sse1 or use FPU instructions in 64-bit VM but we decided not to do that - SSE2 instructions version were much easier to use. We purged all uses of FPU in JDK 15: [JDK-7175279](https://bugs.openjdk.org/browse/JDK-7175279) by using SSE set of instructions because we did not want to mess (save/restore state) with FPU anymore in 64-bit VM. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22567#discussion_r1911258635 From kvn at openjdk.org Fri Jan 10 20:33:46 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 10 Jan 2025 20:33:46 GMT Subject: RFR: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Fri, 10 Jan 2025 20:23:28 GMT, Vladimir Kozlov wrote: >> [not reviewing, just a drive-by comment] Does UseSSE < 2 provide a way to _avoid_ using relevant parts of >> SSE on x86_64, perhaps for debugging? Or does x86_64 effectively hard-wire UseSSE >= 2? > > By default all 64-bits x86 CPU (starting from AMD64) supports all instructions up to SSE2. 32-bit x86 CPU may not support SSE2. > > We can generated sse1 or use FPU instructions in 64-bit VM but we decided not to do that - SSE2 instructions version were much easier to use. We purged all uses of FPU in JDK 15: [JDK-7175279](https://bugs.openjdk.org/browse/JDK-7175279) by using SSE set of instructions because we did not want to mess (save/restore state) with FPU anymore in 64-bit VM. I think there are several places in 64-bit VM where we assume SSE2 instructions are always available. So if you set `UseSSE=1 or = 0` in debugger VM may crash. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22567#discussion_r1911279275 From kbarrett at openjdk.org Sat Jan 11 14:11:48 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 11 Jan 2025 14:11:48 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v5] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: <0T6dqXyqum7hEpCvg97-WsP_zVfOO9JkBCnze1f3sxE=.9b5c6ba1-9f58-4e74-bee8-5478809216cc@github.com> On Tue, 7 Jan 2025 12:51:33 GMT, Coleen Phillimore wrote: >> There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. >> >> Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Restore copyright and macro. Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22916#pullrequestreview-2544851802 From coleenp at openjdk.org Sat Jan 11 15:29:48 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Sat, 11 Jan 2025 15:29:48 GMT Subject: RFR: 8345169: Implement JEP XXX: Remove the 32-bit x86 Port In-Reply-To: References: Message-ID: On Fri, 10 Jan 2025 20:30:32 GMT, Vladimir Kozlov wrote: >> By default all 64-bits x86 CPU (starting from AMD64) supports all instructions up to SSE2. 32-bit x86 CPU may not support SSE2. >> >> We can generated sse1 or use FPU instructions in 64-bit VM but we decided not to do that - SSE2 instructions version were much easier to use. We purged all uses of FPU in JDK 15: [JDK-7175279](https://bugs.openjdk.org/browse/JDK-7175279) by using SSE set of instructions because we did not want to mess (save/restore state) with FPU anymore in 64-bit VM. > > I think there are several places in 64-bit VM where we assume SSE2 instructions are always available. > So if you set `UseSSE=1 or = 0` in debugger VM may crash. Having some kind of pre-JEP patch for this this might be helpful so that we don't drill down on this rather than the whole patch. Maybe the JEP patch could simply be what @iwanowww suggests. Then have a post-JEP patch to remove everything else. Sort of like what we did with Security Manager. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22567#discussion_r1912067239 From qamai at openjdk.org Sun Jan 12 13:48:03 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Sun, 12 Jan 2025 13:48:03 GMT Subject: RFR: 8347481: C2: Remove the control input of some nodes Message-ID: Hi, While working on [JDK-8347365](https://bugs.openjdk.org/browse/JDK-8347365), I noticed that there are some nodes that have their control inputs being set in a seemingly erroneous manner. This patch removes the control inputs for those nodes. Please review this PR, thanks a lot. ------------- Commit messages: - remove control inputs from several nodes Changes: https://git.openjdk.org/jdk/pull/23055/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23055&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347481 Stats: 58 lines in 8 files changed: 0 ins; 2 del; 56 mod Patch: https://git.openjdk.org/jdk/pull/23055.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23055/head:pull/23055 PR: https://git.openjdk.org/jdk/pull/23055 From dholmes at openjdk.org Mon Jan 13 02:36:42 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 13 Jan 2025 02:36:42 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v8] In-Reply-To: References: Message-ID: On Thu, 9 Jan 2025 22:07:12 GMT, Kim Barrett wrote: >> Please review this change to how HotSpot prevents the use of certain C library >> functions (e.g. poisons references to those functions), while permitting a >> subset to be used in restricted circumstances. Reasons for poisoning a >> function include it being considered obsolete, or a security concern, or there >> is a HotSpot function (typically in the os:: namespace) providing similar >> functionality that should be used instead. >> >> The old mechanism, based on -Wattribute-warning and the associated attribute, >> only worked for gcc. (Clang's implementation differs in an important way from >> gcc, which is the subject of a clang bug that has been open for years. MSVC >> doesn't provide a similar mechanism.) It also had problems with LTO, due to a >> gcc bug. >> >> The new mechanism is based on deprecation warnings, using [[deprecated]] >> attributes. We redeclare or forward declare the functions we want to prevent >> use of as being deprecated. This relies on deprecation warnings being >> enabled, which they already are in our build configuration. All of our >> supported compilers support the [[deprecated]] attribute. >> >> Another benefit of using deprecation warnings rather than warning attributes >> is the time when the check is performed. Warning attributes are checked only >> if the function is referenced after all optimizations have been performed. >> Deprecation is checked during initial semantic analysis. That's better for >> our purposes here. (This is also part of why gcc LTO has problems with the >> old mechanism, but not the new.) >> >> Adding these redeclarations or forward declarations isn't as simple as >> expected, due to differences between the various compilers. We hide the >> differences behind a set of macros, FORBID_C_FUNCTION and related macros. See >> the compiler-specific parts of those macros for details. >> >> In some situations we need to allow references to these poisoned functions. >> >> One common case is where our poisoning is visible to some 3rd party code we >> don't want to modify. This is typically 3rd party headers included in HotSpot >> code, such as from Google Test or the C++ Standard Library. For these the >> BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context >> where such references are permitted. >> >> Some of the poisoned functions are needed to implement associated HotSpot os:: >> functions, or in other similarly restricted contexts. For these, a wrapper >> function is provided that calls the poison... > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision: > > - Merge branch 'master' into new-poison > - Merge branch 'master' into new-poison > - remove more os-specific posix forwarding headers > - stefank whitespace suggestions > - add permit wrapper for strdup and use in aix > - remove os-specific posix forwarding headers > - aix permit patches > - more fixes for clang noreturn issues > - Merge branch 'master' into new-poison > - update copyrights > - ... and 5 more: https://git.openjdk.org/jdk/compare/5e6fd8ba...6d49abbb Nothing further from me. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22890#pullrequestreview-2545615287 From dholmes at openjdk.org Mon Jan 13 02:36:43 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 13 Jan 2025 02:36:43 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v2] In-Reply-To: References: <8gxnsZYbqEJ7T3N637tjijrVbmQgbu8BrHHmVAjCt5M=.f98893f3-692a-4168-80f0-997f522ec4b0@github.com> <7w5uOtMR8ROpBIYdJHf4L44ANnxawhijM_iBCHpUtcI=.46e22653-732f-4edd-ad2f-c947d5928f6d@github.com> <012w8uAs0fzFh52ycdTUSBLh0Kv-TgwiCSAdWA5sBDM=.16e4e1be-6fbe-4543-b565-499fd7c40f63@github.com> Message-ID: <-kyK-7ucrlR3C41y8P9m1FnG4TGZhPNEqWYdd0bWIT8=.e475314f-3b78-43a5-80ca-5e53bb100bf7@github.com> On Fri, 10 Jan 2025 13:00:31 GMT, Coleen Phillimore wrote: >> @kimbarrett I understand. I also don't like the includer needing to know about this, and maybe it is time to add OS_FAMILY_HEADER to deal with it. But in the absence of the new macro I prefer this break of abstraction to the creation of a bunch of tiny forwarding files. > > I prefer the opposite. If you break these up into a bunch of forwarding files, do all the rest of the VM the same way. > > It may be my problem but I don't have a good IDE to have to navigate through a slew of similarly named files to find the real implementation. @coleenp I thought you and I were on the same page not liking the creation of all these little files that just forward to the posix one. ??? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22890#discussion_r1912608160 From dholmes at openjdk.org Mon Jan 13 05:05:49 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 13 Jan 2025 05:05:49 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v5] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Tue, 7 Jan 2025 12:51:33 GMT, Coleen Phillimore wrote: >> There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. >> >> Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Restore copyright and macro. Sorry for a "dumb" question but `%z` is for size_t arguments, so why are we using it to replace INTX/UINTX_FORMAT ??? I get that size_t and intx happen to be the same size but still ... if I see `%z` I expect to see a size_t argument passed in. ------------- PR Review: https://git.openjdk.org/jdk/pull/22916#pullrequestreview-2545711471 From thartmann at openjdk.org Mon Jan 13 06:11:36 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 13 Jan 2025 06:11:36 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v4] In-Reply-To: References: Message-ID: <_NHQ3uM6mAOsnk_h6vMfR_7rScZq0_utgLE37hTBakM=.f00fe780-d4fb-450a-a474-269294f50e07@github.com> On Wed, 8 Jan 2025 12:17:14 GMT, Roland Westrelin wrote: >> Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: >> >> Copyright date > > Looks good to me. Thanks for the review, Vladimir! @rwestrel are you okay with the changes as well? ------------- PR Comment: https://git.openjdk.org/jdk/pull/22967#issuecomment-2586236419 From roland at openjdk.org Mon Jan 13 08:14:40 2025 From: roland at openjdk.org (Roland Westrelin) Date: Mon, 13 Jan 2025 08:14:40 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v4] In-Reply-To: References: Message-ID: On Thu, 9 Jan 2025 10:22:12 GMT, Tobias Hartmann wrote: >> C2's arraycopy intrinsic adds guards that check that the source and destination objects are arrays: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5917-L5919 >> >> If these guards pass, the array length is loaded: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5930-L5933 >> >> But since the `LoadRangeNode` is not pinned, it might float above the array guard: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/graphKit.cpp#L1214 >> >> If the object is not an array, we will read garbage. That's usually fine because the result will not be used (the array guard will trigger) but with `-XX:+UseCompactObjectHeaders` it can happen that the memory right after the header is not mapped and we crash. >> >> The fix is to add a `CheckCastPPNode` to propagate the information that the operand is an array and prevent the load from floating. >> >> Thanks to @shipilev for identifying the root cause! >> >> I was able to reliably reproduce the issue with `compiler/arraycopy/TestArrayCopyNoInit.java` and `-XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:+UseCompactObjectHeaders` on Linux AArch64 and verified that the fix solves the problem. >> >> Best regards, >> Tobias > > Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: > > Copyright date Looks good to me. ------------- Marked as reviewed by roland (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22967#pullrequestreview-2545954242 From jbhateja at openjdk.org Mon Jan 13 09:06:12 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 13 Jan 2025 09:06:12 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v10] In-Reply-To: References: Message-ID: > Hi All, > > This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) > > Following is the summary of changes included with this patch:- > > 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. > 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. > 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. > - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. > 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. > 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. > 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. > 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF > 9. X86 backend implementation for all supported intrinsics. > 10. Functional and Performance validation tests. > > Kindly review the patch and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Review comments resolutions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22754/files - new: https://git.openjdk.org/jdk/pull/22754/files/175f4ed2..43aa3eb7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=08-09 Stats: 22 lines in 5 files changed: 5 ins; 2 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/22754.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22754/head:pull/22754 PR: https://git.openjdk.org/jdk/pull/22754 From jbhateja at openjdk.org Mon Jan 13 09:06:13 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 13 Jan 2025 09:06:13 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v9] In-Reply-To: <_SCKY9fuTqNDfR6K1y-FuMvursDMuOx39sKrXMj0Tdg=.225da2f1-fcdc-4418-a753-6d7404b4a83e@github.com> References: <_SCKY9fuTqNDfR6K1y-FuMvursDMuOx39sKrXMj0Tdg=.225da2f1-fcdc-4418-a753-6d7404b4a83e@github.com> Message-ID: <88_pE_E7P1iOkpSUuLuou6wH9UxWvPx83MFo033dY2Y=.d942086a-e87f-45dd-8c1d-72b8fd9c85d6@github.com> On Thu, 9 Jan 2025 13:13:30 GMT, Emanuel Peter wrote: >> Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> Updating copyright year of modified files. > > src/hotspot/share/opto/superword.cpp line 2567: > >> 2565: // half float to float, in such a case back propagation of narrow type (SHORT) >> 2566: // may not be possible. >> 2567: if (n->Opcode() == Op_ConvF2HF || n->Opcode() == Op_ReinterpretHF2S) { > > Is this relevant, or does that belong to a different (vector) RFE? It makes sure to assign a SHORT container type to the ReinterpretHF2S node which could be succeeded by a ConvHF2F IR which expects its inputs to be of SHORT type. During early phase of SLP extraction we get into a control flow querying the implemented vector IR opcode through split_packs_only_implemented_with_smaller_size https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/vectornode.cpp#L1446 This scenario is tested by following JTREG [test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorConvChain.java](https://github.com/openjdk/jdk/pull/22754/files#diff-7e7404a977d8ca567f8005b80bd840ea2e722c022e7187fa2dd21df4a5837faaR49) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1912858395 From jbhateja at openjdk.org Mon Jan 13 09:06:14 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 13 Jan 2025 09:06:14 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v9] In-Reply-To: References: <_SCKY9fuTqNDfR6K1y-FuMvursDMuOx39sKrXMj0Tdg=.225da2f1-fcdc-4418-a753-6d7404b4a83e@github.com> Message-ID: On Thu, 9 Jan 2025 19:22:35 GMT, Paul Sandoz wrote: >> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Float16.java line 1434: >> >>> 1432: return float16ToRawShortBits(valueOf(product + float16ToFloat(f16c))); >>> 1433: }); >>> 1434: return shortBitsToFloat16(res); >> >> I don't understand what is happening here. But I leave this to @PaulSandoz to review > > Uncertain on what bits, but i am guessing it's mostly related to the fallback code in the lambda. To avoid the intrinsics operating on Float16 instances we instead "unpack" the carrier (16bits) values and pass those as arguments to the intrinsic. The fallback (when intrinsification is not supported) also accepts those carrier values as arguments and we convert the carriers to floats, operate on then, convert to the carrier, and then back to float16 on the result. > > The code in the lambda could potentially be simplified if `Float16Math.fma` accepted six arguments the first three being the carrier values used by the intrinsic, and the subsequent three being the float16 values used by the fallback. Then we could express the code in the original source in the lambda. I believe when intrinsified there would be no penalty for those extra arguments. Hi @PaulSandoz , In the current scheme we are passing unboxed carriers to intrinsic entry point, in the fallback implementation carrier type is first converted to floating point value using Float.float16ToFloat API which expects to receive a short type argument, after the operation we again convert float value to carrier type (short) using Float.floatToFloat16 API which expects a float argument, thus our intent here is to perform unboxing and boxing outside the intrinsic thereby avoiding all complexities around boxing by compiler. Even if we pass 3 additional parameters we still need to use Float16.floatValue which invokes Float.float16ToFloat underneath, thus this minor modification on Java side is on account of optimizing the intrinsic interface. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1912858286 From thartmann at openjdk.org Mon Jan 13 09:52:51 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 13 Jan 2025 09:52:51 GMT Subject: RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic [v4] In-Reply-To: References: Message-ID: On Thu, 9 Jan 2025 10:22:12 GMT, Tobias Hartmann wrote: >> C2's arraycopy intrinsic adds guards that check that the source and destination objects are arrays: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5917-L5919 >> >> If these guards pass, the array length is loaded: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5930-L5933 >> >> But since the `LoadRangeNode` is not pinned, it might float above the array guard: >> https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/graphKit.cpp#L1214 >> >> If the object is not an array, we will read garbage. That's usually fine because the result will not be used (the array guard will trigger) but with `-XX:+UseCompactObjectHeaders` it can happen that the memory right after the header is not mapped and we crash. >> >> The fix is to add a `CheckCastPPNode` to propagate the information that the operand is an array and prevent the load from floating. >> >> Thanks to @shipilev for identifying the root cause! >> >> I was able to reliably reproduce the issue with `compiler/arraycopy/TestArrayCopyNoInit.java` and `-XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:+UseCompactObjectHeaders` on Linux AArch64 and verified that the fix solves the problem. >> >> Best regards, >> Tobias > > Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: > > Copyright date Thanks again, Roland! ------------- PR Comment: https://git.openjdk.org/jdk/pull/22967#issuecomment-2586636675 From thartmann at openjdk.org Mon Jan 13 09:52:52 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 13 Jan 2025 09:52:52 GMT Subject: Integrated: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic In-Reply-To: References: Message-ID: On Wed, 8 Jan 2025 12:07:16 GMT, Tobias Hartmann wrote: > C2's arraycopy intrinsic adds guards that check that the source and destination objects are arrays: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5917-L5919 > > If these guards pass, the array length is loaded: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/library_call.cpp#L5930-L5933 > > But since the `LoadRangeNode` is not pinned, it might float above the array guard: > https://github.com/openjdk/jdk/blob/afe543414f58a04832d4f07dea88881d64954a0b/src/hotspot/share/opto/graphKit.cpp#L1214 > > If the object is not an array, we will read garbage. That's usually fine because the result will not be used (the array guard will trigger) but with `-XX:+UseCompactObjectHeaders` it can happen that the memory right after the header is not mapped and we crash. > > The fix is to add a `CheckCastPPNode` to propagate the information that the operand is an array and prevent the load from floating. > > Thanks to @shipilev for identifying the root cause! > > I was able to reliably reproduce the issue with `compiler/arraycopy/TestArrayCopyNoInit.java` and `-XX:-UseTLAB -XX:+UnlockExperimentalVMOptions -XX:+UseCompactObjectHeaders` on Linux AArch64 and verified that the fix solves the problem. > > Best regards, > Tobias This pull request has now been integrated. Changeset: 82e2a791 Author: Tobias Hartmann URL: https://git.openjdk.org/jdk/commit/82e2a791225a289ba32360bf415274c4b48b9e00 Stats: 58 lines in 5 files changed: 14 ins; 0 del; 44 mod 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic Reviewed-by: roland, qamai, kvn ------------- PR: https://git.openjdk.org/jdk/pull/22967 From thartmann at openjdk.org Mon Jan 13 10:03:38 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 13 Jan 2025 10:03:38 GMT Subject: [jdk24] RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic Message-ID: <-9ijGzFfYh1nwsa9JmJPDLbujCJUTAwUNffcu0XcA1g=.346d2d7a-49fa-4e44-ad63-948ae96550a3@github.com> Hi all, This pull request contains a backport of commit [82e2a791](https://github.com/openjdk/jdk/commit/82e2a791225a289ba32360bf415274c4b48b9e00) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. The commit being backported was authored by Tobias Hartmann on 13 Jan 2025 and was reviewed by Roland Westrelin, Quan Anh Mai and Vladimir Kozlov. Thanks! ------------- Commit messages: - Backport 82e2a791225a289ba32360bf415274c4b48b9e00 Changes: https://git.openjdk.org/jdk/pull/23063/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23063&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347006 Stats: 58 lines in 5 files changed: 14 ins; 0 del; 44 mod Patch: https://git.openjdk.org/jdk/pull/23063.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23063/head:pull/23063 PR: https://git.openjdk.org/jdk/pull/23063 From chagedorn at openjdk.org Mon Jan 13 10:21:50 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Mon, 13 Jan 2025 10:21:50 GMT Subject: [jdk24] RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic In-Reply-To: <-9ijGzFfYh1nwsa9JmJPDLbujCJUTAwUNffcu0XcA1g=.346d2d7a-49fa-4e44-ad63-948ae96550a3@github.com> References: <-9ijGzFfYh1nwsa9JmJPDLbujCJUTAwUNffcu0XcA1g=.346d2d7a-49fa-4e44-ad63-948ae96550a3@github.com> Message-ID: <0Ne6wgfK-3iq9UW6gnlTcAeb1inawpuPZAj9Fqa7ArI=.d8779111-c32e-44f7-8705-888ef669d8b5@github.com> On Mon, 13 Jan 2025 09:57:01 GMT, Tobias Hartmann wrote: > Hi all, > > This pull request contains a backport of commit [82e2a791](https://github.com/openjdk/jdk/commit/82e2a791225a289ba32360bf415274c4b48b9e00) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Tobias Hartmann on 13 Jan 2025 and was reviewed by Roland Westrelin, Quan Anh Mai and Vladimir Kozlov. > > Thanks! Looks good! ------------- Marked as reviewed by chagedorn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23063#pullrequestreview-2546218675 From thartmann at openjdk.org Mon Jan 13 10:27:50 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 13 Jan 2025 10:27:50 GMT Subject: [jdk24] RFR: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic In-Reply-To: <-9ijGzFfYh1nwsa9JmJPDLbujCJUTAwUNffcu0XcA1g=.346d2d7a-49fa-4e44-ad63-948ae96550a3@github.com> References: <-9ijGzFfYh1nwsa9JmJPDLbujCJUTAwUNffcu0XcA1g=.346d2d7a-49fa-4e44-ad63-948ae96550a3@github.com> Message-ID: <2uNuo6we9RJ4orQPuGCa-GGyTSgHvsGdfH4nRsjqUEQ=.db8bb385-bdc4-4870-ad08-b09b85afafc1@github.com> On Mon, 13 Jan 2025 09:57:01 GMT, Tobias Hartmann wrote: > Hi all, > > This pull request contains a backport of commit [82e2a791](https://github.com/openjdk/jdk/commit/82e2a791225a289ba32360bf415274c4b48b9e00) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Tobias Hartmann on 13 Jan 2025 and was reviewed by Roland Westrelin, Quan Anh Mai and Vladimir Kozlov. > > Thanks! Thanks for the review, Christian! ------------- PR Comment: https://git.openjdk.org/jdk/pull/23063#issuecomment-2586720650 From coleenp at openjdk.org Mon Jan 13 13:12:57 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 13 Jan 2025 13:12:57 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v8] In-Reply-To: References: Message-ID: <0werdci3K7vyrFZq2vjxDyVRQf5u6_2_mfwriae1Ds8=.c74f077e-7325-454a-b611-684db9f5b583@github.com> On Thu, 9 Jan 2025 22:07:12 GMT, Kim Barrett wrote: >> Please review this change to how HotSpot prevents the use of certain C library >> functions (e.g. poisons references to those functions), while permitting a >> subset to be used in restricted circumstances. Reasons for poisoning a >> function include it being considered obsolete, or a security concern, or there >> is a HotSpot function (typically in the os:: namespace) providing similar >> functionality that should be used instead. >> >> The old mechanism, based on -Wattribute-warning and the associated attribute, >> only worked for gcc. (Clang's implementation differs in an important way from >> gcc, which is the subject of a clang bug that has been open for years. MSVC >> doesn't provide a similar mechanism.) It also had problems with LTO, due to a >> gcc bug. >> >> The new mechanism is based on deprecation warnings, using [[deprecated]] >> attributes. We redeclare or forward declare the functions we want to prevent >> use of as being deprecated. This relies on deprecation warnings being >> enabled, which they already are in our build configuration. All of our >> supported compilers support the [[deprecated]] attribute. >> >> Another benefit of using deprecation warnings rather than warning attributes >> is the time when the check is performed. Warning attributes are checked only >> if the function is referenced after all optimizations have been performed. >> Deprecation is checked during initial semantic analysis. That's better for >> our purposes here. (This is also part of why gcc LTO has problems with the >> old mechanism, but not the new.) >> >> Adding these redeclarations or forward declarations isn't as simple as >> expected, due to differences between the various compilers. We hide the >> differences behind a set of macros, FORBID_C_FUNCTION and related macros. See >> the compiler-specific parts of those macros for details. >> >> In some situations we need to allow references to these poisoned functions. >> >> One common case is where our poisoning is visible to some 3rd party code we >> don't want to modify. This is typically 3rd party headers included in HotSpot >> code, such as from Google Test or the C++ Standard Library. For these the >> BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context >> where such references are permitted. >> >> Some of the poisoned functions are needed to implement associated HotSpot os:: >> functions, or in other similarly restricted contexts. For these, a wrapper >> function is provided that calls the poison... > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision: > > - Merge branch 'master' into new-poison > - Merge branch 'master' into new-poison > - remove more os-specific posix forwarding headers > - stefank whitespace suggestions > - add permit wrapper for strdup and use in aix > - remove os-specific posix forwarding headers > - aix permit patches > - more fixes for clang noreturn issues > - Merge branch 'master' into new-poison > - update copyrights > - ... and 5 more: https://git.openjdk.org/jdk/compare/3c60d213...6d49abbb This looks good for me too. I think you should allow GHA to run for this change. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22890#pullrequestreview-2546569987 From coleenp at openjdk.org Mon Jan 13 13:29:45 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 13 Jan 2025 13:29:45 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v5] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Tue, 7 Jan 2025 12:51:33 GMT, Coleen Phillimore wrote: >> There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. >> >> Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Restore copyright and macro. They are interchangeable and some places used UINTX_FORMAT when they should have used SIZE_FORMAT. Better to have just one and just use %zu, which looks better in the format specifiers. I'm going to do SIZE_FORMAT next but still negotiating how to handle review tedium. The error message can be confusing though because the error message for %z refers to size_t. But some of our use of intx should probably be size_t. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22916#issuecomment-2587101349 From thartmann at openjdk.org Mon Jan 13 13:48:47 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 13 Jan 2025 13:48:47 GMT Subject: [jdk24] Integrated: 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic In-Reply-To: <-9ijGzFfYh1nwsa9JmJPDLbujCJUTAwUNffcu0XcA1g=.346d2d7a-49fa-4e44-ad63-948ae96550a3@github.com> References: <-9ijGzFfYh1nwsa9JmJPDLbujCJUTAwUNffcu0XcA1g=.346d2d7a-49fa-4e44-ad63-948ae96550a3@github.com> Message-ID: On Mon, 13 Jan 2025 09:57:01 GMT, Tobias Hartmann wrote: > Hi all, > > This pull request contains a backport of commit [82e2a791](https://github.com/openjdk/jdk/commit/82e2a791225a289ba32360bf415274c4b48b9e00) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Tobias Hartmann on 13 Jan 2025 and was reviewed by Roland Westrelin, Quan Anh Mai and Vladimir Kozlov. > > Thanks! This pull request has now been integrated. Changeset: da74fbd9 Author: Tobias Hartmann URL: https://git.openjdk.org/jdk/commit/da74fbd920cdcd16f6097fcb8488a061f4753be5 Stats: 58 lines in 5 files changed: 14 ins; 0 del; 44 mod 8347006: LoadRangeNode floats above array guard in arraycopy intrinsic Reviewed-by: chagedorn Backport-of: 82e2a791225a289ba32360bf415274c4b48b9e00 ------------- PR: https://git.openjdk.org/jdk/pull/23063 From kbarrett at openjdk.org Mon Jan 13 15:23:45 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 13 Jan 2025 15:23:45 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v8] In-Reply-To: <0werdci3K7vyrFZq2vjxDyVRQf5u6_2_mfwriae1Ds8=.c74f077e-7325-454a-b611-684db9f5b583@github.com> References: <0werdci3K7vyrFZq2vjxDyVRQf5u6_2_mfwriae1Ds8=.c74f077e-7325-454a-b611-684db9f5b583@github.com> Message-ID: On Mon, 13 Jan 2025 13:09:49 GMT, Coleen Phillimore wrote: > This looks good for me too. I think you should allow GHA to run for this change. I ran GHA tests on at least one early version, but I'll do another run before integrating. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22890#issuecomment-2587404905 From galder at openjdk.org Mon Jan 13 15:37:44 2025 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Mon, 13 Jan 2025 15:37:44 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> Message-ID: On Thu, 9 Jan 2025 12:18:48 GMT, Emanuel Peter wrote: >> @eme64 aarch64 work for this is now complete. I tweaked the `applyIf` condition `MinMaxRed_Long` to make sure `MaxVectorSize` is 32 or higher. I verified this on both Graviton 3 (256 bit register, `MaxVectorSize=32`) and an AVX-512 intel (512 bit register, `MaxVectorSize=64`) > > @galderz So you want me to review again? @eme64 Yes please ------------- PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2587450313 From galder at openjdk.org Mon Jan 13 15:45:46 2025 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Mon, 13 Jan 2025 15:45:46 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> Message-ID: On Thu, 9 Jan 2025 12:18:48 GMT, Emanuel Peter wrote: >> @eme64 aarch64 work for this is now complete. I tweaked the `applyIf` condition `MinMaxRed_Long` to make sure `MaxVectorSize` is 32 or higher. I verified this on both Graviton 3 (256 bit register, `MaxVectorSize=32`) and an AVX-512 intel (512 bit register, `MaxVectorSize=64`) > > @galderz So you want me to review again? @eme64 I've noticed some failures in CI, I'll check those and ping you when it's ready for a review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2587472207 From coleenp at openjdk.org Mon Jan 13 15:49:15 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 13 Jan 2025 15:49:15 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v6] In-Reply-To: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: > There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. > > Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Add Oracle copyright to shenandoah files for this change. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22916/files - new: https://git.openjdk.org/jdk/pull/22916/files/ae9d9f6f..763c3908 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22916&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22916&range=04-05 Stats: 4 lines in 4 files changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/22916.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22916/head:pull/22916 PR: https://git.openjdk.org/jdk/pull/22916 From psandoz at openjdk.org Mon Jan 13 16:53:45 2025 From: psandoz at openjdk.org (Paul Sandoz) Date: Mon, 13 Jan 2025 16:53:45 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v9] In-Reply-To: References: <_SCKY9fuTqNDfR6K1y-FuMvursDMuOx39sKrXMj0Tdg=.225da2f1-fcdc-4418-a753-6d7404b4a83e@github.com> Message-ID: On Mon, 13 Jan 2025 09:02:24 GMT, Jatin Bhateja wrote: >> Uncertain on what bits, but i am guessing it's mostly related to the fallback code in the lambda. To avoid the intrinsics operating on Float16 instances we instead "unpack" the carrier (16bits) values and pass those as arguments to the intrinsic. The fallback (when intrinsification is not supported) also accepts those carrier values as arguments and we convert the carriers to floats, operate on then, convert to the carrier, and then back to float16 on the result. >> >> The code in the lambda could potentially be simplified if `Float16Math.fma` accepted six arguments the first three being the carrier values used by the intrinsic, and the subsequent three being the float16 values used by the fallback. Then we could express the code in the original source in the lambda. I believe when intrinsified there would be no penalty for those extra arguments. > > Hi @PaulSandoz , In the current scheme we are passing unboxed carriers to intrinsic entry point, in the fallback implementation carrier type is first converted to floating point value using Float.float16ToFloat API which expects to receive a short type argument, after the operation we again convert float value to carrier type (short) using Float.floatToFloat16 API which expects a float argument, thus our intent here is to perform unboxing and boxing outside the intrinsic thereby avoiding all complexities around boxing by compiler. Even if we pass 3 additional parameters we still need to use Float16.floatValue which invokes Float.float16ToFloat underneath, thus this minor modification on Java side is on account of optimizing the intrinsic interface. Yes, i understand the approach. It's about clarity of the fallback implementation retaining what was expressed in the original code: short res = Float16Math.fma(fa, fb, fc, a, b, c, (a_, b_, c_) -> { double product = (double)(a_.floatValue() * b._floatValue()); return valueOf(product + c_.doubleValue()); }); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1913502565 From kbarrett at openjdk.org Mon Jan 13 16:54:53 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 13 Jan 2025 16:54:53 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v6] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Mon, 13 Jan 2025 15:49:15 GMT, Coleen Phillimore wrote: >> There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. >> >> Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add Oracle copyright to shenandoah files for this change. Still good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22916#pullrequestreview-2547230951 From galder at openjdk.org Mon Jan 13 17:12:31 2025 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Mon, 13 Jan 2025 17:12:31 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v9] In-Reply-To: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> Message-ID: <5LxTINpzicabL2086ATQoudlqUMANni986510N1nk_k=.499ecef8-bb25-41de-91d6-b368888c90d2@github.com> > This patch intrinsifies `Math.max(long, long)` and `Math.min(long, long)` in order to help improve vectorization performance. > > Currently vectorization does not kick in for loops containing either of these calls because of the following error: > > > VLoop::check_preconditions: failed: control flow in loop not allowed > > > The control flow is due to the java implementation for these methods, e.g. > > > public static long max(long a, long b) { > return (a >= b) ? a : b; > } > > > This patch intrinsifies the calls to replace the CmpL + Bool nodes for MaxL/MinL nodes respectively. > By doing this, vectorization no longer finds the control flow and so it can carry out the vectorization. > E.g. > > > SuperWord::transform_loop: > Loop: N518/N126 counted [int,int),+4 (1025 iters) main has_sfpt strip_mined > 518 CountedLoop === 518 246 126 [[ 513 517 518 242 521 522 422 210 ]] inner stride: 4 main of N518 strip mined !orig=[419],[247],[216],[193] !jvms: Test::test @ bci:14 (line 21) > > > Applying the same changes to `ReductionPerf` as in https://github.com/openjdk/jdk/pull/13056, we can compare the results before and after. Before the patch, on darwin/aarch64 (M1): > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java > 1 1 0 0 > ============================== > TEST SUCCESS > > long min 1155 > long max 1173 > > > After the patch, on darwin/aarch64 (M1): > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java > 1 1 0 0 > ============================== > TEST SUCCESS > > long min 1042 > long max 1042 > > > This patch does not add an platform-specific backend implementations for the MaxL/MinL nodes. > Therefore, it still relies on the macro expansion to transform those into CMoveL. > > I've run tier1 and hotspot compiler tests on darwin/aarch64 and got these results: > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg:tier1 2500 2500 0 0 >>> jtreg:test/jdk:tier1 ... Galder Zamarre?o has updated the pull request incrementally with one additional commit since the last revision: Make sure it runs with cpus with either avx512 or asimd ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20098/files - new: https://git.openjdk.org/jdk/pull/20098/files/c0491987..abbaf875 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20098&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20098&range=07-08 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/20098.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20098/head:pull/20098 PR: https://git.openjdk.org/jdk/pull/20098 From kbarrett at openjdk.org Mon Jan 13 18:28:48 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 13 Jan 2025 18:28:48 GMT Subject: Integrated: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION In-Reply-To: References: Message-ID: On Sun, 29 Dec 2024 08:11:07 GMT, Kim Barrett wrote: > Please review this change to how HotSpot prevents the use of certain C library > functions (e.g. poisons references to those functions), while permitting a > subset to be used in restricted circumstances. Reasons for poisoning a > function include it being considered obsolete, or a security concern, or there > is a HotSpot function (typically in the os:: namespace) providing similar > functionality that should be used instead. > > The old mechanism, based on -Wattribute-warning and the associated attribute, > only worked for gcc. (Clang's implementation differs in an important way from > gcc, which is the subject of a clang bug that has been open for years. MSVC > doesn't provide a similar mechanism.) It also had problems with LTO, due to a > gcc bug. > > The new mechanism is based on deprecation warnings, using [[deprecated]] > attributes. We redeclare or forward declare the functions we want to prevent > use of as being deprecated. This relies on deprecation warnings being > enabled, which they already are in our build configuration. All of our > supported compilers support the [[deprecated]] attribute. > > Another benefit of using deprecation warnings rather than warning attributes > is the time when the check is performed. Warning attributes are checked only > if the function is referenced after all optimizations have been performed. > Deprecation is checked during initial semantic analysis. That's better for > our purposes here. (This is also part of why gcc LTO has problems with the > old mechanism, but not the new.) > > Adding these redeclarations or forward declarations isn't as simple as > expected, due to differences between the various compilers. We hide the > differences behind a set of macros, FORBID_C_FUNCTION and related macros. See > the compiler-specific parts of those macros for details. > > In some situations we need to allow references to these poisoned functions. > > One common case is where our poisoning is visible to some 3rd party code we > don't want to modify. This is typically 3rd party headers included in HotSpot > code, such as from Google Test or the C++ Standard Library. For these the > BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context > where such references are permitted. > > Some of the poisoned functions are needed to implement associated HotSpot os:: > functions, or in other similarly restricted contexts. For these, a wrapper > function is provided that calls the poisoned function with the warning > suppressed. These wrappers are defined in the permit_fo... This pull request has now been integrated. Changeset: e0f2f4b2 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/e0f2f4b216bc9358caa65975204aee086e4fcbd2 Stats: 592 lines in 32 files changed: 415 ins; 64 del; 113 mod 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION Co-authored-by: Martin Doerr Reviewed-by: coleenp, dholmes, jsjolen ------------- PR: https://git.openjdk.org/jdk/pull/22890 From kbarrett at openjdk.org Mon Jan 13 18:28:46 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 13 Jan 2025 18:28:46 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v8] In-Reply-To: References: Message-ID: On Thu, 9 Jan 2025 22:07:12 GMT, Kim Barrett wrote: >> Please review this change to how HotSpot prevents the use of certain C library >> functions (e.g. poisons references to those functions), while permitting a >> subset to be used in restricted circumstances. Reasons for poisoning a >> function include it being considered obsolete, or a security concern, or there >> is a HotSpot function (typically in the os:: namespace) providing similar >> functionality that should be used instead. >> >> The old mechanism, based on -Wattribute-warning and the associated attribute, >> only worked for gcc. (Clang's implementation differs in an important way from >> gcc, which is the subject of a clang bug that has been open for years. MSVC >> doesn't provide a similar mechanism.) It also had problems with LTO, due to a >> gcc bug. >> >> The new mechanism is based on deprecation warnings, using [[deprecated]] >> attributes. We redeclare or forward declare the functions we want to prevent >> use of as being deprecated. This relies on deprecation warnings being >> enabled, which they already are in our build configuration. All of our >> supported compilers support the [[deprecated]] attribute. >> >> Another benefit of using deprecation warnings rather than warning attributes >> is the time when the check is performed. Warning attributes are checked only >> if the function is referenced after all optimizations have been performed. >> Deprecation is checked during initial semantic analysis. That's better for >> our purposes here. (This is also part of why gcc LTO has problems with the >> old mechanism, but not the new.) >> >> Adding these redeclarations or forward declarations isn't as simple as >> expected, due to differences between the various compilers. We hide the >> differences behind a set of macros, FORBID_C_FUNCTION and related macros. See >> the compiler-specific parts of those macros for details. >> >> In some situations we need to allow references to these poisoned functions. >> >> One common case is where our poisoning is visible to some 3rd party code we >> don't want to modify. This is typically 3rd party headers included in HotSpot >> code, such as from Google Test or the C++ Standard Library. For these the >> BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context >> where such references are permitted. >> >> Some of the poisoned functions are needed to implement associated HotSpot os:: >> functions, or in other similarly restricted contexts. For these, a wrapper >> function is provided that calls the poison... > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision: > > - Merge branch 'master' into new-poison > - Merge branch 'master' into new-poison > - remove more os-specific posix forwarding headers > - stefank whitespace suggestions > - add permit wrapper for strdup and use in aix > - remove os-specific posix forwarding headers > - aix permit patches > - more fixes for clang noreturn issues > - Merge branch 'master' into new-poison > - update copyrights > - ... and 5 more: https://git.openjdk.org/jdk/compare/f7e23973...6d49abbb Thanks for reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22890#issuecomment-2587889007 From dholmes at openjdk.org Mon Jan 13 21:07:44 2025 From: dholmes at openjdk.org (David Holmes) Date: Mon, 13 Jan 2025 21:07:44 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v6] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: <9iKSUKvDXxUoqoAxOYVEqzUGmf2PQwDLy_MfbFABs88=.30a14b62-a21e-4a6e-bf32-31454689a33f@github.com> On Mon, 13 Jan 2025 15:49:15 GMT, Coleen Phillimore wrote: >> There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. >> >> Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add Oracle copyright to shenandoah files for this change. Marked as reviewed by dholmes (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/22916#pullrequestreview-2547892700 From coleenp at openjdk.org Mon Jan 13 22:06:49 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 13 Jan 2025 22:06:49 GMT Subject: Integrated: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros In-Reply-To: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Fri, 3 Jan 2025 14:32:39 GMT, Coleen Phillimore wrote: > There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. > > Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. This pull request has now been integrated. Changeset: 379d05bc Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/379d05bcc130446086786ecf6ca5a6b8e977386c Stats: 344 lines in 83 files changed: 6 ins; 19 del; 319 mod 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros Reviewed-by: kbarrett, dholmes, matsaave ------------- PR: https://git.openjdk.org/jdk/pull/22916 From coleenp at openjdk.org Mon Jan 13 22:06:48 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 13 Jan 2025 22:06:48 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v6] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Mon, 13 Jan 2025 15:49:15 GMT, Coleen Phillimore wrote: >> There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. >> >> Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add Oracle copyright to shenandoah files for this change. Thank you Matias, Kim and David. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22916#issuecomment-2588312123 From dholmes at openjdk.org Tue Jan 14 01:18:43 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 14 Jan 2025 01:18:43 GMT Subject: RFR: 8346990: Remove INTX_FORMAT and UINTX_FORMAT macros [v6] In-Reply-To: References: <3DB-2pH7wwVWDuJfkD1XoQwGKJOYxJKhuDQ0UeuxBC4=.03b5f432-6051-49d9-8ea9-34a9ea769ad1@github.com> Message-ID: On Mon, 13 Jan 2025 15:49:15 GMT, Coleen Phillimore wrote: >> There are a lot of format modifiers that are noisy and unnecessary in the code. This change removes the INTX variants. It's not that disruptive even for backporting because %z modifier has been available for a long time so should backport fine. This was mostly done with a sed script plus some hand fixups. >> >> Testing mach5 and other platform cross compilations in progress. Opening this for GHA testing. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add Oracle copyright to shenandoah files for this change. We have belatedly discovered that `0x%zx` and `%#zx` behave differently in their handling of zero. The former prints `0x0` while the latter just prints `0`. This has broken the compiler replay tests as the parsing of 0 no longer works. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22916#issuecomment-2588550581 From galder at openjdk.org Tue Jan 14 05:10:41 2025 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Tue, 14 Jan 2025 05:10:41 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> Message-ID: On Thu, 9 Jan 2025 12:18:48 GMT, Emanuel Peter wrote: >> @eme64 aarch64 work for this is now complete. I tweaked the `applyIf` condition `MinMaxRed_Long` to make sure `MaxVectorSize` is 32 or higher. I verified this on both Graviton 3 (256 bit register, `MaxVectorSize=32`) and an AVX-512 intel (512 bit register, `MaxVectorSize=64`) > > @galderz So you want me to review again? @eme64 I've fixed the test issue, it's ready to be reviewed ------------- PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2589014197 From jwaters at openjdk.org Tue Jan 14 07:51:03 2025 From: jwaters at openjdk.org (Julian Waters) Date: Tue, 14 Jan 2025 07:51:03 GMT Subject: RFR: 8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION [v8] In-Reply-To: References: Message-ID: On Thu, 9 Jan 2025 22:07:12 GMT, Kim Barrett wrote: >> Please review this change to how HotSpot prevents the use of certain C library >> functions (e.g. poisons references to those functions), while permitting a >> subset to be used in restricted circumstances. Reasons for poisoning a >> function include it being considered obsolete, or a security concern, or there >> is a HotSpot function (typically in the os:: namespace) providing similar >> functionality that should be used instead. >> >> The old mechanism, based on -Wattribute-warning and the associated attribute, >> only worked for gcc. (Clang's implementation differs in an important way from >> gcc, which is the subject of a clang bug that has been open for years. MSVC >> doesn't provide a similar mechanism.) It also had problems with LTO, due to a >> gcc bug. >> >> The new mechanism is based on deprecation warnings, using [[deprecated]] >> attributes. We redeclare or forward declare the functions we want to prevent >> use of as being deprecated. This relies on deprecation warnings being >> enabled, which they already are in our build configuration. All of our >> supported compilers support the [[deprecated]] attribute. >> >> Another benefit of using deprecation warnings rather than warning attributes >> is the time when the check is performed. Warning attributes are checked only >> if the function is referenced after all optimizations have been performed. >> Deprecation is checked during initial semantic analysis. That's better for >> our purposes here. (This is also part of why gcc LTO has problems with the >> old mechanism, but not the new.) >> >> Adding these redeclarations or forward declarations isn't as simple as >> expected, due to differences between the various compilers. We hide the >> differences behind a set of macros, FORBID_C_FUNCTION and related macros. See >> the compiler-specific parts of those macros for details. >> >> In some situations we need to allow references to these poisoned functions. >> >> One common case is where our poisoning is visible to some 3rd party code we >> don't want to modify. This is typically 3rd party headers included in HotSpot >> code, such as from Google Test or the C++ Standard Library. For these the >> BEGIN/END_ALLOW_FORBIDDEN_FUNCTIONS pair of macros are used demark the context >> where such references are permitted. >> >> Some of the poisoned functions are needed to implement associated HotSpot os:: >> functions, or in other similarly restricted contexts. For these, a wrapper >> function is provided that calls the poison... > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision: > > - Merge branch 'master' into new-poison > - Merge branch 'master' into new-poison > - remove more os-specific posix forwarding headers > - stefank whitespace suggestions > - add permit wrapper for strdup and use in aix > - remove os-specific posix forwarding headers > - aix permit patches > - more fixes for clang noreturn issues > - Merge branch 'master' into new-poison > - update copyrights > - ... and 5 more: https://git.openjdk.org/jdk/compare/a835d49f...6d49abbb Curious question: Did we ever find out why the attribute warning didn't fire off at the sites of forbidden methods being called inside gtest? ------------- PR Comment: https://git.openjdk.org/jdk/pull/22890#issuecomment-2589232654 From epeter at openjdk.org Tue Jan 14 08:23:40 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 14 Jan 2025 08:23:40 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v9] In-Reply-To: <5LxTINpzicabL2086ATQoudlqUMANni986510N1nk_k=.499ecef8-bb25-41de-91d6-b368888c90d2@github.com> References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> <5LxTINpzicabL2086ATQoudlqUMANni986510N1nk_k=.499ecef8-bb25-41de-91d6-b368888c90d2@github.com> Message-ID: On Mon, 13 Jan 2025 17:12:31 GMT, Galder Zamarre?o wrote: >> This patch intrinsifies `Math.max(long, long)` and `Math.min(long, long)` in order to help improve vectorization performance. >> >> Currently vectorization does not kick in for loops containing either of these calls because of the following error: >> >> >> VLoop::check_preconditions: failed: control flow in loop not allowed >> >> >> The control flow is due to the java implementation for these methods, e.g. >> >> >> public static long max(long a, long b) { >> return (a >= b) ? a : b; >> } >> >> >> This patch intrinsifies the calls to replace the CmpL + Bool nodes for MaxL/MinL nodes respectively. >> By doing this, vectorization no longer finds the control flow and so it can carry out the vectorization. >> E.g. >> >> >> SuperWord::transform_loop: >> Loop: N518/N126 counted [int,int),+4 (1025 iters) main has_sfpt strip_mined >> 518 CountedLoop === 518 246 126 [[ 513 517 518 242 521 522 422 210 ]] inner stride: 4 main of N518 strip mined !orig=[419],[247],[216],[193] !jvms: Test::test @ bci:14 (line 21) >> >> >> Applying the same changes to `ReductionPerf` as in https://github.com/openjdk/jdk/pull/13056, we can compare the results before and after. Before the patch, on darwin/aarch64 (M1): >> >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >> jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java >> 1 1 0 0 >> ============================== >> TEST SUCCESS >> >> long min 1155 >> long max 1173 >> >> >> After the patch, on darwin/aarch64 (M1): >> >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >> jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java >> 1 1 0 0 >> ============================== >> TEST SUCCESS >> >> long min 1042 >> long max 1042 >> >> >> This patch does not add an platform-specific backend implementations for the MaxL/MinL nodes. >> Therefore, it still relies on the macro expansion to transform those into CMoveL. >> >> I've run tier1 and hotspot compiler tests on darwin/aarch64 and got these results: >> >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PA... > > Galder Zamarre?o has updated the pull request incrementally with one additional commit since the last revision: > > Make sure it runs with cpus with either avx512 or asimd test/hotspot/jtreg/compiler/loopopts/superword/MinMaxRed_Long.java line 2: > 1: /* > 2: * Copyright (c) 2024, Red Hat, Inc. All rights reserved. Should probably be 2025 by now, right? At least that is our policy for Oracle, not sure what you want to do at Red Hat ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1914414960 From epeter at openjdk.org Tue Jan 14 08:41:45 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 14 Jan 2025 08:41:45 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v9] In-Reply-To: <5LxTINpzicabL2086ATQoudlqUMANni986510N1nk_k=.499ecef8-bb25-41de-91d6-b368888c90d2@github.com> References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> <5LxTINpzicabL2086ATQoudlqUMANni986510N1nk_k=.499ecef8-bb25-41de-91d6-b368888c90d2@github.com> Message-ID: On Mon, 13 Jan 2025 17:12:31 GMT, Galder Zamarre?o wrote: >> This patch intrinsifies `Math.max(long, long)` and `Math.min(long, long)` in order to help improve vectorization performance. >> >> Currently vectorization does not kick in for loops containing either of these calls because of the following error: >> >> >> VLoop::check_preconditions: failed: control flow in loop not allowed >> >> >> The control flow is due to the java implementation for these methods, e.g. >> >> >> public static long max(long a, long b) { >> return (a >= b) ? a : b; >> } >> >> >> This patch intrinsifies the calls to replace the CmpL + Bool nodes for MaxL/MinL nodes respectively. >> By doing this, vectorization no longer finds the control flow and so it can carry out the vectorization. >> E.g. >> >> >> SuperWord::transform_loop: >> Loop: N518/N126 counted [int,int),+4 (1025 iters) main has_sfpt strip_mined >> 518 CountedLoop === 518 246 126 [[ 513 517 518 242 521 522 422 210 ]] inner stride: 4 main of N518 strip mined !orig=[419],[247],[216],[193] !jvms: Test::test @ bci:14 (line 21) >> >> >> Applying the same changes to `ReductionPerf` as in https://github.com/openjdk/jdk/pull/13056, we can compare the results before and after. Before the patch, on darwin/aarch64 (M1): >> >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >> jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java >> 1 1 0 0 >> ============================== >> TEST SUCCESS >> >> long min 1155 >> long max 1173 >> >> >> After the patch, on darwin/aarch64 (M1): >> >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >> jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java >> 1 1 0 0 >> ============================== >> TEST SUCCESS >> >> long min 1042 >> long max 1042 >> >> >> This patch does not add an platform-specific backend implementations for the MaxL/MinL nodes. >> Therefore, it still relies on the macro expansion to transform those into CMoveL. >> >> I've run tier1 and hotspot compiler tests on darwin/aarch64 and got these results: >> >> >> ============================== >> Test summary >> ============================== >> TEST TOTAL PA... > > Galder Zamarre?o has updated the pull request incrementally with one additional commit since the last revision: > > Make sure it runs with cpus with either avx512 or asimd Changes requested by epeter (Reviewer). test/hotspot/jtreg/compiler/intrinsics/math/TestMinMaxInlining.java line 2: > 1: /* > 2: * Copyright (c) 2024, Red Hat, Inc. All rights reserved. Year 2025? No idea what the Red Hat policy is though. test/hotspot/jtreg/compiler/loopopts/superword/MinMaxRed_Long.java line 104: > 102: } > 103: > 104: public static void ReductionInit(long[] longs, int probability) { Suggestion: public static void reductionInit(long[] longs, int probability) { This is a method name, not a class - so I think it should start lower-case, right? test/hotspot/jtreg/compiler/loopopts/superword/MinMaxRed_Long.java line 107: > 105: int aboveCount, abovePercent; > 106: > 107: // Iterate until you find a set that matches the requirement probability Can you give a high-level definition / explanation what this does? Also: what is the expected number of rounds you iterate here? I'm asking because I would like to be sure that a timeout is basically impossible because the probability is too low. test/hotspot/jtreg/compiler/loopopts/superword/MinMaxRed_Long.java line 121: > 119: } else { > 120: // Decrement by at least 1 > 121: long decrement = random.nextLong(10) + 1; Nit: I would call it `diffToMax`, because you are really just going to get a value below the `max`, and you are not decrementing the `max`. But up to you if you want to change it. ------------- PR Review: https://git.openjdk.org/jdk/pull/20098#pullrequestreview-2549073452 PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1914424784 PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1914426190 PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1914431043 PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1914434395 From epeter at openjdk.org Tue Jan 14 08:41:46 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 14 Jan 2025 08:41:46 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v9] In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> <5LxTINpzicabL2086ATQoudlqUMANni986510N1nk_k=.499ecef8-bb25-41de-91d6-b368888c90d2@github.com> Message-ID: On Tue, 14 Jan 2025 08:29:25 GMT, Emanuel Peter wrote: >> Galder Zamarre?o has updated the pull request incrementally with one additional commit since the last revision: >> >> Make sure it runs with cpus with either avx512 or asimd > > test/hotspot/jtreg/compiler/loopopts/superword/MinMaxRed_Long.java line 104: > >> 102: } >> 103: >> 104: public static void ReductionInit(long[] longs, int probability) { > > Suggestion: > > public static void reductionInit(long[] longs, int probability) { > > This is a method name, not a class - so I think it should start lower-case, right? And the method might as well allocate the array too. But up to you. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1914427359 From epeter at openjdk.org Tue Jan 14 08:46:38 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 14 Jan 2025 08:46:38 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> Message-ID: On Tue, 14 Jan 2025 05:07:55 GMT, Galder Zamarre?o wrote: >> @galderz So you want me to review again? > > @eme64 I've fixed the test issue, it's ready to be reviewed @galderz I don't remember from above, but did you ever run the Long Min/Max benchmarks from this? https://github.com/openjdk/jdk/pull/21032 Would just be nice to see that they have an improvement after this change :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2589327111 From epeter at openjdk.org Tue Jan 14 08:57:45 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 14 Jan 2025 08:57:45 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> Message-ID: On Tue, 14 Jan 2025 05:07:55 GMT, Galder Zamarre?o wrote: >> @galderz So you want me to review again? > > @eme64 I've fixed the test issue, it's ready to be reviewed @galderz I ran some testing on our side, feel free to ping me in 1-2 days for the results. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2589351747 From jbhateja at openjdk.org Tue Jan 14 13:09:45 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 14 Jan 2025 13:09:45 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v9] In-Reply-To: References: <_SCKY9fuTqNDfR6K1y-FuMvursDMuOx39sKrXMj0Tdg=.225da2f1-fcdc-4418-a753-6d7404b4a83e@github.com> Message-ID: On Mon, 13 Jan 2025 16:51:02 GMT, Paul Sandoz wrote: >> Hi @PaulSandoz , In the current scheme we are passing unboxed carriers to intrinsic entry point, in the fallback implementation carrier type is first converted to floating point value using Float.float16ToFloat API which expects to receive a short type argument, after the operation we again convert float value to carrier type (short) using Float.floatToFloat16 API which expects a float argument, thus our intent here is to perform unboxing and boxing outside the intrinsic thereby avoiding all complexities around boxing by compiler. Even if we pass 3 additional parameters we still need to use Float16.floatValue which invokes Float.float16ToFloat underneath, thus this minor modification on Java side is on account of optimizing the intrinsic interface. > > Yes, i understand the approach. It's about clarity of the fallback implementation retaining what was expressed in the original code: > > short res = Float16Math.fma(fa, fb, fc, a, b, c, > (a_, b_, c_) -> { > double product = (double)(a_.floatValue() * b._floatValue()); > return valueOf(product + c_.doubleValue()); > }); Hi @PaulSandoz , In above code snippet the return type 'short' of intrinsic call does not comply with the value being returned which is of box type, thereby mandating addition glue code. Regular primitive type boxing APIs are lazily intrinsified, thereby generating an intrinsifiable Call IR during parsing. LoadNode?s idealization can fetch a boxed value from the input of boxing call IR and directly forward it to users. Q1. What is the problem in directly passing Float16 boxes to FMA and SQRT intrinsic entry points? A. The compiler will have to unbox them before the actual operation. There are multiple schemes to perform unboxing, such as name-based, offset-based, and index-based field lookup. Vector API unbox expansion uses an offset-based payload field lookup, for this it bookkeeps the payload?s offset over runtime representation of VectorPayload class created as part of VM initialization. However, VM can only bookkeep this information for classes that are part of java.base module, Float16 being part of incubation module cannot use offset-based field lookup. Thus only viable alternative is to unbox using field name/index based lookup. For this compiler will first verify that the incoming oop is of Float16 type and then use a hardcoded name-based lookup to Load the field value. This looks fragile as it establishes an unwanted dependency b/w Float16 field names and compiler implementation, same applies to index-based lookup as index values are dependent onthe combined field count of class and instance-specific fields, thus any addition or deletion of a class-level static helper field before the field of interest can invalidate any hardcoded index value used by the compiler. All in all, for safe and reliable unboxing by compiler, it's necessary to create an upfront VM representation like vector_VectorPayload. Q2. What are the pros and cons of passing both the unboxed value and boxed values to the intrinsic entry point? A. Pros: - This will save unsafe unboxing implementation if the holder class is not part of java.base module. - We can leverage existing box intrinsification infrastructure which directly forwards the embedded values to its users. - Also, it will minimize the changes in the Java side implementation. Cons: - It's suboptimal in case the call is neither intrinsified or inlined, as it will add additional spills before the call. Q3. Primitive box class boxing API ?valueOf? accepts an argument of the corresponding primitive type. How different are Float16 boxing APIs. A. Unlike primitive box classes, Float16 has multiple boxing APIs and none of them accept a short type argument. public static Float16 valueOf(int value) public static Float16 valueOf(long value) public static Float16 valueOf(float f) public static Float16 valueOf(double d) public static Float16 valueOf(String s) throws NumberFormatException public static Float16 valueOf(BigDecimal v) public static Float16 valueOf(BigInteger bi) Thus, we need to add special handling to first downcast the parameter value to short type carrier otherwise it will pose problems in forwarding the boxed values. Existing LoadNode idealization directly forwards the input of unboxed Call IR to its users. To use existing idealization, we need to massage the input of unboxed Call IR to the exact carrier size, so it?s not a meager one-line change in the following methods to enable seamless intrinsification of Float16 boxing APIs. bool ciMethod::is_boxing_method() const bool ciMethod::is_unboxing_method() const Given the above observations passing 3 additional box arguments to intrinsic and returning a box value needs additional changes in the compiler while minor re-structuring in Java implementation packed with in the glue logic looks like a reasonable approach. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1914782512 From kbarrett at openjdk.org Tue Jan 14 16:43:10 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 14 Jan 2025 16:43:10 GMT Subject: RFR: 8347720: [BACKOUT] Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION Message-ID: Clean backout of JDK-8313396. This reverts commit e0f2f4b216bc9358caa65975204aee086e4fcbd2. Testing: mach5 tier1 ------------- Commit messages: - Revert "8313396: Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION" Changes: https://git.openjdk.org/jdk/pull/23110/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23110&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347720 Stats: 584 lines in 32 files changed: 56 ins; 407 del; 121 mod Patch: https://git.openjdk.org/jdk/pull/23110.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23110/head:pull/23110 PR: https://git.openjdk.org/jdk/pull/23110 From coleenp at openjdk.org Tue Jan 14 16:47:36 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 14 Jan 2025 16:47:36 GMT Subject: RFR: 8347720: [BACKOUT] Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION In-Reply-To: References: Message-ID: <4wLUBhLUtRiVjQ0n63dvDoQxfQQ4JS0kWMxLREfTiFg=.df683c61-8036-499c-a2fe-7ae2dab6fbdf@github.com> On Tue, 14 Jan 2025 16:37:47 GMT, Kim Barrett wrote: > Clean backout of JDK-8313396. > > This reverts commit e0f2f4b216bc9358caa65975204aee086e4fcbd2. > > Testing: mach5 tier1 Backout looks good. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23110#pullrequestreview-2550391406 From kbarrett at openjdk.org Tue Jan 14 17:46:48 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 14 Jan 2025 17:46:48 GMT Subject: RFR: 8347720: [BACKOUT] Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION In-Reply-To: <4wLUBhLUtRiVjQ0n63dvDoQxfQQ4JS0kWMxLREfTiFg=.df683c61-8036-499c-a2fe-7ae2dab6fbdf@github.com> References: <4wLUBhLUtRiVjQ0n63dvDoQxfQQ4JS0kWMxLREfTiFg=.df683c61-8036-499c-a2fe-7ae2dab6fbdf@github.com> Message-ID: On Tue, 14 Jan 2025 16:44:58 GMT, Coleen Phillimore wrote: >> Clean backout of JDK-8313396. >> >> This reverts commit e0f2f4b216bc9358caa65975204aee086e4fcbd2. >> >> Testing: mach5 tier1 > > Backout looks good. Thanks for review @coleenp ------------- PR Comment: https://git.openjdk.org/jdk/pull/23110#issuecomment-2590686426 From kbarrett at openjdk.org Tue Jan 14 17:46:49 2025 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 14 Jan 2025 17:46:49 GMT Subject: Integrated: 8347720: [BACKOUT] Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION In-Reply-To: References: Message-ID: On Tue, 14 Jan 2025 16:37:47 GMT, Kim Barrett wrote: > Clean backout of JDK-8313396. > > This reverts commit e0f2f4b216bc9358caa65975204aee086e4fcbd2. > > Testing: mach5 tier1 This pull request has now been integrated. Changeset: db76f47f Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/db76f47f27c46ea89cd7c08b0de6d6fa032ffb4d Stats: 584 lines in 32 files changed: 56 ins; 407 del; 121 mod 8347720: [BACKOUT] Portable implementation of FORBID_C_FUNCTION and ALLOW_C_FUNCTION Reviewed-by: coleenp ------------- PR: https://git.openjdk.org/jdk/pull/23110 From psandoz at openjdk.org Wed Jan 15 00:31:40 2025 From: psandoz at openjdk.org (Paul Sandoz) Date: Wed, 15 Jan 2025 00:31:40 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v9] In-Reply-To: References: <_SCKY9fuTqNDfR6K1y-FuMvursDMuOx39sKrXMj0Tdg=.225da2f1-fcdc-4418-a753-6d7404b4a83e@github.com> Message-ID: On Tue, 14 Jan 2025 13:07:27 GMT, Jatin Bhateja wrote: > Given the above observations passing 3 additional box arguments to intrinsic and returning a box value needs additional changes in the compiler while minor re-structuring in Java implementation packed with in the glue logic looks like a reasonable approach. Did you mean to say *no* additional changes in the compiler? Otherwise, if not what would those changes be? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1915791046 From galder at openjdk.org Fri Jan 17 15:31:46 2025 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Fri, 17 Jan 2025 15:31:46 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v9] In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> <5LxTINpzicabL2086ATQoudlqUMANni986510N1nk_k=.499ecef8-bb25-41de-91d6-b368888c90d2@github.com> Message-ID: On Tue, 14 Jan 2025 08:33:36 GMT, Emanuel Peter wrote: >> Galder Zamarre?o has updated the pull request incrementally with one additional commit since the last revision: >> >> Make sure it runs with cpus with either avx512 or asimd > > test/hotspot/jtreg/compiler/loopopts/superword/MinMaxRed_Long.java line 107: > >> 105: int aboveCount, abovePercent; >> 106: >> 107: // Iterate until you find a set that matches the requirement probability > > Can you give a high-level definition / explanation what this does? > Also: what is the expected number of rounds you iterate here? I'm asking because I would like to be sure that a timeout is basically impossible because the probability is too low. Sure I'll add. It's an approximation to make it run fast as sizes increase. In the worst case I've seen it take 15 rounds when size was 100, 50% probability and got 50 below max and 50 above. But with bigger array sizes, say 10'000, and 50% probability aim, it can take 1 or 2 rounds ending up with 5027 above max, 4973 below max. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1920359437 From galder at openjdk.org Fri Jan 17 15:48:43 2025 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Fri, 17 Jan 2025 15:48:43 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> Message-ID: <3rstCr_f43zEJRn6I1q5D-oUKH84wNWO4h_I6ZrOGc4=.014f8a93-2471-45d0-9634-73dbd086c428@github.com> On Tue, 14 Jan 2025 05:07:55 GMT, Galder Zamarre?o wrote: >> @galderz So you want me to review again? > > @eme64 I've fixed the test issue, it's ready to be reviewed > @galderz I don't remember from above, but did you ever run the Long Min/Max benchmarks from this? https://github.com/openjdk/jdk/pull/21032 Would just be nice to see that they have an improvement after this change :) Looking at the benchmark the arrays are loaded with random data with no control over which branch side will be taken. So there's no guarantees that you will see an improvement for the reasons I explained in https://github.com/openjdk/jdk/pull/20098#issuecomment-2379386872. To summarise what was observed there: * In AVX-512 you will only see an improvement when one of the min/max branches is taken ~100% of the time. * In non-AVX-512 this patch will create a regression when one of the min/max branches is taken ~100% of time. If it helps I'm happy to document this in detail in the `MinMaxVector` benchmark added here. I would expect a similar thing to happen when it comes to asimd envs with max vector size >= 32 (e.g. Graviton 3). Those will see vectorization occur and improvements kick in at 100%. Other systems (e.g. Graviton 4) will see a regression at 100%. This means that your work in https://github.com/openjdk/jdk/pull/20098#discussion_r1901576209 to avoid the max vector size limitation might become more important once my PR here goes in. I'm wondering if the long min/max benchmarks introduced in https://github.com/openjdk/jdk/pull/21032 should remain because their results are not predictable and that's not a good situation. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2598645987 From jbhateja at openjdk.org Fri Jan 17 16:02:55 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 17 Jan 2025 16:02:55 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v9] In-Reply-To: References: <_SCKY9fuTqNDfR6K1y-FuMvursDMuOx39sKrXMj0Tdg=.225da2f1-fcdc-4418-a753-6d7404b4a83e@github.com> Message-ID: On Wed, 15 Jan 2025 00:28:50 GMT, Paul Sandoz wrote: >> Hi @PaulSandoz , >> >> In above code snippet the return type 'short' of intrinsic call does not comply with the value being returned which is of box type, thereby mandating addition glue code. >> >> Regular primitive type boxing APIs are lazily intrinsified, thereby generating an intrinsifiable Call IR during parsing. >> LoadNode?s idealization can fetch a boxed value from the input of boxing call IR and directly forward it to users. >> >> Q1. What is the problem in directly passing Float16 boxes to FMA and SQRT intrinsic entry points? >> >> A. The compiler will have to unbox them before the actual operation. There are multiple schemes to perform unboxing, such as name-based, offset-based, and index-based field lookup. >> Vector API unbox expansion uses an offset-based payload field lookup, for this it bookkeeps the payload?s offset over runtime representation of VectorPayload class created as part of VM initialization. >> However, VM can only bookkeep this information for classes that are part of java.base module, Float16 being part of incubation module cannot use offset-based field lookup. Thus only viable alternative is to unbox using field name/index based lookup. >> For this compiler will first verify that the incoming oop is of Float16 type and then use a hardcoded name-based lookup to Load the field value. This looks fragile as it establishes an unwanted dependency b/w Float16 field names and compiler implementation, same applies to index-based lookup as index values are dependent onthe combined field count of class and instance-specific fields, thus any addition or deletion of a class-level static helper field before the field of interest can invalidate any hardcoded index value used by the compiler. >> All in all, for safe and reliable unboxing by compiler, it's necessary to create an upfront VM representation like vector_VectorPayload. >> >> Q2. What are the pros and cons of passing both the unboxed value and boxed values to the intrinsic entry point? >> A. >> Pros: >> - This will save unsafe unboxing implementation if the holder class is not part of java.base module. >> - We can leverage existing box intrinsification infrastructure which directly forwards the embedded values to its users. >> - Also, it will minimize the changes in the Java side implementation. >> >> Cons: >> - It's suboptimal in case the call is neither intrinsified or inlined, as it will add additional spills before the call. >> >> Q3. Primitive box class boxing API ?valueOf? accepts... > >> Given the above observations passing 3 additional box arguments to intrinsic and returning a box value needs additional changes in the compiler while minor re-structuring in Java implementation packed with in the glue logic looks like a reasonable approach. > > Did you mean to say *no* additional changes in the compiler? Otherwise, if not what would those changes be? Hi @PaulSandoz , Many thanks for your suggestion. From a compiler standpoint, changes are mainly around unboxing/boxing arguments and return values. Since the _Float16_ class is defined in the incubation module, we cannot refer to it in the wrapper class _Float16Math_ which is part of _java.base_ module. Thus, lambda must pass the boxes as _java.lang.Object_ type arguments will introduce additional type sharpening casts in lambda and defeat our purpose of preserving unmodified code in lambda. The current scheme only deals in intrinsic with short parameters and return values and is free from these concerns. The user may declare MUL and ADD as separate operations, and to honor JLS specification and precise floating-point model semantics we should not infer an FMA scalar operation, thus, intrinsification is appropriate here rather than a pattern match. With this incremental commit, I am trying to minimize the Java side changes. Let me know if this looks fine to you. Verified that auto-vectorization planned for follow-on patch is aligned with new changes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1920398466 From jbhateja at openjdk.org Fri Jan 17 16:02:55 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 17 Jan 2025 16:02:55 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v11] In-Reply-To: References: Message-ID: > Hi All, > > This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) > > Following is the summary of changes included with this patch:- > > 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. > 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. > 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. > - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. > 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. > 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. > 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. > 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF > 9. X86 backend implementation for all supported intrinsics. > 10. Functional and Performance validation tests. > > Kindly review the patch and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Review suggestions incorporated. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22754/files - new: https://git.openjdk.org/jdk/pull/22754/files/43aa3eb7..692de9c0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=09-10 Stats: 116 lines in 6 files changed: 67 ins; 17 del; 32 mod Patch: https://git.openjdk.org/jdk/pull/22754.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22754/head:pull/22754 PR: https://git.openjdk.org/jdk/pull/22754 From galder at openjdk.org Fri Jan 17 17:45:25 2025 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Fri, 17 Jan 2025 17:45:25 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v10] In-Reply-To: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> Message-ID: > This patch intrinsifies `Math.max(long, long)` and `Math.min(long, long)` in order to help improve vectorization performance. > > Currently vectorization does not kick in for loops containing either of these calls because of the following error: > > > VLoop::check_preconditions: failed: control flow in loop not allowed > > > The control flow is due to the java implementation for these methods, e.g. > > > public static long max(long a, long b) { > return (a >= b) ? a : b; > } > > > This patch intrinsifies the calls to replace the CmpL + Bool nodes for MaxL/MinL nodes respectively. > By doing this, vectorization no longer finds the control flow and so it can carry out the vectorization. > E.g. > > > SuperWord::transform_loop: > Loop: N518/N126 counted [int,int),+4 (1025 iters) main has_sfpt strip_mined > 518 CountedLoop === 518 246 126 [[ 513 517 518 242 521 522 422 210 ]] inner stride: 4 main of N518 strip mined !orig=[419],[247],[216],[193] !jvms: Test::test @ bci:14 (line 21) > > > Applying the same changes to `ReductionPerf` as in https://github.com/openjdk/jdk/pull/13056, we can compare the results before and after. Before the patch, on darwin/aarch64 (M1): > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java > 1 1 0 0 > ============================== > TEST SUCCESS > > long min 1155 > long max 1173 > > > After the patch, on darwin/aarch64 (M1): > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java > 1 1 0 0 > ============================== > TEST SUCCESS > > long min 1042 > long max 1042 > > > This patch does not add an platform-specific backend implementations for the MaxL/MinL nodes. > Therefore, it still relies on the macro expansion to transform those into CMoveL. > > I've run tier1 and hotspot compiler tests on darwin/aarch64 and got these results: > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg:tier1 2500 2500 0 0 >>> jtreg:test/jdk:tier1 ... Galder Zamarre?o has updated the pull request incrementally with two additional commits since the last revision: - Renaming methods and variables and add docu on algorithms - Fix copyright years ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20098/files - new: https://git.openjdk.org/jdk/pull/20098/files/abbaf875..f83d8863 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20098&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20098&range=08-09 Stats: 38 lines in 3 files changed: 25 ins; 2 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/20098.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20098/head:pull/20098 PR: https://git.openjdk.org/jdk/pull/20098 From galder at openjdk.org Fri Jan 17 17:45:25 2025 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Fri, 17 Jan 2025 17:45:25 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> Message-ID: <4NGZx_gqvc7xMcXCTef2c_ns-nMxznsB42NnlQJqX4Q=.8cdc00f9-c9a7-409a-b5c3-885d0677b952@github.com> On Tue, 14 Jan 2025 08:54:34 GMT, Emanuel Peter wrote: >> @eme64 I've fixed the test issue, it's ready to be reviewed > > @galderz I ran some testing on our side, feel free to ping me in 1-2 days for the results. @eme64 I've addressed all the comments. I've not run the `VectorReduction2` for the reasons explained in the previous comment. Happy to add more details to `MinMaxVector` if you feel it's necessary. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2598873155 From galder at openjdk.org Fri Jan 17 17:53:24 2025 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Fri, 17 Jan 2025 17:53:24 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v11] In-Reply-To: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> Message-ID: <6-Fgj-Lrd7GSpR0ZAi8YFlOZB12hCBB6p3oGZ1xodvA=.1ce2fa12-daff-4459-8fb8-1052acaf5639@github.com> > This patch intrinsifies `Math.max(long, long)` and `Math.min(long, long)` in order to help improve vectorization performance. > > Currently vectorization does not kick in for loops containing either of these calls because of the following error: > > > VLoop::check_preconditions: failed: control flow in loop not allowed > > > The control flow is due to the java implementation for these methods, e.g. > > > public static long max(long a, long b) { > return (a >= b) ? a : b; > } > > > This patch intrinsifies the calls to replace the CmpL + Bool nodes for MaxL/MinL nodes respectively. > By doing this, vectorization no longer finds the control flow and so it can carry out the vectorization. > E.g. > > > SuperWord::transform_loop: > Loop: N518/N126 counted [int,int),+4 (1025 iters) main has_sfpt strip_mined > 518 CountedLoop === 518 246 126 [[ 513 517 518 242 521 522 422 210 ]] inner stride: 4 main of N518 strip mined !orig=[419],[247],[216],[193] !jvms: Test::test @ bci:14 (line 21) > > > Applying the same changes to `ReductionPerf` as in https://github.com/openjdk/jdk/pull/13056, we can compare the results before and after. Before the patch, on darwin/aarch64 (M1): > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java > 1 1 0 0 > ============================== > TEST SUCCESS > > long min 1155 > long max 1173 > > > After the patch, on darwin/aarch64 (M1): > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java > 1 1 0 0 > ============================== > TEST SUCCESS > > long min 1042 > long max 1042 > > > This patch does not add an platform-specific backend implementations for the MaxL/MinL nodes. > Therefore, it still relies on the macro expansion to transform those into CMoveL. > > I've run tier1 and hotspot compiler tests on darwin/aarch64 and got these results: > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg:tier1 2500 2500 0 0 >>> jtreg:test/jdk:tier1 ... Galder Zamarre?o has updated the pull request incrementally with one additional commit since the last revision: Fix typo ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20098/files - new: https://git.openjdk.org/jdk/pull/20098/files/f83d8863..724a346a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20098&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20098&range=09-10 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/20098.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20098/head:pull/20098 PR: https://git.openjdk.org/jdk/pull/20098 From epeter at openjdk.org Mon Jan 20 08:03:44 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 20 Jan 2025 08:03:44 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) In-Reply-To: <4NGZx_gqvc7xMcXCTef2c_ns-nMxznsB42NnlQJqX4Q=.8cdc00f9-c9a7-409a-b5c3-885d0677b952@github.com> References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> <4NGZx_gqvc7xMcXCTef2c_ns-nMxznsB42NnlQJqX4Q=.8cdc00f9-c9a7-409a-b5c3-885d0677b952@github.com> Message-ID: On Fri, 17 Jan 2025 17:41:47 GMT, Galder Zamarre?o wrote: >> @galderz I ran some testing on our side, feel free to ping me in 1-2 days for the results. > > @eme64 I've addressed all the comments. I've not run the `VectorReduction2` for the reasons explained in the previous comment. Happy to add more details to `MinMaxVector` if you feel it's necessary. @galderz Ah, right. I understand about the branch probability. Hmm, maybe we should eventually change the `VectorReduction2` benchmark, or just remove the `min/max` benchmark there completely, as it depends on the random input values. Ah, though we have a fixed `seed`, so rerunning the benchmark would at least have consistent branching characteristics. So then it could make sense to run the benchmark, we just don't know the probability. I mean I ran it before for the `in/float/double min/max`, and all of them see a solid speedup. So I would expect the same for `long`, it would be nice to at least see the numbers. You could extend your benchmark to `float / double` as well, to make it complete. But that could also be a follow-up RFE. >I would expect a similar thing to happen when it comes to asimd envs with max vector size >= 32 (e.g. Graviton 3). Those will see vectorization occur and improvements kick in at 100%. Other systems (e.g. Graviton 4) will see a regression at 100%. This means that your work in https://github.com/openjdk/jdk/pull/20098#discussion_r1901576209 to avoid the max vector size limitation might become more important once my PR here goes in. So are you saying there are machines where we are now getting some regressions with your patch (2-element cases)? It would be nice to see the numbers summarized here. I'm losing the overview a little over the 50+ messages now ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2601678386 From epeter at openjdk.org Wed Jan 22 08:17:43 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 22 Jan 2025 08:17:43 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v11] In-Reply-To: <03ozC1NfpoBMN8fyLJY6gt2_7GZQpDtTHEj8cgxD_dU=.dd851537-820d-4b72-acf9-b170aa756e4b@github.com> References: <03ozC1NfpoBMN8fyLJY6gt2_7GZQpDtTHEj8cgxD_dU=.dd851537-820d-4b72-acf9-b170aa756e4b@github.com> Message-ID: On Mon, 16 Dec 2024 14:19:49 GMT, Jatin Bhateja wrote: >>> > Can you quickly summarize what tests you have, and what they test? >>> >>> Patch includes functional and performance tests, as per your suggestions IR framework-based tests now cover various special cases for constant folding transformation. Let me know if you see any gaps. >> >> I was hoping that you could make a list of all optimizations that are included here, and tell me where the tests are for it. That would significantly reduce the review time on my end. Otherwise I have to correlate everything myself, and that will take me hours. > >> > > Can you quickly summarize what tests you have, and what they test? >> > >> > >> > Patch includes functional and performance tests, as per your suggestions IR framework-based tests now cover various special cases for constant folding transformation. Let me know if you see any gaps. >> >> I was hoping that you could make a list of all optimizations that are included here, and tell me where the tests are for it. That would significantly reduce the review time on my end. Otherwise I have to correlate everything myself, and that will take me hours. > > > Validations details:- > > A) x86 backend changes > - new assembler instruction > - macro assembly routines. > Test point:- test/jdk/jdk/incubator/vector/ScalarFloat16OperationsTest.java > - This test is based on a testng framework and includes new DataProviders to generate test vectors. > - Test vectors cover the entire float16 value range and also special floating point values (NaN, +Int, -Inf, 0.0 and -0.0) > B) GVN transformations:- > - Value Transforms > Test point:- test test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java > - Covers all the constant folding scenarios for add, sub, mul, div, sqrt, fma, min, and max operations addressed by this patch. > - It also tests special case scenarios for each operation as specified by Java language specification. > - identity Transforms > Test point:- test test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java > - Covers identity transformation for ReinterpretS2HFNode, DivHFNode > - idealization Transforms > Test points:- test/hotspot/jtreg/compiler/c2/irTests/MulHFNodeIdealizationTests.java > :- test test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java > - Contains test point for the following transform > MulHF idealization i.e. MulHF * 2 => AddHF > - Contains test point for the following transform > DivHF SRC , PoT(constant) => MulHF SRC * reciprocal (constant) > - Contains idealization test points for the following transform > ConvF2HF(FP32BinOp(ConvHF2F(x), ConvHF2F(y))) => > ReinterpretHF2S(FP16BinOp(ReinterpretS2HF(x), ReinterpretS2HF(y))) @jatin-bhateja just ping me again whenever this is ready for a re-review :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/22754#issuecomment-2606558812 From epeter at openjdk.org Wed Jan 22 12:37:42 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 22 Jan 2025 12:37:42 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v11] In-Reply-To: <03ozC1NfpoBMN8fyLJY6gt2_7GZQpDtTHEj8cgxD_dU=.dd851537-820d-4b72-acf9-b170aa756e4b@github.com> References: <03ozC1NfpoBMN8fyLJY6gt2_7GZQpDtTHEj8cgxD_dU=.dd851537-820d-4b72-acf9-b170aa756e4b@github.com> Message-ID: On Mon, 16 Dec 2024 14:19:49 GMT, Jatin Bhateja wrote: >>> > Can you quickly summarize what tests you have, and what they test? >>> >>> Patch includes functional and performance tests, as per your suggestions IR framework-based tests now cover various special cases for constant folding transformation. Let me know if you see any gaps. >> >> I was hoping that you could make a list of all optimizations that are included here, and tell me where the tests are for it. That would significantly reduce the review time on my end. Otherwise I have to correlate everything myself, and that will take me hours. > >> > > Can you quickly summarize what tests you have, and what they test? >> > >> > >> > Patch includes functional and performance tests, as per your suggestions IR framework-based tests now cover various special cases for constant folding transformation. Let me know if you see any gaps. >> >> I was hoping that you could make a list of all optimizations that are included here, and tell me where the tests are for it. That would significantly reduce the review time on my end. Otherwise I have to correlate everything myself, and that will take me hours. > > > Validations details:- > > A) x86 backend changes > - new assembler instruction > - macro assembly routines. > Test point:- test/jdk/jdk/incubator/vector/ScalarFloat16OperationsTest.java > - This test is based on a testng framework and includes new DataProviders to generate test vectors. > - Test vectors cover the entire float16 value range and also special floating point values (NaN, +Int, -Inf, 0.0 and -0.0) > B) GVN transformations:- > - Value Transforms > Test point:- test test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java > - Covers all the constant folding scenarios for add, sub, mul, div, sqrt, fma, min, and max operations addressed by this patch. > - It also tests special case scenarios for each operation as specified by Java language specification. > - identity Transforms > Test point:- test test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java > - Covers identity transformation for ReinterpretS2HFNode, DivHFNode > - idealization Transforms > Test points:- test/hotspot/jtreg/compiler/c2/irTests/MulHFNodeIdealizationTests.java > :- test test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java > - Contains test point for the following transform > MulHF idealization i.e. MulHF * 2 => AddHF > - Contains test point for the following transform > DivHF SRC , PoT(constant) => MulHF SRC * reciprocal (constant) > - Contains idealization test points for the following transform > ConvF2HF(FP32BinOp(ConvHF2F(x), ConvHF2F(y))) => > ReinterpretHF2S(FP16BinOp(ReinterpretS2HF(x), ReinterpretS2HF(y))) @jatin-bhateja can you also merge here, please? ------------- PR Comment: https://git.openjdk.org/jdk/pull/22754#issuecomment-2607132748 From mli at openjdk.org Thu Jan 23 08:16:52 2025 From: mli at openjdk.org (Hamlin Li) Date: Thu, 23 Jan 2025 08:16:52 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v11] In-Reply-To: References: Message-ID: On Fri, 17 Jan 2025 16:02:55 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes included with this patch:- >> >> 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. >> 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. >> 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. >> - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. >> 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. >> 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. >> 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. >> 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF >> 9. X86 backend implementation for all supported intrinsics. >> 10. Functional and Performance validation tests. >> >> Kindly review the patch and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Review suggestions incorporated. src/hotspot/share/opto/library_call.cpp line 8670: > 8668: > 8669: const TypeInstPtr* box_type = _gvn.type(argument(0))->isa_instptr(); > 8670: if (box_type == nullptr || box_type->const_oop() == nullptr) { Hi, this is not a review comment. Just curious, to continue the following code path why does `box_type` must have a valid `const_oop`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1926540874 From jbhateja at openjdk.org Thu Jan 23 08:36:01 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Thu, 23 Jan 2025 08:36:01 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v11] In-Reply-To: References: Message-ID: <9kjqXKPElm3xOqpAi2u7FNf9G4ATlLNMc6Tf0genUwA=.6de34b54-b56d-4f15-91f6-90ef2d358476@github.com> On Thu, 23 Jan 2025 08:14:13 GMT, Hamlin Li wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Review suggestions incorporated. > > src/hotspot/share/opto/library_call.cpp line 8670: > >> 8668: >> 8669: const TypeInstPtr* box_type = _gvn.type(argument(0))->isa_instptr(); >> 8670: if (box_type == nullptr || box_type->const_oop() == nullptr) { > > Hi, this is not a review comment. > Just curious, to continue the following code path why does `box_type` must have a valid `const_oop`? @Hamlin-Li , Class types are passed as constant oop, this check is added for argument validation. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1926564701 From mli at openjdk.org Thu Jan 23 10:53:53 2025 From: mli at openjdk.org (Hamlin Li) Date: Thu, 23 Jan 2025 10:53:53 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v11] In-Reply-To: References: Message-ID: On Fri, 17 Jan 2025 16:02:55 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes included with this patch:- >> >> 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. >> 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. >> 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. >> - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. >> 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. >> 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. >> 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. >> 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF >> 9. X86 backend implementation for all supported intrinsics. >> 10. Functional and Performance validation tests. >> >> Kindly review the patch and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Review suggestions incorporated. One minor comment about the test. test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorConvChain.java line 48: > 46: @IR(applyIfCPUFeatureAnd = {"avx512_fp16", "false", "f16c", "true"}, > 47: counts = {IRNode.VECTOR_CAST_HF2F, IRNode.VECTOR_SIZE_ANY, ">= 1", IRNode.VECTOR_CAST_F2HF, IRNode.VECTOR_SIZE_ANY, " >= 1"}) > 48: @IR(applyIfCPUFeatureAnd = {"avx512_fp16", "false", "zvfh", "true"}, Is `"avx512_fp16", "false"` necessary at this line? ------------- PR Review: https://git.openjdk.org/jdk/pull/22754#pullrequestreview-2569517611 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1926770427 From mli at openjdk.org Thu Jan 23 10:57:49 2025 From: mli at openjdk.org (Hamlin Li) Date: Thu, 23 Jan 2025 10:57:49 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v11] In-Reply-To: <9kjqXKPElm3xOqpAi2u7FNf9G4ATlLNMc6Tf0genUwA=.6de34b54-b56d-4f15-91f6-90ef2d358476@github.com> References: <9kjqXKPElm3xOqpAi2u7FNf9G4ATlLNMc6Tf0genUwA=.6de34b54-b56d-4f15-91f6-90ef2d358476@github.com> Message-ID: On Thu, 23 Jan 2025 08:32:47 GMT, Jatin Bhateja wrote: >> src/hotspot/share/opto/library_call.cpp line 8670: >> >>> 8668: >>> 8669: const TypeInstPtr* box_type = _gvn.type(argument(0))->isa_instptr(); >>> 8670: if (box_type == nullptr || box_type->const_oop() == nullptr) { >> >> Hi, this is not a review comment. >> Just curious, to continue the following code path why does `box_type` must have a valid `const_oop`? > > @Hamlin-Li , Class types are passed as constant oop, this check is added for argument validation. Thanks! Seems it could be an assert instead? Or maybe I could have misunderstood your above explanation. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1926780216 From jbhateja at openjdk.org Thu Jan 23 11:01:49 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Thu, 23 Jan 2025 11:01:49 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v11] In-Reply-To: References: <9kjqXKPElm3xOqpAi2u7FNf9G4ATlLNMc6Tf0genUwA=.6de34b54-b56d-4f15-91f6-90ef2d358476@github.com> Message-ID: On Thu, 23 Jan 2025 10:54:54 GMT, Hamlin Li wrote: >> @Hamlin-Li , Class types are passed as constant oop, this check is added for argument validation. > > Thanks! > Seems it could be an assert instead? Or maybe I could have misunderstood your above explanation. Hi @Hamlin-Li, We intend to disable intrinsification if constraints are not met. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1926786421 From qamai at openjdk.org Thu Jan 23 17:28:27 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 23 Jan 2025 17:28:27 GMT Subject: RFR: 8348411: C2: Remove the control input of LoadKlassNode and LoadNKlassNode Message-ID: Hi, This patch removes the control input of `LoadKlassNode` and `LoadNKlassNode`. They can only have a control input if created inside `Parse::array_store_check()`, the reason given is: // We are allowed to use the constant type only if cast succeeded But this seems incorrect, the load from the constant type can be done regardless, and it will be constant-folded. This patch only makes that more formal and cleanup `LoadKlassNode::can_remove_control`. Please take a look and leave your reviews, thanks a lot. ------------- Commit messages: - remove control input of LoadKlassNode Changes: https://git.openjdk.org/jdk/pull/23274/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23274&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8348411 Stats: 46 lines in 10 files changed: 5 ins; 14 del; 27 mod Patch: https://git.openjdk.org/jdk/pull/23274.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23274/head:pull/23274 PR: https://git.openjdk.org/jdk/pull/23274 From vlivanov at openjdk.org Thu Jan 23 18:31:48 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Thu, 23 Jan 2025 18:31:48 GMT Subject: RFR: 8348411: C2: Remove the control input of LoadKlassNode and LoadNKlassNode In-Reply-To: References: Message-ID: On Thu, 23 Jan 2025 17:22:02 GMT, Quan Anh Mai wrote: > Hi, > > This patch removes the control input of `LoadKlassNode` and `LoadNKlassNode`. They can only have a control input if created inside `Parse::array_store_check()`, the reason given is: > > // We are allowed to use the constant type only if cast succeeded > > But this seems incorrect, the load from the constant type can be done regardless, and it will be constant-folded. This patch only makes that more formal and cleanup `LoadKlassNode::can_remove_control`. > > Please take a look and leave your reviews, thanks a lot. src/hotspot/share/opto/parseHelper.cpp line 229: > 227: int element_klass_offset = in_bytes(ObjArrayKlass::element_klass_offset()); > 228: Node* p2 = basic_plus_adr(array_klass, array_klass, element_klass_offset); > 229: Node* a_e_klass = _gvn.transform(LoadKlassNode::make(_gvn, immutable_memory(), p2, tak)); It looks like you are reverting the fix for JDK-8057622 here. How is it intended to work now? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23274#discussion_r1927470735 From qamai at openjdk.org Thu Jan 23 19:08:25 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 23 Jan 2025 19:08:25 GMT Subject: RFR: 8348411: C2: Remove the control input of LoadKlassNode and LoadNKlassNode [v2] In-Reply-To: References: Message-ID: <-gX_tt4TqHEDFRKAEG7-UWUbtnukhCkjpHV03wWK4Xc=.803af5fc-8b8f-44a0-9cc8-2b10f0cf2002@github.com> > Hi, > > This patch removes the control input of `LoadKlassNode` and `LoadNKlassNode`. They can only have a control input if created inside `Parse::array_store_check()`, the reason given is: > > // We are allowed to use the constant type only if cast succeeded > > But this seems incorrect, the load from the constant type can be done regardless, and it will be constant-folded. This patch only makes that more formal and cleanup `LoadKlassNode::can_remove_control`. > > Please take a look and leave your reviews, thanks a lot. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: remove always_see_exact_class ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23274/files - new: https://git.openjdk.org/jdk/pull/23274/files/79cf1990..ff924dea Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23274&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23274&range=00-01 Stats: 34 lines in 1 file changed: 0 ins; 7 del; 27 mod Patch: https://git.openjdk.org/jdk/pull/23274.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23274/head:pull/23274 PR: https://git.openjdk.org/jdk/pull/23274 From qamai at openjdk.org Thu Jan 23 19:17:50 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 23 Jan 2025 19:17:50 GMT Subject: RFR: 8348411: C2: Remove the control input of LoadKlassNode and LoadNKlassNode [v2] In-Reply-To: References: Message-ID: On Thu, 23 Jan 2025 18:28:56 GMT, Vladimir Ivanov wrote: >> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: >> >> remove always_see_exact_class > > src/hotspot/share/opto/parseHelper.cpp line 229: > >> 227: int element_klass_offset = in_bytes(ObjArrayKlass::element_klass_offset()); >> 228: Node* p2 = basic_plus_adr(array_klass, array_klass, element_klass_offset); >> 229: Node* a_e_klass = _gvn.transform(LoadKlassNode::make(_gvn, immutable_memory(), p2, tak)); > > It looks like you are reverting the fix for JDK-8057622 here. How is it intended to work now? Thanks for noticing, IIUC the issue with JDK-8057622 is that we see that the bottom type of the array is `java.lang.Object`, we then try to find the element type of `java.lang.Object` which is an invalid operation. The fix for the issue seems to do 2 things. It avoids the optimistic check if the bottom type is `java.lang.Object`, then it loads the element type from the constant only if the type check succeeds. The second part seems unnecessary, if the bottom type is not `java.lang.Object`, then it must be an array type, so the load must succeed. I added `is_aryklassptr` there to ensure that we are not having a bogus constant klass pointer. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23274#discussion_r1927557841 From cslucas at openjdk.org Thu Jan 23 20:07:23 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 23 Jan 2025 20:07:23 GMT Subject: RFR: 8336760: [JVMCI] -XX:+PrintCompilation should also print "hosted" JVMCI compilations Message-ID: Currently, `-XX:+PrintCompilation` does not print "hosted" JVMCI compilations (i.e. JVMCI compilations not triggered by the `CompilerBroker` but e.g. by the Truffle framework). On the other hand, if such an nmethod which results from a "hosted" compilation gets deoptimized, it will be printed by `-XX:+PrintCompilation` (with a compilation ID that doesn't appear anywhere before in the compilation log. This pull request is intended to fix that. The snippet below is an example of output printed (with PrintCompilation and CIPrintCompilerName enabled) for Truffle hosted compilations using this patch. 764 jvmci:3812 ??$ 4 com.oracle.truffle.runtime.OptimizedCallTarget::callBoundary (19 bytes) 767 jvmci:4899 ??$ 4 com.oracle.truffle.runtime.hotspot.HotSpotFastThreadLocal::get (4 bytes) 769 jvmci:4944 ??$ 4 com.oracle.truffle.runtime.hotspot.HotSpotFastThreadLocal::set (5 bytes) ------------- Commit messages: - Print JVMCI hosted compilations during code installation. Changes: https://git.openjdk.org/jdk/pull/23278/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23278&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8336760 Stats: 39 lines in 2 files changed: 32 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/23278.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23278/head:pull/23278 PR: https://git.openjdk.org/jdk/pull/23278 From cslucas at openjdk.org Thu Jan 23 23:15:03 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 23 Jan 2025 23:15:03 GMT Subject: RFR: 8336760: [JVMCI] -XX:+PrintCompilation should also print "hosted" JVMCI compilations [v2] In-Reply-To: References: Message-ID: > Currently, `-XX:+PrintCompilation` does not print "hosted" JVMCI compilations (i.e. JVMCI compilations not triggered by the `CompilerBroker` but e.g. by the Truffle framework). On the other hand, if such an nmethod which results from a "hosted" compilation gets deoptimized, it will be printed by `-XX:+PrintCompilation` (with a compilation ID that doesn't appear anywhere before in the compilation log. > > This pull request is intended to fix that. The snippet below is an example of output printed (with PrintCompilation and CIPrintCompilerName enabled) for Truffle hosted compilations using this patch. > > > 783 JVMCI:4667 4 com.oracle.truffle.runtime.OptimizedCallTarget::callBoundary (19 bytes) (hosted JVMCI compilation) > 785 JVMCI:5342 4 com.oracle.truffle.runtime.hotspot.HotSpotFastThreadLocal::get (4 bytes) (hosted JVMCI compilation) > 786 JVMCI:5411 4 com.oracle.truffle.runtime.hotspot.HotSpotFastThreadLocal::set (5 bytes) (hosted JVMCI compilation) > 1582 JVMCI:10125 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) > 1591 JVMCI:10899 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) > 1652 JVMCI:11064 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) > 1656 JVMCI:11175 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) Cesar Soares Lucas has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: Print JVMCI hosted compilations during code installation. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23278/files - new: https://git.openjdk.org/jdk/pull/23278/files/c672e4d8..92f15db7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23278&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23278&range=00-01 Stats: 36 lines in 2 files changed: 1 ins; 27 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/23278.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23278/head:pull/23278 PR: https://git.openjdk.org/jdk/pull/23278 From cslucas at openjdk.org Thu Jan 23 23:28:01 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 23 Jan 2025 23:28:01 GMT Subject: RFR: 8336760: [JVMCI] -XX:+PrintCompilation should also print "hosted" JVMCI compilations [v3] In-Reply-To: References: Message-ID: <9UQRXClTNNnZ_nJx4lXKKbGArPRauMknoiasTPsULOo=.721379b8-cf86-4fcb-8a0b-65f08d617d9b@github.com> > Currently, `-XX:+PrintCompilation` does not print "hosted" JVMCI compilations (i.e. JVMCI compilations not triggered by the `CompilerBroker` but e.g. by the Truffle framework). On the other hand, if such an nmethod which results from a "hosted" compilation gets deoptimized, it will be printed by `-XX:+PrintCompilation` (with a compilation ID that doesn't appear anywhere before in the compilation log. > > This pull request is intended to fix that. The snippet below is an example of output printed (with PrintCompilation and CIPrintCompilerName enabled) for Truffle hosted compilations using this patch. > > > 783 JVMCI:4667 4 com.oracle.truffle.runtime.OptimizedCallTarget::callBoundary (19 bytes) (hosted JVMCI compilation) > 785 JVMCI:5342 4 com.oracle.truffle.runtime.hotspot.HotSpotFastThreadLocal::get (4 bytes) (hosted JVMCI compilation) > 786 JVMCI:5411 4 com.oracle.truffle.runtime.hotspot.HotSpotFastThreadLocal::set (5 bytes) (hosted JVMCI compilation) > 1582 JVMCI:10125 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) > 1591 JVMCI:10899 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) > 1652 JVMCI:11064 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) > 1656 JVMCI:11175 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: Fix typo. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23278/files - new: https://git.openjdk.org/jdk/pull/23278/files/92f15db7..f49ca1ee Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23278&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23278&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/23278.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23278/head:pull/23278 PR: https://git.openjdk.org/jdk/pull/23278 From kvn at openjdk.org Fri Jan 24 00:14:58 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 24 Jan 2025 00:14:58 GMT Subject: RFR: 8347997: assert(false) failed: EA: missing memory path Message-ID: C2's Escape Analysis does not recognize pattern where one input of memory `Phi` node is `MergeMem` node and an other is RAW store. This pattern is created by Continuation pinning intrinsic. As result EA complains about strange memory graph. I suggest to add second `MergeMem` between Store and Phi nodes by calling `reset_memory()`. EA recognize such patter and removes allocations. I checked generated assembler pinning code and it is the same as before. The only difference in the test is eliminated allocations. I moved Uncommon code up to avoid resetting memory - it is already done at the beginning of this intrinsic code. We should not use `uncommon_trap_exact()` for `Deoptimization::Action_none` - It is used for other actions to prevent changing them to `Action_none`. Tested tier1-5, hs-xcomp, hs-comp-stress Added new regression test based on reproducer from bug report. ------------- Commit messages: - Remove trailing space - 8347997: assert(false) failed: EA: missing memory path Changes: https://git.openjdk.org/jdk/pull/23284/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23284&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347997 Stats: 119 lines in 2 files changed: 104 ins; 13 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/23284.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23284/head:pull/23284 PR: https://git.openjdk.org/jdk/pull/23284 From dnsimon at openjdk.org Fri Jan 24 01:50:49 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Fri, 24 Jan 2025 01:50:49 GMT Subject: RFR: 8336760: [JVMCI] -XX:+PrintCompilation should also print "hosted" JVMCI compilations [v3] In-Reply-To: <9UQRXClTNNnZ_nJx4lXKKbGArPRauMknoiasTPsULOo=.721379b8-cf86-4fcb-8a0b-65f08d617d9b@github.com> References: <9UQRXClTNNnZ_nJx4lXKKbGArPRauMknoiasTPsULOo=.721379b8-cf86-4fcb-8a0b-65f08d617d9b@github.com> Message-ID: On Thu, 23 Jan 2025 23:28:01 GMT, Cesar Soares Lucas wrote: >> Currently, `-XX:+PrintCompilation` does not print "hosted" JVMCI compilations (i.e. JVMCI compilations not triggered by the `CompilerBroker` but e.g. by the Truffle framework). On the other hand, if such an nmethod which results from a "hosted" compilation gets deoptimized, it will be printed by `-XX:+PrintCompilation` (with a compilation ID that doesn't appear anywhere before in the compilation log. >> >> This pull request is intended to fix that. The snippet below is an example of output printed (with PrintCompilation and CIPrintCompilerName enabled) for Truffle hosted compilations using this patch. >> >> >> 783 JVMCI:4667 4 com.oracle.truffle.runtime.OptimizedCallTarget::callBoundary (19 bytes) (hosted JVMCI compilation) >> 785 JVMCI:5342 4 com.oracle.truffle.runtime.hotspot.HotSpotFastThreadLocal::get (4 bytes) (hosted JVMCI compilation) >> 786 JVMCI:5411 4 com.oracle.truffle.runtime.hotspot.HotSpotFastThreadLocal::set (5 bytes) (hosted JVMCI compilation) >> 1582 JVMCI:10125 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) >> 1591 JVMCI:10899 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) >> 1652 JVMCI:11064 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) >> 1656 JVMCI:11175 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Fix typo. Looks reasonable to me. ------------- Marked as reviewed by dnsimon (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23278#pullrequestreview-2571477934 From alanb at openjdk.org Fri Jan 24 07:42:51 2025 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 24 Jan 2025 07:42:51 GMT Subject: RFR: 8347997: assert(false) failed: EA: missing memory path In-Reply-To: References: Message-ID: On Fri, 24 Jan 2025 00:07:43 GMT, Vladimir Kozlov wrote: > C2's Escape Analysis does not recognize pattern where one input of memory `Phi` node is `MergeMem` node and an other is RAW store. This pattern is created by Continuation pinning intrinsic. As result EA complains about strange memory graph. > > I suggest to add second `MergeMem` between Store and Phi nodes by calling `reset_memory()`. EA recognize such patter and removes allocations. > > I checked generated assembler pinning code and it is the same as before. The only difference in the test is eliminated allocations. > > I moved Uncommon code up to avoid resetting memory - it is already done at the beginning of this intrinsic code. > > We should not use `uncommon_trap_exact()` for `Deoptimization::Action_none` - It is used for other actions to prevent changing them to `Action_none`. > > Tested tier1-5, hs-xcomp, hs-comp-stress > Added new regression test based on reproducer from bug report. If you want, the reproducer can be simplified to just invoke pin/unpin, avoids needing the method handle code to invoke them reflectively, e.g. import jdk.internal.vm.Continuation; static class FailsEA { final Object o; public FailsEA() throws Throwable { o = new Object(); Continuation.pin(); Continuation.unpin(); } } static class Crashes { final Object o; public Crashes() throws Throwable { Continuation.pin(); Continuation.unpin(); o = new Object(); } } If you add `@modules java.base/jdk.internal.vm` to the test description then jtreg will compile and run the test with this package exported, no need be open the package with the `@run` tag. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23284#issuecomment-2611857331 From jbhateja at openjdk.org Fri Jan 24 10:39:34 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 24 Jan 2025 10:39:34 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v12] In-Reply-To: References: Message-ID: > Hi All, > > This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) > > Following is the summary of changes included with this patch:- > > 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. > 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. > 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. > - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. > 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. > 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. > 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. > 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF > 9. X86 backend implementation for all supported intrinsics. > 10. Functional and Performance validation tests. > > Kindly review the patch and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 14 commits: - Rebasing to jdk mainline - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8342103 - Refining IR match rule - Review suggestions incorporated. - Review comments resolutions - Updating copyright year of modified files. - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8342103 - Review suggestions incorporated. - Review comments resolutions - Addressing review comments - ... and 4 more: https://git.openjdk.org/jdk/compare/4a375e5b...e0602c1d ------------- Changes: https://git.openjdk.org/jdk/pull/22754/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=11 Stats: 2864 lines in 56 files changed: 2797 ins; 0 del; 67 mod Patch: https://git.openjdk.org/jdk/pull/22754.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22754/head:pull/22754 PR: https://git.openjdk.org/jdk/pull/22754 From jbhateja at openjdk.org Fri Jan 24 10:39:34 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 24 Jan 2025 10:39:34 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v11] In-Reply-To: References: Message-ID: On Fri, 17 Jan 2025 16:02:55 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes included with this patch:- >> >> 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. >> 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. >> 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. >> - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. >> 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. >> 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. >> 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. >> 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF >> 9. X86 backend implementation for all supported intrinsics. >> 10. Functional and Performance validation tests. >> >> Kindly review the patch and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Review suggestions incorporated. > test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorConvChain.java Hi @eme64 , Rebased to the latest mainline code please proceed with test runs. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22754#issuecomment-2612203710 From kvn at openjdk.org Fri Jan 24 16:56:45 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 24 Jan 2025 16:56:45 GMT Subject: RFR: 8336760: [JVMCI] -XX:+PrintCompilation should also print "hosted" JVMCI compilations [v3] In-Reply-To: <9UQRXClTNNnZ_nJx4lXKKbGArPRauMknoiasTPsULOo=.721379b8-cf86-4fcb-8a0b-65f08d617d9b@github.com> References: <9UQRXClTNNnZ_nJx4lXKKbGArPRauMknoiasTPsULOo=.721379b8-cf86-4fcb-8a0b-65f08d617d9b@github.com> Message-ID: On Thu, 23 Jan 2025 23:28:01 GMT, Cesar Soares Lucas wrote: >> Currently, `-XX:+PrintCompilation` does not print "hosted" JVMCI compilations (i.e. JVMCI compilations not triggered by the `CompilerBroker` but e.g. by the Truffle framework). On the other hand, if such an nmethod which results from a "hosted" compilation gets deoptimized, it will be printed by `-XX:+PrintCompilation` (with a compilation ID that doesn't appear anywhere before in the compilation log. >> >> This pull request is intended to fix that. The snippet below is an example of output printed (with PrintCompilation and CIPrintCompilerName enabled) for Truffle hosted compilations using this patch. >> >> >> 783 JVMCI:4667 4 com.oracle.truffle.runtime.OptimizedCallTarget::callBoundary (19 bytes) (hosted JVMCI compilation) >> 785 JVMCI:5342 4 com.oracle.truffle.runtime.hotspot.HotSpotFastThreadLocal::get (4 bytes) (hosted JVMCI compilation) >> 786 JVMCI:5411 4 com.oracle.truffle.runtime.hotspot.HotSpotFastThreadLocal::set (5 bytes) (hosted JVMCI compilation) >> 1582 JVMCI:10125 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) >> 1591 JVMCI:10899 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) >> 1652 JVMCI:11064 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) >> 1656 JVMCI:11175 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Fix typo. What about `-XX:+PrintCompilation2` output? And `-XX:+CITime`? Are these flags applicable for "hosted" compilation? ------------- PR Review: https://git.openjdk.org/jdk/pull/23278#pullrequestreview-2573149899 From kvn at openjdk.org Fri Jan 24 18:02:15 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 24 Jan 2025 18:02:15 GMT Subject: RFR: 8347997: assert(false) failed: EA: missing memory path [v2] In-Reply-To: References: Message-ID: > C2's Escape Analysis does not recognize pattern where one input of memory `Phi` node is `MergeMem` node and an other is RAW store. This pattern is created by Continuation pinning intrinsic. As result EA complains about strange memory graph. > > I suggest to add second `MergeMem` between Store and Phi nodes by calling `reset_memory()`. EA recognize such patter and removes allocations. > > I checked generated assembler pinning code and it is the same as before. The only difference in the test is eliminated allocations. > > I moved Uncommon code up to avoid resetting memory - it is already done at the beginning of this intrinsic code. > > We should not use `uncommon_trap_exact()` for `Deoptimization::Action_none` - It is used for other actions to prevent changing them to `Action_none`. > > Tested tier1-5, hs-xcomp, hs-comp-stress > Added new regression test based on reproducer from bug report. Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Update test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23284/files - new: https://git.openjdk.org/jdk/pull/23284/files/dd1b1b9e..fad42ecb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23284&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23284&range=00-01 Stats: 22 lines in 1 file changed: 0 ins; 15 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/23284.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23284/head:pull/23284 PR: https://git.openjdk.org/jdk/pull/23284 From kvn at openjdk.org Fri Jan 24 18:02:15 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 24 Jan 2025 18:02:15 GMT Subject: RFR: 8347997: assert(false) failed: EA: missing memory path In-Reply-To: References: Message-ID: <7bPzRL0zhNeHdBoAymANldp5ypmJ6Ix0iyaTKJR3dsw=.2195ee63-9397-4610-83c8-11ac312c5cc2@github.com> On Fri, 24 Jan 2025 07:39:48 GMT, Alan Bateman wrote: >> C2's Escape Analysis does not recognize pattern where one input of memory `Phi` node is `MergeMem` node and an other is RAW store. This pattern is created by Continuation pinning intrinsic. As result EA complains about strange memory graph. >> >> I suggest to add second `MergeMem` between Store and Phi nodes by calling `reset_memory()`. EA recognize such patter and removes allocations. >> >> I checked generated assembler pinning code and it is the same as before. The only difference in the test is eliminated allocations. >> >> I moved Uncommon code up to avoid resetting memory - it is already done at the beginning of this intrinsic code. >> >> We should not use `uncommon_trap_exact()` for `Deoptimization::Action_none` - It is used for other actions to prevent changing them to `Action_none`. >> >> Tested tier1-5, hs-xcomp, hs-comp-stress >> Added new regression test based on reproducer from bug report. > > If you want, the reproducer can be simplified to just invoke pin/unpin, avoids needing the method handle code to invoke them reflectively, e.g. > > > import jdk.internal.vm.Continuation; > > static class FailsEA { > final Object o; > > public FailsEA() throws Throwable { > o = new Object(); > Continuation.pin(); > Continuation.unpin(); > } > } > > static class Crashes { > final Object o; > > public Crashes() throws Throwable { > Continuation.pin(); > Continuation.unpin(); > o = new Object(); > } > } > > > If you add `@modules java.base/jdk.internal.vm` to the test description then jtreg will compile and run the test with this package exported, no need be open the package with the `@run` tag. Thank you @AlanBateman for suggestions about the test. I implemented them and verified that the test still catches the failure without fix. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23284#issuecomment-2613094931 From alanb at openjdk.org Fri Jan 24 18:30:48 2025 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 24 Jan 2025 18:30:48 GMT Subject: RFR: 8347997: assert(false) failed: EA: missing memory path In-Reply-To: References: Message-ID: <2yIGskBqmb9THoxIGbfymSakjN864UaGmgAl2fsQzIU=.9ecf1b08-c820-42b2-a75c-22be5faf91d3@github.com> On Fri, 24 Jan 2025 07:39:48 GMT, Alan Bateman wrote: >> C2's Escape Analysis does not recognize pattern where one input of memory `Phi` node is `MergeMem` node and an other is RAW store. This pattern is created by Continuation pinning intrinsic. As result EA complains about strange memory graph. >> >> I suggest to add second `MergeMem` between Store and Phi nodes by calling `reset_memory()`. EA recognize such patter and removes allocations. >> >> I checked generated assembler pinning code and it is the same as before. The only difference in the test is eliminated allocations. >> >> I moved Uncommon code up to avoid resetting memory - it is already done at the beginning of this intrinsic code. >> >> We should not use `uncommon_trap_exact()` for `Deoptimization::Action_none` - It is used for other actions to prevent changing them to `Action_none`. >> >> Tested tier1-5, hs-xcomp, hs-comp-stress >> Added new regression test based on reproducer from bug report. > > If you want, the reproducer can be simplified to just invoke pin/unpin, avoids needing the method handle code to invoke them reflectively, e.g. > > > import jdk.internal.vm.Continuation; > > static class FailsEA { > final Object o; > > public FailsEA() throws Throwable { > o = new Object(); > Continuation.pin(); > Continuation.unpin(); > } > } > > static class Crashes { > final Object o; > > public Crashes() throws Throwable { > Continuation.pin(); > Continuation.unpin(); > o = new Object(); > } > } > > > If you add `@modules java.base/jdk.internal.vm` to the test description then jtreg will compile and run the test with this package exported, no need be open the package with the `@run` tag. > Thank you @AlanBateman for suggestions about the test. I implemented them and verified that the test still catches the failure without fix. Good, much simpler. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23284#issuecomment-2613144898 From cslucas at openjdk.org Sat Jan 25 18:11:46 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Sat, 25 Jan 2025 18:11:46 GMT Subject: RFR: 8336760: [JVMCI] -XX:+PrintCompilation should also print "hosted" JVMCI compilations [v3] In-Reply-To: References: <9UQRXClTNNnZ_nJx4lXKKbGArPRauMknoiasTPsULOo=.721379b8-cf86-4fcb-8a0b-65f08d617d9b@github.com> Message-ID: On Fri, 24 Jan 2025 16:54:34 GMT, Vladimir Kozlov wrote: > What about `-XX:+PrintCompilation2` output? And `-XX:+CITime`? Are these flags applicable for "hosted" compilation? I created another RFE to evaluate if we need that and if so work on it: https://bugs.openjdk.org/browse/JDK-8348621 ------------- PR Comment: https://git.openjdk.org/jdk/pull/23278#issuecomment-2614051732 From epeter at openjdk.org Mon Jan 27 06:52:06 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 27 Jan 2025 06:52:06 GMT Subject: RFR: 8323582: C2 SuperWord AlignVector: misaligned vector memory access with unaligned native memory Message-ID: Note: the approach with Predicates and Multiversioning prepares us well for Runtime Checks for Aliasing Analysis, see more below. **Background** With `-XX:+AlignVector`, all vector loads/stores must be aligned. We try to statically determine if we can always align the vectors. One condition is that the address `base` is already aligned. For arrays, we know that this always holds, because they are `ObjectAlignmentInBytes` aligned. But with native memory, the `base` is just some arbitrarily aligned pointer. **Problem** So far, we have just naively assumed that the `base` is always `ObjectAlignmentInBytes` aligned. But that does not hold for `native` memory segments: the `base` can also be unaligned. I had constructed such an example, and with `-XX:+AlignVector -XX:+VerifyAlignVector` this example hits the verification code. MemorySegment nativeAligned = Arena.ofAuto().allocate(RANGE * 4 + 1); MemorySegment nativeUnaligned = nativeAligned.asSlice(1); test3(nativeUnaligned); When compiling the test method, we assume that the `nativeUnaligned.address()` is aligned - but it is not! static void test3(MemorySegment ms) { for (int i = 0; i < RANGE; i++) { long adr = i * 4L; int v = ms.get(ELEMENT_LAYOUT, adr); ms.set(ELEMENT_LAYOUT, adr, (int)(v + 1)); } } **Solution: Runtime Checks - Predicate and Multiversioning** Of course we could just forbid cases where we have a `native` base from vectorizing. But that would lead to regressions currently - in most cases we do get aligned `base`s, and we currently vectorize those. We cannot statically determine if the `base` is aligned, we need a runtime check. I came up with 2 options where to place the runtime checks: - A new "auto vectorization" Parse Predicate: - This only works when predicates are available. - If we fail the predicate, then we recompile without the predicate. That means we cannot add a check to the predicate any more, and we would have to do multiversioning at that point if we still want to have a vectorized loop. - Multiversion the loop: - Create 2 copies of the loop (fast and slow loops). - The `fast_loop` can make speculative alignment assumptions, and add the corresponding check to the `multiversion_if` which decides which loop we take - In the `slow_loop`, we make no assumption which means we can not vectorize, but we still compile - so even unaligned `base`s would end up with reasonably fast code. - We "stall" the `slow_loop` from optimizing until we have fully vectorized the `fast_loop`, and know that we actually are adding runtime checks to the `multiversion_if`, and we really need the `slow_loop`. Hence, the goal is that we compile like this: - First with predicate: if we are lucky we never see an unaligned `base`. - If we fail the check at the predicate: deopt, next time do not use the predicate for that loop. - When we recompile, we find no predicate, and instead multiversion the loop, so that we can compile both for aligned (vectorize) and unaligned (not vectorize) `base`. **Future Work: Runtime Check for Aliasing Analysis** See: [JDK-8324751](https://bugs.openjdk.org/browse/JDK-8324751): C2 SuperWord: Aliasing Analysis runtime check This whole infrastructure with "auto vectorization" Parse Predicate and Multiversioning can be used when we implement Runtime Checks for Aliasing Analysis: We speculate that there is no aliasing. If the runtime check fails, we deopt at the predicate, or take the `slow_loop` for Multiversioning. ------------- Commit messages: - remove multiversion mark if we break the structure - register opaque with igvn - copyright and rm CFG check - IR rules for all cases - 3 test versions - test changed to unaligned ints - stub for slicing - add Verify/AlignVector runs to test - refactor verify - rm TODO - ... and 52 more: https://git.openjdk.org/jdk/compare/16dcf15a...c53985f6 Changes: https://git.openjdk.org/jdk/pull/22016/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=22016&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8323582 Stats: 1074 lines in 27 files changed: 951 ins; 28 del; 95 mod Patch: https://git.openjdk.org/jdk/pull/22016.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22016/head:pull/22016 PR: https://git.openjdk.org/jdk/pull/22016 From epeter at openjdk.org Mon Jan 27 06:52:07 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 27 Jan 2025 06:52:07 GMT Subject: RFR: 8323582: C2 SuperWord AlignVector: misaligned vector memory access with unaligned native memory In-Reply-To: References: Message-ID: On Mon, 11 Nov 2024 14:40:09 GMT, Emanuel Peter wrote: > Note: the approach with Predicates and Multiversioning prepares us well for Runtime Checks for Aliasing Analysis, see more below. > > **Background** > > With `-XX:+AlignVector`, all vector loads/stores must be aligned. We try to statically determine if we can always align the vectors. One condition is that the address `base` is already aligned. For arrays, we know that this always holds, because they are `ObjectAlignmentInBytes` aligned. But with native memory, the `base` is just some arbitrarily aligned pointer. > > **Problem** > > So far, we have just naively assumed that the `base` is always `ObjectAlignmentInBytes` aligned. But that does not hold for `native` memory segments: the `base` can also be unaligned. I had constructed such an example, and with `-XX:+AlignVector -XX:+VerifyAlignVector` this example hits the verification code. > > > MemorySegment nativeAligned = Arena.ofAuto().allocate(RANGE * 4 + 1); > MemorySegment nativeUnaligned = nativeAligned.asSlice(1); > test3(nativeUnaligned); > > > When compiling the test method, we assume that the `nativeUnaligned.address()` is aligned - but it is not! > > static void test3(MemorySegment ms) { > for (int i = 0; i < RANGE; i++) { > long adr = i * 4L; > int v = ms.get(ELEMENT_LAYOUT, adr); > ms.set(ELEMENT_LAYOUT, adr, (int)(v + 1)); > } > } > > > **Solution: Runtime Checks - Predicate and Multiversioning** > > Of course we could just forbid cases where we have a `native` base from vectorizing. But that would lead to regressions currently - in most cases we do get aligned `base`s, and we currently vectorize those. We cannot statically determine if the `base` is aligned, we need a runtime check. > > I came up with 2 options where to place the runtime checks: > - A new "auto vectorization" Parse Predicate: > - This only works when predicates are available. > - If we fail the predicate, then we recompile without the predicate. That means we cannot add a check to the predicate any more, and we would have to do multiversioning at that point if we still want to have a vectorized loop. > - Multiversion the loop: > - Create 2 copies of the loop (fast and slow loops). > - The `fast_loop` can make speculative alignment assumptions, and add the corresponding check to the `multiversion_if` which decides which loop we take > - In the `slow_loop`, we make no assumption which means we can not vectorize, but we still compile - so even unaligned `base`s would end up with reasonably fast code. > - We "stall" the `... src/hotspot/share/opto/loopnode.cpp line 5016: > 5014: tty->print_cr("PredicatesOff"); > 5015: } > 5016: C->set_major_progress(); fix indentation ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22016#discussion_r1850505281 From epeter at openjdk.org Mon Jan 27 07:32:52 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 27 Jan 2025 07:32:52 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v11] In-Reply-To: References: Message-ID: On Fri, 24 Jan 2025 10:36:29 GMT, Jatin Bhateja wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Review suggestions incorporated. > >> test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorConvChain.java > > Hi @eme64 , Rebased to the latest mainline code please proceed with test runs. @jatin-bhateja Thanks for the merge, testing is running! ------------- PR Comment: https://git.openjdk.org/jdk/pull/22754#issuecomment-2615015991 From epeter at openjdk.org Mon Jan 27 07:45:54 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 27 Jan 2025 07:45:54 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v11] In-Reply-To: References: Message-ID: On Fri, 24 Jan 2025 10:36:29 GMT, Jatin Bhateja wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Review suggestions incorporated. > >> test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorConvChain.java > > Hi @eme64 , Rebased to the latest mainline code please proceed with test runs. @jatin-bhateja I'm getting a bad Copyright issue: [2025-01-27T07:30:55,551Z] BAD COPYRIGHT LINE: .../test/micro/org/openjdk/bench/jdk/incubator/vector/Float16OperationsBenchmark.java [2025-01-27T07:30:55,552Z] 1 header format error(s). ------------- PR Comment: https://git.openjdk.org/jdk/pull/22754#issuecomment-2615032561 From epeter at openjdk.org Mon Jan 27 07:45:57 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 27 Jan 2025 07:45:57 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v12] In-Reply-To: References: Message-ID: On Fri, 24 Jan 2025 10:39:34 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes included with this patch:- >> >> 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. >> 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. >> 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. >> - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. >> 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. >> 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. >> 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. >> 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF >> 9. X86 backend implementation for all supported intrinsics. >> 10. Functional and Performance validation tests. >> >> Kindly review the patch and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 14 commits: > > - Rebasing to jdk mainline > - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8342103 > - Refining IR match rule > - Review suggestions incorporated. > - Review comments resolutions > - Updating copyright year of modified files. > - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8342103 > - Review suggestions incorporated. > - Review comments resolutions > - Addressing review comments > - ... and 4 more: https://git.openjdk.org/jdk/compare/4a375e5b...e0602c1d test/micro/org/openjdk/bench/jdk/incubator/vector/Float16OperationsBenchmark.java line 2: > 1: /* > 2: * Copyright (c) 2025, Oracle and/or its affiliates. All rights vectorReserved. Suggestion: * Copyright (c) 2025, Oracle and/or its affiliates. All rights Reserved. Looks like a "vector" snuck in there ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1930090362 From jbhateja at openjdk.org Mon Jan 27 08:35:44 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 27 Jan 2025 08:35:44 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v13] In-Reply-To: References: Message-ID: > Hi All, > > This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) > > Following is the summary of changes included with this patch:- > > 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. > 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. > 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. > - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. > 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. > 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. > 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. > 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF > 9. X86 backend implementation for all supported intrinsics. > 10. Functional and Performance validation tests. > > Kindly review the patch and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Copyright header fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22754/files - new: https://git.openjdk.org/jdk/pull/22754/files/e0602c1d..4f22ed85 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=11-12 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/22754.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22754/head:pull/22754 PR: https://git.openjdk.org/jdk/pull/22754 From jbhateja at openjdk.org Mon Jan 27 08:35:47 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 27 Jan 2025 08:35:47 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v12] In-Reply-To: References: Message-ID: On Mon, 27 Jan 2025 07:42:48 GMT, Emanuel Peter wrote: >> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 14 commits: >> >> - Rebasing to jdk mainline >> - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8342103 >> - Refining IR match rule >> - Review suggestions incorporated. >> - Review comments resolutions >> - Updating copyright year of modified files. >> - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8342103 >> - Review suggestions incorporated. >> - Review comments resolutions >> - Addressing review comments >> - ... and 4 more: https://git.openjdk.org/jdk/compare/4a375e5b...e0602c1d > > test/micro/org/openjdk/bench/jdk/incubator/vector/Float16OperationsBenchmark.java line 2: > >> 1: /* >> 2: * Copyright (c) 2025, Oracle and/or its affiliates. All rights vectorReserved. > > Suggestion: > > * Copyright (c) 2025, Oracle and/or its affiliates. All rights Reserved. > > Looks like a "vector" snuck in there ? DONE ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1930140699 From simonis at openjdk.org Mon Jan 27 15:55:46 2025 From: simonis at openjdk.org (Volker Simonis) Date: Mon, 27 Jan 2025 15:55:46 GMT Subject: RFR: 8336760: [JVMCI] -XX:+PrintCompilation should also print "hosted" JVMCI compilations [v3] In-Reply-To: <9UQRXClTNNnZ_nJx4lXKKbGArPRauMknoiasTPsULOo=.721379b8-cf86-4fcb-8a0b-65f08d617d9b@github.com> References: <9UQRXClTNNnZ_nJx4lXKKbGArPRauMknoiasTPsULOo=.721379b8-cf86-4fcb-8a0b-65f08d617d9b@github.com> Message-ID: On Thu, 23 Jan 2025 23:28:01 GMT, Cesar Soares Lucas wrote: >> Currently, `-XX:+PrintCompilation` does not print "hosted" JVMCI compilations (i.e. JVMCI compilations not triggered by the `CompilerBroker` but e.g. by the Truffle framework). On the other hand, if such an nmethod which results from a "hosted" compilation gets deoptimized, it will be printed by `-XX:+PrintCompilation` (with a compilation ID that doesn't appear anywhere before in the compilation log. >> >> This pull request is intended to fix that. The snippet below is an example of output printed (with PrintCompilation and CIPrintCompilerName enabled) for Truffle hosted compilations using this patch. >> >> >> 783 JVMCI:4667 4 com.oracle.truffle.runtime.OptimizedCallTarget::callBoundary (19 bytes) (hosted JVMCI compilation) >> 785 JVMCI:5342 4 com.oracle.truffle.runtime.hotspot.HotSpotFastThreadLocal::get (4 bytes) (hosted JVMCI compilation) >> 786 JVMCI:5411 4 com.oracle.truffle.runtime.hotspot.HotSpotFastThreadLocal::set (5 bytes) (hosted JVMCI compilation) >> 1582 JVMCI:10125 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) >> 1591 JVMCI:10899 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) >> 1652 JVMCI:11064 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) >> 1656 JVMCI:11175 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Fix typo. Notice that `-XX:+CITime` is already aware of "hosted" JVMCI compilations. This is how the output looks if we only have "hosted" JVMCI compilations (i.e. `-XX:+UseJVMCICompiler`): JVMCI CompileBroker Time: Compile: 0,000 s Install Code: 0,000 s (installs: 0, CodeBlob total size: 0, CodeBlob code size: 0) JVMCI Hosted Time: Install Code: 0,153 s (installs: 28, CodeBlob total size: 120888, CodeBlob code size: 80840) And this is how it looks like with `-XX:+UseJVMCICompiler`: JVMCI CompileBroker Time: Compile: 21,054 s Install Code: 0,307 s (installs: 841, CodeBlob total size: 985832, CodeBlob code size: 673184) JVMCI Hosted Time: Install Code: 0,165 s (installs: 31, CodeBlob total size: 146368, CodeBlob code size: 94808) The "hosted" section has no entry for the compilation time, but it still prints the nunmber of compiled methods and their size. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23278#issuecomment-2616137006 From simonis at openjdk.org Mon Jan 27 16:19:47 2025 From: simonis at openjdk.org (Volker Simonis) Date: Mon, 27 Jan 2025 16:19:47 GMT Subject: RFR: 8336760: [JVMCI] -XX:+PrintCompilation should also print "hosted" JVMCI compilations [v3] In-Reply-To: <9UQRXClTNNnZ_nJx4lXKKbGArPRauMknoiasTPsULOo=.721379b8-cf86-4fcb-8a0b-65f08d617d9b@github.com> References: <9UQRXClTNNnZ_nJx4lXKKbGArPRauMknoiasTPsULOo=.721379b8-cf86-4fcb-8a0b-65f08d617d9b@github.com> Message-ID: On Thu, 23 Jan 2025 23:28:01 GMT, Cesar Soares Lucas wrote: >> Currently, `-XX:+PrintCompilation` does not print "hosted" JVMCI compilations (i.e. JVMCI compilations not triggered by the `CompilerBroker` but e.g. by the Truffle framework). On the other hand, if such an nmethod which results from a "hosted" compilation gets deoptimized, it will be printed by `-XX:+PrintCompilation` (with a compilation ID that doesn't appear anywhere before in the compilation log. >> >> This pull request is intended to fix that. The snippet below is an example of output printed (with PrintCompilation and CIPrintCompilerName enabled) for Truffle hosted compilations using this patch. >> >> >> 783 JVMCI:4667 4 com.oracle.truffle.runtime.OptimizedCallTarget::callBoundary (19 bytes) (hosted JVMCI compilation) >> 785 JVMCI:5342 4 com.oracle.truffle.runtime.hotspot.HotSpotFastThreadLocal::get (4 bytes) (hosted JVMCI compilation) >> 786 JVMCI:5411 4 com.oracle.truffle.runtime.hotspot.HotSpotFastThreadLocal::set (5 bytes) (hosted JVMCI compilation) >> 1582 JVMCI:10125 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) >> 1591 JVMCI:10899 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) >> 1652 JVMCI:11064 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) >> 1656 JVMCI:11175 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Fix typo. Can you please include `compileTask.hpp` where `CompileTask::print(..)` is declared? I understand that it somehow gets included implicitly already, but I'd like to make the dependency explicit. And also updated the copyright year please :) Otherwise looks good. ------------- Changes requested by simonis (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23278#pullrequestreview-2575892285 From kvn at openjdk.org Mon Jan 27 16:40:46 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 27 Jan 2025 16:40:46 GMT Subject: RFR: 8336760: [JVMCI] -XX:+PrintCompilation should also print "hosted" JVMCI compilations [v3] In-Reply-To: <9UQRXClTNNnZ_nJx4lXKKbGArPRauMknoiasTPsULOo=.721379b8-cf86-4fcb-8a0b-65f08d617d9b@github.com> References: <9UQRXClTNNnZ_nJx4lXKKbGArPRauMknoiasTPsULOo=.721379b8-cf86-4fcb-8a0b-65f08d617d9b@github.com> Message-ID: On Thu, 23 Jan 2025 23:28:01 GMT, Cesar Soares Lucas wrote: >> Currently, `-XX:+PrintCompilation` does not print "hosted" JVMCI compilations (i.e. JVMCI compilations not triggered by the `CompilerBroker` but e.g. by the Truffle framework). On the other hand, if such an nmethod which results from a "hosted" compilation gets deoptimized, it will be printed by `-XX:+PrintCompilation` (with a compilation ID that doesn't appear anywhere before in the compilation log. >> >> This pull request is intended to fix that. The snippet below is an example of output printed (with PrintCompilation and CIPrintCompilerName enabled) for Truffle hosted compilations using this patch. >> >> >> 783 JVMCI:4667 4 com.oracle.truffle.runtime.OptimizedCallTarget::callBoundary (19 bytes) (hosted JVMCI compilation) >> 785 JVMCI:5342 4 com.oracle.truffle.runtime.hotspot.HotSpotFastThreadLocal::get (4 bytes) (hosted JVMCI compilation) >> 786 JVMCI:5411 4 com.oracle.truffle.runtime.hotspot.HotSpotFastThreadLocal::set (5 bytes) (hosted JVMCI compilation) >> 1582 JVMCI:10125 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) >> 1591 JVMCI:10899 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) >> 1652 JVMCI:11064 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) >> 1656 JVMCI:11175 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Fix typo. Marked as reviewed by kvn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/23278#pullrequestreview-2575951742 From duke at openjdk.org Mon Jan 27 18:23:43 2025 From: duke at openjdk.org (Ferenc Rakoczi) Date: Mon, 27 Jan 2025 18:23:43 GMT Subject: RFR: 8348561: Add aarch64 intrinsics for ML-DSA Message-ID: By using the aarch64 vector registers the speed of the computation of the ML-DSA algorithms (key generation, document signing, signature verification) can be approximately doubled. ------------- Commit messages: - fixing whitespace errors - 8348561: Add aarch64 intrinsics for ML-DSA Changes: https://git.openjdk.org/jdk/pull/23300/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23300&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8348561 Stats: 2040 lines in 18 files changed: 1987 ins; 4 del; 49 mod Patch: https://git.openjdk.org/jdk/pull/23300.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23300/head:pull/23300 PR: https://git.openjdk.org/jdk/pull/23300 From cslucas at openjdk.org Mon Jan 27 18:39:46 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Mon, 27 Jan 2025 18:39:46 GMT Subject: RFR: 8336760: [JVMCI] -XX:+PrintCompilation should also print "hosted" JVMCI compilations [v3] In-Reply-To: References: <9UQRXClTNNnZ_nJx4lXKKbGArPRauMknoiasTPsULOo=.721379b8-cf86-4fcb-8a0b-65f08d617d9b@github.com> Message-ID: On Mon, 27 Jan 2025 16:16:58 GMT, Volker Simonis wrote: > And also updated the copyright year please :) I'm not sure I understand what you mean here? Shouldn't it be 2025? ------------- PR Comment: https://git.openjdk.org/jdk/pull/23278#issuecomment-2616603684 From sviswanathan at openjdk.org Mon Jan 27 23:29:50 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Mon, 27 Jan 2025 23:29:50 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v13] In-Reply-To: References: Message-ID: On Mon, 27 Jan 2025 08:35:44 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes included with this patch:- >> >> 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. >> 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. >> 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. >> - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. >> 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. >> 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. >> 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. >> 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF >> 9. X86 backend implementation for all supported intrinsics. >> 10. Functional and Performance validation tests. >> >> Kindly review the patch and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Copyright header fix test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorConvChain.java line 2: > 1: /* > 2: * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved. Year should be 2024, 2025, ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1931316524 From sviswanathan at openjdk.org Tue Jan 28 01:34:52 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Tue, 28 Jan 2025 01:34:52 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v13] In-Reply-To: References: Message-ID: On Mon, 27 Jan 2025 08:35:44 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes included with this patch:- >> >> 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. >> 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. >> 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. >> - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. >> 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. >> 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. >> 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. >> 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF >> 9. X86 backend implementation for all supported intrinsics. >> 10. Functional and Performance validation tests. >> >> Kindly review the patch and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Copyright header fix Some more minor comments. src/hotspot/share/opto/addnode.cpp line 1546: > 1544: > 1545: // As per IEEE 754 specification, floating point comparison consider +ve and -ve > 1546: // zeros as equals. Thus, performing signed integral comparison for max value Should be "min value detection". src/hotspot/share/opto/addnode.cpp line 1624: > 1622: // As per IEEE 754 specification, floating point comparison consider +ve and -ve > 1623: // zeros as equals. Thus, performing signed integral comparison for min value > 1624: // detection. Should be "max value detection". src/hotspot/share/opto/divnode.cpp line 848: > 846: // If the dividend is a constant zero > 847: // Note: if t1 and t2 are zero then result is NaN (JVMS page 213) > 848: // Test TypeF::ZERO is not sufficient as it could be negative zero Comment should be TypeH:ZERO is not sufficient ------------- PR Review: https://git.openjdk.org/jdk/pull/22754#pullrequestreview-2576801437 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1931347845 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1931347430 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1931376932 From galder at openjdk.org Tue Jan 28 06:10:49 2025 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Tue, 28 Jan 2025 06:10:49 GMT Subject: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) In-Reply-To: References: <6uzJCMkW_tFnyxzMbFGYfs7p3mezuBhizHl9dkR1Jro=.2da99701-7b40-492f-b15a-ef1ff7530ef7@github.com> <4NGZx_gqvc7xMcXCTef2c_ns-nMxznsB42NnlQJqX4Q=.8cdc00f9-c9a7-409a-b5c3-885d0677b952@github.com> Message-ID: <2Eoj5haSgq0ueNj4nGE-7nqR-4t0xhnmmAUAsFuNjck=.1e748c98-8c34-42c9-ae80-d460e41e3ba3@github.com> On Mon, 20 Jan 2025 08:00:52 GMT, Emanuel Peter wrote: >> @eme64 I've addressed all the comments. I've not run the `VectorReduction2` for the reasons explained in the previous comment. Happy to add more details to `MinMaxVector` if you feel it's necessary. > > @galderz Ah, right. I understand about the branch probability. > > Hmm, maybe we should eventually change the `VectorReduction2` benchmark, or just remove the `min/max` benchmark there completely, as it depends on the random input values. > > Ah, though we have a fixed `seed`, so rerunning the benchmark would at least have consistent branching characteristics. So then it could make sense to run the benchmark, we just don't know the probability. I mean I ran it before for the `in/float/double min/max`, and all of them see a solid speedup. So I would expect the same for `long`, it would be nice to at least see the numbers. > > You could extend your benchmark to `float / double` as well, to make it complete. But that could also be a follow-up RFE. > >>I would expect a similar thing to happen when it comes to asimd envs with max vector size >= 32 (e.g. Graviton 3). Those will see vectorization occur and improvements kick in at 100%. Other systems (e.g. Graviton 4) will see a regression at 100%. This means that your work in https://github.com/openjdk/jdk/pull/20098#discussion_r1901576209 to avoid the max vector size limitation might become more important once my PR here goes in. > > So are you saying there are machines where we are now getting some regressions with your patch (2-element cases)? It would be nice to see the numbers summarized here. I'm losing the overview a little over the 50+ messages now ? @eme64 Fair points. I'll provide a detailed summary with some final numbers after FOSDEM. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2617968437 From jbhateja at openjdk.org Tue Jan 28 06:26:11 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 28 Jan 2025 06:26:11 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v14] In-Reply-To: References: Message-ID: > Hi All, > > This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) > > Following is the summary of changes included with this patch:- > > 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. > 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. > 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. > - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. > 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. > 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. > 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. > 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF > 9. X86 backend implementation for all supported intrinsics. > 10. Functional and Performance validation tests. > > Kindly review the patch and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Updating typos in comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22754/files - new: https://git.openjdk.org/jdk/pull/22754/files/4f22ed85..854fc73f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=12-13 Stats: 4 lines in 3 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/22754.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22754/head:pull/22754 PR: https://git.openjdk.org/jdk/pull/22754 From thartmann at openjdk.org Tue Jan 28 06:33:01 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Tue, 28 Jan 2025 06:33:01 GMT Subject: RFR: 8347997: assert(false) failed: EA: missing memory path [v2] In-Reply-To: References: Message-ID: On Fri, 24 Jan 2025 18:02:15 GMT, Vladimir Kozlov wrote: >> C2's Escape Analysis does not recognize pattern where one input of memory `Phi` node is `MergeMem` node and an other is RAW store. This pattern is created by Continuation pinning intrinsic. As result EA complains about strange memory graph. >> >> I suggest to add second `MergeMem` between Store and Phi nodes by calling `reset_memory()`. EA recognize such patter and removes allocations. >> >> I checked generated assembler pinning code and it is the same as before. The only difference in the test is eliminated allocations. >> >> I moved Uncommon code up to avoid resetting memory - it is already done at the beginning of this intrinsic code. >> >> We should not use `uncommon_trap_exact()` for `Deoptimization::Action_none` - It is used for other actions to prevent changing them to `Action_none`. >> >> Tested tier1-5, hs-xcomp, hs-comp-stress >> Added new regression test based on reproducer from bug report. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Update test Looks good to me otherwise. > The only difference in the test is eliminated allocations. Sounds like a good opportunity for an IR framework test but could be done as follow-up (starter) RFE. test/hotspot/jtreg/compiler/intrinsics/TestContinuationPinningAndEA.java line 27: > 25: * @test > 26: * @bug 8347997 > 27: * @summary Test that Cintinuation.pin() and unpin() intrinsics work with EA. Suggestion: * @summary Test that Continuation.pin() and unpin() intrinsics work with EA. test/hotspot/jtreg/compiler/intrinsics/TestContinuationPinningAndEA.java line 36: > 34: public class TestContinuationPinningAndEA { > 35: > 36: static class FailsEA { Should use 4-whitespace indentation since it's Java code. ------------- Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23284#pullrequestreview-2577227944 PR Review Comment: https://git.openjdk.org/jdk/pull/23284#discussion_r1931596532 PR Review Comment: https://git.openjdk.org/jdk/pull/23284#discussion_r1931597080 From chagedorn at openjdk.org Tue Jan 28 06:52:51 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Tue, 28 Jan 2025 06:52:51 GMT Subject: RFR: 8347997: assert(false) failed: EA: missing memory path [v2] In-Reply-To: References: Message-ID: On Fri, 24 Jan 2025 18:02:15 GMT, Vladimir Kozlov wrote: >> C2's Escape Analysis does not recognize pattern where one input of memory `Phi` node is `MergeMem` node and an other is RAW store. This pattern is created by Continuation pinning intrinsic. As result EA complains about strange memory graph. >> >> I suggest to add second `MergeMem` between Store and Phi nodes by calling `reset_memory()`. EA recognize such patter and removes allocations. >> >> I checked generated assembler pinning code and it is the same as before. The only difference in the test is eliminated allocations. >> >> I moved Uncommon code up to avoid resetting memory - it is already done at the beginning of this intrinsic code. >> >> We should not use `uncommon_trap_exact()` for `Deoptimization::Action_none` - It is used for other actions to prevent changing them to `Action_none`. >> >> Tested tier1-5, hs-xcomp, hs-comp-stress >> Added new regression test based on reproducer from bug report. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Update test Otherwise, the fix looks good to me, too! src/hotspot/share/opto/library_call.cpp line 3775: > 3773: IfNode* iff_pin_count_over_underflow = create_and_map_if(control(), test_pin_count_over_underflow, PROB_MIN, COUNT_UNKNOWN); > 3774: > 3775: // True branch, pin count over/underflow. Maybe you can add a comment here about why we do the trap first (as described in the PR description). ------------- Marked as reviewed by chagedorn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23284#pullrequestreview-2577234825 PR Review Comment: https://git.openjdk.org/jdk/pull/23284#discussion_r1931618277 From simonis at openjdk.org Tue Jan 28 07:55:51 2025 From: simonis at openjdk.org (Volker Simonis) Date: Tue, 28 Jan 2025 07:55:51 GMT Subject: RFR: 8336760: [JVMCI] -XX:+PrintCompilation should also print "hosted" JVMCI compilations [v3] In-Reply-To: References: <9UQRXClTNNnZ_nJx4lXKKbGArPRauMknoiasTPsULOo=.721379b8-cf86-4fcb-8a0b-65f08d617d9b@github.com> Message-ID: On Mon, 27 Jan 2025 18:37:38 GMT, Cesar Soares Lucas wrote: > > And also updated the copyright year please :) > > I'm not sure I understand what you mean here? Shouldn't it be 2025? Sorry, I missed that the copyright was already updated this year. That happens if you don't pull every day :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/23278#issuecomment-2618147688 From thartmann at openjdk.org Tue Jan 28 15:40:28 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Tue, 28 Jan 2025 15:40:28 GMT Subject: RFR: 8348631: Crash in PredictedCallGenerator::generate after JDK-8347006 Message-ID: We crash / assert during C2 compilation of intrinsics like `_getLength` because the cast emitted by the array guard added by [JDK-8347006](https://bugs.openjdk.org/browse/JDK-8347006) is folded to top: https://github.com/openjdk/jdk/blob/c33c1cfe7349ac657cd7bf54861227709d3c8f1b/src/hotspot/share/opto/library_call.cpp#L4302-L4305 This happens when C2's type system determines that the type of the object that we cast implements an interface other than `Serializable` or `Cloneable` and therefore can't be an array. This is possible since [JDK-8297933](https://bugs.openjdk.org/browse/JDK-8297933). Now unfortunately, control via the layout helper check is not (yet) folded due to: https://github.com/openjdk/jdk/blob/c33c1cfe7349ac657cd7bf54861227709d3c8f1b/src/hotspot/share/opto/memnode.cpp#L2215-L2223 This is probably an oversight from [JDK-8297933](https://bugs.openjdk.org/browse/JDK-8297933). Given that this is a regression in JDK 24, I'm going with a conservative approach of simply checking the cast for top and not using it if that's the case. In addition, I made the code more robust and added a compilation bailout (assert in debug) if an intrinsic produces a `top` result. We should then properly fix this by making sure that the layout helper check is folded. I filed [JDK-8348853](https://bugs.openjdk.org/browse/JDK-8348853) for this. Big thanks to @cushon for reporting this just in time for fixing in JDK 24! Best regards, Tobias ------------- Commit messages: - Trailing ws - Fix - 8348631: Crash in PredictedCallGenerator::generate after JDK-8347006 Changes: https://git.openjdk.org/jdk/pull/23331/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23331&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8348631 Stats: 77 lines in 4 files changed: 76 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/23331.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23331/head:pull/23331 PR: https://git.openjdk.org/jdk/pull/23331 From epeter at openjdk.org Tue Jan 28 15:40:28 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 28 Jan 2025 15:40:28 GMT Subject: RFR: 8348631: Crash in PredictedCallGenerator::generate after JDK-8347006 In-Reply-To: References: Message-ID: On Tue, 28 Jan 2025 13:10:37 GMT, Tobias Hartmann wrote: > We crash / assert during C2 compilation of intrinsics like `_getLength` because the cast emitted by the array guard added by [JDK-8347006](https://bugs.openjdk.org/browse/JDK-8347006) is folded to top: > https://github.com/openjdk/jdk/blob/c33c1cfe7349ac657cd7bf54861227709d3c8f1b/src/hotspot/share/opto/library_call.cpp#L4302-L4305 > > This happens when C2's type system determines that the type of the object that we cast implements an interface other than `Serializable` or `Cloneable` and therefore can't be an array. This is possible since [JDK-8297933](https://bugs.openjdk.org/browse/JDK-8297933). Now unfortunately, control via the layout helper check is not (yet) folded due to: > https://github.com/openjdk/jdk/blob/c33c1cfe7349ac657cd7bf54861227709d3c8f1b/src/hotspot/share/opto/memnode.cpp#L2215-L2223 > > This is probably an oversight from [JDK-8297933](https://bugs.openjdk.org/browse/JDK-8297933). Given that this is a regression in JDK 24, I'm going with a conservative approach of simply checking the cast for top and not using it if that's the case. In addition, I made the code more robust and added a compilation bailout (assert in debug) if an intrinsic produces a `top` result. > > We should then properly fix this by making sure that the layout helper check is folded. I filed [JDK-8348853](https://bugs.openjdk.org/browse/JDK-8348853) for this. > > Big thanks to @cushon for reporting this just in time for fixing in JDK 24! > > Best regards, > Tobias Looks good. Thanks for the offline explanation about how you found the `failing` bailout cases you need to catch: you just hard-coded the bailout for all such intrinsics, and ran testing. I think that should be sufficient for now. ------------- Marked as reviewed by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23331#pullrequestreview-2578658856 From kvn at openjdk.org Tue Jan 28 17:27:47 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 28 Jan 2025 17:27:47 GMT Subject: RFR: 8348631: Crash in PredictedCallGenerator::generate after JDK-8347006 In-Reply-To: References: Message-ID: On Tue, 28 Jan 2025 13:10:37 GMT, Tobias Hartmann wrote: > We crash / assert during C2 compilation of intrinsics like `_getLength` because the cast emitted by the array guard added by [JDK-8347006](https://bugs.openjdk.org/browse/JDK-8347006) is folded to top: > https://github.com/openjdk/jdk/blob/c33c1cfe7349ac657cd7bf54861227709d3c8f1b/src/hotspot/share/opto/library_call.cpp#L4302-L4305 > > This happens when C2's type system determines that the type of the object that we cast implements an interface other than `Serializable` or `Cloneable` and therefore can't be an array. This is possible since [JDK-8297933](https://bugs.openjdk.org/browse/JDK-8297933). Now unfortunately, control via the layout helper check is not (yet) folded due to: > https://github.com/openjdk/jdk/blob/c33c1cfe7349ac657cd7bf54861227709d3c8f1b/src/hotspot/share/opto/memnode.cpp#L2215-L2223 > > This is probably an oversight from [JDK-8297933](https://bugs.openjdk.org/browse/JDK-8297933). Given that this is a regression in JDK 24, I'm going with a conservative approach of simply checking the cast for top and not using it if that's the case. In addition, I made the code more robust and added a compilation bailout (assert in debug) if an intrinsic produces a `top` result. > > We should then properly fix this by making sure that the layout helper check is folded. I filed [JDK-8348853](https://bugs.openjdk.org/browse/JDK-8348853) for this. > > Big thanks to @cushon for reporting this just in time for fixing in JDK 24! > > Best regards, > Tobias General question: how in other part of VM (runtime, gc) layout helper was changed for JDK-8297933? src/hotspot/share/opto/library_call.cpp line 4308: > 4306: if (!cast->is_top()) { > 4307: *obj = cast; > 4308: } Add comment why it could be TOP. ------------- PR Review: https://git.openjdk.org/jdk/pull/23331#pullrequestreview-2578931111 PR Review Comment: https://git.openjdk.org/jdk/pull/23331#discussion_r1932562269 From cushon at openjdk.org Tue Jan 28 17:42:46 2025 From: cushon at openjdk.org (Liam Miller-Cushon) Date: Tue, 28 Jan 2025 17:42:46 GMT Subject: RFR: 8348631: Crash in PredictedCallGenerator::generate after JDK-8347006 In-Reply-To: References: Message-ID: On Tue, 28 Jan 2025 13:10:37 GMT, Tobias Hartmann wrote: > We crash / assert during C2 compilation of intrinsics like `_getLength` because the cast emitted by the array guard added by [JDK-8347006](https://bugs.openjdk.org/browse/JDK-8347006) is folded to top: > https://github.com/openjdk/jdk/blob/c33c1cfe7349ac657cd7bf54861227709d3c8f1b/src/hotspot/share/opto/library_call.cpp#L4302-L4305 > > This happens when C2's type system determines that the type of the object that we cast implements an interface other than `Serializable` or `Cloneable` and therefore can't be an array. This is possible since [JDK-8297933](https://bugs.openjdk.org/browse/JDK-8297933). Now unfortunately, control via the layout helper check is not (yet) folded due to: > https://github.com/openjdk/jdk/blob/c33c1cfe7349ac657cd7bf54861227709d3c8f1b/src/hotspot/share/opto/memnode.cpp#L2215-L2223 > > This is probably an oversight from [JDK-8297933](https://bugs.openjdk.org/browse/JDK-8297933). Given that this is a regression in JDK 24, I'm going with a conservative approach of simply checking the cast for top and not using it if that's the case. In addition, I made the code more robust and added a compilation bailout (assert in debug) if an intrinsic produces a `top` result. > > We should then properly fix this by making sure that the layout helper check is folded. I filed [JDK-8348853](https://bugs.openjdk.org/browse/JDK-8348853) for this. > > Big thanks to @cushon for reporting this just in time for fixing in JDK 24! > > Best regards, > Tobias I tested these changes against the original issue that prompted JDK-8348631, and everything looks good. Thanks for the quick fix! ------------- PR Comment: https://git.openjdk.org/jdk/pull/23331#issuecomment-2619665413 From kvn at openjdk.org Tue Jan 28 17:44:32 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 28 Jan 2025 17:44:32 GMT Subject: RFR: 8347997: assert(false) failed: EA: missing memory path [v3] In-Reply-To: References: Message-ID: > C2's Escape Analysis does not recognize pattern where one input of memory `Phi` node is `MergeMem` node and an other is RAW store. This pattern is created by Continuation pinning intrinsic. As result EA complains about strange memory graph. > > I suggest to add second `MergeMem` between Store and Phi nodes by calling `reset_memory()`. EA recognize such patter and removes allocations. > > I checked generated assembler pinning code and it is the same as before. The only difference in the test is eliminated allocations. > > I moved Uncommon code up to avoid resetting memory - it is already done at the beginning of this intrinsic code. > > We should not use `uncommon_trap_exact()` for `Deoptimization::Action_none` - It is used for other actions to prevent changing them to `Action_none`. > > Tested tier1-5, hs-xcomp, hs-comp-stress > Added new regression test based on reproducer from bug report. Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Update test/hotspot/jtreg/compiler/intrinsics/TestContinuationPinningAndEA.java Co-authored-by: Tobias Hartmann ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23284/files - new: https://git.openjdk.org/jdk/pull/23284/files/fad42ecb..fee3b721 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23284&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23284&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/23284.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23284/head:pull/23284 PR: https://git.openjdk.org/jdk/pull/23284 From kvn at openjdk.org Tue Jan 28 18:02:37 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 28 Jan 2025 18:02:37 GMT Subject: RFR: 8347997: assert(false) failed: EA: missing memory path [v4] In-Reply-To: References: Message-ID: > C2's Escape Analysis does not recognize pattern where one input of memory `Phi` node is `MergeMem` node and an other is RAW store. This pattern is created by Continuation pinning intrinsic. As result EA complains about strange memory graph. > > I suggest to add second `MergeMem` between Store and Phi nodes by calling `reset_memory()`. EA recognize such patter and removes allocations. > > I checked generated assembler pinning code and it is the same as before. The only difference in the test is eliminated allocations. > > I moved Uncommon code up to avoid resetting memory - it is already done at the beginning of this intrinsic code. > > We should not use `uncommon_trap_exact()` for `Deoptimization::Action_none` - It is used for other actions to prevent changing them to `Action_none`. > > Tested tier1-5, hs-xcomp, hs-comp-stress > Added new regression test based on reproducer from bug report. Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Address comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23284/files - new: https://git.openjdk.org/jdk/pull/23284/files/fee3b721..e502568c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23284&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23284&range=02-03 Stats: 40 lines in 2 files changed: 10 ins; 8 del; 22 mod Patch: https://git.openjdk.org/jdk/pull/23284.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23284/head:pull/23284 PR: https://git.openjdk.org/jdk/pull/23284 From kvn at openjdk.org Tue Jan 28 18:06:47 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 28 Jan 2025 18:06:47 GMT Subject: RFR: 8347997: assert(false) failed: EA: missing memory path [v2] In-Reply-To: References: Message-ID: On Tue, 28 Jan 2025 06:29:50 GMT, Tobias Hartmann wrote: > Sounds like a good opportunity for an IR framework test but could be done as follow-up (starter) RFE. I created [JDK-8348887](https://bugs.openjdk.org/browse/JDK-8348887) Thank you @TobiHartmann and @chhagedorn for reviews. I addressed your comments. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23284#issuecomment-2619714077 PR Comment: https://git.openjdk.org/jdk/pull/23284#issuecomment-2619716065 From duke at openjdk.org Tue Jan 28 18:38:52 2025 From: duke at openjdk.org (duke) Date: Tue, 28 Jan 2025 18:38:52 GMT Subject: RFR: 8336760: [JVMCI] -XX:+PrintCompilation should also print "hosted" JVMCI compilations [v3] In-Reply-To: <9UQRXClTNNnZ_nJx4lXKKbGArPRauMknoiasTPsULOo=.721379b8-cf86-4fcb-8a0b-65f08d617d9b@github.com> References: <9UQRXClTNNnZ_nJx4lXKKbGArPRauMknoiasTPsULOo=.721379b8-cf86-4fcb-8a0b-65f08d617d9b@github.com> Message-ID: On Thu, 23 Jan 2025 23:28:01 GMT, Cesar Soares Lucas wrote: >> Currently, `-XX:+PrintCompilation` does not print "hosted" JVMCI compilations (i.e. JVMCI compilations not triggered by the `CompilerBroker` but e.g. by the Truffle framework). On the other hand, if such an nmethod which results from a "hosted" compilation gets deoptimized, it will be printed by `-XX:+PrintCompilation` (with a compilation ID that doesn't appear anywhere before in the compilation log. >> >> This pull request is intended to fix that. The snippet below is an example of output printed (with PrintCompilation and CIPrintCompilerName enabled) for Truffle hosted compilations using this patch. >> >> >> 783 JVMCI:4667 4 com.oracle.truffle.runtime.OptimizedCallTarget::callBoundary (19 bytes) (hosted JVMCI compilation) >> 785 JVMCI:5342 4 com.oracle.truffle.runtime.hotspot.HotSpotFastThreadLocal::get (4 bytes) (hosted JVMCI compilation) >> 786 JVMCI:5411 4 com.oracle.truffle.runtime.hotspot.HotSpotFastThreadLocal::set (5 bytes) (hosted JVMCI compilation) >> 1582 JVMCI:10125 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) >> 1591 JVMCI:10899 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) >> 1652 JVMCI:11064 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) >> 1656 JVMCI:11175 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Fix typo. @JohnTortugo Your change (at version f49ca1ee56cda14ef53bbb137b0bdd27af0edce8) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23278#issuecomment-2619782232 From cslucas at openjdk.org Tue Jan 28 19:21:57 2025 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 28 Jan 2025 19:21:57 GMT Subject: Integrated: 8336760: [JVMCI] -XX:+PrintCompilation should also print "hosted" JVMCI compilations In-Reply-To: References: Message-ID: On Thu, 23 Jan 2025 20:01:23 GMT, Cesar Soares Lucas wrote: > Currently, `-XX:+PrintCompilation` does not print "hosted" JVMCI compilations (i.e. JVMCI compilations not triggered by the `CompilerBroker` but e.g. by the Truffle framework). On the other hand, if such an nmethod which results from a "hosted" compilation gets deoptimized, it will be printed by `-XX:+PrintCompilation` (with a compilation ID that doesn't appear anywhere before in the compilation log. > > This pull request is intended to fix that. The snippet below is an example of output printed (with PrintCompilation and CIPrintCompilerName enabled) for Truffle hosted compilations using this patch. > > > 783 JVMCI:4667 4 com.oracle.truffle.runtime.OptimizedCallTarget::callBoundary (19 bytes) (hosted JVMCI compilation) > 785 JVMCI:5342 4 com.oracle.truffle.runtime.hotspot.HotSpotFastThreadLocal::get (4 bytes) (hosted JVMCI compilation) > 786 JVMCI:5411 4 com.oracle.truffle.runtime.hotspot.HotSpotFastThreadLocal::set (5 bytes) (hosted JVMCI compilation) > 1582 JVMCI:10125 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) > 1591 JVMCI:10899 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) > 1652 JVMCI:11064 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) > 1656 JVMCI:11175 4 com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot (51 bytes) (hosted JVMCI compilation) This pull request has now been integrated. Changeset: c3c38887 Author: Cesar Soares Lucas Committer: Vladimir Kozlov URL: https://git.openjdk.org/jdk/commit/c3c3888762712e455757e4a52de8d680d58b8883 Stats: 6 lines in 1 file changed: 6 ins; 0 del; 0 mod 8336760: [JVMCI] -XX:+PrintCompilation should also print "hosted" JVMCI compilations Reviewed-by: dnsimon, kvn ------------- PR: https://git.openjdk.org/jdk/pull/23278 From thartmann at openjdk.org Tue Jan 28 20:41:07 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Tue, 28 Jan 2025 20:41:07 GMT Subject: RFR: 8348631: Crash in PredictedCallGenerator::generate after JDK-8347006 [v2] In-Reply-To: References: Message-ID: > We crash / assert during C2 compilation of intrinsics like `_getLength` because the cast emitted by the array guard added by [JDK-8347006](https://bugs.openjdk.org/browse/JDK-8347006) is folded to top: > https://github.com/openjdk/jdk/blob/c33c1cfe7349ac657cd7bf54861227709d3c8f1b/src/hotspot/share/opto/library_call.cpp#L4302-L4305 > > This happens when C2's type system determines that the type of the object that we cast implements an interface other than `Serializable` or `Cloneable` and therefore can't be an array. This is possible since [JDK-8297933](https://bugs.openjdk.org/browse/JDK-8297933). Now unfortunately, control via the layout helper check is not (yet) folded due to: > https://github.com/openjdk/jdk/blob/c33c1cfe7349ac657cd7bf54861227709d3c8f1b/src/hotspot/share/opto/memnode.cpp#L2215-L2223 > > This is probably an oversight from [JDK-8297933](https://bugs.openjdk.org/browse/JDK-8297933). Given that this is a regression in JDK 24, I'm going with a conservative approach of simply checking the cast for top and not using it if that's the case. In addition, I made the code more robust and added a compilation bailout (assert in debug) if an intrinsic produces a `top` result. > > We should then properly fix this by making sure that the layout helper check is folded. I filed [JDK-8348853](https://bugs.openjdk.org/browse/JDK-8348853) for this. > > Big thanks to @cushon for reporting this just in time for fixing in JDK 24! > > Best regards, > Tobias Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: Added comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23331/files - new: https://git.openjdk.org/jdk/pull/23331/files/4cf7f864..54fd3894 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23331&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23331&range=00-01 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/23331.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23331/head:pull/23331 PR: https://git.openjdk.org/jdk/pull/23331 From thartmann at openjdk.org Tue Jan 28 20:41:07 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Tue, 28 Jan 2025 20:41:07 GMT Subject: RFR: 8348631: Crash in PredictedCallGenerator::generate after JDK-8347006 In-Reply-To: References: Message-ID: On Tue, 28 Jan 2025 13:10:37 GMT, Tobias Hartmann wrote: > We crash / assert during C2 compilation of intrinsics like `_getLength` because the cast emitted by the array guard added by [JDK-8347006](https://bugs.openjdk.org/browse/JDK-8347006) is folded to top: > https://github.com/openjdk/jdk/blob/c33c1cfe7349ac657cd7bf54861227709d3c8f1b/src/hotspot/share/opto/library_call.cpp#L4302-L4305 > > This happens when C2's type system determines that the type of the object that we cast implements an interface other than `Serializable` or `Cloneable` and therefore can't be an array. This is possible since [JDK-8297933](https://bugs.openjdk.org/browse/JDK-8297933). Now unfortunately, control via the layout helper check is not (yet) folded due to: > https://github.com/openjdk/jdk/blob/c33c1cfe7349ac657cd7bf54861227709d3c8f1b/src/hotspot/share/opto/memnode.cpp#L2215-L2223 > > This is probably an oversight from [JDK-8297933](https://bugs.openjdk.org/browse/JDK-8297933). Given that this is a regression in JDK 24, I'm going with a conservative approach of simply checking the cast for top and not using it if that's the case. In addition, I made the code more robust and added a compilation bailout (assert in debug) if an intrinsic produces a `top` result. > > We should then properly fix this by making sure that the layout helper check is folded. I filed [JDK-8348853](https://bugs.openjdk.org/browse/JDK-8348853) for this. > > Big thanks to @cushon for reporting this just in time for fixing in JDK 24! > > Best regards, > Tobias Thanks for the reviews Emanuel and Vladimir! > General question: how in other part of VM (runtime, gc) layout helper was changed for JDK-8297933? @vnkozlov The layout helper was not changed but IIUC (@rwestrel, please correct me if I'm wrong), the type system has now enough information about interfaces that it can determine that casting an object implementing an interface to an array must be TOP. However, the layout helper check is not folded. > I tested these changes against the original issue that prompted JDK-8348631, and everything looks good. @cushon Thanks for checking! ------------- PR Comment: https://git.openjdk.org/jdk/pull/23331#issuecomment-2620005934 From thartmann at openjdk.org Tue Jan 28 20:41:07 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Tue, 28 Jan 2025 20:41:07 GMT Subject: RFR: 8348631: Crash in PredictedCallGenerator::generate after JDK-8347006 [v2] In-Reply-To: References: Message-ID: <6suKgwihKrn1VcNRQkgEXLCSPlv7sSi1lXGhaLmDSq4=.01ef3098-3e58-4ee4-8051-87e65c196e22@github.com> On Tue, 28 Jan 2025 17:13:34 GMT, Vladimir Kozlov wrote: >> Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: >> >> Added comment > > src/hotspot/share/opto/library_call.cpp line 4308: > >> 4306: if (!cast->is_top()) { >> 4307: *obj = cast; >> 4308: } > > Add comment why it could be TOP. I added a comment. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23331#discussion_r1932825818 From thartmann at openjdk.org Tue Jan 28 20:43:46 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Tue, 28 Jan 2025 20:43:46 GMT Subject: RFR: 8347997: assert(false) failed: EA: missing memory path [v4] In-Reply-To: References: Message-ID: On Tue, 28 Jan 2025 18:02:37 GMT, Vladimir Kozlov wrote: >> C2's Escape Analysis does not recognize pattern where one input of memory `Phi` node is `MergeMem` node and an other is RAW store. This pattern is created by Continuation pinning intrinsic. As result EA complains about strange memory graph. >> >> I suggest to add second `MergeMem` between Store and Phi nodes by calling `reset_memory()`. EA recognize such patter and removes allocations. >> >> I checked generated assembler pinning code and it is the same as before. The only difference in the test is eliminated allocations. >> >> I moved Uncommon code up to avoid resetting memory - it is already done at the beginning of this intrinsic code. >> >> We should not use `uncommon_trap_exact()` for `Deoptimization::Action_none` - It is used for other actions to prevent changing them to `Action_none`. >> >> Tested tier1-5, hs-xcomp, hs-comp-stress >> Added new regression test based on reproducer from bug report. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Address comments Marked as reviewed by thartmann (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/23284#pullrequestreview-2579370357 From vlivanov at openjdk.org Tue Jan 28 21:01:46 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Tue, 28 Jan 2025 21:01:46 GMT Subject: RFR: 8348411: C2: Remove the control input of LoadKlassNode and LoadNKlassNode [v2] In-Reply-To: <-gX_tt4TqHEDFRKAEG7-UWUbtnukhCkjpHV03wWK4Xc=.803af5fc-8b8f-44a0-9cc8-2b10f0cf2002@github.com> References: <-gX_tt4TqHEDFRKAEG7-UWUbtnukhCkjpHV03wWK4Xc=.803af5fc-8b8f-44a0-9cc8-2b10f0cf2002@github.com> Message-ID: On Thu, 23 Jan 2025 19:08:25 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch removes the control input of `LoadKlassNode` and `LoadNKlassNode`. They can only have a control input if created inside `Parse::array_store_check()`, the reason given is: >> >> // We are allowed to use the constant type only if cast succeeded >> >> But this seems incorrect, the load from the constant type can be done regardless, and it will be constant-folded. This patch only makes that more formal and cleanup `LoadKlassNode::can_remove_control`. >> >> Please take a look and leave your reviews, thanks a lot. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > remove always_see_exact_class src/hotspot/share/opto/parseHelper.cpp line 168: > 166: // succeeds. > 167: if (MonomorphicArrayCheck && !too_many_traps(Deoptimization::Reason_array_check) && !tak->klass_is_exact() > 168: && tak != TypeInstKlassPtr::OBJECT) { I'd also turn `tak != TypeInstKlassPtr::OBJECT` into `tak->isa_aryklassptr()` to stress the intention. (Please, keep the original formatting here.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23274#discussion_r1932847737 From vlivanov at openjdk.org Tue Jan 28 21:01:47 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Tue, 28 Jan 2025 21:01:47 GMT Subject: RFR: 8348411: C2: Remove the control input of LoadKlassNode and LoadNKlassNode [v2] In-Reply-To: References: Message-ID: On Thu, 23 Jan 2025 19:14:42 GMT, Quan Anh Mai wrote: >> src/hotspot/share/opto/parseHelper.cpp line 229: >> >>> 227: int element_klass_offset = in_bytes(ObjArrayKlass::element_klass_offset()); >>> 228: Node* p2 = basic_plus_adr(array_klass, array_klass, element_klass_offset); >>> 229: Node* a_e_klass = _gvn.transform(LoadKlassNode::make(_gvn, immutable_memory(), p2, tak)); >> >> It looks like you are reverting the fix for JDK-8057622 here. How is it intended to work now? > > Thanks for noticing, IIUC the issue with JDK-8057622 is that we see that the bottom type of the array is `java.lang.Object`, we then try to find the element type of `java.lang.Object` which is an invalid operation. The fix for the issue seems to do 2 things. It avoids the optimistic check if the bottom type is `java.lang.Object`, then it loads the element type from the constant only if the type check succeeds. The second part seems unnecessary, if the bottom type is not `java.lang.Object`, then it must be an array type, so the load must succeed. I added `is_aryklassptr` there to ensure that we are not having a bogus constant klass pointer. Indeed, control does look redundant here. Once `array_klass` becomes a constant, the corresponding `LoadKlass` node should constant fold into a constant as well and the original control is gone anyway. (It's worth adding an assert here that `a_e_klass` is a constant when `array_klass` is a constant (and `StressReflectiveCode == false`). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23274#discussion_r1932843755 From kvn at openjdk.org Tue Jan 28 21:27:46 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 28 Jan 2025 21:27:46 GMT Subject: RFR: 8348631: Crash in PredictedCallGenerator::generate after JDK-8347006 [v2] In-Reply-To: References: Message-ID: On Tue, 28 Jan 2025 20:41:07 GMT, Tobias Hartmann wrote: >> We crash / assert during C2 compilation of intrinsics like `_getLength` because the cast emitted by the array guard added by [JDK-8347006](https://bugs.openjdk.org/browse/JDK-8347006) is folded to top: >> https://github.com/openjdk/jdk/blob/c33c1cfe7349ac657cd7bf54861227709d3c8f1b/src/hotspot/share/opto/library_call.cpp#L4302-L4305 >> >> This happens when C2's type system determines that the type of the object that we cast implements an interface other than `Serializable` or `Cloneable` and therefore can't be an array. This is possible since [JDK-8297933](https://bugs.openjdk.org/browse/JDK-8297933). Now unfortunately, control via the layout helper check is not (yet) folded due to: >> https://github.com/openjdk/jdk/blob/c33c1cfe7349ac657cd7bf54861227709d3c8f1b/src/hotspot/share/opto/memnode.cpp#L2215-L2223 >> >> This is probably an oversight from [JDK-8297933](https://bugs.openjdk.org/browse/JDK-8297933). Given that this is a regression in JDK 24, I'm going with a conservative approach of simply checking the cast for top and not using it if that's the case. In addition, I made the code more robust and added a compilation bailout (assert in debug) if an intrinsic produces a `top` result. >> >> We should then properly fix this by making sure that the layout helper check is folded. I filed [JDK-8348853](https://bugs.openjdk.org/browse/JDK-8348853) for this. >> >> Big thanks to @cushon for reporting this just in time for fixing in JDK 24! >> >> Best regards, >> Tobias > > Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: > > Added comment Good ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23331#pullrequestreview-2579444898 From sviswanathan at openjdk.org Wed Jan 29 00:39:55 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Wed, 29 Jan 2025 00:39:55 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v14] In-Reply-To: References: Message-ID: On Tue, 28 Jan 2025 06:26:11 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes included with this patch:- >> >> 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. >> 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. >> 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. >> - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. >> 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. >> 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. >> 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. >> 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF >> 9. X86 backend implementation for all supported intrinsics. >> 10. Functional and Performance validation tests. >> >> Kindly review the patch and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Updating typos in comments Some more minor comments below. Rest of the PR looks good to me. src/hotspot/share/opto/subnode.hpp line 549: > 547: SqrtHFNode(Compile* C, Node* c, Node* in1) : Node(c, in1) { > 548: init_flags(Flag_is_expensive); > 549: C->add_expensive_node(this); Do we need to set SqrtHF as expensive node? It translates to a single instruction. src/hotspot/share/opto/type.hpp line 2031: > 2029: > 2030: inline const TypeH* Type::is_half_float_constant() const { > 2031: assert( _base == HalfFloatCon, "Not a Float" ); Should be "Not a HalfFloat" here. test/hotspot/jtreg/compiler/c2/irTests/ConvF2HFIdealizationTests.java line 32: > 30: /* > 31: * @test > 32: * @bug 8338061 This should now refer to bug 8342103. test/hotspot/jtreg/compiler/c2/irTests/MulHFNodeIdealizationTests.java line 33: > 31: /* > 32: * @test > 33: * @bug 8336406 This should now refer to bug 8342103. test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java line 27: > 25: /** > 26: * @test > 27: * @bug 8308363 8336406 This should now refer to bug 8342103. ------------- PR Review: https://git.openjdk.org/jdk/pull/22754#pullrequestreview-2579221309 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1932741039 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1932906990 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1933045205 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1933045649 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1933045837 From jbhateja at openjdk.org Wed Jan 29 06:26:41 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 29 Jan 2025 06:26:41 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v15] In-Reply-To: References: Message-ID: > Hi All, > > This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) > > Following is the summary of changes included with this patch:- > > 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. > 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. > 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. > - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. > 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. > 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. > 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. > 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF > 9. X86 backend implementation for all supported intrinsics. > 10. Functional and Performance validation tests. > > Kindly review the patch and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Fixing a typo error ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22754/files - new: https://git.openjdk.org/jdk/pull/22754/files/854fc73f..19fc6c2d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=14 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=13-14 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/22754.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22754/head:pull/22754 PR: https://git.openjdk.org/jdk/pull/22754 From jbhateja at openjdk.org Wed Jan 29 06:26:43 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 29 Jan 2025 06:26:43 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v14] In-Reply-To: References: Message-ID: <3qPSmi5JurXgnyLO6Vpr_GYzolS_gO25eiPCsohEglg=.3f240424-c59f-4105-89f8-7f3b56f7ddb0@github.com> On Wed, 29 Jan 2025 00:36:54 GMT, Sandhya Viswanathan wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Updating typos in comments > > Some more minor comments below. Rest of the PR looks good to me. Hi @sviswa7 , your comments have been addressed. > src/hotspot/share/opto/subnode.hpp line 549: > >> 547: SqrtHFNode(Compile* C, Node* c, Node* in1) : Node(c, in1) { >> 548: init_flags(Flag_is_expensive); >> 549: C->add_expensive_node(this); > > Do we need to set SqrtHF as expensive node? It translates to a single instruction. Its latency is around 15 cycles. Thus, GVN commoning, which leads to its placement onto a frequently executed control path, may be costly. In addition, it is also marked as an expensive node by ADLC to prevent rematerialization by the allocator. > src/hotspot/share/opto/type.hpp line 2031: > >> 2029: >> 2030: inline const TypeH* Type::is_half_float_constant() const { >> 2031: assert( _base == HalfFloatCon, "Not a Float" ); > > Should be "Not a HalfFloat" here. Fixed this typo error. > test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java line 27: > >> 25: /** >> 26: * @test >> 27: * @bug 8308363 8336406 > > This should now refer to bug 8342103. Test was added on lworld+fp16 branch and appropriately reflects the JBS entry. Same, applies to above two comments. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22754#issuecomment-2620812050 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1933322657 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1933322637 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1933322576 From thartmann at openjdk.org Wed Jan 29 07:18:53 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 29 Jan 2025 07:18:53 GMT Subject: RFR: 8348631: Crash in PredictedCallGenerator::generate after JDK-8347006 [v2] In-Reply-To: References: Message-ID: <76Jc4fmzh5oMJi4AmdblzeoFyBK5QHzJwZt58aOOUT4=.01549ec4-9578-45b9-a05b-fff0c06428b1@github.com> On Tue, 28 Jan 2025 20:41:07 GMT, Tobias Hartmann wrote: >> We crash / assert during C2 compilation of intrinsics like `_getLength` because the cast emitted by the array guard added by [JDK-8347006](https://bugs.openjdk.org/browse/JDK-8347006) is folded to top: >> https://github.com/openjdk/jdk/blob/c33c1cfe7349ac657cd7bf54861227709d3c8f1b/src/hotspot/share/opto/library_call.cpp#L4302-L4305 >> >> This happens when C2's type system determines that the type of the object that we cast implements an interface other than `Serializable` or `Cloneable` and therefore can't be an array. This is possible since [JDK-8297933](https://bugs.openjdk.org/browse/JDK-8297933). Now unfortunately, control via the layout helper check is not (yet) folded due to: >> https://github.com/openjdk/jdk/blob/c33c1cfe7349ac657cd7bf54861227709d3c8f1b/src/hotspot/share/opto/memnode.cpp#L2215-L2223 >> >> This is probably an oversight from [JDK-8297933](https://bugs.openjdk.org/browse/JDK-8297933). Given that this is a regression in JDK 24, I'm going with a conservative approach of simply checking the cast for top and not using it if that's the case. In addition, I made the code more robust and added a compilation bailout (assert in debug) if an intrinsic produces a `top` result. >> >> We should then properly fix this by making sure that the layout helper check is folded. I filed [JDK-8348853](https://bugs.openjdk.org/browse/JDK-8348853) for this. >> >> Big thanks to @cushon for reporting this just in time for fixing in JDK 24! >> >> Best regards, >> Tobias > > Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: > > Added comment Thanks again, Vladimir! ------------- PR Comment: https://git.openjdk.org/jdk/pull/23331#issuecomment-2620866749 From thartmann at openjdk.org Wed Jan 29 07:18:54 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 29 Jan 2025 07:18:54 GMT Subject: Integrated: 8348631: Crash in PredictedCallGenerator::generate after JDK-8347006 In-Reply-To: References: Message-ID: On Tue, 28 Jan 2025 13:10:37 GMT, Tobias Hartmann wrote: > We crash / assert during C2 compilation of intrinsics like `_getLength` because the cast emitted by the array guard added by [JDK-8347006](https://bugs.openjdk.org/browse/JDK-8347006) is folded to top: > https://github.com/openjdk/jdk/blob/c33c1cfe7349ac657cd7bf54861227709d3c8f1b/src/hotspot/share/opto/library_call.cpp#L4302-L4305 > > This happens when C2's type system determines that the type of the object that we cast implements an interface other than `Serializable` or `Cloneable` and therefore can't be an array. This is possible since [JDK-8297933](https://bugs.openjdk.org/browse/JDK-8297933). Now unfortunately, control via the layout helper check is not (yet) folded due to: > https://github.com/openjdk/jdk/blob/c33c1cfe7349ac657cd7bf54861227709d3c8f1b/src/hotspot/share/opto/memnode.cpp#L2215-L2223 > > This is probably an oversight from [JDK-8297933](https://bugs.openjdk.org/browse/JDK-8297933). Given that this is a regression in JDK 24, I'm going with a conservative approach of simply checking the cast for top and not using it if that's the case. In addition, I made the code more robust and added a compilation bailout (assert in debug) if an intrinsic produces a `top` result. > > We should then properly fix this by making sure that the layout helper check is folded. I filed [JDK-8348853](https://bugs.openjdk.org/browse/JDK-8348853) for this. > > Big thanks to @cushon for reporting this just in time for fixing in JDK 24! > > Best regards, > Tobias This pull request has now been integrated. Changeset: 55c3e78f Author: Tobias Hartmann URL: https://git.openjdk.org/jdk/commit/55c3e78f4ec982908e9a4b5e64b8be89717c49f4 Stats: 79 lines in 4 files changed: 78 ins; 0 del; 1 mod 8348631: Crash in PredictedCallGenerator::generate after JDK-8347006 Reviewed-by: kvn, epeter ------------- PR: https://git.openjdk.org/jdk/pull/23331 From thartmann at openjdk.org Wed Jan 29 07:23:59 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 29 Jan 2025 07:23:59 GMT Subject: [jdk24] RFR: 8348631: Crash in PredictedCallGenerator::generate after JDK-8347006 Message-ID: Hi all, This pull request contains a backport of commit [55c3e78f](https://github.com/openjdk/jdk/commit/55c3e78f4ec982908e9a4b5e64b8be89717c49f4) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. The commit being backported was authored by Tobias Hartmann on 29 Jan 2025 and was reviewed by Vladimir Kozlov and Emanuel Peter. Thanks! ------------- Commit messages: - Backport 55c3e78f4ec982908e9a4b5e64b8be89717c49f4 Changes: https://git.openjdk.org/jdk/pull/23346/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23346&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8348631 Stats: 79 lines in 4 files changed: 78 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/23346.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23346/head:pull/23346 PR: https://git.openjdk.org/jdk/pull/23346 From chagedorn at openjdk.org Wed Jan 29 07:36:00 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Wed, 29 Jan 2025 07:36:00 GMT Subject: [jdk24] RFR: 8348631: Crash in PredictedCallGenerator::generate after JDK-8347006 In-Reply-To: References: Message-ID: On Wed, 29 Jan 2025 07:19:27 GMT, Tobias Hartmann wrote: > Hi all, > > This pull request contains a backport of commit [55c3e78f](https://github.com/openjdk/jdk/commit/55c3e78f4ec982908e9a4b5e64b8be89717c49f4) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Tobias Hartmann on 29 Jan 2025 and was reviewed by Vladimir Kozlov and Emanuel Peter. > > Thanks! Looks good! ------------- Marked as reviewed by chagedorn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23346#pullrequestreview-2580206866 From thartmann at openjdk.org Wed Jan 29 07:36:00 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 29 Jan 2025 07:36:00 GMT Subject: [jdk24] RFR: 8348631: Crash in PredictedCallGenerator::generate after JDK-8347006 In-Reply-To: References: Message-ID: On Wed, 29 Jan 2025 07:19:27 GMT, Tobias Hartmann wrote: > Hi all, > > This pull request contains a backport of commit [55c3e78f](https://github.com/openjdk/jdk/commit/55c3e78f4ec982908e9a4b5e64b8be89717c49f4) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Tobias Hartmann on 29 Jan 2025 and was reviewed by Vladimir Kozlov and Emanuel Peter. > > Thanks! Thanks Christian! ------------- PR Comment: https://git.openjdk.org/jdk/pull/23346#issuecomment-2620894236 From thartmann at openjdk.org Wed Jan 29 07:36:01 2025 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 29 Jan 2025 07:36:01 GMT Subject: [jdk24] Integrated: 8348631: Crash in PredictedCallGenerator::generate after JDK-8347006 In-Reply-To: References: Message-ID: On Wed, 29 Jan 2025 07:19:27 GMT, Tobias Hartmann wrote: > Hi all, > > This pull request contains a backport of commit [55c3e78f](https://github.com/openjdk/jdk/commit/55c3e78f4ec982908e9a4b5e64b8be89717c49f4) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Tobias Hartmann on 29 Jan 2025 and was reviewed by Vladimir Kozlov and Emanuel Peter. > > Thanks! This pull request has now been integrated. Changeset: 2d473191 Author: Tobias Hartmann URL: https://git.openjdk.org/jdk/commit/2d4731917dff510509eba0b8993a0a6e1dd91017 Stats: 79 lines in 4 files changed: 78 ins; 0 del; 1 mod 8348631: Crash in PredictedCallGenerator::generate after JDK-8347006 Reviewed-by: chagedorn Backport-of: 55c3e78f4ec982908e9a4b5e64b8be89717c49f4 ------------- PR: https://git.openjdk.org/jdk/pull/23346 From chagedorn at openjdk.org Wed Jan 29 08:18:54 2025 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Wed, 29 Jan 2025 08:18:54 GMT Subject: RFR: 8347997: assert(false) failed: EA: missing memory path [v4] In-Reply-To: References: Message-ID: On Tue, 28 Jan 2025 18:02:37 GMT, Vladimir Kozlov wrote: >> C2's Escape Analysis does not recognize pattern where one input of memory `Phi` node is `MergeMem` node and an other is RAW store. This pattern is created by Continuation pinning intrinsic. As result EA complains about strange memory graph. >> >> I suggest to add second `MergeMem` between Store and Phi nodes by calling `reset_memory()`. EA recognize such patter and removes allocations. >> >> I checked generated assembler pinning code and it is the same as before. The only difference in the test is eliminated allocations. >> >> I moved Uncommon code up to avoid resetting memory - it is already done at the beginning of this intrinsic code. >> >> We should not use `uncommon_trap_exact()` for `Deoptimization::Action_none` - It is used for other actions to prevent changing them to `Action_none`. >> >> Tested tier1-5, hs-xcomp, hs-comp-stress >> Added new regression test based on reproducer from bug report. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Address comments Update looks good, thanks! ------------- Marked as reviewed by chagedorn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23284#pullrequestreview-2580289161 From sviswanathan at openjdk.org Wed Jan 29 17:06:53 2025 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Wed, 29 Jan 2025 17:06:53 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v15] In-Reply-To: References: Message-ID: <2mwDB7u60CMOcPcuaaWxn9XblgXILZtgOvNwQqQyBno=.0f042b54-c6b5-4bbe-bd7a-23792b7774f9@github.com> On Wed, 29 Jan 2025 06:26:41 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes included with this patch:- >> >> 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. >> 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. >> 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. >> - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. >> 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. >> 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. >> 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. >> 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF >> 9. X86 backend implementation for all supported intrinsics. >> 10. Functional and Performance validation tests. >> >> Kindly review the patch and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Fixing a typo error Looks good to me. ------------- Marked as reviewed by sviswanathan (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22754#pullrequestreview-2581694270 From kvn at openjdk.org Wed Jan 29 17:26:53 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 29 Jan 2025 17:26:53 GMT Subject: RFR: 8347997: assert(false) failed: EA: missing memory path [v4] In-Reply-To: References: Message-ID: <04d9GIlRxMhuci0uqGJNUt8eFAj7Bm6EWIZG-UrrFic=.692701a5-c418-4966-bffd-c74641182e0e@github.com> On Tue, 28 Jan 2025 20:40:40 GMT, Tobias Hartmann wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Address comments > > Marked as reviewed by thartmann (Reviewer). Thank you, @TobiHartmann and @chhagedorn for reviews ------------- PR Comment: https://git.openjdk.org/jdk/pull/23284#issuecomment-2622331272 From kvn at openjdk.org Wed Jan 29 17:26:54 2025 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 29 Jan 2025 17:26:54 GMT Subject: Integrated: 8347997: assert(false) failed: EA: missing memory path In-Reply-To: References: Message-ID: <2UV_kczyr9AsOO1M0Tiog1zv2phF_mSOY9V5ZgCTNEw=.d077042d-06cb-4b3d-bace-d96b5e610c8b@github.com> On Fri, 24 Jan 2025 00:07:43 GMT, Vladimir Kozlov wrote: > C2's Escape Analysis does not recognize pattern where one input of memory `Phi` node is `MergeMem` node and an other is RAW store. This pattern is created by Continuation pinning intrinsic. As result EA complains about strange memory graph. > > I suggest to add second `MergeMem` between Store and Phi nodes by calling `reset_memory()`. EA recognize such patter and removes allocations. > > I checked generated assembler pinning code and it is the same as before. The only difference in the test is eliminated allocations. > > I moved Uncommon code up to avoid resetting memory - it is already done at the beginning of this intrinsic code. > > We should not use `uncommon_trap_exact()` for `Deoptimization::Action_none` - It is used for other actions to prevent changing them to `Action_none`. > > Tested tier1-5, hs-xcomp, hs-comp-stress > Added new regression test based on reproducer from bug report. This pull request has now been integrated. Changeset: 6b581d22 Author: Vladimir Kozlov URL: https://git.openjdk.org/jdk/commit/6b581d22e13599b16b38aff1ca5a795c6a910d30 Stats: 106 lines in 2 files changed: 91 ins; 13 del; 2 mod 8347997: assert(false) failed: EA: missing memory path Reviewed-by: thartmann, chagedorn ------------- PR: https://git.openjdk.org/jdk/pull/23284 From qamai at openjdk.org Thu Jan 30 07:56:29 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 30 Jan 2025 07:56:29 GMT Subject: RFR: 8348411: C2: Remove the control input of LoadKlassNode and LoadNKlassNode [v3] In-Reply-To: References: Message-ID: > Hi, > > This patch removes the control input of `LoadKlassNode` and `LoadNKlassNode`. They can only have a control input if created inside `Parse::array_store_check()`, the reason given is: > > // We are allowed to use the constant type only if cast succeeded > > But this seems incorrect, the load from the constant type can be done regardless, and it will be constant-folded. This patch only makes that more formal and cleanup `LoadKlassNode::can_remove_control`. > > Please take a look and leave your reviews, thanks a lot. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: clearer intention, revert formatting, add assert ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23274/files - new: https://git.openjdk.org/jdk/pull/23274/files/ff924dea..c6761889 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23274&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23274&range=01-02 Stats: 22 lines in 1 file changed: 1 ins; 0 del; 21 mod Patch: https://git.openjdk.org/jdk/pull/23274.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23274/head:pull/23274 PR: https://git.openjdk.org/jdk/pull/23274 From qamai at openjdk.org Thu Jan 30 07:56:29 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 30 Jan 2025 07:56:29 GMT Subject: RFR: 8348411: C2: Remove the control input of LoadKlassNode and LoadNKlassNode [v2] In-Reply-To: References: <-gX_tt4TqHEDFRKAEG7-UWUbtnukhCkjpHV03wWK4Xc=.803af5fc-8b8f-44a0-9cc8-2b10f0cf2002@github.com> Message-ID: On Tue, 28 Jan 2025 20:56:30 GMT, Vladimir Ivanov wrote: >> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: >> >> remove always_see_exact_class > > src/hotspot/share/opto/parseHelper.cpp line 168: > >> 166: // succeeds. >> 167: if (MonomorphicArrayCheck && !too_many_traps(Deoptimization::Reason_array_check) && !tak->klass_is_exact() >> 168: && tak != TypeInstKlassPtr::OBJECT) { > > I'd also turn `tak != TypeInstKlassPtr::OBJECT` into `tak->isa_aryklassptr()` to stress the intention. > (Please, keep the original formatting here.) Done, I assume you meant the formatting of the comment below, right? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23274#discussion_r1935156156 From qamai at openjdk.org Thu Jan 30 07:56:29 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 30 Jan 2025 07:56:29 GMT Subject: RFR: 8348411: C2: Remove the control input of LoadKlassNode and LoadNKlassNode [v3] In-Reply-To: References: Message-ID: On Tue, 28 Jan 2025 20:52:52 GMT, Vladimir Ivanov wrote: >> Thanks for noticing, IIUC the issue with JDK-8057622 is that we see that the bottom type of the array is `java.lang.Object`, we then try to find the element type of `java.lang.Object` which is an invalid operation. The fix for the issue seems to do 2 things. It avoids the optimistic check if the bottom type is `java.lang.Object`, then it loads the element type from the constant only if the type check succeeds. The second part seems unnecessary, if the bottom type is not `java.lang.Object`, then it must be an array type, so the load must succeed. I added `is_aryklassptr` there to ensure that we are not having a bogus constant klass pointer. > > Indeed, control does look redundant here. Once `array_klass` becomes a constant, the corresponding `LoadKlass` node should constant fold into a constant as well and the original control is gone anyway. > > (It's worth adding an assert here that `a_e_klass` is a constant when `array_klass` is a constant (and `StressReflectiveCode == false`). I assume that asserting `StressReflectiveCode || array_klass->is_Con() == a_e_klass->is_Con()` is correct. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23274#discussion_r1935155414 From epeter at openjdk.org Thu Jan 30 10:01:52 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 30 Jan 2025 10:01:52 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v14] In-Reply-To: <3qPSmi5JurXgnyLO6Vpr_GYzolS_gO25eiPCsohEglg=.3f240424-c59f-4105-89f8-7f3b56f7ddb0@github.com> References: <3qPSmi5JurXgnyLO6Vpr_GYzolS_gO25eiPCsohEglg=.3f240424-c59f-4105-89f8-7f3b56f7ddb0@github.com> Message-ID: On Wed, 29 Jan 2025 06:23:25 GMT, Jatin Bhateja wrote: >> Some more minor comments below. Rest of the PR looks good to me. > > Hi @sviswa7 , your comments have been addressed. The updates look good, thanks @jatin-bhateja ! Running one more round of testing to verify copyright etc. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22754#issuecomment-2624028178 From jbhateja at openjdk.org Thu Jan 30 11:03:43 2025 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Thu, 30 Jan 2025 11:03:43 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v16] In-Reply-To: References: Message-ID: > Hi All, > > This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) > > Following is the summary of changes included with this patch:- > > 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. > 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. > 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. > - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. > 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. > 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. > 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. > 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF > 9. X86 backend implementation for all supported intrinsics. > 10. Functional and Performance validation tests. > > Kindly review the patch and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Update test/micro/org/openjdk/bench/jdk/incubator/vector/Float16OperationsBenchmark.java Co-authored-by: Emanuel Peter ------------- Changes: - all: https://git.openjdk.org/jdk/pull/22754/files - new: https://git.openjdk.org/jdk/pull/22754/files/19fc6c2d..8207c9ff Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=15 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22754&range=14-15 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/22754.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22754/head:pull/22754 PR: https://git.openjdk.org/jdk/pull/22754 From epeter at openjdk.org Thu Jan 30 11:03:45 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 30 Jan 2025 11:03:45 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v15] In-Reply-To: References: Message-ID: On Wed, 29 Jan 2025 06:26:41 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes included with this patch:- >> >> 1. Detection of various Float16 operations through inline expansion or pattern folding idealizations. >> 2. Float16 operations like add, sub, mul, div, max, and min are inferred through pattern folding idealization. >> 3. Float16 SQRT and FMA operation are inferred through inline expansion and their corresponding entry points are defined in the newly added Float16Math class. >> - These intrinsics receive unwrapped short arguments encoding IEEE 754 binary16 values. >> 5. New specialized IR nodes for Float16 operations, associated idealizations, and constant folding routines. >> 6. New Ideal type for constant and non-constant Float16 IR nodes. Please refer to [FAQs ](https://github.com/openjdk/jdk/pull/22754#issuecomment-2543982577)for more details. >> 7. Since Float16 uses short as its storage type, hence raw FP16 values are always loaded into general purpose register, but FP16 ISA generally operates over floating point registers, thus the compiler injects reinterpretation IR before and after Float16 operation nodes to move short value to floating point register and vice versa. >> 8. New idealization routines to optimize redundant reinterpretation chains. HF2S + S2HF = HF >> 9. X86 backend implementation for all supported intrinsics. >> 10. Functional and Performance validation tests. >> >> Kindly review the patch and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Fixing a typo error `[2025-01-30T10:01:50,247Z] BAD COPYRIGHT: ...test/micro/org/openjdk/bench/jdk/incubator/vector/Float16OperationsBenchmark.java` There is yet another `vector` that snuck in later: 19c19 < * Please contact Oracle, 500 Oracle Parkway, Redwood ShovectorRes, CA 94065 USA --- > * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA test/micro/org/openjdk/bench/jdk/incubator/vector/Float16OperationsBenchmark.java line 19: > 17: * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. > 18: * > 19: * Please contact Oracle, 500 Oracle Parkway, Redwood ShovectorRes, CA 94065 USA Suggestion: * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ------------- PR Comment: https://git.openjdk.org/jdk/pull/22754#issuecomment-2624164650 PR Review Comment: https://git.openjdk.org/jdk/pull/22754#discussion_r1935412209 From epeter at openjdk.org Thu Jan 30 11:08:55 2025 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 30 Jan 2025 11:08:55 GMT Subject: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v14] In-Reply-To: <3qPSmi5JurXgnyLO6Vpr_GYzolS_gO25eiPCsohEglg=.3f240424-c59f-4105-89f8-7f3b56f7ddb0@github.com> References: <3qPSmi5JurXgnyLO6Vpr_GYzolS_gO25eiPCsohEglg=.3f240424-c59f-4105-89f8-7f3b56f7ddb0@github.com> Message-ID: On Wed, 29 Jan 2025 06:23:25 GMT, Jatin Bhateja wrote: >> Some more minor comments below. Rest of the PR looks good to me. > > Hi @sviswa7 , your comments have been addressed. @jatin-bhateja Launched testing again ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/22754#issuecomment-2624183368 From duke at openjdk.org Thu Jan 30 16:14:27 2025 From: duke at openjdk.org (Ferenc Rakoczi) Date: Thu, 30 Jan 2025 16:14:27 GMT Subject: RFR: 8348561: Add aarch64 intrinsics for ML-DSA [v2] In-Reply-To: References: Message-ID: > By using the aarch64 vector registers the speed of the computation of the ML-DSA algorithms (key generation, document signing, signature verification) can be approximately doubled. Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision: Use SHA3Parallel for matrix generation ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23300/files - new: https://git.openjdk.org/jdk/pull/23300/files/eb7cc430..36151635 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23300&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23300&range=00-01 Stats: 81 lines in 1 file changed: 49 ins; 5 del; 27 mod Patch: https://git.openjdk.org/jdk/pull/23300.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23300/head:pull/23300 PR: https://git.openjdk.org/jdk/pull/23300 From adinn at openjdk.org Thu Jan 30 16:26:46 2025 From: adinn at openjdk.org (Andrew Dinn) Date: Thu, 30 Jan 2025 16:26:46 GMT Subject: RFR: 8348561: Add aarch64 intrinsics for ML-DSA [v2] In-Reply-To: References: Message-ID: <7UgNYEuTu6rj7queOgM9xIy-6kQMdACrZiDLtlniMYw=.dff6f18b-1236-43b1-8280-2bce9160f32a@github.com> On Thu, 30 Jan 2025 16:14:27 GMT, Ferenc Rakoczi wrote: >> By using the aarch64 vector registers the speed of the computation of the ML-DSA algorithms (key generation, document signing, signature verification) can be approximately doubled. > > Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision: > > Use SHA3Parallel for matrix generation @ferakocz I'm afraid you lucked out on getting your change committed before my reorganization of the stub generation code. If you are unsure of how to do the merge so your new stub is declared and generated following the new model (see the doc comments in stubDeclarations.hpp for details) let me know and I'll be happy to help you sort it out. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23300#issuecomment-2624962113 From vlivanov at openjdk.org Thu Jan 30 16:41:59 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Thu, 30 Jan 2025 16:41:59 GMT Subject: RFR: 8348411: C2: Remove the control input of LoadKlassNode and LoadNKlassNode [v2] In-Reply-To: References: <-gX_tt4TqHEDFRKAEG7-UWUbtnukhCkjpHV03wWK4Xc=.803af5fc-8b8f-44a0-9cc8-2b10f0cf2002@github.com> Message-ID: On Thu, 30 Jan 2025 07:53:33 GMT, Quan Anh Mai wrote: >> src/hotspot/share/opto/parseHelper.cpp line 168: >> >>> 166: // succeeds. >>> 167: if (MonomorphicArrayCheck && !too_many_traps(Deoptimization::Reason_array_check) && !tak->klass_is_exact() >>> 168: && tak != TypeInstKlassPtr::OBJECT) { >> >> I'd also turn `tak != TypeInstKlassPtr::OBJECT` into `tak->isa_aryklassptr()` to stress the intention. >> (Please, keep the original formatting here.) > > Done, I assume you meant the formatting of the comment below, right? No, I referred to the original shape of the condition (1 check per line). IMO it is easier to follow. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23274#discussion_r1935926644 From vlivanov at openjdk.org Thu Jan 30 16:41:58 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Thu, 30 Jan 2025 16:41:58 GMT Subject: RFR: 8348411: C2: Remove the control input of LoadKlassNode and LoadNKlassNode [v3] In-Reply-To: References: Message-ID: On Thu, 30 Jan 2025 07:56:29 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch removes the control input of `LoadKlassNode` and `LoadNKlassNode`. They can only have a control input if created inside `Parse::array_store_check()`, the reason given is: >> >> // We are allowed to use the constant type only if cast succeeded >> >> But this seems incorrect, the load from the constant type can be done regardless, and it will be constant-folded. This patch only makes that more formal and cleanup `LoadKlassNode::can_remove_control`. >> >> Please take a look and leave your reviews, thanks a lot. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > clearer intention, revert formatting, add assert Looks good (w/ minor cleanup suggestions). Testing (hs-tier1 - hs-tier4) results are clean. ------------- Marked as reviewed by vlivanov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/23274#pullrequestreview-2584425369 From vlivanov at openjdk.org Thu Jan 30 16:42:00 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Thu, 30 Jan 2025 16:42:00 GMT Subject: RFR: 8348411: C2: Remove the control input of LoadKlassNode and LoadNKlassNode [v3] In-Reply-To: References: Message-ID: On Thu, 30 Jan 2025 07:52:45 GMT, Quan Anh Mai wrote: >> Indeed, control does look redundant here. Once `array_klass` becomes a constant, the corresponding `LoadKlass` node should constant fold into a constant as well and the original control is gone anyway. >> >> (It's worth adding an assert here that `a_e_klass` is a constant when `array_klass` is a constant (and `StressReflectiveCode == false`). > > I assume that asserting `StressReflectiveCode || array_klass->is_Con() == a_e_klass->is_Con()` is correct. Yes, it looks good. I suggest to invert the order of the checks to make the intention clearer (`a_e_klass->is_Con() == array_klass->is_Con() || StressReflectiveCode`). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23274#discussion_r1935922289 From qamai at openjdk.org Thu Jan 30 17:11:08 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 30 Jan 2025 17:11:08 GMT Subject: RFR: 8348411: C2: Remove the control input of LoadKlassNode and LoadNKlassNode [v4] In-Reply-To: References: Message-ID: <9IQfA04XvFFNu5R3LMh6dYWPAbf7D-VoAb2O0Gt0b8Q=.22b568c9-2a45-41b7-91d7-11c22bb6abe0@github.com> > Hi, > > This patch removes the control input of `LoadKlassNode` and `LoadNKlassNode`. They can only have a control input if created inside `Parse::array_store_check()`, the reason given is: > > // We are allowed to use the constant type only if cast succeeded > > But this seems incorrect, the load from the constant type can be done regardless, and it will be constant-folded. This patch only makes that more formal and cleanup `LoadKlassNode::can_remove_control`. > > Please take a look and leave your reviews, thanks a lot. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: format ------------- Changes: - all: https://git.openjdk.org/jdk/pull/23274/files - new: https://git.openjdk.org/jdk/pull/23274/files/c6761889..175232a6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23274&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23274&range=02-03 Stats: 5 lines in 1 file changed: 2 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/23274.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23274/head:pull/23274 PR: https://git.openjdk.org/jdk/pull/23274 From qamai at openjdk.org Thu Jan 30 17:11:08 2025 From: qamai at openjdk.org (Quan Anh Mai) Date: Thu, 30 Jan 2025 17:11:08 GMT Subject: RFR: 8348411: C2: Remove the control input of LoadKlassNode and LoadNKlassNode [v3] In-Reply-To: References: Message-ID: On Thu, 30 Jan 2025 16:39:22 GMT, Vladimir Ivanov wrote: >> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: >> >> clearer intention, revert formatting, add assert > > Looks good (w/ minor cleanup suggestions). > > Testing (hs-tier1 - hs-tier4) results are clean. @iwanowww Thanks a lot for your reviews and testing, I hope I have addressed all of your concerns. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23274#issuecomment-2625077117 From vlivanov at openjdk.org Thu Jan 30 18:26:52 2025 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Thu, 30 Jan 2025 18:26:52 GMT Subject: RFR: 8348411: C2: Remove the control input of LoadKlassNode and LoadNKlassNode [v4] In-Reply-To: <9IQfA04XvFFNu5R3LMh6dYWPAbf7D-VoAb2O0Gt0b8Q=.22b568c9-2a45-41b7-91d7-11c22bb6abe0@github.com> References: <9IQfA04XvFFNu5R3LMh6dYWPAbf7D-VoAb2O0Gt0b8Q=.22b568c9-2a45-41b7-91d7-11c22bb6abe0@github.com> Message-ID: On Thu, 30 Jan 2025 17:11:08 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch removes the control input of `LoadKlassNode` and `LoadNKlassNode`. They can only have a control input if created inside `Parse::array_store_check()`, the reason given is: >> >> // We are allowed to use the constant type only if cast succeeded >> >> But this seems incorrect, the load from the constant type can be done regardless, and it will be constant-folded. This patch only makes that more formal and cleanup `LoadKlassNode::can_remove_control`. >> >> Please take a look and leave your reviews, thanks a lot. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > format Marked as reviewed by vlivanov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/23274#pullrequestreview-2584713287