From mseledtsov at openjdk.org Wed Feb 1 02:31:52 2023 From: mseledtsov at openjdk.org (Mikhailo Seledtsov) Date: Wed, 1 Feb 2023 02:31:52 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v2] In-Reply-To: References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> Message-ID: <67kznn0IPgH3_IYGjItwlol3GUcfcwscPskRIMNNTqc=.b57148e8-fb03-4bb0-9c00-c0626788caa7@github.com> On Tue, 31 Jan 2023 04:35:30 GMT, David Holmes wrote: >> Files containing redirected stdout and stderr will be in the jtreg working directory, usually JTwork/scratch/. >> I tried to keep the code as simple as possible, and found that redirecting the output of a child process to a file is the simplest option. Capturing child process output might involve quite a bit of code, pumping/draining the output to a buffer then printing the buffer out to the main logging method. E.g. see OutputAnalyzer. >> >> Also note that we can not simply use test utils library such as output analyzer. Every depended non-core library has to be declared in TEST.ROOT as: >> requires.extraPropDefns.libs = \ >> ../../lib/jdk/test/lib/Platform.java \ >> ../../lib/jdk/test/lib/Container.java >> >> I tried to add OutputAnalyzer here, but it does require lot's of dependencies and the list grew large. >> >> Let me know if I should do this anyway even if it adds more code to vmprops. > > Granted creating the file is much simpler than capturing the child process output directly, but can we not read the file back in and then include it in the main log? I'm assuming the output of container-ps is going to be quite small. We could also choose to include it only if there appears to have been an error. > Though we could always expand on this later if the current simple approach doesn't suffice. Reading the file back into the main output sounds reasonable. Will update the code and retest. ------------- PR: https://git.openjdk.org/jdk/pull/12239 From mseledtsov at openjdk.org Wed Feb 1 02:36:33 2023 From: mseledtsov at openjdk.org (Mikhailo Seledtsov) Date: Wed, 1 Feb 2023 02:36:33 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v3] In-Reply-To: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> Message-ID: <842OMUjSBTQeiXs2TPp52yPhkyiJMg6h1u4X9ZqCXj4=.c89fbfff-a93f-4d99-b166-68937b134a6e@github.com> > Add log() and logToFile() methods to VMProps for diagnostic purposes. Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: Updated comment as per review feedback ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12239/files - new: https://git.openjdk.org/jdk/pull/12239/files/881a547e..02d6833a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12239&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12239&range=01-02 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/12239.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12239/head:pull/12239 PR: https://git.openjdk.org/jdk/pull/12239 From mseledtsov at openjdk.org Wed Feb 1 03:47:14 2023 From: mseledtsov at openjdk.org (Mikhailo Seledtsov) Date: Wed, 1 Feb 2023 03:47:14 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v4] In-Reply-To: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> Message-ID: > Add log() and logToFile() methods to VMProps for diagnostic purposes. Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: Review feedback: printing stdout and stderr files to log() ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12239/files - new: https://git.openjdk.org/jdk/pull/12239/files/02d6833a..b1aa1c4b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12239&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12239&range=02-03 Stats: 33 lines in 1 file changed: 29 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/12239.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12239/head:pull/12239 PR: https://git.openjdk.org/jdk/pull/12239 From david.holmes at oracle.com Wed Feb 1 04:24:30 2023 From: david.holmes at oracle.com (David Holmes) Date: Wed, 1 Feb 2023 14:24:30 +1000 Subject: JVM Flag ergonomics and constraints In-Reply-To: <1e05de2b-5f03-7ff2-6a47-176df9046588@oracle.com> References: <1e05de2b-5f03-7ff2-6a47-176df9046588@oracle.com> Message-ID: <637c4388-4c02-c654-6166-23491d797fce@oracle.com> On 1/02/2023 5:23 am, Ioi Lam wrote: > CC-ing hotspot-dev for wider discussion. > > On 1/31/2023 12:43 AM, Thomas St?fe wrote: >> Hi Ioi, >> >> a short question. You did the JVM flag handling revamp, right? >> >> I recently ran into a weird problem where FLAG_SET_ERGO would not >> work, turned out it ends up using the constraint function, which >> silently refused to set the argument if the argument violates the >> constraint. >> >> That makes perfect sense, I had an error in my code. But if we set a >> flag with ERGO origin, we set it in the JVM, and the JVM should adhere >> to constraints, not doing so is a programming error and I would expect >> an assert here. The same logic may be applied to other origins too, >> e.g. DEFAULT. >> Do you think this makes sense? >> >> BTW I added debug output (setting verbose to true for ERGO), and I see >> other flag's ergonomics also getting ignored, which would have to be >> fixed too. >> >> Cheers, Thomas >> > > > This problem is also reported in > https://bugs.openjdk.org/browse/JDK-8224980 > > There are proposals for making the debug message more obvious, and/or > assert(). > > Currently we don't have documentation of what the caller of > FLAG_SET_ERGO is supposed to do. > > There are 3 ways to set a flag. The current behavior of the following > two ways seems reasonable > > ?? FLAG_SET_CMDLINE - rejects user input and exit VM > ?? FLAG_SET_MGMT??? - return JVMFlag::Error > > With command-line, the user should inspect the limits for the current > environment and restart the app with an appropriate value > > With FLAG_SET_MGMT, I suppose it's similar -- whoever is using the > management API should detect the error and adjust their parameter values > (by hand, or programmatically) > > However, FLAG_SET_ERGO also returns JVMFlag::Error, but I don't see any > of our code checking for the return value. So the unwritten convention > is that the call must not fail. > However, if the specified value is out of range, should we abort the VM, > ignore the value, or cap the value? E.g., if the range is 0~100, but you > call SET_ERGO with 101, should we cap the value to 100? Generally if our code tries to sets a flags value to something invalid that is a programming error that should be reported right away. If the value is a "constant" then we should just assert that setting the value worked. If the value is determined by external factors that might vary at runtime (so no guarantee the end result will be valid) then we should have logic that can correct for the invalid value (or we should change the way we do the calculation). Cheers, David > Perhaps capping would make the ergo code easier to write, especially if > the constraint function produces a dynamic range (depending on the > machine's RAM size, for example). Without capping, you will need to > duplicate (or share) the range detection logic between the ergo setter > and the constraint checker. > > Thanks > - Ioi > > From dholmes at openjdk.org Wed Feb 1 04:46:52 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 1 Feb 2023 04:46:52 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v4] In-Reply-To: References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> Message-ID: On Wed, 1 Feb 2023 03:47:14 GMT, Mikhailo Seledtsov wrote: >> Add log() and logToFile() methods to VMProps for diagnostic purposes. > > Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: > > Review feedback: printing stdout and stderr files to log() test/jtreg-ext/requires/VMProps.java line 519: > 517: private String redirectOutputToLogFile(String msg, ProcessBuilder pb, String fileNameBase) { > 518: if (!Boolean.getBoolean("jtreg.log.vmprops")) { > 519: return ""; I think having to explicitly enable this somewhat defeats the purpose of using it for diagnostics. If we get an unexpected failure we will have to reproduce it with logging enabled to see what went wrong. test/jtreg-ext/requires/VMProps.java line 542: > 540: try { > 541: Files.lines(Path.of(logFileNames.split(",")[0])) > 542: .forEach(line -> log(line)); Nit: style convention is to align '.' in stream operations test/jtreg-ext/requires/VMProps.java line 568: > 566: > 567: log(String.format("checkDockerSupport(): exitValue = %s, pid = %s", exitValue, p.pid())); > 568: printLogfileContent(logFileNames); If we are concerned there may be too much output then I'd make this logging conditional on the process failing. ------------- PR: https://git.openjdk.org/jdk/pull/12239 From dholmes at openjdk.org Wed Feb 1 07:48:52 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 1 Feb 2023 07:48:52 GMT Subject: RFR: 8301244: Replace legacy pragma implementations in HotSpot's compilerWarnings.hpp [v8] In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 09:17:22 GMT, Julian Waters wrote: >> HotSpot has a few legacy implementations of pragmas in some of its macros in compilerWarnings.hpp, they should be cleaned up and use a more modern approach. Rename PRAGMA macro in compilerWarnings_gcc.hpp as well since it can be used for more than just compiler warnings > > Julian Waters has updated the pull request incrementally with four additional commits since the last revision: > > - gcc > - VISCPP > - Comment > - Re-add PRAGMA So can I assume you have some uses for this in the pipeline? ------------- PR: https://git.openjdk.org/jdk/pull/12255 From mbaesken at openjdk.org Wed Feb 1 08:22:49 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 1 Feb 2023 08:22:49 GMT Subject: RFR: JDK-8301050: Detect Xen Virtualization on Linux aarch64 In-Reply-To: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> References: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> Message-ID: <2KnE9yBsTkWcHt6Tauj-StpmdIZHGLeCPOILE5WwjcQ=.17df20d7-d339-485c-8771-d6f64b02e6cc@github.com> On Thu, 26 Jan 2023 13:22:48 GMT, Matthias Baesken wrote: > With https://bugs.openjdk.org/browse/JDK-8300266 KVM and VMWare virtualization detection has been added; this should be enhanced by Xen detection. > However it seems that for Xen we should look at other file system locations compared to the other virtualizations. > more /sys/hypervisor/type > xen > seems to be available there. Hi, some reviews would be nice ! Thanks Matthias ------------- PR: https://git.openjdk.org/jdk/pull/12217 From clanger at openjdk.org Wed Feb 1 09:12:53 2023 From: clanger at openjdk.org (Christoph Langer) Date: Wed, 1 Feb 2023 09:12:53 GMT Subject: RFR: JDK-8301050: Detect Xen Virtualization on Linux aarch64 In-Reply-To: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> References: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> Message-ID: On Thu, 26 Jan 2023 13:22:48 GMT, Matthias Baesken wrote: > With https://bugs.openjdk.org/browse/JDK-8300266 KVM and VMWare virtualization detection has been added; this should be enhanced by Xen detection. > However it seems that for Xen we should look at other file system locations compared to the other virtualizations. > more /sys/hypervisor/type > xen > seems to be available there. Hi Matthias, with this code, you'd open and iterate /sys/devices/virtual/dmi/id/product_name twice. Wouldn't it be better to modify the code to first go through /sys/devices/virtual/dmi/id/product_name and check for anything it could find there in one go? ------------- PR: https://git.openjdk.org/jdk/pull/12217 From dholmes at openjdk.org Wed Feb 1 09:15:53 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 1 Feb 2023 09:15:53 GMT Subject: RFR: JDK-8301050: Detect Xen Virtualization on Linux aarch64 In-Reply-To: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> References: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> Message-ID: On Thu, 26 Jan 2023 13:22:48 GMT, Matthias Baesken wrote: > With https://bugs.openjdk.org/browse/JDK-8300266 KVM and VMWare virtualization detection has been added; this should be enhanced by Xen detection. > However it seems that for Xen we should look at other file system locations compared to the other virtualizations. > more /sys/hypervisor/type > xen > seems to be available there. This seems inefficient as we now open and read the file twice. src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 590: > 588: const char* tname_file = "/sys/hypervisor/type"; > 589: if (check_info_file(pname_file, "KVM")) { > 590: Abstract_VM_Version::_detected_virtualization = KVM; return; Please put the return on a new line. Thanks ------------- PR: https://git.openjdk.org/jdk/pull/12217 From ayang at openjdk.org Wed Feb 1 09:45:50 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 1 Feb 2023 09:45:50 GMT Subject: RFR: 8301446: Remove unused includes of gc/shared/genOopClosures In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 07:25:05 GMT, Albert Mingkun Yang wrote: > Simple removing unused includes. Thanks for the review. ------------- PR: https://git.openjdk.org/jdk/pull/12307 From thomas.stuefe at gmail.com Wed Feb 1 09:48:33 2023 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 1 Feb 2023 10:48:33 +0100 Subject: JVM Flag ergonomics and constraints In-Reply-To: <637c4388-4c02-c654-6166-23491d797fce@oracle.com> References: <1e05de2b-5f03-7ff2-6a47-176df9046588@oracle.com> <637c4388-4c02-c654-6166-23491d797fce@oracle.com> Message-ID: On Wed, Feb 1, 2023 at 5:24 AM David Holmes wrote: > On 1/02/2023 5:23 am, Ioi Lam wrote: > > CC-ing hotspot-dev for wider discussion. > > > > On 1/31/2023 12:43 AM, Thomas St?fe wrote: > >> Hi Ioi, > >> > >> a short question. You did the JVM flag handling revamp, right? > >> > >> I recently ran into a weird problem where FLAG_SET_ERGO would not > >> work, turned out it ends up using the constraint function, which > >> silently refused to set the argument if the argument violates the > >> constraint. > >> > >> That makes perfect sense, I had an error in my code. But if we set a > >> flag with ERGO origin, we set it in the JVM, and the JVM should adhere > >> to constraints, not doing so is a programming error and I would expect > >> an assert here. The same logic may be applied to other origins too, > >> e.g. DEFAULT. > >> Do you think this makes sense? > >> > >> BTW I added debug output (setting verbose to true for ERGO), and I see > >> other flag's ergonomics also getting ignored, which would have to be > >> fixed too. > >> > >> Cheers, Thomas > >> > > > > > > This problem is also reported in > > https://bugs.openjdk.org/browse/JDK-8224980 > > > > There are proposals for making the debug message more obvious, and/or > > assert(). > > > > Currently we don't have documentation of what the caller of > > FLAG_SET_ERGO is supposed to do. > > > > There are 3 ways to set a flag. The current behavior of the following > > two ways seems reasonable > > > > FLAG_SET_CMDLINE - rejects user input and exit VM > > FLAG_SET_MGMT - return JVMFlag::Error > > > > With command-line, the user should inspect the limits for the current > > environment and restart the app with an appropriate value > > > > With FLAG_SET_MGMT, I suppose it's similar -- whoever is using the > > management API should detect the error and adjust their parameter values > > (by hand, or programmatically) > > > > However, FLAG_SET_ERGO also returns JVMFlag::Error, but I don't see any > > of our code checking for the return value. So the unwritten convention > > is that the call must not fail. > > However, if the specified value is out of range, should we abort the VM, > > ignore the value, or cap the value? E.g., if the range is 0~100, but you > > call SET_ERGO with 101, should we cap the value to 100? > > Generally if our code tries to sets a flags value to something invalid > that is a programming error that should be reported right away. If the > value is a "constant" then we should just assert that setting the value > worked. If the value is determined by external factors that might vary > at runtime (so no guarantee the end result will be valid) then we should > have logic that can correct for the invalid value (or we should change > the way we do the calculation). > Agreed, but how do you want to tell them apart? > Cheers, > David > > > > Perhaps capping would make the ergo code easier to write, especially if > > the constraint function produces a dynamic range (depending on the > > machine's RAM size, for example). Without capping, you will need to > > duplicate (or share) the range detection logic between the ergo setter > > and the constraint checker. > I like capping as a pragmatic solution. The only problem I see is for "special" values that are outside the allowed range but still hold information. For example, NonNMethodCodeHeapSize is set to 0 via FLAG_SET_ERGO (SegmentedCodeCache off), which gets ignored since it violates VMPageSizeConstraintFunc (valid range starts at 4096). Though, while typing, I realized that this maybe is a simple error, and the answer would be to either allow 0 as a value in the constraint function or to not use this constraint function. Thanks, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From ayang at openjdk.org Wed Feb 1 09:49:05 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 1 Feb 2023 09:49:05 GMT Subject: Integrated: 8301446: Remove unused includes of gc/shared/genOopClosures In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 07:25:05 GMT, Albert Mingkun Yang wrote: > Simple removing unused includes. This pull request has now been integrated. Changeset: 4f6f3cc6 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/4f6f3cc642d95c5a2a8068b6dc07cca4fda74bc9 Stats: 5 lines in 5 files changed: 0 ins; 5 del; 0 mod 8301446: Remove unused includes of gc/shared/genOopClosures Reviewed-by: stefank, kbarrett ------------- PR: https://git.openjdk.org/jdk/pull/12307 From ayang at openjdk.org Wed Feb 1 09:50:27 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 1 Feb 2023 09:50:27 GMT Subject: RFR: 8301465: Remove unnecessary nmethod arming in Full GC of Serial/Parallel [v2] In-Reply-To: References: Message-ID: > Simple removing unnecessary code and adding API documentation. > > Test: hotspot_gc Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12311/files - new: https://git.openjdk.org/jdk/pull/12311/files/49ac215a..c78f41b6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12311&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12311&range=00-01 Stats: 13 lines in 2 files changed: 7 ins; 6 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12311.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12311/head:pull/12311 PR: https://git.openjdk.org/jdk/pull/12311 From ayang at openjdk.org Wed Feb 1 09:50:27 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 1 Feb 2023 09:50:27 GMT Subject: RFR: 8301465: Remove unnecessary nmethod arming in Full GC of Serial/Parallel [v2] In-Reply-To: <0EDU-jWBNGm1-FgdQsSJrr1ssmFWl4YQgGtSkDbGtd0=.332b49ca-bb04-409e-a8f5-c6417c55b833@github.com> References: <0EDU-jWBNGm1-FgdQsSJrr1ssmFWl4YQgGtSkDbGtd0=.332b49ca-bb04-409e-a8f5-c6417c55b833@github.com> Message-ID: <1AFbP4sN2SEQaba2dNL6M0RZlwYOY9W4-w8K4DembSY=.83a5081b-41e6-44a0-9d66-7acfecad30ab@github.com> On Tue, 31 Jan 2023 10:00:44 GMT, Thomas Schatzl wrote: >> Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: >> >> review > > src/hotspot/share/code/codeCache.cpp line 880: > >> 878: // nmethods are visited. >> 879: // 2. Used at the end of (stw/concurrent) marking so that nmethod::_gc_epoch is >> 880: // up-to-date, which provides more accurate estimate of nmethod::is_cold. > > I would prefer to have general documentation and usage information about a method in the .hpp file. Moved. ------------- PR: https://git.openjdk.org/jdk/pull/12311 From tschatzl at openjdk.org Wed Feb 1 10:01:52 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 1 Feb 2023 10:01:52 GMT Subject: RFR: 8301465: Remove unnecessary nmethod arming in Full GC of Serial/Parallel [v2] In-Reply-To: References: Message-ID: <-lzzNug5CSPWxqLwFYl7MkRkJOJXlhqZ2sK9hTmW1Rs=.f2ea41e6-c45a-41c4-929d-77122cb9fff1@github.com> On Wed, 1 Feb 2023 09:50:27 GMT, Albert Mingkun Yang wrote: >> Simple removing unnecessary code and adding API documentation. >> >> Test: hotspot_gc > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review Lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.org/jdk/pull/12311 From stuefe at openjdk.org Wed Feb 1 10:22:44 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 1 Feb 2023 10:22:44 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v7] In-Reply-To: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: > This RFE adds the option to auto-trim the Glibc heap as part of the GC cycle. If the VM process suffered high temporary malloc spikes (regardless whether from JVM- or user code), this could recover significant amounts of memory. > > We discussed this a year ago [1], but the item got pushed to the bottom of my work pile, therefore it took longer than I thought. > > ### Motivation > > The Glibc allocator is reluctant to return memory to the OS, much more so than other allocators. Temporary malloc spikes often carry over as permanent RSS increase. > > Note that C-heap retention is difficult to observe. Since it is freed memory, it won't show up in NMT, it is just a part of private RSS. > > Theoretically, retained memory is not lost since it will be reused by future mallocs. Retaining memory is therefore a bet on the future behavior of the app. The allocator bets on the application needing memory in the near future, and to satisfy that need via malloc. > > But an app's malloc load can fluctuate wildly, with temporary spikes and long idle periods. And if the app rolls its own custom allocators atop of mmap, as hotspot does, a lot of that memory cannot be reused even though it counts toward its memory footprint. > > To help, Glibc exports an API to trim the C-heap: `malloc_trim(3)`. With JDK 18 [2], SAP contributed a new jcmd command to *manually* trim the C-heap on Linux. This RFE adds a complementary way to trim automatically. > > #### Is this even a problem? > > Do we have high malloc spikes in the JVM process? We assume that malloc load from hotspot is usually low since hotspot typically clusters allocations into custom areas - metaspace, code heap, arenas. > > But arenas are subject to Glibc mem retention too. I was surprised by that since I assumed 32k arena chunks were too big to be subject of Glibc retention. But I saw in experiments that high arena peaks often cause lasting RSS increase. > > And of course, both hotspot and JDK do a lot of finer-granular mallocs outside of custom allocators. > > But many cases of high memory retention in Glibc I have seen in third-party JNI code. Libraries allocate large buffers via malloc as temporary buffers. In fact, since we introduced the jcmd "System.trim_native_heap", some of our customers started to call this command periodically in scripts to counter these issues. > > Therefore I think while high malloc spikes are atypical for a JVM process, they can happen. Having a way to auto-trim the native heap makes sense. > > ### When should we trim? > > We want to trim when we know there is a lull in malloc activity coming. But we have no knowledge of the future. > > We could build a heuristic based on malloc frequency. But on closer inspection that is difficult. We cannot use NMT, since NMT has no complete picture (only knows hotspot) and is usually disabled in production anyway. The only way to get *all* mallocs would be to use Glibc malloc hooks. We have done so in desperate cases at SAP, but Glibc removed malloc hooks in 2.35. It would be a messy solution anyway; best to avoid it. > > The next best thing is synchronizing with the larger C-heap users in the VM: compiler and GC. But compiler turns out not to be such a problem, since the compiler uses arenas, and arena chunks are buffered in a free pool with a five-second delay. That means compiler activity that happens in bursts, like at VM startup, will just shuffle arena chunks around from/to the arena free pool, never bothering to call malloc or free. > > That leaves the GC, which was also the experts' recommendation in last year's discussion [1]. Most GCs do uncommit, and trimming the native heap fits well into this. And we want to time the trim to not get into the way of a GC. Plus, integrating trims into the GC cycle lets us reuse GC logging and timing, thereby making RSS changes caused by trim-native visible to the analyst. > > > ### How it works: > > Patch adds new options (experimental for now, and shared among all GCs): > > > -XX:+GCTrimNativeHeap > -XX:GCTrimNativeHeapInterval= (defaults to 60) > > > `GCTrimNativeHeap` is off by default. If enabled, it will cause the VM to trim the native heap on full GCs as well as periodically. The period is defined by `GCTrimNativeHeapInterval`. Periodic trimming can be completely switched off with `GCTrimNativeHeapInterval=0`; in that case, we will only trim on full GCs. > > ### Examples: > > This is an artificial test that causes two high malloc spikes with long idle periods. Observe how RSS recovers with trim but stays up without trim. The trim interval was set to 15 seconds for the test, and no GC was invoked here; this is periodic trimming. > > ![alloc-test](http://cr.openjdk.java.net/~stuefe/other/autotrim/rss-all-collectors.png) > > (See here for parameters: [run script](http://cr.openjdk.java.net/~stuefe/other/autotrim/run-all.sh) ) > > Spring pet clinic boots up, then idles. Once with, once without trim, with the trim interval at 60 seconds default. Of course, if it were actually doing something instead of idling, trim effects would be smaller. But the point of trimming is to recover memory in idle periods. > > ![petclinic bootup](http://cr.openjdk.java.net/~stuefe/other/autotrim/spring-petclinic-rss-with-and-without-trim.png)) > > (See here for parameters: [run script](http://cr.openjdk.java.net/~stuefe/other/autotrim/run-petclinic-boot.sh) ) > > > > ### Implementation > > One problem I faced when implementing this was that trimming was non-interruptable. GCs usually split the uncommit work into smaller portions, which is impossible for `malloc_trim()`. > > So very slow trims could introduce longer GC pauses. I did not want this, therefore I implemented two ways to trim: > 1) GCs can opt to trim asynchronously. In that case, a `NativeTrimmer` thread runs on behalf of the GC and takes care of all trimming. The GC just nudges the `NativeTrimmer` at the end of its GC cycle, but the trim itself runs concurrently. > 2) GCs can do the trim inside their own thread, synchronously. It will have to wait until the trim is done. > > (1) has the advantage of giving us periodic trims even without GC activity (Shenandoah does this out of the box). > > #### Serial > > Serial does the trimming synchronously as part of a full GC, and only then. I did not want to spawn a separate thread for the SerialGC. Therefore Serial is the only GC that does not offer periodic trimming, it just trims on full GC. > > #### Parallel, G1, Z > > All of them do the trimming asynchronously via `NativeTrimmer`. They schedule the native trim at the end of a full collection. They also pause the trimming at the beginning of a cycle to not trim during GCs. > > #### Shenandoah > > Shenandoah does the trimming synchronously in its service thread, similar to how it handles uncommits. Since the service thread already runs concurrently and continuously, it can do periodic trimming; no need to spin a new thread. And this way we can reuse the Shenandoah timing classes. > > ### Patch details > > - adds three new functions to the `os` namespace: > - `os::trim_native_heap()` implementing trim > - `os::can_trim_native_heap()` and `os::should_trim_native_heap()` to return whether platform supports trimming resp. whether the platform considers trimming to be useful. > - replaces implementation of the cmd "System.trim_native_heap" with the new `os::trim_native_heap` > - provides a new wrapper function wrapping the tedious `mallinfo()` vs `mallinfo2()` business: `os::Linux::get_mallinfo()` > - adds a GC-shared utility class, `GCTrimNative`, that takes care of trimming and GC-logging and houses the `NativeTrimmer` thread class. > - adds a regression test > > > ### Tests > > Tested older Glibc (2.31), and newer Glibc (2.35) (`mallinfo()` vs` mallinfo2()`), on Linux x64. > > The rest of the tests will be done by GHA and in our SAP nightlies. > > > ### Remarks > > #### How about other allocators? > > I have seen this retention problem mainly with the Glibc and the AIX libc. Muslc returns memory more eagerly to the OS. I also tested with jemalloc and found it also reclaims more aggressively, therefore I don't think MacOS or BSD are affected that much by retention either. > > #### Trim costs? > > Trim-native is a tradeoff between memory and performance. We pay > - The cost to do the trim depends on how much is trimmed. Time ranges on my machine between < 1ms for no-op trims, to ~800ms for 32GB trims. > - The cost for re-acquiring the memory, should the memory be needed again, is the second cost factor. > > #### Predicting malloc_trim effects? > > `ShenandoahUncommit` avoids uncommits if they are not necessary, thus avoiding work and gc log spamming. I liked that and tried to follow that example. Tried to devise a way to predict the effect trim could have based on allocator info from mallinfo(3). That was quite frustrating since the documentation was confusing and I had to do a lot of experimenting. In the end, I came up with a heuristic to prevent obviously pointless trim attempts; see `os::should_trim_native_heap()`. I am not completely happy with it. > > #### glibc.malloc.trim_threshold? > > glibc has a tunable that looks like it could influence the willingness of Glibc to return memory to the OS, the "trim_threshold". In practice, I could not get it to do anything useful. Regardless of the setting, it never seemed to influence the trimming behavior. Even if it would work, I'm not sure we'd want to use that, since by doing malloc_trim manually we can space out the trims as we see fit, instead of paying the trim price for free(3). > > > - [1] https://mail.openjdk.org/pipermail/hotspot-dev/2021-August/054323.html > - [2] https://bugs.openjdk.org/browse/JDK-8269345 Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 23 commits: - wip - wip - wip - Merge branch 'master' into JDK-8293114-GC-trim-native - wip - Merge branch 'master' into JDK-8293114-GC-trim-native - revamp wip - Merge branch 'master' into JDK-8293114-GC-trim-native - make tests for ppc more lenient - Merge - ... and 13 more: https://git.openjdk.org/jdk/compare/af564e46...599573d9 ------------- Changes: https://git.openjdk.org/jdk/pull/10085/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=10085&range=06 Stats: 1069 lines in 23 files changed: 1065 ins; 1 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/10085.diff Fetch: git fetch https://git.openjdk.org/jdk pull/10085/head:pull/10085 PR: https://git.openjdk.org/jdk/pull/10085 From tschatzl at openjdk.org Wed Feb 1 10:42:32 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 1 Feb 2023 10:42:32 GMT Subject: RFR: 8301116: Parallelize TLAB resizing in G1 Message-ID: Hi all, can I get reviews for this change that moves TLAB resizing into the second parallel post evacuation phase? The change is fairly straightforward except maybe for the following: * there is a new claimer for `JavaThreads` because the `Threads::possibly_parallel_threads_do` method to parallelize does not work well with very little per-thread work. Actually with a moderate amount of threads (~20) performance would already be many times slower than doing things serially. * moving the TLAB resizing out of `G1CollectedHeap::prepare_tlabs_for_mutator` made that method and the enclosing timing a bit obsolete, so I repurposed it as a "do preparatory work for the mutator during young gc" phase. That resulted in minor cleanup and regularization of what is done during that phase (wrt to what the corresponding method for full gc does). Testing: gha, local perf testing, tier1 Thanks, Thomas ------------- Commit messages: - Cleanup - Fixes, refactoring - Fixes - Cleanup - initial version Changes: https://git.openjdk.org/jdk/pull/12360/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12360&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301116 Stats: 113 lines in 12 files changed: 77 ins; 20 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/12360.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12360/head:pull/12360 PR: https://git.openjdk.org/jdk/pull/12360 From kbarrett at openjdk.org Wed Feb 1 12:17:53 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 1 Feb 2023 12:17:53 GMT Subject: RFR: 8301244: Replace legacy pragma implementations in HotSpot's compilerWarnings.hpp [v7] In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 09:06:09 GMT, Julian Waters wrote: > The aim was to introduce a way to easily use pragmas inside macro implementations in HotSpot which behave identically to if one had used `#pragma`, regardless of any quirks on any platform/compiler, competing with the regular `#pragma` preprocessor directive is otherwise not the goal of this change. Defining differing pragmas which are compiler specific based on `#ifdef` blocks are a different issue too, all this proposal intends to do is make it easy for defining pragmas inside macros, as mentioned above. The vast majority of the compilers that can be used to compile HotSpot (that is to say, all of them except for Visual C++) don't have an easy way to do this that doesn't involve the hassle of defining one-off helper macros directly where they're used to expand the input to properly feed into `_Pragma` (see compilerWarnings_gcc.hpp for instance). I get that this might sound slightly petty, but I'd prefer if macros defined for the sole purpose of solving a local issue were somewhat kept to a minimum in HotSpot if they could be avoided. This can also be especially helpful in shared code where a one-for-all macro that contains a pragma is needed, where __pragma wouldn't be accepted by other compilers anyway. Regarding Kim's concerns, I could indeed make the implementation use __pragma if `#ifdef _MSC_VER` is true, but I don't really see the need for doing that, since Visual C++ pretty much treats __pragma and `_Pragma` identically (to the point that they share the same bugs), and doing so would just be extra code which would behave the same > > I forgot to mention `PRAGMA_UTIL` was just `PRAGMA` earlier on and in macros.hpp, I changed it after hearing Kim's concerns, but now after looking back at it I think I should've kept it for review The existing one-off helper macro (PRAGMA_DISABLE_GCC_WARNING_AUX) exists in its current form because there is exactly one use of that behavior, by the associated no-_AUX macro. If there were multiple uses then there would be reason for a common helper macro, and such would likely already exist. That hasn't happened yet. If it did, I'd probably call it PRAGMA_STRINGIZE or something like that. (Coming up with a good name for that utility is another reason why we have the existing _AUX macro.) But until there is such a need, I don't see the point. It's just premature generalization. To reiterate David, is there some new use-case that you have in mind? Maybe we need to discuss that first. I also don't see the point of making MSVC jump through that extra hoop. The pragma tokens being used for a given PRAGMA_ macro are likely always going to be different between MSVC and gcc (or any other compiler), so letting MSVC-specific code use an MSVC-specific extension doesn't bother me at all. And because the needed tokens for a particular effect are always different for different compilers, the idea of a one-for-all macro containing a "bare" pragma that works for all platforms is just not a thing. (There are a small number of standard pragmas, where one would likely just directly use _Pragma with the appropriate string on all platforms.) And what MSVC bugs in __pragma/_Pragma are you referring to? If it's the macro expansion of the pragma tokens, that's not a bug, it's an intentional feature that is permitted by the Standard(s). It's implementation-defined whether those tokens are subject to expansion. ------------- PR: https://git.openjdk.org/jdk/pull/12255 From mbaesken at openjdk.org Wed Feb 1 13:09:19 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 1 Feb 2023 13:09:19 GMT Subject: RFR: JDK-8301050: Detect Xen Virtualization on Linux aarch64 [v2] In-Reply-To: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> References: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> Message-ID: <2AzgpXplQYjUVCu3UqiVabBzV-a4a5sntGLL5CikDLA=.a7a80f58-4ca4-425b-afa5-9e82e81c3ee6@github.com> > With https://bugs.openjdk.org/browse/JDK-8300266 KVM and VMWare virtualization detection has been added; this should be enhanced by Xen detection. > However it seems that for Xen we should look at other file system locations compared to the other virtualizations. > more /sys/hypervisor/type > xen > seems to be available there. Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: Do not scan the product_name file twice ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12217/files - new: https://git.openjdk.org/jdk/pull/12217/files/b71a062b..3ef5c503 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12217&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12217&range=00-01 Stats: 17 lines in 1 file changed: 8 ins; 0 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/12217.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12217/head:pull/12217 PR: https://git.openjdk.org/jdk/pull/12217 From mbaesken at openjdk.org Wed Feb 1 13:12:52 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 1 Feb 2023 13:12:52 GMT Subject: RFR: JDK-8301050: Detect Xen Virtualization on Linux aarch64 [v2] In-Reply-To: <2AzgpXplQYjUVCu3UqiVabBzV-a4a5sntGLL5CikDLA=.a7a80f58-4ca4-425b-afa5-9e82e81c3ee6@github.com> References: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> <2AzgpXplQYjUVCu3UqiVabBzV-a4a5sntGLL5CikDLA=.a7a80f58-4ca4-425b-afa5-9e82e81c3ee6@github.com> Message-ID: <3I6VBgm275T5ZNmpp_jEhl8UB3zkKIyO6tX9TUlIYeA=.0752c2e2-5b68-4e8c-ac6d-81211bf1ed9c@github.com> On Wed, 1 Feb 2023 13:09:19 GMT, Matthias Baesken wrote: >> With https://bugs.openjdk.org/browse/JDK-8300266 KVM and VMWare virtualization detection has been added; this should be enhanced by Xen detection. >> However it seems that for Xen we should look at other file system locations compared to the other virtualizations. >> more /sys/hypervisor/type >> xen >> seems to be available there. > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Do not scan the product_name file twice Hi Christoph/David , I adjusted the file reading so that iteration of /sys/devices/virtual/dmi/id/product_name is done once now, not twice. Should be more efficient. ------------- PR: https://git.openjdk.org/jdk/pull/12217 From mbaesken at openjdk.org Wed Feb 1 13:12:55 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 1 Feb 2023 13:12:55 GMT Subject: RFR: JDK-8301050: Detect Xen Virtualization on Linux aarch64 [v2] In-Reply-To: References: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> Message-ID: On Wed, 1 Feb 2023 09:10:30 GMT, David Holmes wrote: >> Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: >> >> Do not scan the product_name file twice > > src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 590: > >> 588: const char* tname_file = "/sys/hypervisor/type"; >> 589: if (check_info_file(pname_file, "KVM")) { >> 590: Abstract_VM_Version::_detected_virtualization = KVM; return; > > Please put the return on a new line. Thanks Hi David, I moved the returns. ------------- PR: https://git.openjdk.org/jdk/pull/12217 From aboldtch at openjdk.org Wed Feb 1 14:48:32 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 1 Feb 2023 14:48:32 GMT Subject: RFR: 8301612: OopLoadProxy constructor should be explicit Message-ID: The implicit conversion via constructor `OopLoadProxy(P* addr)` can cause problems when using the access API and nullptr. For example code like ```c++ oop o = (_obj == NULL) ? nullptr : NativeAccess<>::oop_load(_obj); will compile but is wrong and will crash. This is because it will be interpreted as: ```c++ oop o = static_cast((_obj == NULL) ? OopLoadProxy(nullptr) : NativeAccess<>::oop_load(_obj)); while the intent of the programmer probably was: ```c++ oop o = (_obj == NULL) ? (oop)nullptr : NativeAccess<>::oop_load(_obj); or more explicitly: ```c++ oop o = (_obj == NULL) ? static_cast(nullptr) : static_cast(NativeAccess<>::oop_load(_obj)); Marking `OopLoadProxy(P* addr)` explicit will make this example, and similar code not compile. ------------- Commit messages: - 8301612: OopLoadProxy constructor should be explicit Changes: https://git.openjdk.org/jdk/pull/12364/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12364&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301612 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12364.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12364/head:pull/12364 PR: https://git.openjdk.org/jdk/pull/12364 From stefank at openjdk.org Wed Feb 1 15:00:49 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 1 Feb 2023 15:00:49 GMT Subject: RFR: 8301612: OopLoadProxy constructor should be explicit In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 14:41:35 GMT, Axel Boldt-Christmas wrote: > The implicit conversion via constructor `OopLoadProxy(P* addr)` can cause problems when using the access API and nullptr. > For example code like > ```c++ > oop o = (_obj == NULL) ? nullptr : NativeAccess<>::oop_load(_obj); > > will compile but is wrong and will crash. > This is because it will be interpreted as: > ```c++ > oop o = static_cast((_obj == NULL) ? OopLoadProxy(nullptr) : NativeAccess<>::oop_load(_obj)); > > while the intent of the programmer probably was: > ```c++ > oop o = (_obj == NULL) ? (oop)nullptr : NativeAccess<>::oop_load(_obj); > > or more explicitly: > ```c++ > oop o = (_obj == NULL) ? static_cast(nullptr) : static_cast(NativeAccess<>::oop_load(_obj)); > > > Marking `OopLoadProxy(P* addr)` explicit will make this example, and similar code not compile. Looks good and can be considered "trivial". ------------- Marked as reviewed by stefank (Reviewer). PR: https://git.openjdk.org/jdk/pull/12364 From jsjolen at openjdk.org Wed Feb 1 15:25:50 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 1 Feb 2023 15:25:50 GMT Subject: RFR: 8301612: OopLoadProxy constructor should be explicit In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 14:41:35 GMT, Axel Boldt-Christmas wrote: > The implicit conversion via constructor `OopLoadProxy(P* addr)` can cause problems when using the access API and nullptr. > For example code like > ```c++ > oop o = (_obj == NULL) ? nullptr : NativeAccess<>::oop_load(_obj); > > will compile but is wrong and will crash. > This is because it will be interpreted as: > ```c++ > oop o = static_cast((_obj == NULL) ? OopLoadProxy(nullptr) : NativeAccess<>::oop_load(_obj)); > > while the intent of the programmer probably was: > ```c++ > oop o = (_obj == NULL) ? (oop)nullptr : NativeAccess<>::oop_load(_obj); > > or more explicitly: > ```c++ > oop o = (_obj == NULL) ? static_cast(nullptr) : static_cast(NativeAccess<>::oop_load(_obj)); > > > Marking `OopLoadProxy(P* addr)` explicit will make this example, and similar code not compile. Thanks for this. ------------- Marked as reviewed by jsjolen (Committer). PR: https://git.openjdk.org/jdk/pull/12364 From jcking at openjdk.org Wed Feb 1 15:26:27 2023 From: jcking at openjdk.org (Justin King) Date: Wed, 1 Feb 2023 15:26:27 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v2] In-Reply-To: References: Message-ID: > Adds initial LSan (LeakSanitizer) support to Hotspot. This setup has been used to identify multiple leaks so far. It can run most of the test suite except those that rely on testing compressed oops or compressed class pointers. It is especially useful when combined with ASan, as LSan can use poisoning information to determine what memory to scan or not to scan, making leak detection more accurate and faster. > > **Suppressing:** > Currently the suppression list is only used to suppress JLI leaks that are known, the rest are done in code. Suppressing needs to identify the source of thet leak. Due to Hotspot's code organization, we would need to suppress `os::malloc` and friends, which would suppress everything. Suppressing in code has the added benefit of being explicit and surviving refactors if methods change. > > **Caveats:** > - `UseCompressedOops` and `UseCompressedClassPointers` are forced to false when LSan is enabled. This is necessary to ensure all pointers to memory which could possible container pointers to malloc memory are property aligned, which is an LSan requirement. > - By default ASan enables LSan, however we explicitly disable it unless `--enable-lsan` is given. This is due to the other caveats. It is useful to be able to use ASan without LSan. Using LSan by itself is less likely to be useful and will probably not work, but its still possible currently. > - There are a series of tests that are upset due to the above flags being forced false, as they rely on the arguments being supported. In the future ideally these tests would be skipped nicely when LSan is enabled. Justin King has updated the pull request incrementally with one additional commit since the last revision: Support CDS Signed-off-by: Justin King ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12229/files - new: https://git.openjdk.org/jdk/pull/12229/files/15f060e8..4559ef4c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12229&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12229&range=00-01 Stats: 6 lines in 2 files changed: 3 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12229.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12229/head:pull/12229 PR: https://git.openjdk.org/jdk/pull/12229 From jcking at openjdk.org Wed Feb 1 15:26:27 2023 From: jcking at openjdk.org (Justin King) Date: Wed, 1 Feb 2023 15:26:27 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot In-Reply-To: References: Message-ID: On Thu, 26 Jan 2023 17:33:28 GMT, Justin King wrote: > Adds initial LSan (LeakSanitizer) support to Hotspot. This setup has been used to identify multiple leaks so far. It can run most of the test suite except those that rely on testing compressed oops or compressed class pointers. It is especially useful when combined with ASan, as LSan can use poisoning information to determine what memory to scan or not to scan, making leak detection more accurate and faster. > > **Suppressing:** > Currently the suppression list is only used to suppress JLI leaks that are known, the rest are done in code. Suppressing needs to identify the source of thet leak. Due to Hotspot's code organization, we would need to suppress `os::malloc` and friends, which would suppress everything. Suppressing in code has the added benefit of being explicit and surviving refactors if methods change. > > **Caveats:** > - `UseCompressedOops` and `UseCompressedClassPointers` are forced to false when LSan is enabled. This is necessary to ensure all pointers to memory which could possible container pointers to malloc memory are property aligned, which is an LSan requirement. > - By default ASan enables LSan, however we explicitly disable it unless `--enable-lsan` is given. This is due to the other caveats. It is useful to be able to use ASan without LSan. Using LSan by itself is less likely to be useful and will probably not work, but its still possible currently. > - There are a series of tests that are upset due to the above flags being forced false, as they rely on the arguments being supported. In the future ideally these tests would be skipped nicely when LSan is enabled. Updated. Was able to get it cooperating with CDS. ------------- PR: https://git.openjdk.org/jdk/pull/12229 From clanger at openjdk.org Wed Feb 1 16:42:51 2023 From: clanger at openjdk.org (Christoph Langer) Date: Wed, 1 Feb 2023 16:42:51 GMT Subject: RFR: JDK-8301050: Detect Xen Virtualization on Linux aarch64 [v2] In-Reply-To: <2AzgpXplQYjUVCu3UqiVabBzV-a4a5sntGLL5CikDLA=.a7a80f58-4ca4-425b-afa5-9e82e81c3ee6@github.com> References: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> <2AzgpXplQYjUVCu3UqiVabBzV-a4a5sntGLL5CikDLA=.a7a80f58-4ca4-425b-afa5-9e82e81c3ee6@github.com> Message-ID: On Wed, 1 Feb 2023 13:09:19 GMT, Matthias Baesken wrote: >> With https://bugs.openjdk.org/browse/JDK-8300266 KVM and VMWare virtualization detection has been added; this should be enhanced by Xen detection. >> However it seems that for Xen we should look at other file system locations compared to the other virtualizations. >> more /sys/hypervisor/type >> xen >> seems to be available there. > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Do not scan the product_name file twice What about modifying Abstract_VM_Version::_detected_virtualization directly in check_info_file? You could also consider passing varargs to check_info_file for the string to look for and the resulting value of _detected_virtualization. ------------- PR: https://git.openjdk.org/jdk/pull/12217 From rkennke at openjdk.org Wed Feb 1 17:09:21 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 1 Feb 2023 17:09:21 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v19] In-Reply-To: References: Message-ID: > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Remove stale method ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/bbdecd7a..24a95a7c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=18 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=17-18 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From tschatzl at openjdk.org Wed Feb 1 17:11:48 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 1 Feb 2023 17:11:48 GMT Subject: RFR: 8301116: Parallelize TLAB resizing in G1 In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 10:34:36 GMT, Thomas Schatzl wrote: > Hi all, > > can I get reviews for this change that moves TLAB resizing into the second parallel post evacuation phase? > > The change is fairly straightforward except maybe for the following: > * there is a new claimer for `JavaThreads` because the `Threads::possibly_parallel_threads_do` method to parallelize does not work well with very little per-thread work. Actually with a moderate amount of threads (~20) performance would already be many times slower than doing things serially. > * moving the TLAB resizing out of `G1CollectedHeap::prepare_tlabs_for_mutator` made that method and the enclosing timing a bit obsolete, so I repurposed it as a "do preparatory work for the mutator during young gc" phase. That resulted in minor cleanup and regularization of what is done during that phase (wrt to what the corresponding method for full gc does). > > Testing: gha, local perf testing, tier1 > > Thanks, > Thomas The reason for this change (and the follow-up(s)) is that in applications with a large number of Java threads (20k+), TLAB retirement, log buffer flushing and TLAB resizing takes 3-7% of total pause time. Parallelization can reduce this time significantly. ------------- PR: https://git.openjdk.org/jdk/pull/12360 From duke at openjdk.org Wed Feb 1 18:28:25 2023 From: duke at openjdk.org (Scott Gibbons) Date: Wed, 1 Feb 2023 18:28:25 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v7] In-Reply-To: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: > Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. > > Encode performance: > **Old:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms > > > **New:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms > > > Decode performance: > **Old:** > > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms > > **New:** > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Handle AVX2 URL; address review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12126/files - new: https://git.openjdk.org/jdk/pull/12126/files/3e66f7be..4007e984 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=05-06 Stats: 94 lines in 4 files changed: 48 ins; 32 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/12126.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12126/head:pull/12126 PR: https://git.openjdk.org/jdk/pull/12126 From stuefe at openjdk.org Wed Feb 1 18:47:18 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 1 Feb 2023 18:47:18 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v8] In-Reply-To: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: > This RFE adds the option to auto-trim the Glibc heap as part of the GC cycle. If the VM process suffered high temporary malloc spikes (regardless whether from JVM- or user code), this could recover significant amounts of memory. > > We discussed this a year ago [1], but the item got pushed to the bottom of my work pile, therefore it took longer than I thought. > > ### Motivation > > The Glibc allocator is reluctant to return memory to the OS, much more so than other allocators. Temporary malloc spikes often carry over as permanent RSS increase. > > Note that C-heap retention is difficult to observe. Since it is freed memory, it won't show up in NMT, it is just a part of private RSS. > > Theoretically, retained memory is not lost since it will be reused by future mallocs. Retaining memory is therefore a bet on the future behavior of the app. The allocator bets on the application needing memory in the near future, and to satisfy that need via malloc. > > But an app's malloc load can fluctuate wildly, with temporary spikes and long idle periods. And if the app rolls its own custom allocators atop of mmap, as hotspot does, a lot of that memory cannot be reused even though it counts toward its memory footprint. > > To help, Glibc exports an API to trim the C-heap: `malloc_trim(3)`. With JDK 18 [2], SAP contributed a new jcmd command to *manually* trim the C-heap on Linux. This RFE adds a complementary way to trim automatically. > > #### Is this even a problem? > > Do we have high malloc spikes in the JVM process? We assume that malloc load from hotspot is usually low since hotspot typically clusters allocations into custom areas - metaspace, code heap, arenas. > > But arenas are subject to Glibc mem retention too. I was surprised by that since I assumed 32k arena chunks were too big to be subject of Glibc retention. But I saw in experiments that high arena peaks often cause lasting RSS increase. > > And of course, both hotspot and JDK do a lot of finer-granular mallocs outside of custom allocators. > > But many cases of high memory retention in Glibc I have seen in third-party JNI code. Libraries allocate large buffers via malloc as temporary buffers. In fact, since we introduced the jcmd "System.trim_native_heap", some of our customers started to call this command periodically in scripts to counter these issues. > > Therefore I think while high malloc spikes are atypical for a JVM process, they can happen. Having a way to auto-trim the native heap makes sense. > > ### When should we trim? > > We want to trim when we know there is a lull in malloc activity coming. But we have no knowledge of the future. > > We could build a heuristic based on malloc frequency. But on closer inspection that is difficult. We cannot use NMT, since NMT has no complete picture (only knows hotspot) and is usually disabled in production anyway. The only way to get *all* mallocs would be to use Glibc malloc hooks. We have done so in desperate cases at SAP, but Glibc removed malloc hooks in 2.35. It would be a messy solution anyway; best to avoid it. > > The next best thing is synchronizing with the larger C-heap users in the VM: compiler and GC. But compiler turns out not to be such a problem, since the compiler uses arenas, and arena chunks are buffered in a free pool with a five-second delay. That means compiler activity that happens in bursts, like at VM startup, will just shuffle arena chunks around from/to the arena free pool, never bothering to call malloc or free. > > That leaves the GC, which was also the experts' recommendation in last year's discussion [1]. Most GCs do uncommit, and trimming the native heap fits well into this. And we want to time the trim to not get into the way of a GC. Plus, integrating trims into the GC cycle lets us reuse GC logging and timing, thereby making RSS changes caused by trim-native visible to the analyst. > > > ### How it works: > > Patch adds new options (experimental for now, and shared among all GCs): > > > -XX:+GCTrimNativeHeap > -XX:GCTrimNativeHeapInterval= (defaults to 60) > > > `GCTrimNativeHeap` is off by default. If enabled, it will cause the VM to trim the native heap on full GCs as well as periodically. The period is defined by `GCTrimNativeHeapInterval`. Periodic trimming can be completely switched off with `GCTrimNativeHeapInterval=0`; in that case, we will only trim on full GCs. > > ### Examples: > > This is an artificial test that causes two high malloc spikes with long idle periods. Observe how RSS recovers with trim but stays up without trim. The trim interval was set to 15 seconds for the test, and no GC was invoked here; this is periodic trimming. > > ![alloc-test](http://cr.openjdk.java.net/~stuefe/other/autotrim/rss-all-collectors.png) > > (See here for parameters: [run script](http://cr.openjdk.java.net/~stuefe/other/autotrim/run-all.sh) ) > > Spring pet clinic boots up, then idles. Once with, once without trim, with the trim interval at 60 seconds default. Of course, if it were actually doing something instead of idling, trim effects would be smaller. But the point of trimming is to recover memory in idle periods. > > ![petclinic bootup](http://cr.openjdk.java.net/~stuefe/other/autotrim/spring-petclinic-rss-with-and-without-trim.png)) > > (See here for parameters: [run script](http://cr.openjdk.java.net/~stuefe/other/autotrim/run-petclinic-boot.sh) ) > > > > ### Implementation > > One problem I faced when implementing this was that trimming was non-interruptable. GCs usually split the uncommit work into smaller portions, which is impossible for `malloc_trim()`. > > So very slow trims could introduce longer GC pauses. I did not want this, therefore I implemented two ways to trim: > 1) GCs can opt to trim asynchronously. In that case, a `NativeTrimmer` thread runs on behalf of the GC and takes care of all trimming. The GC just nudges the `NativeTrimmer` at the end of its GC cycle, but the trim itself runs concurrently. > 2) GCs can do the trim inside their own thread, synchronously. It will have to wait until the trim is done. > > (1) has the advantage of giving us periodic trims even without GC activity (Shenandoah does this out of the box). > > #### Serial > > Serial does the trimming synchronously as part of a full GC, and only then. I did not want to spawn a separate thread for the SerialGC. Therefore Serial is the only GC that does not offer periodic trimming, it just trims on full GC. > > #### Parallel, G1, Z > > All of them do the trimming asynchronously via `NativeTrimmer`. They schedule the native trim at the end of a full collection. They also pause the trimming at the beginning of a cycle to not trim during GCs. > > #### Shenandoah > > Shenandoah does the trimming synchronously in its service thread, similar to how it handles uncommits. Since the service thread already runs concurrently and continuously, it can do periodic trimming; no need to spin a new thread. And this way we can reuse the Shenandoah timing classes. > > ### Patch details > > - adds three new functions to the `os` namespace: > - `os::trim_native_heap()` implementing trim > - `os::can_trim_native_heap()` and `os::should_trim_native_heap()` to return whether platform supports trimming resp. whether the platform considers trimming to be useful. > - replaces implementation of the cmd "System.trim_native_heap" with the new `os::trim_native_heap` > - provides a new wrapper function wrapping the tedious `mallinfo()` vs `mallinfo2()` business: `os::Linux::get_mallinfo()` > - adds a GC-shared utility class, `GCTrimNative`, that takes care of trimming and GC-logging and houses the `NativeTrimmer` thread class. > - adds a regression test > > > ### Tests > > Tested older Glibc (2.31), and newer Glibc (2.35) (`mallinfo()` vs` mallinfo2()`), on Linux x64. > > The rest of the tests will be done by GHA and in our SAP nightlies. > > > ### Remarks > > #### How about other allocators? > > I have seen this retention problem mainly with the Glibc and the AIX libc. Muslc returns memory more eagerly to the OS. I also tested with jemalloc and found it also reclaims more aggressively, therefore I don't think MacOS or BSD are affected that much by retention either. > > #### Trim costs? > > Trim-native is a tradeoff between memory and performance. We pay > - The cost to do the trim depends on how much is trimmed. Time ranges on my machine between < 1ms for no-op trims, to ~800ms for 32GB trims. > - The cost for re-acquiring the memory, should the memory be needed again, is the second cost factor. > > #### Predicting malloc_trim effects? > > `ShenandoahUncommit` avoids uncommits if they are not necessary, thus avoiding work and gc log spamming. I liked that and tried to follow that example. Tried to devise a way to predict the effect trim could have based on allocator info from mallinfo(3). That was quite frustrating since the documentation was confusing and I had to do a lot of experimenting. In the end, I came up with a heuristic to prevent obviously pointless trim attempts; see `os::should_trim_native_heap()`. I am not completely happy with it. > > #### glibc.malloc.trim_threshold? > > glibc has a tunable that looks like it could influence the willingness of Glibc to return memory to the OS, the "trim_threshold". In practice, I could not get it to do anything useful. Regardless of the setting, it never seemed to influence the trimming behavior. Even if it would work, I'm not sure we'd want to use that, since by doing malloc_trim manually we can space out the trims as we see fit, instead of paying the trim price for free(3). > > > - [1] https://mail.openjdk.org/pipermail/hotspot-dev/2021-August/054323.html > - [2] https://bugs.openjdk.org/browse/JDK-8269345 Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: - src/hotspot/share/gc/shared/gcTrimNativeHeap.cpp - wip ------------- Changes: - all: https://git.openjdk.org/jdk/pull/10085/files - new: https://git.openjdk.org/jdk/pull/10085/files/599573d9..c765621b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=10085&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=10085&range=06-07 Stats: 235 lines in 1 file changed: 128 ins; 67 del; 40 mod Patch: https://git.openjdk.org/jdk/pull/10085.diff Fetch: git fetch https://git.openjdk.org/jdk pull/10085/head:pull/10085 PR: https://git.openjdk.org/jdk/pull/10085 From duke at openjdk.org Wed Feb 1 19:07:17 2023 From: duke at openjdk.org (Scott Gibbons) Date: Wed, 1 Feb 2023 19:07:17 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v8] In-Reply-To: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: > Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. > > Encode performance: > **Old:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms > > > **New:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms > > > Decode performance: > **Old:** > > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms > > **New:** > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Change break-even buffer size for AVX512 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12126/files - new: https://git.openjdk.org/jdk/pull/12126/files/4007e984..80fbf4ef Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=06-07 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12126.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12126/head:pull/12126 PR: https://git.openjdk.org/jdk/pull/12126 From iklam at openjdk.org Wed Feb 1 19:11:41 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 1 Feb 2023 19:11:41 GMT Subject: RFR: 8301564: Non-C-heap allocated ResourceHashtable keys and values must have trivial destructor Message-ID: To ensure we don't have memory leaks or other more serious memory management bugs, I added static asserts to check that the `K` and `V` types for `ResourceHashtableBase` must have trivial destructors if the table is not `C_HEAP` allocated. Thanks to @JornVernee for the original assert code. The asserts actually found a problem with `ClassLoaderStatsClosure::StatsTable`. The space used by the `oop` in the freed hashtable may be trashed when `-XX:+CheckUnhandledOops` is enabled, so live objects that reuse the same space may be corrupted. (I tried but couldn't get it to fail). I also had to change `CodeBuffer::SharedTrampolineRequests` because it's `V` type is `LinkListImpl`, which has a non-trivial destructor. (Note: in this case, the destructor doesn't do anything; we can probably make it go away with C++-20. See https://stackoverflow.com/questions/40094871/sfinae-away-a-destructor) Testing with tier1~4 + builds-tier5 ------------- Commit messages: - fixed aarch64 and risvc64 - fixed typo - Fixed ClassLoaderStatsClosure::StatsTable - 8301564: Non-C-heap allocated ResourceHashtable keys and values must have trivial destructor Changes: https://git.openjdk.org/jdk/pull/12355/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12355&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301564 Stats: 22 lines in 6 files changed: 14 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/12355.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12355/head:pull/12355 PR: https://git.openjdk.org/jdk/pull/12355 From coleenp at openjdk.org Wed Feb 1 19:42:14 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 1 Feb 2023 19:42:14 GMT Subject: RFR: 8301564: Non-C-heap allocated ResourceHashtable keys and values must have trivial destructor In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 23:23:16 GMT, Ioi Lam wrote: > To ensure we don't have memory leaks or other more serious memory management bugs, I added static asserts to check that the `K` and `V` types for `ResourceHashtableBase` must have trivial destructors if the table is not `C_HEAP` allocated. Thanks to @JornVernee for the original assert code. > > The asserts actually found a problem with `ClassLoaderStatsClosure::StatsTable`. The space used by the `oop` in the freed hashtable may be trashed when `-XX:+CheckUnhandledOops` is enabled, so live objects that reuse the same space may be corrupted. (I tried but couldn't get it to fail). > > I also had to change `CodeBuffer::SharedTrampolineRequests` because it's `V` type is `LinkListImpl`, which has a non-trivial destructor. (Note: in this case, the destructor doesn't do anything; we can probably make it go away with C++-20. See https://stackoverflow.com/questions/40094871/sfinae-away-a-destructor) > > Testing with tier1~4 + builds-tier5 Looks good. Glad there weren't more hashtables breaking this assumption. src/hotspot/share/classfile/classLoaderStats.hpp line 115: > 113: > 114: typedef ResourceHashtable 115: 256, AnyObj::C_HEAP, mtInternal, There's an mtStatistics NMT category, can you use that for this? and the instance below? ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.org/jdk/pull/12355 From redestad at openjdk.org Wed Feb 1 20:57:05 2023 From: redestad at openjdk.org (Claes Redestad) Date: Wed, 1 Feb 2023 20:57:05 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v7] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: On Wed, 1 Feb 2023 18:28:25 GMT, Scott Gibbons wrote: >> Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. >> >> Encode performance: >> **Old:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms >> >> >> **New:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms >> >> >> Decode performance: >> **Old:** >> >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms >> >> **New:** >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Handle AVX2 URL; address review comments src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2202: > 2200: } > 2201: > 2202: address StubGenerator::base64_AVX2_decode_URL_tables_addr() { Shouldn't this be `decode_lut_tables`? As it's used for URL and non-URL decoding alike. ------------- PR: https://git.openjdk.org/jdk/pull/12126 From duke at openjdk.org Wed Feb 1 21:02:31 2023 From: duke at openjdk.org (Scott Gibbons) Date: Wed, 1 Feb 2023 21:02:31 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v7] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: On Wed, 1 Feb 2023 20:53:54 GMT, Claes Redestad wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Handle AVX2 URL; address review comments > > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2202: > >> 2200: } >> 2201: >> 2202: address StubGenerator::base64_AVX2_decode_URL_tables_addr() { > > Shouldn't this be `decode_lut_tables`? As it's used for URL and non-URL decoding alike. These tables are used for both URL and non-URL based on the parameter, and they are only two of the three lut tables used (the other is in `base64_AVX2_decode_tables_addr` ). Both names are essentially incorrect. Does the name really matter that much? It's the same as `base64_AVX2_decode_tables_addr` with the addition of URL tables. ------------- PR: https://git.openjdk.org/jdk/pull/12126 From duke at openjdk.org Wed Feb 1 21:02:38 2023 From: duke at openjdk.org (Scott Gibbons) Date: Wed, 1 Feb 2023 21:02:38 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v6] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> <0oERFNuGS7s0R-naHhheLiPGTQvo7jHbTtR47LGOSE8=.1401d6bb-7095-425d-ad45-83fbed127c3c@github.com> Message-ID: On Fri, 27 Jan 2023 21:45:37 GMT, Claes Redestad wrote: >> Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 13 additional commits since the last revision: >> >> - Merge branch 'openjdk:master' into Base64-AVX2 >> - Merge branch 'openjdk:master' into Base64-AVX2 >> - Merge branch 'openjdk:master' into Base64-AVX2 >> - Merge branch 'Base64-AVX2' of https://github.com/asgibbons/jdk into Base64-AVX2 >> - Merge branch 'openjdk:master' into Base64-AVX2 >> - Address review comment >> - Remove whitespace >> - Fix wrong register usage >> - Working version of Base64 decode with AVX2 (4x perf improvement). No URL support >> - Merge branch 'Base64-AVX2' of https://github.com/asgibbons/jdk into Base64-AVX2 >> - ... and 3 more: https://git.openjdk.org/jdk/compare/52737e0d...3e66f7be > > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2651: > >> 2649: __ jcc(Assembler::notZero, L_tailProc); >> 2650: >> 2651: __ cmpl(length, 44); > > Perform `length` checks first to avoid unnecessary branches on small inputs? > > Ideal might be to move this length check up just before the `_cmpl(length, 128)` in the AVX-512 block, so that if `AVX=3` short inputs branch directly to the scalar tail procedure without jumping around. This might also apply to the encode stub, though that's pre-existing. Done. ------------- PR: https://git.openjdk.org/jdk/pull/12126 From jvernee at openjdk.org Wed Feb 1 21:18:59 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Wed, 1 Feb 2023 21:18:59 GMT Subject: RFR: 8301564: Non-C-heap allocated ResourceHashtable keys and values must have trivial destructor In-Reply-To: References: Message-ID: <1mPho2AFEhUa0e9cSa-jHAN6ky3cU2UHXpXR12ACqKE=.a38a8cc3-8c61-4433-a4c1-c60b711a28ba@github.com> On Tue, 31 Jan 2023 23:23:16 GMT, Ioi Lam wrote: > To ensure we don't have memory leaks or other more serious memory management bugs, I added static asserts to check that the `K` and `V` types for `ResourceHashtableBase` must have trivial destructors if the table is not `C_HEAP` allocated. Thanks to @JornVernee for the original assert code. > > The asserts actually found a problem with `ClassLoaderStatsClosure::StatsTable`. The space used by the `oop` in the freed hashtable may be trashed when `-XX:+CheckUnhandledOops` is enabled, so live objects that reuse the same space may be corrupted. (I tried but couldn't get it to fail). > > I also had to change `CodeBuffer::SharedTrampolineRequests` because it's `V` type is `LinkListImpl`, which has a non-trivial destructor. (Note: in this case, the destructor doesn't do anything; we can probably make it go away with C++-20. See https://stackoverflow.com/questions/40094871/sfinae-away-a-destructor) > > Testing with tier1~4 + builds-tier5 This looks good, thanks. I'll put this here for posterity: we discussed changing `ResourceHashtable` to always call the destructor of the key and value (sketch [here](https://github.com/openjdk/jdk/compare/master...JornVernee:jdk:Always_Dtor)). That would allow using `K` and `V` with non-trivial destructors as well. But, this discussion didn't reach a conclusion yet (in particular, there were some doubts expressed over whether a container should be calling destructors at all). In the mean time, this patch improves upon the status quo. ------------- Marked as reviewed by jvernee (Reviewer). PR: https://git.openjdk.org/jdk/pull/12355 From dholmes at openjdk.org Wed Feb 1 22:02:27 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 1 Feb 2023 22:02:27 GMT Subject: RFR: JDK-8301050: Detect Xen Virtualization on Linux aarch64 [v2] In-Reply-To: <2AzgpXplQYjUVCu3UqiVabBzV-a4a5sntGLL5CikDLA=.a7a80f58-4ca4-425b-afa5-9e82e81c3ee6@github.com> References: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> <2AzgpXplQYjUVCu3UqiVabBzV-a4a5sntGLL5CikDLA=.a7a80f58-4ca4-425b-afa5-9e82e81c3ee6@github.com> Message-ID: <2hTfLN6mLaACw972Oj7lmYRQOiu1gU8OyQ5lXoptTGU=.d9ec8d1d-0eef-46c2-b3bc-c20267dcc270@github.com> On Wed, 1 Feb 2023 13:09:19 GMT, Matthias Baesken wrote: >> With https://bugs.openjdk.org/browse/JDK-8300266 KVM and VMWare virtualization detection has been added; this should be enhanced by Xen detection. >> However it seems that for Xen we should look at other file system locations compared to the other virtualizations. >> more /sys/hypervisor/type >> xen >> seems to be available there. > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Do not scan the product_name file twice src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 596: > 594: int resp = check_info_file(pname_file, "KVM", "VMWare"); > 595: if (resp == 1) { > 596: Abstract_VM_Version::_detected_virtualization = KVM; This may be what Christoph suggested by why not keep this in check_info_file so you don't need to map return codes. check_info_file could just be a bool function then. ------------- PR: https://git.openjdk.org/jdk/pull/12217 From sjohanss at openjdk.org Wed Feb 1 22:07:19 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Wed, 1 Feb 2023 22:07:19 GMT Subject: RFR: 8301641: NativeMemoryUsageTotal event uses reserved value for committed field Message-ID: Please review this small fix for the NativeMemoryUsageTotal JFR event. **Summary** The NativeMemoryUsageTotal event uses the reserved value for both reserved and committed. **Testing** Added new test and verified that it works locally. Now running it in mach5 to make sure it doesn't see any unexpected issues there. ------------- Commit messages: - Update test comment - Add test - 8301641: NativeMemoryUsageTotal event uses reserved value for committed field Changes: https://git.openjdk.org/jdk/pull/12377/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12377&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301641 Stats: 13 lines in 2 files changed: 12 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12377.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12377/head:pull/12377 PR: https://git.openjdk.org/jdk/pull/12377 From dholmes at openjdk.org Wed Feb 1 22:10:23 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 1 Feb 2023 22:10:23 GMT Subject: RFR: JDK-8301480: Replace NULL with nullptr in os/posix In-Reply-To: References: Message-ID: <-MMDkiHj6LQwNy4i4DOdjHH9OfyQlI47_u4Ey7ElBNw=.7ecb7c52-2800-496c-945f-ac76982efa30@github.com> On Tue, 31 Jan 2023 10:16:00 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/posix. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Generally good but one bug. Thanks src/hotspot/os/posix/signals_posix.cpp line 1347: > 1345: EXC_MASK_BAD_ACCESS | EXC_MASK_ARITHMETIC > 1346: AARCH64_ONLY(| EXC_MASK_BAD_INSTRUCTION), > 1347: MACH_PORT_nullptr, This is wrong. Needs reverting. (How did this build on macOS?) ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12316 From dholmes at openjdk.org Wed Feb 1 22:35:26 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 1 Feb 2023 22:35:26 GMT Subject: RFR: JDK-8301481: Replace NULL with nullptr in os/windows In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 10:16:14 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/windows. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Looks good! Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12317 From mseledtsov at openjdk.org Wed Feb 1 22:58:46 2023 From: mseledtsov at openjdk.org (Mikhailo Seledtsov) Date: Wed, 1 Feb 2023 22:58:46 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v5] In-Reply-To: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> Message-ID: > Add log() and logToFile() methods to VMProps for diagnostic purposes. Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: Addressing review feedback ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12239/files - new: https://git.openjdk.org/jdk/pull/12239/files/b1aa1c4b..ceb0ed65 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12239&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12239&range=03-04 Stats: 6 lines in 1 file changed: 3 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12239.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12239/head:pull/12239 PR: https://git.openjdk.org/jdk/pull/12239 From mseledtsov at openjdk.org Wed Feb 1 23:03:26 2023 From: mseledtsov at openjdk.org (Mikhailo Seledtsov) Date: Wed, 1 Feb 2023 23:03:26 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v4] In-Reply-To: References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> Message-ID: On Wed, 1 Feb 2023 04:42:18 GMT, David Holmes wrote: >> Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: >> >> Review feedback: printing stdout and stderr files to log() > > test/jtreg-ext/requires/VMProps.java line 519: > >> 517: private String redirectOutputToLogFile(String msg, ProcessBuilder pb, String fileNameBase) { >> 518: if (!Boolean.getBoolean("jtreg.log.vmprops")) { >> 519: return ""; > > I think having to explicitly enable this somewhat defeats the purpose of using it for diagnostics. If we get an unexpected failure we will have to reproduce it with logging enabled to see what went wrong. If I remove this conditional guard then we will see extra output at the beginning of test run which might be undesirable. For instance: jtreg -v1 -conc:4 -Djtreg.log.vmprops=true open/test/hotspot/jtreg/containers/docker/ VMProps: Entering call() VMProps: Entering dockerSupport() VMProps: dockerSupport(): platform check: isSupported = false VMProps: dockerSupport(): returning isSupported = false VMProps: Leaving call() Test results: no tests selected I am not sure what is the best thing to do here, so I stayed on a "safe" side and added this conditional guard. The model in my mind is that such logging is normally off but can be turned on in case the issue is suspected. However if you believe it should be always on I can remove the guard. ------------- PR: https://git.openjdk.org/jdk/pull/12239 From dcubed at openjdk.org Wed Feb 1 23:21:25 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Wed, 1 Feb 2023 23:21:25 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v4] In-Reply-To: References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> Message-ID: On Wed, 1 Feb 2023 22:59:47 GMT, Mikhailo Seledtsov wrote: >> test/jtreg-ext/requires/VMProps.java line 519: >> >>> 517: private String redirectOutputToLogFile(String msg, ProcessBuilder pb, String fileNameBase) { >>> 518: if (!Boolean.getBoolean("jtreg.log.vmprops")) { >>> 519: return ""; >> >> I think having to explicitly enable this somewhat defeats the purpose of using it for diagnostics. If we get an unexpected failure we will have to reproduce it with logging enabled to see what went wrong. > > If I remove this conditional guard then we will see extra output at the beginning of test run which might be undesirable. For instance: > > jtreg -v1 -conc:4 -Djtreg.log.vmprops=true open/test/hotspot/jtreg/containers/docker/ > > VMProps: Entering call() > VMProps: Entering dockerSupport() > VMProps: dockerSupport(): platform check: isSupported = false > VMProps: dockerSupport(): returning isSupported = false > VMProps: Leaving call() > Test results: no tests selected > > > I am not sure what is the best thing to do here, so I stayed on a "safe" side and added this conditional guard. The model in my mind is that such logging is normally off but can be turned on in case the issue is suspected. However if you believe it should be always on I can remove the guard. The failure happens so rarely I think you need to leave the option on in order to get any useful information about this issue when it occurs in Mach5. If a sighting happens in Mach5, I don't think rerunning another task on the same machine will necessarily fail so you need to get the information from the failures that do occur. ------------- PR: https://git.openjdk.org/jdk/pull/12239 From mseledtsov at openjdk.org Wed Feb 1 23:53:24 2023 From: mseledtsov at openjdk.org (Mikhailo Seledtsov) Date: Wed, 1 Feb 2023 23:53:24 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v4] In-Reply-To: References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> Message-ID: On Wed, 1 Feb 2023 23:18:59 GMT, Daniel D. Daugherty wrote: >> If I remove this conditional guard then we will see extra output at the beginning of test run which might be undesirable. For instance: >> >> jtreg -v1 -conc:4 -Djtreg.log.vmprops=true open/test/hotspot/jtreg/containers/docker/ >> >> VMProps: Entering call() >> VMProps: Entering dockerSupport() >> VMProps: dockerSupport(): platform check: isSupported = false >> VMProps: dockerSupport(): returning isSupported = false >> VMProps: Leaving call() >> Test results: no tests selected >> >> >> I am not sure what is the best thing to do here, so I stayed on a "safe" side and added this conditional guard. The model in my mind is that such logging is normally off but can be turned on in case the issue is suspected. However if you believe it should be always on I can remove the guard. > > The failure happens so rarely I think you need to leave the option on in order to get > any useful information about this issue when it occurs in Mach5. If a sighting happens > in Mach5, I don't think rerunning another task on the same machine will necessarily > fail so you need to get the information from the failures that do occur. Thanks Dan and David. I will remove the condition then, and have this logging always on. ------------- PR: https://git.openjdk.org/jdk/pull/12239 From dholmes at openjdk.org Wed Feb 1 23:55:27 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 1 Feb 2023 23:55:27 GMT Subject: RFR: JDK-8301494: Replace NULL with nullptr in cpu/arm In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 11:39:38 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/arm. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Looks good. Thanks. src/hotspot/cpu/arm/interp_masm_arm.cpp line 63: > 61: ldr(Rtemp, Address(FP, frame::interpreter_frame_last_sp_offset * wordSize)); > 62: cbz(Rtemp, L); > 63: stop("InterpreterMacroAssembler::call_VM_helper: last_sp != null"); nullptr ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12322 From mseledtsov at openjdk.org Wed Feb 1 23:59:00 2023 From: mseledtsov at openjdk.org (Mikhailo Seledtsov) Date: Wed, 1 Feb 2023 23:59:00 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v6] In-Reply-To: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> Message-ID: > Add log() and logToFile() methods to VMProps for diagnostic purposes. Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: Removed logging condition as per review feedback ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12239/files - new: https://git.openjdk.org/jdk/pull/12239/files/ceb0ed65..1eef0f6f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12239&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12239&range=04-05 Stats: 6 lines in 1 file changed: 0 ins; 6 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12239.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12239/head:pull/12239 PR: https://git.openjdk.org/jdk/pull/12239 From dholmes at openjdk.org Thu Feb 2 00:01:24 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 2 Feb 2023 00:01:24 GMT Subject: RFR: JDK-8301499: Replace NULL with nullptr in cpu/zero In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 11:40:29 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/zero. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Looks good. Thanks. src/hotspot/cpu/zero/javaFrameAnchor_zero.hpp line 38: > 36: // 2 - saving a current state (javaCalls) > 37: // 3 - restoring an old state (javaCalls) > 38: // Note that whenever _last_Java_sp != null other anchor fields nullptr ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12327 From redestad at openjdk.org Thu Feb 2 00:37:29 2023 From: redestad at openjdk.org (Claes Redestad) Date: Thu, 2 Feb 2023 00:37:29 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v7] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: On Wed, 1 Feb 2023 20:59:24 GMT, Scott Gibbons wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2202: >> >>> 2200: } >>> 2201: >>> 2202: address StubGenerator::base64_AVX2_decode_URL_tables_addr() { >> >> Shouldn't this be `decode_lut_tables`? As it's used for URL and non-URL decoding alike. > > These tables are used for both URL and non-URL based on the parameter, and they are only two of the three lut tables used (the other is in `base64_AVX2_decode_tables_addr` ). Both names are essentially incorrect. Does the name really matter that much? It's the same as `base64_AVX2_decode_tables_addr` with the addition of URL tables. Names are important, but always hard to get right. At the very least they need to be correct. Maybe call it something like `..parameterized_decode_tables..` and the other `..shared_decode_tables..`? ------------- PR: https://git.openjdk.org/jdk/pull/12126 From yyang at openjdk.org Thu Feb 2 02:19:18 2023 From: yyang at openjdk.org (Yi Yang) Date: Thu, 2 Feb 2023 02:19:18 GMT Subject: RFR: 8298091: Dump native instruction along with nmethod name when using Compiler.codelist Message-ID: <1dSP2mbbyqiKLRmWwCfwGdb67ll7-K4cRtV_muUkS9I=.2d2dc2f8-acca-49de-89f5-70a825c057fa@github.com> Dump native instruction along with nmethod name when using Compiler.codelist. e.g. jcmd Compiler.codelist decode= Output: nmethod_name [addr_begin, code_begin, code_end] ------------- Commit messages: - 8298091: Dump native instruction along with nmethod name when using Compiler.codelist Changes: https://git.openjdk.org/jdk/pull/12381/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12381&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8298091 Stats: 32 lines in 7 files changed: 21 ins; 3 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/12381.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12381/head:pull/12381 PR: https://git.openjdk.org/jdk/pull/12381 From mseledtsov at openjdk.org Thu Feb 2 04:00:23 2023 From: mseledtsov at openjdk.org (Mikhailo Seledtsov) Date: Thu, 2 Feb 2023 04:00:23 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v6] In-Reply-To: References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> Message-ID: On Wed, 1 Feb 2023 23:59:00 GMT, Mikhailo Seledtsov wrote: >> Add log() and logToFile() methods to VMProps for diagnostic purposes. > > Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: > > Removed logging condition as per review feedback Alternatively we could keep the conditional guard using the property -Djtreg.log.vmprops. The users will enable this property in the test execution scripts for a test set of interest, such as container testing. ------------- PR: https://git.openjdk.org/jdk/pull/12239 From david.holmes at oracle.com Thu Feb 2 04:30:22 2023 From: david.holmes at oracle.com (David Holmes) Date: Thu, 2 Feb 2023 14:30:22 +1000 Subject: JVM Flag ergonomics and constraints In-Reply-To: References: <1e05de2b-5f03-7ff2-6a47-176df9046588@oracle.com> <637c4388-4c02-c654-6166-23491d797fce@oracle.com> Message-ID: On 1/02/2023 7:48 pm, Thomas St?fe wrote: > On Wed, Feb 1, 2023 at 5:24 AM David Holmes > wrote: > > On 1/02/2023 5:23 am, Ioi Lam wrote: > > CC-ing hotspot-dev for wider discussion. > > > > On 1/31/2023 12:43 AM, Thomas St?fe wrote: > >> Hi Ioi, > >> > >> a short question. You did the JVM flag handling revamp, right? > >> > >> I recently ran into a weird problem where FLAG_SET_ERGO would not > >> work, turned out it ends up using the constraint function, which > >> silently refused to set the argument if the argument violates the > >> constraint. > >> > >> That makes perfect sense, I had an error in my code. But if we > set a > >> flag with ERGO origin, we set it in the JVM, and the JVM should > adhere > >> to constraints, not doing so is a programming error and I would > expect > >> an assert here. The same logic may be applied to other origins too, > >> e.g. DEFAULT. > >> Do you think this makes sense? > >> > >> BTW I added debug output (setting verbose to true for ERGO), and > I see > >> other flag's ergonomics also getting ignored, which would have > to be > >> fixed too. > >> > >> Cheers, Thomas > >> > > > > > > This problem is also reported in > > https://bugs.openjdk.org/browse/JDK-8224980 > > > > > There are proposals for making the debug message more obvious, > and/or > > assert(). > > > > Currently we don't have documentation of what the caller of > > FLAG_SET_ERGO is supposed to do. > > > > There are 3 ways to set a flag. The current behavior of the > following > > two ways seems reasonable > > > >? ?? FLAG_SET_CMDLINE - rejects user input and exit VM > >? ?? FLAG_SET_MGMT??? - return JVMFlag::Error > > > > With command-line, the user should inspect the limits for the > current > > environment and restart the app with an appropriate value > > > > With FLAG_SET_MGMT, I suppose it's similar -- whoever is using the > > management API should detect the error and adjust their parameter > values > > (by hand, or programmatically) > > > > However, FLAG_SET_ERGO also returns JVMFlag::Error, but I don't > see any > > of our code checking for the return value. So the unwritten > convention > > is that the call must not fail. > > However, if the specified value is out of range, should we abort > the VM, > > ignore the value, or cap the value? E.g., if the range is 0~100, > but you > > call SET_ERGO with 101, should we cap the value to 100? > > Generally if our code tries to sets a flags value to something invalid > that is a programming error that should be reported right away. If the > value is a "constant" then we should just assert that setting the value > worked. If the value is determined by external factors that might vary > at runtime (so no guarantee the end result will be valid) then we > should > have logic that can correct for the invalid value (or we should change > the way we do the calculation). > > > Agreed, but how do you want to tell them apart? That would have to be determined case-by-case. For each ergo flag we would have to decide what value it is calculating and based on what, and determine whether an out-of-range value is an error or something to be wary of. > > > Perhaps capping would make the ergo code easier to write, > especially if > > the constraint function produces a dynamic range (depending on the > > machine's RAM size, for example). Without capping, you will need to > > duplicate (or share) the range detection logic between the ergo > setter > > and the constraint checker. > > > I like capping as a pragmatic solution. > > The only problem I see is for "special" values that are outside the > allowed range but still hold information. > > For example, NonNMethodCodeHeapSize is set to 0 via FLAG_SET_ERGO > (SegmentedCodeCache off), which gets ignored since it violates > VMPageSizeConstraintFunc (valid range starts at 4096). > > Though, while typing, I realized that this maybe is a simple error, and > the answer would be to either allow 0 as a value in the constraint > function or to not use this constraint function. Yeah that seems like a flat-out bug in the code. Cheers, David > > Thanks, Thomas From dholmes at openjdk.org Thu Feb 2 04:45:22 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 2 Feb 2023 04:45:22 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v6] In-Reply-To: References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> Message-ID: On Wed, 1 Feb 2023 23:59:00 GMT, Mikhailo Seledtsov wrote: >> Add log() and logToFile() methods to VMProps for diagnostic purposes. > > Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: > > Removed logging condition as per review feedback Okay I'm struggling with the best way to handle this. I think the problem is that the code currently contains logging output of different levels (ref unified logging in the VM) but we only have a binary on/off selection mechanism. I don't want to see all the logging all the time - too noisy. But I do want to see the logging of the stdout/stderr from the exec'd process any time it fails. But I guess we could keep it all conditional on the property and have the task definition for the container tests set the property as suggested. ------------- PR: https://git.openjdk.org/jdk/pull/12239 From dholmes at openjdk.org Thu Feb 2 05:14:34 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 2 Feb 2023 05:14:34 GMT Subject: RFR: 8298091: Dump native instruction along with nmethod name when using Compiler.codelist In-Reply-To: <1dSP2mbbyqiKLRmWwCfwGdb67ll7-K4cRtV_muUkS9I=.2d2dc2f8-acca-49de-89f5-70a825c057fa@github.com> References: <1dSP2mbbyqiKLRmWwCfwGdb67ll7-K4cRtV_muUkS9I=.2d2dc2f8-acca-49de-89f5-70a825c057fa@github.com> Message-ID: On Thu, 2 Feb 2023 02:12:06 GMT, Yi Yang wrote: > Dump native instruction along with nmethod name when using Compiler.codelist. > > e.g. > jcmd Compiler.codelist decode= > > Output: > nmethod_name [addr_begin, code_begin, code_end] > I'm not clear exactly how this is intended to be used, and I think there should be usage information to update somewhere. Further coding suggestions below. I'm stilling mulling this over. Note there is a very high bar for defining new manageable flags (CSR request required) and I'm not sure this usecase warrants it. Why do we need to force/retry the load of the disassembler library it it previously failed to load? Why do you think it will load successfully on subsequent attempts? src/hotspot/share/compiler/disassembler.cpp line 773: > 771: // Do not try to load multiple times. Failed once -> fails always. > 772: // To force retry in debugger: assign _tried_to_load_library=0 or > 773: // turn on ForceLoadDisassembler in the fly "on the fly" src/hotspot/share/compiler/disassembler.cpp line 774: > 772: // To force retry in debugger: assign _tried_to_load_library=0 or > 773: // turn on ForceLoadDisassembler in the fly > 774: if (!ForceLoadDisassembler && _tried_to_load_library) { This doesn't seem quite right. If the library has already been loaded and is usable then you don't need to "reload" it even if ForceLoadDisassembler is true. I would expect: if (_tried_to_load_library) { if (_library_usable) { return true; } else if (!RetryLoadDisassembler) { return false; } src/hotspot/share/runtime/globals.hpp line 651: > 649: \ > 650: product(bool, ForceLoadDisassembler, false, MANAGEABLE, \ > 651: "Always loading hotspot disassembler") \ That doesn't quite describe what the flag does. I would suggest calling it `RetryLoadDisassembler` with a description: Always try to reload the disassembler, even if a previous load failed" ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12381 From iklam at openjdk.org Thu Feb 2 05:14:46 2023 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 2 Feb 2023 05:14:46 GMT Subject: RFR: 8301564: Non-C-heap allocated ResourceHashtable keys and values must have trivial destructor [v2] In-Reply-To: References: Message-ID: > To ensure we don't have memory leaks or other more serious memory management bugs, I added static asserts to check that the `K` and `V` types for `ResourceHashtableBase` must have trivial destructors if the table is not `C_HEAP` allocated. Thanks to @JornVernee for the original assert code. > > The asserts actually found a problem with `ClassLoaderStatsClosure::StatsTable`. The space used by the `oop` in the freed hashtable may be trashed when `-XX:+CheckUnhandledOops` is enabled, so live objects that reuse the same space may be corrupted. (I tried but couldn't get it to fail). > > I also had to change `CodeBuffer::SharedTrampolineRequests` because it's `V` type is `LinkListImpl`, which has a non-trivial destructor. (Note: in this case, the destructor doesn't do anything; we can probably make it go away with C++-20. See https://stackoverflow.com/questions/40094871/sfinae-away-a-destructor) > > Testing with tier1~4 + builds-tier5 Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: @coleenp comments - use mtStatistics instead ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12355/files - new: https://git.openjdk.org/jdk/pull/12355/files/e3ccac72..915f058b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12355&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12355&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12355.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12355/head:pull/12355 PR: https://git.openjdk.org/jdk/pull/12355 From iklam at openjdk.org Thu Feb 2 05:14:47 2023 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 2 Feb 2023 05:14:47 GMT Subject: RFR: 8301564: Non-C-heap allocated ResourceHashtable keys and values must have trivial destructor [v2] In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 19:38:38 GMT, Coleen Phillimore wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> @coleenp comments - use mtStatistics instead > > src/hotspot/share/classfile/classLoaderStats.hpp line 115: > >> 113: >> 114: typedef ResourceHashtable> 115: 256, AnyObj::C_HEAP, mtInternal, > > There's an mtStatistics NMT category, can you use that for this? and the instance below? Fixed. ------------- PR: https://git.openjdk.org/jdk/pull/12355 From thomas.stuefe at gmail.com Thu Feb 2 05:25:56 2023 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 2 Feb 2023 06:25:56 +0100 Subject: JVM Flag ergonomics and constraints In-Reply-To: References: <1e05de2b-5f03-7ff2-6a47-176df9046588@oracle.com> <637c4388-4c02-c654-6166-23491d797fce@oracle.com> Message-ID: On Thu, Feb 2, 2023 at 5:30 AM David Holmes wrote: > > On 1/02/2023 7:48 pm, Thomas St?fe wrote: > > On Wed, Feb 1, 2023 at 5:24 AM David Holmes > > wrote: > > > > On 1/02/2023 5:23 am, Ioi Lam wrote: > > > CC-ing hotspot-dev for wider discussion. > > > > > > On 1/31/2023 12:43 AM, Thomas St?fe wrote: > > >> Hi Ioi, > > >> > > >> a short question. You did the JVM flag handling revamp, right? > > >> > > >> I recently ran into a weird problem where FLAG_SET_ERGO would not > > >> work, turned out it ends up using the constraint function, which > > >> silently refused to set the argument if the argument violates the > > >> constraint. > > >> > > >> That makes perfect sense, I had an error in my code. But if we > > set a > > >> flag with ERGO origin, we set it in the JVM, and the JVM should > > adhere > > >> to constraints, not doing so is a programming error and I would > > expect > > >> an assert here. The same logic may be applied to other origins > too, > > >> e.g. DEFAULT. > > >> Do you think this makes sense? > > >> > > >> BTW I added debug output (setting verbose to true for ERGO), and > > I see > > >> other flag's ergonomics also getting ignored, which would have > > to be > > >> fixed too. > > >> > > >> Cheers, Thomas > > >> > > > > > > > > > This problem is also reported in > > > https://bugs.openjdk.org/browse/JDK-8224980 > > > > > > > > There are proposals for making the debug message more obvious, > > and/or > > > assert(). > > > > > > Currently we don't have documentation of what the caller of > > > FLAG_SET_ERGO is supposed to do. > > > > > > There are 3 ways to set a flag. The current behavior of the > > following > > > two ways seems reasonable > > > > > > FLAG_SET_CMDLINE - rejects user input and exit VM > > > FLAG_SET_MGMT - return JVMFlag::Error > > > > > > With command-line, the user should inspect the limits for the > > current > > > environment and restart the app with an appropriate value > > > > > > With FLAG_SET_MGMT, I suppose it's similar -- whoever is using the > > > management API should detect the error and adjust their parameter > > values > > > (by hand, or programmatically) > > > > > > However, FLAG_SET_ERGO also returns JVMFlag::Error, but I don't > > see any > > > of our code checking for the return value. So the unwritten > > convention > > > is that the call must not fail. > > > However, if the specified value is out of range, should we abort > > the VM, > > > ignore the value, or cap the value? E.g., if the range is 0~100, > > but you > > > call SET_ERGO with 101, should we cap the value to 100? > > > > Generally if our code tries to sets a flags value to something > invalid > > that is a programming error that should be reported right away. If > the > > value is a "constant" then we should just assert that setting the > value > > worked. If the value is determined by external factors that might > vary > > at runtime (so no guarantee the end result will be valid) then we > > should > > have logic that can correct for the invalid value (or we should > change > > the way we do the calculation). > > > > > > Agreed, but how do you want to tell them apart? > > That would have to be determined case-by-case. For each ergo flag we > would have to decide what value it is calculating and based on what, and > determine whether an out-of-range value is an error or something to be > wary of. > Sure, what I meant was how to control it? Two sets of FLAT_SET_... macros, one that asserts on constraint violation, one that caps? Cheers, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From dholmes at openjdk.org Thu Feb 2 05:27:24 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 2 Feb 2023 05:27:24 GMT Subject: RFR: 8301116: Parallelize TLAB resizing in G1 In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 10:34:36 GMT, Thomas Schatzl wrote: > Hi all, > > can I get reviews for this change that moves TLAB resizing into the second parallel post evacuation phase? > > The change is fairly straightforward except maybe for the following: > * there is a new claimer for `JavaThreads` because the `Threads::possibly_parallel_threads_do` method to parallelize does not work well with very little per-thread work. Actually with a moderate amount of threads (~20) performance would already be many times slower than doing things serially. > * moving the TLAB resizing out of `G1CollectedHeap::prepare_tlabs_for_mutator` made that method and the enclosing timing a bit obsolete, so I repurposed it as a "do preparatory work for the mutator during young gc" phase. That resulted in minor cleanup and regularization of what is done during that phase (wrt to what the corresponding method for full gc does). > > Testing: gha, local perf testing, tier1 > > Thanks, > Thomas src/hotspot/share/runtime/threadSMR.hpp line 204: > 202: JavaThread *const thread_at(uint i) const { return _threads[i]; } > 203: > 204: JavaThread *const *threads() const { return _threads; } Seems an unrelated change. ?? ------------- PR: https://git.openjdk.org/jdk/pull/12360 From dholmes at openjdk.org Thu Feb 2 06:13:29 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 2 Feb 2023 06:13:29 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v2] In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 15:26:27 GMT, Justin King wrote: >> Adds initial LSan (LeakSanitizer) support to Hotspot. This setup has been used to identify multiple leaks so far. It can run most of the test suite except those that rely on testing compressed oops or compressed class pointers. It is especially useful when combined with ASan, as LSan can use poisoning information to determine what memory to scan or not to scan, making leak detection more accurate and faster. >> >> **Suppressing:** >> Currently the suppression list is only used to suppress JLI leaks that are known, the rest are done in code. Suppressing needs to identify the source of thet leak. Due to Hotspot's code organization, we would need to suppress `os::malloc` and friends, which would suppress everything. Suppressing in code has the added benefit of being explicit and surviving refactors if methods change. >> >> **Caveats:** >> - `UseCompressedOops` and `UseCompressedClassPointers` are forced to false when LSan is enabled. This is necessary to ensure all pointers to memory which could possible container pointers to malloc memory are property aligned, which is an LSan requirement. >> - By default ASan enables LSan, however we explicitly disable it unless `--enable-lsan` is given. This is due to the other caveats. It is useful to be able to use ASan without LSan. Using LSan by itself is less likely to be useful and will probably not work, but its still possible currently. >> - There are a series of tests that are upset due to the above flags being forced false, as they rely on the arguments being supported. In the future ideally these tests would be skipped nicely when LSan is enabled. > > Justin King has updated the pull request incrementally with one additional commit since the last revision: > > Support CDS > > Signed-off-by: Justin King make/data/lsan/lsan_default_options.c line 49: > 47: // extremely early during library loading, before main is called. We need to override the default > 48: // options because LSan will perform leak checking at program exit. Unfortunately Hotspot does not > 49: // shutdown cleaning at the moment and some leaks occur, we want to ignore these. Instead we s/cleaning/cleanly/ src/hotspot/share/runtime/arguments.cpp line 3979: > 3977: // LSAN relies on pointers to be natively aligned. Using compressed class pointers breaks this > 3978: // expectation and results in nondeterministic leak reports. > 3979: if (FLAG_SET_CMDLINE(UseCompressedOops, false) != JVMFlag::SUCCESS) { This positioning doesn't look right. There are a number of inter-dependencies on the compressed oops flags and I don't think you can just turn them off here. Surely `Arguments::set_use_compressed_oops()` is where this needs to be done? src/hotspot/share/runtime/init.cpp line 132: > 130: VirtualSpaceSummary summary = Universe::heap()->create_heap_space_summary(); > 131: LSAN_REGISTER_ROOT_REGION(summary.start(), summary.reserved_size()); > 132: } Why is this unconditional? I expected it to be in `#ifdef LEAK_SANITIZER` src/hotspot/share/runtime/init.cpp line 198: > 196: VirtualSpaceSummary summary = Universe::heap()->create_heap_space_summary(); > 197: LSAN_UNREGISTER_ROOT_REGION(summary.start(), summary.reserved_size()); > 198: } Again Why is this unconditional? I expected it to be in `#ifdef LEAK_SANITIZER` src/hotspot/share/runtime/java.cpp line 432: > 430: static_cast(result); > 431: } > 432: Why is this unconditional? I expected it to be in `#ifdef LEAK_SANITIZER` ------------- PR: https://git.openjdk.org/jdk/pull/12229 From dholmes at openjdk.org Thu Feb 2 06:13:30 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 2 Feb 2023 06:13:30 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v2] In-Reply-To: References: Message-ID: <6xcgr1n9j4B5--0YAAqKybMi2Rd4W_OOv0nvQwDWhcY=.b99607ae-d25a-4429-bc73-27ccf5a82ede@github.com> On Thu, 2 Feb 2023 06:01:04 GMT, David Holmes wrote: >> Justin King has updated the pull request incrementally with one additional commit since the last revision: >> >> Support CDS >> >> Signed-off-by: Justin King > > src/hotspot/share/runtime/arguments.cpp line 3979: > >> 3977: // LSAN relies on pointers to be natively aligned. Using compressed class pointers breaks this >> 3978: // expectation and results in nondeterministic leak reports. >> 3979: if (FLAG_SET_CMDLINE(UseCompressedOops, false) != JVMFlag::SUCCESS) { > > This positioning doesn't look right. There are a number of inter-dependencies on the compressed oops flags and I don't think you can just turn them off here. Surely `Arguments::set_use_compressed_oops()` is where this needs to be done? Not sure FLAG_SET_CMDLINE is right here either (but flag setting is very confusing). ------------- PR: https://git.openjdk.org/jdk/pull/12229 From stuefe at openjdk.org Thu Feb 2 06:24:24 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 2 Feb 2023 06:24:24 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v2] In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 15:26:27 GMT, Justin King wrote: >> Adds initial LSan (LeakSanitizer) support to Hotspot. This setup has been used to identify multiple leaks so far. It can run most of the test suite except those that rely on testing compressed oops or compressed class pointers. It is especially useful when combined with ASan, as LSan can use poisoning information to determine what memory to scan or not to scan, making leak detection more accurate and faster. >> >> **Suppressing:** >> Currently the suppression list is only used to suppress JLI leaks that are known, the rest are done in code. Suppressing needs to identify the source of thet leak. Due to Hotspot's code organization, we would need to suppress `os::malloc` and friends, which would suppress everything. Suppressing in code has the added benefit of being explicit and surviving refactors if methods change. >> >> **Caveats:** >> - `UseCompressedOops` and `UseCompressedClassPointers` are forced to false when LSan is enabled. This is necessary to ensure all pointers to memory which could possible container pointers to malloc memory are property aligned, which is an LSan requirement. >> - By default ASan enables LSan, however we explicitly disable it unless `--enable-lsan` is given. This is due to the other caveats. It is useful to be able to use ASan without LSan. Using LSan by itself is less likely to be useful and will probably not work, but its still possible currently. >> - There are a series of tests that are upset due to the above flags being forced false, as they rely on the arguments being supported. In the future ideally these tests would be skipped nicely when LSan is enabled. > > Justin King has updated the pull request incrementally with one additional commit since the last revision: > > Support CDS > > Signed-off-by: Justin King I think this is very useful, but the reliance on -UseCompressedOops and -UseCompressedClassPointers is unfortunate. Especially the latter. With Project Liliput, we plan to eliminate uncompressed class pointers altogether (there is really no good reason to not have them even in today's JVM unless you plan to load more than ~4million classes). ------------- PR: https://git.openjdk.org/jdk/pull/12229 From stuefe at openjdk.org Thu Feb 2 06:37:23 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 2 Feb 2023 06:37:23 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v2] In-Reply-To: References: Message-ID: <9Q5Yb8KEUtCRCVYWSAD0WxGEfOCxAmov9_g1zQ8i7fM=.fa6eb056-13b9-4057-9adf-c5d185f4895b@github.com> On Wed, 1 Feb 2023 15:26:27 GMT, Justin King wrote: >> Adds initial LSan (LeakSanitizer) support to Hotspot. This setup has been used to identify multiple leaks so far. It can run most of the test suite except those that rely on testing compressed oops or compressed class pointers. It is especially useful when combined with ASan, as LSan can use poisoning information to determine what memory to scan or not to scan, making leak detection more accurate and faster. >> >> **Suppressing:** >> Currently the suppression list is only used to suppress JLI leaks that are known, the rest are done in code. Suppressing needs to identify the source of thet leak. Due to Hotspot's code organization, we would need to suppress `os::malloc` and friends, which would suppress everything. Suppressing in code has the added benefit of being explicit and surviving refactors if methods change. >> >> **Caveats:** >> - `UseCompressedOops` and `UseCompressedClassPointers` are forced to false when LSan is enabled. This is necessary to ensure all pointers to memory which could possible container pointers to malloc memory are property aligned, which is an LSan requirement. >> - By default ASan enables LSan, however we explicitly disable it unless `--enable-lsan` is given. This is due to the other caveats. It is useful to be able to use ASan without LSan. Using LSan by itself is less likely to be useful and will probably not work, but its still possible currently. >> - There are a series of tests that are upset due to the above flags being forced false, as they rely on the arguments being supported. In the future ideally these tests would be skipped nicely when LSan is enabled. > > Justin King has updated the pull request incrementally with one additional commit since the last revision: > > Support CDS > > Signed-off-by: Justin King Thinking this through some more: Why would UseCompressedClassPointers cause leak analysis to fail? Because you track Metaspace? If so, we should think whether it makes sense to track Metaspace. Metaspace memory is tied to class loaders. Class loaders may live forever, until VM death, in that case Metaspace will not be released. How exactly does the leak profiler deal with that? Does it report those cases as leaks? Metaspace "leaks" exist, but these are typically class loader leaks at the application level, and those you cannot recognize from the JVM. You need application knowledge for that. ------------- PR: https://git.openjdk.org/jdk/pull/12229 From david.holmes at oracle.com Thu Feb 2 06:37:50 2023 From: david.holmes at oracle.com (David Holmes) Date: Thu, 2 Feb 2023 16:37:50 +1000 Subject: JVM Flag ergonomics and constraints In-Reply-To: References: <1e05de2b-5f03-7ff2-6a47-176df9046588@oracle.com> <637c4388-4c02-c654-6166-23491d797fce@oracle.com> Message-ID: On 2/02/2023 3:25 pm, Thomas St?fe wrote: > On Thu, Feb 2, 2023 at 5:30 AM David Holmes > wrote: > > That would have to be determined case-by-case. For each ergo flag we > would have to decide what value it is calculating and based on what, > and > determine whether an out-of-range value is an error or something to be > wary of. > > Sure, what I meant was how to control it? Two sets of FLAT_SET_... > macros, one that asserts on constraint violation, one that caps? No I expect to just see the places where FLAG_SET_ERGO is called do the right thing by checking the return value and proceeding accordingly. Cheers, David > > Cheers, Thomas From eosterlund at openjdk.org Thu Feb 2 07:04:24 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 2 Feb 2023 07:04:24 GMT Subject: RFR: 8301641: NativeMemoryUsageTotal event uses reserved value for committed field In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 22:00:47 GMT, Stefan Johansson wrote: > Please review this small fix for the NativeMemoryUsageTotal JFR event. > > **Summary** > The NativeMemoryUsageTotal event uses the reserved value for both reserved and committed. > > **Testing** > Added new test and verified that it works locally. Now running it in mach5 to make sure it doesn't see any unexpected issues there. Marked as reviewed by eosterlund (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12377 From mbaesken at openjdk.org Thu Feb 2 09:14:03 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 2 Feb 2023 09:14:03 GMT Subject: RFR: JDK-8301050: Detect Xen Virtualization on Linux aarch64 [v3] In-Reply-To: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> References: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> Message-ID: > With https://bugs.openjdk.org/browse/JDK-8300266 KVM and VMWare virtualization detection has been added; this should be enhanced by Xen detection. > However it seems that for Xen we should look at other file system locations compared to the other virtualizations. > more /sys/hypervisor/type > xen > seems to be available there. Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: Move more coding into check_info_file ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12217/files - new: https://git.openjdk.org/jdk/pull/12217/files/3ef5c503..f4175a91 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12217&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12217&range=01-02 Stats: 22 lines in 2 files changed: 5 ins; 9 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/12217.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12217/head:pull/12217 PR: https://git.openjdk.org/jdk/pull/12217 From mbaesken at openjdk.org Thu Feb 2 09:14:06 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 2 Feb 2023 09:14:06 GMT Subject: RFR: JDK-8301050: Detect Xen Virtualization on Linux aarch64 [v2] In-Reply-To: <2AzgpXplQYjUVCu3UqiVabBzV-a4a5sntGLL5CikDLA=.a7a80f58-4ca4-425b-afa5-9e82e81c3ee6@github.com> References: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> <2AzgpXplQYjUVCu3UqiVabBzV-a4a5sntGLL5CikDLA=.a7a80f58-4ca4-425b-afa5-9e82e81c3ee6@github.com> Message-ID: On Wed, 1 Feb 2023 13:09:19 GMT, Matthias Baesken wrote: >> With https://bugs.openjdk.org/browse/JDK-8300266 KVM and VMWare virtualization detection has been added; this should be enhanced by Xen detection. >> However it seems that for Xen we should look at other file system locations compared to the other virtualizations. >> more /sys/hypervisor/type >> xen >> seems to be available there. > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Do not scan the product_name file twice Not a big fan or varargs, and I think we do not need them here. But sure we can move modifying Abstract_VM_Version::_detected_virtualization directly in check_info_file, I adjusted the coding. ------------- PR: https://git.openjdk.org/jdk/pull/12217 From tschatzl at openjdk.org Thu Feb 2 09:23:29 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 2 Feb 2023 09:23:29 GMT Subject: RFR: 8301116: Parallelize TLAB resizing in G1 In-Reply-To: References: Message-ID: On Thu, 2 Feb 2023 05:24:53 GMT, David Holmes wrote: >> Hi all, >> >> can I get reviews for this change that moves TLAB resizing into the second parallel post evacuation phase? >> >> The change is fairly straightforward except maybe for the following: >> * there is a new claimer for `JavaThreads` because the `Threads::possibly_parallel_threads_do` method to parallelize does not work well with very little per-thread work. Actually with a moderate amount of threads (~20) performance would already be many times slower than doing things serially. >> * moving the TLAB resizing out of `G1CollectedHeap::prepare_tlabs_for_mutator` made that method and the enclosing timing a bit obsolete, so I repurposed it as a "do preparatory work for the mutator during young gc" phase. That resulted in minor cleanup and regularization of what is done during that phase (wrt to what the corresponding method for full gc does). >> >> Testing: gha, local perf testing, tier1 >> >> Thanks, >> Thomas > > src/hotspot/share/runtime/threadSMR.hpp line 204: > >> 202: JavaThread *const thread_at(uint i) const { return _threads[i]; } >> 203: >> 204: JavaThread *const *threads() const { return _threads; } > > Seems an unrelated change. ?? No, the operations on the `JavaThread` during these phases change the `JavaThread` state (actually, the referenced TLAB state). E.g. `g1YoungGCPostEvacuateTasks.cpp:720`: void do_work(uint worker_id) override { JavaThread* const* list; uint count; while ((list = _claimer.claim(count)) != nullptr) { for (uint i = 0; i < count; i++) { list[i]->tlab().resize(); <--- here, the tlab needs to be modified (but not the JavaThread) } } } I could `const_cast` that one away, but that seemed really ugly to me. In a follow-up change there is a similar access to the threads' TLAB. Maybe there is a better way to do that (apart from `const_cast`ing). ------------- PR: https://git.openjdk.org/jdk/pull/12360 From tschatzl at openjdk.org Thu Feb 2 09:27:52 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 2 Feb 2023 09:27:52 GMT Subject: RFR: 8301116: Parallelize TLAB resizing in G1 In-Reply-To: References: Message-ID: <4STcU7c1bgEwc6q5FeqqMSkkRSftG62O1_tiqGBlLXc=.01e51631-360c-4a39-884b-5f5ca3efcba4@github.com> On Thu, 2 Feb 2023 09:20:18 GMT, Thomas Schatzl wrote: >> src/hotspot/share/runtime/threadSMR.hpp line 204: >> >>> 202: JavaThread *const thread_at(uint i) const { return _threads[i]; } >>> 203: >>> 204: JavaThread *const *threads() const { return _threads; } >> >> Seems an unrelated change. ?? > > No, the operations on the `JavaThread` during these phases change the `JavaThread` state (actually, the referenced TLAB state). > > E.g. `g1YoungGCPostEvacuateTasks.cpp:720`: > > void do_work(uint worker_id) override { > JavaThread* const* list; > uint count; > while ((list = _claimer.claim(count)) != nullptr) { > for (uint i = 0; i < count; i++) { > list[i]->tlab().resize(); <--- here, the tlab needs to be modified (but not the JavaThread) > } > } > } > > > I could `const_cast` that one away, but that seemed really ugly to me. > > In a follow-up change there is a similar access to the threads' TLAB. Maybe there is a better way to do that (apart from `const_cast`ing). To be more exact, the removal of the const happens in `g1CollectedHeap.inline.hpp:124`: inline JavaThread* const* G1JavaThreadsListClaimer::claim(uint& count) { count = 0; if (Atomic::load(&_cur_claim) >= _list.length()) { return nullptr; } uint claim = Atomic::fetch_and_add(&_cur_claim, _claim_step); if (claim >= _list.length()) { return nullptr; } count = MIN2(_list.length() - claim, _claim_step); return _list.list()->threads() + claim; <--- here } because of the mentioned access in `g1YoungGCPostEvacuateTasks.cpp:720`. ------------- PR: https://git.openjdk.org/jdk/pull/12360 From tschatzl at openjdk.org Thu Feb 2 09:42:51 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 2 Feb 2023 09:42:51 GMT Subject: RFR: 8301116: Parallelize TLAB resizing in G1 In-Reply-To: <4STcU7c1bgEwc6q5FeqqMSkkRSftG62O1_tiqGBlLXc=.01e51631-360c-4a39-884b-5f5ca3efcba4@github.com> References: <4STcU7c1bgEwc6q5FeqqMSkkRSftG62O1_tiqGBlLXc=.01e51631-360c-4a39-884b-5f5ca3efcba4@github.com> Message-ID: <7Y1f2wfiabz1aO5qFODUi8gEJU_VB2EQKNmMOH4oY1I=.baf20aec-b37d-4a5b-a45a-fae9faaf1089@github.com> On Thu, 2 Feb 2023 09:25:17 GMT, Thomas Schatzl wrote: >> No, the operations on the `JavaThread` during these phases change the `JavaThread` state (actually, the referenced TLAB state). >> >> E.g. `g1YoungGCPostEvacuateTasks.cpp:720`: >> >> void do_work(uint worker_id) override { >> JavaThread* const* list; >> uint count; >> while ((list = _claimer.claim(count)) != nullptr) { >> for (uint i = 0; i < count; i++) { >> list[i]->tlab().resize(); <--- here, the tlab needs to be modified (but not the JavaThread) >> } >> } >> } >> >> >> I could `const_cast` that one away, but that seemed really ugly to me. >> >> In a follow-up change there is a similar access to the threads' TLAB. Maybe there is a better way to do that (apart from `const_cast`ing). > > To be more exact, the removal of the const happens in `g1CollectedHeap.inline.hpp:124`: > > inline JavaThread* const* G1JavaThreadsListClaimer::claim(uint& count) { > count = 0; > if (Atomic::load(&_cur_claim) >= _list.length()) { > return nullptr; > } > uint claim = Atomic::fetch_and_add(&_cur_claim, _claim_step); > if (claim >= _list.length()) { > return nullptr; > } > count = MIN2(_list.length() - claim, _claim_step); > return _list.list()->threads() + claim; <--- here > } > > because of the mentioned access in `g1YoungGCPostEvacuateTasks.cpp:720`. Using `Threads::possibly_parallel_oops_do`, thread iteration/claiming seems to be the bottleneck, i.e. claiming the token. One issue is that all threads need to traverse the array from the beginning to get to the current claim position - so the minimum processing time for a single thread is iterating the complete thread array and checking all tokens for it. That does not matter so much if the work per thread is big (like when walking the stacks for oops - but even then I think it would be noticable, need to check). The other is that with little work per thread threads like in this situation they seem to be contending heavily on the JavaThread claim tokens, so this "parallelization" is quite a pessimization - I've measured ~4x slower with 18 threads than using the single-threaded version (on ~21k JavaThreads) ------------- PR: https://git.openjdk.org/jdk/pull/12360 From tschatzl at openjdk.org Thu Feb 2 09:55:25 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 2 Feb 2023 09:55:25 GMT Subject: RFR: 8301116: Parallelize TLAB resizing in G1 In-Reply-To: <7Y1f2wfiabz1aO5qFODUi8gEJU_VB2EQKNmMOH4oY1I=.baf20aec-b37d-4a5b-a45a-fae9faaf1089@github.com> References: <4STcU7c1bgEwc6q5FeqqMSkkRSftG62O1_tiqGBlLXc=.01e51631-360c-4a39-884b-5f5ca3efcba4@github.com> <7Y1f2wfiabz1aO5qFODUi8gEJU_VB2EQKNmMOH4oY1I=.baf20aec-b37d-4a5b-a45a-fae9faaf1089@github.com> Message-ID: <5ceqoRFEZguGfrxpAmInTaTTSFYbReYk6eonXmQJOwg=.6fc16c96-6a7b-4c2e-8a3b-c81b5beb5903@github.com> On Thu, 2 Feb 2023 09:39:32 GMT, Thomas Schatzl wrote: >> To be more exact, the removal of the const happens in `g1CollectedHeap.inline.hpp:124`: >> >> inline JavaThread* const* G1JavaThreadsListClaimer::claim(uint& count) { >> count = 0; >> if (Atomic::load(&_cur_claim) >= _list.length()) { >> return nullptr; >> } >> uint claim = Atomic::fetch_and_add(&_cur_claim, _claim_step); >> if (claim >= _list.length()) { >> return nullptr; >> } >> count = MIN2(_list.length() - claim, _claim_step); >> return _list.list()->threads() + claim; <--- here >> } >> >> because of the mentioned access in `g1YoungGCPostEvacuateTasks.cpp:720`. > > Using `Threads::possibly_parallel_oops_do`, thread iteration/claiming seems to be the bottleneck, i.e. claiming the token. > > One issue is that all threads need to traverse the array from the beginning to get to the current claim position - so the minimum processing time for a single thread is iterating the complete thread array and checking all tokens for it. That does not matter so much if the work per thread is big (like when walking the stacks for oops - but even then I think it would be noticable, need to check). > > The other is that with little work per thread threads like in this situation they seem to be contending heavily on the JavaThread claim tokens, so this "parallelization" is quite a pessimization - I've measured ~4x slower with 18 threads than using the single-threaded version (on ~21k JavaThreads) It looks like this has been a leftover of some older version (the `G1JavaThreadClaimer` is fairly "new") - I will remove this change. Thanks for making me look at this again in detail! ------------- PR: https://git.openjdk.org/jdk/pull/12360 From clanger at openjdk.org Thu Feb 2 09:59:11 2023 From: clanger at openjdk.org (Christoph Langer) Date: Thu, 2 Feb 2023 09:59:11 GMT Subject: RFR: JDK-8301050: Detect Xen Virtualization on Linux aarch64 [v3] In-Reply-To: References: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> Message-ID: On Thu, 2 Feb 2023 09:14:03 GMT, Matthias Baesken wrote: >> With https://bugs.openjdk.org/browse/JDK-8300266 KVM and VMWare virtualization detection has been added; this should be enhanced by Xen detection. >> However it seems that for Xen we should look at other file system locations compared to the other virtualizations. >> more /sys/hypervisor/type >> xen >> seems to be available there. > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Move more coding into check_info_file Looks good to me now. ------------- Marked as reviewed by clanger (Reviewer). PR: https://git.openjdk.org/jdk/pull/12217 From duke at openjdk.org Thu Feb 2 10:24:36 2023 From: duke at openjdk.org (Afshin Zafari) Date: Thu, 2 Feb 2023 10:24:36 GMT Subject: RFR: 8297539: Consolidate the uses of the int<->float union conversion trick Message-ID: ### Discussion There are different ways of casting/converting integral types to floating point types and vice versa in the code. These conversions are changed to invocation of a centralised `cast(From)` method in `primitiveConversions.hpp`. To make this changes build and run, some new `cast` methods are introduced and header dependencies changed. ### Patch - All instances of using `union` technique that are used for int<->float and long<->double type conversions are replaced by `PrimitiveConversions::cast(From)` method. - Instances of, for example, converting `int`<->`intptr_t` are not changed. - After this change, `JavaValue` also uses the `PrimitiveConversions::cast(From)` templates. Therefore, all the `union` conversions for `int`<-> `float` and `long` <-> `double` in hotspot are now centralised in one location. - Templates in `PrimitiveConversions` class are numbered from 1-9 and all lines of code that use `PrimitivEConversions::cast(From)` have comments saying which templated `cast(From)` matches. Issue number 8297539 is also in the comments to make the search/reviews more easier. - The new 'cast(From)' templates are numbered 6-9 for some instances where existing templates (1-5) did not match. - `globalDefinitions.hpp` now includes `primitiveConversions.hpp`. - `primitiveConversions.hpp` does NOT include `globalDefinitions.hpp` any more. Some typedef's are added to it to compensate the missing types. - The followings are not changed: - `jni.hpp` - `ciConstant.hpp` is not changed, since uinon is not used for converting `int` <->`float` or `long` <-> `double`. There are some asserts that allow setter/getter used only for the same types. - `bytecodeInterpreter_zero.hpp` is not changed. Writing to and reading from the `VMJavaVal64` are always with the same type and no conversion is made. - union with bit fields are not changed. - `json.hpp` and `json.cpp` in src/hotspot/share/utilities. ### Test mach5 tiers 1-5 passed, tiers 6-8 in progress ------------- Commit messages: - 8297539: Consolidate the uses of the int<->float union conversion trick Changes: https://git.openjdk.org/jdk/pull/12384/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12384&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8297539 Stats: 404 lines in 12 files changed: 149 ins; 114 del; 141 mod Patch: https://git.openjdk.org/jdk/pull/12384.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12384/head:pull/12384 PR: https://git.openjdk.org/jdk/pull/12384 From tschatzl at openjdk.org Thu Feb 2 10:38:02 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 2 Feb 2023 10:38:02 GMT Subject: RFR: 8301116: Parallelize TLAB resizing in G1 [v2] In-Reply-To: References: Message-ID: > Hi all, > > can I get reviews for this change that moves TLAB resizing into the second parallel post evacuation phase? > > The change is fairly straightforward except maybe for the following: > * there is a new claimer for `JavaThreads` because the `Threads::possibly_parallel_threads_do` method to parallelize does not work well with very little per-thread work. Actually with a moderate amount of threads (~20) performance would already be many times slower than doing things serially. > * moving the TLAB resizing out of `G1CollectedHeap::prepare_tlabs_for_mutator` made that method and the enclosing timing a bit obsolete, so I repurposed it as a "do preparatory work for the mutator during young gc" phase. That resulted in minor cleanup and regularization of what is done during that phase (wrt to what the corresponding method for full gc does). > > Testing: gha, local perf testing, tier1 > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with two additional commits since the last revision: - fix indentation :( - daholmes review, remove unnecessary change ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12360/files - new: https://git.openjdk.org/jdk/pull/12360/files/6d8751ef..6bff3cc1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12360&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12360&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12360.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12360/head:pull/12360 PR: https://git.openjdk.org/jdk/pull/12360 From stuefe at openjdk.org Thu Feb 2 13:19:06 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 2 Feb 2023 13:19:06 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v9] In-Reply-To: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: > This RFE adds the option to auto-trim the Glibc heap as part of the GC cycle. If the VM process suffered high temporary malloc spikes (regardless whether from JVM- or user code), this could recover significant amounts of memory. > > We discussed this a year ago [1], but the item got pushed to the bottom of my work pile, therefore it took longer than I thought. > > ### Motivation > > The Glibc allocator is reluctant to return memory to the OS, much more so than other allocators. Temporary malloc spikes often carry over as permanent RSS increase. > > Note that C-heap retention is difficult to observe. Since it is freed memory, it won't show up in NMT, it is just a part of private RSS. > > Theoretically, retained memory is not lost since it will be reused by future mallocs. Retaining memory is therefore a bet on the future behavior of the app. The allocator bets on the application needing memory in the near future, and to satisfy that need via malloc. > > But an app's malloc load can fluctuate wildly, with temporary spikes and long idle periods. And if the app rolls its own custom allocators atop of mmap, as hotspot does, a lot of that memory cannot be reused even though it counts toward its memory footprint. > > To help, Glibc exports an API to trim the C-heap: `malloc_trim(3)`. With JDK 18 [2], SAP contributed a new jcmd command to *manually* trim the C-heap on Linux. This RFE adds a complementary way to trim automatically. > > #### Is this even a problem? > > Do we have high malloc spikes in the JVM process? We assume that malloc load from hotspot is usually low since hotspot typically clusters allocations into custom areas - metaspace, code heap, arenas. > > But arenas are subject to Glibc mem retention too. I was surprised by that since I assumed 32k arena chunks were too big to be subject of Glibc retention. But I saw in experiments that high arena peaks often cause lasting RSS increase. > > And of course, both hotspot and JDK do a lot of finer-granular mallocs outside of custom allocators. > > But many cases of high memory retention in Glibc I have seen in third-party JNI code. Libraries allocate large buffers via malloc as temporary buffers. In fact, since we introduced the jcmd "System.trim_native_heap", some of our customers started to call this command periodically in scripts to counter these issues. > > Therefore I think while high malloc spikes are atypical for a JVM process, they can happen. Having a way to auto-trim the native heap makes sense. > > ### When should we trim? > > We want to trim when we know there is a lull in malloc activity coming. But we have no knowledge of the future. > > We could build a heuristic based on malloc frequency. But on closer inspection that is difficult. We cannot use NMT, since NMT has no complete picture (only knows hotspot) and is usually disabled in production anyway. The only way to get *all* mallocs would be to use Glibc malloc hooks. We have done so in desperate cases at SAP, but Glibc removed malloc hooks in 2.35. It would be a messy solution anyway; best to avoid it. > > The next best thing is synchronizing with the larger C-heap users in the VM: compiler and GC. But compiler turns out not to be such a problem, since the compiler uses arenas, and arena chunks are buffered in a free pool with a five-second delay. That means compiler activity that happens in bursts, like at VM startup, will just shuffle arena chunks around from/to the arena free pool, never bothering to call malloc or free. > > That leaves the GC, which was also the experts' recommendation in last year's discussion [1]. Most GCs do uncommit, and trimming the native heap fits well into this. And we want to time the trim to not get into the way of a GC. Plus, integrating trims into the GC cycle lets us reuse GC logging and timing, thereby making RSS changes caused by trim-native visible to the analyst. > > > ### How it works: > > Patch adds new options (experimental for now, and shared among all GCs): > > > -XX:+GCTrimNativeHeap > -XX:GCTrimNativeHeapInterval= (defaults to 60) > > > `GCTrimNativeHeap` is off by default. If enabled, it will cause the VM to trim the native heap on full GCs as well as periodically. The period is defined by `GCTrimNativeHeapInterval`. Periodic trimming can be completely switched off with `GCTrimNativeHeapInterval=0`; in that case, we will only trim on full GCs. > > ### Examples: > > This is an artificial test that causes two high malloc spikes with long idle periods. Observe how RSS recovers with trim but stays up without trim. The trim interval was set to 15 seconds for the test, and no GC was invoked here; this is periodic trimming. > > ![alloc-test](http://cr.openjdk.java.net/~stuefe/other/autotrim/rss-all-collectors.png) > > (See here for parameters: [run script](http://cr.openjdk.java.net/~stuefe/other/autotrim/run-all.sh) ) > > Spring pet clinic boots up, then idles. Once with, once without trim, with the trim interval at 60 seconds default. Of course, if it were actually doing something instead of idling, trim effects would be smaller. But the point of trimming is to recover memory in idle periods. > > ![petclinic bootup](http://cr.openjdk.java.net/~stuefe/other/autotrim/spring-petclinic-rss-with-and-without-trim.png)) > > (See here for parameters: [run script](http://cr.openjdk.java.net/~stuefe/other/autotrim/run-petclinic-boot.sh) ) > > > > ### Implementation > > One problem I faced when implementing this was that trimming was non-interruptable. GCs usually split the uncommit work into smaller portions, which is impossible for `malloc_trim()`. > > So very slow trims could introduce longer GC pauses. I did not want this, therefore I implemented two ways to trim: > 1) GCs can opt to trim asynchronously. In that case, a `NativeTrimmer` thread runs on behalf of the GC and takes care of all trimming. The GC just nudges the `NativeTrimmer` at the end of its GC cycle, but the trim itself runs concurrently. > 2) GCs can do the trim inside their own thread, synchronously. It will have to wait until the trim is done. > > (1) has the advantage of giving us periodic trims even without GC activity (Shenandoah does this out of the box). > > #### Serial > > Serial does the trimming synchronously as part of a full GC, and only then. I did not want to spawn a separate thread for the SerialGC. Therefore Serial is the only GC that does not offer periodic trimming, it just trims on full GC. > > #### Parallel, G1, Z > > All of them do the trimming asynchronously via `NativeTrimmer`. They schedule the native trim at the end of a full collection. They also pause the trimming at the beginning of a cycle to not trim during GCs. > > #### Shenandoah > > Shenandoah does the trimming synchronously in its service thread, similar to how it handles uncommits. Since the service thread already runs concurrently and continuously, it can do periodic trimming; no need to spin a new thread. And this way we can reuse the Shenandoah timing classes. > > ### Patch details > > - adds three new functions to the `os` namespace: > - `os::trim_native_heap()` implementing trim > - `os::can_trim_native_heap()` and `os::should_trim_native_heap()` to return whether platform supports trimming resp. whether the platform considers trimming to be useful. > - replaces implementation of the cmd "System.trim_native_heap" with the new `os::trim_native_heap` > - provides a new wrapper function wrapping the tedious `mallinfo()` vs `mallinfo2()` business: `os::Linux::get_mallinfo()` > - adds a GC-shared utility class, `GCTrimNative`, that takes care of trimming and GC-logging and houses the `NativeTrimmer` thread class. > - adds a regression test > > > ### Tests > > Tested older Glibc (2.31), and newer Glibc (2.35) (`mallinfo()` vs` mallinfo2()`), on Linux x64. > > The rest of the tests will be done by GHA and in our SAP nightlies. > > > ### Remarks > > #### How about other allocators? > > I have seen this retention problem mainly with the Glibc and the AIX libc. Muslc returns memory more eagerly to the OS. I also tested with jemalloc and found it also reclaims more aggressively, therefore I don't think MacOS or BSD are affected that much by retention either. > > #### Trim costs? > > Trim-native is a tradeoff between memory and performance. We pay > - The cost to do the trim depends on how much is trimmed. Time ranges on my machine between < 1ms for no-op trims, to ~800ms for 32GB trims. > - The cost for re-acquiring the memory, should the memory be needed again, is the second cost factor. > > #### Predicting malloc_trim effects? > > `ShenandoahUncommit` avoids uncommits if they are not necessary, thus avoiding work and gc log spamming. I liked that and tried to follow that example. Tried to devise a way to predict the effect trim could have based on allocator info from mallinfo(3). That was quite frustrating since the documentation was confusing and I had to do a lot of experimenting. In the end, I came up with a heuristic to prevent obviously pointless trim attempts; see `os::should_trim_native_heap()`. I am not completely happy with it. > > #### glibc.malloc.trim_threshold? > > glibc has a tunable that looks like it could influence the willingness of Glibc to return memory to the OS, the "trim_threshold". In practice, I could not get it to do anything useful. Regardless of the setting, it never seemed to influence the trimming behavior. Even if it would work, I'm not sure we'd want to use that, since by doing malloc_trim manually we can space out the trims as we see fit, instead of paying the trim price for free(3). > > > - [1] https://mail.openjdk.org/pipermail/hotspot-dev/2021-August/054323.html > - [2] https://bugs.openjdk.org/browse/JDK-8269345 Thomas Stuefe has updated the pull request incrementally with four additional commits since the last revision: - wip - rename GCTrimNative TrimNative - rename NativeTrimmer - rename ------------- Changes: - all: https://git.openjdk.org/jdk/pull/10085/files - new: https://git.openjdk.org/jdk/pull/10085/files/c765621b..55386d2a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=10085&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=10085&range=07-08 Stats: 1226 lines in 26 files changed: 638 ins; 557 del; 31 mod Patch: https://git.openjdk.org/jdk/pull/10085.diff Fetch: git fetch https://git.openjdk.org/jdk pull/10085/head:pull/10085 PR: https://git.openjdk.org/jdk/pull/10085 From coleenp at openjdk.org Thu Feb 2 14:02:29 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 2 Feb 2023 14:02:29 GMT Subject: RFR: 8297539: Consolidate the uses of the int<->float union conversion trick In-Reply-To: References: Message-ID: <5Aucxw80mvi-IP9E65xis9qu5cEa2FiYXvPJrNoD62g=.53fe9396-ca1c-48ea-8aff-bd01d397af76@github.com> On Thu, 2 Feb 2023 10:17:21 GMT, Afshin Zafari wrote: > ### Discussion > > There are different ways of casting/converting integral types to floating point types and vice versa in the code. > These conversions are changed to invocation of a centralised `cast(From)` method in `primitiveConversions.hpp`. To make this changes build and run, some new `cast` methods are introduced and header dependencies changed. > > ### Patch > > - All instances of using `union` technique that are used for int<->float and long<->double type conversions are replaced by `PrimitiveConversions::cast(From)` method. > - Instances of, for example, converting `int`<->`intptr_t` are not changed. > - After this change, `JavaValue` also uses the `PrimitiveConversions::cast(From)` templates. Therefore, all the `union` conversions for `int`<-> `float` and `long` <-> `double` in hotspot are now centralised in one location. > - Templates in `PrimitiveConversions` class are numbered from 1-9 and all lines of code that use `PrimitivEConversions::cast(From)` have comments saying which templated `cast(From)` matches. Issue number 8297539 is also in the comments to make the search/reviews more easier. > - The new 'cast(From)' templates are numbered 6-9 for some instances where existing templates (1-5) did not match. > - `globalDefinitions.hpp` now includes `primitiveConversions.hpp`. > - `primitiveConversions.hpp` does NOT include `globalDefinitions.hpp` any more. Some typedef's are added to it to compensate the missing types. > - The followings are not changed: > - `jni.hpp` > - `ciConstant.hpp` is not changed, since uinon is not used for converting `int` <->`float` or `long` <-> `double`. There are some asserts that allow setter/getter used only for the same types. > - `bytecodeInterpreter_zero.hpp` is not changed. Writing to and reading from the `VMJavaVal64` are always with the same type and no conversion is made. > - union with bit fields are not changed. > - `json.hpp` and `json.cpp` in src/hotspot/share/utilities. > > ### Test > mach5 tiers 1-5 passed, tiers 6-8 in progress src/hotspot/cpu/aarch64/assembler_aarch64.cpp line 414: > 412: > 413: static uint64_t doubleTo64Bits(jdouble d) { > 414: // 8297539. This matches with Template #5 of cast(From). We don't generally put bug numbers in the code. ------------- PR: https://git.openjdk.org/jdk/pull/12384 From aboldtch at openjdk.org Thu Feb 2 14:07:45 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 2 Feb 2023 14:07:45 GMT Subject: Integrated: 8301612: OopLoadProxy constructor should be explicit In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 14:41:35 GMT, Axel Boldt-Christmas wrote: > The implicit conversion via constructor `OopLoadProxy(P* addr)` can cause problems when using the access API and nullptr. > For example code like > ```c++ > oop o = (_obj == NULL) ? nullptr : NativeAccess<>::oop_load(_obj); > > will compile but is wrong and will crash. > This is because it will be interpreted as: > ```c++ > oop o = static_cast((_obj == NULL) ? OopLoadProxy(nullptr) : NativeAccess<>::oop_load(_obj)); > > while the intent of the programmer probably was: > ```c++ > oop o = (_obj == NULL) ? (oop)nullptr : NativeAccess<>::oop_load(_obj); > > or more explicitly: > ```c++ > oop o = (_obj == NULL) ? static_cast(nullptr) : static_cast(NativeAccess<>::oop_load(_obj)); > > > Marking `OopLoadProxy(P* addr)` explicit will make this example, and similar code not compile. This pull request has now been integrated. Changeset: 21c1afbc Author: Axel Boldt-Christmas URL: https://git.openjdk.org/jdk/commit/21c1afbc3229e898146022935bc589bcf95aa1f7 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8301612: OopLoadProxy constructor should be explicit Reviewed-by: stefank, jsjolen ------------- PR: https://git.openjdk.org/jdk/pull/12364 From lmesnik at openjdk.org Thu Feb 2 15:13:37 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 2 Feb 2023 15:13:37 GMT Subject: Integrated: 8298979: Remove duplicated serviceability/jvmti/thread/GetAllThreads/allthr01/allthr01.java In-Reply-To: <7N18GctUO6Kt5_VJe5lJBV-ga-jZv-z9vtnh99dKHZo=.390887cd-5af8-4a61-bd2c-b4b5f74353e8@github.com> References: <7N18GctUO6Kt5_VJe5lJBV-ga-jZv-z9vtnh99dKHZo=.390887cd-5af8-4a61-bd2c-b4b5f74353e8@github.com> Message-ID: On Fri, 27 Jan 2023 05:06:25 GMT, Leonid Mesnik wrote: > PR adds fix "8284027: vmTestbase/nsk/jvmti/GetAllThreads/allthr001/ is failing" to new test and remove duplication. > > Test allthr002 ported as > serviceability/jvmti/negative/GetAllThreadsNullTest/GetAllThreadsNullTest.java This pull request has now been integrated. Changeset: 2d50c7d4 Author: Leonid Mesnik URL: https://git.openjdk.org/jdk/commit/2d50c7d477b4141d58ae4ad01c254cde03050373 Stats: 844 lines in 11 files changed: 34 ins; 785 del; 25 mod 8298979: Remove duplicated serviceability/jvmti/thread/GetAllThreads/allthr01/allthr01.java Reviewed-by: sspitsyn ------------- PR: https://git.openjdk.org/jdk/pull/12240 From jsjolen at openjdk.org Thu Feb 2 15:14:52 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 2 Feb 2023 15:14:52 GMT Subject: RFR: JDK-8301479: Replace NULL with nullptr in os/linux [v2] In-Reply-To: References: Message-ID: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/linux. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Missed one mistake ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12315/files - new: https://git.openjdk.org/jdk/pull/12315/files/73e001d2..9dd25a01 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12315&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12315&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12315.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12315/head:pull/12315 PR: https://git.openjdk.org/jdk/pull/12315 From mgronlun at openjdk.org Thu Feb 2 15:16:27 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Thu, 2 Feb 2023 15:16:27 GMT Subject: RFR: 8301380: jdk/jfr/api/consumer/streaming/TestCrossProcessStreaming.java Message-ID: Greetings, please help review this small adjustment to close the gap where iterated threads that attach via jni are left without a valid JFR thread id. For details, please see the JIRA issue. Testing: jdk_jfr Thanks Markus ------------- Commit messages: - 8301380 Changes: https://git.openjdk.org/jdk/pull/12388/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12388&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301380 Stats: 8 lines in 2 files changed: 6 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12388.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12388/head:pull/12388 PR: https://git.openjdk.org/jdk/pull/12388 From coleenp at openjdk.org Thu Feb 2 15:21:47 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 2 Feb 2023 15:21:47 GMT Subject: RFR: JDK-8301479: Replace NULL with nullptr in os/linux [v2] In-Reply-To: References: Message-ID: On Thu, 2 Feb 2023 15:14:52 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/linux. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Missed one mistake Marked as reviewed by coleenp (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12315 From mgronlun at openjdk.org Thu Feb 2 15:34:14 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Thu, 2 Feb 2023 15:34:14 GMT Subject: RFR: 8301380: jdk/jfr/api/consumer/streaming/TestCrossProcessStreaming.java [v2] In-Reply-To: References: Message-ID: <7oSkKJNaPRjItariUnPQItEvDu1JU72EXaOvYU79qKA=.136ef5be-741d-44ce-b04e-19048e094af6@github.com> > Greetings, > > please help review this small adjustment to close the gap where iterated threads that attach via jni are left without a valid JFR thread id. For details, please see the JIRA issue. > > Testing: jdk_jfr > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with two additional commits since the last revision: - typos - unconditional id assignment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12388/files - new: https://git.openjdk.org/jdk/pull/12388/files/d3d42150..34fc58aa Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12388&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12388&range=00-01 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12388.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12388/head:pull/12388 PR: https://git.openjdk.org/jdk/pull/12388 From duke at openjdk.org Thu Feb 2 15:36:09 2023 From: duke at openjdk.org (Scott Gibbons) Date: Thu, 2 Feb 2023 15:36:09 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v7] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: On Thu, 2 Feb 2023 00:34:32 GMT, Claes Redestad wrote: >> These tables are used for both URL and non-URL based on the parameter, and they are only two of the three lut tables used (the other is in `base64_AVX2_decode_tables_addr` ). Both names are essentially incorrect. Does the name really matter that much? It's the same as `base64_AVX2_decode_tables_addr` with the addition of URL tables. > > Names are important, but always hard to get right. At the very least they need to be correct. Maybe call it something like `..parameterized_decode_tables..` and the other `..shared_decode_tables..`? I prefer leaving them the way they are. I don't think the names, along with the associated comments within the tables, causes any undue confusion as to their function. However I will implement the name change if that's all it takes to procure a review approval. Please provide the specific names you'd like me to use and I'll change them. Or just approve as-is :-). ------------- PR: https://git.openjdk.org/jdk/pull/12126 From jcking at openjdk.org Thu Feb 2 15:52:01 2023 From: jcking at openjdk.org (Justin King) Date: Thu, 2 Feb 2023 15:52:01 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v3] In-Reply-To: References: Message-ID: > Adds initial LSan (LeakSanitizer) support to Hotspot. This setup has been used to identify multiple leaks so far. It can run most of the test suite except those that rely on testing compressed oops or compressed class pointers. It is especially useful when combined with ASan, as LSan can use poisoning information to determine what memory to scan or not to scan, making leak detection more accurate and faster. > > **Suppressing:** > Currently the suppression list is only used to suppress JLI leaks that are known, the rest are done in code. Suppressing needs to identify the source of thet leak. Due to Hotspot's code organization, we would need to suppress `os::malloc` and friends, which would suppress everything. Suppressing in code has the added benefit of being explicit and surviving refactors if methods change. > > **Caveats:** > - `UseCompressedOops` and `UseCompressedClassPointers` are forced to false when LSan is enabled. This is necessary to ensure all pointers to memory which could possible container pointers to malloc memory are property aligned, which is an LSan requirement. > - By default ASan enables LSan, however we explicitly disable it unless `--enable-lsan` is given. This is due to the other caveats. It is useful to be able to use ASan without LSan. Using LSan by itself is less likely to be useful and will probably not work, but its still possible currently. > - There are a series of tests that are upset due to the above flags being forced false, as they rely on the arguments being supported. In the future ideally these tests would be skipped nicely when LSan is enabled. Justin King has updated the pull request incrementally with two additional commits since the last revision: - Fix grammatical error Signed-off-by: Justin King - Update based on review and remove restriction on compressed oops and class pointers Signed-off-by: Justin King ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12229/files - new: https://git.openjdk.org/jdk/pull/12229/files/4559ef4c..93af227a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12229&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12229&range=01-02 Stats: 32 lines in 4 files changed: 16 ins; 14 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12229.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12229/head:pull/12229 PR: https://git.openjdk.org/jdk/pull/12229 From jcking at openjdk.org Thu Feb 2 15:52:10 2023 From: jcking at openjdk.org (Justin King) Date: Thu, 2 Feb 2023 15:52:10 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v2] In-Reply-To: References: Message-ID: <9Ssbs9Bn5_RKgHETiinaboR6Jca2RuQKCjAiA74XZu0=.57bcfad8-45df-4866-bc38-6ea8beffb8c7@github.com> On Thu, 2 Feb 2023 05:34:57 GMT, David Holmes wrote: >> Justin King has updated the pull request incrementally with one additional commit since the last revision: >> >> Support CDS >> >> Signed-off-by: Justin King > > make/data/lsan/lsan_default_options.c line 49: > >> 47: // extremely early during library loading, before main is called. We need to override the default >> 48: // options because LSan will perform leak checking at program exit. Unfortunately Hotspot does not >> 49: // shutdown cleaning at the moment and some leaks occur, we want to ignore these. Instead we > > s/cleaning/cleanly/ Fixed. > src/hotspot/share/runtime/init.cpp line 198: > >> 196: VirtualSpaceSummary summary = Universe::heap()->create_heap_space_summary(); >> 197: LSAN_UNREGISTER_ROOT_REGION(summary.start(), summary.reserved_size()); >> 198: } > > Again Why is this unconditional? I expected it to be in `#ifdef LEAK_SANITIZER` Same as other comment. This does more than just the LSAN_*, so I wrapped it now. > src/hotspot/share/runtime/java.cpp line 432: > >> 430: static_cast(result); >> 431: } >> 432: > > Why is this unconditional? I expected it to be in `#ifdef LEAK_SANITIZER` Since this does a little more than just invoke the LSAN_* macros, that is fair. I wrapped it in an ifdef. ------------- PR: https://git.openjdk.org/jdk/pull/12229 From jcking at openjdk.org Thu Feb 2 15:52:11 2023 From: jcking at openjdk.org (Justin King) Date: Thu, 2 Feb 2023 15:52:11 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v2] In-Reply-To: <6xcgr1n9j4B5--0YAAqKybMi2Rd4W_OOv0nvQwDWhcY=.b99607ae-d25a-4429-bc73-27ccf5a82ede@github.com> References: <6xcgr1n9j4B5--0YAAqKybMi2Rd4W_OOv0nvQwDWhcY=.b99607ae-d25a-4429-bc73-27ccf5a82ede@github.com> Message-ID: On Thu, 2 Feb 2023 06:06:04 GMT, David Holmes wrote: >> src/hotspot/share/runtime/arguments.cpp line 3979: >> >>> 3977: // LSAN relies on pointers to be natively aligned. Using compressed class pointers breaks this >>> 3978: // expectation and results in nondeterministic leak reports. >>> 3979: if (FLAG_SET_CMDLINE(UseCompressedOops, false) != JVMFlag::SUCCESS) { >> >> This positioning doesn't look right. There are a number of inter-dependencies on the compressed oops flags and I don't think you can just turn them off here. Surely `Arguments::set_use_compressed_oops()` is where this needs to be done? > > Not sure FLAG_SET_CMDLINE is right here either (but flag setting is very confusing). Removed, no longer needed. I'll explain below. ------------- PR: https://git.openjdk.org/jdk/pull/12229 From jcking at openjdk.org Thu Feb 2 16:05:43 2023 From: jcking at openjdk.org (Justin King) Date: Thu, 2 Feb 2023 16:05:43 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v2] In-Reply-To: <9Q5Yb8KEUtCRCVYWSAD0WxGEfOCxAmov9_g1zQ8i7fM=.fa6eb056-13b9-4057-9adf-c5d185f4895b@github.com> References: <9Q5Yb8KEUtCRCVYWSAD0WxGEfOCxAmov9_g1zQ8i7fM=.fa6eb056-13b9-4057-9adf-c5d185f4895b@github.com> Message-ID: On Thu, 2 Feb 2023 06:34:56 GMT, Thomas Stuefe wrote: >> Justin King has updated the pull request incrementally with one additional commit since the last revision: >> >> Support CDS >> >> Signed-off-by: Justin King > > Thinking this through some more: > > Why would UseCompressedClassPointers cause leak analysis to fail? Because you track Metaspace? If so, we should think whether it makes sense to track Metaspace. > > Metaspace memory is tied to class loaders. Class loaders may live forever, until VM death, in that case Metaspace will not be released. How exactly does the leak profiler deal with that? Does it report those cases as leaks? > > Metaspace "leaks" exist, but these are typically class loader leaks at the application level, and those you cannot recognize from the JVM. You need application knowledge for that. @tstuefe Okay, I removed the forceful disabling of `UseCompressedOops` and `UseCompressedClassPointers` and it looks clean, so I removed that. I think what happened was two things. First, when I started this many months ago I did not have a full understanding of the different memory regions used by Hotspot compared to now and second when I was originally doing this I got false positive feedback by disabling these likely due to missing registration of a root region. So I originally chalked it up to being due to misaligned pointers. My original goal was to just get a clean run, so I took the hammer approach. As long as the pointers to malloc-based memory are properly aligned in one of the root regions (ones we register or other memory allocated by malloc), LSan is happy. It shouldn't matter whether pointers to data in metaspace are compressed or not, or oops. Both reside in metaspace and java heap, neither of which LSan is directly concerned about other than looking for pointers to malloc-based memory. So you are correct in that regard that neither `UseCompressedOops` and `UseCompressedClassPointers` should be necessary. However metaspace leaks do exist and this setup actually found one (JDK-8298084), so metaspace being instrumented is definitely useful. To clarify, LSan is only concerned with leaks related to memory obtained from malloc and friends. It does not deal with metaspace leaks directly or java heap leaks. ------------- PR: https://git.openjdk.org/jdk/pull/12229 From iklam at openjdk.org Thu Feb 2 17:22:01 2023 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 2 Feb 2023 17:22:01 GMT Subject: RFR: 8301564: Non-C-heap allocated ResourceHashtable keys and values must have trivial destructor [v3] In-Reply-To: References: Message-ID: > To ensure we don't have memory leaks or other more serious memory management bugs, I added static asserts to check that the `K` and `V` types for `ResourceHashtableBase` must have trivial destructors if the table is not `C_HEAP` allocated. Thanks to @JornVernee for the original assert code. > > The asserts actually found a problem with `ClassLoaderStatsClosure::StatsTable`. The space used by the `oop` in the freed hashtable may be trashed when `-XX:+CheckUnhandledOops` is enabled, so live objects that reuse the same space may be corrupted. (I tried but couldn't get it to fail). > > I also had to change `CodeBuffer::SharedTrampolineRequests` because it's `V` type is `LinkListImpl`, which has a non-trivial destructor. (Note: in this case, the destructor doesn't do anything; we can probably make it go away with C++-20. See https://stackoverflow.com/questions/40094871/sfinae-away-a-destructor) > > Testing with tier1~4 + builds-tier5 Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: - Merge branch 'master' into 8301564-forbid-non-trivial-destructor-for-resourcehash-KV-types - @coleenp comments - use mtStatistics instead - fixed aarch64 and risvc64 - fixed typo - Fixed ClassLoaderStatsClosure::StatsTable - 8301564: Non-C-heap allocated ResourceHashtable keys and values must have trivial destructor ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12355/files - new: https://git.openjdk.org/jdk/pull/12355/files/915f058b..2421a75e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12355&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12355&range=01-02 Stats: 20095 lines in 935 files changed: 5641 ins; 3886 del; 10568 mod Patch: https://git.openjdk.org/jdk/pull/12355.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12355/head:pull/12355 PR: https://git.openjdk.org/jdk/pull/12355 From kbarrett at openjdk.org Thu Feb 2 17:38:26 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 2 Feb 2023 17:38:26 GMT Subject: RFR: 8297539: Consolidate the uses of the int<->float union conversion trick In-Reply-To: References: Message-ID: On Thu, 2 Feb 2023 10:17:21 GMT, Afshin Zafari wrote: > ### Discussion > > There are different ways of casting/converting integral types to floating point types and vice versa in the code. > These conversions are changed to invocation of a centralised `cast(From)` method in `primitiveConversions.hpp`. To make this changes build and run, some new `cast` methods are introduced and header dependencies changed. > > ### Patch > > - All instances of using `union` technique that are used for int<->float and long<->double type conversions are replaced by `PrimitiveConversions::cast(From)` method. > - Instances of, for example, converting `int`<->`intptr_t` are not changed. > - After this change, `JavaValue` also uses the `PrimitiveConversions::cast(From)` templates. Therefore, all the `union` conversions for `int`<-> `float` and `long` <-> `double` in hotspot are now centralised in one location. > - Templates in `PrimitiveConversions` class are numbered from 1-9 and all lines of code that use `PrimitivEConversions::cast(From)` have comments saying which templated `cast(From)` matches. Issue number 8297539 is also in the comments to make the search/reviews more easier. > - The new 'cast(From)' templates are numbered 6-9 for some instances where existing templates (1-5) did not match. > - `globalDefinitions.hpp` now includes `primitiveConversions.hpp`. > - `primitiveConversions.hpp` does NOT include `globalDefinitions.hpp` any more. Some typedef's are added to it to compensate the missing types. > - The followings are not changed: > - `jni.hpp` > - `ciConstant.hpp` is not changed, since uinon is not used for converting `int` <->`float` or `long` <-> `double`. There are some asserts that allow setter/getter used only for the same types. > - `bytecodeInterpreter_zero.hpp` is not changed. Writing to and reading from the `VMJavaVal64` are always with the same type and no conversion is made. > - union with bit fields are not changed. > - `json.hpp` and `json.cpp` in src/hotspot/share/utilities. > > ### Test > mach5 tiers 1-5 passed, tiers 6-8 in progress Changes requested by kbarrett (Reviewer). src/hotspot/share/utilities/globalDefinitions.hpp line 28: > 26: #define SHARE_UTILITIES_GLOBALDEFINITIONS_HPP > 27: > 28: #include "metaprogramming/primitiveConversions.hpp" I think this should not be done, for the same reasons as not including the suggested bit_cast (https://git.openjdk.org/jdk/pull/11865). There is substantial overlap between this proposed change and that one, though I think the bit_cast proposal is more problematic. Rather than adding a new dependency here, separate out the code that uses the conversions into new files and update the (much smaller number than use globalDefinitions.hpp) users of those functions to include the new headers. ------------- PR: https://git.openjdk.org/jdk/pull/12384 From duke at openjdk.org Thu Feb 2 21:22:26 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Thu, 2 Feb 2023 21:22:26 GMT Subject: RFR: 8296344: Remove dependency on G1 for writing the CDS archive heap [v2] In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 16:11:13 GMT, Ioi Lam wrote: >> Goals >> >> - Simplify the writing of the CDS archive heap >> - We no longer need to allocate special "archive regions" during CDS dump time. See all the removed G1 code. >> - Make it possible to (in a future RFE) write the CDS archive heap using any garbage collector >> >> Implementation - the following runs inside a safepoint so heap objects aren't moving >> >> - Find all the objects that should be archived >> - Allocate buffer using a GrowableArray >> - Copy all "open" objects into the buffer >> - Copy all "closed" objects into the buffer >> - Make sure the heap image can be mapped with all possible G1 region sizes: >> - While copying, add fillers to make sure no object spans across 1MB boundary (minimal G1 region size) >> - Align the bottom of the "open" and "closed" objects at 1MB boundary >> - Write the buffer to disk as the image for the archive heap >> >> Testing: tiers 1 ~ 5 > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > fixed windows and 32-bit builds which do not support archive heap src/hotspot/share/cds/archiveHeapWriter.cpp line 175: > 173: > 174: oop ArchiveHeapWriter::heap_roots_requested_address() { > 175: return cast_to_oop(_requested_open_region_bottom + _heap_roots_bottom); A suggestion: it is probably better to use the internal method `requested_obj_from_buffer_offset(_heap_roots_bottom)` instead. ------------- PR: https://git.openjdk.org/jdk/pull/12304 From duke at openjdk.org Thu Feb 2 21:53:27 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Thu, 2 Feb 2023 21:53:27 GMT Subject: RFR: 8296344: Remove dependency on G1 for writing the CDS archive heap [v2] In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 16:11:13 GMT, Ioi Lam wrote: >> Goals >> >> - Simplify the writing of the CDS archive heap >> - We no longer need to allocate special "archive regions" during CDS dump time. See all the removed G1 code. >> - Make it possible to (in a future RFE) write the CDS archive heap using any garbage collector >> >> Implementation - the following runs inside a safepoint so heap objects aren't moving >> >> - Find all the objects that should be archived >> - Allocate buffer using a GrowableArray >> - Copy all "open" objects into the buffer >> - Copy all "closed" objects into the buffer >> - Make sure the heap image can be mapped with all possible G1 region sizes: >> - While copying, add fillers to make sure no object spans across 1MB boundary (minimal G1 region size) >> - Align the bottom of the "open" and "closed" objects at 1MB boundary >> - Write the buffer to disk as the image for the archive heap >> >> Testing: tiers 1 ~ 5 > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > fixed windows and 32-bit builds which do not support archive heap src/hotspot/share/cds/archiveHeapWriter.cpp line 266: > 264: HeapShared::CachedOopInfo* info = HeapShared::archived_object_cache()->get(src_obj); > 265: assert(info != nullptr, "must be"); > 266: if (info->in_open_region() == copy_open_region) { This trick `(false == false)` is `true` used to copy closed regions fooled me for a while. This was unusual for me, so pointing out here for anyone else reading it up. This would anyway go away in the future work that combines open and closed regions into a single one. src/hotspot/share/cds/archiveHeapWriter.cpp line 528: > 526: template void do_oop_work(T *p) { > 527: size_t field_offset = pointer_delta(p, _src_obj, sizeof(char)); > 528: ArchiveHeapWriter::relocate_field_in_requested_obj(_requested_obj, field_offset); I understand what we are doing here, but it seems to be taking a bit of a convoluted approach. It performs quite a few operations to calculate `source_referent`, which is already available in *p. So an alternate implementation for `do_oop(narrowOop *p)` can be: if (!CompressedOops::is_null(*p)) { size_t field_offset = pointer_delta(p, _src_obj, sizeof(char)); oop request_referrent = source_obj_to_requested_obj(CompressedOops::decode(*p)); // source referent is already in *p narrowOop* field_address_in_buffer = (narrowOop*)(cast_from_oop
(_buffered_obj) + field_offset); *field_address_in_buffer = CompressedOops::encode(request_referrent); mark_oop_pointer(_requested_obj, field_offset, _oopmap); } The `_buffered_obj` can be calculated in the `iterator` below and passed to `EmbeddedOopRelocator` c'tor. `do_oop(oop *p)` would follow the same. Does this read better? src/hotspot/share/cds/archiveHeapWriter.cpp line 559: > 557: for (int i = 0; i < length; i++) { > 558: if (UseCompressedOops) { > 559: relocate_root_at(requested_roots, i); This can also follow similar pattern as code in my comment on`do_oop_work` above, as in: if (UseCompressedOops) { offset = (size_t)((objArrayOop)requested_roots)->obj_at_offset(i); narrowOop* buffered_addr = (narrowOop*)(cast_from_oop
(buffered_roots) + offset); oop request_referent = source_obj_to_requested_obj(CompressedOops::decode(*buffered_addr)); *buffered_addr = CompressedOops::encode(request_referent); mark_oop_pointer(requested_roots, offset, heap_info->oopmap()); } ------------- PR: https://git.openjdk.org/jdk/pull/12304 From jrose at openjdk.org Thu Feb 2 22:12:33 2023 From: jrose at openjdk.org (John R Rose) Date: Thu, 2 Feb 2023 22:12:33 GMT Subject: RFR: 8297539: Consolidate the uses of the int<->float union conversion trick In-Reply-To: References: Message-ID: On Thu, 2 Feb 2023 10:17:21 GMT, Afshin Zafari wrote: > ### Discussion > > There are different ways of casting/converting integral types to floating point types and vice versa in the code. > These conversions are changed to invocation of a centralised `cast(From)` method in `primitiveConversions.hpp`. To make this changes build and run, some new `cast` methods are introduced and header dependencies changed. > > ### Patch > > - All instances of using `union` technique that are used for int<->float and long<->double type conversions are replaced by `PrimitiveConversions::cast(From)` method. > - Instances of, for example, converting `int`<->`intptr_t` are not changed. > - After this change, `JavaValue` also uses the `PrimitiveConversions::cast(From)` templates. Therefore, all the `union` conversions for `int`<-> `float` and `long` <-> `double` in hotspot are now centralised in one location. > - Templates in `PrimitiveConversions` class are numbered from 1-9 and all lines of code that use `PrimitivEConversions::cast(From)` have comments saying which templated `cast(From)` matches. Issue number 8297539 is also in the comments to make the search/reviews more easier. > - The new 'cast(From)' templates are numbered 6-9 for some instances where existing templates (1-5) did not match. > - `globalDefinitions.hpp` now includes `primitiveConversions.hpp`. > - `primitiveConversions.hpp` does NOT include `globalDefinitions.hpp` any more. Some typedef's are added to it to compensate the missing types. > - The followings are not changed: > - `jni.hpp` > - `ciConstant.hpp` is not changed, since uinon is not used for converting `int` <->`float` or `long` <-> `double`. There are some asserts that allow setter/getter used only for the same types. > - `bytecodeInterpreter_zero.hpp` is not changed. Writing to and reading from the `VMJavaVal64` are always with the same type and no conversion is made. > - union with bit fields are not changed. > - `json.hpp` and `json.cpp` in src/hotspot/share/utilities. > > ### Test > mach5 tiers 1-5 passed, tiers 6-8 in progress I think the term "cast" in this API means too many different things, including things that a C programmer would not think of as simple casting. To be clear, this is a problem I have with the existing API, not just this PR. I request we start to make a clearer distinction between conversions which preserve values (perhaps with truncation or loss of precision) and reinterpretations which discard values but keep the underlying bits. Both may be called casts, but it is asking for bugs to use the name for both kinds of requests. (Using the word cast for something that preserves at least some values is how the JLS handles conversion terminology, and the JLS makes reinterpret casts available only via special methods. I know C++ has its own lexicon of terms, but I claim some privilege on this project for Java terms as well.) I know this work here is changing from names like "jint_cast" which perform reinterpretations, but I always viewed those names as suspicious, and had to learn by rote that they refer to reinterpretations and not true C (or Java) casts. Naming it just "cast" with a tricky type parameter is a move even farther in the wrong direction. Having the function named "cast" in a class named PrimitiveConversions (not PrimitiveReinterpretations) makes it dangerously easy to miss that the functions in that API are (apparently?) all reinterpretations, and none are what a Java programmer would call a cast. I suggest either renaming of the class (PrimitiveReinterpretations) or better changing the name from "cast" to something like "reinterpret". Then casual readers of code will be less likely to make errors of? reinterpretation. ------------- PR: https://git.openjdk.org/jdk/pull/12384 From jcking at openjdk.org Thu Feb 2 22:25:35 2023 From: jcking at openjdk.org (Justin King) Date: Thu, 2 Feb 2023 22:25:35 GMT Subject: RFR: 8297539: Consolidate the uses of the int<->float union conversion trick In-Reply-To: References: Message-ID: On Thu, 2 Feb 2023 21:59:17 GMT, John R Rose wrote: > I think the term "cast" in this API means too many different things, including things that a C programmer would not think of as simple casting. To be clear, this is a problem I have with the existing API, not just this PR. > > I request we start to make a clearer distinction between conversions which preserve values (perhaps with truncation or loss of precision) and reinterpretations which discard values but keep the underlying bits. Both may be called casts, but it is asking for bugs to use the name for both kinds of requests. > > (Using the word cast for something that preserves at least some values is how the JLS handles conversion terminology, and the JLS makes reinterpret casts available only via special methods. I know C++ has its own lexicon of terms, but I claim some privilege on this project for Java terms as well.) > > I know this work here is changing from names like "jint_cast" which perform reinterpretations, but I always viewed those names as suspicious, and had to learn by rote that they refer to reinterpretations and not true C (or Java) casts. Naming it just "cast" with a tricky type parameter is a move even farther in the wrong direction. > > Having the function named "cast" in a class named PrimitiveConversions (not PrimitiveReinterpretations) makes it dangerously easy to miss that the functions in that API are (apparently?) all reinterpretations, and none are what a Java programmer would call a cast. I suggest either renaming of the class (PrimitiveReinterpretations) or better changing the name from "cast" to something like "reinterpret". Then casual readers of code will be less likely to make errors of? reinterpretation. It should be noted that bit_cast is the chosen term for C++ starting in C++20, which reinterprets. Whether that is clear or not is a separate issue, but using `cast` is consistent at least even if odd. Maybe we just change it to `Bits::cast`? ------------- PR: https://git.openjdk.org/jdk/pull/12384 From iklam at openjdk.org Thu Feb 2 22:27:52 2023 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 2 Feb 2023 22:27:52 GMT Subject: RFR: 8301564: Non-C-heap allocated ResourceHashtable keys and values must have trivial destructor [v3] In-Reply-To: <1mPho2AFEhUa0e9cSa-jHAN6ky3cU2UHXpXR12ACqKE=.a38a8cc3-8c61-4433-a4c1-c60b711a28ba@github.com> References: <1mPho2AFEhUa0e9cSa-jHAN6ky3cU2UHXpXR12ACqKE=.a38a8cc3-8c61-4433-a4c1-c60b711a28ba@github.com> Message-ID: On Wed, 1 Feb 2023 21:16:35 GMT, Jorn Vernee wrote: >> Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into 8301564-forbid-non-trivial-destructor-for-resourcehash-KV-types >> - @coleenp comments - use mtStatistics instead >> - fixed aarch64 and risvc64 >> - fixed typo >> - Fixed ClassLoaderStatsClosure::StatsTable >> - 8301564: Non-C-heap allocated ResourceHashtable keys and values must have trivial destructor > > This looks good, thanks. > > I'll put this here for posterity: we discussed changing `ResourceHashtable` to always call the destructor of the key and value (sketch [here](https://github.com/openjdk/jdk/compare/master...JornVernee:jdk:Always_Dtor)). That would allow using `K` and `V` with non-trivial destructors as well. But, this discussion didn't reach a conclusion yet (in particular, there were some doubts expressed over whether a container should be calling destructors at all). > > In the mean time, this patch improves upon the status quo. Thanks @JornVernee and @coleenp for the review. ------------- PR: https://git.openjdk.org/jdk/pull/12355 From jcking at openjdk.org Thu Feb 2 22:30:17 2023 From: jcking at openjdk.org (Justin King) Date: Thu, 2 Feb 2023 22:30:17 GMT Subject: RFR: 8297539: Consolidate the uses of the int<->float union conversion trick In-Reply-To: References: Message-ID: On Thu, 2 Feb 2023 17:35:27 GMT, Kim Barrett wrote: >> ### Discussion >> >> There are different ways of casting/converting integral types to floating point types and vice versa in the code. >> These conversions are changed to invocation of a centralised `cast(From)` method in `primitiveConversions.hpp`. To make this changes build and run, some new `cast` methods are introduced and header dependencies changed. >> >> ### Patch >> >> - All instances of using `union` technique that are used for int<->float and long<->double type conversions are replaced by `PrimitiveConversions::cast(From)` method. >> - Instances of, for example, converting `int`<->`intptr_t` are not changed. >> - After this change, `JavaValue` also uses the `PrimitiveConversions::cast(From)` templates. Therefore, all the `union` conversions for `int`<-> `float` and `long` <-> `double` in hotspot are now centralised in one location. >> - Templates in `PrimitiveConversions` class are numbered from 1-9 and all lines of code that use `PrimitivEConversions::cast(From)` have comments saying which templated `cast(From)` matches. Issue number 8297539 is also in the comments to make the search/reviews more easier. >> - The new 'cast(From)' templates are numbered 6-9 for some instances where existing templates (1-5) did not match. >> - `globalDefinitions.hpp` now includes `primitiveConversions.hpp`. >> - `primitiveConversions.hpp` does NOT include `globalDefinitions.hpp` any more. Some typedef's are added to it to compensate the missing types. >> - The followings are not changed: >> - `jni.hpp` >> - `ciConstant.hpp` is not changed, since uinon is not used for converting `int` <->`float` or `long` <-> `double`. There are some asserts that allow setter/getter used only for the same types. >> - `bytecodeInterpreter_zero.hpp` is not changed. Writing to and reading from the `VMJavaVal64` are always with the same type and no conversion is made. >> - union with bit fields are not changed. >> - `json.hpp` and `json.cpp` in src/hotspot/share/utilities. >> >> ### Test >> mach5 tiers 1-5 passed, tiers 6-8 in progress > > src/hotspot/share/utilities/globalDefinitions.hpp line 28: > >> 26: #define SHARE_UTILITIES_GLOBALDEFINITIONS_HPP >> 27: >> 28: #include "metaprogramming/primitiveConversions.hpp" > > I think this should not be done, for the same reasons as not including the suggested bit_cast > (https://git.openjdk.org/jdk/pull/11865). There is substantial overlap between this proposed > change and that one, though I think the bit_cast proposal is more problematic. > > Rather than adding a new dependency here, separate out the code that uses the conversions > into new files and update the (much smaller number than use globalDefinitions.hpp) users > of those functions to include the new headers. I previously updated the bit_cast PR to just be `PrimitiveConversions::cast` implementation, but using __builtin_bit_cast when available. So this and that are basically the same. ------------- PR: https://git.openjdk.org/jdk/pull/12384 From iklam at openjdk.org Thu Feb 2 22:35:34 2023 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 2 Feb 2023 22:35:34 GMT Subject: Integrated: 8301564: Non-C-heap allocated ResourceHashtable keys and values must have trivial destructor In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 23:23:16 GMT, Ioi Lam wrote: > To ensure we don't have memory leaks or other more serious memory management bugs, I added static asserts to check that the `K` and `V` types for `ResourceHashtableBase` must have trivial destructors if the table is not `C_HEAP` allocated. Thanks to @JornVernee for the original assert code. > > The asserts actually found a problem with `ClassLoaderStatsClosure::StatsTable`. The space used by the `oop` in the freed hashtable may be trashed when `-XX:+CheckUnhandledOops` is enabled, so live objects that reuse the same space may be corrupted. (I tried but couldn't get it to fail). > > I also had to change `CodeBuffer::SharedTrampolineRequests` because it's `V` type is `LinkListImpl`, which has a non-trivial destructor. (Note: in this case, the destructor doesn't do anything; we can probably make it go away with C++-20. See https://stackoverflow.com/questions/40094871/sfinae-away-a-destructor) > > Testing with tier1~4 + builds-tier5 This pull request has now been integrated. Changeset: 04278e6b Author: Ioi Lam URL: https://git.openjdk.org/jdk/commit/04278e6bf2da501542feb777ab864bbcc5794fd0 Stats: 21 lines in 6 files changed: 14 ins; 0 del; 7 mod 8301564: Non-C-heap allocated ResourceHashtable keys and values must have trivial destructor Reviewed-by: coleenp, jvernee ------------- PR: https://git.openjdk.org/jdk/pull/12355 From jrose at openjdk.org Thu Feb 2 22:40:13 2023 From: jrose at openjdk.org (John R Rose) Date: Thu, 2 Feb 2023 22:40:13 GMT Subject: RFR: 8297539: Consolidate the uses of the int<->float union conversion trick In-Reply-To: References: Message-ID: On Thu, 2 Feb 2023 22:22:45 GMT, Justin King wrote: > bit_cast is the chosen term for C++ starting in C++20, which reinterprets Good. I would be very content with bit_cast, or some use of the older word "reinterpret". The problem with plain "cast" is the overloading of meanings. (And then there's the extra push in the wrong direction from the name "PrimitiveConversions".) We are not just C++ programmers here; many of us are Java programmers as well. ------------- PR: https://git.openjdk.org/jdk/pull/12384 From kbarrett at openjdk.org Fri Feb 3 01:36:57 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 3 Feb 2023 01:36:57 GMT Subject: RFR: 8301116: Parallelize TLAB resizing in G1 [v2] In-Reply-To: <5ceqoRFEZguGfrxpAmInTaTTSFYbReYk6eonXmQJOwg=.6fc16c96-6a7b-4c2e-8a3b-c81b5beb5903@github.com> References: <4STcU7c1bgEwc6q5FeqqMSkkRSftG62O1_tiqGBlLXc=.01e51631-360c-4a39-884b-5f5ca3efcba4@github.com> <7Y1f2wfiabz1aO5qFODUi8gEJU_VB2EQKNmMOH4oY1I=.baf20aec-b37d-4a5b-a45a-fae9faaf1089@github.com> <5ceqoRFEZguGfrxpAmInTaTTSFYbReYk6eonXmQJOwg=.6fc16c96-6a7b-4c2e-8a3b-c81b5beb5903@github.com> Message-ID: On Thu, 2 Feb 2023 09:52:29 GMT, Thomas Schatzl wrote: >> Using `Threads::possibly_parallel_oops_do`, thread iteration/claiming seems to be the bottleneck, i.e. claiming the token. >> >> One issue is that all threads need to traverse the array from the beginning to get to the current claim position - so the minimum processing time for a single thread is iterating the complete thread array and checking all tokens for it. That does not matter so much if the work per thread is big (like when walking the stacks for oops - but even then I think it would be noticable, need to check). >> >> The other is that with little work per thread threads like in this situation they seem to be contending heavily on the JavaThread claim tokens, so this "parallelization" is quite a pessimization - I've measured ~4x slower with 18 threads than using the single-threaded version (on ~21k JavaThreads) > > It looks like this has been a leftover of some older version (the `G1JavaThreadClaimer` is fairly "new") - I will remove this change. > Thanks for making me look at this again in detail! > To be more exact, the removal of the const happens in `g1CollectedHeap.inline.hpp:124`: > > ``` > inline JavaThread* const* G1JavaThreadsListClaimer::claim(uint& count) { > count = 0; > if (Atomic::load(&_cur_claim) >= _list.length()) { > return nullptr; > } > uint claim = Atomic::fetch_and_add(&_cur_claim, _claim_step); > if (claim >= _list.length()) { > return nullptr; > } > count = MIN2(_list.length() - claim, _claim_step); > return _list.list()->threads() + claim; <--- here > } > ``` > > because of the mentioned access in `g1YoungGCPostEvacuateTasks.cpp:720`. I think a better solution might be to declare the `Thread::_tlab` member `mutable` and `Thread::tlab() const`. ------------- PR: https://git.openjdk.org/jdk/pull/12360 From kbarrett at openjdk.org Fri Feb 3 01:49:56 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 3 Feb 2023 01:49:56 GMT Subject: RFR: 8301116: Parallelize TLAB resizing in G1 [v2] In-Reply-To: References: <4STcU7c1bgEwc6q5FeqqMSkkRSftG62O1_tiqGBlLXc=.01e51631-360c-4a39-884b-5f5ca3efcba4@github.com> <7Y1f2wfiabz1aO5qFODUi8gEJU_VB2EQKNmMOH4oY1I=.baf20aec-b37d-4a5b-a45a-fae9faaf1089@github.com> <5ceqoRFEZguGfrxpAmInTaTTSFYbReYk6eonXmQJOwg=.6fc16c96-6a7b-4c2e-8a3b-c81b5beb5903@github.com> Message-ID: <58lEutsk9hNEy1-seo_pkH0jO8Fr0mz74XqA-JwOix8=.11ff0722-adf8-4770-b2ac-898460bb43b8@github.com> On Fri, 3 Feb 2023 01:34:26 GMT, Kim Barrett wrote: >> It looks like this has been a leftover of some older version (the `G1JavaThreadClaimer` is fairly "new") - I will remove this change. >> Thanks for making me look at this again in detail! > >> To be more exact, the removal of the const happens in `g1CollectedHeap.inline.hpp:124`: >> >> ``` >> inline JavaThread* const* G1JavaThreadsListClaimer::claim(uint& count) { >> count = 0; >> if (Atomic::load(&_cur_claim) >= _list.length()) { >> return nullptr; >> } >> uint claim = Atomic::fetch_and_add(&_cur_claim, _claim_step); >> if (claim >= _list.length()) { >> return nullptr; >> } >> count = MIN2(_list.length() - claim, _claim_step); >> return _list.list()->threads() + claim; <--- here >> } >> ``` >> >> because of the mentioned access in `g1YoungGCPostEvacuateTasks.cpp:720`. > > I think a better solution might be to declare the `Thread::_tlab` member `mutable` and `Thread::tlab() const`. > Using `Threads::possibly_parallel_oops_do`, thread iteration/claiming seems to be the bottleneck, i.e. claiming the token. > > One issue is that all threads need to traverse the array from the beginning to get to the current claim position - so the minimum processing time for a single thread is iterating the complete thread array and checking all tokens for it. That does not matter so much if the work per thread is big (like when walking the stacks for oops - but even then I think it would be noticable, need to check). > > The other is that with little work per thread threads like in this situation they seem to be contending heavily on the JavaThread claim tokens, so this "parallelization" is quite a pessimization - I've measured ~4x slower with 18 threads than using the single-threaded version (on ~21k JavaThreads) Looking at `Threads::possibly_parallel_threads_do`, I'm thinking it could use a redesign to use the technique being used in `G1JavaThreadsListClaimer`. The existing design and implementation dates from when the threads-list was an actual linked list, rather than the newer ThreadsList. But maybe that's future work. ------------- PR: https://git.openjdk.org/jdk/pull/12360 From iklam at openjdk.org Fri Feb 3 04:06:26 2023 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 3 Feb 2023 04:06:26 GMT Subject: RFR: 8296344: Remove dependency on G1 for writing the CDS archive heap [v3] In-Reply-To: References: Message-ID: > Goals > > - Simplify the writing of the CDS archive heap > - We no longer need to allocate special "archive regions" during CDS dump time. See all the removed G1 code. > - Make it possible to (in a future RFE) write the CDS archive heap using any garbage collector > > Implementation - the following runs inside a safepoint so heap objects aren't moving > > - Find all the objects that should be archived > - Allocate buffer using a GrowableArray > - Copy all "open" objects into the buffer > - Copy all "closed" objects into the buffer > - Make sure the heap image can be mapped with all possible G1 region sizes: > - While copying, add fillers to make sure no object spans across 1MB boundary (minimal G1 region size) > - Align the bottom of the "open" and "closed" objects at 1MB boundary > - Write the buffer to disk as the image for the archive heap > > Testing: tiers 1 ~ 5 Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: @ashu-mehra review; added comments about relocating HeapShared::roots() ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12304/files - new: https://git.openjdk.org/jdk/pull/12304/files/664fb922..2b814bdd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12304&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12304&range=01-02 Stats: 4 lines in 1 file changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12304.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12304/head:pull/12304 PR: https://git.openjdk.org/jdk/pull/12304 From dholmes at openjdk.org Fri Feb 3 04:16:51 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 3 Feb 2023 04:16:51 GMT Subject: RFR: 8301380: jdk/jfr/api/consumer/streaming/TestCrossProcessStreaming.java [v2] In-Reply-To: <7oSkKJNaPRjItariUnPQItEvDu1JU72EXaOvYU79qKA=.136ef5be-741d-44ce-b04e-19048e094af6@github.com> References: <7oSkKJNaPRjItariUnPQItEvDu1JU72EXaOvYU79qKA=.136ef5be-741d-44ce-b04e-19048e094af6@github.com> Message-ID: On Thu, 2 Feb 2023 15:34:14 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> please help review this small adjustment to close the gap where iterated threads that attach via jni are left without a valid JFR thread id. For details, please see the JIRA issue. >> >> Testing: jdk_jfr >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with two additional commits since the last revision: > > - typos > - unconditional id assignment Code changes look good. A couple of suggested comment changes to explain the reasoning more clearly. Thanks. src/hotspot/share/prims/jni.cpp line 3468: > 3466: static void post_thread_start_event(const JavaThread* jt) { > 3467: assert(jt != nullptr, "invariant"); > 3468: // We hoist the read for the thread id to ensure the thread is assigned an id (lazy assignment). Suggestion: // Ensure the thread is assigned an id (lazy assignment) even if the event isn't committed. src/hotspot/share/prims/jni.cpp line 3841: > 3839: } > 3840: > 3841: // Please keep this inside the _attaching_via_jni section. Suggestion: // Ensure the thread_id is set whilst still `_attaching_via_jni` so that it can be seen // once the thread is included by the JFR thread iterator. ? ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12388 From dholmes at openjdk.org Fri Feb 3 04:29:53 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 3 Feb 2023 04:29:53 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v3] In-Reply-To: References: Message-ID: On Thu, 2 Feb 2023 15:52:01 GMT, Justin King wrote: >> Adds initial LSan (LeakSanitizer) support to Hotspot. This setup has been used to identify multiple leaks so far. It can run most of the test suite except those that also fail with ASan, which is being looked at separately. It is especially useful when combined with ASan, as LSan can use poisoning information to determine what memory to scan or not to scan, making leak detection more accurate and faster. >> >> **Suppressing:** >> Currently the suppression list is only used to suppress JLI leaks that are known, the rest are done in code. Suppressing needs to identify the source of thet leak. Due to Hotspot's code organization, we would need to suppress `os::malloc` and friends, which would suppress everything. Suppressing in code has the added benefit of being explicit and surviving refactors if methods change. >> >> **Caveats:** >> - By default ASan enables LSan, however we explicitly disable it unless `--enable-lsan` is given. It is useful to be able to use ASan without LSan. Using LSan by itself is less likely to be useful and will probably not work, but its still possible currently. > > Justin King has updated the pull request incrementally with two additional commits since the last revision: > > - Fix grammatical error > > Signed-off-by: Justin King > - Update based on review and remove restriction on compressed oops and class pointers > > Signed-off-by: Justin King Updates look good - glad to see the flag changes go away! I suggest factoring out the change to `test/jdk/jni/nullCaller/exeNullCallerTest.cpp` as it is a JDK test and not part of hotspot. Thanks. ------------- PR: https://git.openjdk.org/jdk/pull/12229 From dholmes at openjdk.org Fri Feb 3 04:37:54 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 3 Feb 2023 04:37:54 GMT Subject: RFR: JDK-8301050: Detect Xen Virtualization on Linux aarch64 [v3] In-Reply-To: References: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> Message-ID: <_Ymy8VmPPZuULJEbIfrvhiES8R78JbBENkibLeo2rSk=.60184c8b-e931-4c6a-b0ce-767d7a706806@github.com> On Thu, 2 Feb 2023 09:14:03 GMT, Matthias Baesken wrote: >> With https://bugs.openjdk.org/browse/JDK-8300266 KVM and VMWare virtualization detection has been added; this should be enhanced by Xen detection. >> However it seems that for Xen we should look at other file system locations compared to the other virtualizations. >> more /sys/hypervisor/type >> xen >> seems to be available there. > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Move more coding into check_info_file Changes seem fine as-is, but one small tweak if possible. Thanks. src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 570: > 568: static bool check_info_file(const char* fpath, > 569: const char* virt1, VirtualizationType vt1, > 570: const char* virt2, VirtualizationType vt2) { We should (future RFE) convert VirtualizationType to a class/struct so that we don't need to manually pair the name with the value. src/hotspot/share/runtime/abstract_vm_version.hpp line 74: > 72: static unsigned int _data_cache_line_flush_size; > 73: > 74: public: Would protected work here instead of public? ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12217 From iklam at openjdk.org Fri Feb 3 05:23:26 2023 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 3 Feb 2023 05:23:26 GMT Subject: RFR: 8296344: Remove dependency on G1 for writing the CDS archive heap [v4] In-Reply-To: References: Message-ID: > Goals > > - Simplify the writing of the CDS archive heap > - We no longer need to allocate special "archive regions" during CDS dump time. See all the removed G1 code. > - Make it possible to (in a future RFE) write the CDS archive heap using any garbage collector > > Implementation - the following runs inside a safepoint so heap objects aren't moving > > - Find all the objects that should be archived > - Allocate buffer using a GrowableArray > - Copy all "open" objects into the buffer > - Copy all "closed" objects into the buffer > - Make sure the heap image can be mapped with all possible G1 region sizes: > - While copying, add fillers to make sure no object spans across 1MB boundary (minimal G1 region size) > - Align the bottom of the "open" and "closed" objects at 1MB boundary > - Write the buffer to disk as the image for the archive heap > > Testing: tiers 1 ~ 5 Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: Simplified relocation by using buffered address instead of requested address ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12304/files - new: https://git.openjdk.org/jdk/pull/12304/files/2b814bdd..00dcedd7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12304&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12304&range=02-03 Stats: 34 lines in 3 files changed: 5 ins; 7 del; 22 mod Patch: https://git.openjdk.org/jdk/pull/12304.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12304/head:pull/12304 PR: https://git.openjdk.org/jdk/pull/12304 From iklam at openjdk.org Fri Feb 3 05:31:51 2023 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 3 Feb 2023 05:31:51 GMT Subject: RFR: 8296344: Remove dependency on G1 for writing the CDS archive heap [v2] In-Reply-To: References: Message-ID: On Thu, 2 Feb 2023 21:38:40 GMT, Ashutosh Mehra wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed windows and 32-bit builds which do not support archive heap > > src/hotspot/share/cds/archiveHeapWriter.cpp line 528: > >> 526: template void do_oop_work(T *p) { >> 527: size_t field_offset = pointer_delta(p, _src_obj, sizeof(char)); >> 528: ArchiveHeapWriter::relocate_field_in_requested_obj(_requested_obj, field_offset); > > I understand what we are doing here, but it seems to be taking a bit of a convoluted approach. It performs quite a few operations to calculate `source_referent`, which is already available in *p. > So an alternate implementation for `do_oop(narrowOop *p)` can be: > > if (!CompressedOops::is_null(*p)) { > size_t field_offset = pointer_delta(p, _src_obj, sizeof(char)); > oop request_referrent = source_obj_to_requested_obj(CompressedOops::decode(*p)); // source referent is already in *p > narrowOop* field_address_in_buffer = (narrowOop*)(cast_from_oop
(_buffered_obj) + field_offset); > *field_address_in_buffer = CompressedOops::encode(request_referrent); > mark_oop_pointer(_requested_obj, field_offset, _oopmap); > } > > The `_buffered_obj` can be calculated in the `iterator` below and passed to `EmbeddedOopRelocator` c'tor. > `do_oop(oop *p)` would follow the same. Does this read better? Thanks for the suggestion. I simplified the relocation code by loading/storing the field using the "buffered address" instead of the "requested address". ------------- PR: https://git.openjdk.org/jdk/pull/12304 From stuefe at openjdk.org Fri Feb 3 06:18:51 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 3 Feb 2023 06:18:51 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v2] In-Reply-To: <9Q5Yb8KEUtCRCVYWSAD0WxGEfOCxAmov9_g1zQ8i7fM=.fa6eb056-13b9-4057-9adf-c5d185f4895b@github.com> References: <9Q5Yb8KEUtCRCVYWSAD0WxGEfOCxAmov9_g1zQ8i7fM=.fa6eb056-13b9-4057-9adf-c5d185f4895b@github.com> Message-ID: On Thu, 2 Feb 2023 06:34:56 GMT, Thomas Stuefe wrote: >> Justin King has updated the pull request incrementally with one additional commit since the last revision: >> >> Support CDS >> >> Signed-off-by: Justin King > > Thinking this through some more: > > Why would UseCompressedClassPointers cause leak analysis to fail? Because you track Metaspace? If so, we should think whether it makes sense to track Metaspace. > > Metaspace memory is tied to class loaders. Class loaders may live forever, until VM death, in that case Metaspace will not be released. How exactly does the leak profiler deal with that? Does it report those cases as leaks? > > Metaspace "leaks" exist, but these are typically class loader leaks at the application level, and those you cannot recognize from the JVM. You need application knowledge for that. > @tstuefe > > Okay, I removed the forceful disabling of `UseCompressedOops` and `UseCompressedClassPointers` and it looks clean, so I removed that. I think what happened was two things. First, when I started this many months ago I did not have a full understanding of the different memory regions used by Hotspot compared to now and second when I was originally doing this I got false positive feedback by disabling these likely due to missing registration of a root region. So I originally chalked it up to being due to misaligned pointers. My original goal was to just get a clean run, so I took the hammer approach. > > As long as the pointers to malloc-based memory are properly aligned in one of the root regions (ones we register or other memory allocated by malloc), LSan is happy. It shouldn't matter whether pointers to data in metaspace are compressed or not, or oops. Both reside in metaspace and java heap, neither of which LSan is directly concerned about other than looking for pointers to malloc-based memory. So you are correct in that regard that neither `UseCompressedOops` and `UseCompressedClassPointers` should be necessary. However metaspace leaks do exist and this setup actually found one (JDK-8298084), so metaspace being instrumented is definitely useful. Thanks for the explanation. Do I understand correctly: You register address ranges with LSAN, and at certain points (program exit?) the memory is examined for pointers that point still to unreleased memory? Does LSAN instrument malloc and free? > > To clarify, LSan is only concerned with leaks related to memory obtained from malloc and friends. It does not deal with metaspace leaks directly or java heap leaks. Metaspace objects hold pointers to malloced memory. Metaspace Objects can leak, that's accepted. So these pointers would leak too. Would that not give us tons of false positives? In other words, in Metaspace, we need to distinguish real malloc pointer leaks from allowed ones, otherwise monitoring Metaspace is not useful. Real leaks would be malloc pointers in metaspace objects that had been implicitly released by metaspace arena death. I need to think more about this some more. Tracking leaks from metaspace objects could be useful. We could always postpone tracking Metaspace malloc pointers to another RFE, in which case I would remove the changes to VirtualSpaceNode. @dholmes-ora : Do all the sub-data tied to metaspace objects really have to live in C-heap? They mostly do have the same lifetime as the parent object. It would be logical for them to live in Metaspace too - no need to explicitly release them in the various release_C_heap_structures. Cheaper too, since malloc is comparatively expensive. ------------- PR: https://git.openjdk.org/jdk/pull/12229 From ioi.lam at oracle.com Fri Feb 3 07:12:00 2023 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 2 Feb 2023 23:12:00 -0800 Subject: JVM Flag ergonomics and constraints In-Reply-To: References: <1e05de2b-5f03-7ff2-6a47-176df9046588@oracle.com> <637c4388-4c02-c654-6166-23491d797fce@oracle.com> Message-ID: <678fc49c-d85c-3948-0a7d-a3802b909dd1@oracle.com> On 2/1/2023 10:37 PM, David Holmes wrote: > > > On 2/02/2023 3:25 pm, Thomas St?fe wrote: >> On Thu, Feb 2, 2023 at 5:30 AM David Holmes > > wrote: >> >> ??? That would have to be determined case-by-case. For each ergo flag we >> ??? would have to decide what value it is calculating and based on what, >> ??? and >> ??? determine whether an out-of-range value is an error or something >> to be >> ??? wary of. >> >> Sure, what I meant was how to control it? Two sets of FLAT_SET_... >> macros, one that asserts on constraint violation, one that caps? > > No I expect to just see the places where FLAG_SET_ERGO is called do > the right thing by checking the return value and proceeding accordingly. David, are you proposing this? ?? int value = calculate_ergo(); ?? if (FLAG_SET_ERGO (TheFlag, value) == error) { ???? value = do_it_again() ???? if (FLAG_SET_ERGO (TheFlag, value) == error) { ??????? huh? ???? } ?? } In the above code, all FLAG_SET_ERGO does is to check the range for you. It doesn't tell you why the range check fails. So the caller would have to redo the calculation (with no more information, other than that the last value was no good). What if the new value is also no good? With this style, you almost have to wrap every FLAG_SET_ERGO? in a loop, keep trying (in the dark) until you succeed. ================== I think it's better to just assert inside FLAG_SET_ERGO if the value is out of range. The caller should make sure it has calculated a correct value. Maybe we can add a new FLAG_SET_ERGO_CAPPED() that would cap the value, but that should be an opt-in. FLAG_SET_ERGO() should not automatically cap the value. Thanks - Ioi From thomas.stuefe at gmail.com Fri Feb 3 07:37:34 2023 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 3 Feb 2023 08:37:34 +0100 Subject: JVM Flag ergonomics and constraints In-Reply-To: <678fc49c-d85c-3948-0a7d-a3802b909dd1@oracle.com> References: <1e05de2b-5f03-7ff2-6a47-176df9046588@oracle.com> <637c4388-4c02-c654-6166-23491d797fce@oracle.com> <678fc49c-d85c-3948-0a7d-a3802b909dd1@oracle.com> Message-ID: On Fri, Feb 3, 2023 at 8:12 AM Ioi Lam wrote: > > > On 2/1/2023 10:37 PM, David Holmes wrote: > > > > > > On 2/02/2023 3:25 pm, Thomas St?fe wrote: > >> On Thu, Feb 2, 2023 at 5:30 AM David Holmes >> > wrote: > >> > >> That would have to be determined case-by-case. For each ergo flag we > >> would have to decide what value it is calculating and based on what, > >> and > >> determine whether an out-of-range value is an error or something > >> to be > >> wary of. > >> > >> Sure, what I meant was how to control it? Two sets of FLAT_SET_... > >> macros, one that asserts on constraint violation, one that caps? > > > > No I expect to just see the places where FLAG_SET_ERGO is called do > > the right thing by checking the return value and proceeding accordingly. > > David, are you proposing this? > > int value = calculate_ergo(); > if (FLAG_SET_ERGO (TheFlag, value) == error) { > value = do_it_again() > if (FLAG_SET_ERGO (TheFlag, value) == error) { > huh? > } > } > > In the above code, all FLAG_SET_ERGO does is to check the range for you. > It doesn't tell you why the range check fails. So the caller would have > to redo the calculation (with no more information, other than that the > last value was no good). What if the new value is also no good? > > With this style, you almost have to wrap every FLAG_SET_ERGO in a loop, > keep trying (in the dark) until you succeed. > > ================== > I think it's better to just assert inside FLAG_SET_ERGO if the value is > out of range. The caller should make sure it has calculated a correct > value. > > Maybe we can add a new FLAG_SET_ERGO_CAPPED() that would cap the value, > but that should be an opt-in. FLAG_SET_ERGO() should not automatically > cap the value. > > That sounds like a pragmatic solution. The CAPPED version may not even be needed. I expects the asserts mostly to fire for "special" values, as in the cited case of NonNMethodCodeHeapSize, and point to problems in the constraint function itself, which should accept special values. Cheers, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From mbaesken at openjdk.org Fri Feb 3 07:58:06 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 3 Feb 2023 07:58:06 GMT Subject: RFR: JDK-8301050: Detect Xen Virtualization on Linux aarch64 [v3] In-Reply-To: References: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> Message-ID: On Thu, 2 Feb 2023 09:14:03 GMT, Matthias Baesken wrote: >> With https://bugs.openjdk.org/browse/JDK-8300266 KVM and VMWare virtualization detection has been added; this should be enhanced by Xen detection. >> However it seems that for Xen we should look at other file system locations compared to the other virtualizations. >> more /sys/hypervisor/type >> xen >> seems to be available there. > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Move more coding into check_info_file Hi Christoph and David, thanks for the reviews ! ------------- PR: https://git.openjdk.org/jdk/pull/12217 From mbaesken at openjdk.org Fri Feb 3 07:58:08 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 3 Feb 2023 07:58:08 GMT Subject: RFR: JDK-8301050: Detect Xen Virtualization on Linux aarch64 [v3] In-Reply-To: <_Ymy8VmPPZuULJEbIfrvhiES8R78JbBENkibLeo2rSk=.60184c8b-e931-4c6a-b0ce-767d7a706806@github.com> References: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> <_Ymy8VmPPZuULJEbIfrvhiES8R78JbBENkibLeo2rSk=.60184c8b-e931-4c6a-b0ce-767d7a706806@github.com> Message-ID: On Fri, 3 Feb 2023 04:33:45 GMT, David Holmes wrote: >> Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: >> >> Move more coding into check_info_file > > src/hotspot/share/runtime/abstract_vm_version.hpp line 74: > >> 72: static unsigned int _data_cache_line_flush_size; >> 73: >> 74: public: > > Would protected work here instead of public? Hi David, unfortunately protected does not work here. ------------- PR: https://git.openjdk.org/jdk/pull/12217 From mbaesken at openjdk.org Fri Feb 3 07:58:10 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 3 Feb 2023 07:58:10 GMT Subject: Integrated: JDK-8301050: Detect Xen Virtualization on Linux aarch64 In-Reply-To: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> References: <7EBbyNeCuZPMBzTN8MpQlHnCzJ2myxa5xQh2MfAEKmo=.0a4f74e2-7dd9-4b79-a78b-6bd66380534b@github.com> Message-ID: On Thu, 26 Jan 2023 13:22:48 GMT, Matthias Baesken wrote: > With https://bugs.openjdk.org/browse/JDK-8300266 KVM and VMWare virtualization detection has been added; this should be enhanced by Xen detection. > However it seems that for Xen we should look at other file system locations compared to the other virtualizations. > more /sys/hypervisor/type > xen > seems to be available there. This pull request has now been integrated. Changeset: 11804b24 Author: Matthias Baesken URL: https://git.openjdk.org/jdk/commit/11804b246e8643a3465b9549794ccfb24ccd8fc5 Stats: 30 lines in 2 files changed: 17 ins; 2 del; 11 mod 8301050: Detect Xen Virtualization on Linux aarch64 Reviewed-by: dholmes, clanger ------------- PR: https://git.openjdk.org/jdk/pull/12217 From xlinzheng at openjdk.org Fri Feb 3 08:16:24 2023 From: xlinzheng at openjdk.org (Xiaolin Zheng) Date: Fri, 3 Feb 2023 08:16:24 GMT Subject: RFR: 8299162: Refactor shared trampoline emission logic [v3] In-Reply-To: References: Message-ID: <7Aj4v0D-CFSg2vt7gQf7HAJ6X55MhFX5o8jSJNvdT7s=.ee880799-68df-48ee-942b-0ba91b363b77@github.com> > After the quick fix [JDK-8297763](https://bugs.openjdk.org/browse/JDK-8297763), shared trampoline logic gets a bit verbose. If we can turn to batch emission of trampoline stubs, pre-calculating the total size, and pre-allocating them, then we can remove the CodeBuffer expansion checks each time and clean up the code around. > > > [Stub Code] > ... > > __ align() // emit nothing or a 4-byte padding > <-- (B) emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) > __ ldr() > __ br() > __ emit_int64() > > __ align() // emit nothing or a 4-byte padding > <-- emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) > __ ldr() > __ br() > __ emit_int64() > > __ align() // emit nothing or a 4-byte padding > <-- emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) > __ ldr() > __ br() > __ emit_int64() > > > Here, the `pc_at_(C) - pc_at_(B)` is the fixed length `NativeCallTrampolineStub::instruction_size`; but the `pc_at_(B) - pc_at_(A)` may be a 0 or 4, which is not a fixed-length value. > > So originally: > The logic of the lambda `emit()` inside the `emit_shared_trampolines()` when emitting a shared trampoline: > > We are at (A) -> > do an align() -> > We are at (B) -> > emit lots of relocations bound to this shared trampoline at (B) -> > do an emit_trampoline_stub() -> > We are at (C) > > > After this patch: > > We are at (A) -> > do an emit_trampoline_stub(), which contains an align() already -> > We are at (C) directly -> > reversely calculate the (B) address, for `pc_at_(C) - pc_at_(B)` is a fixed-length value -> > emit lots of relocations bound to this shared trampoline at (B) > > > Theoretically the same. Just a code refactoring and we can remove some checks inside and make the code clean. > > Tested AArch64 hotspot tier1~4 with fastdebug build twice; Tested RISC-V hotspot tier1~4 with fastdebug build on hardware once. > > > Thanks, > Xiaolin Xiaolin Zheng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - license - Merge branch 'master' into shared-trampoline - __ - Refactor and cleanup for shared trampolines ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11749/files - new: https://git.openjdk.org/jdk/pull/11749/files/aefe74ff..6159a93f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11749&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11749&range=01-02 Stats: 59929 lines in 3052 files changed: 25333 ins; 10660 del; 23936 mod Patch: https://git.openjdk.org/jdk/pull/11749.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11749/head:pull/11749 PR: https://git.openjdk.org/jdk/pull/11749 From xlinzheng at openjdk.org Fri Feb 3 08:23:55 2023 From: xlinzheng at openjdk.org (Xiaolin Zheng) Date: Fri, 3 Feb 2023 08:23:55 GMT Subject: RFR: 8299162: Refactor shared trampoline emission logic [v3] In-Reply-To: <7Aj4v0D-CFSg2vt7gQf7HAJ6X55MhFX5o8jSJNvdT7s=.ee880799-68df-48ee-942b-0ba91b363b77@github.com> References: <7Aj4v0D-CFSg2vt7gQf7HAJ6X55MhFX5o8jSJNvdT7s=.ee880799-68df-48ee-942b-0ba91b363b77@github.com> Message-ID: On Fri, 3 Feb 2023 08:16:24 GMT, Xiaolin Zheng wrote: >> After the quick fix [JDK-8297763](https://bugs.openjdk.org/browse/JDK-8297763), shared trampoline logic gets a bit verbose. If we can turn to batch emission of trampoline stubs, pre-calculating the total size, and pre-allocating them, then we can remove the CodeBuffer expansion checks each time and clean up the code around. >> >> >> [Stub Code] >> ... >> >> __ align() // emit nothing or a 4-byte padding >> <-- (B) emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) >> __ ldr() >> __ br() >> __ emit_int64() >> >> __ align() // emit nothing or a 4-byte padding >> <-- emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) >> __ ldr() >> __ br() >> __ emit_int64() >> >> __ align() // emit nothing or a 4-byte padding >> <-- emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) >> __ ldr() >> __ br() >> __ emit_int64() >> >> >> Here, the `pc_at_(C) - pc_at_(B)` is the fixed length `NativeCallTrampolineStub::instruction_size`; but the `pc_at_(B) - pc_at_(A)` may be a 0 or 4, which is not a fixed-length value. >> >> So originally: >> The logic of the lambda `emit()` inside the `emit_shared_trampolines()` when emitting a shared trampoline: >> >> We are at (A) -> >> do an align() -> >> We are at (B) -> >> emit lots of relocations bound to this shared trampoline at (B) -> >> do an emit_trampoline_stub() -> >> We are at (C) >> >> >> After this patch: >> >> We are at (A) -> >> do an emit_trampoline_stub(), which contains an align() already -> >> We are at (C) directly -> >> reversely calculate the (B) address, for `pc_at_(C) - pc_at_(B)` is a fixed-length value -> >> emit lots of relocations bound to this shared trampoline at (B) >> >> >> Theoretically the same. Just a code refactoring and we can remove some checks inside and make the code clean. >> >> Tested AArch64 hotspot tier1~4 with fastdebug build twice; Tested RISC-V hotspot tier1~4 with fastdebug build on hardware once. >> >> >> Thanks, >> Xiaolin > > Xiaolin Zheng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - license > - Merge branch 'master' into shared-trampoline > - __ > - Refactor and cleanup for shared trampolines Not yet. Ping to keep it active - May I receive another review from the AArch64 side, please? Thanks in advance. ------------- PR: https://git.openjdk.org/jdk/pull/11749 From xlinzheng at openjdk.org Fri Feb 3 08:25:33 2023 From: xlinzheng at openjdk.org (Xiaolin Zheng) Date: Fri, 3 Feb 2023 08:25:33 GMT Subject: RFR: 8301743: RISC-V: Add InlineSkippedInstructionsCounter to post-call nops Message-ID: RISC-V part of [JDK-8300002](https://bugs.openjdk.org/browse/JDK-8300002), one line only. (Did not observe performance impact on JMH mentioned in JDK-8300002.) Tested hotspot tier1 (fastdebug). ------------- Commit messages: - JDK-8300002 Changes: https://git.openjdk.org/jdk/pull/12402/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12402&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301743 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12402.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12402/head:pull/12402 PR: https://git.openjdk.org/jdk/pull/12402 From dholmes at openjdk.org Fri Feb 3 08:34:52 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 3 Feb 2023 08:34:52 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v3] In-Reply-To: References: Message-ID: On Fri, 3 Feb 2023 04:26:40 GMT, David Holmes wrote: >> Justin King has updated the pull request incrementally with two additional commits since the last revision: >> >> - Fix grammatical error >> >> Signed-off-by: Justin King >> - Update based on review and remove restriction on compressed oops and class pointers >> >> Signed-off-by: Justin King > > Updates look good - glad to see the flag changes go away! > > I suggest factoring out the change to `test/jdk/jni/nullCaller/exeNullCallerTest.cpp` as it is a JDK test and not part of hotspot. Thanks. > @dholmes-ora : Do all the sub-data tied to metaspace objects really have to live in C-heap? @tstuefe - sorry I have no knowledge about that. Perhaps @coleenp can answer? ------------- PR: https://git.openjdk.org/jdk/pull/12229 From david.holmes at oracle.com Fri Feb 3 08:40:00 2023 From: david.holmes at oracle.com (David Holmes) Date: Fri, 3 Feb 2023 18:40:00 +1000 Subject: JVM Flag ergonomics and constraints In-Reply-To: <678fc49c-d85c-3948-0a7d-a3802b909dd1@oracle.com> References: <1e05de2b-5f03-7ff2-6a47-176df9046588@oracle.com> <637c4388-4c02-c654-6166-23491d797fce@oracle.com> <678fc49c-d85c-3948-0a7d-a3802b909dd1@oracle.com> Message-ID: On 3/02/2023 5:12 pm, Ioi Lam wrote: > > > On 2/1/2023 10:37 PM, David Holmes wrote: >> >> >> On 2/02/2023 3:25 pm, Thomas St?fe wrote: >>> On Thu, Feb 2, 2023 at 5:30 AM David Holmes >> > wrote: >>> >>> ??? That would have to be determined case-by-case. For each ergo flag we >>> ??? would have to decide what value it is calculating and based on what, >>> ??? and >>> ??? determine whether an out-of-range value is an error or something >>> to be >>> ??? wary of. >>> >>> Sure, what I meant was how to control it? Two sets of FLAT_SET_... >>> macros, one that asserts on constraint violation, one that caps? >> >> No I expect to just see the places where FLAG_SET_ERGO is called do >> the right thing by checking the return value and proceeding accordingly. > > David, are you proposing this? > > ?? int value = calculate_ergo(); > ?? if (FLAG_SET_ERGO (TheFlag, value) == error) { > ???? value = do_it_again() > ???? if (FLAG_SET_ERGO (TheFlag, value) == error) { > ??????? huh? > ???? } > ?? } > > In the above code, all FLAG_SET_ERGO does is to check the range for you. > It doesn't tell you why the range check fails. So the caller would have > to redo the calculation (with no more information, other than that the > last value was no good). What if the new value is also no good? To me that indicates we don't know what the heck we are calculating! How can you calculate a setting for a flag if you don't even understand the valid range of that flag. Which may mean the max, for example, needs to be exposed programmatically. > With this style, you almost have to wrap every FLAG_SET_ERGO? in a loop, > keep trying (in the dark) until you succeed. > > ================== > I think it's better to just assert inside FLAG_SET_ERGO if the value is > out of range. The caller should make sure it has calculated a correct > value. If the calculation uses some externally settable value then an assert won't help if we're in a product build. As I said it depends on exactly what we are using to do the calculation. If it is "closed world" then it is a fatal error to get it wrong and should be detected during testing. Otherwise the code has to deal with the possibility of error. > > Maybe we can add a new FLAG_SET_ERGO_CAPPED() that would cap the value, > but that should be an opt-in. FLAG_SET_ERGO() should not automatically > cap the value. That might be suitable for some flags/calculations. I don't like trying to talk about this in generalities. Each specific situation needs to be examined to understand what is being calculated and how. But it is the callers of FLAG_SET_ERGO that need to know what they are doing. Cheers, David > > Thanks > - Ioi > From mgronlun at openjdk.org Fri Feb 3 09:32:25 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Fri, 3 Feb 2023 09:32:25 GMT Subject: RFR: 8301380: jdk/jfr/api/consumer/streaming/TestCrossProcessStreaming.java [v3] In-Reply-To: References: Message-ID: > Greetings, > > please help review this small adjustment to close the gap where iterated threads that attach via jni are left without a valid JFR thread id. For details, please see the JIRA issue. > > Testing: jdk_jfr > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: revert uncalled for changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12388/files - new: https://git.openjdk.org/jdk/pull/12388/files/34fc58aa..5f129b91 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12388&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12388&range=01-02 Stats: 8 lines in 1 file changed: 2 ins; 5 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12388.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12388/head:pull/12388 PR: https://git.openjdk.org/jdk/pull/12388 From mgronlun at openjdk.org Fri Feb 3 09:32:28 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Fri, 3 Feb 2023 09:32:28 GMT Subject: RFR: 8301380: jdk/jfr/api/consumer/streaming/TestCrossProcessStreaming.java [v2] In-Reply-To: References: <7oSkKJNaPRjItariUnPQItEvDu1JU72EXaOvYU79qKA=.136ef5be-741d-44ce-b04e-19048e094af6@github.com> Message-ID: <6Ej-RlKCJbvtXd2kt6D52Ne7QIQLkvKU6K3UItRRYUc=.e0457d0a-f45c-4904-97b6-561d05a18744@github.com> On Fri, 3 Feb 2023 04:06:46 GMT, David Holmes wrote: >> Markus Gr?nlund has updated the pull request incrementally with two additional commits since the last revision: >> >> - typos >> - unconditional id assignment > > src/hotspot/share/prims/jni.cpp line 3468: > >> 3466: static void post_thread_start_event(const JavaThread* jt) { >> 3467: assert(jt != nullptr, "invariant"); >> 3468: // We hoist the read for the thread id to ensure the thread is assigned an id (lazy assignment). > > Suggestion: > > // Ensure the thread is assigned an id (lazy assignment) even if the event isn't committed. Thanks David. I realized there is no need to move anything in jni.cpp, so I rolled them back. The change to the iterator will suffice. > src/hotspot/share/prims/jni.cpp line 3841: > >> 3839: } >> 3840: >> 3841: // Please keep this inside the _attaching_via_jni section. > > Suggestion: > > // Ensure the thread_id is set whilst still `_attaching_via_jni` so that it can be seen > // once the thread is included by the JFR thread iterator. > > ? Thanks David. I realized there is no need to move anything in jni.cpp, so I rolled them back. The change to the iterator will suffice. ------------- PR: https://git.openjdk.org/jdk/pull/12388 From tschatzl at openjdk.org Fri Feb 3 09:49:51 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 3 Feb 2023 09:49:51 GMT Subject: RFR: 8301371: Interpreter store barriers on x86_64 don't have disjoint temp registers In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 08:08:14 GMT, Erik ?sterlund wrote: > The interpreter store barriers on x86_64 use temp registers that sometimes intersects with registers that are part of the address. Therefore, when clobbering temp registers, the address gets destroyed. This has happened to be okay so far, based on how these registers are used in existing backends, but it doesn't work well for generational ZGC. This CR selects new temp registers that I have verified works for everyone on x86_64. Marked as reviewed by tschatzl (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12309 From adinn at openjdk.org Fri Feb 3 10:46:53 2023 From: adinn at openjdk.org (Andrew Dinn) Date: Fri, 3 Feb 2023 10:46:53 GMT Subject: RFR: 8299162: Refactor shared trampoline emission logic [v3] In-Reply-To: <7Aj4v0D-CFSg2vt7gQf7HAJ6X55MhFX5o8jSJNvdT7s=.ee880799-68df-48ee-942b-0ba91b363b77@github.com> References: <7Aj4v0D-CFSg2vt7gQf7HAJ6X55MhFX5o8jSJNvdT7s=.ee880799-68df-48ee-942b-0ba91b363b77@github.com> Message-ID: On Fri, 3 Feb 2023 08:16:24 GMT, Xiaolin Zheng wrote: >> After the quick fix [JDK-8297763](https://bugs.openjdk.org/browse/JDK-8297763), shared trampoline logic gets a bit verbose. If we can turn to batch emission of trampoline stubs, pre-calculating the total size, and pre-allocating them, then we can remove the CodeBuffer expansion checks each time and clean up the code around. >> >> >> [Stub Code] >> ... >> >> __ align() // emit nothing or a 4-byte padding >> <-- (B) emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) >> __ ldr() >> __ br() >> __ emit_int64() >> >> __ align() // emit nothing or a 4-byte padding >> <-- emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) >> __ ldr() >> __ br() >> __ emit_int64() >> >> __ align() // emit nothing or a 4-byte padding >> <-- emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) >> __ ldr() >> __ br() >> __ emit_int64() >> >> >> Here, the `pc_at_(C) - pc_at_(B)` is the fixed length `NativeCallTrampolineStub::instruction_size`; but the `pc_at_(B) - pc_at_(A)` may be a 0 or 4, which is not a fixed-length value. >> >> So originally: >> The logic of the lambda `emit()` inside the `emit_shared_trampolines()` when emitting a shared trampoline: >> >> We are at (A) -> >> do an align() -> >> We are at (B) -> >> emit lots of relocations bound to this shared trampoline at (B) -> >> do an emit_trampoline_stub() -> >> We are at (C) >> >> >> After this patch: >> >> We are at (A) -> >> do an emit_trampoline_stub(), which contains an align() already -> >> We are at (C) directly -> >> reversely calculate the (B) address, for `pc_at_(C) - pc_at_(B)` is a fixed-length value -> >> emit lots of relocations bound to this shared trampoline at (B) >> >> >> Theoretically the same. Just a code refactoring and we can remove some checks inside and make the code clean. >> >> Tested AArch64 hotspot tier1~4 with fastdebug build twice; Tested RISC-V hotspot tier1~4 with fastdebug build on hardware once. >> >> >> Thanks, >> Xiaolin > > Xiaolin Zheng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - license > - Merge branch 'master' into shared-trampoline > - __ > - Refactor and cleanup for shared trampolines AArch64 changes look ok to me. src/hotspot/cpu/aarch64/codeBuffer_aarch64.cpp line 71: > 69: > 70: assert(requests->number_of_entries() >= 1, "at least one"); > 71: const int total_requested_size = MacroAssembler::trampoline_stub_size() * requests->number_of_entries(); There's a detail here that is worth mentioning. At first I was concerned that this calculation might overestimate the required size by one 32 bit word -- that happens where we are able generate the first stub without the need for any alignment padding. However, that won't cause any unnecessary failure because the rounding up will be from an odd to an even multiple of 32 bit words i.e. adding this extra half-word will not push the request over to consume an extra 64 bit word. ------------- Marked as reviewed by adinn (Reviewer). PR: https://git.openjdk.org/jdk/pull/11749 From eosterlund at openjdk.org Fri Feb 3 10:47:51 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 3 Feb 2023 10:47:51 GMT Subject: RFR: 8301371: Interpreter store barriers on x86_64 don't have disjoint temp registers In-Reply-To: References: Message-ID: On Fri, 3 Feb 2023 09:47:08 GMT, Thomas Schatzl wrote: >> The interpreter store barriers on x86_64 use temp registers that sometimes intersects with registers that are part of the address. Therefore, when clobbering temp registers, the address gets destroyed. This has happened to be okay so far, based on how these registers are used in existing backends, but it doesn't work well for generational ZGC. This CR selects new temp registers that I have verified works for everyone on x86_64. > > Marked as reviewed by tschatzl (Reviewer). Thanks for the review, @tschatzl! ------------- PR: https://git.openjdk.org/jdk/pull/12309 From tschatzl at openjdk.org Fri Feb 3 11:28:59 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 3 Feb 2023 11:28:59 GMT Subject: RFR: 8301116: Parallelize TLAB resizing in G1 [v2] In-Reply-To: <58lEutsk9hNEy1-seo_pkH0jO8Fr0mz74XqA-JwOix8=.11ff0722-adf8-4770-b2ac-898460bb43b8@github.com> References: <4STcU7c1bgEwc6q5FeqqMSkkRSftG62O1_tiqGBlLXc=.01e51631-360c-4a39-884b-5f5ca3efcba4@github.com> <7Y1f2wfiabz1aO5qFODUi8gEJU_VB2EQKNmMOH4oY1I=.baf20aec-b37d-4a5b-a45a-fae9faaf1089@github.com> <5ceqoRFEZguGfrxpAmInTaTTSFYbReYk6eonXmQJOwg=.6fc16c96-6a7b-4c2e-8a3b-c81b5beb5903@github.com> <58lEutsk9hNEy1-seo_pkH0jO8Fr0mz74XqA-JwOix8=.11ff0722-adf8-4770-b2ac-898460bb43b8@github.com> Message-ID: <69qqMeVrgCWBHALBnDUMllkaE7wKyZ61zfAugbhLGcY=.7295c1ab-c74c-4896-b170-679bdbbc3f64@github.com> On Fri, 3 Feb 2023 01:47:32 GMT, Kim Barrett wrote: >>> To be more exact, the removal of the const happens in `g1CollectedHeap.inline.hpp:124`: >>> >>> ``` >>> inline JavaThread* const* G1JavaThreadsListClaimer::claim(uint& count) { >>> count = 0; >>> if (Atomic::load(&_cur_claim) >= _list.length()) { >>> return nullptr; >>> } >>> uint claim = Atomic::fetch_and_add(&_cur_claim, _claim_step); >>> if (claim >= _list.length()) { >>> return nullptr; >>> } >>> count = MIN2(_list.length() - claim, _claim_step); >>> return _list.list()->threads() + claim; <--- here >>> } >>> ``` >>> >>> because of the mentioned access in `g1YoungGCPostEvacuateTasks.cpp:720`. >> >> I think a better solution might be to declare the `Thread::_tlab` member `mutable` and `Thread::tlab() const`. > >> Using `Threads::possibly_parallel_oops_do`, thread iteration/claiming seems to be the bottleneck, i.e. claiming the token. >> >> One issue is that all threads need to traverse the array from the beginning to get to the current claim position - so the minimum processing time for a single thread is iterating the complete thread array and checking all tokens for it. That does not matter so much if the work per thread is big (like when walking the stacks for oops - but even then I think it would be noticable, need to check). >> >> The other is that with little work per thread threads like in this situation they seem to be contending heavily on the JavaThread claim tokens, so this "parallelization" is quite a pessimization - I've measured ~4x slower with 18 threads than using the single-threaded version (on ~21k JavaThreads) > > Looking at `Threads::possibly_parallel_threads_do`, I'm thinking it could use a redesign to use the technique > being used in `G1JavaThreadsListClaimer`. The existing design and implementation dates from when the > threads-list was an actual linked list, rather than the newer ThreadsList. But maybe that's future work. > > Using `Threads::possibly_parallel_oops_do`, thread iteration/claiming seems to be the bottleneck, i.e. claiming the token. > > One issue is that all threads need to traverse the array from the beginning to get to the current claim position [...] > > The other is that with little work per thread threads like in this situation they seem to be contending heavily on the JavaThread claim tokens [...] > > Looking at `Threads::possibly_parallel_threads_do`, I'm thinking it could use a redesign to use the technique being used in `G1JavaThreadsListClaimer`. The existing design and implementation dates from when the threads-list was an actual linked list, rather than the newer ThreadsList. But maybe that's future work. I considered that, but the non-java threads still use the linked list design, so I moved that to future work :) I can investigate that a bit more though, but would like to keep it as is for this change. ------------- PR: https://git.openjdk.org/jdk/pull/12360 From jsjolen at openjdk.org Fri Feb 3 11:56:03 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 3 Feb 2023 11:56:03 GMT Subject: RFR: JDK-8301479: Replace NULL with nullptr in os/linux [v2] In-Reply-To: References: Message-ID: On Thu, 2 Feb 2023 15:14:52 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/linux. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Missed one mistake All tests passing, integrating. Thanks for review. ------------- PR: https://git.openjdk.org/jdk/pull/12315 From jsjolen at openjdk.org Fri Feb 3 11:56:04 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 3 Feb 2023 11:56:04 GMT Subject: Integrated: JDK-8301479: Replace NULL with nullptr in os/linux In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 10:15:48 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/linux. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! This pull request has now been integrated. Changeset: ac9e0467 Author: Johan Sj?len URL: https://git.openjdk.org/jdk/commit/ac9e046748a9bb6ee065dc473d82135ce36043b7 Stats: 367 lines in 17 files changed: 0 ins; 0 del; 367 mod 8301479: Replace NULL with nullptr in os/linux Reviewed-by: coleenp, sgehwolf ------------- PR: https://git.openjdk.org/jdk/pull/12315 From jsjolen at openjdk.org Fri Feb 3 12:20:25 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 3 Feb 2023 12:20:25 GMT Subject: RFR: JDK-8301481: Replace NULL with nullptr in os/windows [v2] In-Reply-To: References: Message-ID: <6thW4VFZvy37BXKjJ8DTgOJew8Uhsn4AlSyblY91ZZ8=.0481520d-eef1-4e47-8e95-2f8c1595cb6d@github.com> > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/windows. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: - The 2nd argument is a DWORD not a pointer - Thread ID is an unsigned long on Windows ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12317/files - new: https://git.openjdk.org/jdk/pull/12317/files/27e135de..3fb6b8f6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12317&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12317&range=00-01 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12317.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12317/head:pull/12317 PR: https://git.openjdk.org/jdk/pull/12317 From jsjolen at openjdk.org Fri Feb 3 12:20:25 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 3 Feb 2023 12:20:25 GMT Subject: RFR: JDK-8301481: Replace NULL with nullptr in os/windows In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 10:16:14 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/windows. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Found a couple of bugs: 1. thread_id was assigned with NULL, but thread_id is unsigned_long on Windows. Maybe this should actually be `uintptr_t`? 2. An (ignored according to MS docs) argument had type `DWORD`, not pointer. ------------- PR: https://git.openjdk.org/jdk/pull/12317 From fyang at openjdk.org Fri Feb 3 12:29:52 2023 From: fyang at openjdk.org (Fei Yang) Date: Fri, 3 Feb 2023 12:29:52 GMT Subject: RFR: 8301743: RISC-V: Add InlineSkippedInstructionsCounter to post-call nops In-Reply-To: References: Message-ID: <-FYbgRBONLkylnwdv6QiuP3RF3sxisf_eyo4FYWramo=.d4406b99-701c-4b46-850d-4fa279cc4bc3@github.com> On Fri, 3 Feb 2023 08:19:08 GMT, Xiaolin Zheng wrote: > RISC-V part of [JDK-8300002](https://bugs.openjdk.org/browse/JDK-8300002), one line only. > > (Did not observe performance impact on JMH mentioned in JDK-8300002.) > > Tested hotspot tier1 (fastdebug). Marked as reviewed by fyang (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12402 From duke at openjdk.org Fri Feb 3 14:08:55 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Fri, 3 Feb 2023 14:08:55 GMT Subject: RFR: 8296344: Remove dependency on G1 for writing the CDS archive heap [v4] In-Reply-To: References: Message-ID: On Fri, 3 Feb 2023 05:23:26 GMT, Ioi Lam wrote: >> Goals >> >> - Simplify the writing of the CDS archive heap >> - We no longer need to allocate special "archive regions" during CDS dump time. See all the removed G1 code. >> - Make it possible to (in a future RFE) write the CDS archive heap using any garbage collector >> >> Implementation - the following runs inside a safepoint so heap objects aren't moving >> >> - Find all the objects that should be archived >> - Allocate buffer using a GrowableArray >> - Copy all "open" objects into the buffer >> - Copy all "closed" objects into the buffer >> - Make sure the heap image can be mapped with all possible G1 region sizes: >> - While copying, add fillers to make sure no object spans across 1MB boundary (minimal G1 region size) >> - Align the bottom of the "open" and "closed" objects at 1MB boundary >> - Write the buffer to disk as the image for the archive heap >> >> Testing: tiers 1 ~ 5 > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Simplified relocation by using buffered address instead of requested address Marked as reviewed by ashu-mehra at github.com (no known OpenJDK username). @iklam thank you for addressing the comments. It looks good to me. ------------- PR: https://git.openjdk.org/jdk/pull/12304 From duke at openjdk.org Fri Feb 3 14:08:59 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Fri, 3 Feb 2023 14:08:59 GMT Subject: RFR: 8296344: Remove dependency on G1 for writing the CDS archive heap [v2] In-Reply-To: References: Message-ID: <8tXHG43YsqH0wIHOTTGSt7eOFMkZeJu0_NwoeWBwMgo=.d3087c8f-9119-4dcb-b73c-055ceba2b666@github.com> On Thu, 2 Feb 2023 21:41:47 GMT, Ashutosh Mehra wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed windows and 32-bit builds which do not support archive heap > > src/hotspot/share/cds/archiveHeapWriter.cpp line 559: > >> 557: for (int i = 0; i < length; i++) { >> 558: if (UseCompressedOops) { >> 559: relocate_root_at(requested_roots, i); > > This can also follow similar pattern as code in my comment on`do_oop_work` above, as in: > > > if (UseCompressedOops) { > offset = (size_t)((objArrayOop)requested_roots)->obj_at_offset(i); > narrowOop* buffered_addr = (narrowOop*)(cast_from_oop
(buffered_roots) + offset); > oop request_referent = source_obj_to_requested_obj(CompressedOops::decode(*buffered_addr)); > *buffered_addr = CompressedOops::encode(request_referent); > mark_oop_pointer(requested_roots, offset, heap_info->oopmap()); > } In light of the recent change, this has also been taken care of. ------------- PR: https://git.openjdk.org/jdk/pull/12304 From jcking at openjdk.org Fri Feb 3 15:11:52 2023 From: jcking at openjdk.org (Justin King) Date: Fri, 3 Feb 2023 15:11:52 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v2] In-Reply-To: References: <9Q5Yb8KEUtCRCVYWSAD0WxGEfOCxAmov9_g1zQ8i7fM=.fa6eb056-13b9-4057-9adf-c5d185f4895b@github.com> Message-ID: <_NnlbvduQz7iO5RIVJtALHW_ZmksdiH9XWgpz94s4_A=.54301297-1c6c-4722-9aee-348dbb0957db@github.com> On Fri, 3 Feb 2023 06:16:06 GMT, Thomas Stuefe wrote: > Thanks for the explanation. Do I understand correctly: You register address ranges with LSAN, and at certain points (program exit?) the memory is examined for pointers that point still to unreleased memory? Does LSAN instrument malloc and free? LSan intercepts all malloc-related functions and is its own allocator. It normally performs leak checking at program exit, but we disable this behavior. We do not particularly care about leaks that happen as a result of the shutdown process. Instead we trigger leak checking extremely early in the JVM shutdown process, immediately after the synchronization lock but before the shutdown logic actually occurs. When LSan performs leak checking, the entire process is paused except the thread performing the leak checking. LSan is aware of all memory that has been allocated up until that point, let's call it set `A`, its goal is to prove that all pointers (the ones returned by malloc and friends) are present. It considers other memory allocated by malloc, the mutable data section of the executable, our registered root regions, and some others. During scanning if it finds the pointer to an allocation in set `A`, besides from an allocation itself, it marks it. It must be the pointer returned b y malloc and friends itself, not a pointer to a subregion of the allocation. After the scanning process, leaks are all the allocations in set `A` that have not been marked. > Metaspace objects hold pointers to malloced memory. Metaspace Objects can leak, that's accepted. So these pointers would leak too. Would that not give us tons of false positives? No. LSan only cares that it can find pointers to malloc'ed memory, not that it can find pointers to Metaspace objects. The Metaspace is registered as a root region. It does however take into account ASan poisoning, which we added earlier to Metaspace. LSan will not scan poisoned memory regions that are part of root regions. We only poison in Metaspace when deallocation occurs and unpoison when allocation occurs. So there are no false positives there. ------------- PR: https://git.openjdk.org/jdk/pull/12229 From jsjolen at openjdk.org Fri Feb 3 15:18:34 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 3 Feb 2023 15:18:34 GMT Subject: RFR: JDK-8301481: Replace NULL with nullptr in os/windows [v3] In-Reply-To: References: Message-ID: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/windows. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Bug due to null string conversion ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12317/files - new: https://git.openjdk.org/jdk/pull/12317/files/3fb6b8f6..de13f1ce Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12317&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12317&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12317.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12317/head:pull/12317 PR: https://git.openjdk.org/jdk/pull/12317 From gcao at openjdk.org Sat Feb 4 06:51:36 2023 From: gcao at openjdk.org (Gui Cao) Date: Sat, 4 Feb 2023 06:51:36 GMT Subject: RFR: 8301818: RISC-V: Small Optimizing for use mv to replace mvw Message-ID: HI, The two functions of mvw and mv overlap, we can use mv to replace mvw, so the mvw function can be removed. and format some code. ## Testing: - all tier1 on unmatched board without new failures ------------- Commit messages: - Small Optimizing for use mv to replace mvw Changes: https://git.openjdk.org/jdk/pull/12425/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12425&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301818 Stats: 31 lines in 9 files changed: 0 ins; 2 del; 29 mod Patch: https://git.openjdk.org/jdk/pull/12425.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12425/head:pull/12425 PR: https://git.openjdk.org/jdk/pull/12425 From kbarrett at openjdk.org Sat Feb 4 07:52:57 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 4 Feb 2023 07:52:57 GMT Subject: RFR: 8274400: HotSpot Style Guide should permit use of alignof [v7] In-Reply-To: References: <3IdeZGaGcFB1-DbVugn0P9v8vRQMLURzHFRSpAm46NA=.09bf4dd8-b652-49c3-b9a5-ad3f58dfbcc3@github.com> Message-ID: On Tue, 24 Jan 2023 09:21:37 GMT, Julian Waters wrote: >> The alignof operator was added by C++11. It returns the alignment for the given type. Various metaprogramming usages exist, in particular when using std::aligned_storage. Use of this operator should be permitted in HotSpot code. > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > Oof Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.org/jdk/pull/11761 From ysuenaga at openjdk.org Sat Feb 4 09:49:28 2023 From: ysuenaga at openjdk.org (Yasumasa Suenaga) Date: Sat, 4 Feb 2023 09:49:28 GMT Subject: RFR: 8301820: C4819 warnings were reported on Windows Message-ID: C4819 warnings were reported when I tried to build JDK on Windows with VS2022. This PR contains changes both HotSpot and client libraries. Let me know if they should be separated. * HotSpot * stubGenerator_x86_64_poly.cpp * elfFile.hpp * libfontmanager * hb.hh * libfreetype * afblue.c I added C4819 to `DISABLED_WARNINGS_microsoft` for libfontmanager and for libfreetype because they are 3rd-party libraries. ------------- Commit messages: - 8301820: C4819 warnings were reported on Windows Changes: https://git.openjdk.org/jdk/pull/12427/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12427&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301820 Stats: 56 lines in 3 files changed: 0 ins; 0 del; 56 mod Patch: https://git.openjdk.org/jdk/pull/12427.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12427/head:pull/12427 PR: https://git.openjdk.org/jdk/pull/12427 From luhenry at openjdk.org Sat Feb 4 10:55:51 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Sat, 4 Feb 2023 10:55:51 GMT Subject: RFR: 8301818: RISC-V: Factor out function mvw from MacroAssembler In-Reply-To: References: Message-ID: On Sat, 4 Feb 2023 06:42:33 GMT, Gui Cao wrote: > HI, > > The two functions of mvw and mv overlap, we can use mv to replace mvw, so the mvw function can be removed. and format some code. > ## Testing: > - all tier1 on unmatched board without new failures src/hotspot/cpu/riscv/vtableStubs_riscv.cpp line 92: > 90: > 91: // check offset vs vtable length > 92: __ lwu(t0, Address(t2, Klass::vtable_length_offset())); There is an extra `lwu` -> `lw` which doesn't happen anywhere else. ------------- PR: https://git.openjdk.org/jdk/pull/12425 From luhenry at openjdk.org Sat Feb 4 10:58:48 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Sat, 4 Feb 2023 10:58:48 GMT Subject: RFR: 8301743: RISC-V: Add InlineSkippedInstructionsCounter to post-call nops In-Reply-To: References: Message-ID: <3TYySwZ-wfgkUQYQyzuktYKP1rCmAH8wwmS9sNYmOCY=.dee213d2-013b-4fe5-966a-d64cf122eac5@github.com> On Fri, 3 Feb 2023 08:19:08 GMT, Xiaolin Zheng wrote: > RISC-V part of [JDK-8300002](https://bugs.openjdk.org/browse/JDK-8300002), one line only. > > (Did not observe performance impact on JMH mentioned in JDK-8300002.) > > Tested hotspot tier1 (fastdebug). Marked as reviewed by luhenry (Committer). ------------- PR: https://git.openjdk.org/jdk/pull/12402 From luhenry at openjdk.org Sat Feb 4 10:59:51 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Sat, 4 Feb 2023 10:59:51 GMT Subject: RFR: 8299162: Refactor shared trampoline emission logic [v3] In-Reply-To: <7Aj4v0D-CFSg2vt7gQf7HAJ6X55MhFX5o8jSJNvdT7s=.ee880799-68df-48ee-942b-0ba91b363b77@github.com> References: <7Aj4v0D-CFSg2vt7gQf7HAJ6X55MhFX5o8jSJNvdT7s=.ee880799-68df-48ee-942b-0ba91b363b77@github.com> Message-ID: <63b6AafFXzSHmamyYMEoKz9PySDfxNId_txvtz5lvhk=.0f3a4fcc-f082-443b-a6f5-115de6b936b7@github.com> On Fri, 3 Feb 2023 08:16:24 GMT, Xiaolin Zheng wrote: >> After the quick fix [JDK-8297763](https://bugs.openjdk.org/browse/JDK-8297763), shared trampoline logic gets a bit verbose. If we can turn to batch emission of trampoline stubs, pre-calculating the total size, and pre-allocating them, then we can remove the CodeBuffer expansion checks each time and clean up the code around. >> >> >> [Stub Code] >> ... >> >> __ align() // emit nothing or a 4-byte padding >> <-- (B) emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) >> __ ldr() >> __ br() >> __ emit_int64() >> >> __ align() // emit nothing or a 4-byte padding >> <-- emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) >> __ ldr() >> __ br() >> __ emit_int64() >> >> __ align() // emit nothing or a 4-byte padding >> <-- emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) >> __ ldr() >> __ br() >> __ emit_int64() >> >> >> Here, the `pc_at_(C) - pc_at_(B)` is the fixed length `NativeCallTrampolineStub::instruction_size`; but the `pc_at_(B) - pc_at_(A)` may be a 0 or 4, which is not a fixed-length value. >> >> So originally: >> The logic of the lambda `emit()` inside the `emit_shared_trampolines()` when emitting a shared trampoline: >> >> We are at (A) -> >> do an align() -> >> We are at (B) -> >> emit lots of relocations bound to this shared trampoline at (B) -> >> do an emit_trampoline_stub() -> >> We are at (C) >> >> >> After this patch: >> >> We are at (A) -> >> do an emit_trampoline_stub(), which contains an align() already -> >> We are at (C) directly -> >> reversely calculate the (B) address, for `pc_at_(C) - pc_at_(B)` is a fixed-length value -> >> emit lots of relocations bound to this shared trampoline at (B) >> >> >> Theoretically the same. Just a code refactoring and we can remove some checks inside and make the code clean. >> >> Tested AArch64 hotspot tier1~4 with fastdebug build twice; Tested RISC-V hotspot tier1~4 with fastdebug build on hardware once. >> >> >> Thanks, >> Xiaolin > > Xiaolin Zheng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - license > - Merge branch 'master' into shared-trampoline > - __ > - Refactor and cleanup for shared trampolines Marked as reviewed by luhenry (Committer). ------------- PR: https://git.openjdk.org/jdk/pull/11749 From gcao at openjdk.org Sat Feb 4 11:30:03 2023 From: gcao at openjdk.org (Gui Cao) Date: Sat, 4 Feb 2023 11:30:03 GMT Subject: RFR: 8301818: RISC-V: Factor out function mvw from MacroAssembler [v2] In-Reply-To: References: Message-ID: > HI, > > The two functions of mvw and mv overlap, we can use mv to replace mvw, so the mvw function can be removed. and format some code. > ## Testing: > - all tier1 on unmatched board without new failures Gui Cao has updated the pull request incrementally with one additional commit since the last revision: Revert use lwu ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12425/files - new: https://git.openjdk.org/jdk/pull/12425/files/e468629e..a4a05c55 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12425&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12425&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12425.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12425/head:pull/12425 PR: https://git.openjdk.org/jdk/pull/12425 From gcao at openjdk.org Sat Feb 4 11:30:07 2023 From: gcao at openjdk.org (Gui Cao) Date: Sat, 4 Feb 2023 11:30:07 GMT Subject: RFR: 8301818: RISC-V: Factor out function mvw from MacroAssembler [v2] In-Reply-To: References: Message-ID: On Sat, 4 Feb 2023 10:52:50 GMT, Ludovic Henry wrote: >> Gui Cao has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert use lwu > > src/hotspot/cpu/riscv/vtableStubs_riscv.cpp line 92: > >> 90: >> 91: // check offset vs vtable length >> 92: __ lwu(t0, Address(t2, Klass::vtable_length_offset())); > > There is an extra `lwu` -> `lw` which doesn't happen anywhere else. Thanks for your review. fixed. ------------- PR: https://git.openjdk.org/jdk/pull/12425 From jwaters at openjdk.org Sat Feb 4 15:05:06 2023 From: jwaters at openjdk.org (Julian Waters) Date: Sat, 4 Feb 2023 15:05:06 GMT Subject: RFR: 8250269: Replace ATTRIBUTE_ALIGNED with alignas [v15] In-Reply-To: <9QKV9cYFTo_1D8R-mI80lnewNkA0ceJNKFPbrvICxl4=.d6736b76-8324-4084-bede-6e144b4f6c04@github.com> References: <9QKV9cYFTo_1D8R-mI80lnewNkA0ceJNKFPbrvICxl4=.d6736b76-8324-4084-bede-6e144b4f6c04@github.com> Message-ID: > C++11 added the alignas attribute, for the purpose of specifying alignment on types, much like compiler specific syntax such as gcc's __attribute__((aligned(x))) or Visual C++'s __declspec(align(x)). > > We can phase out the use of the macro in favor of the standard attribute. In the meantime, we can replace the compiler specific definitions of ATTRIBUTE_ALIGNED with a portable definition. We might deprecate the use of the macro but changing its implementation quickly and cleanly applies the feature where the macro is being used. > > Note: With certain parts of HotSpot using ATTRIBUTE_ALIGNED so indiscriminately, this commit will likely take some time to get right > > This will require adding the alignas attribute to the list of language features approved for use in HotSpot code. (Completed with [8297912](https://github.com/openjdk/jdk/pull/11446)) Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision: - Merge branch 'openjdk:master' into alignas - alignas - Merge branch 'openjdk:master' into alignas - Merge branch 'openjdk:master' into alignas - Merge branch 'openjdk:master' into alignas - Merge branch 'openjdk:master' into alignas - Merge branch 'openjdk:master' into alignas - Merge branch 'openjdk:master' into alignas - Merge branch 'openjdk:master' into alignas - Merge branch 'openjdk:master' into alignas - ... and 5 more: https://git.openjdk.org/jdk/compare/efea724e...a621bb62 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11431/files - new: https://git.openjdk.org/jdk/pull/11431/files/b36f835c..a621bb62 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11431&range=14 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11431&range=13-14 Stats: 12084 lines in 309 files changed: 5119 ins; 3210 del; 3755 mod Patch: https://git.openjdk.org/jdk/pull/11431.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11431/head:pull/11431 PR: https://git.openjdk.org/jdk/pull/11431 From luhenry at openjdk.org Sat Feb 4 21:09:50 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Sat, 4 Feb 2023 21:09:50 GMT Subject: RFR: 8301818: RISC-V: Factor out function mvw from MacroAssembler [v2] In-Reply-To: References: Message-ID: On Sat, 4 Feb 2023 11:30:03 GMT, Gui Cao wrote: >> HI, >> >> The two functions of mvw and mv overlap, we can use mv to replace mvw, so the mvw function can be removed. and format some code. >> ## Testing: >> - all tier1 on unmatched board without new failures > > Gui Cao has updated the pull request incrementally with one additional commit since the last revision: > > Revert use lwu Marked as reviewed by luhenry (Committer). ------------- PR: https://git.openjdk.org/jdk/pull/12425 From kbarrett at openjdk.org Sat Feb 4 22:52:52 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 4 Feb 2023 22:52:52 GMT Subject: RFR: JDK-8299688: Adopt C++14 compatible std::bit_cast introduced in C++20 [v8] In-Reply-To: References: Message-ID: On Mon, 23 Jan 2023 19:10:50 GMT, Justin King wrote: >> Adopts a C++14 compatible implementation of [std::bit_cast](https://en.cppreference.com/w/cpp/numeric/bit_cast). >> >> `PrimitiveConversions::cast` is refactored into `bit_cast` and extended to support all compatible types. Additionally it makes use of `__builtin_bit_cast` when available, which is strictly well defined compared to fallback approaches which are sometimes lurking in undefined behavior territory. >> >> Lastly some legacy casting is updated to use `bit_cast` in `utilities/globalDefinitions.hpp`. > > Justin King has updated the pull request incrementally with two additional commits since the last revision: > > - Documentation updates > > Signed-off-by: Justin King > - Restore comment regarding integral casting > > Signed-off-by: Justin King Changes requested by kbarrett (Reviewer). .github/workflows/main.yml line 238: > 236: with: > 237: platform: macos-x64 > 238: xcode-toolset-version: '12.5.1' This and the other toolchain update proposed here are to avoid the mentioned bug in Apple Clang, so that `__builtin_bit_cast` can be used with that compiler. Since I don't think we should be using the builtin, I don't think this is needed either. There may be other reasons to move the GHA toolchain version forward, but that should be a separate discussion. src/hotspot/share/utilities/bitCast.hpp line 35: > 33: // Casts from type From to type To without changing the underlying bit representation. This is > 34: // partially compatible with std::bit_cast introduced in C++20, but is more restrictive on the type > 35: // of conversions allowed. If it's not compatible (or at least trying to be, in so far as possible within the restrictions we operate under) with `std::bit_cast` (and I've suggested there are reasons why we shouldn't be compatible), then using the name `bit_cast` seems like a mistake that will only cause confusion. `PrimitiveConversions::cast` might not be the best name either (John Rose certainly expressed dislike), but discussing a name change is probably better done outside the context of a PR. In the meantime, I don't think this part of the change should be made. There are various places that should be using PrimitiveConversions::cast (under whatever name is ultimately chosen) but aren't, and they can be changed to do so. Later, once we've picked the final name, we can do a global rename. ------------- PR: https://git.openjdk.org/jdk/pull/11865 From kbarrett at openjdk.org Sat Feb 4 22:52:54 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 4 Feb 2023 22:52:54 GMT Subject: RFR: JDK-8299688: Adopt C++14 compatible std::bit_cast introduced in C++20 [v4] In-Reply-To: References: Message-ID: On Mon, 23 Jan 2023 19:00:09 GMT, Justin King wrote: >> I also don't like the direct call to __builtins either, gcc to my knowledge automatically replaces the appropriate builtins when optimization is specified > > Apple Clang seemed to have a bug with the version we are using in GHA. I bumped them to 12.5.1 and the error disappeared. > > `__builtin_bit_cast` is used to implement std::bit_cast for Clang/GCC. It is purely a compiler construct IIRC and has no runtime function equivalent. It's always replaced with machine code for performing a byte-wise copy and ensures no shenanigans occur. Additionally std::bit_cast is the only way to convert between floating point and integral without invoking undefined behavior, though the union trick does currently work. The other option is std::memcpy. I know what `__builtin_bit_cast` is. I don't think we need to use it, so shouldn't uglify the code to do so. There is also a risk of different semantics depending whether the builtin is used or not. For example, `std::bit_cast` between a floating point type and an integral type is constexpr, so presumably the builtin is similarly constexpr. But the non-builtin implementation isn't. ------------- PR: https://git.openjdk.org/jdk/pull/11865 From mikael at openjdk.org Sat Feb 4 23:11:48 2023 From: mikael at openjdk.org (Mikael Vidstedt) Date: Sat, 4 Feb 2023 23:11:48 GMT Subject: RFR: 8301820: C4819 warnings were reported on Windows In-Reply-To: References: Message-ID: On Sat, 4 Feb 2023 09:42:57 GMT, Yasumasa Suenaga wrote: > C4819 warnings were reported when I tried to build JDK on Windows with VS2022. > This PR contains changes both HotSpot and client libraries. Let me know if they should be separated. > > * HotSpot > * stubGenerator_x86_64_poly.cpp > * elfFile.hpp > * libfontmanager > * hb.hh > * libfreetype > * afblue.c > > I added C4819 to `DISABLED_WARNINGS_microsoft` for libfontmanager and for libfreetype because they are 3rd-party libraries. Out of curiosity, which version of VS 2022 is this? ------------- PR: https://git.openjdk.org/jdk/pull/12427 From ysuenaga at openjdk.org Sun Feb 5 00:42:52 2023 From: ysuenaga at openjdk.org (Yasumasa Suenaga) Date: Sun, 5 Feb 2023 00:42:52 GMT Subject: RFR: 8301820: C4819 warnings were reported on Windows In-Reply-To: References: Message-ID: <3QCTFOqlvgfDS3Q4ZTGt75I4ot6DQKQQ7XBEItYkKKY=.d5778a31-13dc-4499-9b9b-fc64ed071bb6@github.com> On Sat, 4 Feb 2023 09:42:57 GMT, Yasumasa Suenaga wrote: > C4819 warnings were reported when I tried to build JDK on Windows with VS2022. > This PR contains changes both HotSpot and client libraries. Let me know if they should be separated. > > * HotSpot > * stubGenerator_x86_64_poly.cpp > * elfFile.hpp > * libfontmanager > * hb.hh > * libfreetype > * afblue.c > > I added C4819 to `DISABLED_WARNINGS_microsoft` for libfontmanager and for libfreetype because they are 3rd-party libraries. I uses 17.3.6 (VS Community) on Windows 11 Pro (Japanese locale (CP932)) I saw following message on the console (It is Japanese, sorry...) d:\github-forked\jdk\src\hotspot\cpu\x86\stubGenerator_x86_64_poly.cpp(1): error C2220: ????????????????? d:\github-forked\jdk\src\hotspot\cpu\x86\stubGenerator_x86_64_poly.cpp(1): warning C4819: ???????????? ??? (932) ??????????????????????????????????? Unicode ???????????? In case of stubGenerator_x86_64_poly.cpp , `?` in comments cannot be handled in cl.exe. I confirmed that this character cannot be handled `iconv -f CP932 -t UTF8`, so I replaced it to `x` (alphabet). ------------- PR: https://git.openjdk.org/jdk/pull/12427 From redestad at openjdk.org Sun Feb 5 23:37:49 2023 From: redestad at openjdk.org (Claes Redestad) Date: Sun, 5 Feb 2023 23:37:49 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v7] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: On Thu, 2 Feb 2023 15:33:29 GMT, Scott Gibbons wrote: >> Names are important, but always hard to get right. At the very least they need to be correct. Maybe call it something like `..parameterized_decode_tables..` and the other `..shared_decode_tables..`? > > I prefer leaving them the way they are. I don't think the names, along with the associated comments within the tables, causes any undue confusion as to their function. However I will implement the name change if that's all it takes to procure a review approval. Please provide the specific names you'd like me to use and I'll change them. Or just approve as-is :-). I meant no disrespect here. By your own words `base64_AVX2_decode_URL_tables_addr` is "essentially incorrect", so I suggested some alternatives that I thought would make the code slightly more approachable. Regardless of whether you fix this detail I am not ready to approve this PR since I would need time to digest the latest changes. I'll ask a more senior engineer to review and give final approval (these changes need 2 reviewers approval anyhow). ------------- PR: https://git.openjdk.org/jdk/pull/12126 From dholmes at openjdk.org Mon Feb 6 00:52:50 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 6 Feb 2023 00:52:50 GMT Subject: RFR: 8301380: jdk/jfr/api/consumer/streaming/TestCrossProcessStreaming.java [v3] In-Reply-To: References: Message-ID: On Fri, 3 Feb 2023 09:32:25 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> please help review this small adjustment to close the gap where iterated threads that attach via jni are left without a valid JFR thread id. For details, please see the JIRA issue. >> >> Testing: jdk_jfr >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: > > revert uncalled for changes Marking this not ready while details are discussed. ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12388 From dholmes at openjdk.org Mon Feb 6 00:52:52 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 6 Feb 2023 00:52:52 GMT Subject: RFR: 8301380: jdk/jfr/api/consumer/streaming/TestCrossProcessStreaming.java [v2] In-Reply-To: <6Ej-RlKCJbvtXd2kt6D52Ne7QIQLkvKU6K3UItRRYUc=.e0457d0a-f45c-4904-97b6-561d05a18744@github.com> References: <7oSkKJNaPRjItariUnPQItEvDu1JU72EXaOvYU79qKA=.136ef5be-741d-44ce-b04e-19048e094af6@github.com> <6Ej-RlKCJbvtXd2kt6D52Ne7QIQLkvKU6K3UItRRYUc=.e0457d0a-f45c-4904-97b6-561d05a18744@github.com> Message-ID: On Fri, 3 Feb 2023 09:28:05 GMT, Markus Gr?nlund wrote: >> src/hotspot/share/prims/jni.cpp line 3841: >> >>> 3839: } >>> 3840: >>> 3841: // Please keep this inside the _attaching_via_jni section. >> >> Suggestion: >> >> // Ensure the thread_id is set whilst still `_attaching_via_jni` so that it can be seen >> // once the thread is included by the JFR thread iterator. >> >> ? > > Thanks David. I realized there is no need to move anything in jni.cpp, so I rolled them back. The change to the iterator will suffice. It will? Don't you still have the problem that the thread can be seen after ` thread->set_done_attaching_via_jni();` but before `post_thread_start_event(thread);` where the id will be set? ------------- PR: https://git.openjdk.org/jdk/pull/12388 From dholmes at openjdk.org Mon Feb 6 01:08:49 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 6 Feb 2023 01:08:49 GMT Subject: RFR: 8301820: C4819 warnings were reported on Windows In-Reply-To: References: Message-ID: On Sat, 4 Feb 2023 09:42:57 GMT, Yasumasa Suenaga wrote: > C4819 warnings were reported when I tried to build JDK on Windows with VS2022. > This PR contains changes both HotSpot and client libraries. Let me know if they should be separated. > > * HotSpot > * stubGenerator_x86_64_poly.cpp > * elfFile.hpp > * libfontmanager > * hb.hh > * libfreetype > * afblue.c > > I added C4819 to `DISABLED_WARNINGS_microsoft` for libfontmanager and for libfreetype because they are 3rd-party libraries. Sorry Yasumasa, but I'd much prefer to see this split into three sub-tasks (or issues) just so we can keep the issues more clearly separated. IIUC for stubGenerator_x86_64_poly.cpp we have a multiplication symbol instead of an ascii 'x' (likely pasted from a word processing document). For elfFile.hpp there are smart-quotes instead of ascii " Thanks. ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12427 From dholmes at openjdk.org Mon Feb 6 01:18:53 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 6 Feb 2023 01:18:53 GMT Subject: RFR: JDK-8301481: Replace NULL with nullptr in os/windows [v3] In-Reply-To: References: Message-ID: On Fri, 3 Feb 2023 15:18:34 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/windows. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Bug due to null string conversion Updates look good. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12317 From fyang at openjdk.org Mon Feb 6 02:25:51 2023 From: fyang at openjdk.org (Fei Yang) Date: Mon, 6 Feb 2023 02:25:51 GMT Subject: RFR: 8301818: RISC-V: Factor out function mvw from MacroAssembler [v2] In-Reply-To: References: Message-ID: On Sat, 4 Feb 2023 11:30:03 GMT, Gui Cao wrote: >> HI, >> >> The two functions of mvw and mv overlap, we can use mv to replace mvw, so the mvw function can be removed. and format some code. >> ## Testing: >> - all tier1 on unmatched board without new failures > > Gui Cao has updated the pull request incrementally with one additional commit since the last revision: > > Revert use lwu Marked as reviewed by fyang (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12425 From xlinzheng at openjdk.org Mon Feb 6 03:08:00 2023 From: xlinzheng at openjdk.org (Xiaolin Zheng) Date: Mon, 6 Feb 2023 03:08:00 GMT Subject: RFR: 8301743: RISC-V: Add InlineSkippedInstructionsCounter to post-call nops In-Reply-To: <-FYbgRBONLkylnwdv6QiuP3RF3sxisf_eyo4FYWramo=.d4406b99-701c-4b46-850d-4fa279cc4bc3@github.com> References: <-FYbgRBONLkylnwdv6QiuP3RF3sxisf_eyo4FYWramo=.d4406b99-701c-4b46-850d-4fa279cc4bc3@github.com> Message-ID: On Fri, 3 Feb 2023 12:27:13 GMT, Fei Yang wrote: >> RISC-V part of [JDK-8300002](https://bugs.openjdk.org/browse/JDK-8300002), one line only. >> >> (Did not observe performance impact on JMH mentioned in JDK-8300002.) >> >> Tested hotspot tier1 (fastdebug). > > Marked as reviewed by fyang (Reviewer). Thanks for reviewing! @RealFYang @luhenry ------------- PR: https://git.openjdk.org/jdk/pull/12402 From xlinzheng at openjdk.org Mon Feb 6 03:08:02 2023 From: xlinzheng at openjdk.org (Xiaolin Zheng) Date: Mon, 6 Feb 2023 03:08:02 GMT Subject: Integrated: 8301743: RISC-V: Add InlineSkippedInstructionsCounter to post-call nops In-Reply-To: References: Message-ID: On Fri, 3 Feb 2023 08:19:08 GMT, Xiaolin Zheng wrote: > RISC-V part of [JDK-8300002](https://bugs.openjdk.org/browse/JDK-8300002), one line only. > > (Did not observe performance impact on JMH mentioned in JDK-8300002.) > > Tested hotspot tier1 (fastdebug). This pull request has now been integrated. Changeset: b4cb6c8e Author: Xiaolin Zheng Committer: Fei Yang URL: https://git.openjdk.org/jdk/commit/b4cb6c8e8b4bb10d47fd4839c7abf13a552323f6 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod 8301743: RISC-V: Add InlineSkippedInstructionsCounter to post-call nops Reviewed-by: fyang, luhenry ------------- PR: https://git.openjdk.org/jdk/pull/12402 From xlinzheng at openjdk.org Mon Feb 6 04:15:10 2023 From: xlinzheng at openjdk.org (Xiaolin Zheng) Date: Mon, 6 Feb 2023 04:15:10 GMT Subject: RFR: 8299162: Refactor shared trampoline emission logic [v4] In-Reply-To: References: Message-ID: <3OBigr9oEmWC6KwJ6sOzQawTqbwld06_EyOm-Ga4Oow=.9d57542e-941d-4bb2-b5d6-9c3008bd1c62@github.com> > After the quick fix [JDK-8297763](https://bugs.openjdk.org/browse/JDK-8297763), shared trampoline logic gets a bit verbose. If we can turn to batch emission of trampoline stubs, pre-calculating the total size, and pre-allocating them, then we can remove the CodeBuffer expansion checks each time and clean up the code around. > > > [Stub Code] > ... > > __ align() // emit nothing or a 4-byte padding > <-- (B) emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) > __ ldr() > __ br() > __ emit_int64() > > __ align() // emit nothing or a 4-byte padding > <-- emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) > __ ldr() > __ br() > __ emit_int64() > > __ align() // emit nothing or a 4-byte padding > <-- emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) > __ ldr() > __ br() > __ emit_int64() > > > Here, the `pc_at_(C) - pc_at_(B)` is the fixed length `NativeCallTrampolineStub::instruction_size`; but the `pc_at_(B) - pc_at_(A)` may be a 0 or 4, which is not a fixed-length value. > > So originally: > The logic of the lambda `emit()` inside the `emit_shared_trampolines()` when emitting a shared trampoline: > > We are at (A) -> > do an align() -> > We are at (B) -> > emit lots of relocations bound to this shared trampoline at (B) -> > do an emit_trampoline_stub() -> > We are at (C) > > > After this patch: > > We are at (A) -> > do an emit_trampoline_stub(), which contains an align() already -> > We are at (C) directly -> > reversely calculate the (B) address, for `pc_at_(C) - pc_at_(B)` is a fixed-length value -> > emit lots of relocations bound to this shared trampoline at (B) > > > Theoretically the same. Just a code refactoring and we can remove some checks inside and make the code clean. > > Tested AArch64 hotspot tier1~4 with fastdebug build twice; Tested RISC-V hotspot tier1~4 with fastdebug build on hardware once. > > > Thanks, > Xiaolin Xiaolin Zheng has updated the pull request incrementally with one additional commit since the last revision: trampoline_stub_size() -> max_trampoline_stub_size() to reflect its semantics ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11749/files - new: https://git.openjdk.org/jdk/pull/11749/files/6159a93f..8a091924 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11749&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11749&range=02-03 Stats: 12 lines in 8 files changed: 0 ins; 0 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/11749.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11749/head:pull/11749 PR: https://git.openjdk.org/jdk/pull/11749 From xlinzheng at openjdk.org Mon Feb 6 04:45:55 2023 From: xlinzheng at openjdk.org (Xiaolin Zheng) Date: Mon, 6 Feb 2023 04:45:55 GMT Subject: RFR: 8299162: Refactor shared trampoline emission logic [v3] In-Reply-To: References: <7Aj4v0D-CFSg2vt7gQf7HAJ6X55MhFX5o8jSJNvdT7s=.ee880799-68df-48ee-942b-0ba91b363b77@github.com> Message-ID: On Fri, 3 Feb 2023 10:44:21 GMT, Andrew Dinn wrote: >> Xiaolin Zheng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: >> >> - license >> - Merge branch 'master' into shared-trampoline >> - __ >> - Refactor and cleanup for shared trampolines > > src/hotspot/cpu/aarch64/codeBuffer_aarch64.cpp line 71: > >> 69: >> 70: assert(requests->number_of_entries() >= 1, "at least one"); >> 71: const int total_requested_size = MacroAssembler::trampoline_stub_size() * requests->number_of_entries(); > > There's a detail here that is worth mentioning. At first I was concerned that this calculation might overestimate the required size by one 32 bit word -- that happens where we are able generate the first stub without the need for any alignment padding. However, that won't cause any unnecessary failure because the rounding up will be from an odd to an even multiple of 32 bit words i.e. adding this extra half-word will not push the request over to consume an extra 64 bit word. Thanks for your review, Andrew - Yes, indeed a detail here. Although the real emitted code does not get affected, when allocating the requested trampolines, the pre-allocated size the code buffer is applying for could be slightly bigger than the previous real-time processing. I am not sure how useful it is to calculate the precise allocated request size, (though doable, seems we will lose the code simplicity) for in the context the current CodeBuffer is only a temporary one, which would be finally poured into the real nmethod; and also when the code buffer is expanding[1], a quite big and heuristic size could be applied for. I dumped the `requests->number_of_entries()` here, the trampoline count: seems the value is a small number, often less than `5`, sometimes `10`, if the `ReservedCodeCacheSize > branch_range` condition is true so that trampolines are generated. I realized maybe I should rename the `trampoline_stub_size` to something like `max_trampoline_stub_size` to reflect its semantics. I was wondering if the final version still looks okay to you? [1] https://github.com/openjdk/jdk/blob/b4cb6c8e8b4bb10d47fd4839c7abf13a552323f6/src/hotspot/share/asm/codeBuffer.cpp#L833-L845 ------------- PR: https://git.openjdk.org/jdk/pull/11749 From xlinzheng at openjdk.org Mon Feb 6 06:08:36 2023 From: xlinzheng at openjdk.org (Xiaolin Zheng) Date: Mon, 6 Feb 2023 06:08:36 GMT Subject: RFR: 8299162: Refactor shared trampoline emission logic [v5] In-Reply-To: References: Message-ID: > After the quick fix [JDK-8297763](https://bugs.openjdk.org/browse/JDK-8297763), shared trampoline logic gets a bit verbose. If we can turn to batch emission of trampoline stubs, pre-calculating the total size, and pre-allocating them, then we can remove the CodeBuffer expansion checks each time and clean up the code around. > > > [Stub Code] > ... > > __ align() // emit nothing or a 4-byte padding > <-- (B) emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) > __ ldr() > __ br() > __ emit_int64() > > __ align() // emit nothing or a 4-byte padding > <-- emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) > __ ldr() > __ br() > __ emit_int64() > > __ align() // emit nothing or a 4-byte padding > <-- emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) > __ ldr() > __ br() > __ emit_int64() > > > Here, the `pc_at_(C) - pc_at_(B)` is the fixed length `NativeCallTrampolineStub::instruction_size`; but the `pc_at_(B) - pc_at_(A)` may be a 0 or 4, which is not a fixed-length value. > > So originally: > The logic of the lambda `emit()` inside the `emit_shared_trampolines()` when emitting a shared trampoline: > > We are at (A) -> > do an align() -> > We are at (B) -> > emit lots of relocations bound to this shared trampoline at (B) -> > do an emit_trampoline_stub() -> > We are at (C) > > > After this patch: > > We are at (A) -> > do an emit_trampoline_stub(), which contains an align() already -> > We are at (C) directly -> > reversely calculate the (B) address, for `pc_at_(C) - pc_at_(B)` is a fixed-length value -> > emit lots of relocations bound to this shared trampoline at (B) > > > Theoretically the same. Just a code refactoring and we can remove some checks inside and make the code clean. > > Tested AArch64 hotspot tier1~4 with fastdebug build twice; Tested RISC-V hotspot tier1~4 with fastdebug build on hardware once. > > > Thanks, > Xiaolin Xiaolin Zheng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: - Merge remote-tracking branch 'github-openjdk/master' into shared-trampoline - trampoline_stub_size() -> max_trampoline_stub_size() to reflect its semantics - license - Merge branch 'master' into shared-trampoline - __ - Refactor and cleanup for shared trampolines ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11749/files - new: https://git.openjdk.org/jdk/pull/11749/files/8a091924..0c279f66 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11749&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11749&range=03-04 Stats: 9599 lines in 227 files changed: 5101 ins; 3178 del; 1320 mod Patch: https://git.openjdk.org/jdk/pull/11749.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11749/head:pull/11749 PR: https://git.openjdk.org/jdk/pull/11749 From jbhateja at openjdk.org Mon Feb 6 06:39:51 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 6 Feb 2023 06:39:51 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v8] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: On Wed, 1 Feb 2023 19:07:17 GMT, Scott Gibbons wrote: >> Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. >> >> Encode performance: >> **Old:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms >> >> >> **New:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms >> >> >> Decode performance: >> **Old:** >> >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms >> >> **New:** >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Change break-even buffer size for AVX512 src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2693: > 2691: __ vpshufb(xmm0, xmm0, xmm13, Assembler::AVX_256bit); > 2692: __ vpermd(xmm0, xmm12, xmm0, Assembler::AVX_256bit); > 2693: __ subl(length, 0x20); Subtraction effects EFLAGs we can save one redundant compare per iteration on [#L2697](https://github.com/openjdk/jdk/pull/12126/files#diff-b938ab8a7bd9f57eb02271e2dd24a305bca30f06e9f8b028e18a139c4908ec92R2697) by doing a prior subtraction by 0x2c (44) in pre-loop and increment by same amount post loop. Same goes out for encode main loop also. ------------- PR: https://git.openjdk.org/jdk/pull/12126 From haosun at openjdk.org Mon Feb 6 07:09:15 2023 From: haosun at openjdk.org (Hao Sun) Date: Mon, 6 Feb 2023 07:09:15 GMT Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64 [v25] In-Reply-To: <-RReNBa4rbWMG38xuIH_yCfzDDEubd4gU2QsVCdiubM=.0e13582c-f1c8-4e35-a1ce-174a22b86a58@github.com> References: <-RReNBa4rbWMG38xuIH_yCfzDDEubd4gU2QsVCdiubM=.0e13582c-f1c8-4e35-a1ce-174a22b86a58@github.com> Message-ID: <4v4pdOt2xrFJx30NbLByS6jw3p_y3NSutAxIQq_I-lg=.cedfdf56-6a23-46b6-8cd3-044a467c2c61@github.com> On Tue, 31 Jan 2023 14:59:31 GMT, Doug Simon wrote: >> Alan Hayward has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits: >> >> - Merge master >> - Merge master >> - Merge master >> - Error on -XX:-PreserveFramePointer -XX:UseBranchProtection=pac-ret >> - Add comments to enter calls >> - Set PreserveFramePointer if use_rop_protection is set >> - Merge enter_subframe into enter >> - Review fixups >> - Documentation updates >> - Update copyrights to 2022 >> - ... and 24 more: https://git.openjdk.org/jdk/compare/022d8070...c4e0ee31 > > src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 413: > >> 411: } >> 412: >> 413: if (UseBranchProtection == nullptr || strcmp(UseBranchProtection, "none") == 0) { > > My understanding is that `UseBranchProtection` requires JIT compiler integration. This is currently missing in Graal. The logic that decides if it should be enabled by default needs to take JVMCI into consideration, much like the call to `JVMCIGlobals::check_jvmci_supported_gc` determines if JVMCI supports the configured GC. > > Thanks to @gilles-duboscq for alerting me to this change. Thanks for reporting this. @dougxc Yes. I encountered several jtreg failures with Graal, which is built with PAC-RET enabled JDK. As I see it, the straightforward fix is to disable JVMCI if `_rop_protection` is parsed as "true" finally, since Graal doesn't supports PAC-RET currently. That is similar to the way `JVMCIGlobals::check_jvmci_supported_gc` does, i.e. if one GC except `G1/Serial/Parallel` is configured, we disable JVMCI. WDYT ------------- PR: https://git.openjdk.org/jdk/pull/6334 From ioi.lam at oracle.com Mon Feb 6 07:26:10 2023 From: ioi.lam at oracle.com (Ioi Lam) Date: Sun, 5 Feb 2023 23:26:10 -0800 Subject: JVM Flag ergonomics and constraints In-Reply-To: References: <1e05de2b-5f03-7ff2-6a47-176df9046588@oracle.com> <637c4388-4c02-c654-6166-23491d797fce@oracle.com> <678fc49c-d85c-3948-0a7d-a3802b909dd1@oracle.com> Message-ID: <9a4fb118-17ba-8e04-8ce5-737f221cb5f4@oracle.com> On 2/3/2023 12:40 AM, David Holmes wrote: > On 3/02/2023 5:12 pm, Ioi Lam wrote: >> >> >> On 2/1/2023 10:37 PM, David Holmes wrote: >>> >>> >>> On 2/02/2023 3:25 pm, Thomas St?fe wrote: >>>> On Thu, Feb 2, 2023 at 5:30 AM David Holmes >>>> > wrote: >>>> >>>> ??? That would have to be determined case-by-case. For each ergo >>>> flag we >>>> ??? would have to decide what value it is calculating and based on >>>> what, >>>> ??? and >>>> ??? determine whether an out-of-range value is an error or >>>> something to be >>>> ??? wary of. >>>> >>>> Sure, what I meant was how to control it? Two sets of FLAT_SET_... >>>> macros, one that asserts on constraint violation, one that caps? >>> >>> No I expect to just see the places where FLAG_SET_ERGO is called do >>> the right thing by checking the return value and proceeding >>> accordingly. >> >> David, are you proposing this? >> >> ??? int value = calculate_ergo(); >> ??? if (FLAG_SET_ERGO (TheFlag, value) == error) { >> ????? value = do_it_again() >> ????? if (FLAG_SET_ERGO (TheFlag, value) == error) { >> ???????? huh? >> ????? } >> ??? } >> >> In the above code, all FLAG_SET_ERGO does is to check the range for >> you. It doesn't tell you why the range check fails. So the caller >> would have to redo the calculation (with no more information, other >> than that the last value was no good). What if the new value is also >> no good? > > To me that indicates we don't know what the heck we are calculating! > How can you calculate a setting for a flag if you don't even > understand the valid range of that flag. Which may mean the max, for > example, needs to be exposed programmatically. > >> With this style, you almost have to wrap every FLAG_SET_ERGO? in a >> loop, keep trying (in the dark) until you succeed. >> >> ================== >> I think it's better to just assert inside FLAG_SET_ERGO if the value >> is out of range. The caller should make sure it has calculated a >> correct value. > > If the calculation uses some externally settable value then an assert > won't help if we're in a product build. > We have lots of set_xxx() APIs in various classes that assert when given an inappropriate value -- this indicates a programming error. So FLAG_SET_ERGO would be no different. It's the responsibility of the caller of FLAG_SET_ERGO to pass a correct value to FLAG_SET_ERGO . How the caller comes up with that value (whether it depends on some external input) is irrelevant. ======== To properly catch the programming errors with ERGO setting, we should: - Change FLAG_SET_ERGO to return "void". The current API gives the false impression that the caller needs to check the returned value and do something about it. - Assert if the value is out of range, so we can catch the programming error during testing (with debug builds) It's up to the caller of FLAG_SET_ERGO to determine a value that's within the allowed range. If the caller wants to query the constraints of the flag that it wants to set, that can be done via the JVMFlagLimit API. Using JVMFlagLimit is much better than blindly setting a value and getting an error code that doesn't tell you why it failed. > As I said it depends on exactly what we are using to do the > calculation. If it is "closed world" then it is a fatal error to get > it wrong and should be detected during testing. Otherwise the code has > to deal with the possibility of error. > >> >> Maybe we can add a new FLAG_SET_ERGO_CAPPED() that would cap the >> value, but that should be an opt-in. FLAG_SET_ERGO() should not >> automatically cap the value. > > That might be suitable for some flags/calculations. > > I don't like trying to talk about this in generalities. Each specific > situation needs to be examined to understand what is being calculated > and how. But it is the callers of FLAG_SET_ERGO that need to know what > they are doing. > > Cheers, > David > >> >> Thanks >> - Ioi >> From david.holmes at oracle.com Mon Feb 6 07:45:50 2023 From: david.holmes at oracle.com (David Holmes) Date: Mon, 6 Feb 2023 17:45:50 +1000 Subject: JVM Flag ergonomics and constraints In-Reply-To: <9a4fb118-17ba-8e04-8ce5-737f221cb5f4@oracle.com> References: <1e05de2b-5f03-7ff2-6a47-176df9046588@oracle.com> <637c4388-4c02-c654-6166-23491d797fce@oracle.com> <678fc49c-d85c-3948-0a7d-a3802b909dd1@oracle.com> <9a4fb118-17ba-8e04-8ce5-737f221cb5f4@oracle.com> Message-ID: <41d86ae6-f13b-c030-b1e4-3f77ef2752e4@oracle.com> On 6/02/2023 5:26 pm, Ioi Lam wrote: > On 2/3/2023 12:40 AM, David Holmes wrote: >> On 3/02/2023 5:12 pm, Ioi Lam wrote: >>> >>> >>> On 2/1/2023 10:37 PM, David Holmes wrote: >>>> >>>> >>>> On 2/02/2023 3:25 pm, Thomas St?fe wrote: >>>>> On Thu, Feb 2, 2023 at 5:30 AM David Holmes >>>>> > wrote: >>>>> >>>>> ??? That would have to be determined case-by-case. For each ergo >>>>> flag we >>>>> ??? would have to decide what value it is calculating and based on >>>>> what, >>>>> ??? and >>>>> ??? determine whether an out-of-range value is an error or >>>>> something to be >>>>> ??? wary of. >>>>> >>>>> Sure, what I meant was how to control it? Two sets of FLAT_SET_... >>>>> macros, one that asserts on constraint violation, one that caps? >>>> >>>> No I expect to just see the places where FLAG_SET_ERGO is called do >>>> the right thing by checking the return value and proceeding >>>> accordingly. >>> >>> David, are you proposing this? >>> >>> ??? int value = calculate_ergo(); >>> ??? if (FLAG_SET_ERGO (TheFlag, value) == error) { >>> ????? value = do_it_again() >>> ????? if (FLAG_SET_ERGO (TheFlag, value) == error) { >>> ???????? huh? >>> ????? } >>> ??? } >>> >>> In the above code, all FLAG_SET_ERGO does is to check the range for >>> you. It doesn't tell you why the range check fails. So the caller >>> would have to redo the calculation (with no more information, other >>> than that the last value was no good). What if the new value is also >>> no good? >> >> To me that indicates we don't know what the heck we are calculating! >> How can you calculate a setting for a flag if you don't even >> understand the valid range of that flag. Which may mean the max, for >> example, needs to be exposed programmatically. >> >>> With this style, you almost have to wrap every FLAG_SET_ERGO? in a >>> loop, keep trying (in the dark) until you succeed. >>> >>> ================== >>> I think it's better to just assert inside FLAG_SET_ERGO if the value >>> is out of range. The caller should make sure it has calculated a >>> correct value. >> >> If the calculation uses some externally settable value then an assert >> won't help if we're in a product build. >> > > We have lots of set_xxx() APIs in various classes that assert when given > an inappropriate value -- this indicates a programming error. So > FLAG_SET_ERGO would be no different. > > It's the responsibility of the caller of FLAG_SET_ERGO to pass a correct > value to FLAG_SET_ERGO . How the caller comes up with that value > (whether it depends on some external input) is irrelevant. > ======== > > To properly catch the programming errors with ERGO setting, we should: > > - Change FLAG_SET_ERGO to return "void". The current API gives the false > impression that the caller needs to check the returned value and do > something about it. That's because they do! You still need to be able to report an error for product builds if the value is being calculated based on some passed in runtime value. Putting in an assert is fine but not sufficient. David ----- > - Assert if the value is out of range, so we can catch the programming > error during testing (with debug builds) > > It's up to the caller of FLAG_SET_ERGO to determine a value that's > within the allowed range. > > If the caller wants to query the constraints of the flag that it wants > to set, that can be done via the JVMFlagLimit API. Using JVMFlagLimit is > much better than blindly setting a value and getting an error code that > doesn't tell you why it failed. > > >> As I said it depends on exactly what we are using to do the >> calculation. If it is "closed world" then it is a fatal error to get >> it wrong and should be detected during testing. Otherwise the code has >> to deal with the possibility of error. >> >>> >>> Maybe we can add a new FLAG_SET_ERGO_CAPPED() that would cap the >>> value, but that should be an opt-in. FLAG_SET_ERGO() should not >>> automatically cap the value. >> >> That might be suitable for some flags/calculations. >> >> I don't like trying to talk about this in generalities. Each specific >> situation needs to be examined to understand what is being calculated >> and how. But it is the callers of FLAG_SET_ERGO that need to know what >> they are doing. >> >> Cheers, >> David >> >>> >>> Thanks >>> - Ioi >>> > From ayang at openjdk.org Mon Feb 6 08:45:01 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 6 Feb 2023 08:45:01 GMT Subject: RFR: 8301116: Parallelize TLAB resizing in G1 [v2] In-Reply-To: References: Message-ID: <-hZgE7WaPCxIh-8bLb6zKWvj5wwpU1tU-P2WSyMq4q8=.f2ed8976-5a44-4e14-acfd-f189cbe9f19a@github.com> On Thu, 2 Feb 2023 10:38:02 GMT, Thomas Schatzl wrote: >> Hi all, >> >> can I get reviews for this change that moves TLAB resizing into the second parallel post evacuation phase? >> >> The change is fairly straightforward except maybe for the following: >> * there is a new claimer for `JavaThreads` because the `Threads::possibly_parallel_threads_do` method to parallelize does not work well with very little per-thread work. Actually with a moderate amount of threads (~20) performance would already be many times slower than doing things serially. >> * moving the TLAB resizing out of `G1CollectedHeap::prepare_tlabs_for_mutator` made that method and the enclosing timing a bit obsolete, so I repurposed it as a "do preparatory work for the mutator during young gc" phase. That resulted in minor cleanup and regularization of what is done during that phase (wrt to what the corresponding method for full gc does). >> >> Testing: gha, local perf testing, tier1 >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with two additional commits since the last revision: > > - fix indentation :( > - daholmes review, remove unnecessary change Two minor suggestions: 1. I think "inlining" the logic of `G1JavaThreadsListClaimer` makes the flow in the caller more transparent. 2. `ResizeThreadLABs` -> `ResizeTLABs`, just to be consistent with other *TLAB occurrences. ------------- Marked as reviewed by ayang (Reviewer). PR: https://git.openjdk.org/jdk/pull/12360 From iwalulya at openjdk.org Mon Feb 6 08:50:52 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Mon, 6 Feb 2023 08:50:52 GMT Subject: RFR: 8301465: Remove unnecessary nmethod arming in Full GC of Serial/Parallel [v2] In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 09:50:27 GMT, Albert Mingkun Yang wrote: >> Simple removing unnecessary code and adding API documentation. >> >> Test: hotspot_gc > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review Marked as reviewed by iwalulya (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12311 From jsjolen at openjdk.org Mon Feb 6 09:39:17 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 6 Feb 2023 09:39:17 GMT Subject: RFR: JDK-8301480: Replace NULL with nullptr in os/posix [v2] In-Reply-To: References: Message-ID: <5A0_x2-25DljN6avzpJhMvCQdL5TfVDD4A9dJ0w8VvM=.3c49b7f0-4c37-468f-8d7d-386283738dc6@github.com> > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/posix. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Fix dholmes' bug ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12316/files - new: https://git.openjdk.org/jdk/pull/12316/files/a8f990bb..a54c0664 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12316&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12316&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12316.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12316/head:pull/12316 PR: https://git.openjdk.org/jdk/pull/12316 From mgronlun at openjdk.org Mon Feb 6 09:42:54 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 6 Feb 2023 09:42:54 GMT Subject: RFR: 8301380: jdk/jfr/api/consumer/streaming/TestCrossProcessStreaming.java [v2] In-Reply-To: References: <7oSkKJNaPRjItariUnPQItEvDu1JU72EXaOvYU79qKA=.136ef5be-741d-44ce-b04e-19048e094af6@github.com> <6Ej-RlKCJbvtXd2kt6D52Ne7QIQLkvKU6K3UItRRYUc=.e0457d0a-f45c-4904-97b6-561d05a18744@github.com> Message-ID: On Mon, 6 Feb 2023 00:49:47 GMT, David Holmes wrote: >> Thanks David. I realized there is no need to move anything in jni.cpp, so I rolled them back. The change to the iterator will suffice. > > It will? Don't you still have the problem that the thread can be seen after ` thread->set_done_attaching_via_jni();` but before `post_thread_start_event(thread);` where the id will be set? Yes. The JFR_JVM_THREAD_ID(thread) is a lazy assignment of the thread id. It can be done by the thread itself, or the periodic task thread. The premise for a JavaThread is that there exist a threadObj. ------------- PR: https://git.openjdk.org/jdk/pull/12388 From jsjolen at openjdk.org Mon Feb 6 09:44:27 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 6 Feb 2023 09:44:27 GMT Subject: RFR: JDK-8301481: Replace NULL with nullptr in os/windows [v4] In-Reply-To: References: Message-ID: <6fhxlbFt5TtE7Y5OKVViHtuXV7nIz1b7y3a3JL_Nto0=.738f3e4f-2fae-49a6-9668-047ebb4abc66@github.com> > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/windows. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Can't directly compare uintptr_t, need reinterpret_cast ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12317/files - new: https://git.openjdk.org/jdk/pull/12317/files/de13f1ce..5f33229b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12317&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12317&range=02-03 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12317.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12317/head:pull/12317 PR: https://git.openjdk.org/jdk/pull/12317 From tschatzl at openjdk.org Mon Feb 6 09:58:01 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 6 Feb 2023 09:58:01 GMT Subject: RFR: 8301116: Parallelize TLAB resizing in G1 [v2] In-Reply-To: <69qqMeVrgCWBHALBnDUMllkaE7wKyZ61zfAugbhLGcY=.7295c1ab-c74c-4896-b170-679bdbbc3f64@github.com> References: <4STcU7c1bgEwc6q5FeqqMSkkRSftG62O1_tiqGBlLXc=.01e51631-360c-4a39-884b-5f5ca3efcba4@github.com> <7Y1f2wfiabz1aO5qFODUi8gEJU_VB2EQKNmMOH4oY1I=.baf20aec-b37d-4a5b-a45a-fae9faaf1089@github.com> <5ceqoRFEZguGfrxpAmInTaTTSFYbReYk6eonXmQJOwg=.6fc16c96-6a7b-4c2e-8a3b-c81b5beb5903@github.com> <58lEutsk9hNEy1-seo_pkH0jO8Fr0mz74XqA-JwOix8=.11ff0722-adf8-4770-b2ac-898460bb43b8@github.com> <69qqMeVrgCWBHALBnDUMllkaE7wKyZ61zfAugbhLGcY=.7295c1ab-c74c-4896-b170-679bdbbc3f64@github.com> Message-ID: On Fri, 3 Feb 2023 11:25:49 GMT, Thomas Schatzl wrote: >>> Using `Threads::possibly_parallel_oops_do`, thread iteration/claiming seems to be the bottleneck, i.e. claiming the token. >>> >>> One issue is that all threads need to traverse the array from the beginning to get to the current claim position - so the minimum processing time for a single thread is iterating the complete thread array and checking all tokens for it. That does not matter so much if the work per thread is big (like when walking the stacks for oops - but even then I think it would be noticable, need to check). >>> >>> The other is that with little work per thread threads like in this situation they seem to be contending heavily on the JavaThread claim tokens, so this "parallelization" is quite a pessimization - I've measured ~4x slower with 18 threads than using the single-threaded version (on ~21k JavaThreads) >> >> Looking at `Threads::possibly_parallel_threads_do`, I'm thinking it could use a redesign to use the technique >> being used in `G1JavaThreadsListClaimer`. The existing design and implementation dates from when the >> threads-list was an actual linked list, rather than the newer ThreadsList. But maybe that's future work. > >> > Using `Threads::possibly_parallel_oops_do`, thread iteration/claiming seems to be the bottleneck, i.e. claiming the token. >> > One issue is that all threads need to traverse the array from the beginning to get to the current claim position [...] >> > The other is that with little work per thread threads like in this situation they seem to be contending heavily on the JavaThread claim tokens [...] >> >> Looking at `Threads::possibly_parallel_threads_do`, I'm thinking it could use a redesign to use the technique being used in `G1JavaThreadsListClaimer`. The existing design and implementation dates from when the threads-list was an actual linked list, rather than the newer ThreadsList. But maybe that's future work. > > I considered that, but the non-java threads still use the linked list design, so I moved that to future work :) I can investigate that a bit more though, but would like to keep it as is for this change. > I think a better solution might be to declare the `Thread::_tlab` member `mutable` and `Thread::tlab() const`. The code works as is without the removed `const` after all, so I'm not sure I should do this suggested change. ------------- PR: https://git.openjdk.org/jdk/pull/12360 From tschatzl at openjdk.org Mon Feb 6 10:19:40 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 6 Feb 2023 10:19:40 GMT Subject: RFR: 8301116: Parallelize TLAB resizing in G1 [v3] In-Reply-To: References: Message-ID: > Hi all, > > can I get reviews for this change that moves TLAB resizing into the second parallel post evacuation phase? > > The change is fairly straightforward except maybe for the following: > * there is a new claimer for `JavaThreads` because the `Threads::possibly_parallel_threads_do` method to parallelize does not work well with very little per-thread work. Actually with a moderate amount of threads (~20) performance would already be many times slower than doing things serially. > * moving the TLAB resizing out of `G1CollectedHeap::prepare_tlabs_for_mutator` made that method and the enclosing timing a bit obsolete, so I repurposed it as a "do preparatory work for the mutator during young gc" phase. That resulted in minor cleanup and regularization of what is done during that phase (wrt to what the corresponding method for full gc does). > > Testing: gha, local perf testing, tier1 > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: Improve claimer ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12360/files - new: https://git.openjdk.org/jdk/pull/12360/files/6bff3cc1..82a0cc57 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12360&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12360&range=01-02 Stats: 27 lines in 3 files changed: 17 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/12360.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12360/head:pull/12360 PR: https://git.openjdk.org/jdk/pull/12360 From jsjolen at openjdk.org Mon Feb 6 10:22:36 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 6 Feb 2023 10:22:36 GMT Subject: RFR: JDK-8301493: Replace NULL with nullptr in cpu/aarch64 Message-ID: Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/aarch64. Unfortunately the script that does the change isn't perfect, and so we need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. Here are some typical things to look out for: 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. An example of this: ```c++ // This function returns null void* ret_null(); // This function returns true if *x == nullptr bool is_nullptr(void** x); Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. Thanks! ------------- Commit messages: - Fixes - Replace NULL with nullptr in cpu/aarch64 Changes: https://git.openjdk.org/jdk/pull/12321/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12321&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301493 Stats: 458 lines in 43 files changed: 0 ins; 0 del; 458 mod Patch: https://git.openjdk.org/jdk/pull/12321.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12321/head:pull/12321 PR: https://git.openjdk.org/jdk/pull/12321 From jsjolen at openjdk.org Mon Feb 6 10:22:53 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 6 Feb 2023 10:22:53 GMT Subject: RFR: JDK-8301493: Replace NULL with nullptr in cpu/aarch64 In-Reply-To: References: Message-ID: <93w5_gLPwyrao5XtNgUVyj5ntkIZh2Oz4y2f51xnbos=.7caf423d-dd48-4152-8110-55def437182b@github.com> On Tue, 31 Jan 2023 11:39:27 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/aarch64. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Some simple fixes here src/hotspot/cpu/aarch64/frame_aarch64.inline.hpp line 174: > 172: // Then we could use the assert below. However this assert is of somewhat dubious > 173: // value. > 174: // assert(_pc != null, "no pc?"); nullptr src/hotspot/cpu/aarch64/gc/g1/g1BarrierSetAssembler_aarch64.cpp line 162: > 160: // Calling the runtime using the regular call_VM_leaf mechanism generates > 161: // code (generated by InterpreterMacroAssember::call_VM_leaf_base) > 162: // that checks that the *(rfp+frame::interpreter_frame_last_sp) == null. nullptr src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp line 160: > 158: // Calling the runtime using the regular call_VM_leaf mechanism generates > 159: // code (generated by InterpreterMacroAssember::call_VM_leaf_base) > 160: // that checks that the *(rfp+frame::interpreter_frame_last_sp) == null. nullptr src/hotspot/cpu/aarch64/icBuffer_aarch64.cpp line 50: > 48: // (1) the value is old (i.e., doesn't matter for scavenges) > 49: // (2) these ICStubs are removed *before* a GC happens, so the roots disappear > 50: // assert(cached_value == null || cached_oop->is_perm(), "must be perm oop"); nullptr src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp line 143: > 141: Label L; > 142: ldr(rscratch1, Address(rthread, JavaThread::jvmti_thread_state_offset())); > 143: cbz(rscratch1, L); // if (thread->jvmti_thread_state() == null) exit; nullptr src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp line 1313: > 1311: // if (row[2].rec == rec) { row[2].incr(); goto done; } > 1312: // if (row[2].rec != null) { count.incr(); goto done; } // overflow > 1313: // row[2].init(rec); goto done; nullptr src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp line 1586: > 1584: cbz(rscratch1, L); > 1585: stop("InterpreterMacroAssembler::call_VM_leaf_base:" > 1586: " last_sp != null"); nullptr src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp line 1614: > 1612: cbz(rscratch1, L); > 1613: stop("InterpreterMacroAssembler::call_VM_base:" > 1614: " last_sp != null"); nullptr src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1158: > 1156: } > 1157: > 1158: // for (scan = klass->itable(); scan->interface() != null; scan += scan_step) { nullptr src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 4791: > 4789: > 4790: if (UseSimpleArrayEquals) { > 4791: Label NEXT_WORD, SHORT, TAIL03, TAIL01, A_MIGHT_BE_nullptr, A_IS_NOT_NULL; Fix this macro src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 2059: > 2057: __ tbnz(src_pos, 31, L_failed); // i.e. sign bit set > 2058: > 2059: // if (dst == null) return -1; nullptr src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 2077: > 2075: #ifdef ASSERT > 2076: // assert(src->klass() != null); > 2077: { nullptr src/hotspot/cpu/aarch64/templateTable_aarch64.cpp line 3698: > 3696: __ bind(done); > 3697: // r0 = 0: obj == NULL or obj is not an instanceof the specified klass > 3698: // r0 = 1: obj != NULL and obj is an instanceof the specified klass nullptr ------------- PR: https://git.openjdk.org/jdk/pull/12321 From tschatzl at openjdk.org Mon Feb 6 10:28:04 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 6 Feb 2023 10:28:04 GMT Subject: RFR: 8301116: Parallelize TLAB resizing in G1 [v2] In-Reply-To: <-hZgE7WaPCxIh-8bLb6zKWvj5wwpU1tU-P2WSyMq4q8=.f2ed8976-5a44-4e14-acfd-f189cbe9f19a@github.com> References: <-hZgE7WaPCxIh-8bLb6zKWvj5wwpU1tU-P2WSyMq4q8=.f2ed8976-5a44-4e14-acfd-f189cbe9f19a@github.com> Message-ID: On Mon, 6 Feb 2023 08:41:48 GMT, Albert Mingkun Yang wrote: > Two minor suggestions: > > 1. I think "inlining" the logic of `G1JavaThreadsListClaimer` makes the flow in the caller more transparent. > The `G1JavaThreadsListClaimer` will be used in the next PR about parallelizing TLAB retiring; however we came up with a nicer version of the claimer that hides the claiming implementation detail. > 2. `ResizeThreadLABs` -> `ResizeTLABs`, just to be consistent with other *TLAB occurrences. Unfortunately there is already the global option `ResizeTLAB`, and I did not want to have two (fairly global) constants to have almost the same name. ------------- PR: https://git.openjdk.org/jdk/pull/12360 From duke at openjdk.org Mon Feb 6 10:31:21 2023 From: duke at openjdk.org (Afshin Zafari) Date: Mon, 6 Feb 2023 10:31:21 GMT Subject: RFR: 8297539: Consolidate the uses of the int<->float union conversion trick [v2] In-Reply-To: References: Message-ID: > ### Discussion > > There are different ways of casting/converting integral types to floating point types and vice versa in the code. > These conversions are changed to invocation of a centralised `cast(From)` method in `primitiveConversions.hpp`. To make this changes build and run, some new `cast` methods are introduced and header dependencies changed. > > ### Patch > > - All instances of using `union` technique that are used for int<->float and long<->double type conversions are replaced by `PrimitiveConversions::cast(From)` method. > - Instances of, for example, converting `int`<->`intptr_t` are not changed. > - After this change, `JavaValue` also uses the `PrimitiveConversions::cast(From)` templates. Therefore, all the `union` conversions for `int`<-> `float` and `long` <-> `double` in hotspot are now centralised in one location. > - Templates in `PrimitiveConversions` class are numbered from 1-9 and all lines of code that use `PrimitivEConversions::cast(From)` have comments saying which templated `cast(From)` matches. Issue number 8297539 is also in the comments to make the search/reviews more easier. > - The new 'cast(From)' templates are numbered 6-9 for some instances where existing templates (1-5) did not match. > - `globalDefinitions.hpp` now includes `primitiveConversions.hpp`. > - `primitiveConversions.hpp` does NOT include `globalDefinitions.hpp` any more. Some typedef's are added to it to compensate the missing types. > - The followings are not changed: > - `jni.hpp` > - `ciConstant.hpp` is not changed, since uinon is not used for converting `int` <->`float` or `long` <-> `double`. There are some asserts that allow setter/getter used only for the same types. > - `bytecodeInterpreter_zero.hpp` is not changed. Writing to and reading from the `VMJavaVal64` are always with the same type and no conversion is made. > - union with bit fields are not changed. > - `json.hpp` and `json.cpp` in src/hotspot/share/utilities. > > ### Test > mach5 tiers 1-5 passed, tiers 6-8 in progress Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: 8297539: Consolidate the uses of the int<->float union conversion trick ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12384/files - new: https://git.openjdk.org/jdk/pull/12384/files/33d1c8ec..653f93b1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12384&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12384&range=00-01 Stats: 388 lines in 18 files changed: 211 ins; 139 del; 38 mod Patch: https://git.openjdk.org/jdk/pull/12384.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12384/head:pull/12384 PR: https://git.openjdk.org/jdk/pull/12384 From duke at openjdk.org Mon Feb 6 10:32:51 2023 From: duke at openjdk.org (Afshin Zafari) Date: Mon, 6 Feb 2023 10:32:51 GMT Subject: RFR: 8297539: Consolidate the uses of the int<->float union conversion trick [v2] In-Reply-To: References: Message-ID: On Thu, 2 Feb 2023 22:27:33 GMT, Justin King wrote: >> src/hotspot/share/utilities/globalDefinitions.hpp line 28: >> >>> 26: #define SHARE_UTILITIES_GLOBALDEFINITIONS_HPP >>> 27: >>> 28: #include "metaprogramming/primitiveConversions.hpp" >> >> I think this should not be done, for the same reasons as not including the suggested bit_cast >> (https://git.openjdk.org/jdk/pull/11865). There is substantial overlap between this proposed >> change and that one, though I think the bit_cast proposal is more problematic. >> >> Rather than adding a new dependency here, separate out the code that uses the conversions >> into new files and update the (much smaller number than use globalDefinitions.hpp) users >> of those functions to include the new headers. > > I previously updated the bit_cast PR to just be `PrimitiveConversions::cast` implementation, but using __builtin_bit_cast when available. So this and that are basically the same. Dependencies to `globalDefinitions` and `primitiveCopnversions` are broken down to smaller header files and all the dependencies in other headers are addressed correspondingly. ------------- PR: https://git.openjdk.org/jdk/pull/12384 From duke at openjdk.org Mon Feb 6 10:35:51 2023 From: duke at openjdk.org (Afshin Zafari) Date: Mon, 6 Feb 2023 10:35:51 GMT Subject: RFR: 8297539: Consolidate the uses of the int<->float union conversion trick [v2] In-Reply-To: References: Message-ID: On Thu, 2 Feb 2023 22:37:29 GMT, John R Rose wrote: > > bit_cast is the chosen term for C++ starting in C++20, which reinterprets > > Good. I would be very content with bit_cast, or some use of the older word "reinterpret". The problem with plain "cast" is the overloading of meanings. (And then there's the extra push in the wrong direction from the name "PrimitiveConversions".) We are not just C++ programmers here; many of us are Java programmers as well. I need this conversation completes to apply the required changes. ------------- PR: https://git.openjdk.org/jdk/pull/12384 From ayang at openjdk.org Mon Feb 6 11:40:59 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 6 Feb 2023 11:40:59 GMT Subject: RFR: 8301465: Remove unnecessary nmethod arming in Full GC of Serial/Parallel [v2] In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 09:50:27 GMT, Albert Mingkun Yang wrote: >> Simple removing unnecessary code and adding API documentation. >> >> Test: hotspot_gc > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review Thanks for the review. ------------- PR: https://git.openjdk.org/jdk/pull/12311 From ayang at openjdk.org Mon Feb 6 11:41:02 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 6 Feb 2023 11:41:02 GMT Subject: Integrated: 8301465: Remove unnecessary nmethod arming in Full GC of Serial/Parallel In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 09:18:08 GMT, Albert Mingkun Yang wrote: > Simple removing unnecessary code and adding API documentation. > > Test: hotspot_gc This pull request has now been integrated. Changeset: 371a0c4f Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/371a0c4f885856b4820870fe9e523ea8694e3997 Stats: 9 lines in 3 files changed: 7 ins; 2 del; 0 mod 8301465: Remove unnecessary nmethod arming in Full GC of Serial/Parallel Reviewed-by: tschatzl, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/12311 From dnsimon at openjdk.org Mon Feb 6 11:54:18 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 6 Feb 2023 11:54:18 GMT Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64 [v25] In-Reply-To: <4v4pdOt2xrFJx30NbLByS6jw3p_y3NSutAxIQq_I-lg=.cedfdf56-6a23-46b6-8cd3-044a467c2c61@github.com> References: <-RReNBa4rbWMG38xuIH_yCfzDDEubd4gU2QsVCdiubM=.0e13582c-f1c8-4e35-a1ce-174a22b86a58@github.com> <4v4pdOt2xrFJx30NbLByS6jw3p_y3NSutAxIQq_I-lg=.cedfdf56-6a23-46b6-8cd3-044a467c2c61@github.com> Message-ID: On Mon, 6 Feb 2023 07:05:39 GMT, Hao Sun wrote: >> src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 413: >> >>> 411: } >>> 412: >>> 413: if (UseBranchProtection == nullptr || strcmp(UseBranchProtection, "none") == 0) { >> >> My understanding is that `UseBranchProtection` requires JIT compiler integration. This is currently missing in Graal. The logic that decides if it should be enabled by default needs to take JVMCI into consideration, much like the call to `JVMCIGlobals::check_jvmci_supported_gc` determines if JVMCI supports the configured GC. >> >> Thanks to @gilles-duboscq for alerting me to this change. > > Thanks for reporting this. @dougxc > > Yes. I encountered several jtreg failures with Graal, which is built with PAC-RET enabled JDK. > > As I see it, the straightforward fix is to disable JVMCI if `_rop_protection` is parsed as "true" finally, since Graal doesn't supports PAC-RET currently. > That is similar to the way `JVMCIGlobals::check_jvmci_supported_gc` does, i.e. if one GC except `G1/Serial/Parallel` is configured, we disable JVMCI. > WDYT The problem with `JVMCIGlobals::check_jvmci_supported_gc` is that it does not give the compiler a chance to specify whether it supports a given GC. Instead, we want a JVMCI upcall like `jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.isGCSupported(int)`. ------------- PR: https://git.openjdk.org/jdk/pull/6334 From adinn at openjdk.org Mon Feb 6 12:18:52 2023 From: adinn at openjdk.org (Andrew Dinn) Date: Mon, 6 Feb 2023 12:18:52 GMT Subject: RFR: 8299162: Refactor shared trampoline emission logic [v3] In-Reply-To: References: <7Aj4v0D-CFSg2vt7gQf7HAJ6X55MhFX5o8jSJNvdT7s=.ee880799-68df-48ee-942b-0ba91b363b77@github.com> Message-ID: On Mon, 6 Feb 2023 04:42:58 GMT, Xiaolin Zheng wrote: >> src/hotspot/cpu/aarch64/codeBuffer_aarch64.cpp line 71: >> >>> 69: >>> 70: assert(requests->number_of_entries() >= 1, "at least one"); >>> 71: const int total_requested_size = MacroAssembler::trampoline_stub_size() * requests->number_of_entries(); >> >> There's a detail here that is worth mentioning. At first I was concerned that this calculation might overestimate the required size by one 32 bit word -- that happens where we are able generate the first stub without the need for any alignment padding. However, that won't cause any unnecessary failure because the rounding up will be from an odd to an even multiple of 32 bit words i.e. adding this extra half-word will not push the request over to consume an extra 64 bit word. > > Thanks for your review, Andrew - Yes, indeed a detail here. Although the real emitted code does not get affected, when allocating the requested trampolines, the pre-allocated size the code buffer is applying for could be slightly bigger than the previous real-time processing. I am not sure how useful it is to calculate the precise allocated request size, (though doable, seems we will lose the code simplicity) for in the context the current CodeBuffer is only a temporary one, which would be finally poured into the real nmethod; and also when the code buffer is expanding[1], a quite big and heuristic size could be applied for. I dumped the `requests->number_of_entries()` here, the trampoline count: seems the value is a small number, often less than `5`, sometimes `10`, if the `ReservedCodeCacheSize > branch_range` condition is true so that trampolines are generated. > > I realized maybe I should rename the `trampoline_stub_size` to something like `max_trampoline_stub_size` to reflect its semantics. I was wondering if the final version still looks okay to you? > > [1] https://github.com/openjdk/jdk/blob/b4cb6c8e8b4bb10d47fd4839c7abf13a552323f6/src/hotspot/share/asm/codeBuffer.cpp#L833-L845 @zhengxiaolinX > I realized maybe I should rename the `trampoline_stub_size` to something like `max_trampoline_stub_size`to reflect its semantics. I was wondering if the final version still looks okay to you? There is no need for any name change but it does no harm. My comment was only to record a detail on the PR that might help anyone who looks at this patch in future. I agree there is no reason to complicate the size estimation code given that it is unlikely to save a significant amount of space. So, yes, this version looks good. Ship it! ------------- PR: https://git.openjdk.org/jdk/pull/11749 From xlinzheng at openjdk.org Mon Feb 6 12:30:51 2023 From: xlinzheng at openjdk.org (Xiaolin Zheng) Date: Mon, 6 Feb 2023 12:30:51 GMT Subject: RFR: 8299162: Refactor shared trampoline emission logic [v2] In-Reply-To: References: <2AQOqhItBwarWAAf3o64HQnJqGLN7hRd96krydvJxeM=.22d31e24-c43d-44fb-b54b-d1ea86893883@github.com> Message-ID: On Tue, 3 Jan 2023 03:38:55 GMT, Fei Yang wrote: >> Xiaolin Zheng has updated the pull request incrementally with one additional commit since the last revision: >> >> __ > > Looks good to me. Better to have another reviewer. Thanks for taking the time to review this patch! @RealFYang @adinn @luhenry My local tests look okay on both AArch64 and RISC-V platforms. ------------- PR: https://git.openjdk.org/jdk/pull/11749 From luhenry at openjdk.org Mon Feb 6 12:37:53 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Mon, 6 Feb 2023 12:37:53 GMT Subject: RFR: 8299162: Refactor shared trampoline emission logic [v5] In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 06:08:36 GMT, Xiaolin Zheng wrote: >> After the quick fix [JDK-8297763](https://bugs.openjdk.org/browse/JDK-8297763), shared trampoline logic gets a bit verbose. If we can turn to batch emission of trampoline stubs, pre-calculating the total size, and pre-allocating them, then we can remove the CodeBuffer expansion checks each time and clean up the code around. >> >> >> [Stub Code] >> ... >> >> __ align() // emit nothing or a 4-byte padding >> <-- (B) emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) >> __ ldr() >> __ br() >> __ emit_int64() >> >> __ align() // emit nothing or a 4-byte padding >> <-- emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) >> __ ldr() >> __ br() >> __ emit_int64() >> >> __ align() // emit nothing or a 4-byte padding >> <-- emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) >> __ ldr() >> __ br() >> __ emit_int64() >> >> >> Here, the `pc_at_(C) - pc_at_(B)` is the fixed length `NativeCallTrampolineStub::instruction_size`; but the `pc_at_(B) - pc_at_(A)` may be a 0 or 4, which is not a fixed-length value. >> >> So originally: >> The logic of the lambda `emit()` inside the `emit_shared_trampolines()` when emitting a shared trampoline: >> >> We are at (A) -> >> do an align() -> >> We are at (B) -> >> emit lots of relocations bound to this shared trampoline at (B) -> >> do an emit_trampoline_stub() -> >> We are at (C) >> >> >> After this patch: >> >> We are at (A) -> >> do an emit_trampoline_stub(), which contains an align() already -> >> We are at (C) directly -> >> reversely calculate the (B) address, for `pc_at_(C) - pc_at_(B)` is a fixed-length value -> >> emit lots of relocations bound to this shared trampoline at (B) >> >> >> Theoretically the same. Just a code refactoring and we can remove some checks inside and make the code clean. >> >> Tested AArch64 hotspot tier1~4 with fastdebug build twice; Tested RISC-V hotspot tier1~4 with fastdebug build on hardware once. >> >> >> Thanks, >> Xiaolin > > Xiaolin Zheng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - Merge remote-tracking branch 'github-openjdk/master' into shared-trampoline > - trampoline_stub_size() -> max_trampoline_stub_size() to reflect its semantics > - license > - Merge branch 'master' into shared-trampoline > - __ > - Refactor and cleanup for shared trampolines Marked as reviewed by luhenry (Committer). ------------- PR: https://git.openjdk.org/jdk/pull/11749 From ysuenaga at openjdk.org Mon Feb 6 12:40:04 2023 From: ysuenaga at openjdk.org (Yasumasa Suenaga) Date: Mon, 6 Feb 2023 12:40:04 GMT Subject: RFR: 8301820: C4819 warnings were reported on Windows In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 01:06:32 GMT, David Holmes wrote: >> C4819 warnings were reported when I tried to build JDK on Windows with VS2022. >> This PR contains changes both HotSpot and client libraries. Let me know if they should be separated. >> >> * HotSpot >> * stubGenerator_x86_64_poly.cpp >> * elfFile.hpp >> * libfontmanager >> * hb.hh >> * libfreetype >> * afblue.c >> >> I added C4819 to `DISABLED_WARNINGS_microsoft` for libfontmanager and for libfreetype because they are 3rd-party libraries. > > Sorry Yasumasa, but I'd much prefer to see this split into three sub-tasks (or issues) just so we can keep the issues more clearly separated. > > IIUC for stubGenerator_x86_64_poly.cpp we have a multiplication symbol instead of an ascii 'x' (likely pasted from a word processing document). > > For elfFile.hpp there are smart-quotes instead of ascii " > > Thanks. @dholmes-ora I splitted this in three sub tasks as #12435, #12436, #12437. I hope they are reviewed and merged. So I close this PR (but I do not close this issue on JBS until all subtasks are resolved). ------------- PR: https://git.openjdk.org/jdk/pull/12427 From ysuenaga at openjdk.org Mon Feb 6 12:40:05 2023 From: ysuenaga at openjdk.org (Yasumasa Suenaga) Date: Mon, 6 Feb 2023 12:40:05 GMT Subject: Withdrawn: 8301820: C4819 warnings were reported on Windows In-Reply-To: References: Message-ID: On Sat, 4 Feb 2023 09:42:57 GMT, Yasumasa Suenaga wrote: > C4819 warnings were reported when I tried to build JDK on Windows with VS2022. > This PR contains changes both HotSpot and client libraries. Let me know if they should be separated. > > * HotSpot > * stubGenerator_x86_64_poly.cpp > * elfFile.hpp > * libfontmanager > * hb.hh > * libfreetype > * afblue.c > > I added C4819 to `DISABLED_WARNINGS_microsoft` for libfontmanager and for libfreetype because they are 3rd-party libraries. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/12427 From ysuenaga at openjdk.org Mon Feb 6 12:40:44 2023 From: ysuenaga at openjdk.org (Yasumasa Suenaga) Date: Mon, 6 Feb 2023 12:40:44 GMT Subject: RFR: 8301853: C4819 warnings were reported in HotSpot on Windows Message-ID: This is subtask of #12427 . I have seen C4819 warning in stubGenerator_x86_64_poly.cpp and elfFile.hpp in HotSpot on Windows (CP932: Japanese locale) d:\github-forked\jdk\src\hotspot\cpu\x86\stubGenerator_x86_64_poly.cpp(1): error C2220: ????????????????? d:\github-forked\jdk\src\hotspot\cpu\x86\stubGenerator_x86_64_poly.cpp(1): warning C4819: ???????????? ??? (932) ??????????????????????????????????? Unicode ???????????? ------------- Commit messages: - 8301853: C4819 warnings were reported in HotSpot on Windows Changes: https://git.openjdk.org/jdk/pull/12435/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12435&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301853 Stats: 54 lines in 2 files changed: 0 ins; 0 del; 54 mod Patch: https://git.openjdk.org/jdk/pull/12435.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12435/head:pull/12435 PR: https://git.openjdk.org/jdk/pull/12435 From xlinzheng at openjdk.org Mon Feb 6 12:41:59 2023 From: xlinzheng at openjdk.org (Xiaolin Zheng) Date: Mon, 6 Feb 2023 12:41:59 GMT Subject: Integrated: 8299162: Refactor shared trampoline emission logic In-Reply-To: References: Message-ID: On Wed, 21 Dec 2022 03:39:15 GMT, Xiaolin Zheng wrote: > After the quick fix [JDK-8297763](https://bugs.openjdk.org/browse/JDK-8297763), shared trampoline logic gets a bit verbose. If we can turn to batch emission of trampoline stubs, pre-calculating the total size, and pre-allocating them, then we can remove the CodeBuffer expansion checks each time and clean up the code around. > > > [Stub Code] > ... > > __ align() // emit nothing or a 4-byte padding > <-- (B) emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) > __ ldr() > __ br() > __ emit_int64() > > __ align() // emit nothing or a 4-byte padding > <-- emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) > __ ldr() > __ br() > __ emit_int64() > > __ align() // emit nothing or a 4-byte padding > <-- emit multiple relocations at the pc: __ relocate(, trampoline_stub_Relocation::spec()) > __ ldr() > __ br() > __ emit_int64() > > > Here, the `pc_at_(C) - pc_at_(B)` is the fixed length `NativeCallTrampolineStub::instruction_size`; but the `pc_at_(B) - pc_at_(A)` may be a 0 or 4, which is not a fixed-length value. > > So originally: > The logic of the lambda `emit()` inside the `emit_shared_trampolines()` when emitting a shared trampoline: > > We are at (A) -> > do an align() -> > We are at (B) -> > emit lots of relocations bound to this shared trampoline at (B) -> > do an emit_trampoline_stub() -> > We are at (C) > > > After this patch: > > We are at (A) -> > do an emit_trampoline_stub(), which contains an align() already -> > We are at (C) directly -> > reversely calculate the (B) address, for `pc_at_(C) - pc_at_(B)` is a fixed-length value -> > emit lots of relocations bound to this shared trampoline at (B) > > > Theoretically the same. Just a code refactoring and we can remove some checks inside and make the code clean. > > Tested AArch64 hotspot tier1~4 with fastdebug build twice; Tested RISC-V hotspot tier1~4 with fastdebug build on hardware once. > > > Thanks, > Xiaolin This pull request has now been integrated. Changeset: 77305064 Author: Xiaolin Zheng Committer: Ludovic Henry URL: https://git.openjdk.org/jdk/commit/773050647ea49cc4f23bd27b18012dece9f0faa2 Stats: 112 lines in 9 files changed: 50 ins; 29 del; 33 mod 8299162: Refactor shared trampoline emission logic Reviewed-by: fyang, adinn, luhenry ------------- PR: https://git.openjdk.org/jdk/pull/11749 From eosterlund at openjdk.org Mon Feb 6 15:19:25 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 6 Feb 2023 15:19:25 GMT Subject: RFR: 8301874: BarrierSetC2 should assign barrier data to stores Message-ID: BarrierSetC2 should assign barrier data to stores, like it does for loads and atomics. That way, a GC that uses barrier data for stores, can call the super class to generate the loads/stores, instead of copying that code. ------------- Commit messages: - 8301874: BarrierSetC2 should assign barrier data to stores Changes: https://git.openjdk.org/jdk/pull/12442/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12442&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301874 Stats: 4 lines in 1 file changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12442.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12442/head:pull/12442 PR: https://git.openjdk.org/jdk/pull/12442 From eosterlund at openjdk.org Mon Feb 6 15:45:30 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 6 Feb 2023 15:45:30 GMT Subject: RFR: 8300255: Introduce interface for GC oop verification in the assembler Message-ID: In the assembly code, there is some generic oop verification code. Said verification may or may not fit a particular GC. In particular, it has not worked well for ZGC for a while, and there is an if (UseZGC). This enhancement aims at generalizing this code, such that a collector can have its own oop verification policy. ------------- Commit messages: - 8300255: Introduce interface for GC oop verification in the assembler Changes: https://git.openjdk.org/jdk/pull/12443/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12443&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8300255 Stats: 183 lines in 18 files changed: 115 ins; 58 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/12443.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12443/head:pull/12443 PR: https://git.openjdk.org/jdk/pull/12443 From eosterlund at openjdk.org Mon Feb 6 15:45:31 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 6 Feb 2023 15:45:31 GMT Subject: RFR: 8300255: Introduce interface for GC oop verification in the assembler In-Reply-To: References: Message-ID: <5Ts_s04vfL_X-B1d9uM2QWeeHXZ99zntV0W_kI6KKNs=.1dfadaa4-cd3d-4b80-a523-6fdcd5f404c5@github.com> On Mon, 6 Feb 2023 15:32:50 GMT, Erik ?sterlund wrote: > In the assembly code, there is some generic oop verification code. Said verification may or may not fit a particular GC. In particular, it has not worked well for ZGC for a while, and there is an if (UseZGC). This enhancement aims at generalizing this code, such that a collector can have its own oop verification policy. PPC code contributed by @TheRealMDoerr and RISC-V code contributed by @yadongw. Could you please check that it works on your machines? ------------- PR: https://git.openjdk.org/jdk/pull/12443 From ioi.lam at oracle.com Mon Feb 6 17:19:35 2023 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 6 Feb 2023 09:19:35 -0800 Subject: JVM Flag ergonomics and constraints In-Reply-To: <41d86ae6-f13b-c030-b1e4-3f77ef2752e4@oracle.com> References: <1e05de2b-5f03-7ff2-6a47-176df9046588@oracle.com> <637c4388-4c02-c654-6166-23491d797fce@oracle.com> <678fc49c-d85c-3948-0a7d-a3802b909dd1@oracle.com> <9a4fb118-17ba-8e04-8ce5-737f221cb5f4@oracle.com> <41d86ae6-f13b-c030-b1e4-3f77ef2752e4@oracle.com> Message-ID: On 2/5/2023 11:45 PM, David Holmes wrote: > On 6/02/2023 5:26 pm, Ioi Lam wrote: >> On 2/3/2023 12:40 AM, David Holmes wrote: >>> On 3/02/2023 5:12 pm, Ioi Lam wrote: >>>> >>>> >>>> On 2/1/2023 10:37 PM, David Holmes wrote: >>>>> >>>>> >>>>> On 2/02/2023 3:25 pm, Thomas St?fe wrote: >>>>>> On Thu, Feb 2, 2023 at 5:30 AM David Holmes >>>>>> > wrote: >>>>>> >>>>>> ??? That would have to be determined case-by-case. For each ergo >>>>>> flag we >>>>>> ??? would have to decide what value it is calculating and based >>>>>> on what, >>>>>> ??? and >>>>>> ??? determine whether an out-of-range value is an error or >>>>>> something to be >>>>>> ??? wary of. >>>>>> >>>>>> Sure, what I meant was how to control it? Two sets of >>>>>> FLAT_SET_... macros, one that asserts on constraint violation, >>>>>> one that caps? >>>>> >>>>> No I expect to just see the places where FLAG_SET_ERGO is called >>>>> do the right thing by checking the return value and proceeding >>>>> accordingly. >>>> >>>> David, are you proposing this? >>>> >>>> ??? int value = calculate_ergo(); >>>> ??? if (FLAG_SET_ERGO (TheFlag, value) == error) { >>>> ????? value = do_it_again() >>>> ????? if (FLAG_SET_ERGO (TheFlag, value) == error) { >>>> ???????? huh? >>>> ????? } >>>> ??? } >>>> >>>> In the above code, all FLAG_SET_ERGO does is to check the range for >>>> you. It doesn't tell you why the range check fails. So the caller >>>> would have to redo the calculation (with no more information, other >>>> than that the last value was no good). What if the new value is >>>> also no good? >>> >>> To me that indicates we don't know what the heck we are calculating! >>> How can you calculate a setting for a flag if you don't even >>> understand the valid range of that flag. Which may mean the max, for >>> example, needs to be exposed programmatically. >>> >>>> With this style, you almost have to wrap every FLAG_SET_ERGO? in a >>>> loop, keep trying (in the dark) until you succeed. >>>> >>>> ================== >>>> I think it's better to just assert inside FLAG_SET_ERGO if the >>>> value is out of range. The caller should make sure it has >>>> calculated a correct value. >>> >>> If the calculation uses some externally settable value then an >>> assert won't help if we're in a product build. >>> >> >> We have lots of set_xxx() APIs in various classes that assert when >> given an inappropriate value -- this indicates a programming error. >> So FLAG_SET_ERGO would be no different. >> >> It's the responsibility of the caller of FLAG_SET_ERGO to pass a >> correct value to FLAG_SET_ERGO . How the caller comes up with that >> value (whether it depends on some external input) is irrelevant. >> ======== >> >> To properly catch the programming errors with ERGO setting, we should: >> >> - Change FLAG_SET_ERGO to return "void". The current API gives the >> false impression that the caller needs to check the returned value >> and do something about it. > > That's because they do! You still need to be able to report an error > for product builds if the value is being calculated based on some > passed in runtime value. > That should be done by the code that calls FLAG_SET_ERGO, not by FLAG_SET_ERGO itself. E.g., if an environment variable would cause an invalid ERGO setting, the code that handles this variable should report an error and should NOT call FLAG_SET_ERGO with an invalid value. FLAG_SET_ERGO knows nothing about such env variables and cannot print out a meaningful error message. > Putting in an assert is fine but not sufficient. > > David > ----- > >> - Assert if the value is out of range, so we can catch the >> programming error during testing (with debug builds) >> >> It's up to the caller of FLAG_SET_ERGO to determine a value that's >> within the allowed range. >> >> If the caller wants to query the constraints of the flag that it >> wants to set, that can be done via the JVMFlagLimit API. Using >> JVMFlagLimit is much better than blindly setting a value and getting >> an error code that doesn't tell you why it failed. >> >> >>> As I said it depends on exactly what we are using to do the >>> calculation. If it is "closed world" then it is a fatal error to get >>> it wrong and should be detected during testing. Otherwise the code >>> has to deal with the possibility of error. >>> >>>> >>>> Maybe we can add a new FLAG_SET_ERGO_CAPPED() that would cap the >>>> value, but that should be an opt-in. FLAG_SET_ERGO() should not >>>> automatically cap the value. >>> >>> That might be suitable for some flags/calculations. >>> >>> I don't like trying to talk about this in generalities. Each >>> specific situation needs to be examined to understand what is being >>> calculated and how. But it is the callers of FLAG_SET_ERGO that need >>> to know what they are doing. >>> >>> Cheers, >>> David >>> >>>> >>>> Thanks >>>> - Ioi >>>> >> From duke at openjdk.org Mon Feb 6 17:35:18 2023 From: duke at openjdk.org (Scott Gibbons) Date: Mon, 6 Feb 2023 17:35:18 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v9] In-Reply-To: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: > Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. > > Encode performance: > **Old:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms > > > **New:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms > > > Decode performance: > **Old:** > > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms > > **New:** > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Removal of a redundant cmp in inner loop. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12126/files - new: https://git.openjdk.org/jdk/pull/12126/files/80fbf4ef..a8ecc15a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=07-08 Stats: 7 lines in 1 file changed: 2 ins; 1 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/12126.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12126/head:pull/12126 PR: https://git.openjdk.org/jdk/pull/12126 From bkilambi at openjdk.org Mon Feb 6 17:36:55 2023 From: bkilambi at openjdk.org (Bhavana Kilambi) Date: Mon, 6 Feb 2023 17:36:55 GMT Subject: RFR: 8301012: [vectorapi]: Intrinsify CompressBitsV/ExpandBitsV and add the AArch64 SVE backend implementation Message-ID: This patch adds mid-end compiler vector IR nodes for the scalar CompressBits and ExpandBits nodes - CompressBitsV and ExpandBitsV and also adds aarch64 backend support for these nodes using SVE2 instructions (included in the svebitperm feature). As there are direct instructions in SVE2 that map to these operations, a huge speed up in performance can be observed and it might significantly benefit all those workloads that extensively run these operations on an SVE2(with svebitperm feature) supporting machine. All the JTREG tests under "test/jdk/jdk/incubator/vector" pass successfully with this patch on an SVE2 machine. The JMH tests - COMPRESS_BITS and EXPAND_BITS from [1] and [2] were run on a 128-bit vector length, SVE2 and svebitperm supporting aarch64 machine. Following are the gains observed with this patch - Benchmark (length) Mode Cnt Gain IntMaxVector.COMPRESS_BITS 1024 thrpt 15 81.68x IntMaxVector.EXPAND_BITS 1024 thrpt 15 85.65x LongMaxVector.COMPRESS_BITS 1024 thrpt 15 70.78x LongMaxVector.EXPAND_BITS 1024 thrpt 15 76.31x The "Gain" column is the ratio between the throughput of benchmark runs with this patch and that of benchmark runs without this patch. This patch does not change the performance of these operations for all other machines that do not support these instructions or when run on a different architecture. With this patch, vectorization of CompressBits and ExpandBits operations happens only through vectorapi for aarch64. Autovectorization does not take place as the current JDK source does not contain aarch64 backend implementation for scalar CompressBits and ExpandBits. However, this PR - https://github.com/openjdk/jdk/pull/10537 adds aarch64 backend implementaton for CompressBits and ExpandBits and may lead to autovectorization of these nodes as well eventually but this PR is a standalone one and not dependent on the scalar implementation. [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/IntMaxVector.java [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/LongMaxVector.java ------------- Commit messages: - 8301012: [vectorapi]: Intrinsify CompressBitsV/ExpandBitsV and add the AArch64 SVE backend implementation Changes: https://git.openjdk.org/jdk/pull/12446/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12446&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301012 Stats: 259 lines in 9 files changed: 254 ins; 2 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/12446.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12446/head:pull/12446 PR: https://git.openjdk.org/jdk/pull/12446 From bkilambi at openjdk.org Mon Feb 6 17:39:50 2023 From: bkilambi at openjdk.org (Bhavana Kilambi) Date: Mon, 6 Feb 2023 17:39:50 GMT Subject: RFR: 8301012: [vectorapi]: Intrinsify CompressBitsV/ExpandBitsV and add the AArch64 SVE backend implementation In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 17:23:20 GMT, Bhavana Kilambi wrote: > This patch adds mid-end compiler vector IR nodes for the scalar CompressBits and ExpandBits nodes - CompressBitsV and ExpandBitsV and also adds aarch64 backend support for these nodes using SVE2 instructions (included in the svebitperm feature). As there are direct instructions in SVE2 that map to these operations, a huge speed up in performance can be observed and it might significantly benefit all those workloads that extensively run these operations on an SVE2(with svebitperm feature) supporting machine. > > All the JTREG tests under "test/jdk/jdk/incubator/vector" pass successfully with this patch on an SVE2 machine. > The JMH tests - COMPRESS_BITS and EXPAND_BITS from [1] and [2] were run on a 128-bit vector length, SVE2 and svebitperm supporting aarch64 machine. Following are the gains observed with this patch - > > > Benchmark (length) Mode Cnt Gain > IntMaxVector.COMPRESS_BITS 1024 thrpt 15 81.68x > IntMaxVector.EXPAND_BITS 1024 thrpt 15 85.65x > LongMaxVector.COMPRESS_BITS 1024 thrpt 15 70.78x > LongMaxVector.EXPAND_BITS 1024 thrpt 15 76.31x > > > The "Gain" column is the ratio between the throughput of benchmark runs with this patch and that of benchmark runs without this patch. This patch does not change the performance of these operations for all other machines that do not support these instructions or when run on a different architecture. > With this patch, vectorization of CompressBits and ExpandBits operations happens only through vectorapi for aarch64. Autovectorization does not take place as the current JDK source does not contain aarch64 backend implementation for scalar CompressBits and ExpandBits. However, this PR - https://github.com/openjdk/jdk/pull/10537 adds aarch64 backend implementaton for CompressBits and ExpandBits and may lead to autovectorization of these nodes as well eventually but this PR is a standalone one and not dependent on the scalar implementation. > > [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/IntMaxVector.java > [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/LongMaxVector.java Hi @jatin-bhateja, just in case you may wonder why I added you as a contributor, I used some of your code from an older PR on the panama-vector repo - https://github.com/openjdk/panama-vector/pull/195 in this patch. Thanks! ------------- PR: https://git.openjdk.org/jdk/pull/12446 From duke at openjdk.org Mon Feb 6 19:29:08 2023 From: duke at openjdk.org (Scott Gibbons) Date: Mon, 6 Feb 2023 19:29:08 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v10] In-Reply-To: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: > Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. > > Encode performance: > **Old:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms > > > **New:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms > > > Decode performance: > **Old:** > > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms > > **New:** > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Remove redundant cmp from inner loop of encode ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12126/files - new: https://git.openjdk.org/jdk/pull/12126/files/a8ecc15a..cb271c00 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=08-09 Stats: 7 lines in 1 file changed: 1 ins; 2 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/12126.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12126/head:pull/12126 PR: https://git.openjdk.org/jdk/pull/12126 From dholmes at openjdk.org Mon Feb 6 22:12:50 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 6 Feb 2023 22:12:50 GMT Subject: RFR: 8301853: C4819 warnings were reported in HotSpot on Windows In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 12:33:44 GMT, Yasumasa Suenaga wrote: > This is subtask of #12427 . > > I have seen C4819 warning in stubGenerator_x86_64_poly.cpp and elfFile.hpp in HotSpot on Windows (CP932: Japanese locale) > > > d:\github-forked\jdk\src\hotspot\cpu\x86\stubGenerator_x86_64_poly.cpp(1): error C2220: ????????????????? > d:\github-forked\jdk\src\hotspot\cpu\x86\stubGenerator_x86_64_poly.cpp(1): warning C4819: ???????????? ??? (932) ??????????????????????????????????? Unicode ???????????? Looks good and trivial. Thanks for fixing. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12435 From dholmes at openjdk.org Mon Feb 6 23:32:47 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 6 Feb 2023 23:32:47 GMT Subject: RFR: JDK-8301480: Replace NULL with nullptr in os/posix [v2] In-Reply-To: <5A0_x2-25DljN6avzpJhMvCQdL5TfVDD4A9dJ0w8VvM=.3c49b7f0-4c37-468f-8d7d-386283738dc6@github.com> References: <5A0_x2-25DljN6avzpJhMvCQdL5TfVDD4A9dJ0w8VvM=.3c49b7f0-4c37-468f-8d7d-386283738dc6@github.com> Message-ID: On Mon, 6 Feb 2023 09:39:17 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/posix. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Fix dholmes' bug Update looks good. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12316 From duke at openjdk.org Tue Feb 7 00:12:21 2023 From: duke at openjdk.org (Scott Gibbons) Date: Tue, 7 Feb 2023 00:12:21 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v11] In-Reply-To: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: > Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. > > Encode performance: > **Old:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms > > > **New:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms > > > Decode performance: > **Old:** > > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms > > **New:** > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Add algorithm comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12126/files - new: https://git.openjdk.org/jdk/pull/12126/files/cb271c00..0a274e5a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=09-10 Stats: 16 lines in 1 file changed: 14 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12126.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12126/head:pull/12126 PR: https://git.openjdk.org/jdk/pull/12126 From david.holmes at oracle.com Tue Feb 7 00:48:30 2023 From: david.holmes at oracle.com (David Holmes) Date: Tue, 7 Feb 2023 10:48:30 +1000 Subject: JVM Flag ergonomics and constraints In-Reply-To: References: <1e05de2b-5f03-7ff2-6a47-176df9046588@oracle.com> <637c4388-4c02-c654-6166-23491d797fce@oracle.com> <678fc49c-d85c-3948-0a7d-a3802b909dd1@oracle.com> <9a4fb118-17ba-8e04-8ce5-737f221cb5f4@oracle.com> <41d86ae6-f13b-c030-b1e4-3f77ef2752e4@oracle.com> Message-ID: <471c6789-cef8-988c-967d-ff044ccb3a89@oracle.com> On 7/02/2023 3:19 am, Ioi Lam wrote: > > > On 2/5/2023 11:45 PM, David Holmes wrote: >> On 6/02/2023 5:26 pm, Ioi Lam wrote: >>> On 2/3/2023 12:40 AM, David Holmes wrote: >>>> On 3/02/2023 5:12 pm, Ioi Lam wrote: >>>>> >>>>> >>>>> On 2/1/2023 10:37 PM, David Holmes wrote: >>>>>> >>>>>> >>>>>> On 2/02/2023 3:25 pm, Thomas St?fe wrote: >>>>>>> On Thu, Feb 2, 2023 at 5:30 AM David Holmes >>>>>>> > wrote: >>>>>>> >>>>>>> ??? That would have to be determined case-by-case. For each ergo >>>>>>> flag we >>>>>>> ??? would have to decide what value it is calculating and based >>>>>>> on what, >>>>>>> ??? and >>>>>>> ??? determine whether an out-of-range value is an error or >>>>>>> something to be >>>>>>> ??? wary of. >>>>>>> >>>>>>> Sure, what I meant was how to control it? Two sets of >>>>>>> FLAT_SET_... macros, one that asserts on constraint violation, >>>>>>> one that caps? >>>>>> >>>>>> No I expect to just see the places where FLAG_SET_ERGO is called >>>>>> do the right thing by checking the return value and proceeding >>>>>> accordingly. >>>>> >>>>> David, are you proposing this? >>>>> >>>>> ??? int value = calculate_ergo(); >>>>> ??? if (FLAG_SET_ERGO (TheFlag, value) == error) { >>>>> ????? value = do_it_again() >>>>> ????? if (FLAG_SET_ERGO (TheFlag, value) == error) { >>>>> ???????? huh? >>>>> ????? } >>>>> ??? } >>>>> >>>>> In the above code, all FLAG_SET_ERGO does is to check the range for >>>>> you. It doesn't tell you why the range check fails. So the caller >>>>> would have to redo the calculation (with no more information, other >>>>> than that the last value was no good). What if the new value is >>>>> also no good? >>>> >>>> To me that indicates we don't know what the heck we are calculating! >>>> How can you calculate a setting for a flag if you don't even >>>> understand the valid range of that flag. Which may mean the max, for >>>> example, needs to be exposed programmatically. >>>> >>>>> With this style, you almost have to wrap every FLAG_SET_ERGO? in a >>>>> loop, keep trying (in the dark) until you succeed. >>>>> >>>>> ================== >>>>> I think it's better to just assert inside FLAG_SET_ERGO if the >>>>> value is out of range. The caller should make sure it has >>>>> calculated a correct value. >>>> >>>> If the calculation uses some externally settable value then an >>>> assert won't help if we're in a product build. >>>> >>> >>> We have lots of set_xxx() APIs in various classes that assert when >>> given an inappropriate value -- this indicates a programming error. >>> So FLAG_SET_ERGO would be no different. >>> >>> It's the responsibility of the caller of FLAG_SET_ERGO to pass a >>> correct value to FLAG_SET_ERGO . How the caller comes up with that >>> value (whether it depends on some external input) is irrelevant. >>> ======== >>> >>> To properly catch the programming errors with ERGO setting, we should: >>> >>> - Change FLAG_SET_ERGO to return "void". The current API gives the >>> false impression that the caller needs to check the returned value >>> and do something about it. >> >> That's because they do! You still need to be able to report an error >> for product builds if the value is being calculated based on some >> passed in runtime value. >> > > That should be done by the code that calls FLAG_SET_ERGO, not by > FLAG_SET_ERGO itself. E.g., if an environment variable would cause an > invalid ERGO setting, the code that handles this variable should report > an error and should NOT call FLAG_SET_ERGO with an invalid value. I'm not asking FLAG_SET_ERGO to report a specific error, just that the supplied value was out-of-range. Your position, as stated, is that the caller should validate everything before calling FLAG_SET_ERGO. My position was that we could still support a usage pattern where we assume everything will be fine and we recover if it is not e.g. newVal = simpleCalc(x, y, z); // works 99% of time if (!FLAG_SET_ERGO(flag, newVal)) { newVal = complexCalc(x, y, z); // guarantees in range FLAG_SET_ERGO(flag, newVal); } With what you propose the caller has to range-check newVal, and then FLAG_SET_ERGO will range-check it again. > FLAG_SET_ERGO knows nothing about such env variables and cannot print > out a meaningful error message. I'm not wanting it to print anything. David ----- > > >> Putting in an assert is fine but not sufficient. >> >> David >> ----- >> >>> - Assert if the value is out of range, so we can catch the >>> programming error during testing (with debug builds) >>> >>> It's up to the caller of FLAG_SET_ERGO to determine a value that's >>> within the allowed range. >>> >>> If the caller wants to query the constraints of the flag that it >>> wants to set, that can be done via the JVMFlagLimit API. Using >>> JVMFlagLimit is much better than blindly setting a value and getting >>> an error code that doesn't tell you why it failed. >>> >>> >>>> As I said it depends on exactly what we are using to do the >>>> calculation. If it is "closed world" then it is a fatal error to get >>>> it wrong and should be detected during testing. Otherwise the code >>>> has to deal with the possibility of error. >>>> >>>>> >>>>> Maybe we can add a new FLAG_SET_ERGO_CAPPED() that would cap the >>>>> value, but that should be an opt-in. FLAG_SET_ERGO() should not >>>>> automatically cap the value. >>>> >>>> That might be suitable for some flags/calculations. >>>> >>>> I don't like trying to talk about this in generalities. Each >>>> specific situation needs to be examined to understand what is being >>>> calculated and how. But it is the callers of FLAG_SET_ERGO that need >>>> to know what they are doing. >>>> >>>> Cheers, >>>> David >>>> >>>>> >>>>> Thanks >>>>> - Ioi >>>>> >>> > From sviswanathan at openjdk.org Tue Feb 7 00:51:46 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Tue, 7 Feb 2023 00:51:46 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v11] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: On Tue, 7 Feb 2023 00:12:21 GMT, Scott Gibbons wrote: >> Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. >> >> Encode performance: >> **Old:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms >> >> >> **New:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms >> >> >> Decode performance: >> **Old:** >> >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms >> >> **New:** >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Add algorithm comments src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2720: > 2718: __ vpshufb(xmm5, xmm9, xmm1, Assembler::AVX_256bit); > 2719: // If the and of the two is non-zero, we have an invalid input character > 2720: __ vptest(xmm3, xmm5); For isURL, it looks to me that the vptest will fail for URL valid input 0x5F ("_"): upper_nibble = 0x5; lower_nibble = 0xF; lut_lo_URL = 0x1B; (corresponding to 0xF) lut_hi = 0x8; (corresponding to 0x5) lut_lo_URL & lut_hi = 0x8; (not zero, taken as not allowable and so exit from loop) Could you please verify on your end and fix this? My understanding is that this is happening because 5 and 7 upper nibble get the same encoding 0x8. ------------- PR: https://git.openjdk.org/jdk/pull/12126 From jwaters at openjdk.org Tue Feb 7 01:21:55 2023 From: jwaters at openjdk.org (Julian Waters) Date: Tue, 7 Feb 2023 01:21:55 GMT Subject: RFR: 8297912: HotSpot Style Guide should permit alignas (Second Proposal Attempt) [v2] In-Reply-To: References: <58rvZjSeJKCKUkZd4xEsQxaQjgGWs-V4IPBpSNutc0g=.883e96aa-b385-4019-80f5-69b56e732e20@github.com> Message-ID: On Thu, 8 Dec 2022 16:37:35 GMT, Kim Barrett wrote: >> Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: >> >> - Merge branch 'openjdk:master' into style >> - HotSpot Style Guide > > The MSVC implementation of `ATTRIBUTE_ALIGNED` using `__declspec(align(N))` has the same > limitation of needing the alignment value, and not being usable for types. > > I happened to run into both the limitation to alignment values problem and > the lack of an XLC definition for ATTRIBUTE_ALIGNED today. @kimbarrett Now that alignas has officially been approved, is there anything I need to do with [8250269: Replace ATTRIBUTE_ALIGNED with alignas](https://github.com/openjdk/jdk/pull/11431)? I'm not sure if there's anything else needed on my part for that issue ------------- PR: https://git.openjdk.org/jdk/pull/11446 From dholmes at openjdk.org Tue Feb 7 02:12:43 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 7 Feb 2023 02:12:43 GMT Subject: RFR: 8301380: jdk/jfr/api/consumer/streaming/TestCrossProcessStreaming.java [v3] In-Reply-To: References: Message-ID: On Fri, 3 Feb 2023 09:32:25 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> please help review this small adjustment to close the gap where iterated threads that attach via jni are left without a valid JFR thread id. For details, please see the JIRA issue. >> >> Testing: jdk_jfr >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: > > revert uncalled for changes Looks good. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12388 From dholmes at openjdk.org Tue Feb 7 02:12:46 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 7 Feb 2023 02:12:46 GMT Subject: RFR: 8301380: jdk/jfr/api/consumer/streaming/TestCrossProcessStreaming.java [v2] In-Reply-To: References: <7oSkKJNaPRjItariUnPQItEvDu1JU72EXaOvYU79qKA=.136ef5be-741d-44ce-b04e-19048e094af6@github.com> <6Ej-RlKCJbvtXd2kt6D52Ne7QIQLkvKU6K3UItRRYUc=.e0457d0a-f45c-4904-97b6-561d05a18744@github.com> Message-ID: On Mon, 6 Feb 2023 09:40:00 GMT, Markus Gr?nlund wrote: >> It will? Don't you still have the problem that the thread can be seen after ` thread->set_done_attaching_via_jni();` but before `post_thread_start_event(thread);` where the id will be set? > > Yes. The JFR_JVM_THREAD_ID(thread) is a lazy assignment of the thread id. It can be done by the thread itself, or the periodic task thread. The premise for a JavaThread is that there exist a threadObj. Now I get it. The zero thread id only arises when the `threadObj` is null and that is no longer possible when we skip JNI-attaching threads. ------------- PR: https://git.openjdk.org/jdk/pull/12388 From dholmes at openjdk.org Tue Feb 7 02:33:46 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 7 Feb 2023 02:33:46 GMT Subject: RFR: 8301853: C4819 warnings were reported in HotSpot on Windows In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 12:33:44 GMT, Yasumasa Suenaga wrote: > This is subtask of #12427 . > > I have seen C4819 warning in stubGenerator_x86_64_poly.cpp and elfFile.hpp in HotSpot on Windows (CP932: Japanese locale) > > > d:\github-forked\jdk\src\hotspot\cpu\x86\stubGenerator_x86_64_poly.cpp(1): error C2220: ????????????????? > d:\github-forked\jdk\src\hotspot\cpu\x86\stubGenerator_x86_64_poly.cpp(1): warning C4819: ???????????? ??? (932) ??????????????????????????????????? Unicode ???????????? I note from the related PRs that they are not being fixed due to "building with the 'wrong' locale", but I would still want to see this change integrated for hotspot because the problematic characters in src/hotspot/cpu/x86/stubGenerator_x86_64_poly.cpp do not display correctly in text editors (the multiplication sign displays for me as a capital A with a tilde over it). ------------- PR: https://git.openjdk.org/jdk/pull/12435 From dholmes at openjdk.org Tue Feb 7 02:37:56 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 7 Feb 2023 02:37:56 GMT Subject: RFR: JDK-8301481: Replace NULL with nullptr in os/windows [v4] In-Reply-To: <6fhxlbFt5TtE7Y5OKVViHtuXV7nIz1b7y3a3JL_Nto0=.738f3e4f-2fae-49a6-9668-047ebb4abc66@github.com> References: <6fhxlbFt5TtE7Y5OKVViHtuXV7nIz1b7y3a3JL_Nto0=.738f3e4f-2fae-49a6-9668-047ebb4abc66@github.com> Message-ID: On Mon, 6 Feb 2023 09:44:27 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/windows. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Can't directly compare uintptr_t, need reinterpret_cast Changes requested by dholmes (Reviewer). src/hotspot/os/windows/gc/z/zVirtualMemory_windows.cpp line 142: > 140: const uintptr_t res = ZMapper::reserve(addr, size); > 141: > 142: assert(res == addr || res == reinterpret_cast(nullptr), "Should not reserve other memory than requested"); Seems to me if the returned value is uintptr_t then this should be a check against zero not NULL/nullptr. ------------- PR: https://git.openjdk.org/jdk/pull/12317 From ysuenaga at openjdk.org Tue Feb 7 02:50:45 2023 From: ysuenaga at openjdk.org (Yasumasa Suenaga) Date: Tue, 7 Feb 2023 02:50:45 GMT Subject: RFR: 8301853: C4819 warnings were reported in HotSpot on Windows In-Reply-To: References: Message-ID: <-MoCrf76vTgM4KMF5NPeaNXzZhSvfLkAUt7e88cbBD4=.11de7587-8524-464e-820f-f22fc94d914f@github.com> On Mon, 6 Feb 2023 12:33:44 GMT, Yasumasa Suenaga wrote: > This is subtask of #12427 . > > I have seen C4819 warning in stubGenerator_x86_64_poly.cpp and elfFile.hpp in HotSpot on Windows (CP932: Japanese locale) > > > d:\github-forked\jdk\src\hotspot\cpu\x86\stubGenerator_x86_64_poly.cpp(1): error C2220: ????????????????? > d:\github-forked\jdk\src\hotspot\cpu\x86\stubGenerator_x86_64_poly.cpp(1): warning C4819: ???????????? ??? (932) ??????????????????????????????????? Unicode ???????????? We continue to discuss for the encoding of source files in #12436, but I want to merge this to master branch because this PR changes comments only. ------------- PR: https://git.openjdk.org/jdk/pull/12435 From sviswanathan at openjdk.org Tue Feb 7 02:52:46 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Tue, 7 Feb 2023 02:52:46 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v11] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: On Tue, 7 Feb 2023 00:12:21 GMT, Scott Gibbons wrote: >> Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. >> >> Encode performance: >> **Old:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms >> >> >> **New:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms >> >> >> Decode performance: >> **Old:** >> >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms >> >> **New:** >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Add algorithm comments src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2227: > 2225: > 2226: // lut_roll URL > 2227: __ emit_data64(0xb9b9bfbf04111000, relocInfo::none); The lut_roll URL doesn't seem to be correct: 0x5F (URL base64 ASCII for "/") would need an offset of -20H i.e. 0xEC. However the others with upper nibble as 5 need an offset of -65H i.e. 0xBF. It looks to me that the adjustment for 5F should be -4 instead of -1 at line 2722. ------------- PR: https://git.openjdk.org/jdk/pull/12126 From fjiang at openjdk.org Tue Feb 7 02:53:42 2023 From: fjiang at openjdk.org (Feilong Jiang) Date: Tue, 7 Feb 2023 02:53:42 GMT Subject: RFR: 8284196: RISC-V: Detect supported ISA extensions over cpuinfo Message-ID: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> Currently, `elf_hwcap` for RISC-V only sets single-letter extension bit (e.g. IMAFD). As many standard multi-letter ISA extensions are ratified (e.g. Zba/Zbb/Zbc/Zbs), we should find a stable way to detect and auto-enable these supported ISA extensions in JVM. [1] has proposed a way to parse supported extensions through /proc/cpuinfo or "riscv,isa" string of /sys/firmware/devicetree, we could detect supported extensions in the same way. Here is an example of /proc/cpuinfo from Ubuntu 20.04 in QEMU-SYSTEM: ubuntu at ubuntu:~$ uname -a Linux ubuntu 5.8.0-14-generic #16~20.04.3-Ubuntu SMP Mon Feb 1 16:33:19 UTC 2021 riscv64 riscv64 riscv64 GNU/Linux ubuntu at ubuntu:~$ cat /proc/cpuinfo processor : 0 hart : 2 isa : rv64imafdch_zicsr_zifencei_zihintpause_zba_zbb_zbc_zbs_sstc mmu : sv48 1: http://lists.infradead.org/pipermail/linux-riscv/2021-November/010252.html Testing: - [x] `jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UseZihintpause -XX:+UseRVV -XX:+UseZicbop -XX:+UseZba -XX:+UseZbb -XX:+UseZbs -XX:+PrintFlagsFinal -version` with release build ------------- Commit messages: - code adjustment - adjust comments - update copyright - add zihintpuase - fix calling of strdup - Detect isa extensions from cpuinfo isa string Changes: https://git.openjdk.org/jdk/pull/12343/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12343&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8284196 Stats: 210 lines in 3 files changed: 130 ins; 65 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/12343.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12343/head:pull/12343 PR: https://git.openjdk.org/jdk/pull/12343 From fjiang at openjdk.org Tue Feb 7 03:07:16 2023 From: fjiang at openjdk.org (Feilong Jiang) Date: Tue, 7 Feb 2023 03:07:16 GMT Subject: RFR: 8284196: RISC-V: Detect supported ISA extensions over cpuinfo [v2] In-Reply-To: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> References: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> Message-ID: > Currently, `elf_hwcap` for RISC-V only sets single-letter extension bit (e.g. IMAFD). > As many standard multi-letter ISA extensions are ratified (e.g. Zba/Zbb/Zbc/Zbs), > we should find a stable way to detect and auto-enable these supported ISA extensions > in JVM. [1] has proposed a way to parse supported extensions through /proc/cpuinfo > or "riscv,isa" string of /sys/firmware/devicetree, we could detect supported extensions > in the same way. > > Here is an example of /proc/cpuinfo from Ubuntu 20.04 in QEMU-SYSTEM: > > > ubuntu at ubuntu:~$ uname -a > Linux ubuntu 5.8.0-14-generic #16~20.04.3-Ubuntu SMP Mon Feb 1 16:33:19 UTC 2021 riscv64 riscv64 riscv64 GNU/Linux > ubuntu at ubuntu:~$ cat /proc/cpuinfo > processor : 0 > hart : 2 > isa : rv64imafdch_zicsr_zifencei_zihintpause_zba_zbb_zbc_zbs_sstc > mmu : sv48 > > > 1: http://lists.infradead.org/pipermail/linux-riscv/2021-November/010252.html > > > Testing: > - [x] `jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UseZihintpause -XX:+UseRVV -XX:+UseZicbop -XX:+UseZba -XX:+UseZbb -XX:+UseZbs -XX:+PrintFlagsFinal -version` with release build Feilong Jiang has updated the pull request incrementally with one additional commit since the last revision: fix typo ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12343/files - new: https://git.openjdk.org/jdk/pull/12343/files/7bf90b7c..59c33e2f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12343&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12343&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12343.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12343/head:pull/12343 PR: https://git.openjdk.org/jdk/pull/12343 From fjiang at openjdk.org Tue Feb 7 04:09:45 2023 From: fjiang at openjdk.org (Feilong Jiang) Date: Tue, 7 Feb 2023 04:09:45 GMT Subject: RFR: 8301818: RISC-V: Factor out function mvw from MacroAssembler [v2] In-Reply-To: References: Message-ID: On Sat, 4 Feb 2023 11:30:03 GMT, Gui Cao wrote: >> HI, >> >> The two functions of mvw and mv overlap, we can use mv to replace mvw, so the mvw function can be removed. and format some code. >> ## Testing: >> - all tier1 on unmatched board without new failures > > Gui Cao has updated the pull request incrementally with one additional commit since the last revision: > > Revert use lwu Marked as reviewed by fjiang (Author). ------------- PR: https://git.openjdk.org/jdk/pull/12425 From duke at openjdk.org Tue Feb 7 04:32:43 2023 From: duke at openjdk.org (Amit Kumar) Date: Tue, 7 Feb 2023 04:32:43 GMT Subject: RFR: 8298413: [s390] CPUInfoTest fails due to uppercase feature string In-Reply-To: <4jGO8wNNTB7LMAY7Hob6Pbkei8t4S37-pr8lBsbJkc8=.4d1c0296-bcc9-41ca-8275-e5d247c78e53@github.com> References: <4jGO8wNNTB7LMAY7Hob6Pbkei8t4S37-pr8lBsbJkc8=.4d1c0296-bcc9-41ca-8275-e5d247c78e53@github.com> Message-ID: On Mon, 30 Jan 2023 09:34:18 GMT, Amit Kumar wrote: > CPUInfoTest.java is failing on s390 due to an incorrect feature string. Feature Vector Enhancement is shown as VEnh2 instead of venh2. Beside that out-of-support date/(tbd) was not being included even if it was present there so this issue is also addressed by this fix. Hi @RealLucy and @TheRealMDoerr, would you like to review this PR as well. ------------- PR: https://git.openjdk.org/jdk/pull/12286 From gcao at openjdk.org Tue Feb 7 04:40:45 2023 From: gcao at openjdk.org (Gui Cao) Date: Tue, 7 Feb 2023 04:40:45 GMT Subject: RFR: 8301818: RISC-V: Factor out function mvw from MacroAssembler [v2] In-Reply-To: References: Message-ID: On Sat, 4 Feb 2023 11:30:03 GMT, Gui Cao wrote: >> HI, >> >> The two functions of mvw and mv overlap, we can use mv to replace mvw, so the mvw function can be removed. and format some code. >> ## Testing: >> - all tier1 on unmatched board without new failures > > Gui Cao has updated the pull request incrementally with one additional commit since the last revision: > > Revert use lwu Thanks all for the review. ------------- PR: https://git.openjdk.org/jdk/pull/12425 From gcao at openjdk.org Tue Feb 7 04:51:51 2023 From: gcao at openjdk.org (Gui Cao) Date: Tue, 7 Feb 2023 04:51:51 GMT Subject: Integrated: 8301818: RISC-V: Factor out function mvw from MacroAssembler In-Reply-To: References: Message-ID: On Sat, 4 Feb 2023 06:42:33 GMT, Gui Cao wrote: > HI, > > The two functions of mvw and mv overlap, we can use mv to replace mvw, so the mvw function can be removed. and format some code. > ## Testing: > - all tier1 on unmatched board without new failures This pull request has now been integrated. Changeset: c04a982e Author: Gui Cao Committer: Fei Yang URL: https://git.openjdk.org/jdk/commit/c04a982eb47170f3c613617179fca012bb4d40ae Stats: 30 lines in 9 files changed: 0 ins; 2 del; 28 mod 8301818: RISC-V: Factor out function mvw from MacroAssembler Reviewed-by: luhenry, fyang, fjiang ------------- PR: https://git.openjdk.org/jdk/pull/12425 From mdoerr at openjdk.org Tue Feb 7 04:54:45 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 7 Feb 2023 04:54:45 GMT Subject: RFR: 8298413: [s390] CPUInfoTest fails due to uppercase feature string In-Reply-To: <4jGO8wNNTB7LMAY7Hob6Pbkei8t4S37-pr8lBsbJkc8=.4d1c0296-bcc9-41ca-8275-e5d247c78e53@github.com> References: <4jGO8wNNTB7LMAY7Hob6Pbkei8t4S37-pr8lBsbJkc8=.4d1c0296-bcc9-41ca-8275-e5d247c78e53@github.com> Message-ID: On Mon, 30 Jan 2023 09:34:18 GMT, Amit Kumar wrote: > CPUInfoTest.java is failing on s390 due to an incorrect feature string. Feature Vector Enhancement is shown as VEnh2 instead of venh2. Beside that out-of-support date/(tbd) was not being included even if it was present there so this issue is also addressed by this fix. LGTM. ------------- Marked as reviewed by mdoerr (Reviewer). PR: https://git.openjdk.org/jdk/pull/12286 From jcking at openjdk.org Tue Feb 7 04:54:45 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 04:54:45 GMT Subject: RFR: JDK-8300783: Consolidate byteswap implementations [v13] In-Reply-To: <4AfWUfBaHKZwVsm6XhCkKc0krgKgWTw1RVI2R1-MdcM=.48008df7-1411-42d6-aeb1-3276f443dfe3@github.com> References: <4AfWUfBaHKZwVsm6XhCkKc0krgKgWTw1RVI2R1-MdcM=.48008df7-1411-42d6-aeb1-3276f443dfe3@github.com> Message-ID: On Tue, 31 Jan 2023 20:05:21 GMT, Justin King wrote: >> Deduplicate byte swapping implementations by consolidating them into `utilities/byteswap.hpp`, following `std::byteswap` introduced in C++23. Further simplification of `Bytes` will follow in https://github.com/openjdk/jdk/pull/12078. > > Justin King has updated the pull request incrementally with one additional commit since the last revision: > > Fix copyright > > Signed-off-by: Justin King Poke. ------------- PR: https://git.openjdk.org/jdk/pull/12114 From dholmes at openjdk.org Tue Feb 7 05:05:44 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 7 Feb 2023 05:05:44 GMT Subject: RFR: 8298091: Dump native instruction along with nmethod name when using Compiler.codelist In-Reply-To: <1dSP2mbbyqiKLRmWwCfwGdb67ll7-K4cRtV_muUkS9I=.2d2dc2f8-acca-49de-89f5-70a825c057fa@github.com> References: <1dSP2mbbyqiKLRmWwCfwGdb67ll7-K4cRtV_muUkS9I=.2d2dc2f8-acca-49de-89f5-70a825c057fa@github.com> Message-ID: On Thu, 2 Feb 2023 02:12:06 GMT, Yi Yang wrote: > Dump native instruction along with nmethod name when using Compiler.codelist. > > e.g. > jcmd Compiler.codelist decode= > > Output: > nmethod_name [addr_begin, code_begin, code_end] > Encouraging the compiler folk to take a look at this. ------------- PR: https://git.openjdk.org/jdk/pull/12381 From mdoerr at openjdk.org Tue Feb 7 05:16:42 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 7 Feb 2023 05:16:42 GMT Subject: RFR: 8300255: Introduce interface for GC oop verification in the assembler In-Reply-To: <5Ts_s04vfL_X-B1d9uM2QWeeHXZ99zntV0W_kI6KKNs=.1dfadaa4-cd3d-4b80-a523-6fdcd5f404c5@github.com> References: <5Ts_s04vfL_X-B1d9uM2QWeeHXZ99zntV0W_kI6KKNs=.1dfadaa4-cd3d-4b80-a523-6fdcd5f404c5@github.com> Message-ID: On Mon, 6 Feb 2023 15:36:13 GMT, Erik ?sterlund wrote: > PPC code contributed by @TheRealMDoerr and RISC-V code contributed by @yadongw. Could you please check that it works on your machines? The default version of check_oop doesn't work for ZGC. I suggest to exclude the PPC64 parts here and ship them with the generational ZGC PR. That will reduce effort. Does that make sense? ------------- PR: https://git.openjdk.org/jdk/pull/12443 From fyang at openjdk.org Tue Feb 7 05:38:45 2023 From: fyang at openjdk.org (Fei Yang) Date: Tue, 7 Feb 2023 05:38:45 GMT Subject: RFR: 8284196: RISC-V: Detect supported ISA extensions over cpuinfo [v2] In-Reply-To: References: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> Message-ID: On Tue, 7 Feb 2023 03:07:16 GMT, Feilong Jiang wrote: >> Currently, `elf_hwcap` for RISC-V only sets single-letter extension bit (e.g. IMAFD). >> As many standard multi-letter ISA extensions are ratified (e.g. Zba/Zbb/Zbc/Zbs), >> we should find a stable way to detect and auto-enable these supported ISA extensions >> in JVM. [1] has proposed a way to parse supported extensions through /proc/cpuinfo >> or "riscv,isa" string of /sys/firmware/devicetree, we could detect supported extensions >> in the same way. >> >> Here is an example of /proc/cpuinfo from Ubuntu 20.04 in QEMU-SYSTEM: >> >> >> ubuntu at ubuntu:~$ uname -a >> Linux ubuntu 5.8.0-14-generic #16~20.04.3-Ubuntu SMP Mon Feb 1 16:33:19 UTC 2021 riscv64 riscv64 riscv64 GNU/Linux >> ubuntu at ubuntu:~$ cat /proc/cpuinfo >> processor : 0 >> hart : 2 >> isa : rv64imafdch_zicsr_zifencei_zihintpause_zba_zbb_zbc_zbs_sstc >> mmu : sv48 >> >> >> 1: http://lists.infradead.org/pipermail/linux-riscv/2021-November/010252.html >> >> >> Testing: >> - [x] `jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UseZihintpause -XX:+UseRVV -XX:+UseZicbop -XX:+UseZba -XX:+UseZbb -XX:+UseZbs -XX:+PrintFlagsFinal -version` with release build > > Feilong Jiang has updated the pull request incrementally with one additional commit since the last revision: > > fix typo Changes requested by fyang (Reviewer). src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp line 100: > 98: } > 99: } > 100: Can you add a comment here about how the 'isa' substring with those extensions look like? src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp line 102: > 100: > 101: void VM_Version::get_isa() { > 102: char isa_buf[500]; Better to use 512 instead of 500 to be consistent with other places. ------------- PR: https://git.openjdk.org/jdk/pull/12343 From ioi.lam at oracle.com Tue Feb 7 05:53:08 2023 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 6 Feb 2023 21:53:08 -0800 Subject: JVM Flag ergonomics and constraints In-Reply-To: <471c6789-cef8-988c-967d-ff044ccb3a89@oracle.com> References: <1e05de2b-5f03-7ff2-6a47-176df9046588@oracle.com> <637c4388-4c02-c654-6166-23491d797fce@oracle.com> <678fc49c-d85c-3948-0a7d-a3802b909dd1@oracle.com> <9a4fb118-17ba-8e04-8ce5-737f221cb5f4@oracle.com> <41d86ae6-f13b-c030-b1e4-3f77ef2752e4@oracle.com> <471c6789-cef8-988c-967d-ff044ccb3a89@oracle.com> Message-ID: On 2/6/2023 4:48 PM, David Holmes wrote: > On 7/02/2023 3:19 am, Ioi Lam wrote: >> >> >> On 2/5/2023 11:45 PM, David Holmes wrote: >>> On 6/02/2023 5:26 pm, Ioi Lam wrote: >>>> On 2/3/2023 12:40 AM, David Holmes wrote: >>>>> On 3/02/2023 5:12 pm, Ioi Lam wrote: >>>>>> >>>>>> >>>>>> On 2/1/2023 10:37 PM, David Holmes wrote: >>>>>>> >>>>>>> >>>>>>> On 2/02/2023 3:25 pm, Thomas St?fe wrote: >>>>>>>> On Thu, Feb 2, 2023 at 5:30 AM David Holmes >>>>>>>> > wrote: >>>>>>>> >>>>>>>> ??? That would have to be determined case-by-case. For each >>>>>>>> ergo flag we >>>>>>>> ??? would have to decide what value it is calculating and based >>>>>>>> on what, >>>>>>>> ??? and >>>>>>>> ??? determine whether an out-of-range value is an error or >>>>>>>> something to be >>>>>>>> ??? wary of. >>>>>>>> >>>>>>>> Sure, what I meant was how to control it? Two sets of >>>>>>>> FLAT_SET_... macros, one that asserts on constraint violation, >>>>>>>> one that caps? >>>>>>> >>>>>>> No I expect to just see the places where FLAG_SET_ERGO is called >>>>>>> do the right thing by checking the return value and proceeding >>>>>>> accordingly. >>>>>> >>>>>> David, are you proposing this? >>>>>> >>>>>> ??? int value = calculate_ergo(); >>>>>> ??? if (FLAG_SET_ERGO (TheFlag, value) == error) { >>>>>> ????? value = do_it_again() >>>>>> ????? if (FLAG_SET_ERGO (TheFlag, value) == error) { >>>>>> ???????? huh? >>>>>> ????? } >>>>>> ??? } >>>>>> >>>>>> In the above code, all FLAG_SET_ERGO does is to check the range >>>>>> for you. It doesn't tell you why the range check fails. So the >>>>>> caller would have to redo the calculation (with no more >>>>>> information, other than that the last value was no good). What if >>>>>> the new value is also no good? >>>>> >>>>> To me that indicates we don't know what the heck we are >>>>> calculating! How can you calculate a setting for a flag if you >>>>> don't even understand the valid range of that flag. Which may mean >>>>> the max, for example, needs to be exposed programmatically. >>>>> >>>>>> With this style, you almost have to wrap every FLAG_SET_ERGO? in >>>>>> a loop, keep trying (in the dark) until you succeed. >>>>>> >>>>>> ================== >>>>>> I think it's better to just assert inside FLAG_SET_ERGO if the >>>>>> value is out of range. The caller should make sure it has >>>>>> calculated a correct value. >>>>> >>>>> If the calculation uses some externally settable value then an >>>>> assert won't help if we're in a product build. >>>>> >>>> >>>> We have lots of set_xxx() APIs in various classes that assert when >>>> given an inappropriate value -- this indicates a programming error. >>>> So FLAG_SET_ERGO would be no different. >>>> >>>> It's the responsibility of the caller of FLAG_SET_ERGO to pass a >>>> correct value to FLAG_SET_ERGO . How the caller comes up with that >>>> value (whether it depends on some external input) is irrelevant. >>>> ======== >>>> >>>> To properly catch the programming errors with ERGO setting, we should: >>>> >>>> - Change FLAG_SET_ERGO to return "void". The current API gives the >>>> false impression that the caller needs to check the returned value >>>> and do something about it. >>> >>> That's because they do! You still need to be able to report an error >>> for product builds if the value is being calculated based on some >>> passed in runtime value. >>> >> >> That should be done by the code that calls FLAG_SET_ERGO, not by >> FLAG_SET_ERGO itself. E.g., if an environment variable would cause an >> invalid ERGO setting, the code that handles this variable should >> report an error and should NOT call FLAG_SET_ERGO with an invalid value. > > I'm not asking FLAG_SET_ERGO to report a specific error, just that the > supplied value was out-of-range. Your position, as stated, is that the > caller should validate everything before calling FLAG_SET_ERGO. My > position was that we could still support a usage pattern where we > assume everything will be fine and we recover if it is not e.g. > > newVal = simpleCalc(x, y, z); // works 99% of time > if (!FLAG_SET_ERGO(flag, newVal)) { > ? newVal = complexCalc(x, y, z); // guarantees in range > ? FLAG_SET_ERGO(flag, newVal); > } > > I don't think this coding style is useful. None of the existing code is structured this way. Instead of arguing abstractly, let's say we are calling FLAG_SET_ERGO to set the InitialHeapSize. The code that ergonomically sets InitialHeapSize should know the valid range for this flag. It should be in same module of code that implements the range constraint function for the InitialHeapSize. This module should use its own knowledge to choose a valid value for FLAG_SET_ERGO(InitialHeapSize). > With what you propose the caller has to range-check newVal, and then > FLAG_SET_ERGO will range-check it again. That's not true. The ERGO setter of InitialHeapSize will use an algorithm that calculates a value that would be guaranteed to fit it the allowed range. The algorithm itself doesn't necessarily have to call the range check function: ??? size_t n = calculate_init_heap(); ??? FLAG_SET_ERGO(InitialHeapSize, n);? // assert(InitialHeapSizeConstraintFunc(n, false) == JVMFlag::SUCCESS, ...); The assert inside FLAG_SET_ERGO is exactly what the caller wants -- to ensure that its algorithm has produced a correct value. Even if the ERGO setter really wants to call the range checker function, it can already do so: ?? size_t n = calculate_init_heap(); ?? if (InitialHeapSizeConstraintFunc(n, false) != JVMFlag::SUCCESS) { ???? n = .....; ?? } ?? FLAG_SET_ERGO(InitialHeapSize, n); But we shouldn't dictate this style through the FLAG_SET_ERGO API. Also, your proposal will not allow FLAG_SET_ERGO to assert. This means that all existing calls of FLAG_SET_ERGO (over 150 of them) need to be rewritten to catch the problem reported in JDK-8224980: ??? JVMFlag::Error err = FLAG_SET_ERGO(flag, newVal); ??? assert(err == JVMFlag::SUCCESS, "sanity"); >> FLAG_SET_ERGO knows nothing about such env variables and cannot print >> out a meaningful error message. > > I'm not wanting it to print anything. > > David > ----- > >> >> >>> Putting in an assert is fine but not sufficient. >>> >>> David >>> ----- >>> >>>> - Assert if the value is out of range, so we can catch the >>>> programming error during testing (with debug builds) >>>> >>>> It's up to the caller of FLAG_SET_ERGO to determine a value that's >>>> within the allowed range. >>>> >>>> If the caller wants to query the constraints of the flag that it >>>> wants to set, that can be done via the JVMFlagLimit API. Using >>>> JVMFlagLimit is much better than blindly setting a value and >>>> getting an error code that doesn't tell you why it failed. >>>> >>>> >>>>> As I said it depends on exactly what we are using to do the >>>>> calculation. If it is "closed world" then it is a fatal error to >>>>> get it wrong and should be detected during testing. Otherwise the >>>>> code has to deal with the possibility of error. >>>>> >>>>>> >>>>>> Maybe we can add a new FLAG_SET_ERGO_CAPPED() that would cap the >>>>>> value, but that should be an opt-in. FLAG_SET_ERGO() should not >>>>>> automatically cap the value. >>>>> >>>>> That might be suitable for some flags/calculations. >>>>> >>>>> I don't like trying to talk about this in generalities. Each >>>>> specific situation needs to be examined to understand what is >>>>> being calculated and how. But it is the callers of FLAG_SET_ERGO >>>>> that need to know what they are doing. >>>>> >>>>> Cheers, >>>>> David >>>>> >>>>>> >>>>>> Thanks >>>>>> - Ioi >>>>>> >>>> >> From haosun at openjdk.org Tue Feb 7 06:17:14 2023 From: haosun at openjdk.org (Hao Sun) Date: Tue, 7 Feb 2023 06:17:14 GMT Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64 [v25] In-Reply-To: References: <-RReNBa4rbWMG38xuIH_yCfzDDEubd4gU2QsVCdiubM=.0e13582c-f1c8-4e35-a1ce-174a22b86a58@github.com> <4v4pdOt2xrFJx30NbLByS6jw3p_y3NSutAxIQq_I-lg=.cedfdf56-6a23-46b6-8cd3-044a467c2c61@github.com> Message-ID: On Mon, 6 Feb 2023 11:50:57 GMT, Doug Simon wrote: >> Thanks for reporting this. @dougxc >> >> Yes. I encountered several jtreg failures with Graal, which is built with PAC-RET enabled JDK. >> >> As I see it, the straightforward fix is to disable JVMCI if `_rop_protection` is parsed as "true" finally, since Graal doesn't supports PAC-RET currently. >> That is similar to the way `JVMCIGlobals::check_jvmci_supported_gc` does, i.e. if one GC except `G1/Serial/Parallel` is configured, we disable JVMCI. >> WDYT > > The problem with `JVMCIGlobals::check_jvmci_supported_gc` is that it does not give the compiler a chance to specify whether it supports a given GC. > Instead, we want a JVMCI upcall like `jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.isGCSupported(int)`. Thanks for your reply. But I'm afraid I'm still a bit confused. 1) flag `_rop_protection` is set inside function `VM_Version::initialize()`. The execution flow is `create_vm -> init_globals -> VM_Version_init -> VM_Version::initialize()`. 2) I suppose compilers (including JVMCI compiler) are initialized after `init_global()` function. See https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/threads.cpp#L685~L707. Hence I suppose we cannot call JVMCI methods before this site, right? If so, I'm afraid it's a bit late for VM to ask the compiler whether it supports PAC-RET after the compiler initialization, and to disable PAC-RET if the JVMCI compiler doesn't support it, mainly because some stubs are generated already between `init_global()` and compiler initialization, and in these stubs, flag `_rop_protection` is used to generate PAC-related code, e.g., `SharedRuntime::generate_stubs() -> generate_deopt_blob() -> protect_return_address()`. Please correct me if there is something I misunderstood. Thanks. ------------- PR: https://git.openjdk.org/jdk/pull/6334 From mdegtyarev at gmail.com Tue Feb 7 07:25:23 2023 From: mdegtyarev at gmail.com (Maxim Degtyarev) Date: Tue, 7 Feb 2023 10:25:23 +0300 Subject: RFR: 8301853: C4819 warnings were reported in HotSpot on Windows In-Reply-To: <-MoCrf76vTgM4KMF5NPeaNXzZhSvfLkAUt7e88cbBD4=.11de7587-8524-464e-820f-f22fc94d914f@github.com> References: <-MoCrf76vTgM4KMF5NPeaNXzZhSvfLkAUt7e88cbBD4=.11de7587-8524-464e-820f-f22fc94d914f@github.com> Message-ID: Isn't it better to replace the multiplication sign with an asterisk (*) instead of the letter 'x'? The 'a2xr0 ' looks like a cryptic identifier and not the "multiply a2 to r0". Same for all other changes. El mar, 7 feb 2023 a las 5:51, Yasumasa Suenaga () escribi?: > > On Mon, 6 Feb 2023 12:33:44 GMT, Yasumasa Suenaga wrote: > > > This is subtask of #12427 . > > > > I have seen C4819 warning in stubGenerator_x86_64_poly.cpp and elfFile.hpp in HotSpot on Windows (CP932: Japanese locale) > > > > > > d:\github-forked\jdk\src\hotspot\cpu\x86\stubGenerator_x86_64_poly.cpp(1): error C2220: ????????????????? > > d:\github-forked\jdk\src\hotspot\cpu\x86\stubGenerator_x86_64_poly.cpp(1): warning C4819: ???????????? ??? (932) ??????????????????????????????????? Unicode ???????????? > > We continue to discuss for the encoding of source files in #12436, but I want to merge this to master branch because this PR changes comments only. > > ------------- > > PR: https://git.openjdk.org/jdk/pull/12435 From kbarrett at openjdk.org Tue Feb 7 07:38:44 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 7 Feb 2023 07:38:44 GMT Subject: RFR: JDK-8300783: Consolidate byteswap implementations [v13] In-Reply-To: References: <4AfWUfBaHKZwVsm6XhCkKc0krgKgWTw1RVI2R1-MdcM=.48008df7-1411-42d6-aeb1-3276f443dfe3@github.com> Message-ID: On Tue, 31 Jan 2023 23:29:41 GMT, Justin King wrote: > I am not 100% sure of XLC, it should generate a single store instruction. Getting a copy of the old XLC on AIX (not the clang-based one) doesn't seem easy. Why do we care about the old XLC? It doesn't support C++11/14, so only the newer clang-based version is usable with our current code base. Unless you were going to propose backporting this, which seems dubious to me but not my call. > Poke. Working on it... ------------- PR: https://git.openjdk.org/jdk/pull/12114 From dnsimon at openjdk.org Tue Feb 7 07:59:14 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 7 Feb 2023 07:59:14 GMT Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64 [v25] In-Reply-To: References: <-RReNBa4rbWMG38xuIH_yCfzDDEubd4gU2QsVCdiubM=.0e13582c-f1c8-4e35-a1ce-174a22b86a58@github.com> <4v4pdOt2xrFJx30NbLByS6jw3p_y3NSutAxIQq_I-lg=.cedfdf56-6a23-46b6-8cd3-044a467c2c61@github.com> Message-ID: On Tue, 7 Feb 2023 06:14:24 GMT, Hao Sun wrote: >> The problem with `JVMCIGlobals::check_jvmci_supported_gc` is that it does not give the compiler a chance to specify whether it supports a given GC. >> Instead, we want a JVMCI upcall like `jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.isGCSupported(int)`. > > Thanks for your reply. But I'm afraid I'm still a bit confused. > > 1) flag `_rop_protection` is set inside function `VM_Version::initialize()`. The execution flow is `create_vm -> init_globals -> VM_Version_init -> VM_Version::initialize()`. > 2) I suppose compilers (including JVMCI compiler) are initialized after `init_global()` function. See https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/threads.cpp#L685~L707. Hence I suppose we cannot call JVMCI methods before this site, right? > > If so, I'm afraid it's a bit late for VM to ask the compiler whether it supports PAC-RET after the compiler initialization, and to disable PAC-RET if the JVMCI compiler doesn't support it, mainly because some stubs are generated already between `init_global()` and compiler initialization, and in these stubs, flag `_rop_protection` is used to generate PAC-related code, e.g., `SharedRuntime::generate_stubs() -> generate_deopt_blob() -> protect_return_address()`. > > Please correct me if there is something I misunderstood. Thanks. Ah ok, I did not pay attention to the VM lifecycle constraints in setting this flag. We could do the JVMCI check in `CompileBroker::compilation_init_phase2` but that would mean forcing eager initialization of JVMCI which, in turn, would impact VM startup. For now, we will just add a check into Graal that `UseBranchProtection` is not set, exiting the VM if it is. At the same time, we will work on adding support for this. ------------- PR: https://git.openjdk.org/jdk/pull/6334 From lucy at openjdk.org Tue Feb 7 08:21:41 2023 From: lucy at openjdk.org (Lutz Schmidt) Date: Tue, 7 Feb 2023 08:21:41 GMT Subject: RFR: 8298413: [s390] CPUInfoTest fails due to uppercase feature string In-Reply-To: <4jGO8wNNTB7LMAY7Hob6Pbkei8t4S37-pr8lBsbJkc8=.4d1c0296-bcc9-41ca-8275-e5d247c78e53@github.com> References: <4jGO8wNNTB7LMAY7Hob6Pbkei8t4S37-pr8lBsbJkc8=.4d1c0296-bcc9-41ca-8275-e5d247c78e53@github.com> Message-ID: On Mon, 30 Jan 2023 09:34:18 GMT, Amit Kumar wrote: > CPUInfoTest.java is failing on s390 due to an incorrect feature string. Feature Vector Enhancement is shown as VEnh2 instead of venh2. Beside that out-of-support date/(tbd) was not being included even if it was present there so this issue is also addressed by this fix. LGTM ------------- Marked as reviewed by lucy (Reviewer). PR: https://git.openjdk.org/jdk/pull/12286 From duke at openjdk.org Tue Feb 7 08:21:42 2023 From: duke at openjdk.org (Amit Kumar) Date: Tue, 7 Feb 2023 08:21:42 GMT Subject: RFR: 8298413: [s390] CPUInfoTest fails due to uppercase feature string In-Reply-To: References: <4jGO8wNNTB7LMAY7Hob6Pbkei8t4S37-pr8lBsbJkc8=.4d1c0296-bcc9-41ca-8275-e5d247c78e53@github.com> Message-ID: On Tue, 7 Feb 2023 08:18:58 GMT, Lutz Schmidt wrote: >> CPUInfoTest.java is failing on s390 due to an incorrect feature string. Feature Vector Enhancement is shown as VEnh2 instead of venh2. Beside that out-of-support date/(tbd) was not being included even if it was present there so this issue is also addressed by this fix. > > LGTM thanks @RealLucy and @TheRealMDoerr ------------- PR: https://git.openjdk.org/jdk/pull/12286 From luhenry at openjdk.org Tue Feb 7 08:22:47 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Tue, 7 Feb 2023 08:22:47 GMT Subject: RFR: 8284196: RISC-V: Detect supported ISA extensions over cpuinfo [v2] In-Reply-To: References: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> Message-ID: <9mhhiQTdXtlYKNoyiGi5bKzk5CvhM20qjMU817bBnFc=.207cd537-a73c-4c9a-aa41-dc03ed3159bc@github.com> On Tue, 7 Feb 2023 03:07:16 GMT, Feilong Jiang wrote: >> Currently, `elf_hwcap` for RISC-V only sets single-letter extension bit (e.g. IMAFD). >> As many standard multi-letter ISA extensions are ratified (e.g. Zba/Zbb/Zbc/Zbs), >> we should find a stable way to detect these supported ISA extensions in JVM. >> [1] has proposed a way to parse supported extensions through /proc/cpuinfo >> or "riscv,isa" string of /sys/firmware/devicetree, we could detect supported extensions >> in the same way. >> >> Here is an example of /proc/cpuinfo with multi-letter extensions from Ubuntu 20.04 in QEMU-SYSTEM: >> >> >> ubuntu at ubuntu:~$ uname -a >> Linux ubuntu 5.8.0-14-generic #16~20.04.3-Ubuntu SMP Mon Feb 1 16:33:19 UTC 2021 riscv64 riscv64 riscv64 GNU/Linux >> ubuntu at ubuntu:~$ cat /proc/cpuinfo >> processor : 0 >> hart : 2 >> isa : rv64imafdch_zicsr_zifencei_zihintpause_zba_zbb_zbc_zbs_sstc >> mmu : sv48 >> >> >> 1: http://lists.infradead.org/pipermail/linux-riscv/2021-November/010252.html >> >> >> Testing: >> - [x] `jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UseZihintpause -XX:+UseRVV -XX:+UseZicbop -XX:+UseZba -XX:+UseZbb -XX:+UseZbs -XX:+PrintFlagsFinal -version` with release build > > Feilong Jiang has updated the pull request incrementally with one additional commit since the last revision: > > fix typo The device tree is generally not trusted by even Linux, so it can be quite tricky to rely on it to reliably detect features (there are hardware out there - not necessarily RISC-V - that "lie" in their device tree for some reasons). Relying on `/proc/cpuinfo` which comes from the device tree is I think a good first step. but we will want to make sure to have a better solution in the future (I wish there was a cpuid-like feature on RISC-V...). There are also other proposals in Linux, like the `riscv_hwprobe` syscall [1] which should also be supported in glibc. [1] https://lkml.org/lkml/2023/2/6/1147 ------------- PR: https://git.openjdk.org/jdk/pull/12343 From luhenry at openjdk.org Tue Feb 7 08:37:50 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Tue, 7 Feb 2023 08:37:50 GMT Subject: RFR: 8284196: RISC-V: Detect supported ISA extensions over cpuinfo [v2] In-Reply-To: References: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> Message-ID: On Tue, 7 Feb 2023 03:07:16 GMT, Feilong Jiang wrote: >> Currently, `elf_hwcap` for RISC-V only sets single-letter extension bit (e.g. IMAFD). >> As many standard multi-letter ISA extensions are ratified (e.g. Zba/Zbb/Zbc/Zbs), >> we should find a stable way to detect these supported ISA extensions in JVM. >> [1] has proposed a way to parse supported extensions through /proc/cpuinfo >> or "riscv,isa" string of /sys/firmware/devicetree, we could detect supported extensions >> in the same way. >> >> Here is an example of /proc/cpuinfo with multi-letter extensions from Ubuntu 20.04 in QEMU-SYSTEM: >> >> >> ubuntu at ubuntu:~$ uname -a >> Linux ubuntu 5.8.0-14-generic #16~20.04.3-Ubuntu SMP Mon Feb 1 16:33:19 UTC 2021 riscv64 riscv64 riscv64 GNU/Linux >> ubuntu at ubuntu:~$ cat /proc/cpuinfo >> processor : 0 >> hart : 2 >> isa : rv64imafdch_zicsr_zifencei_zihintpause_zba_zbb_zbc_zbs_sstc >> mmu : sv48 >> >> >> 1: http://lists.infradead.org/pipermail/linux-riscv/2021-November/010252.html >> >> >> Testing: >> - [x] `jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UseZihintpause -XX:+UseRVV -XX:+UseZicbop -XX:+UseZba -XX:+UseZbb -XX:+UseZbs -XX:+PrintFlagsFinal -version` with release build > > Feilong Jiang has updated the pull request incrementally with one additional commit since the last revision: > > fix typo Changes requested by luhenry (Committer). src/hotspot/cpu/riscv/vm_version_riscv.cpp line 186: > 184: } > 185: > 186: if (UseZba && !_cpu_features.ext_zba) { We are missing where we are implicitly enabling `UseZ*` based on their corresponding values in `_cpu_feature.ext_z*`. I would assume it to be the main motivator for this change. src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp line 103: > 101: void VM_Version::get_isa() { > 102: char isa_buf[500]; > 103: strcpy(isa_buf, _isa); You should use `snprintf(isa_buf, sizeof(isa_buf), "%s", _isa)` to make sure it fits in `isa_buf` and it's `\0` terminated. src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp line 113: > 111: while (base_ext[i] != '\0') { > 112: const char ch = base_ext[i++]; > 113: if (ch == 'i') { Nit: You can use a `switch` case here, easier to read IMO. ------------- PR: https://git.openjdk.org/jdk/pull/12343 From duke at openjdk.org Tue Feb 7 08:40:57 2023 From: duke at openjdk.org (Amit Kumar) Date: Tue, 7 Feb 2023 08:40:57 GMT Subject: Integrated: 8298413: [s390] CPUInfoTest fails due to uppercase feature string In-Reply-To: <4jGO8wNNTB7LMAY7Hob6Pbkei8t4S37-pr8lBsbJkc8=.4d1c0296-bcc9-41ca-8275-e5d247c78e53@github.com> References: <4jGO8wNNTB7LMAY7Hob6Pbkei8t4S37-pr8lBsbJkc8=.4d1c0296-bcc9-41ca-8275-e5d247c78e53@github.com> Message-ID: <34K80AKkrMdJsrgCbppfNnRodkHGfguRCgd7QgBMZv4=.ac58349f-e038-4adb-a95d-84d6199446af@github.com> On Mon, 30 Jan 2023 09:34:18 GMT, Amit Kumar wrote: > CPUInfoTest.java is failing on s390 due to an incorrect feature string. Feature Vector Enhancement is shown as VEnh2 instead of venh2. Beside that out-of-support date/(tbd) was not being included even if it was present there so this issue is also addressed by this fix. This pull request has now been integrated. Changeset: 9dad874f Author: Amit Kumar Committer: Lutz Schmidt URL: https://git.openjdk.org/jdk/commit/9dad874ff9f03f5891aa8b37e7826a67c851f06d Stats: 5 lines in 1 file changed: 0 ins; 0 del; 5 mod 8298413: [s390] CPUInfoTest fails due to uppercase feature string Reviewed-by: mdoerr, lucy ------------- PR: https://git.openjdk.org/jdk/pull/12286 From fjiang at openjdk.org Tue Feb 7 09:23:46 2023 From: fjiang at openjdk.org (Feilong Jiang) Date: Tue, 7 Feb 2023 09:23:46 GMT Subject: RFR: 8284196: RISC-V: Detect supported ISA extensions over cpuinfo [v2] In-Reply-To: <9mhhiQTdXtlYKNoyiGi5bKzk5CvhM20qjMU817bBnFc=.207cd537-a73c-4c9a-aa41-dc03ed3159bc@github.com> References: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> <9mhhiQTdXtlYKNoyiGi5bKzk5CvhM20qjMU817bBnFc=.207cd537-a73c-4c9a-aa41-dc03ed3159bc@github.com> Message-ID: On Tue, 7 Feb 2023 08:20:27 GMT, Ludovic Henry wrote: > There are also other proposals in Linux, like the `riscv_hwprobe` syscall [1] which should also be supported in glibc. > [1] https://lkml.org/lkml/2023/2/6/1147 Thanks for the information, great to see some progress being made. > Relying on `/proc/cpuinfo` which comes from the device tree is I think a good first step. Yes, it would be better than hwcap which only represents single-letter cpu features, and that's why I create this PR. When I first implemented this part of the code, I searched for the proper way to detect features but only found hwcap, so I left a TODO in `get_os_cpu_info`. I also find this slide [1] discussed how to represent all RISC-V ISA features (I think hwcap2 looks promising). There is further progress about hwcap2 in [2], so when this feature is ready, we can switch to a better solution in the future. [1] https://lpc.events/event/11/contributions/1103/attachments/832/1576/Puzzle%20for%20RISC-V%20ifunc.pdf [2] https://lpc.events/event/16/contributions/1168/attachments/1055/2075/The%20Odyssey%20of%20HWCAP%20on%20RISC-V%20platforms%20with%20Palmer%27s%20Porosal%20in%20Backups.pdf ------------- PR: https://git.openjdk.org/jdk/pull/12343 From stefank at openjdk.org Tue Feb 7 09:26:40 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 7 Feb 2023 09:26:40 GMT Subject: RFR: 8301641: NativeMemoryUsageTotal event uses reserved value for committed field In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 22:00:47 GMT, Stefan Johansson wrote: > Please review this small fix for the NativeMemoryUsageTotal JFR event. > > **Summary** > The NativeMemoryUsageTotal event uses the reserved value for both reserved and committed. > > **Testing** > Added new test and verified that it works locally. Now running it in mach5 to make sure it doesn't see any unexpected issues there. Marked as reviewed by stefank (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12377 From sjohanss at openjdk.org Tue Feb 7 09:32:59 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Tue, 7 Feb 2023 09:32:59 GMT Subject: RFR: 8301641: NativeMemoryUsageTotal event uses reserved value for committed field In-Reply-To: References: Message-ID: On Thu, 2 Feb 2023 07:01:44 GMT, Erik ?sterlund wrote: >> Please review this small fix for the NativeMemoryUsageTotal JFR event. >> >> **Summary** >> The NativeMemoryUsageTotal event uses the reserved value for both reserved and committed. >> >> **Testing** >> Added new test and verified that it works locally. Now running it in mach5 to make sure it doesn't see any unexpected issues there. > > Marked as reviewed by eosterlund (Reviewer). Thanks for the reviews @fisk and @stefank ------------- PR: https://git.openjdk.org/jdk/pull/12377 From sjohanss at openjdk.org Tue Feb 7 09:33:00 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Tue, 7 Feb 2023 09:33:00 GMT Subject: Integrated: 8301641: NativeMemoryUsageTotal event uses reserved value for committed field In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 22:00:47 GMT, Stefan Johansson wrote: > Please review this small fix for the NativeMemoryUsageTotal JFR event. > > **Summary** > The NativeMemoryUsageTotal event uses the reserved value for both reserved and committed. > > **Testing** > Added new test and verified that it works locally. Now running it in mach5 to make sure it doesn't see any unexpected issues there. This pull request has now been integrated. Changeset: 77dbcd85 Author: Stefan Johansson URL: https://git.openjdk.org/jdk/commit/77dbcd85695b2b35ce10526d37a51e7e5fb656d7 Stats: 13 lines in 2 files changed: 12 ins; 0 del; 1 mod 8301641: NativeMemoryUsageTotal event uses reserved value for committed field Reviewed-by: eosterlund, stefank ------------- PR: https://git.openjdk.org/jdk/pull/12377 From eosterlund at openjdk.org Tue Feb 7 09:36:57 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 7 Feb 2023 09:36:57 GMT Subject: Integrated: 8301371: Interpreter store barriers on x86_64 don't have disjoint temp registers In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 08:08:14 GMT, Erik ?sterlund wrote: > The interpreter store barriers on x86_64 use temp registers that sometimes intersects with registers that are part of the address. Therefore, when clobbering temp registers, the address gets destroyed. This has happened to be okay so far, based on how these registers are used in existing backends, but it doesn't work well for generational ZGC. This CR selects new temp registers that I have verified works for everyone on x86_64. This pull request has now been integrated. Changeset: f5f38a82 Author: Erik ?sterlund URL: https://git.openjdk.org/jdk/commit/f5f38a82cc357218804c2e4cab5140d23b44ee06 Stats: 4 lines in 1 file changed: 3 ins; 0 del; 1 mod 8301371: Interpreter store barriers on x86_64 don't have disjoint temp registers Reviewed-by: kvn, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/12309 From mgronlun at openjdk.org Tue Feb 7 09:55:44 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 7 Feb 2023 09:55:44 GMT Subject: RFR: 8301380: jdk/jfr/api/consumer/streaming/TestCrossProcessStreaming.java [v3] In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 02:10:20 GMT, David Holmes wrote: >> Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: >> >> revert uncalled for changes > > Looks good. > > Thanks Thanks @dholmes-ora, for the review! ------------- PR: https://git.openjdk.org/jdk/pull/12388 From eosterlund at openjdk.org Tue Feb 7 10:00:50 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 7 Feb 2023 10:00:50 GMT Subject: RFR: 8300255: Introduce interface for GC oop verification in the assembler In-Reply-To: References: <5Ts_s04vfL_X-B1d9uM2QWeeHXZ99zntV0W_kI6KKNs=.1dfadaa4-cd3d-4b80-a523-6fdcd5f404c5@github.com> Message-ID: On Tue, 7 Feb 2023 05:14:17 GMT, Martin Doerr wrote: > > PPC code contributed by @TheRealMDoerr and RISC-V code contributed by @yadongw. Could you please check that it works on your machines? > > The default version of check_oop doesn't work for ZGC. I suggest to exclude the PPC64 parts here and ship them with the generational ZGC PR. That will reduce effort. Does that make sense? Hmm. Looking at the changes of this patch, we used to call __ verify_oop, but now we call check_oop on the barrier set assembler, which in turn has a single implementation which calls __ verify_oop. It seems like this patch currently does not change the behaviour at all to what happened before? While I am perfectly fine removing the PPC changes, I don't think I understand what behaviour this patch changes for PPC. ------------- PR: https://git.openjdk.org/jdk/pull/12443 From duke at openjdk.org Tue Feb 7 10:02:20 2023 From: duke at openjdk.org (Alan Hayward) Date: Tue, 7 Feb 2023 10:02:20 GMT Subject: RFR: 8277204: Implement PAC-RET branch protection on Linux/AArch64 [v25] In-Reply-To: References: <-RReNBa4rbWMG38xuIH_yCfzDDEubd4gU2QsVCdiubM=.0e13582c-f1c8-4e35-a1ce-174a22b86a58@github.com> <4v4pdOt2xrFJx30NbLByS6jw3p_y3NSutAxIQq_I-lg=.cedfdf56-6a23-46b6-8cd3-044a467c2c61@github.com> Message-ID: <-k2uQ-TURI9crzECbTiEo2JyJxq9r2yBuo3iP_NEVt8=.5e548bc8-d8ef-41c9-8a53-e7d1d6f1109e@github.com> On Tue, 7 Feb 2023 07:55:58 GMT, Doug Simon wrote: >For now, we will just add a check into Graal that UseBranchProtection is not set, exiting the VM if it is that sounds a better solution to me instead of: > disable JVMCI if _rop_protection is parsed as "true" ------------- PR: https://git.openjdk.org/jdk/pull/6334 From fyang at openjdk.org Tue Feb 7 10:36:47 2023 From: fyang at openjdk.org (Fei Yang) Date: Tue, 7 Feb 2023 10:36:47 GMT Subject: RFR: 8300255: Introduce interface for GC oop verification in the assembler In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 15:32:50 GMT, Erik ?sterlund wrote: > In the assembly code, there is some generic oop verification code. Said verification may or may not fit a particular GC. In particular, it has not worked well for ZGC for a while, and there is an if (UseZGC). This enhancement aims at generalizing this code, such that a collector can have its own oop verification policy. Hi, seems we need two extra small changes to build on linux-riscv. Thanks. src/hotspot/cpu/riscv/gc/z/zBarrierSetAssembler_riscv.cpp line 450: > 448: #define __ masm-> > 449: > 450: void BarrierSetAssembler::check_oop(MacroAssembler* masm, Register obj, Register tmp1, Register tmp2, Label& error) { Should be 'ZBarrierSetAssembler::check_oop' here. src/hotspot/cpu/riscv/gc/z/zBarrierSetAssembler_riscv.hpp line 103: > 101: #endif // COMPILER2 > 102: > 103: virtual void check_oop(MacroAssembler* masm, Register obj, Register tmp1, Register tmp2, Label& error); Need to remove 'virtual' here. ------------- Changes requested by fyang (Reviewer). PR: https://git.openjdk.org/jdk/pull/12443 From ayang at openjdk.org Tue Feb 7 11:30:40 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 7 Feb 2023 11:30:40 GMT Subject: RFR: 8225409: Investigate removal of the Hot Card Cache Message-ID: Mostly straightforward removing G1 HotCardCache, as no performance benefit is observed for various benchmarks (specjbb, specjvm, dacapo, etc). Test: tier1-6 ------------- Commit messages: - g1-hcc Changes: https://git.openjdk.org/jdk/pull/12452/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12452&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8225409 Stats: 1259 lines in 34 files changed: 50 ins; 1176 del; 33 mod Patch: https://git.openjdk.org/jdk/pull/12452.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12452/head:pull/12452 PR: https://git.openjdk.org/jdk/pull/12452 From ihse at openjdk.org Tue Feb 7 11:47:46 2023 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 7 Feb 2023 11:47:46 GMT Subject: RFR: 8301853: C4819 warnings were reported in HotSpot on Windows In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 12:33:44 GMT, Yasumasa Suenaga wrote: > This is subtask of #12427 . > > I have seen C4819 warning in stubGenerator_x86_64_poly.cpp and elfFile.hpp in HotSpot on Windows (CP932: Japanese locale) > > > d:\github-forked\jdk\src\hotspot\cpu\x86\stubGenerator_x86_64_poly.cpp(1): error C2220: ????????????????? > d:\github-forked\jdk\src\hotspot\cpu\x86\stubGenerator_x86_64_poly.cpp(1): warning C4819: ???????????? ??? (932) ??????????????????????????????????? Unicode ???????????? Please hold on and do not merge this PR. The problem here is that the encoding of the file is not properly conveyed. This is code written by Intel where the original author clearly put in the time to use a separate mathematical sign to clarify the discussion. I think this should not just be trivially removed. Let's fix this properly instead. ------------- Changes requested by ihse (Reviewer). PR: https://git.openjdk.org/jdk/pull/12435 From mdoerr at openjdk.org Tue Feb 7 11:55:41 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 7 Feb 2023 11:55:41 GMT Subject: RFR: 8300255: Introduce interface for GC oop verification in the assembler In-Reply-To: References: <5Ts_s04vfL_X-B1d9uM2QWeeHXZ99zntV0W_kI6KKNs=.1dfadaa4-cd3d-4b80-a523-6fdcd5f404c5@github.com> Message-ID: On Tue, 7 Feb 2023 05:14:17 GMT, Martin Doerr wrote: >> PPC code contributed by @TheRealMDoerr and RISC-V code contributed by @yadongw. Could you please check that it works on your machines? > >> PPC code contributed by @TheRealMDoerr and RISC-V code contributed by @yadongw. Could you please check that it works on your machines? > > The default version of check_oop doesn't work for ZGC. I suggest to exclude the PPC64 parts here and ship them with the generational ZGC PR. That will reduce effort. Does that make sense? > > > PPC code contributed by @TheRealMDoerr and RISC-V code contributed by @yadongw. Could you please check that it works on your machines? > > > > > > The default version of check_oop doesn't work for ZGC. I suggest to exclude the PPC64 parts here and ship them with the generational ZGC PR. That will reduce effort. Does that make sense? > > Hmm. Looking at the changes of this patch, we used to call __ verify_oop, but now we call check_oop on the barrier set assembler, which in turn has a single implementation which calls __ verify_oop. It seems like this patch currently does not change the behaviour at all to what happened before? While I am perfectly fine removing the PPC changes, I don't think I understand what behaviour this patch changes for PPC. You're right. VerifyOops is already broken with ZGC on PPC64. You can keep the PPC64 parts as they are. The generational ZGC code will fix the problem. So, I'm fine with this PR. ------------- PR: https://git.openjdk.org/jdk/pull/12443 From jcking at openjdk.org Tue Feb 7 12:43:45 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 12:43:45 GMT Subject: RFR: JDK-8300783: Consolidate byteswap implementations [v13] In-Reply-To: References: <4AfWUfBaHKZwVsm6XhCkKc0krgKgWTw1RVI2R1-MdcM=.48008df7-1411-42d6-aeb1-3276f443dfe3@github.com> Message-ID: On Tue, 7 Feb 2023 07:36:12 GMT, Kim Barrett wrote: > > Poke. > > Working on it... Oh, no problem! Ignore my impatience. > > I am not 100% sure of XLC, it should generate a single store instruction. Getting a copy of the old XLC on AIX (not the clang-based one) doesn't seem easy. > > Why do we care about the old XLC? It doesn't support C++11/14, so only the newer clang-based version is usable with our current code base. Unless you were going to propose backporting this, which seems dubious to me but not my call. Oh, if we do not care about old XLC shall we drop it elsewhere? I assume `TARGET_COMPILER_xlc` is the old XLC? ------------- PR: https://git.openjdk.org/jdk/pull/12114 From ihse at openjdk.org Tue Feb 7 13:52:50 2023 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 7 Feb 2023 13:52:50 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v3] In-Reply-To: References: Message-ID: <9AqbW76scQl4YzQ30Q4yzBHl3HH9R4dltSFYGP0QCTU=.e24c95da-0a04-47b1-9cc7-9353a45665eb@github.com> On Thu, 2 Feb 2023 15:52:01 GMT, Justin King wrote: >> Adds initial LSan (LeakSanitizer) support to Hotspot. This setup has been used to identify multiple leaks so far. It can run most of the test suite except those that also fail with ASan, which is being looked at separately. It is especially useful when combined with ASan, as LSan can use poisoning information to determine what memory to scan or not to scan, making leak detection more accurate and faster. >> >> **Suppressing:** >> Currently the suppression list is only used to suppress JLI leaks that are known, the rest are done in code. Suppressing needs to identify the source of thet leak. Due to Hotspot's code organization, we would need to suppress `os::malloc` and friends, which would suppress everything. Suppressing in code has the added benefit of being explicit and surviving refactors if methods change. >> >> **Caveats:** >> - By default ASan enables LSan, however we explicitly disable it unless `--enable-lsan` is given. It is useful to be able to use ASan without LSan. Using LSan by itself is less likely to be useful and will probably not work, but its still possible currently. > > Justin King has updated the pull request incrementally with two additional commits since the last revision: > > - Fix grammatical error > > Signed-off-by: Justin King > - Update based on review and remove restriction on compressed oops and class pointers > > Signed-off-by: Justin King Marked as reviewed by ihse (Reviewer). make/data/asan/asan_default_options.c line 53: > 51: ATTRIBUTE_DEFAULT_VISIBILITY ATTRIBUTE_USED const char* __asan_default_options() { > 52: return > 53: #ifndef LEAK_SANITIZER I'm trying to understand this part. The test for `LEAK_SANITIZER` was put in place in this file, before it was ever generated by the build system..? Was this in anticipation for this PR, or intended for the user to add an extra `-DLEAK_SANITIZER` to the compiler flags manually? Also, this patch changes the values for asan, regardless of the state of lsan. Is this required to get the lsan patch working, or is this additional, separate fixes to asan being "smuggled" into this PR? (I don't mind a certain amount of "smuggling", since sometimes it is too tedious to open a separate JBS issue for minor, separate fixes one found out during development), but it is nice to be explicit about it. And finally, now that the if statement also has an else, maybe reverse it to an `#ifdef` instead of `#ifndef`..? Apart from this, the build changes look good. ------------- PR: https://git.openjdk.org/jdk/pull/12229 From fjiang at openjdk.org Tue Feb 7 14:10:26 2023 From: fjiang at openjdk.org (Feilong Jiang) Date: Tue, 7 Feb 2023 14:10:26 GMT Subject: RFR: 8284196: RISC-V: Detect supported ISA extensions over cpuinfo [v3] In-Reply-To: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> References: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> Message-ID: > Currently, `elf_hwcap` for RISC-V only sets single-letter extension bit (e.g. IMAFD). > As many standard multi-letter ISA extensions are ratified (e.g. Zba/Zbb/Zbc/Zbs), > we should find a stable way to detect these supported ISA extensions in JVM. > [1] has proposed a way to parse supported extensions through /proc/cpuinfo > or "riscv,isa" string of /sys/firmware/devicetree, we could detect supported extensions > in the same way. > > Here is an example of /proc/cpuinfo with multi-letter extensions from Ubuntu 20.04 in QEMU-SYSTEM: > > > ubuntu at ubuntu:~$ uname -a > Linux ubuntu 5.8.0-14-generic #16~20.04.3-Ubuntu SMP Mon Feb 1 16:33:19 UTC 2021 riscv64 riscv64 riscv64 GNU/Linux > ubuntu at ubuntu:~$ cat /proc/cpuinfo > processor : 0 > hart : 2 > isa : rv64imafdch_zicsr_zifencei_zihintpause_zba_zbb_zbc_zbs_sstc > mmu : sv48 > > > 1: http://lists.infradead.org/pipermail/linux-riscv/2021-November/010252.html > > > Testing: > - [x] `jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UseZihintpause -XX:+UseRVV -XX:+UseZicbop -XX:+UseZba -XX:+UseZbb -XX:+UseZbs -XX:+PrintFlagsFinal -version` with release build Feilong Jiang has updated the pull request incrementally with one additional commit since the last revision: fix comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12343/files - new: https://git.openjdk.org/jdk/pull/12343/files/59c33e2f..8def2029 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12343&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12343&range=01-02 Stats: 18 lines in 1 file changed: 2 ins; 5 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/12343.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12343/head:pull/12343 PR: https://git.openjdk.org/jdk/pull/12343 From duke at openjdk.org Tue Feb 7 14:11:01 2023 From: duke at openjdk.org (Afshin Zafari) Date: Tue, 7 Feb 2023 14:11:01 GMT Subject: Integrated: 8151413: os::allocation_granularity/page_size and friends return signed values In-Reply-To: References: Message-ID: On Thu, 19 Jan 2023 10:59:02 GMT, Afshin Zafari wrote: > ### Description > os::allocation_granularity/page_size and friends return signed values > > ### Patch > - Type of `vm_page_size` and `vm_allocation_granularity` members of `OSInfo` class and their wrappers in `os` class changed to `size_t` > - Initial value of them changed from -1 to 0. > - In setters, checking for *set only once* condition is updated accordingly (comparing with 0 instead of -1). Also, checking the argument be positive is removed. > - Equal to 0 (instead of `<= 0` ) is used to check if calling setters failed. > - All `(size_t)` casting of getters removed. > - In arithmetic and negation operations, the operand related to the getters casted to `(int)`. Otherwise, the Windows builds complain. > - Explicitly casted to `(int)` where `jint` needed. > - In ` align_up(T size, A alignment)`, assignment of variables of type `A` to type `T` (i.e., `T t = (A) a;`) should be safe. `T : size_t` and `A : int` won't compile. Fixed appropriately. > - `"%d"` format-flags replaced with `SIZE_FORMAT`. > - Type of `CompilerToVM::Data::vm_page_size` changed to `size_t`. > > ### Test > tier1-5: all green, except an unrelated fail for whom a bug is already created. > job-id: afshin-8151413-20230117-1255-40910454 This pull request has now been integrated. Changeset: 4fe99da7 Author: Afshin Zafari Committer: Jesper Wilhelmsson URL: https://git.openjdk.org/jdk/commit/4fe99da74f557461c31293cdc48af1199dd2b85c Stats: 170 lines in 66 files changed: 7 ins; 5 del; 158 mod 8151413: os::allocation_granularity/page_size and friends return signed values Reviewed-by: stefank, ccheung, ysr ------------- PR: https://git.openjdk.org/jdk/pull/12091 From fjiang at openjdk.org Tue Feb 7 14:13:52 2023 From: fjiang at openjdk.org (Feilong Jiang) Date: Tue, 7 Feb 2023 14:13:52 GMT Subject: RFR: 8284196: RISC-V: Detect supported ISA extensions over cpuinfo [v2] In-Reply-To: References: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> Message-ID: On Tue, 7 Feb 2023 05:34:32 GMT, Fei Yang wrote: >> Feilong Jiang has updated the pull request incrementally with one additional commit since the last revision: >> >> fix typo > > src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp line 100: > >> 98: } >> 99: } >> 100: > > Can you add a comment here about how the 'isa' substring with those extensions look like? Fixed. Add an example of isa string and explain how multi-letter extensions are separated. ------------- PR: https://git.openjdk.org/jdk/pull/12343 From fjiang at openjdk.org Tue Feb 7 14:13:56 2023 From: fjiang at openjdk.org (Feilong Jiang) Date: Tue, 7 Feb 2023 14:13:56 GMT Subject: RFR: 8284196: RISC-V: Detect supported ISA extensions over cpuinfo [v2] In-Reply-To: References: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> Message-ID: On Tue, 7 Feb 2023 08:28:26 GMT, Ludovic Henry wrote: >> Feilong Jiang has updated the pull request incrementally with one additional commit since the last revision: >> >> fix typo > > src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp line 113: > >> 111: while (base_ext[i] != '\0') { >> 112: const char ch = base_ext[i++]; >> 113: if (ch == 'i') { > > Nit: You can use a `switch` case here, easier to read IMO. switch/case looks better, fixed ------------- PR: https://git.openjdk.org/jdk/pull/12343 From fjiang at openjdk.org Tue Feb 7 14:20:48 2023 From: fjiang at openjdk.org (Feilong Jiang) Date: Tue, 7 Feb 2023 14:20:48 GMT Subject: RFR: 8284196: RISC-V: Detect supported ISA extensions over cpuinfo [v2] In-Reply-To: References: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> Message-ID: <_9crD-_k0hyO59IGouVx1vUJDJAmNUb1eLFa5EIqvxI=.e230cd38-5cb4-451f-817d-768aa512615d@github.com> On Tue, 7 Feb 2023 08:30:12 GMT, Ludovic Henry wrote: >> Feilong Jiang has updated the pull request incrementally with one additional commit since the last revision: >> >> fix typo > > src/hotspot/cpu/riscv/vm_version_riscv.cpp line 186: > >> 184: } >> 185: >> 186: if (UseZba && !_cpu_features.ext_zba) { > > We are missing where we are implicitly enabling `UseZ*` based on their corresponding values in `_cpu_feature.ext_z*`. I would assume it to be the main motivator for this change. Unlike RVC, there is still no hardware that supports those extensions. I think we should do more testing and evaluation on real hardware before enabling them by default. ------------- PR: https://git.openjdk.org/jdk/pull/12343 From jcking at openjdk.org Tue Feb 7 14:23:40 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 14:23:40 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy Message-ID: - Rename `MEMFLAGS` to `MemoryType`. `MEMFLAGS` is highly misleading as flags typically can be combined. - Update `MemoryType` to have enumeration names that follow the style guide, no `mt` prefix. - Create aliases for old `mtXXX` names. - Remove `mt_number_of_types` from the enumeration. - Shift implementation of utilities related to `MEMFLAGS` from `NMTUtil` to `MemoryTypes`. Handle missing `mt` prefix during parsing. - `NMTUtil` references are not updated to avoid increasing patch size. - Merge `AllocFailStrategy` and `AllocFailType` into `AllocationFailureStrategy`. - Move `MemoryType` and `AllocationFailureStrategy` to their own respective headers. This change does not update variable verbiage. That should be something done over time. ------------- Commit messages: - Fix undefined variable reference - Cleanup memory types and allocation failure strategies Changes: https://git.openjdk.org/jdk/pull/12454/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12454&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301983 Stats: 794 lines in 80 files changed: 213 ins; 94 del; 487 mod Patch: https://git.openjdk.org/jdk/pull/12454.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12454/head:pull/12454 PR: https://git.openjdk.org/jdk/pull/12454 From fjiang at openjdk.org Tue Feb 7 14:29:14 2023 From: fjiang at openjdk.org (Feilong Jiang) Date: Tue, 7 Feb 2023 14:29:14 GMT Subject: RFR: 8284196: RISC-V: Detect supported ISA extensions over cpuinfo [v4] In-Reply-To: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> References: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> Message-ID: > Currently, `elf_hwcap` for RISC-V only sets single-letter extension bit (e.g. IMAFD). > As many standard multi-letter ISA extensions are ratified (e.g. Zba/Zbb/Zbc/Zbs), > we should find a stable way to detect these supported ISA extensions in JVM. > [1] has proposed a way to parse supported extensions through /proc/cpuinfo > or "riscv,isa" string of /sys/firmware/devicetree, we could detect supported extensions > in the same way. > > Here is an example of /proc/cpuinfo with multi-letter extensions from Ubuntu 20.04 in QEMU-SYSTEM: > > > ubuntu at ubuntu:~$ uname -a > Linux ubuntu 5.8.0-14-generic #16~20.04.3-Ubuntu SMP Mon Feb 1 16:33:19 UTC 2021 riscv64 riscv64 riscv64 GNU/Linux > ubuntu at ubuntu:~$ cat /proc/cpuinfo > processor : 0 > hart : 2 > isa : rv64imafdch_zicsr_zifencei_zihintpause_zba_zbb_zbc_zbs_sstc > mmu : sv48 > > > 1: http://lists.infradead.org/pipermail/linux-riscv/2021-November/010252.html > > > Testing: > - [x] `jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UseZihintpause -XX:+UseRVV -XX:+UseZicbop -XX:+UseZba -XX:+UseZbb -XX:+UseZbs -XX:+PrintFlagsFinal -version` with release build Feilong Jiang has updated the pull request incrementally with one additional commit since the last revision: adjust comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12343/files - new: https://git.openjdk.org/jdk/pull/12343/files/8def2029..66b62159 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12343&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12343&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12343.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12343/head:pull/12343 PR: https://git.openjdk.org/jdk/pull/12343 From jcking at openjdk.org Tue Feb 7 14:29:57 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 14:29:57 GMT Subject: RFR: JDK-8300783: Consolidate byteswap implementations [v13] In-Reply-To: <4AfWUfBaHKZwVsm6XhCkKc0krgKgWTw1RVI2R1-MdcM=.48008df7-1411-42d6-aeb1-3276f443dfe3@github.com> References: <4AfWUfBaHKZwVsm6XhCkKc0krgKgWTw1RVI2R1-MdcM=.48008df7-1411-42d6-aeb1-3276f443dfe3@github.com> Message-ID: On Tue, 31 Jan 2023 20:05:21 GMT, Justin King wrote: >> Deduplicate byte swapping implementations by consolidating them into `utilities/byteswap.hpp`, following `std::byteswap` introduced in C++23. Further simplification of `Bytes` will follow in https://github.com/openjdk/jdk/pull/12078. > > Justin King has updated the pull request incrementally with one additional commit since the last revision: > > Fix copyright > > Signed-off-by: Justin King I filed JDK-8301987 to remove `TARGET_COMPILER_xlc` and any legacy XL C/C++ specific stuff. We should be able to treat Open XL C/C++ as Clang. ------------- PR: https://git.openjdk.org/jdk/pull/12114 From duke at openjdk.org Tue Feb 7 14:30:53 2023 From: duke at openjdk.org (Scott Gibbons) Date: Tue, 7 Feb 2023 14:30:53 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v11] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: On Tue, 7 Feb 2023 00:12:21 GMT, Scott Gibbons wrote: >> Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. >> >> Encode performance: >> **Old:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms >> >> >> **New:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms >> >> >> Decode performance: >> **Old:** >> >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms >> >> **New:** >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Add algorithm comments @sviswa7 Can you please share the test(s) you used to determine the stated failures? I've run all of the tier1-3 tests in the suite with no failures. Plus, if what you say is true, then the same logic would apply for the non-URL case as well as it applies to 0x2F ('/') in the same manner as 0x5F ('_'). If your test(s) do indeed show the failure, we should probably add them to the test suite. Thanks. ------------- PR: https://git.openjdk.org/jdk/pull/12126 From jcking at openjdk.org Tue Feb 7 14:33:34 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 14:33:34 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy [v2] In-Reply-To: References: Message-ID: > - Rename `MEMFLAGS` to `MemoryType`. `MEMFLAGS` is highly misleading as flags typically can be combined. > - Update `MemoryType` to have enumeration names that follow the style guide, no `mt` prefix. > - Create aliases for old `mtXXX` names. > - Remove `mt_number_of_types` from the enumeration. > - Shift implementation of utilities related to `MEMFLAGS` from `NMTUtil` to `MemoryTypes`. Handle missing `mt` prefix during parsing. > - `NMTUtil` references are not updated to avoid increasing patch size. > - Merge `AllocFailStrategy` and `AllocFailType` into `AllocationFailureStrategy`. > - Move `MemoryType` and `AllocationFailureStrategy` to their own respective headers. > > This change does not update variable verbiage. That should be something done over time. Justin King has updated the pull request incrementally with one additional commit since the last revision: Fix printf formatting error Signed-off-by: Justin King ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12454/files - new: https://git.openjdk.org/jdk/pull/12454/files/be6c2307..c2f229ba Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12454&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12454&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12454.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12454/head:pull/12454 PR: https://git.openjdk.org/jdk/pull/12454 From jcking at openjdk.org Tue Feb 7 14:40:49 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 14:40:49 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy [v3] In-Reply-To: References: Message-ID: > - Rename `MEMFLAGS` to `MemoryType`. `MEMFLAGS` is highly misleading as flags typically can be combined. > - Update `MemoryType` to have enumeration names that follow the style guide, no `mt` prefix. > - Create aliases for old `mtXXX` names. > - Remove `mt_number_of_types` from the enumeration. > - Shift implementation of utilities related to `MEMFLAGS` from `NMTUtil` to `MemoryTypes`. Handle missing `mt` prefix during parsing. > - `NMTUtil` references are not updated to avoid increasing patch size. > - Merge `AllocFailStrategy` and `AllocFailType` into `AllocationFailureStrategy`. > - Move `MemoryType` and `AllocationFailureStrategy` to their own respective headers. > > This change does not update variable verbiage. That should be something done over time. Justin King has updated the pull request incrementally with one additional commit since the last revision: Fix refactor mistake Signed-off-by: Justin King ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12454/files - new: https://git.openjdk.org/jdk/pull/12454/files/c2f229ba..fe293c42 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12454&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12454&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12454.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12454/head:pull/12454 PR: https://git.openjdk.org/jdk/pull/12454 From coleenp at openjdk.org Tue Feb 7 14:45:56 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 7 Feb 2023 14:45:56 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy [v3] In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 14:40:49 GMT, Justin King wrote: >> - Rename `MEMFLAGS` to `MemoryType`. `MEMFLAGS` is highly misleading as flags typically can be combined. >> - Update `MemoryType` to have enumeration names that follow the style guide, no `mt` prefix. >> - Create aliases for old `mtXXX` names. >> - Remove `mt_number_of_types` from the enumeration. >> - Shift implementation of utilities related to `MEMFLAGS` from `NMTUtil` to `MemoryTypes`. Handle missing `mt` prefix during parsing. >> - `NMTUtil` references are not updated to avoid increasing patch size. >> - Merge `AllocFailStrategy` and `AllocFailType` into `AllocationFailureStrategy`. >> - Move `MemoryType` and `AllocationFailureStrategy` to their own respective headers. >> >> This change does not update variable verbiage. That should be something done over time. > > Justin King has updated the pull request incrementally with one additional commit since the last revision: > > Fix refactor mistake > > Signed-off-by: Justin King I don't like MemoryType. MEMFLAGS makes it indicative that it's a flag for some purpose and sticks out to my eyes as a template parameter, which is part of our coding style. mtWhatever is easy to spot in the code where used. I don't agree with the reasoning for this change. Moving the header file out of allocation.hpp seems good though although we still need to include allocation.hpp to get CHeapObj so not sure how much inclusion that saves. I think also having a PR with a bullet list is a good sign that you're doing too much in one change. ------------- PR: https://git.openjdk.org/jdk/pull/12454 From kbarrett at openjdk.org Tue Feb 7 15:05:20 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 7 Feb 2023 15:05:20 GMT Subject: RFR: JDK-8300783: Consolidate byteswap implementations [v13] In-Reply-To: <4AfWUfBaHKZwVsm6XhCkKc0krgKgWTw1RVI2R1-MdcM=.48008df7-1411-42d6-aeb1-3276f443dfe3@github.com> References: <4AfWUfBaHKZwVsm6XhCkKc0krgKgWTw1RVI2R1-MdcM=.48008df7-1411-42d6-aeb1-3276f443dfe3@github.com> Message-ID: <1xR0Xv1SvZPokjaNhZ02xuMfBP2MWrW6W0_-CnmSlmQ=.0709d04b-ac52-4b72-84b5-12eee25597c7@github.com> On Tue, 31 Jan 2023 20:05:21 GMT, Justin King wrote: >> Deduplicate byte swapping implementations by consolidating them into `utilities/byteswap.hpp`, following `std::byteswap` introduced in C++23. Further simplification of `Bytes` will follow in https://github.com/openjdk/jdk/pull/12078. > > Justin King has updated the pull request incrementally with one additional commit since the last revision: > > Fix copyright > > Signed-off-by: Justin King > Oh, if we do not care about old XLC shall we drop it elsewhere? I assume TARGET_COMPILER_xlc is the old XLC? > I filed JDK-8301987 to remove `TARGET_COMPILER_xlc` and any legacy XL C/C++ specific stuff. We should be able to treat Open XL C/C++ as Clang. TARGET_COMPILER_xlc now designates the new Clang-based compiler. It has various options compatible with the old non-Clang compiler, and things like that. Remember that it's Clang-*based*, and so far as I can tell from afar, is its own dialect. I think it's up to the aix-ppc port maintainers (the only OpenJDK port using it) to decide / address those kinds of issues. ------------- PR: https://git.openjdk.org/jdk/pull/12114 From jcking at openjdk.org Tue Feb 7 15:07:26 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 15:07:26 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy [v3] In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 14:42:14 GMT, Coleen Phillimore wrote: > I don't like MemoryType. MEMFLAGS makes it indicative that it's a flag for some purpose and sticks out to my eyes as a template parameter, which is part of our coding style. mtWhatever is easy to spot in the code where used. I don't agree with the reasoning for this change. Moving the header file out of allocation.hpp seems good though although we still need to include allocation.hpp to get CHeapObj so not sure how much inclusion that saves. MEMFLAGS implies it's flags. In C/C++ that almost always means they can be combined together and passed as a single argument. The only other usage I've seen it with is command line options. That is not the case here, it's just an enumeration of types/categories/kinds. It's not a combination of bits or a pattern. The mt prefix is preserved, you can still use mtWhatever, it's just an alias. The prefix mt, AFAIK, quite literally means memory type, hence the chosen name. MEMFLAGS isn't always a template parameter either. ------------- PR: https://git.openjdk.org/jdk/pull/12454 From tschatzl at openjdk.org Tue Feb 7 15:10:21 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 7 Feb 2023 15:10:21 GMT Subject: RFR: 8301988: VerifyLiveClosure::verify_liveness asserts on bad pointers outside heap Message-ID: Hi all, can I have reviews for this change to liveness verification that fixes some unwanted asserts because - it uses decode_not_null which will assert if the given oop address is not in the heap, making the remainder of the verification useless in that case - if the referenced object is not in the heap, we try to get its heap region too when printing, which also fails some assertions - in the innermost if lots of code is duplicated in both cases The first two issues are really annoying (there is another one when the `Klass` is garbage when calling `is_obj_dead_cond`, but I'll fix that separately). Testing: local compilation/testing, gha Thanks, Thomas ------------- Commit messages: - initial version Changes: https://git.openjdk.org/jdk/pull/12456/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12456&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301988 Stats: 29 lines in 3 files changed: 13 ins; 12 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/12456.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12456/head:pull/12456 PR: https://git.openjdk.org/jdk/pull/12456 From eosterlund at openjdk.org Tue Feb 7 15:12:24 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 7 Feb 2023 15:12:24 GMT Subject: RFR: 8300255: Introduce interface for GC oop verification in the assembler [v2] In-Reply-To: References: Message-ID: > In the assembly code, there is some generic oop verification code. Said verification may or may not fit a particular GC. In particular, it has not worked well for ZGC for a while, and there is an if (UseZGC). This enhancement aims at generalizing this code, such that a collector can have its own oop verification policy. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: RISC-V build fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12443/files - new: https://git.openjdk.org/jdk/pull/12443/files/0779af3e..c5617992 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12443&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12443&range=00-01 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12443.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12443/head:pull/12443 PR: https://git.openjdk.org/jdk/pull/12443 From eosterlund at openjdk.org Tue Feb 7 15:12:27 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 7 Feb 2023 15:12:27 GMT Subject: RFR: 8300255: Introduce interface for GC oop verification in the assembler In-Reply-To: References: <5Ts_s04vfL_X-B1d9uM2QWeeHXZ99zntV0W_kI6KKNs=.1dfadaa4-cd3d-4b80-a523-6fdcd5f404c5@github.com> Message-ID: On Tue, 7 Feb 2023 11:52:40 GMT, Martin Doerr wrote: > > > > PPC code contributed by @TheRealMDoerr and RISC-V code contributed by @yadongw. Could you please check that it works on your machines? > > > > > > > > > The default version of check_oop doesn't work for ZGC. I suggest to exclude the PPC64 parts here and ship them with the generational ZGC PR. That will reduce effort. Does that make sense? > > > > > > Hmm. Looking at the changes of this patch, we used to call __ verify_oop, but now we call check_oop on the barrier set assembler, which in turn has a single implementation which calls __ verify_oop. It seems like this patch currently does not change the behaviour at all to what happened before? While I am perfectly fine removing the PPC changes, I don't think I understand what behaviour this patch changes for PPC. > > You're right. VerifyOops is already broken with ZGC on PPC64. You can keep the PPC64 parts as they are. The generational ZGC code will fix the problem. So, I'm fine with this PR. Okay great - thanks for checking! ------------- PR: https://git.openjdk.org/jdk/pull/12443 From eosterlund at openjdk.org Tue Feb 7 15:12:30 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 7 Feb 2023 15:12:30 GMT Subject: RFR: 8300255: Introduce interface for GC oop verification in the assembler [v2] In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 10:34:05 GMT, Fei Yang wrote: > Hi, seems we need two extra small changes to build on linux-riscv. Thanks. Thanks for having a look! Hopefully fixed in the latest commit. ------------- PR: https://git.openjdk.org/jdk/pull/12443 From jcking at openjdk.org Tue Feb 7 15:28:05 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 15:28:05 GMT Subject: RFR: JDK-8300783: Consolidate byteswap implementations [v13] In-Reply-To: <4AfWUfBaHKZwVsm6XhCkKc0krgKgWTw1RVI2R1-MdcM=.48008df7-1411-42d6-aeb1-3276f443dfe3@github.com> References: <4AfWUfBaHKZwVsm6XhCkKc0krgKgWTw1RVI2R1-MdcM=.48008df7-1411-42d6-aeb1-3276f443dfe3@github.com> Message-ID: <-_iaIJpjg8JvzYn9RRU26K3Ou53hqr_VeSL6C-2lzNc=.668b7e8b-85f0-4f90-829b-48907d172552@github.com> On Tue, 31 Jan 2023 20:05:21 GMT, Justin King wrote: >> Deduplicate byte swapping implementations by consolidating them into `utilities/byteswap.hpp`, following `std::byteswap` introduced in C++23. Further simplification of `Bytes` will follow in https://github.com/openjdk/jdk/pull/12078. > > Justin King has updated the pull request incrementally with one additional commit since the last revision: > > Fix copyright > > Signed-off-by: Justin King > > Oh, if we do not care about old XLC shall we drop it elsewhere? I assume TARGET_COMPILER_xlc is the old XLC? > > I filed JDK-8301987 to remove `TARGET_COMPILER_xlc` and any legacy XL C/C++ specific stuff. We should be able to treat Open XL C/C++ as Clang. > > TARGET_COMPILER_xlc now designates the new Clang-based compiler. It has various options compatible with the old non-Clang compiler, and things like that. Remember that it's Clang-_based_, and so far as I can tell from afar, is its own dialect. I think it's up to the aix-ppc port maintainers (the only OpenJDK port using it) to decide / address those kinds of issues. I do not particularly care about old XLC, but new XLC is fine. It looks like it handles the builtins the same, it should generate the single instruction to store with reversal. ------------- PR: https://git.openjdk.org/jdk/pull/12114 From jcking at openjdk.org Tue Feb 7 15:32:11 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 15:32:11 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v4] In-Reply-To: References: Message-ID: > Adds initial LSan (LeakSanitizer) support to Hotspot. This setup has been used to identify multiple leaks so far. It can run most of the test suite except those that also fail with ASan, which is being looked at separately. It is especially useful when combined with ASan, as LSan can use poisoning information to determine what memory to scan or not to scan, making leak detection more accurate and faster. > > **Suppressing:** > Currently the suppression list is only used to suppress JLI leaks that are known, the rest are done in code. Suppressing needs to identify the source of thet leak. Due to Hotspot's code organization, we would need to suppress `os::malloc` and friends, which would suppress everything. Suppressing in code has the added benefit of being explicit and surviving refactors if methods change. > > **Caveats:** > - By default ASan enables LSan, however we explicitly disable it unless `--enable-lsan` is given. It is useful to be able to use ASan without LSan. Using LSan by itself is less likely to be useful and will probably not work, but its still possible currently. Justin King has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: - Merge branch 'master' into lsan - Fix grammatical error Signed-off-by: Justin King - Update based on review and remove restriction on compressed oops and class pointers Signed-off-by: Justin King - Support CDS Signed-off-by: Justin King - Merge remote-tracking branch 'upstream/master' into lsan - Merge remote-tracking branch 'upstream/master' into lsan - Merge remote-tracking branch 'upstream/master' into lsan - Implement initial LSAN support v2 Signed-off-by: Justin King - Implement initial LSAN support ------------- Changes: https://git.openjdk.org/jdk/pull/12229/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12229&range=03 Stats: 316 lines in 23 files changed: 307 ins; 0 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/12229.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12229/head:pull/12229 PR: https://git.openjdk.org/jdk/pull/12229 From jcking at openjdk.org Tue Feb 7 15:32:16 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 15:32:16 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v3] In-Reply-To: <9AqbW76scQl4YzQ30Q4yzBHl3HH9R4dltSFYGP0QCTU=.e24c95da-0a04-47b1-9cc7-9353a45665eb@github.com> References: <9AqbW76scQl4YzQ30Q4yzBHl3HH9R4dltSFYGP0QCTU=.e24c95da-0a04-47b1-9cc7-9353a45665eb@github.com> Message-ID: <0yjUgBOQkBxUw36VJricraVo_2D23j1M1tyR6NJQzF0=.3bd3aff0-f093-4011-b9db-f7e2ef87b6a2@github.com> On Tue, 7 Feb 2023 13:49:55 GMT, Magnus Ihse Bursie wrote: >> Justin King has updated the pull request incrementally with two additional commits since the last revision: >> >> - Fix grammatical error >> >> Signed-off-by: Justin King >> - Update based on review and remove restriction on compressed oops and class pointers >> >> Signed-off-by: Justin King > > make/data/asan/asan_default_options.c line 53: > >> 51: ATTRIBUTE_DEFAULT_VISIBILITY ATTRIBUTE_USED const char* __asan_default_options() { >> 52: return >> 53: #ifndef LEAK_SANITIZER > > I'm trying to understand this part. The test for `LEAK_SANITIZER` was put in place in this file, before it was ever generated by the build system..? Was this in anticipation for this PR, or intended for the user to add an extra `-DLEAK_SANITIZER` to the compiler flags manually? > > Also, this patch changes the values for asan, regardless of the state of lsan. Is this required to get the lsan patch working, or is this additional, separate fixes to asan being "smuggled" into this PR? (I don't mind a certain amount of "smuggling", since sometimes it is too tedious to open a separate JBS issue for minor, separate fixes one found out during development), but it is nice to be explicit about it. > > And finally, now that the if statement also has an else, maybe reverse it to an `#ifdef` instead of `#ifndef`..? > > Apart from this, the build changes look good. It was in anticipation for LSan, yes. ASan by default, in the compilers, enables LSan. I always intending on doing LSan after ASan, so I pre-emptively added the macro check before defining it in the build system. Since ASan always enables LSan, we change options based on whether it was enabled at configuration to match. I'll switch to ifdef. ------------- PR: https://git.openjdk.org/jdk/pull/12229 From jcking at openjdk.org Tue Feb 7 15:35:45 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 15:35:45 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v5] In-Reply-To: References: Message-ID: > Adds initial LSan (LeakSanitizer) support to Hotspot. This setup has been used to identify multiple leaks so far. It can run most of the test suite except those that also fail with ASan, which is being looked at separately. It is especially useful when combined with ASan, as LSan can use poisoning information to determine what memory to scan or not to scan, making leak detection more accurate and faster. > > **Suppressing:** > Currently the suppression list is only used to suppress JLI leaks that are known, the rest are done in code. Suppressing needs to identify the source of thet leak. Due to Hotspot's code organization, we would need to suppress `os::malloc` and friends, which would suppress everything. Suppressing in code has the added benefit of being explicit and surviving refactors if methods change. > > **Caveats:** > - By default ASan enables LSan, however we explicitly disable it unless `--enable-lsan` is given. It is useful to be able to use ASan without LSan. Using LSan by itself is less likely to be useful and will probably not work, but its still possible currently. Justin King has updated the pull request incrementally with one additional commit since the last revision: Update based on review Signed-off-by: Justin King ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12229/files - new: https://git.openjdk.org/jdk/pull/12229/files/f3d054b5..55b8381a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12229&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12229&range=03-04 Stats: 7 lines in 1 file changed: 4 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12229.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12229/head:pull/12229 PR: https://git.openjdk.org/jdk/pull/12229 From jcking at openjdk.org Tue Feb 7 15:40:34 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 15:40:34 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v6] In-Reply-To: References: Message-ID: > Adds initial LSan (LeakSanitizer) support to Hotspot. This setup has been used to identify multiple leaks so far. It can run most of the test suite except those that also fail with ASan, which is being looked at separately. It is especially useful when combined with ASan, as LSan can use poisoning information to determine what memory to scan or not to scan, making leak detection more accurate and faster. > > **Suppressing:** > Currently the suppression list is only used to suppress JLI leaks that are known, the rest are done in code. Suppressing needs to identify the source of thet leak. Due to Hotspot's code organization, we would need to suppress `os::malloc` and friends, which would suppress everything. Suppressing in code has the added benefit of being explicit and surviving refactors if methods change. > > **Caveats:** > - By default ASan enables LSan, however we explicitly disable it unless `--enable-lsan` is given. It is useful to be able to use ASan without LSan. Using LSan by itself is less likely to be useful and will probably not work, but its still possible currently. Justin King has updated the pull request incrementally with one additional commit since the last revision: Revert changes to JDK Signed-off-by: Justin King ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12229/files - new: https://git.openjdk.org/jdk/pull/12229/files/55b8381a..5c79bef5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12229&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12229&range=04-05 Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12229.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12229/head:pull/12229 PR: https://git.openjdk.org/jdk/pull/12229 From jcking at openjdk.org Tue Feb 7 15:40:36 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 15:40:36 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v4] In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 15:32:11 GMT, Justin King wrote: >> Adds initial LSan (LeakSanitizer) support to Hotspot. This setup has been used to identify multiple leaks so far. It can run most of the test suite except those that also fail with ASan, which is being looked at separately. It is especially useful when combined with ASan, as LSan can use poisoning information to determine what memory to scan or not to scan, making leak detection more accurate and faster. >> >> **Suppressing:** >> Currently the suppression list is only used to suppress JLI leaks that are known, the rest are done in code. Suppressing needs to identify the source of thet leak. Due to Hotspot's code organization, we would need to suppress `os::malloc` and friends, which would suppress everything. Suppressing in code has the added benefit of being explicit and surviving refactors if methods change. >> >> **Caveats:** >> - By default ASan enables LSan, however we explicitly disable it unless `--enable-lsan` is given. It is useful to be able to use ASan without LSan. Using LSan by itself is less likely to be useful and will probably not work, but its still possible currently. > > Justin King has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: > > - Merge branch 'master' into lsan > - Fix grammatical error > > Signed-off-by: Justin King > - Update based on review and remove restriction on compressed oops and class pointers > > Signed-off-by: Justin King > - Support CDS > > Signed-off-by: Justin King > - Merge remote-tracking branch 'upstream/master' into lsan > - Merge remote-tracking branch 'upstream/master' into lsan > - Merge remote-tracking branch 'upstream/master' into lsan > - Implement initial LSAN support v2 > > Signed-off-by: Justin King > - Implement initial LSAN support Reverted the change to the JDK test leak, that can be done in another PR as mentioned. ------------- PR: https://git.openjdk.org/jdk/pull/12229 From tschatzl at openjdk.org Tue Feb 7 15:41:28 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 7 Feb 2023 15:41:28 GMT Subject: RFR: 8301988: VerifyLiveClosure::verify_liveness asserts on bad pointers outside heap [v2] In-Reply-To: References: Message-ID: > Hi all, > > can I have reviews for this change to liveness verification that fixes some unwanted asserts because > > - it uses decode_not_null which will assert if the given oop address is not in the heap, making the remainder of the verification useless in that case > - if the referenced object is not in the heap, we try to get its heap region too when printing, which also fails some assertions > - in the innermost if lots of code is duplicated in both cases > > The first two issues are really annoying (there is another one when the `Klass` is garbage when calling `is_obj_dead_cond`, but I'll try to improve that separately). > > Testing: local compilation/testing, gha > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: Missing changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12456/files - new: https://git.openjdk.org/jdk/pull/12456/files/3066f5f2..b2c41af7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12456&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12456&range=00-01 Stats: 11 lines in 3 files changed: 10 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12456.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12456/head:pull/12456 PR: https://git.openjdk.org/jdk/pull/12456 From jcking at openjdk.org Tue Feb 7 15:44:36 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 15:44:36 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy [v4] In-Reply-To: References: Message-ID: <9KjUQXJRYFeVtfzbRh-OY9CaQaAjwguxJJpqR92xFAo=.409e3276-9c9d-4d82-9bcc-27fad82e7074@github.com> > - Rename `MEMFLAGS` to `MemoryType`. `MEMFLAGS` is highly misleading as flags typically can be combined. > - Update `MemoryType` to have enumeration names that follow the style guide, no `mt` prefix. > - Create aliases for old `mtXXX` names. > - Remove `mt_number_of_types` from the enumeration. > - Shift implementation of utilities related to `MEMFLAGS` from `NMTUtil` to `MemoryTypes`. Handle missing `mt` prefix during parsing. > - `NMTUtil` references are not updated to avoid increasing patch size. > - Merge `AllocFailStrategy` and `AllocFailType` into `AllocationFailureStrategy`. > - Move `MemoryType` and `AllocationFailureStrategy` to their own respective headers. > > This change does not update variable verbiage. That should be something done over time. Justin King has updated the pull request incrementally with one additional commit since the last revision: Fix printf formatting error Signed-off-by: Justin King ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12454/files - new: https://git.openjdk.org/jdk/pull/12454/files/fe293c42..61596ca6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12454&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12454&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12454.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12454/head:pull/12454 PR: https://git.openjdk.org/jdk/pull/12454 From sviswanathan at openjdk.org Tue Feb 7 15:57:00 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Tue, 7 Feb 2023 15:57:00 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v11] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: On Tue, 7 Feb 2023 02:49:44 GMT, Sandhya Viswanathan wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Add algorithm comments > > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2227: > >> 2225: >> 2226: // lut_roll URL >> 2227: __ emit_data64(0xb9b9bfbf04111000, relocInfo::none); > > The lut_roll URL doesn't seem to be correct: > 0x5F (URL base64 ASCII for "/") would need an offset of -20H i.e. 0xEC. > However the others with upper nibble as 5 need an offset of -65H i.e. 0xBF. > > It looks to me that the adjustment for 5F should be -4 instead of -1 at line 2722. @asgibbons Currently the base64 string with "_" is not getting the vectorized implementation. as per my previous comment. It should show up in the jmh test. Once you fix that you will come across this bug. The logic is wrong here for 5F case. ------------- PR: https://git.openjdk.org/jdk/pull/12126 From duke at openjdk.org Tue Feb 7 16:17:05 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Tue, 7 Feb 2023 16:17:05 GMT Subject: RFR: 8296344: Remove dependency on G1 for writing the CDS archive heap [v4] In-Reply-To: References: Message-ID: <94Jg1T0G_ZtP-EgPfOsCusWpm4xA6K4ms_fNKKa7wLM=.c2756e61-c490-466b-a9b4-abf215fbcb41@github.com> On Fri, 3 Feb 2023 05:23:26 GMT, Ioi Lam wrote: >> Goals >> >> - Simplify the writing of the CDS archive heap >> - We no longer need to allocate special "archive regions" during CDS dump time. See all the removed G1 code. >> - Make it possible to (in a future RFE) write the CDS archive heap using any garbage collector >> >> Implementation - the following runs inside a safepoint so heap objects aren't moving >> >> - Find all the objects that should be archived >> - Allocate buffer using a GrowableArray >> - Copy all "open" objects into the buffer >> - Copy all "closed" objects into the buffer >> - Make sure the heap image can be mapped with all possible G1 region sizes: >> - While copying, add fillers to make sure no object spans across 1MB boundary (minimal G1 region size) >> - Align the bottom of the "open" and "closed" objects at 1MB boundary >> - Write the buffer to disk as the image for the archive heap >> >> Testing: tiers 1 ~ 5 > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Simplified relocation by using buffered address instead of requested address src/hotspot/share/cds/archiveHeapWriter.cpp line 388: > 386: assert(is_aligned(_open_bottom, HeapRegion::GrainBytes), "sanity"); > 387: > 388: _requested_closed_region_bottom = align_down(heap_end - closed_region_byte_size, HeapRegion::GrainBytes); Instead of aligning to `HeapRegion::GrainBytes`, shouldn't it be aligned to `MIN_GC_REGION_ALIGNMENT`? With `HeapRegion::GrainBytes`, if the cds archive is created with large region size (say 32M) and later mapped with smaller region size (say 1M), then the archive heap would need to be relocated. Using `MIN_GC_REGION_ALIGNMENT` would avoid the relocation in this case. ------------- PR: https://git.openjdk.org/jdk/pull/12304 From jcking at openjdk.org Tue Feb 7 17:25:02 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 17:25:02 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy [v5] In-Reply-To: References: Message-ID: <60XHlRRmnr8C-BcA_pRbLwIs7P5lqv5cayfc4ifBHeE=.12ed00cf-afc1-410a-b415-4425769be62b@github.com> > - Rename `MEMFLAGS` to `MemoryType`. `MEMFLAGS` is highly misleading as flags typically can be combined. > - Update `MemoryType` to have enumeration names that follow the style guide, no `mt` prefix. > - Create aliases for old `mtXXX` names. > - Remove `mt_number_of_types` from the enumeration. > - Shift implementation of utilities related to `MEMFLAGS` from `NMTUtil` to `MemoryTypes`. Handle missing `mt` prefix during parsing. > - `NMTUtil` references are not updated to avoid increasing patch size. > - Merge `AllocFailStrategy` and `AllocFailType` into `AllocationFailureStrategy`. > - Move `MemoryType` and `AllocationFailureStrategy` to their own respective headers. > > This change does not update variable verbiage. That should be something done over time. Justin King has updated the pull request incrementally with one additional commit since the last revision: Add precompiled header Signed-off-by: Justin King ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12454/files - new: https://git.openjdk.org/jdk/pull/12454/files/61596ca6..eaa44d06 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12454&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12454&range=03-04 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12454.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12454/head:pull/12454 PR: https://git.openjdk.org/jdk/pull/12454 From jcking at openjdk.org Tue Feb 7 17:33:09 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 17:33:09 GMT Subject: Withdrawn: JDK-8299688: Adopt C++14 compatible std::bit_cast introduced in C++20 In-Reply-To: References: Message-ID: <2PNq1D3M-oKauuWsXv4kSCPg-Gq4JEVQeqcIMFnmnnk=.71d5baa2-2cde-44eb-a0c6-f80fc58a0195@github.com> On Thu, 5 Jan 2023 18:04:29 GMT, Justin King wrote: > Adopts a C++14 compatible implementation of [std::bit_cast](https://en.cppreference.com/w/cpp/numeric/bit_cast). > > `PrimitiveConversions::cast` is refactored into `bit_cast` and extended to support all compatible types. Additionally it makes use of `__builtin_bit_cast` when available, which is strictly well defined compared to fallback approaches which are sometimes lurking in undefined behavior territory. > > Lastly some legacy casting is updated to use `bit_cast` in `utilities/globalDefinitions.hpp`. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/11865 From stuefe at openjdk.org Tue Feb 7 17:43:35 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 7 Feb 2023 17:43:35 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy [v3] In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 14:43:01 GMT, Coleen Phillimore wrote: >> Justin King has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix refactor mistake >> >> Signed-off-by: Justin King > > I think also having a PR with a bullet list is a good sign that you're doing too much in one change. I agree with @coleenp, sorry. I think this PR is premature. My advice would be to discuss such invasive changes on the ML first before opening a PR. Sometimes we do broad changes like this to keep the code base fresh. But it's always a tradeoff. Often the benefits of correcting the code base outweigh the cost, so we do it. But sometimes they don't. And I argue that in this case, they don't. Mental load cost: Not every name in the hotspot adheres to our naming guides, but may nevertheless be burnt into the collective brain of developers who have worked with the code base for many years. We talked about them, argued about them, there are tons of ML discussions and private emails and documentation surrounding them, they appear in scripts and test documentation... I'd hate to talk about the flag previously known as MEMFLAGS. It is a dividing line, and necessarily an arbitrary one. Not everyone sees this line at the same point. For a broad change that lies on this side of the line see the recent NULL->nullptr changes. Invasive, sure, but useful enough to do. Backporting cost: three LTS releases are still maintained by vendors, soon to be joined by a fourth one. Such a broad change makes backporting fixes difficult. The NULL->nullptr change was simpler in that regard because even though it spoils automatic merges, merging manually is straightforward. But renaming is a different matter. We have seen such problems in the past with Metaspace class renames, and that is just an isolated subsystem. Moreover, with MEMFLAGS, things are in flux; we may change the implementation in the future, and there are different plans for it. So I'd leave this as it is. ------------- PR: https://git.openjdk.org/jdk/pull/12454 From jcking at openjdk.org Tue Feb 7 17:44:49 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 17:44:49 GMT Subject: Integrated: JDK-8298445: Add LeakSanitizer support in HotSpot In-Reply-To: References: Message-ID: <5dEG5OrxWtBykLpbMa6lTXYFeriLC5_JjgKj_1xoApo=.bd397254-1fbd-4898-9b1b-5b24cda4d715@github.com> On Thu, 26 Jan 2023 17:33:28 GMT, Justin King wrote: > Adds initial LSan (LeakSanitizer) support to Hotspot. This setup has been used to identify multiple leaks so far. It can run most of the test suite except those that also fail with ASan, which is being looked at separately. It is especially useful when combined with ASan, as LSan can use poisoning information to determine what memory to scan or not to scan, making leak detection more accurate and faster. > > **Suppressing:** > Currently the suppression list is only used to suppress JLI leaks that are known, the rest are done in code. Suppressing needs to identify the source of thet leak. Due to Hotspot's code organization, we would need to suppress `os::malloc` and friends, which would suppress everything. Suppressing in code has the added benefit of being explicit and surviving refactors if methods change. > > **Caveats:** > - By default ASan enables LSan, however we explicitly disable it unless `--enable-lsan` is given. It is useful to be able to use ASan without LSan. Using LSan by itself is less likely to be useful and will probably not work, but its still possible currently. This pull request has now been integrated. Changeset: 27126157 Author: Justin King Committer: Magnus Ihse Bursie URL: https://git.openjdk.org/jdk/commit/27126157d927c5ec4354cde8f31076899691996b Stats: 316 lines in 22 files changed: 307 ins; 0 del; 9 mod 8298445: Add LeakSanitizer support in HotSpot Reviewed-by: erikj, ihse ------------- PR: https://git.openjdk.org/jdk/pull/12229 From jcking at openjdk.org Tue Feb 7 18:02:52 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 18:02:52 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy [v3] In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 14:43:01 GMT, Coleen Phillimore wrote: >> Justin King has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix refactor mistake >> >> Signed-off-by: Justin King > > I think also having a PR with a bullet list is a good sign that you're doing too much in one change. > I agree with @coleenp, sorry. I think this PR is premature. My advice would be to discuss such invasive changes on the ML first before opening a PR. > > Sometimes we do broad changes like this to keep the code base fresh. But it's always a tradeoff. Often the benefits of correcting the code base outweigh the cost, so we do it. But sometimes they don't. And I argue that in this case, they don't. > > Mental load cost: Not every name in the hotspot adheres to our naming guides, but may nevertheless be burnt into the collective brain of developers who have worked with the code base for many years. We talked about them, argued about them, there are tons of ML discussions and private emails and documentation surrounding them, they appear in scripts and test documentation... I'd hate to talk about the flag previously known as MEMFLAGS. > > It is a dividing line, and necessarily an arbitrary one. Not everyone sees this line at the same point. For a broad change that lies on this side of the line see the recent NULL->nullptr changes. Invasive, sure, but useful enough to do. > > Backporting cost: three LTS releases are still maintained by vendors, soon to be joined by a fourth one. Such a broad change makes backporting fixes difficult. The NULL->nullptr change was simpler in that regard because even though it spoils automatic merges, merging manually is straightforward. But renaming is a different matter. We have seen such problems in the past with Metaspace class renames, and that is just an isolated subsystem. > > Moreover, with MEMFLAGS, things are in flux; we may change the implementation in the future, and there are different plans for it. So I'd leave this as it is. I can agree with the reasoning (except backporting, that can be applied to any change touching an existing file), it is arbitrary and depends where it is draw. It didn't take very long to do this PR, so it doesn't particularly matter to me. ------------- PR: https://git.openjdk.org/jdk/pull/12454 From jcking at openjdk.org Tue Feb 7 18:02:54 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 18:02:54 GMT Subject: Withdrawn: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 14:14:07 GMT, Justin King wrote: > - Rename `MEMFLAGS` to `MemoryType`. `MEMFLAGS` is highly misleading as flags typically can be combined. > - Update `MemoryType` to have enumeration names that follow the style guide, no `mt` prefix. > - Create aliases for old `mtXXX` names. > - Remove `mt_number_of_types` from the enumeration. > - Shift implementation of utilities related to `MEMFLAGS` from `NMTUtil` to `MemoryTypes`. Handle missing `mt` prefix during parsing. > - `NMTUtil` references are not updated to avoid increasing patch size. > - Merge `AllocFailStrategy` and `AllocFailType` into `AllocationFailureStrategy`. > - Move `MemoryType` and `AllocationFailureStrategy` to their own respective headers. > > This change does not update variable verbiage. That should be something done over time. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/12454 From duke at openjdk.org Tue Feb 7 18:15:22 2023 From: duke at openjdk.org (Scott Gibbons) Date: Tue, 7 Feb 2023 18:15:22 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v12] In-Reply-To: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: > Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. > > Encode performance: > **Old:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms > > > **New:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms > > > Decode performance: > **Old:** > > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms > > **New:** > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Back out change to encode ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12126/files - new: https://git.openjdk.org/jdk/pull/12126/files/0a274e5a..d23206db Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=10-11 Stats: 6 lines in 1 file changed: 1 ins; 1 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/12126.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12126/head:pull/12126 PR: https://git.openjdk.org/jdk/pull/12126 From egahlin at openjdk.org Tue Feb 7 18:35:27 2023 From: egahlin at openjdk.org (Erik Gahlin) Date: Tue, 7 Feb 2023 18:35:27 GMT Subject: RFR: 8301380: jdk/jfr/api/consumer/streaming/TestCrossProcessStreaming.java [v3] In-Reply-To: References: Message-ID: <-BeFPzhN2k9aM-LTHDRtpWPH7k7KU84JtigSsNAZvhg=.179fee09-767c-48e0-8a8c-037497fcc4d8@github.com> On Fri, 3 Feb 2023 09:32:25 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> please help review this small adjustment to close the gap where iterated threads that attach via jni are left without a valid JFR thread id. For details, please see the JIRA issue. >> >> Testing: jdk_jfr >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: > > revert uncalled for changes Marked as reviewed by egahlin (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12388 From luhenry at openjdk.org Tue Feb 7 18:39:21 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Tue, 7 Feb 2023 18:39:21 GMT Subject: RFR: 8284196: RISC-V: Detect supported ISA extensions over cpuinfo [v2] In-Reply-To: <_9crD-_k0hyO59IGouVx1vUJDJAmNUb1eLFa5EIqvxI=.e230cd38-5cb4-451f-817d-768aa512615d@github.com> References: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> <_9crD-_k0hyO59IGouVx1vUJDJAmNUb1eLFa5EIqvxI=.e230cd38-5cb4-451f-817d-768aa512615d@github.com> Message-ID: <5Vfja25tJ0cURNIlqACQ4oTbGKgYMaLuIpnI_h3boo8=.062169e0-7beb-4d5c-9cad-f8380e045a5e@github.com> On Tue, 7 Feb 2023 14:17:49 GMT, Feilong Jiang wrote: >> src/hotspot/cpu/riscv/vm_version_riscv.cpp line 186: >> >>> 184: } >>> 185: >>> 186: if (UseZba && !_cpu_features.ext_zba) { >> >> We are missing where we are implicitly enabling `UseZ*` based on their corresponding values in `_cpu_feature.ext_z*`. I would assume it to be the main motivator for this change. > > Unlike RVC, there is still no hardware that supports those extensions. I think we should do more testing and evaluation on real hardware before enabling them by default. > The old code may crash with sigill when users enable unsupported extensions through command line since there is no mechanism to detect multi-letter extensions. With this patch, we could complain `not supported on this CPU` warning and continue running the program. QEMU has good enough support to do functional testing. For performance testing, I agree it would be ideal to have proper hardware to test, but even then it's not guaranteed that a performance-minded extension would _actually_ be faster on all given hardwares. For example the original AVX-512 which led to thermal throttling. So IMO as long as it's functional, we should want to enable it. We otherwise run the risk of "what about"-ing it until we can never enable it. Happy to heard others' opinions as well! ------------- PR: https://git.openjdk.org/jdk/pull/12343 From kvn at openjdk.org Tue Feb 7 20:09:48 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 7 Feb 2023 20:09:48 GMT Subject: RFR: 8298091: Dump native instruction along with nmethod name when using Compiler.codelist In-Reply-To: <1dSP2mbbyqiKLRmWwCfwGdb67ll7-K4cRtV_muUkS9I=.2d2dc2f8-acca-49de-89f5-70a825c057fa@github.com> References: <1dSP2mbbyqiKLRmWwCfwGdb67ll7-K4cRtV_muUkS9I=.2d2dc2f8-acca-49de-89f5-70a825c057fa@github.com> Message-ID: On Thu, 2 Feb 2023 02:12:06 GMT, Yi Yang wrote: > Dump native instruction along with nmethod name when using Compiler.codelist. > > e.g. > jcmd Compiler.codelist decode= > > Output: > nmethod_name [addr_begin, code_begin, code_end] > Please, provide short/reduced version of output example in JBS or here in a separate comment. ------------- Changes requested by kvn (Reviewer). PR: https://git.openjdk.org/jdk/pull/12381 From rehn at openjdk.org Tue Feb 7 20:34:23 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 7 Feb 2023 20:34:23 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v9] In-Reply-To: References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: On Thu, 2 Feb 2023 13:19:06 GMT, Thomas Stuefe wrote: >> This RFE adds the option to auto-trim the Glibc heap as part of the GC cycle. If the VM process suffered high temporary malloc spikes (regardless whether from JVM- or user code), this could recover significant amounts of memory. >> >> We discussed this a year ago [1], but the item got pushed to the bottom of my work pile, therefore it took longer than I thought. >> >> ### Motivation >> >> The Glibc allocator is reluctant to return memory to the OS, much more so than other allocators. Temporary malloc spikes often carry over as permanent RSS increase. >> >> Note that C-heap retention is difficult to observe. Since it is freed memory, it won't show up in NMT, it is just a part of private RSS. >> >> Theoretically, retained memory is not lost since it will be reused by future mallocs. Retaining memory is therefore a bet on the future behavior of the app. The allocator bets on the application needing memory in the near future, and to satisfy that need via malloc. >> >> But an app's malloc load can fluctuate wildly, with temporary spikes and long idle periods. And if the app rolls its own custom allocators atop of mmap, as hotspot does, a lot of that memory cannot be reused even though it counts toward its memory footprint. >> >> To help, Glibc exports an API to trim the C-heap: `malloc_trim(3)`. With JDK 18 [2], SAP contributed a new jcmd command to *manually* trim the C-heap on Linux. This RFE adds a complementary way to trim automatically. >> >> #### Is this even a problem? >> >> Do we have high malloc spikes in the JVM process? We assume that malloc load from hotspot is usually low since hotspot typically clusters allocations into custom areas - metaspace, code heap, arenas. >> >> But arenas are subject to Glibc mem retention too. I was surprised by that since I assumed 32k arena chunks were too big to be subject of Glibc retention. But I saw in experiments that high arena peaks often cause lasting RSS increase. >> >> And of course, both hotspot and JDK do a lot of finer-granular mallocs outside of custom allocators. >> >> But many cases of high memory retention in Glibc I have seen in third-party JNI code. Libraries allocate large buffers via malloc as temporary buffers. In fact, since we introduced the jcmd "System.trim_native_heap", some of our customers started to call this command periodically in scripts to counter these issues. >> >> Therefore I think while high malloc spikes are atypical for a JVM process, they can happen. Having a way to auto-trim the native heap makes sense. >> >> ### When should we trim? >> >> We want to trim when we know there is a lull in malloc activity coming. But we have no knowledge of the future. >> >> We could build a heuristic based on malloc frequency. But on closer inspection that is difficult. We cannot use NMT, since NMT has no complete picture (only knows hotspot) and is usually disabled in production anyway. The only way to get *all* mallocs would be to use Glibc malloc hooks. We have done so in desperate cases at SAP, but Glibc removed malloc hooks in 2.35. It would be a messy solution anyway; best to avoid it. >> >> The next best thing is synchronizing with the larger C-heap users in the VM: compiler and GC. But compiler turns out not to be such a problem, since the compiler uses arenas, and arena chunks are buffered in a free pool with a five-second delay. That means compiler activity that happens in bursts, like at VM startup, will just shuffle arena chunks around from/to the arena free pool, never bothering to call malloc or free. >> >> That leaves the GC, which was also the experts' recommendation in last year's discussion [1]. Most GCs do uncommit, and trimming the native heap fits well into this. And we want to time the trim to not get into the way of a GC. Plus, integrating trims into the GC cycle lets us reuse GC logging and timing, thereby making RSS changes caused by trim-native visible to the analyst. >> >> >> ### How it works: >> >> Patch adds new options (experimental for now, and shared among all GCs): >> >> >> -XX:+GCTrimNativeHeap >> -XX:GCTrimNativeHeapInterval= (defaults to 60) >> >> >> `GCTrimNativeHeap` is off by default. If enabled, it will cause the VM to trim the native heap on full GCs as well as periodically. The period is defined by `GCTrimNativeHeapInterval`. Periodic trimming can be completely switched off with `GCTrimNativeHeapInterval=0`; in that case, we will only trim on full GCs. >> >> ### Examples: >> >> This is an artificial test that causes two high malloc spikes with long idle periods. Observe how RSS recovers with trim but stays up without trim. The trim interval was set to 15 seconds for the test, and no GC was invoked here; this is periodic trimming. >> >> ![alloc-test](http://cr.openjdk.java.net/~stuefe/other/autotrim/rss-all-collectors.png) >> >> (See here for parameters: [run script](http://cr.openjdk.java.net/~stuefe/other/autotrim/run-all.sh) ) >> >> Spring pet clinic boots up, then idles. Once with, once without trim, with the trim interval at 60 seconds default. Of course, if it were actually doing something instead of idling, trim effects would be smaller. But the point of trimming is to recover memory in idle periods. >> >> ![petclinic bootup](http://cr.openjdk.java.net/~stuefe/other/autotrim/spring-petclinic-rss-with-and-without-trim.png)) >> >> (See here for parameters: [run script](http://cr.openjdk.java.net/~stuefe/other/autotrim/run-petclinic-boot.sh) ) >> >> >> >> ### Implementation >> >> One problem I faced when implementing this was that trimming was non-interruptable. GCs usually split the uncommit work into smaller portions, which is impossible for `malloc_trim()`. >> >> So very slow trims could introduce longer GC pauses. I did not want this, therefore I implemented two ways to trim: >> 1) GCs can opt to trim asynchronously. In that case, a `NativeTrimmer` thread runs on behalf of the GC and takes care of all trimming. The GC just nudges the `NativeTrimmer` at the end of its GC cycle, but the trim itself runs concurrently. >> 2) GCs can do the trim inside their own thread, synchronously. It will have to wait until the trim is done. >> >> (1) has the advantage of giving us periodic trims even without GC activity (Shenandoah does this out of the box). >> >> #### Serial >> >> Serial does the trimming synchronously as part of a full GC, and only then. I did not want to spawn a separate thread for the SerialGC. Therefore Serial is the only GC that does not offer periodic trimming, it just trims on full GC. >> >> #### Parallel, G1, Z >> >> All of them do the trimming asynchronously via `NativeTrimmer`. They schedule the native trim at the end of a full collection. They also pause the trimming at the beginning of a cycle to not trim during GCs. >> >> #### Shenandoah >> >> Shenandoah does the trimming synchronously in its service thread, similar to how it handles uncommits. Since the service thread already runs concurrently and continuously, it can do periodic trimming; no need to spin a new thread. And this way we can reuse the Shenandoah timing classes. >> >> ### Patch details >> >> - adds three new functions to the `os` namespace: >> - `os::trim_native_heap()` implementing trim >> - `os::can_trim_native_heap()` and `os::should_trim_native_heap()` to return whether platform supports trimming resp. whether the platform considers trimming to be useful. >> - replaces implementation of the cmd "System.trim_native_heap" with the new `os::trim_native_heap` >> - provides a new wrapper function wrapping the tedious `mallinfo()` vs `mallinfo2()` business: `os::Linux::get_mallinfo()` >> - adds a GC-shared utility class, `GCTrimNative`, that takes care of trimming and GC-logging and houses the `NativeTrimmer` thread class. >> - adds a regression test >> >> >> ### Tests >> >> Tested older Glibc (2.31), and newer Glibc (2.35) (`mallinfo()` vs` mallinfo2()`), on Linux x64. >> >> The rest of the tests will be done by GHA and in our SAP nightlies. >> >> >> ### Remarks >> >> #### How about other allocators? >> >> I have seen this retention problem mainly with the Glibc and the AIX libc. Muslc returns memory more eagerly to the OS. I also tested with jemalloc and found it also reclaims more aggressively, therefore I don't think MacOS or BSD are affected that much by retention either. >> >> #### Trim costs? >> >> Trim-native is a tradeoff between memory and performance. We pay >> - The cost to do the trim depends on how much is trimmed. Time ranges on my machine between < 1ms for no-op trims, to ~800ms for 32GB trims. >> - The cost for re-acquiring the memory, should the memory be needed again, is the second cost factor. >> >> #### Predicting malloc_trim effects? >> >> `ShenandoahUncommit` avoids uncommits if they are not necessary, thus avoiding work and gc log spamming. I liked that and tried to follow that example. Tried to devise a way to predict the effect trim could have based on allocator info from mallinfo(3). That was quite frustrating since the documentation was confusing and I had to do a lot of experimenting. In the end, I came up with a heuristic to prevent obviously pointless trim attempts; see `os::should_trim_native_heap()`. I am not completely happy with it. >> >> #### glibc.malloc.trim_threshold? >> >> glibc has a tunable that looks like it could influence the willingness of Glibc to return memory to the OS, the "trim_threshold". In practice, I could not get it to do anything useful. Regardless of the setting, it never seemed to influence the trimming behavior. Even if it would work, I'm not sure we'd want to use that, since by doing malloc_trim manually we can space out the trims as we see fit, instead of paying the trim price for free(3). >> >> >> - [1] https://mail.openjdk.org/pipermail/hotspot-dev/2021-August/054323.html >> - [2] https://bugs.openjdk.org/browse/JDK-8269345 > > Thomas Stuefe has updated the pull request incrementally with four additional commits since the last revision: > > - wip > - rename GCTrimNative TrimNative > - rename NativeTrimmer > - rename Thanks for working on this. I'm still thinking about it. A question, could we also have a option for trimming on/after GC iff RSS is above settable threshold? (with some nice logic that pushes threshold upwards if RSS is still higher after trim) Secondly do you have any performance numbers regarding throughput/latency when trimming is on vs off? ------------- PR: https://git.openjdk.org/jdk/pull/10085 From duke at openjdk.org Wed Feb 8 00:03:12 2023 From: duke at openjdk.org (Scott Gibbons) Date: Wed, 8 Feb 2023 00:03:12 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v13] In-Reply-To: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: > Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. > > Encode performance: > **Old:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms > > > **New:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms > > > Decode performance: > **Old:** > > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms > > **New:** > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Address review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12126/files - new: https://git.openjdk.org/jdk/pull/12126/files/d23206db..d174c43f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=11-12 Stats: 20 lines in 1 file changed: 9 ins; 0 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/12126.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12126/head:pull/12126 PR: https://git.openjdk.org/jdk/pull/12126 From dholmes at openjdk.org Wed Feb 8 00:54:42 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 8 Feb 2023 00:54:42 GMT Subject: RFR: 8301853: C4819 warnings were reported in HotSpot on Windows In-Reply-To: References: Message-ID: On Wed, 8 Feb 2023 00:18:28 GMT, Maxim Degtyarev wrote: > Isn't it better to replace the multiplication sign with an asterisk (*) instead of the letter 'x'? No. Asterisk is already used to mean something else // * used for poly1305_multiply_scalar // x used for poly1305_multiply8_avx512 > The problem here is that the encoding of the file is not properly conveyed. @magicus That may be the problem with the warning, but I'm more concerned that the special characters won't display properly in typical editing environments. So as a change to hotspot code I am more than happy to see this go in. ------------- PR: https://git.openjdk.org/jdk/pull/12435 From dholmes at openjdk.org Wed Feb 8 01:13:45 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 8 Feb 2023 01:13:45 GMT Subject: RFR: 8301853: C4819 warnings were reported in HotSpot on Windows In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 12:33:44 GMT, Yasumasa Suenaga wrote: > This is subtask of #12427 . > > I have seen C4819 warning in stubGenerator_x86_64_poly.cpp and elfFile.hpp in HotSpot on Windows (CP932: Japanese locale) > > > d:\github-forked\jdk\src\hotspot\cpu\x86\stubGenerator_x86_64_poly.cpp(1): error C2220: ????????????????? > d:\github-forked\jdk\src\hotspot\cpu\x86\stubGenerator_x86_64_poly.cpp(1): warning C4819: ???????????? ??? (932) ??????????????????????????????????? Unicode ???????????? Visually the difference between ? and x is minimal. Whereas seeing ? in place of x is very disruptive. Sure you could argue this is an environment issue and the multiplication symbol should display properly with the right settings, but it is trivial fix, with zero harm, to just replace it with plain x. ------------- PR: https://git.openjdk.org/jdk/pull/12435 From fyang at openjdk.org Wed Feb 8 01:25:48 2023 From: fyang at openjdk.org (Fei Yang) Date: Wed, 8 Feb 2023 01:25:48 GMT Subject: RFR: 8284196: RISC-V: Detect supported ISA extensions over cpuinfo [v2] In-Reply-To: <5Vfja25tJ0cURNIlqACQ4oTbGKgYMaLuIpnI_h3boo8=.062169e0-7beb-4d5c-9cad-f8380e045a5e@github.com> References: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> <_9crD-_k0hyO59IGouVx1vUJDJAmNUb1eLFa5EIqvxI=.e230cd38-5cb4-451f-817d-768aa512615d@github.com> <5Vfja25tJ0cURNIlqACQ4oTbGKgYMaLuIpnI_h3boo8=.062169e0-7beb-4d5c-9cad-f8380e045a5e@github.com> Message-ID: On Tue, 7 Feb 2023 18:36:53 GMT, Ludovic Henry wrote: >> Unlike RVC, there is still no hardware that supports those extensions. I think we should do more testing and evaluation on real hardware before enabling them by default. >> The old code may crash with sigill when users enable unsupported extensions through command line since there is no mechanism to detect multi-letter extensions. With this patch, we could complain `not supported on this CPU` warning and continue running the program. > > QEMU has good enough support to do functional testing. For performance testing, I agree it would be ideal to have proper hardware to test, but even then it's not guaranteed that a performance-minded extension would _actually_ be faster on all given hardwares. For example the original AVX-512 which led to thermal throttling. > > So IMO as long as it's functional, we should want to enable it. We otherwise run the risk of "what about"-ing it until we can never enable it. > > Happy to heard others' opinions as well! That's a good question. I am thinking that at least the user could enable use of specific ISA extensions on the JVM command line if it turns out to be profitable on their hardware platforms. It should be safe for them to do as this PR adds the necessary detection of availability of those ISA extensions. Previous working experience on other CPU platforms show that it's might be risky in respect of performance for the JVM to auto-enable all those [1][2]: the performance numbers might vary on different vendors. So I guess one possibility is that we might want some vendor-specific control like we do other CPU platforms [3]. But I hope some ISA extensions like bitmanip could be generally profitable across all venders so that the JVM could auto-enable those extension. But that needs more testing on real hardwares to prove. That said, we could at least move JVM options corresponding to those extensions out of the EXPERIMENTAL group for now once they are fully tested in QEMU system, but that 's another new issue. [1] https://bugs.openjdk.org/browse/JDK-8292894 [2] https://bugs.openjdk.org/browse/JDK-8295698 [3] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp#L132 ------------- PR: https://git.openjdk.org/jdk/pull/12343 From dholmes at openjdk.org Wed Feb 8 02:29:44 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 8 Feb 2023 02:29:44 GMT Subject: RFR: 8301988: VerifyLiveClosure::verify_liveness asserts on bad pointers outside heap [v2] In-Reply-To: References: Message-ID: <7nMjwERfYZcjO3Q1_o5g4Jb5VhHVc0Y4xCynPI9S8uU=.c49cd6de-5486-46a3-9a7f-73631cbe38b6@github.com> On Tue, 7 Feb 2023 15:41:28 GMT, Thomas Schatzl wrote: >> Hi all, >> >> can I have reviews for this change to liveness verification that fixes some unwanted asserts because >> >> - it uses decode_not_null which will assert if the given oop address is not in the heap, making the remainder of the verification useless in that case >> - if the referenced object is not in the heap, we try to get its heap region too when printing, which also fails some assertions >> - in the innermost if lots of code is duplicated in both cases >> >> The first two issues are really annoying (there is another one when the `Klass` is garbage when calling `is_obj_dead_cond`, but I'll try to improve that separately). >> >> Testing: local compilation/testing, gha >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > Missing changes Mostly seems quite reasonable based on the description, but one query below. Thanks. src/hotspot/share/gc/g1/g1CollectedHeap.inline.hpp line 217: > 215: > 216: inline bool G1CollectedHeap::is_obj_filler(const oop obj) { > 217: Klass* k = obj->klass_raw(); Not clear how you can get here from ` HeapRegion::is_obj_dead` with a bad oop, such that you need the raw variant. ?? ------------- PR: https://git.openjdk.org/jdk/pull/12456 From dholmes at openjdk.org Wed Feb 8 02:36:55 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 8 Feb 2023 02:36:55 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v4] In-Reply-To: References: Message-ID: <__4azmkEG1NJZTlUN3aQYxNvuzRMKBBEmbSoT93wB2k=.41c15e09-0fb2-4c8b-80ff-9f742df2ecf0@github.com> On Tue, 7 Feb 2023 15:34:50 GMT, Justin King wrote: >> Justin King has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: >> >> - Merge branch 'master' into lsan >> - Fix grammatical error >> >> Signed-off-by: Justin King >> - Update based on review and remove restriction on compressed oops and class pointers >> >> Signed-off-by: Justin King >> - Support CDS >> >> Signed-off-by: Justin King >> - Merge remote-tracking branch 'upstream/master' into lsan >> - Merge remote-tracking branch 'upstream/master' into lsan >> - Merge remote-tracking branch 'upstream/master' into lsan >> - Implement initial LSAN support v2 >> >> Signed-off-by: Justin King >> - Implement initial LSAN support > > Reverted the change to the JDK test leak, that can be done in another PR as mentioned. @jcking and @magicus this change was not approved by any Hotspot Reviewer prior to integration. AFAICS the discussion with @tstuefe was still ongoing. ------------- PR: https://git.openjdk.org/jdk/pull/12229 From jcking at openjdk.org Wed Feb 8 02:42:58 2023 From: jcking at openjdk.org (Justin King) Date: Wed, 8 Feb 2023 02:42:58 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v6] In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 15:40:34 GMT, Justin King wrote: >> Adds initial LSan (LeakSanitizer) support to Hotspot. This setup has been used to identify multiple leaks so far. It can run most of the test suite except those that also fail with ASan, which is being looked at separately. It is especially useful when combined with ASan, as LSan can use poisoning information to determine what memory to scan or not to scan, making leak detection more accurate and faster. >> >> **Suppressing:** >> Currently the suppression list is only used to suppress JLI leaks that are known, the rest are done in code. Suppressing needs to identify the source of thet leak. Due to Hotspot's code organization, we would need to suppress `os::malloc` and friends, which would suppress everything. Suppressing in code has the added benefit of being explicit and surviving refactors if methods change. >> >> **Caveats:** >> - By default ASan enables LSan, however we explicitly disable it unless `--enable-lsan` is given. It is useful to be able to use ASan without LSan. Using LSan by itself is less likely to be useful and will probably not work, but its still possible currently. > > Justin King has updated the pull request incrementally with one additional commit since the last revision: > > Revert changes to JDK > > Signed-off-by: Justin King I did integrate assuming somebody from Hotspot would sponsor. Does the bot/tooling not ensure reviews from applicable reviewers? If not, that seems non-ideal itself. ------------- PR: https://git.openjdk.org/jdk/pull/12229 From dholmes at openjdk.org Wed Feb 8 02:56:55 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 8 Feb 2023 02:56:55 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v6] In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 15:40:34 GMT, Justin King wrote: >> Adds initial LSan (LeakSanitizer) support to Hotspot. This setup has been used to identify multiple leaks so far. It can run most of the test suite except those that also fail with ASan, which is being looked at separately. It is especially useful when combined with ASan, as LSan can use poisoning information to determine what memory to scan or not to scan, making leak detection more accurate and faster. >> >> **Suppressing:** >> Currently the suppression list is only used to suppress JLI leaks that are known, the rest are done in code. Suppressing needs to identify the source of thet leak. Due to Hotspot's code organization, we would need to suppress `os::malloc` and friends, which would suppress everything. Suppressing in code has the added benefit of being explicit and surviving refactors if methods change. >> >> **Caveats:** >> - By default ASan enables LSan, however we explicitly disable it unless `--enable-lsan` is given. It is useful to be able to use ASan without LSan. Using LSan by itself is less likely to be useful and will probably not work, but its still possible currently. > > Justin King has updated the pull request incrementally with one additional commit since the last revision: > > Revert changes to JDK > > Signed-off-by: Justin King The bot can only check the required number of Reviewers (strictly 1 per OpenJDK rules but changeable as here via `/reviewers` command) but it doesn't know about the informal rules such as having reviewers from each affected area (there is no mapping from people to areas). Regardless until your conversation with @tstuefe was complete this was not ready for integration. ------------- PR: https://git.openjdk.org/jdk/pull/12229 From jcking at openjdk.org Wed Feb 8 03:14:54 2023 From: jcking at openjdk.org (Justin King) Date: Wed, 8 Feb 2023 03:14:54 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v6] In-Reply-To: References: Message-ID: <3KMh7CoSK-qyddwQgjcQAubNFT_ThIZ3lSy-hW-0oTE=.ece75ba9-8bc6-427c-83bf-e1d7bf95fe0e@github.com> On Wed, 8 Feb 2023 02:53:51 GMT, David Holmes wrote: > The bot can only check the required number of Reviewers (strictly 1 per OpenJDK rules but changeable as here via `/reviewers` command) but it doesn't know about the informal rules such as having reviewers from each affected area (there is no mapping from people to areas). > > Regardless until your conversation with @tstuefe was complete this was not ready for integration. >From my point of view it was more questions to clarify how it worked at this point to get to an understanding, rather than specific objections, since the flags were no longer there. Requiring contributors to adhere to informal rules such as who can and cannot approve, and who you should and do not have to wait for, is not ideal. It relies heavily in determining who is part of what group and what group they are reviewing for. Wouldn't it make more sense to tie approval to labels and have reviewers explicitly approve via a command? Or allow reviewers to explicitly require their approval via a command? That way there is no guessing? It would at least make it explicit and avoid miscommunication. The reviewers themselves are in the best position to know and bump required reviewers, not contributors. Just my two cents. ------------- PR: https://git.openjdk.org/jdk/pull/12229 From iklam at openjdk.org Wed Feb 8 03:22:44 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 8 Feb 2023 03:22:44 GMT Subject: RFR: 8296344: Remove dependency on G1 for writing the CDS archive heap [v4] In-Reply-To: <94Jg1T0G_ZtP-EgPfOsCusWpm4xA6K4ms_fNKKa7wLM=.c2756e61-c490-466b-a9b4-abf215fbcb41@github.com> References: <94Jg1T0G_ZtP-EgPfOsCusWpm4xA6K4ms_fNKKa7wLM=.c2756e61-c490-466b-a9b4-abf215fbcb41@github.com> Message-ID: <9wofGO5FZW7h3OUJPoK2lDbPN3BvD6v9AAGjW0jfXCo=.7b78f322-191f-4996-8af5-09cd50884432@github.com> On Tue, 7 Feb 2023 16:13:48 GMT, Ashutosh Mehra wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Simplified relocation by using buffered address instead of requested address > > src/hotspot/share/cds/archiveHeapWriter.cpp line 388: > >> 386: assert(is_aligned(_open_bottom, HeapRegion::GrainBytes), "sanity"); >> 387: >> 388: _requested_closed_region_bottom = align_down(heap_end - closed_region_byte_size, HeapRegion::GrainBytes); > > Instead of aligning to `HeapRegion::GrainBytes`, shouldn't it be aligned to `MIN_GC_REGION_ALIGNMENT`? > With `HeapRegion::GrainBytes`, if the cds archive is created with large region size (say 32M) and later mapped with smaller region size (say 1M), then the archive heap would need to be relocated. Using `MIN_GC_REGION_ALIGNMENT` would avoid the relocation in this case. When we dump with a large heap size (e.g.,16GB) but with with a small heap size (e.g., 256M), we are likely to need to relocate anyway, because the narrowOop encoding would be different (3-bit shift vs no shift). The design decision we made in the past was: the layout of the archived objects should be optimized for the heap size specified at dump time. Hence, the default CDS archive in the JDK is dumped with -Xmx128M so that it will be optimized for small workloads such as microservices. See https://github.com/openjdk/jdk/blob/2a579ab8392d30a35f044954178c788d16d4b800/make/Images.gmk#L117-L121 ------------- PR: https://git.openjdk.org/jdk/pull/12304 From stuefe at openjdk.org Wed Feb 8 06:49:59 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 8 Feb 2023 06:49:59 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v6] In-Reply-To: <3KMh7CoSK-qyddwQgjcQAubNFT_ThIZ3lSy-hW-0oTE=.ece75ba9-8bc6-427c-83bf-e1d7bf95fe0e@github.com> References: <3KMh7CoSK-qyddwQgjcQAubNFT_ThIZ3lSy-hW-0oTE=.ece75ba9-8bc6-427c-83bf-e1d7bf95fe0e@github.com> Message-ID: On Wed, 8 Feb 2023 03:11:32 GMT, Justin King wrote: > > The bot can only check the required number of Reviewers (strictly 1 per OpenJDK rules but changeable as here via `/reviewers` command) but it doesn't know about the informal rules such as having reviewers from each affected area (there is no mapping from people to areas). > > Regardless until your conversation with @tstuefe was complete this was not ready for integration. Yep, that was a bit quick. But misunderstandings happen. > > From my point of view it was more questions to clarify how it worked at this point to get to an understanding, rather than specific objections, since the flags were no longer there. Hi Justin, No, the discussion was still ongoing. The informal rule is that ongoing discussions should be closed and that nobody strongly objects to a change. Two reviewers are easy to come by. The point of this rule is that you have a reasonable chance to block changes you strongly oppose. And that every change in the code base has the buy-in from the majority. Note that a patch is seldom blocked by one person alone for a prolonged amount of time. But sometimes, overworked reviewers may block-then-ghost without meaning to. Requires patience from authors and responsibility from reviewers. As an author, it is okay to ping after a while and, getting no answer, to integrate with a clear warning. This all requires more time than faster paced projects, which is accepted for most RFEs. Requires more discussion and convincing, but gives us a better community. Still faster than getting in kernel patches :) > Requiring contributors to adhere to informal rules such as who can and cannot approve, and who you should and do not have to wait for, is not ideal. It relies heavily in determining who is part of what group and what group they are reviewing for. The rule is simply to resolve all discussions and opposition. In fact, even if you are an unknown, come from the outside, and oppose a patch with good reasons, people will usually listen. If the points made were good, the author has to address those concerns (or, should). > Wouldn't it make more sense to tie approval to labels and have reviewers explicitly approve via a command? Or allow reviewers to explicitly require their approval via a command? That way there is no guessing? It would at least make it explicit and avoid miscommunication. The reviewers themselves are in the best position to know and bump required reviewers, not contributors. Our rules come from pre-Github times, where discussions happened on MLs. We may adapt over time. But you cannot codify all interactions into software-enforced rules, there is always a human element. (I will continue the technical discussion separately) Cheers, Thomas > > Just my two cents. ------------- PR: https://git.openjdk.org/jdk/pull/12229 From stuefe at openjdk.org Wed Feb 8 07:01:54 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 8 Feb 2023 07:01:54 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v6] In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 15:40:34 GMT, Justin King wrote: >> Adds initial LSan (LeakSanitizer) support to Hotspot. This setup has been used to identify multiple leaks so far. It can run most of the test suite except those that also fail with ASan, which is being looked at separately. It is especially useful when combined with ASan, as LSan can use poisoning information to determine what memory to scan or not to scan, making leak detection more accurate and faster. >> >> **Suppressing:** >> Currently the suppression list is only used to suppress JLI leaks that are known, the rest are done in code. Suppressing needs to identify the source of thet leak. Due to Hotspot's code organization, we would need to suppress `os::malloc` and friends, which would suppress everything. Suppressing in code has the added benefit of being explicit and surviving refactors if methods change. >> >> **Caveats:** >> - By default ASan enables LSan, however we explicitly disable it unless `--enable-lsan` is given. It is useful to be able to use ASan without LSan. Using LSan by itself is less likely to be useful and will probably not work, but its still possible currently. > > Justin King has updated the pull request incrementally with one additional commit since the last revision: > > Revert changes to JDK > > Signed-off-by: Justin King I read through your explanation, and through the [design docs](https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizerDesignDocument), but my question remains unanswered. See below. > > Metaspace objects hold pointers to malloced memory. Metaspace Objects can leak, that's accepted. So these pointers would leak too. Would that not give us tons of false positives? > > No. LSan only cares that it can find pointers to malloc'ed memory, not that it can find pointers to Metaspace objects. The Metaspace is registered as a root region. It does however take into account ASan poisoning, which we added earlier to Metaspace. LSan will not scan poisoned memory regions that are part of root regions. We only poison in Metaspace when deallocation occurs and unpoison when allocation occurs. But they are not poisened. Upon JVM end, the heap still exists, holding class loaders, which hold references to CLDs. Those hold metaspace arenas (living in Metaspace), which hold Metaspace chunks, which are registered LSAN root areas, which hold C-allocated sub structures. These chunks are not poisened since they have never been returned to the chunk pool. They are still alive. To put it in other words: even I as a human cannot determine that this is not a C-heap leak. Technically, it is. But it is accepted since we accept these things leaking. Since we don't support unloading and re-loading the JVM, we accept that to have a faster shutdown. ------------- PR: https://git.openjdk.org/jdk/pull/12229 From stuefe at openjdk.org Wed Feb 8 08:13:44 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 8 Feb 2023 08:13:44 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v9] In-Reply-To: References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: On Tue, 7 Feb 2023 20:30:45 GMT, Robbin Ehn wrote: > Thanks for working on this. I'm still thinking about it. Thanks :) The PR is currently in Draft, I did not expect people to care. Its almost finished, but the description is outdated. I really don't like that Skara sends out mails for draft-state PRs. I consider closing this PR for the moment to avoid this. > > A question, could we also have a option for trimming on/after GC iff RSS is above settable threshold? (with some nice logic that pushes threshold upwards if RSS is still higher after trim) Sure, that makes sense. In our VM, we have a warning system that triggers automatically at certain RSS thresholds. You tie arbitrary jcmds to those thresholds, and customers use that to do a System.trim_native_heap jcmd. So, its certainly possible. I only dislike that this is tied to GC. Depending on the GC, if none runs, you may never hit the threshold. But I'll add this to my run-alongside-GC trigger loop. > > Secondly do you have any performance numbers regarding throughput/latency when trimming is on vs off? I tried before and got nothing raising from background noise; I guess the trim cycle is just to long (I had a 30 second cycle). I will repeat the tests. What benchmark would you recommend? I also plan to do a regular benchmark but with an agent lib injected that does concurrent malloc spikes, to mimic third party code doing mallocs on the side. ------------- PR: https://git.openjdk.org/jdk/pull/10085 From fyang at openjdk.org Wed Feb 8 09:05:43 2023 From: fyang at openjdk.org (Fei Yang) Date: Wed, 8 Feb 2023 09:05:43 GMT Subject: RFR: 8300255: Introduce interface for GC oop verification in the assembler [v2] In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 15:08:34 GMT, Erik ?sterlund wrote: >> Hi, seems we need two extra small changes to build on linux-riscv. Thanks. > >> Hi, seems we need two extra small changes to build on linux-riscv. Thanks. > > Thanks for having a look! Hopefully fixed in the latest commit. @fisk : Hi, I think we should also add following changes for riscv to make it work: diff --git a/src/hotspot/cpu/riscv/gc/shared/barrierSetAssembler_riscv.cpp b/src/hotspot/cpu/riscv/gc/shared/barrierSetAssembler_riscv.cpp index ed53fbf22c8..662de44de29 100644 --- a/src/hotspot/cpu/riscv/gc/shared/barrierSetAssembler_riscv.cpp +++ b/src/hotspot/cpu/riscv/gc/shared/barrierSetAssembler_riscv.cpp @@ -316,8 +316,7 @@ void BarrierSetAssembler::check_oop(MacroAssembler* masm, Register obj, Register __ mv(tmp2, (intptr_t) Universe::verify_oop_bits()); // Compare tmp1 and tmp2. - __ xorr(tmp1, tmp1, tmp2); - __ bnez(tmp1, error); + __ bne(tmp1, tmp2, error); // Make sure klass is 'reasonable', which is not zero. __ load_klass(obj, obj, tmp1); // get klass diff --git a/src/hotspot/cpu/riscv/gc/z/zBarrierSetAssembler_riscv.cpp b/src/hotspot/cpu/riscv/gc/z/zBarrierSetAssembler_riscv.cpp index 0741869ddaf..1ebfb8f9e23 100644 --- a/src/hotspot/cpu/riscv/gc/z/zBarrierSetAssembler_riscv.cpp +++ b/src/hotspot/cpu/riscv/gc/z/zBarrierSetAssembler_riscv.cpp @@ -449,10 +449,10 @@ void ZBarrierSetAssembler::generate_c1_load_barrier_runtime_stub(StubAssembler* void ZBarrierSetAssembler::check_oop(MacroAssembler* masm, Register obj, Register tmp1, Register tmp2, Label& error) { // Check if mask is good. - // verifies that ZAddressBadMask & x10 == 0 - __ ld(c_rarg3, Address(xthread, ZThreadLocalData::address_bad_mask_offset())); - __ andr(c_rarg2, x10, c_rarg3); - __ bnez(c_rarg2, error); + // verifies that ZAddressBadMask & obj == 0 + __ ld(tmp2, Address(xthread, ZThreadLocalData::address_bad_mask_offset())); + __ andr(tmp1, obj, tmp2); + __ bnez(tmp1, error); BarrierSetAssembler::check_oop(masm, obj, tmp1, tmp2, error); } ------------- PR: https://git.openjdk.org/jdk/pull/12443 From stefank at openjdk.org Wed Feb 8 09:46:56 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 8 Feb 2023 09:46:56 GMT Subject: RFR: JDK-8301481: Replace NULL with nullptr in os/windows [v4] In-Reply-To: <6fhxlbFt5TtE7Y5OKVViHtuXV7nIz1b7y3a3JL_Nto0=.738f3e4f-2fae-49a6-9668-047ebb4abc66@github.com> References: <6fhxlbFt5TtE7Y5OKVViHtuXV7nIz1b7y3a3JL_Nto0=.738f3e4f-2fae-49a6-9668-047ebb4abc66@github.com> Message-ID: <7zZV10CLteZJ9H93rFij1KrLkPQjxzAe1MHNRgH8QRI=.79d06665-a02d-4026-a42e-3d5989feee67@github.com> On Mon, 6 Feb 2023 09:44:27 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/windows. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Can't directly compare uintptr_t, need reinterpret_cast Changes requested by stefank (Reviewer). src/hotspot/os/windows/gc/z/zVirtualMemory_windows.cpp line 165: > 163: const uintptr_t res = ZMapper::reserve_for_shared_awe(ZAWESection, addr, size); > 164: > 165: assert(res == addr || res == reinterpret_cast(nullptr), "Should not reserve other memory than requested"); `res == 0` ------------- PR: https://git.openjdk.org/jdk/pull/12317 From stefank at openjdk.org Wed Feb 8 09:46:59 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 8 Feb 2023 09:46:59 GMT Subject: RFR: JDK-8301481: Replace NULL with nullptr in os/windows [v4] In-Reply-To: References: <6fhxlbFt5TtE7Y5OKVViHtuXV7nIz1b7y3a3JL_Nto0=.738f3e4f-2fae-49a6-9668-047ebb4abc66@github.com> Message-ID: On Tue, 7 Feb 2023 02:35:03 GMT, David Holmes wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Can't directly compare uintptr_t, need reinterpret_cast > > src/hotspot/os/windows/gc/z/zVirtualMemory_windows.cpp line 142: > >> 140: const uintptr_t res = ZMapper::reserve(addr, size); >> 141: >> 142: assert(res == addr || res == reinterpret_cast(nullptr), "Should not reserve other memory than requested"); > > Seems to me if the returned value is uintptr_t then this should be a check against zero not NULL/nullptr. Yes, please change this to `res == 0`. ------------- PR: https://git.openjdk.org/jdk/pull/12317 From rehn at openjdk.org Wed Feb 8 10:13:44 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 8 Feb 2023 10:13:44 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v9] In-Reply-To: References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: <-JTjkWIjvzpoIbAT9bUyVlgHWf35XAeqQ4Yj38FmBTU=.32cb1bfb-1b10-4089-a663-9526665c3458@github.com> On Wed, 8 Feb 2023 08:11:20 GMT, Thomas Stuefe wrote: > I will repeat the tests. What benchmark would you recommend? I also plan to do a regular benchmark but with an agent lib injected that does concurrent malloc spikes, to mimic third party code doing mallocs on the side. Internally we have seem workloads that involves classloading as a builder of RSS. (thought to be Symbols (we are looking into that now)) But I don't know what, and if there is such, benchmark that do classloading during the "performance cycle". Another observation is that if we get unlucky, it seem like we could be trimming (NativeTrimmerThread) while cleaning e.g. stringtable (via ServiceThread). Thus calling free a lot, not sure if that is problematic. Another case is if user only have very few logical cores, if ServiceThread+NativeTrimmerThread running at the same time user may experience some additional hick-ups. So we maybe want to use ServiceThread at the cost of sometimes delaying the trimming due to other work? ------------- PR: https://git.openjdk.org/jdk/pull/10085 From eosterlund at openjdk.org Wed Feb 8 10:19:20 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 8 Feb 2023 10:19:20 GMT Subject: RFR: 8300255: Introduce interface for GC oop verification in the assembler [v3] In-Reply-To: References: Message-ID: <4CCmelM9IsYigYOrhH308nUuL-e4MybxRpmN19zJUu8=.c4deaa3e-c721-4870-93ba-8c99089e4d97@github.com> > In the assembly code, there is some generic oop verification code. Said verification may or may not fit a particular GC. In particular, it has not worked well for ZGC for a while, and there is an if (UseZGC). This enhancement aims at generalizing this code, such that a collector can have its own oop verification policy. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: RISC-V fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12443/files - new: https://git.openjdk.org/jdk/pull/12443/files/c5617992..37018550 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12443&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12443&range=01-02 Stats: 6 lines in 2 files changed: 0 ins; 1 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/12443.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12443/head:pull/12443 PR: https://git.openjdk.org/jdk/pull/12443 From tschatzl at openjdk.org Wed Feb 8 10:26:49 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 8 Feb 2023 10:26:49 GMT Subject: RFR: 8301988: VerifyLiveClosure::verify_liveness asserts on bad pointers outside heap [v2] In-Reply-To: <7nMjwERfYZcjO3Q1_o5g4Jb5VhHVc0Y4xCynPI9S8uU=.c49cd6de-5486-46a3-9a7f-73631cbe38b6@github.com> References: <7nMjwERfYZcjO3Q1_o5g4Jb5VhHVc0Y4xCynPI9S8uU=.c49cd6de-5486-46a3-9a7f-73631cbe38b6@github.com> Message-ID: On Wed, 8 Feb 2023 02:26:26 GMT, David Holmes wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> Missing changes > > src/hotspot/share/gc/g1/g1CollectedHeap.inline.hpp line 217: > >> 215: >> 216: inline bool G1CollectedHeap::is_obj_filler(const oop obj) { >> 217: Klass* k = obj->klass_raw(); > > Not clear how you can get here from ` HeapRegion::is_obj_dead` with a bad oop, such that you need the raw variant. ?? The object is in the heap, but the occupying memory has already been zapped (in debug mode); i.e. the call in `heapRegion.cpp:518` could read `badHeapWordVal` as (compressed) klass value in the header. In that case the current code asserts in this call because in `oopDesc::klass()`, the call to `CompressedKlassPointers::decode_not_null` will assert in `compressedOops.inline.hpp:135` due to the `check_alignment` condition not satisfied. This makes this verification code assert before printing out any useful information to diagnose the problem quickly (in my case this has been a change that wrongly managed remembered sets). If I had had the remembered set verification printout, I would have found the problem immediately in this case (because the message would have told me that there is a problem with remembered sets). So it took a while to diagnose the issue, having to go into the debugger to painfully find the exact same information. I.e. this makes the verification code more robust. Imo the suggested solution to just continue execution is fine, because `is_obj_filler` will always return false (i.e. object is dead) for garbage objects and do the right thing here. There is the concern that now other non-verification code might not immediately trigger now, but most of it just fails the VM anyway if it finds a bad reference (after printing some information about it), for all other cases this is the right choice. ------------- PR: https://git.openjdk.org/jdk/pull/12456 From mgronlun at openjdk.org Wed Feb 8 10:29:52 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 8 Feb 2023 10:29:52 GMT Subject: Integrated: 8301380: jdk/jfr/api/consumer/streaming/TestCrossProcessStreaming.java In-Reply-To: References: Message-ID: <4lZV-8VSJVLfE0QDSD8-Gzz6jzsQlErywFy3k3mkaI0=.a5277907-6128-4669-9606-41eec7fae915@github.com> On Thu, 2 Feb 2023 15:07:21 GMT, Markus Gr?nlund wrote: > Greetings, > > please help review this small adjustment to close the gap where iterated threads that attach via jni are left without a valid JFR thread id. For details, please see the JIRA issue. > > Testing: jdk_jfr > > Thanks > Markus This pull request has now been integrated. Changeset: c92a7deb Author: Markus Gr?nlund URL: https://git.openjdk.org/jdk/commit/c92a7deba50cbf5e283d1bd0ef5f2d6f8a4fc947 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod 8301380: jdk/jfr/api/consumer/streaming/TestCrossProcessStreaming.java Reviewed-by: dholmes, egahlin ------------- PR: https://git.openjdk.org/jdk/pull/12388 From eosterlund at openjdk.org Wed Feb 8 10:45:45 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 8 Feb 2023 10:45:45 GMT Subject: RFR: 8300255: Introduce interface for GC oop verification in the assembler [v3] In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 15:08:34 GMT, Erik ?sterlund wrote: >> Hi, seems we need two extra small changes to build on linux-riscv. Thanks. > >> Hi, seems we need two extra small changes to build on linux-riscv. Thanks. > > Thanks for having a look! Hopefully fixed in the latest commit. > @fisk : Hi, I think we should also add following changes for riscv to make it work: > > ``` > diff --git a/src/hotspot/cpu/riscv/gc/shared/barrierSetAssembler_riscv.cpp b/src/hotspot/cpu/riscv/gc/shared/barrierSetAssembler_riscv.cpp > index ed53fbf22c8..662de44de29 100644 > --- a/src/hotspot/cpu/riscv/gc/shared/barrierSetAssembler_riscv.cpp > +++ b/src/hotspot/cpu/riscv/gc/shared/barrierSetAssembler_riscv.cpp > @@ -316,8 +316,7 @@ void BarrierSetAssembler::check_oop(MacroAssembler* masm, Register obj, Register > __ mv(tmp2, (intptr_t) Universe::verify_oop_bits()); > > // Compare tmp1 and tmp2. > - __ xorr(tmp1, tmp1, tmp2); > - __ bnez(tmp1, error); > + __ bne(tmp1, tmp2, error); > > // Make sure klass is 'reasonable', which is not zero. > __ load_klass(obj, obj, tmp1); // get klass > diff --git a/src/hotspot/cpu/riscv/gc/z/zBarrierSetAssembler_riscv.cpp b/src/hotspot/cpu/riscv/gc/z/zBarrierSetAssembler_riscv.cpp > index 0741869ddaf..1ebfb8f9e23 100644 > --- a/src/hotspot/cpu/riscv/gc/z/zBarrierSetAssembler_riscv.cpp > +++ b/src/hotspot/cpu/riscv/gc/z/zBarrierSetAssembler_riscv.cpp > @@ -449,10 +449,10 @@ void ZBarrierSetAssembler::generate_c1_load_barrier_runtime_stub(StubAssembler* > > void ZBarrierSetAssembler::check_oop(MacroAssembler* masm, Register obj, Register tmp1, Register tmp2, Label& error) { > // Check if mask is good. > - // verifies that ZAddressBadMask & x10 == 0 > - __ ld(c_rarg3, Address(xthread, ZThreadLocalData::address_bad_mask_offset())); > - __ andr(c_rarg2, x10, c_rarg3); > - __ bnez(c_rarg2, error); > + // verifies that ZAddressBadMask & obj == 0 > + __ ld(tmp2, Address(xthread, ZThreadLocalData::address_bad_mask_offset())); > + __ andr(tmp1, obj, tmp2); > + __ bnez(tmp1, error); > > BarrierSetAssembler::check_oop(masm, obj, tmp1, tmp2, error); > } > ``` Fixed! ------------- PR: https://git.openjdk.org/jdk/pull/12443 From stuefe at openjdk.org Wed Feb 8 11:05:44 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 8 Feb 2023 11:05:44 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v9] In-Reply-To: <-JTjkWIjvzpoIbAT9bUyVlgHWf35XAeqQ4Yj38FmBTU=.32cb1bfb-1b10-4089-a663-9526665c3458@github.com> References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> <-JTjkWIjvzpoIbAT9bUyVlgHWf35XAeqQ4Yj38FmBTU=.32cb1bfb-1b10-4089-a663-9526665c3458@github.com> Message-ID: On Wed, 8 Feb 2023 10:10:47 GMT, Robbin Ehn wrote: > > I will repeat the tests. What benchmark would you recommend? I also plan to do a regular benchmark but with an agent lib injected that does concurrent malloc spikes, to mimic third party code doing mallocs on the side. > > Internally we have seem workloads that involves classloading as a builder of RSS. (thought to be Symbols (we are looking into that now)) But I don't know what, and if there is such, benchmark that do classloading during the "performance cycle". I have mass classloading tests for JEP387 around. I can test them. That ties in to another question I wondered about recently, which is why so many sub structures of MetaspaceObj are C-heap allocated. Ideally, one would not need something like the various release_C_heap_structures() - since the lifetime of the sub structures is identical to its Parent in Metaspace, e.g. InstanceKlass, they could live in Metaspace too and not need explicit deallocation. I understand that you need explicit release for Symbol etc. > > Another observation is that if we get unlucky, it seem like we could be trimming (NativeTrimmerThread) while cleaning e.g. stringtable (via ServiceThread). Thus calling free a lot, not sure if that is problematic. Another case is if user only have very few logical cores, if ServiceThread+NativeTrimmerThread running at the same time user may experience some additional hick-ups. Interesting. One possibility would be use os::malloc/free calls, e.g. reset the timer. I did not do this because hotspot mallocs/frees are normally not that hot; I expected more malloc traffic from outside JNI libraries in our process, and those I cannot control or monitor anyway, so why bother. Note that if allocations are done via Arenas (RA or CompileArenas), they cluster allocations and have an inbuilt free delay, so they already filter high malloc/free jitter. That is why I disregarded the Compiler for the moment instead of pausing the trimmer thread during compilation. Before making it more complex I'd like to do more benchmarks though. Maybe its not even needed. After all, its an experimental opt-in switch for now and can be refined later. Let's see. > > So we maybe want to use ServiceThread at the cost of sometimes delaying the trimming due to other work? The problem with this - why I did not integrate it directly into e.g. the Shenandoah Control Thread - is that trim time cannot be predicted and trimming cannot be interrupted once it goes. Its a surprise and depends on the retention size and how fast the trim can proceed. Note that concurrent malloc/frees are usually not blocked while trimming if they are satisfied from the local arena. With GC-induced java heap uncommitting, it is different. We know much more. E.g., G1 uncommits piece by piece and can interrupt this process if time's up. So, using ServiceThread would be fine only if prolonged blocking is acceptable. ------------- PR: https://git.openjdk.org/jdk/pull/10085 From ihse at openjdk.org Wed Feb 8 13:17:48 2023 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 8 Feb 2023 13:17:48 GMT Subject: RFR: 8301853: C4819 warnings were reported in HotSpot on Windows In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 12:33:44 GMT, Yasumasa Suenaga wrote: > This is subtask of #12427 . > > I have seen C4819 warning in stubGenerator_x86_64_poly.cpp and elfFile.hpp in HotSpot on Windows (CP932: Japanese locale) > > > d:\github-forked\jdk\src\hotspot\cpu\x86\stubGenerator_x86_64_poly.cpp(1): error C2220: ????????????????? > d:\github-forked\jdk\src\hotspot\cpu\x86\stubGenerator_x86_64_poly.cpp(1): warning C4819: ???????????? ??? (932) ??????????????????????????????????? Unicode ???????????? Ok, it's your code, so if you want an `x` instead, go ahead. I fully agree that some special non-ASCII character has no need in the source code, like typographic quotes, and should be replaced by normal ASCII character. I would not have counted this multiplication symbol as unnecessary, but I understand and respect that you think otherwise. ------------- Marked as reviewed by ihse (Reviewer). PR: https://git.openjdk.org/jdk/pull/12435 From fjiang at openjdk.org Wed Feb 8 13:21:24 2023 From: fjiang at openjdk.org (Feilong Jiang) Date: Wed, 8 Feb 2023 13:21:24 GMT Subject: RFR: 8284196: RISC-V: Detect supported ISA extensions over cpuinfo [v5] In-Reply-To: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> References: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> Message-ID: <_5TH9GjFLpvbfQvoo1yl1iZBNtiIiY87sX8WIDZQAPc=.40c50b7b-00d9-4c40-a867-9ea27e258eca@github.com> > Currently, `elf_hwcap` for RISC-V only sets single-letter extension bit (e.g. IMAFD). > As many standard multi-letter ISA extensions are ratified (e.g. Zba/Zbb/Zbc/Zbs), > we should find a stable way to detect these supported ISA extensions in JVM. > [1] has proposed a way to parse supported extensions through /proc/cpuinfo > or "riscv,isa" string of /sys/firmware/devicetree, we could detect supported extensions > in the same way. > > Here is an example of /proc/cpuinfo with multi-letter extensions from Ubuntu 20.04 in QEMU-SYSTEM: > > > ubuntu at ubuntu:~$ uname -a > Linux ubuntu 5.8.0-14-generic #16~20.04.3-Ubuntu SMP Mon Feb 1 16:33:19 UTC 2021 riscv64 riscv64 riscv64 GNU/Linux > ubuntu at ubuntu:~$ cat /proc/cpuinfo > processor : 0 > hart : 2 > isa : rv64imafdch_zicsr_zifencei_zihintpause_zba_zbb_zbc_zbs_sstc > mmu : sv48 > > > 1: http://lists.infradead.org/pipermail/linux-riscv/2021-November/010252.html > > > Testing: > - [x] `jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UseZihintpause -XX:+UseRVV -XX:+UseZicbop -XX:+UseZba -XX:+UseZbb -XX:+UseZbs -XX:+PrintFlagsFinal -version` with release build Feilong Jiang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision: - Merge branch 'master' of https://github.com/openjdk/jdk into detect_isa_string - fix - adjust comments - fix comment - fix typo - code adjustment - adjust comments - update copyright - add zihintpuase - fix calling of strdup - ... and 1 more: https://git.openjdk.org/jdk/compare/ab0f8c1f...53ea14e6 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12343/files - new: https://git.openjdk.org/jdk/pull/12343/files/66b62159..53ea14e6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12343&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12343&range=03-04 Stats: 4834 lines in 182 files changed: 4221 ins; 227 del; 386 mod Patch: https://git.openjdk.org/jdk/pull/12343.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12343/head:pull/12343 PR: https://git.openjdk.org/jdk/pull/12343 From ihse at openjdk.org Wed Feb 8 13:39:58 2023 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 8 Feb 2023 13:39:58 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v6] In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 15:40:34 GMT, Justin King wrote: >> Adds initial LSan (LeakSanitizer) support to Hotspot. This setup has been used to identify multiple leaks so far. It can run most of the test suite except those that also fail with ASan, which is being looked at separately. It is especially useful when combined with ASan, as LSan can use poisoning information to determine what memory to scan or not to scan, making leak detection more accurate and faster. >> >> **Suppressing:** >> Currently the suppression list is only used to suppress JLI leaks that are known, the rest are done in code. Suppressing needs to identify the source of thet leak. Due to Hotspot's code organization, we would need to suppress `os::malloc` and friends, which would suppress everything. Suppressing in code has the added benefit of being explicit and surviving refactors if methods change. >> >> **Caveats:** >> - By default ASan enables LSan, however we explicitly disable it unless `--enable-lsan` is given. It is useful to be able to use ASan without LSan. Using LSan by itself is less likely to be useful and will probably not work, but its still possible currently. > > Justin King has updated the pull request incrementally with one additional commit since the last revision: > > Revert changes to JDK > > Signed-off-by: Justin King I apologize, the fault lies entirely with me. Justin should have no blame in this -- he is not a committer and is not expected to fully know all rules for integration. That responsibility lies with the sponsor, in this case, me. I read @dholmes-ora's comment: > Updates look good - glad to see the flag changes go away! > I suggest factoring out the change to test/jdk/jni/nullCaller/exeNullCallerTest.cpp as it is a JDK test and not part of hotspot. Thanks. as an approval from Hotspot, given that the JDK test was removed, which a later commit did indeed remove. And, like Justin, I interpreted Thomas comments as questions that were now answered, rather than an ongoing discussion about the I did not realize there were still an ongoing discussion, and was too eager to sponsor this PR. @tstuefe @dholmes-ora Do you want me to revert this change? Or should we continue the discussion in a follow-up bug that can address the remaining problems? ------------- PR: https://git.openjdk.org/jdk/pull/12229 From duke at openjdk.org Wed Feb 8 14:33:51 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Wed, 8 Feb 2023 14:33:51 GMT Subject: RFR: 8296344: Remove dependency on G1 for writing the CDS archive heap [v4] In-Reply-To: <9wofGO5FZW7h3OUJPoK2lDbPN3BvD6v9AAGjW0jfXCo=.7b78f322-191f-4996-8af5-09cd50884432@github.com> References: <94Jg1T0G_ZtP-EgPfOsCusWpm4xA6K4ms_fNKKa7wLM=.c2756e61-c490-466b-a9b4-abf215fbcb41@github.com> <9wofGO5FZW7h3OUJPoK2lDbPN3BvD6v9AAGjW0jfXCo=.7b78f322-191f-4996-8af5-09cd50884432@github.com> Message-ID: <1J2_7DqspIFPl0uHtgAyiEYhWEDpefiuHsjEljl3Bzw=.fbbee099-ae9d-412c-8c28-6138628d8ba7@github.com> On Wed, 8 Feb 2023 03:20:04 GMT, Ioi Lam wrote: >> src/hotspot/share/cds/archiveHeapWriter.cpp line 388: >> >>> 386: assert(is_aligned(_open_bottom, HeapRegion::GrainBytes), "sanity"); >>> 387: >>> 388: _requested_closed_region_bottom = align_down(heap_end - closed_region_byte_size, HeapRegion::GrainBytes); >> >> Instead of aligning to `HeapRegion::GrainBytes`, shouldn't it be aligned to `MIN_GC_REGION_ALIGNMENT`? >> With `HeapRegion::GrainBytes`, if the cds archive is created with large region size (say 32M) and later mapped with smaller region size (say 1M), then the archive heap would need to be relocated. Using `MIN_GC_REGION_ALIGNMENT` would avoid the relocation in this case. > > When we dump with a large heap size (e.g.,16GB) but with with a small heap size (e.g., 256M), we are likely to need to relocate anyway, because the narrowOop encoding would be different (3-bit shift vs no shift). > > The design decision we made in the past was: the layout of the archived objects should be optimized for the heap size specified at dump time. Hence, the default CDS archive in the JDK is dumped with -Xmx128M so that it will be optimized for small workloads such as microservices. See https://github.com/openjdk/jdk/blob/2a579ab8392d30a35f044954178c788d16d4b800/make/Images.gmk#L117-L121 I am not referring to heap size, but the G1 region size. A user can create a base static archive with 128M heap and explicitly ask for larger region size than 1M and then try to use that image with a smaller region size. This will cause the archive heap to be relocated. Although highly unlikely scenario, but it can easily be handled by changing the alignment to always use `MIN_GC_REGION_ALIGNMENT` instead of using the current region size. ------------- PR: https://git.openjdk.org/jdk/pull/12304 From alanb at openjdk.org Wed Feb 8 15:17:24 2023 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 8 Feb 2023 15:17:24 GMT Subject: RFR: 8301819: Enable continuations code by default Message-ID: The continuations code added in JDK 19 with JEP 425 is currently disabled by default and is only enabled when running with preview features enabled (`--enable-preview`). Disabled by default was the pragmatic choice at that time to reduce risk and possible performance regressions. We'd like to change this so the code is enabled by default on the platforms that have a port. Enabling this code by default helps to shake out any issues early in JDK 21 and sets things up to make virtual threads a permanent feature in the future. If something comes up in the next few weeks/months then it can be changed to be disabled by default again. This change has no impact on the builds that do not have a port. Testing: tier1-tier6 ------------- Commit messages: - Initial commit Changes: https://git.openjdk.org/jdk/pull/12459/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12459&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301819 Stats: 4 lines in 2 files changed: 0 ins; 1 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/12459.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12459/head:pull/12459 PR: https://git.openjdk.org/jdk/pull/12459 From jcking at openjdk.org Wed Feb 8 15:32:01 2023 From: jcking at openjdk.org (Justin King) Date: Wed, 8 Feb 2023 15:32:01 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v6] In-Reply-To: References: <3KMh7CoSK-qyddwQgjcQAubNFT_ThIZ3lSy-hW-0oTE=.ece75ba9-8bc6-427c-83bf-e1d7bf95fe0e@github.com> Message-ID: On Wed, 8 Feb 2023 06:45:28 GMT, Thomas Stuefe wrote: > The informal rule is that ongoing discussions should be closed and that nobody strongly objects to a change. Two reviewers are easy to come by. The point of this rule is that you have a reasonable chance to block changes you strongly oppose. And that every change in the code base has the buy-in from the majority. > > Note that a patch is seldom blocked by one person alone for a prolonged amount of time. But sometimes, overworked reviewers may block-then-ghost without meaning to. Requires patience from authors and responsibility from reviewers. As an author, it is okay to ping after a while and, getting no answer, to integrate with a clear warning. > > This all requires more time than faster paced projects, which is accepted for most RFEs. Requires more discussion and convincing, but gives us a better community. Still faster than getting in kernel patches :) Okay, fair. I will keep all this in mind. Thank you! Let's continue the discussion, will respond in a second. ------------- PR: https://git.openjdk.org/jdk/pull/12229 From jcking at openjdk.org Wed Feb 8 15:41:57 2023 From: jcking at openjdk.org (Justin King) Date: Wed, 8 Feb 2023 15:41:57 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v6] In-Reply-To: References: Message-ID: On Wed, 8 Feb 2023 06:58:52 GMT, Thomas Stuefe wrote: > I read through your explanation, and through the [design docs](https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizerDesignDocument), but my question remains unanswered. See below. > > > > Metaspace objects hold pointers to malloced memory. Metaspace Objects can leak, that's accepted. So these pointers would leak too. Would that not give us tons of false positives? > > > > > > No. LSan only cares that it can find pointers to malloc'ed memory, not that it can find pointers to Metaspace objects. The Metaspace is registered as a root region. It does however take into account ASan poisoning, which we added earlier to Metaspace. LSan will not scan poisoned memory regions that are part of root regions. We only poison in Metaspace when deallocation occurs and unpoison when allocation occurs. > > But they are not poisened. > > Upon JVM end, the heap still exists, holding class loaders, which hold references to CLDs. Those hold metaspace arenas (living in Metaspace), which hold Metaspace chunks, which are registered LSAN root areas, which hold C-allocated sub structures. These chunks are not poisened since they have never been returned to the chunk pool. They are still alive. > > To put it in other words: even I as a human cannot determine that this is not a C-heap leak. Technically, it is. But it is accepted since we accept these things leaking. Since we don't support unloading and re-loading the JVM, we accept that to have a faster shutdown. Let's ignore ASan/poisoning for now, its an optimization. I shouldn't have brought it up. LSan leak checking, as implemented for Hotspot, happens before the JVM has actually shutdown. It happens immediately after one thread wins the right to begin shutting down and other threads are either waiting or doing other work. Everything is still alive, including CollectedHeap, Metaspace, and CodeHeap. We ignore leaks as a result of shutdown this way, because as you mentioned we are not that concerned. Outside of Hotspot, LSan leak checking is typically done at program exit. If you do that with Hotspot you will definitely get false positives. Note: If you want, feel free to run a build with `--enable-asan` and `--enable-lsan` and run some of the test tiers. Some tests will fail due to some existing incompatibilities with ASan (working on fixing those separately), but some will fail with real leaks. I need to report these, just have not gotten around to it yet. Plan was to do so after this integrated. Most of the remaining ones are minor. The only time false positives can happen is some of the tests which do some interesting things and use a custom launcher, these don't inherit our options and will attempt to leak check at exit. I am slowly going through and correcting these. They should be rare at this point. On a side note, once this discussion is complete and in favor of this change, I should probably add some documentation related to sanitizers in Hotspot including ASan, LSan, and UBSan. ------------- PR: https://git.openjdk.org/jdk/pull/12229 From luhenry at openjdk.org Wed Feb 8 16:03:43 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Wed, 8 Feb 2023 16:03:43 GMT Subject: RFR: 8284196: RISC-V: Detect supported ISA extensions over cpuinfo [v2] In-Reply-To: References: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> <_9crD-_k0hyO59IGouVx1vUJDJAmNUb1eLFa5EIqvxI=.e230cd38-5cb4-451f-817d-768aa512615d@github.com> <5Vfja25tJ0cURNIlqACQ4oTbGKgYMaLuIpnI_h3boo8=.062169e0-7beb-4d5c-9cad-f8380e045a5e@github.com> Message-ID: <4L69mem1grLwna22oqYGfJ_FRFECsYeYZZPP9kQNoSc=.ee060a06-1478-4126-9b55-c9222b06f0a3@github.com> On Wed, 8 Feb 2023 01:23:02 GMT, Fei Yang wrote: > It should be safe for them to do as this PR adds the necessary detection of availability of those ISA extensions. Ok. That also aligns us with how it's done on other platforms (SHA on x86 or AArch64 for example). I've a slight concern with the reliability of detecting these features based solely on the device tree + being able to force enable said-features in case we know for sure the extension is available but it's not communicated by the kernel. FYI, the kernel doesn't transfer verbatim the device-tree "isa" provided by the hardware, but it reconstruct it based on the extensions the kernel knows about. So if the hardware support a cutting edge extension (Zvknh and Zvkb for example), but the kernel is older and doesn't know about it, the hardware is going to provide a device-tree with `_Zvknh_Zvkb` to the kernel, but it's not going to show up in `/proc/cpuinfo`. (You can find the list of extensions the kernel currently knows about at [1], and the logic to build the `isa` string at [2]). > So I guess one possibility is that we might want some vendor-specific control like we do other CPU platforms I fully agree we will want some of these. Good thing there are the vendorid as part of the spec so we can easily detect which one is which. > But I hope some ISA extensions like bitmanip could be generally profitable across all venders so that the JVM could auto-enable those extension. But that needs more testing on real hardwares to prove. Agreed. As long as we have a way to manually enable them so that the user still has the ability to test out some features on their hardware. [1] https://github.com/torvalds/linux/blob/0983f6bf2bfc0789b51ddf7315f644ff4da50acb/arch/riscv/kernel/cpu.c#L164-L172 [2] https://github.com/torvalds/linux/blob/0983f6bf2bfc0789b51ddf7315f644ff4da50acb/arch/riscv/kernel/cpu.c#L174-L191 ------------- PR: https://git.openjdk.org/jdk/pull/12343 From iklam at openjdk.org Wed Feb 8 17:34:43 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 8 Feb 2023 17:34:43 GMT Subject: RFR: 8296344: Remove dependency on G1 for writing the CDS archive heap [v4] In-Reply-To: <1J2_7DqspIFPl0uHtgAyiEYhWEDpefiuHsjEljl3Bzw=.fbbee099-ae9d-412c-8c28-6138628d8ba7@github.com> References: <94Jg1T0G_ZtP-EgPfOsCusWpm4xA6K4ms_fNKKa7wLM=.c2756e61-c490-466b-a9b4-abf215fbcb41@github.com> <9wofGO5FZW7h3OUJPoK2lDbPN3BvD6v9AAGjW0jfXCo=.7b78f322-191f-4996-8af5-09cd50884432@github.com> <1J2_7DqspIFPl0uHtgAyiEYhWEDpefiuHsjEljl3Bzw=.fbbee099-ae9d-412c-8c28-6138628d8ba7@github.com> Message-ID: <7oVwJglwZelFf3L3YvHJT5EiODuoctL9jaRVyOmk9bs=.26648be7-bd4a-441c-a06c-7b9ef0768ced@github.com> On Wed, 8 Feb 2023 14:31:03 GMT, Ashutosh Mehra wrote: >> When we dump with a large heap size (e.g.,16GB) but with with a small heap size (e.g., 256M), we are likely to need to relocate anyway, because the narrowOop encoding would be different (3-bit shift vs no shift). >> >> The design decision we made in the past was: the layout of the archived objects should be optimized for the heap size specified at dump time. Hence, the default CDS archive in the JDK is dumped with -Xmx128M so that it will be optimized for small workloads such as microservices. See https://github.com/openjdk/jdk/blob/2a579ab8392d30a35f044954178c788d16d4b800/make/Images.gmk#L117-L121 > > I am not referring to heap size, but the G1 region size. A user can create a base static archive with 128M heap and explicitly ask for larger region size than 1M and then try to use that image with a smaller region size. This will cause the archive heap to be relocated. > Although highly unlikely scenario, but it can easily be handled by changing the alignment to always use `MIN_GC_REGION_ALIGNMENT` instead of using the current region size. Since we have a small amount of archived objects, if we always use MIN_GC_REGION_ALIGNMENT to decide the requested location, it will always be 1MB below the top of the heap. As a result, we will always relocate when running with larger heaps (or larger G1HeapRegionSize): $ java -Xshare:dump -Xmx128m -XX:G1HeapRegionSize=1m $ java -Xmx128m -XX:G1HeapRegionSize=4m -Xlog:cds --version [.....] [0.002s][info][cds] CDS archive was created with max heap size = 128M, and the following configuration: [0.002s][info][cds] narrow_klass_base = 0x0000000800000000, narrow_klass_shift = 0 [0.002s][info][cds] narrow_oop_mode = 0, narrow_oop_base = 0x0000000000000000, narrow_oop_shift = 0 [0.002s][info][cds] heap range = [0x00000000f8000000 - 0x0000000100000000] [0.002s][info][cds] The current max heap size = 128M, HeapRegion::GrainBytes = 4194304 [0.002s][info][cds] narrow_klass_base = 0x0000000800000000, narrow_klass_shift = 0 [0.002s][info][cds] narrow_oop_mode = 0, narrow_oop_base = 0x0000000000000000, narrow_oop_shift = 0 [0.002s][info][cds] heap range = [0x00000000f8000000 - 0x0000000100000000] [0.002s][info][cds] Heap region ca0 = 0x00000000fff00000 - 0x00000000fff83000 = 536576 bytes [0.002s][info][cds] Heap region oa0 = 0x00000000ffe00000 - 0x00000000ffe79000 = 495616 bytes [0.002s][info][cds] CDS heap data relocation delta = 0 bytes [0.002s][info][cds] CDS heap data needs to be relocated lower by a further 3145728 bytes to -3145728 to be aligned with HeapRegion::GrainBytes Because the archive regions are pinned, we want to always put them at the absolute top end of the heap to avoid fragmentation. With this PR, I think we should keep the current behavior (which has not been changed by this PR) -- the requested location of the archive objects should be optimized for the heap parameters specified during dump time. However, after [JDK-8298048](https://bugs.openjdk.org/browse/JDK-8298048), we will use G1's "old" regions which aren't pinned. So we can afford to have a few empty regions above the archive regions. In that case, for G1, the requested base should be (heap_top - max_region_size), where max_region_size should be 8M. This will allow all heap sizes up to 16GB to map the archive regions without relocating the pointers. ------------- PR: https://git.openjdk.org/jdk/pull/12304 From mseledtsov at openjdk.org Wed Feb 8 17:59:45 2023 From: mseledtsov at openjdk.org (Mikhailo Seledtsov) Date: Wed, 8 Feb 2023 17:59:45 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v6] In-Reply-To: References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> Message-ID: On Wed, 1 Feb 2023 23:59:00 GMT, Mikhailo Seledtsov wrote: >> Add log() and logToFile() methods to VMProps for diagnostic purposes. > > Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: > > Removed logging condition as per review feedback I will then makes the logging conditional on the property mentioned above. Any test run who wishes additional logging could add/enable this property by the launching script. ------------- PR: https://git.openjdk.org/jdk/pull/12239 From duke at openjdk.org Wed Feb 8 18:27:44 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Wed, 8 Feb 2023 18:27:44 GMT Subject: RFR: 8296344: Remove dependency on G1 for writing the CDS archive heap [v4] In-Reply-To: <7oVwJglwZelFf3L3YvHJT5EiODuoctL9jaRVyOmk9bs=.26648be7-bd4a-441c-a06c-7b9ef0768ced@github.com> References: <94Jg1T0G_ZtP-EgPfOsCusWpm4xA6K4ms_fNKKa7wLM=.c2756e61-c490-466b-a9b4-abf215fbcb41@github.com> <9wofGO5FZW7h3OUJPoK2lDbPN3BvD6v9AAGjW0jfXCo=.7b78f322-191f-4996-8af5-09cd50884432@github.com> <1J2_7DqspIFPl0uHtgAyiEYhWEDpefiuHsjEljl3Bzw=.fbbee099-ae9d-412c-8c28-6138628d8ba7@github.com> <7oVwJglwZelFf3L3YvHJT5EiODuoctL9jaRVyOmk9bs=.26648be7-bd4a-441c-a06c-7b9ef0768ced@github.com> Message-ID: On Wed, 8 Feb 2023 17:31:59 GMT, Ioi Lam wrote: >> I am not referring to heap size, but the G1 region size. A user can create a base static archive with 128M heap and explicitly ask for larger region size than 1M and then try to use that image with a smaller region size. This will cause the archive heap to be relocated. >> Although highly unlikely scenario, but it can easily be handled by changing the alignment to always use `MIN_GC_REGION_ALIGNMENT` instead of using the current region size. > > Since we have a small amount of archived objects, if we always use MIN_GC_REGION_ALIGNMENT to decide the requested location, it will always be 1MB below the top of the heap. As a result, we will always relocate when running with larger heaps (or larger G1HeapRegionSize): > > > $ java -Xshare:dump -Xmx128m -XX:G1HeapRegionSize=1m > $ java -Xmx128m -XX:G1HeapRegionSize=4m -Xlog:cds --version > [.....] > [0.002s][info][cds] CDS archive was created with max heap size = 128M, and the following configuration: > [0.002s][info][cds] narrow_klass_base = 0x0000000800000000, narrow_klass_shift = 0 > [0.002s][info][cds] narrow_oop_mode = 0, narrow_oop_base = 0x0000000000000000, narrow_oop_shift = 0 > [0.002s][info][cds] heap range = [0x00000000f8000000 - 0x0000000100000000] > [0.002s][info][cds] The current max heap size = 128M, HeapRegion::GrainBytes = 4194304 > [0.002s][info][cds] narrow_klass_base = 0x0000000800000000, narrow_klass_shift = 0 > [0.002s][info][cds] narrow_oop_mode = 0, narrow_oop_base = 0x0000000000000000, narrow_oop_shift = 0 > [0.002s][info][cds] heap range = [0x00000000f8000000 - 0x0000000100000000] > [0.002s][info][cds] Heap region ca0 = 0x00000000fff00000 - 0x00000000fff83000 = 536576 bytes > [0.002s][info][cds] Heap region oa0 = 0x00000000ffe00000 - 0x00000000ffe79000 = 495616 bytes > [0.002s][info][cds] CDS heap data relocation delta = 0 bytes > [0.002s][info][cds] CDS heap data needs to be relocated lower by a further 3145728 bytes to -3145728 to be aligned with HeapRegion::GrainBytes > > > Because the archive regions are pinned, we want to always put them at the absolute top end of the heap to avoid fragmentation. > > With this PR, I think we should keep the current behavior (which has not been changed by this PR) -- the requested location of the archive objects should be optimized for the heap parameters specified during dump time. > > However, after [JDK-8298048](https://bugs.openjdk.org/browse/JDK-8298048), we will use G1's "old" regions which aren't pinned. So we can afford to have a few empty regions above the archive regions. > > In that case, for G1, the requested base should be (heap_top - max_region_size), where max_region_size should be 8M. This will allow all heap sizes up to 16GB to map the archive regions without relocating the pointers. yeah I realized what I am suggesting wouldn't work given that the two different archive regions are required to be different G1 heap region. Relocation cannot be avoided in that case. Hopefully things will be simpler to reason about after JDK-8298048. Thanks for explaining the current situation. ------------- PR: https://git.openjdk.org/jdk/pull/12304 From rkennke at openjdk.org Wed Feb 8 19:45:41 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 8 Feb 2023 19:45:41 GMT Subject: RFR: 8302070: Factor null-check into load_klass() calls Message-ID: In several asm routines, a pattern like the following can be found: __ null_check(obj, oopDesc::klass_offset_in_bytes()); __ load_klass(klass, obj, tmp); The null-check logically belongs into load_klass(), though. This becomes apparent in Lilliput, where the klass-offset is different, depending on whether or not Lilliput is enabled. Testing: - [x] tier1 x86_64 - [x] tier1 aarch64 ------------- Commit messages: - ARM fix - S390 fixes - PPC fixes - ARM fixes - S390 parts - RISCV parts - PPC parts - ARM parts - AArch64 - 8302070: Factor null-check into load_klass() calls Changes: https://git.openjdk.org/jdk/pull/12473/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12473&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302070 Stats: 89 lines in 28 files changed: 41 ins; 25 del; 23 mod Patch: https://git.openjdk.org/jdk/pull/12473.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12473/head:pull/12473 PR: https://git.openjdk.org/jdk/pull/12473 From kvn at openjdk.org Wed Feb 8 20:03:44 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 8 Feb 2023 20:03:44 GMT Subject: RFR: 8301819: Enable continuations code by default In-Reply-To: References: Message-ID: <4jO5IBlzZ78eO75ecxQuqWQpdW2QDmCS3cESaoA9aCU=.6fdd7cf0-7d75-4dbb-83a0-d41a2c1158e2@github.com> On Tue, 7 Feb 2023 18:37:03 GMT, Alan Bateman wrote: > The continuations code added in JDK 19 with JEP 425 is currently disabled by default and is only enabled when running with preview features enabled (`--enable-preview`). Disabled by default was the pragmatic choice at that time to reduce risk and possible performance regressions. We'd like to change this so the code is enabled by default on the platforms that have a port. Enabling this code by default helps to shake out any issues early in JDK 21 and sets things up to make virtual threads a permanent feature in the future. If something comes up in the next few weeks/months then it can be changed to be disabled by default again. This change has no impact on the builds that do not have a port. > > Testing: tier1-tier6 @AlanBateman Do you have performance data for our regular benchmarks? ------------- PR: https://git.openjdk.org/jdk/pull/12459 From kvn at openjdk.org Wed Feb 8 20:34:42 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 8 Feb 2023 20:34:42 GMT Subject: RFR: 8301819: Enable continuations code by default In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 18:37:03 GMT, Alan Bateman wrote: > The continuations code added in JDK 19 with JEP 425 is currently disabled by default and is only enabled when running with preview features enabled (`--enable-preview`). Disabled by default was the pragmatic choice at that time to reduce risk and possible performance regressions. We'd like to change this so the code is enabled by default on the platforms that have a port. Enabling this code by default helps to shake out any issues early in JDK 21 and sets things up to make virtual threads a permanent feature in the future. If something comes up in the next few weeks/months then it can be changed to be disabled by default again. This change has no impact on the builds that do not have a port. > > Testing: tier1-tier6 Alan showed me performance data with nothing alarming (at least from benchmarks we use). I agree with these changes. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.org/jdk/pull/12459 From dcubed at openjdk.org Wed Feb 8 20:46:44 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Wed, 8 Feb 2023 20:46:44 GMT Subject: RFR: 8301819: Enable continuations code by default In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 18:37:03 GMT, Alan Bateman wrote: > The continuations code added in JDK 19 with JEP 425 is currently disabled by default and is only enabled when running with preview features enabled (`--enable-preview`). Disabled by default was the pragmatic choice at that time to reduce risk and possible performance regressions. We'd like to change this so the code is enabled by default on the platforms that have a port. Enabling this code by default helps to shake out any issues early in JDK 21 and sets things up to make virtual threads a permanent feature in the future. If something comes up in the next few weeks/months then it can be changed to be disabled by default again. This change has no impact on the builds that do not have a port. > > Testing: tier1-tier6 @AlanBateman - Thanks for documenting that Tier[1-6] testing was done on this change. ------------- PR: https://git.openjdk.org/jdk/pull/12459 From pchilanomate at openjdk.org Wed Feb 8 22:39:51 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 8 Feb 2023 22:39:51 GMT Subject: RFR: 8298853: JvmtiVTMSTransitionDisabler should support disabling one virtual thread transitions [v5] In-Reply-To: References: Message-ID: On Sat, 24 Dec 2022 04:14:07 GMT, Serguei Spitsyn wrote: >> Now the `JvmtiVTMSTransitionDisabler` mechanism supports disabling VTMS transitions for all virtual threads only. It should also support disabling transitions for any specific virtual thread as well. This will improve scalability of the JVMTI functions operating on target virtual threads as the functions can be executed concurrently without blocking each other execution when target virtual threads are different. >> New constructor `JvmtiVTMSTransitionDisabler(jthread vthread)` is added which has jthread parameter of the target virtual thread. >> >> Testing: >> mach5 jobs are TBD (preliminary testing was completed): >> - all JVMTI, JDWP, JDI and JDB tests have to be run >> - Kitchensink >> - tier5 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > fix race between VTMS_transition_disable_for_one and start_VTMS_transition Hi Serguei, I looked at the latest changes and it all looks good to me. Thanks! Patricio src/hotspot/share/prims/jvmtiThreadState.cpp line 376: > 374: Atomic::dec(&_VTMS_transition_disable_for_one_count); > 375: if (_VTMS_transition_disable_for_one_count == 0 || _is_SR) { > 376: ml.notify_all(); I was going to say that we don't really need the global _VTMS_transition_disable_for_one_count since we can notify here if java_lang_Thread::VTMS_transition_disable_count(vth()) is zero, which is what the target checks. But I realized everybody uses the same lock so we would be notifying everybody. Using _VTMS_transition_disable_for_one_count as it is at least we notify everybody at once when there are no more single disablers. Not sure if that was the purpose, otherwise I think we could remove it. src/hotspot/share/prims/jvmtiThreadState.cpp line 479: > 477: > 478: // Unblock waiting VTMS transition disablers. > 479: if (_VTMS_transition_disable_for_one_count > 0 || In here it would actually be the other way. If we would check java_lang_Thread::VTMS_transition_disable_count(vth()) > 0 instead of the global _VTMS_transition_disable_for_one_count we would avoid the notify in case there are singler disablers waiting but not for this thread. ------------- Marked as reviewed by pchilanomate (Reviewer). PR: https://git.openjdk.org/jdk/pull/11690 From mseledtsov at openjdk.org Wed Feb 8 23:18:20 2023 From: mseledtsov at openjdk.org (Mikhailo Seledtsov) Date: Wed, 8 Feb 2023 23:18:20 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v7] In-Reply-To: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> Message-ID: > Add log() and logToFile() methods to VMProps for diagnostic purposes. Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: Re-added conditional logging per review discussion ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12239/files - new: https://git.openjdk.org/jdk/pull/12239/files/1eef0f6f..b85636da Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12239&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12239&range=05-06 Stats: 7 lines in 1 file changed: 7 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12239.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12239/head:pull/12239 PR: https://git.openjdk.org/jdk/pull/12239 From dholmes at openjdk.org Thu Feb 9 02:45:50 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 9 Feb 2023 02:45:50 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v7] In-Reply-To: References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> Message-ID: <0-kJqmwCN63s-7ekTyfK_xVjV5-N-j2AWZ6OH_wvQwc=.5608a423-92eb-4703-9a00-efdc83393de6@github.com> On Wed, 8 Feb 2023 23:18:20 GMT, Mikhailo Seledtsov wrote: >> Add log() and logToFile() methods to VMProps for diagnostic purposes. > > Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: > > Re-added conditional logging per review discussion Looks okay to me. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12239 From dholmes at openjdk.org Thu Feb 9 03:32:47 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 9 Feb 2023 03:32:47 GMT Subject: RFR: 8301819: Enable continuations code by default In-Reply-To: References: Message-ID: <1FPt6K7a7E4fgcVMi6HnNcYrzAXs5t38-zyKVkiGxtE=.931145af-0143-4542-8db8-6866237369c1@github.com> On Tue, 7 Feb 2023 18:37:03 GMT, Alan Bateman wrote: > The continuations code added in JDK 19 with JEP 425 is currently disabled by default and is only enabled when running with preview features enabled (`--enable-preview`). Disabled by default was the pragmatic choice at that time to reduce risk and possible performance regressions. We'd like to change this so the code is enabled by default on the platforms that have a port. Enabling this code by default helps to shake out any issues early in JDK 21 and sets things up to make virtual threads a permanent feature in the future. If something comes up in the next few weeks/months then it can be changed to be disabled by default again. This change has no impact on the builds that do not have a port. > > Testing: tier1-tier6 @AlanBateman I agree with enabling this and giving it time to bake. Vladimir has already checked there are no red flags on the performance front, so that is fine. If anything does arise we have, as you note, time to address it. Copyright year needs updating in continuation.cpp. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12459 From dholmes at openjdk.org Thu Feb 9 03:36:43 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 9 Feb 2023 03:36:43 GMT Subject: RFR: 8301988: VerifyLiveClosure::verify_liveness asserts on bad pointers outside heap [v2] In-Reply-To: References: Message-ID: <8R1JPibjyJ6BWOxhPcFBytCHSEkHbf3E-K4ar3MAkhg=.e2ea20d8-5c5e-49ec-b1e1-0816a9cb54cc@github.com> On Tue, 7 Feb 2023 15:41:28 GMT, Thomas Schatzl wrote: >> Hi all, >> >> can I have reviews for this change to liveness verification that fixes some unwanted asserts because >> >> - it uses decode_not_null which will assert if the given oop address is not in the heap, making the remainder of the verification useless in that case >> - if the referenced object is not in the heap, we try to get its heap region too when printing, which also fails some assertions >> - in the innermost if lots of code is duplicated in both cases >> >> The first two issues are really annoying (there is another one when the `Klass` is garbage when calling `is_obj_dead_cond`, but I'll try to improve that separately). >> >> Testing: local compilation/testing, gha >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > Missing changes LGTM. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12456 From dholmes at openjdk.org Thu Feb 9 03:36:45 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 9 Feb 2023 03:36:45 GMT Subject: RFR: 8301988: VerifyLiveClosure::verify_liveness asserts on bad pointers outside heap [v2] In-Reply-To: References: <7nMjwERfYZcjO3Q1_o5g4Jb5VhHVc0Y4xCynPI9S8uU=.c49cd6de-5486-46a3-9a7f-73631cbe38b6@github.com> Message-ID: <3r80q1tmqKdti0sBjLA7KY47NEFEk-saboD7WQ1qSiY=.578de102-363a-40a9-b7b0-0aa138ea90bc@github.com> On Wed, 8 Feb 2023 10:24:00 GMT, Thomas Schatzl wrote: >> src/hotspot/share/gc/g1/g1CollectedHeap.inline.hpp line 217: >> >>> 215: >>> 216: inline bool G1CollectedHeap::is_obj_filler(const oop obj) { >>> 217: Klass* k = obj->klass_raw(); >> >> Not clear how you can get here from ` HeapRegion::is_obj_dead` with a bad oop, such that you need the raw variant. ?? > > The object is in the heap, but the occupying memory has already been zapped (in debug mode); i.e. the call in `heapRegion.cpp:518` could read `badHeapWordVal` as (compressed) klass value in the header. > > In that case the current code asserts in this call because in `oopDesc::klass()`, the call to `CompressedKlassPointers::decode_not_null` will assert in `compressedOops.inline.hpp:135` due to the `check_alignment` condition not satisfied. > > This makes this verification code assert before printing out any useful information to diagnose the problem quickly (in my case this has been a change that wrongly managed remembered sets). > > If I had had the remembered set verification printout, I would have found the problem immediately in this case (because the message would have told me that there is a problem with remembered sets). So it took a while to diagnose the issue, having to go into the debugger to painfully find the exact same information. > > I.e. this makes the verification code more robust. > > Imo the suggested solution to just continue execution is fine, because `is_obj_filler` will always return false (i.e. object is dead) for garbage objects and do the right thing here. > There is the concern that now other non-verification code might not immediately trigger now, but most of it just fails the VM anyway if it finds a bad reference (after printing some information about it), for all other cases this is the right choice. Thanks for the explanation. ------------- PR: https://git.openjdk.org/jdk/pull/12456 From jbhateja at openjdk.org Thu Feb 9 05:16:46 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Thu, 9 Feb 2023 05:16:46 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v11] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: On Tue, 7 Feb 2023 14:27:49 GMT, Scott Gibbons wrote: > @sviswa7 Can you please share the test(s) you used to determine the stated failures? I've run all of the tier1-3 tests in the suite with no failures. Plus, if what you say is true, then the same logic would apply for the non-URL case as well as it applies to 0x2F ('/') in the same manner as 0x5F ('_'). > > If your test(s) do indeed show the failure, we should probably add them to the test suite. Thanks. Stub processes the input till last legal byte and falls over to java implementation for error handling, thus URL decoder will not find any anomaly on encountering characters ( _ or - ) in input sequence, the behavior will still be functionally correct. Problem will only show up in JMH micro as a perf degradation, our micro handles MIME and Base64 decoder can you kindly extend it for URL decoder with special characters in input sequence. ------------- PR: https://git.openjdk.org/jdk/pull/12126 From fyang at openjdk.org Thu Feb 9 05:33:44 2023 From: fyang at openjdk.org (Fei Yang) Date: Thu, 9 Feb 2023 05:33:44 GMT Subject: RFR: 8284196: RISC-V: Detect supported ISA extensions over cpuinfo [v2] In-Reply-To: <4L69mem1grLwna22oqYGfJ_FRFECsYeYZZPP9kQNoSc=.ee060a06-1478-4126-9b55-c9222b06f0a3@github.com> References: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> <_9crD-_k0hyO59IGouVx1vUJDJAmNUb1eLFa5EIqvxI=.e230cd38-5cb4-451f-817d-768aa512615d@github.com> <5Vfja25tJ0cURNIlqACQ4oTbGKgYMaLuIpnI_h3boo8=.062169e0-7beb-4d5c-9cad-f8380e045a5e@github.com> <4L69mem1grLwna22oqYGfJ_FRFECsYeYZZPP9kQNoSc=.ee060a06-1478-4126-9b55-c9222b06f0a3@github.com> Message-ID: On Wed, 8 Feb 2023 16:01:08 GMT, Ludovic Henry wrote: > > It should be safe for them to do as this PR adds the necessary detection of availability of those ISA extensions. > > Ok. That also aligns RISC-V with how it's done on other platforms (SHA on x86 or AArch64 for example). > > I've a slight concern with the reliability of detecting these features based solely on the device tree + being able to force enable said-features in case we know for sure the extension is available but it's not communicated by the kernel. > > FYI, the kernel doesn't transfer verbatim the device-tree "isa" provided by the hardware, but it reconstruct it based on the extensions the kernel knows about. So if the hardware support a cutting edge extension (Zvknh and Zvkb for example), but the kernel is older and doesn't know about it, the hardware is going to provide a device-tree with `_Zvknh_Zvkb` to the kernel, but it's not going to show up in `/proc/cpuinfo`. (You can find the list of extensions the kernel currently knows about at [1], and the logic to build the `isa` string at [2]). That make sense to me. My initial thought was to have some sort of checking like this PR does when users try to enable use of isa extensions on the command line. It will be better than simply crashing the JVM when their hardware are not equipped with those features. But as you mentioned, this might also bear the risk of losing the ability for them to do so. In the hope of having some stable user interface for probing riscv hardware features like [1], I guess we might need to stay with the current approach and put off this PR for some while to see. @feilongjiang : Are you OK with this? [1] https://lkml.org/lkml/2023/2/6/1147 ------------- PR: https://git.openjdk.org/jdk/pull/12343 From fjiang at openjdk.org Thu Feb 9 06:09:44 2023 From: fjiang at openjdk.org (Feilong Jiang) Date: Thu, 9 Feb 2023 06:09:44 GMT Subject: RFR: 8284196: RISC-V: Detect supported ISA extensions over cpuinfo [v2] In-Reply-To: References: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> <_9crD-_k0hyO59IGouVx1vUJDJAmNUb1eLFa5EIqvxI=.e230cd38-5cb4-451f-817d-768aa512615d@github.com> <5Vfja25tJ0cURNIlqACQ4oTbGKgYMaLuIpnI_h3boo8=.062169e0-7beb-4d5c-9cad-f8380e045a5e@github.com> <4L69mem1grLwna22oqYGfJ_FRFECsYeYZZPP9kQNoSc=.ee060a06-1478-4126-9b55-c9222b06f0a3@github.com> Message-ID: On Thu, 9 Feb 2023 05:30:23 GMT, Fei Yang wrote: > I guess we might need to stay with the current approach and put off this PR for some while to see Agree. Looks like it's still too early to do that. I'll convert this PR to draft then. ------------- PR: https://git.openjdk.org/jdk/pull/12343 From fjiang at openjdk.org Thu Feb 9 06:26:26 2023 From: fjiang at openjdk.org (Feilong Jiang) Date: Thu, 9 Feb 2023 06:26:26 GMT Subject: RFR: 8302114: RISC-V: Several foreign jtreg tests fail with debug build after JDK-8301818 Message-ID: This problem only triggers when using debug build to run foreign jtreg tests. [JDK-8301818](https://bugs.openjdk.org/browse/JDK-8301818) made following code changes in function generate_generic_copy [1]: // At this point, it is known to be a typeArray (array_tag 0x3). #ifdef ASSERT { BLOCK_COMMENT("assert primitive array {"); Label L; - __ mvw(t1, Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift); + __ mv(t1, Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift); __ bge(lh, t1, L); __ stop("must be a primitive array"); __ bind(L); BLOCK_COMMENT("} assert primitive array done"); } #endif Notably, `Klass::_lh_array_tag_type_value` is of type `unsigned int`: static const unsigned int _lh_array_tag_type_value = 0Xffffffff; static const int _lh_array_tag_shift = BitsPerInt - _lh_array_tag_bits; static const int _lh_array_tag_bits = 2; So `Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift` would be `0xc0000000` of type `unsigned int`. Previously, `mvw` function would sign-extend this value into 64-bit `0xffffffffc0000000` since we want to do 64-bit signed compare and branch with `bge` instruction next. template::value)> inline void mv(Register Rd, T o) { li(Rd, (int64_t)o); } inline void mvw(Register Rd, int32_t imm32) { mv(Rd, imm32); } But this is not the case when changed to use `mv` with type `unsigned int`. So I think this changes the behaviour. A simple fix would be adding an explicit int32_t type conversion here for this value, like: __ mv(t1, (int32_t)(Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift)); P.S.: We also checked other places where moving an immediate into registers and some type conversions were added/removed. [1] https://github.com/openjdk/jdk/commit/c04a982eb47170f3c613617179fca012bb4d40ae Testing: - [x] jdk_foreign all passed on Unmatched board (fastdebug build) - [ ] tier1-3 on Unmatched board (release build) ------------- Commit messages: - fix move with type conversion Changes: https://git.openjdk.org/jdk/pull/12481/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12481&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302114 Stats: 11 lines in 4 files changed: 0 ins; 0 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/12481.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12481/head:pull/12481 PR: https://git.openjdk.org/jdk/pull/12481 From fyang at openjdk.org Thu Feb 9 06:33:42 2023 From: fyang at openjdk.org (Fei Yang) Date: Thu, 9 Feb 2023 06:33:42 GMT Subject: RFR: 8302114: RISC-V: Several foreign jtreg tests fail with debug build after JDK-8301818 In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 06:14:18 GMT, Feilong Jiang wrote: > This problem only triggers when using debug build to run foreign jtreg tests. > [JDK-8301818](https://bugs.openjdk.org/browse/JDK-8301818) made following code changes in function generate_generic_copy [1]: > > > // At this point, it is known to be a typeArray (array_tag 0x3). > #ifdef ASSERT > { > BLOCK_COMMENT("assert primitive array {"); > Label L; > - __ mvw(t1, Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift); > + __ mv(t1, Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift); > __ bge(lh, t1, L); > __ stop("must be a primitive array"); > __ bind(L); > BLOCK_COMMENT("} assert primitive array done"); > } > #endif > > > Notably, `Klass::_lh_array_tag_type_value` is of type `unsigned int`: > > static const unsigned int _lh_array_tag_type_value = 0Xffffffff; > static const int _lh_array_tag_shift = BitsPerInt - _lh_array_tag_bits; > static const int _lh_array_tag_bits = 2; > > So `Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift` would be `0xc0000000` of type `unsigned int`. > > Previously, `mvw` function would sign-extend this value into 64-bit `0xffffffffc0000000` since we want to do 64-bit signed compare and branch with `bge` instruction next. > > template::value)> > inline void mv(Register Rd, T o) { li(Rd, (int64_t)o); } > > inline void mvw(Register Rd, int32_t imm32) { mv(Rd, imm32); } > > > But this is not the case when changed to use `mv` with type `unsigned int`. So I think this changes the behaviour. > A simple fix would be adding an explicit `int32_t` type conversion here for this value, like: > > __ mv(t1, (int32_t)(Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift)); > > > > P.S.: We also checked other places where moving an immediate into registers and some type conversions were added/removed. > > > [1] https://github.com/openjdk/jdk/commit/c04a982eb47170f3c613617179fca012bb4d40ae > > Testing: > - [x] jdk_foreign all passed on Unmatched board (fastdebug build) > - [ ] tier1-3 on Unmatched board (release build) Marked as reviewed by fyang (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12481 From stuefe at openjdk.org Thu Feb 9 06:35:57 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 9 Feb 2023 06:35:57 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v6] In-Reply-To: References: Message-ID: On Wed, 8 Feb 2023 15:39:17 GMT, Justin King wrote: > > I read through your explanation, and through the [design docs](https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizerDesignDocument), but my question remains unanswered. See below. > > > > Metaspace objects hold pointers to malloced memory. Metaspace Objects can leak, that's accepted. So these pointers would leak too. Would that not give us tons of false positives? > > > > > > > > > No. LSan only cares that it can find pointers to malloc'ed memory, not that it can find pointers to Metaspace objects. The Metaspace is registered as a root region. It does however take into account ASan poisoning, which we added earlier to Metaspace. LSan will not scan poisoned memory regions that are part of root regions. We only poison in Metaspace when deallocation occurs and unpoison when allocation occurs. > > > > > > But they are not poisened. > > Upon JVM end, the heap still exists, holding class loaders, which hold references to CLDs. Those hold metaspace arenas (living in Metaspace), which hold Metaspace chunks, which are registered LSAN root areas, which hold C-allocated sub structures. These chunks are not poisened since they have never been returned to the chunk pool. They are still alive. > > To put it in other words: even I as a human cannot determine that this is not a C-heap leak. Technically, it is. But it is accepted since we accept these things leaking. Since we don't support unloading and re-loading the JVM, we accept that to have a faster shutdown. > > Let's ignore ASan/poisoning for now, its an optimization. I shouldn't have brought it up. > > LSan leak checking, as implemented for Hotspot, happens before the JVM has actually shutdown. It happens immediately after one thread wins the right to begin shutting down and other threads are either waiting or doing other work. Everything is still alive, including CollectedHeap, Metaspace, and CodeHeap. We ignore leaks as a result of shutdown this way, because as you mentioned we are not that concerned. Outside of Hotspot, LSan leak checking is typically done at program exit. If you do that with Hotspot you will definitely get false positives. > > Note: If you want, feel free to run a build with `--enable-asan` and `--enable-lsan` and run some of the test tiers. Some tests will fail due to some existing incompatibilities with ASan (working on fixing those separately), but some will fail with real leaks. I need to report these, just have not gotten around to it yet. Plan was to do so after this integrated. Most of the remaining ones are minor. > > The only time false positives can happen is some of the tests which do some interesting things and use a custom launcher, these don't inherit our options and will attempt to leak check at exit. I am slowly going through and correcting these. They should be rare at this point. > > On a side note, once this discussion is complete and in favor of this change, I should probably add some documentation related to sanitizers in Hotspot including ASan, LSan, and UBSan. Thanks for the explanation, @jcking . I'm out of time for this one, but since this PR is already integrated, you don't need my immediate feedback. I'll play around with LSAN when I have time. Finding leaked C-heap substructures from dead metaspace objects should certainly be useful. Cheers, Thomas ------------- PR: https://git.openjdk.org/jdk/pull/12229 From ysuenaga at openjdk.org Thu Feb 9 06:52:52 2023 From: ysuenaga at openjdk.org (Yasumasa Suenaga) Date: Thu, 9 Feb 2023 06:52:52 GMT Subject: Integrated: 8301853: C4819 warnings were reported in HotSpot on Windows In-Reply-To: References: Message-ID: <0wdaI3hpD5NddCgB5SNu02w_0BZUo0S1J-QShBl2YGk=.1ffed11d-69e7-4aee-bfb6-3255786b88ba@github.com> On Mon, 6 Feb 2023 12:33:44 GMT, Yasumasa Suenaga wrote: > This is subtask of #12427 . > > I have seen C4819 warning in stubGenerator_x86_64_poly.cpp and elfFile.hpp in HotSpot on Windows (CP932: Japanese locale) > > > d:\github-forked\jdk\src\hotspot\cpu\x86\stubGenerator_x86_64_poly.cpp(1): error C2220: ????????????????? > d:\github-forked\jdk\src\hotspot\cpu\x86\stubGenerator_x86_64_poly.cpp(1): warning C4819: ???????????? ??? (932) ??????????????????????????????????? Unicode ???????????? This pull request has now been integrated. Changeset: c72f9515 Author: Yasumasa Suenaga URL: https://git.openjdk.org/jdk/commit/c72f9515299b0c59bd1a5e1987982812d79e9ace Stats: 54 lines in 2 files changed: 0 ins; 0 del; 54 mod 8301853: C4819 warnings were reported in HotSpot on Windows Reviewed-by: dholmes, ihse ------------- PR: https://git.openjdk.org/jdk/pull/12435 From dholmes at openjdk.org Thu Feb 9 07:23:58 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 9 Feb 2023 07:23:58 GMT Subject: RFR: JDK-8298445: Add LeakSanitizer support in HotSpot [v6] In-Reply-To: References: Message-ID: On Wed, 8 Feb 2023 13:37:04 GMT, Magnus Ihse Bursie wrote: >> Justin King has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert changes to JDK >> >> Signed-off-by: Justin King > > I apologize, the fault lies entirely with me. Justin should have no blame in this -- he is not a committer and is not expected to fully know all rules for integration. That responsibility lies with the sponsor, in this case, me. > > I read @dholmes-ora's comment: >> Updates look good - glad to see the flag changes go away! > >> I suggest factoring out the change to test/jdk/jni/nullCaller/exeNullCallerTest.cpp as it is a JDK test and not part of hotspot. Thanks. > > as an approval from Hotspot, given that the JDK test was removed, which a later commit did indeed remove. And, like Justin, I interpreted Thomas comments as questions that were now answered, rather than an ongoing discussion about the > > I did not realize there were still an ongoing discussion, and was too eager to sponsor this PR. > > @tstuefe @dholmes-ora Do you want me to revert this change? Or should we continue the discussion in a follow-up bug that can address the remaining problems? @magicus no need to backout. Thanks. ------------- PR: https://git.openjdk.org/jdk/pull/12229 From iwalulya at openjdk.org Thu Feb 9 09:13:49 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Thu, 9 Feb 2023 09:13:49 GMT Subject: RFR: 8301116: Parallelize TLAB resizing in G1 [v3] In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 10:19:40 GMT, Thomas Schatzl wrote: >> Hi all, >> >> can I get reviews for this change that moves TLAB resizing into the second parallel post evacuation phase? >> >> The change is fairly straightforward except maybe for the following: >> * there is a new claimer for `JavaThreads` because the `Threads::possibly_parallel_threads_do` method to parallelize does not work well with very little per-thread work. Actually with a moderate amount of threads (~20) performance would already be many times slower than doing things serially. >> * moving the TLAB resizing out of `G1CollectedHeap::prepare_tlabs_for_mutator` made that method and the enclosing timing a bit obsolete, so I repurposed it as a "do preparatory work for the mutator during young gc" phase. That resulted in minor cleanup and regularization of what is done during that phase (wrt to what the corresponding method for full gc does). >> >> Testing: gha, local perf testing, tier1 >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > Improve claimer Lgtm! ------------- Marked as reviewed by iwalulya (Reviewer). PR: https://git.openjdk.org/jdk/pull/12360 From tschatzl at openjdk.org Thu Feb 9 09:20:02 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 9 Feb 2023 09:20:02 GMT Subject: RFR: 8301116: Parallelize TLAB resizing in G1 [v3] In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 10:19:40 GMT, Thomas Schatzl wrote: >> Hi all, >> >> can I get reviews for this change that moves TLAB resizing into the second parallel post evacuation phase? >> >> The change is fairly straightforward except maybe for the following: >> * there is a new claimer for `JavaThreads` because the `Threads::possibly_parallel_threads_do` method to parallelize does not work well with very little per-thread work. Actually with a moderate amount of threads (~20) performance would already be many times slower than doing things serially. >> * moving the TLAB resizing out of `G1CollectedHeap::prepare_tlabs_for_mutator` made that method and the enclosing timing a bit obsolete, so I repurposed it as a "do preparatory work for the mutator during young gc" phase. That resulted in minor cleanup and regularization of what is done during that phase (wrt to what the corresponding method for full gc does). >> >> Testing: gha, local perf testing, tier1 >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > Improve claimer I'm pushing this because I consider @dholmes-ora and @kimbarrett 's question resolved - the code in question has been reverted after all. Thanks everyone for your reviews. ------------- PR: https://git.openjdk.org/jdk/pull/12360 From tschatzl at openjdk.org Thu Feb 9 09:20:04 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 9 Feb 2023 09:20:04 GMT Subject: Integrated: 8301116: Parallelize TLAB resizing in G1 In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 10:34:36 GMT, Thomas Schatzl wrote: > Hi all, > > can I get reviews for this change that moves TLAB resizing into the second parallel post evacuation phase? > > The change is fairly straightforward except maybe for the following: > * there is a new claimer for `JavaThreads` because the `Threads::possibly_parallel_threads_do` method to parallelize does not work well with very little per-thread work. Actually with a moderate amount of threads (~20) performance would already be many times slower than doing things serially. > * moving the TLAB resizing out of `G1CollectedHeap::prepare_tlabs_for_mutator` made that method and the enclosing timing a bit obsolete, so I repurposed it as a "do preparatory work for the mutator during young gc" phase. That resulted in minor cleanup and regularization of what is done during that phase (wrt to what the corresponding method for full gc does). > > Testing: gha, local perf testing, tier1 > > Thanks, > Thomas This pull request has now been integrated. Changeset: 83e2db6b Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/83e2db6ba32e5004d2863e77c9eee91d1b65bd22 Stats: 129 lines in 11 files changed: 94 ins; 20 del; 15 mod 8301116: Parallelize TLAB resizing in G1 Reviewed-by: ayang, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/12360 From luhenry at openjdk.org Thu Feb 9 09:56:45 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Thu, 9 Feb 2023 09:56:45 GMT Subject: RFR: 8284196: RISC-V: Detect supported ISA extensions over cpuinfo [v2] In-Reply-To: References: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> <_9crD-_k0hyO59IGouVx1vUJDJAmNUb1eLFa5EIqvxI=.e230cd38-5cb4-451f-817d-768aa512615d@github.com> <5Vfja25tJ0cURNIlqACQ4oTbGKgYMaLuIpnI_h3boo8=.062169e0-7beb-4d5c-9cad-f8380e045a5e@github.com> <4L69mem1grLwna22oqYGfJ_FRFECsYeYZZPP9kQNoSc=.ee060a06-1478-4126-9b55-c9222b06f0a3@github.com> Message-ID: On Thu, 9 Feb 2023 06:06:22 GMT, Feilong Jiang wrote: >>> > It should be safe for them to do as this PR adds the necessary detection of availability of those ISA extensions. >>> >>> Ok. That also aligns RISC-V with how it's done on other platforms (SHA on x86 or AArch64 for example). >>> >>> I've a slight concern with the reliability of detecting these features based solely on the device tree + being able to force enable said-features in case we know for sure the extension is available but it's not communicated by the kernel. >>> >>> FYI, the kernel doesn't transfer verbatim the device-tree "isa" provided by the hardware, but it reconstruct it based on the extensions the kernel knows about. So if the hardware support a cutting edge extension (Zvknh and Zvkb for example), but the kernel is older and doesn't know about it, the hardware is going to provide a device-tree with `_Zvknh_Zvkb` to the kernel, but it's not going to show up in `/proc/cpuinfo`. (You can find the list of extensions the kernel currently knows about at [1], and the logic to build the `isa` string at [2]). >> >> That make sense to me. My initial thought was to have some sort of checking like this PR does when users try to enable use of isa extensions on the command line. It will be better than simply crashing the JVM when their hardware are not equipped with those features. But as you mentioned, this might also bear the risk of losing the ability for them to do so. In the hope of having some stable user interface for probing riscv hardware features like [1], I guess we might need to stay with the current approach and put off this PR for some while to see. @feilongjiang : Are you OK with this? >> >> [1] https://lkml.org/lkml/2023/2/6/1147 > >> I guess we might need to stay with the current approach and put off this PR for some while to see > > Agreed. Looks like it's still too early to do that. I'll convert this PR to draft then. An alternative approach that doesn't rely on a kernel API is to generate some stub that probes for some instructions, and trigger a SIGILL in case the instruction isn't supported. That SIGILL can be caught by the runtime (very similar to how we do it with SIGSEGV and SafeFetch for example) and we know based on the instruction executed that it isn't available. That would work as long as there is no conflicts in the encoding of instructions, and no bug/performance penalty in the actual implementation of the instructions (but that's where vendor-specific settings is helpful). ------------- PR: https://git.openjdk.org/jdk/pull/12343 From ayang at openjdk.org Thu Feb 9 10:51:31 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 9 Feb 2023 10:51:31 GMT Subject: RFR: 8225409: Investigate removal of the Hot Card Cache [v2] In-Reply-To: References: Message-ID: > Mostly straightforward removing G1 HotCardCache, as no performance benefit is observed for various benchmarks (specjbb, specjvm, dacapo, etc). > > Test: tier1-6 Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: - merge - g1-hcc ------------- Changes: https://git.openjdk.org/jdk/pull/12452/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12452&range=01 Stats: 1259 lines in 34 files changed: 50 ins; 1176 del; 33 mod Patch: https://git.openjdk.org/jdk/pull/12452.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12452/head:pull/12452 PR: https://git.openjdk.org/jdk/pull/12452 From ayang at openjdk.org Thu Feb 9 11:03:15 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 9 Feb 2023 11:03:15 GMT Subject: RFR: 8302127: Remove unused arg in write_ref_field_post Message-ID: Trivial removing dead code. ------------- Commit messages: - trivial Changes: https://git.openjdk.org/jdk/pull/12489/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12489&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302127 Stats: 8 lines in 6 files changed: 0 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/12489.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12489/head:pull/12489 PR: https://git.openjdk.org/jdk/pull/12489 From rkennke at openjdk.org Thu Feb 9 12:48:02 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 9 Feb 2023 12:48:02 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v8] In-Reply-To: References: Message-ID: > This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). > > What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. > > This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal p rotocols. > > The lock-stack is grown when needed. This means that we need to check for potential overflow before attempting locking. When that is the case, locking fast-paths would call into the runtime to grow the stack and handle the locking. Compiled fast-paths (C1 and C2 on x86_64 and aarch64) do this check on method entry to avoid (possibly lots) of such checks at locking sites. > > In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. > > One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. > > As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. > > This change enables to simplify (and speed-up!) a lot of code: > > - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. > - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR > > > Testing: > - [x] tier1 x86_64 x aarch64 x +UseFastLocking > - [x] tier2 x86_64 x aarch64 x +UseFastLocking > - [x] tier3 x86_64 x aarch64 x +UseFastLocking > - [x] tier4 x86_64 x aarch64 x +UseFastLocking > - [x] tier1 x86_64 x aarch64 x -UseFastLocking > - [x] tier2 x86_64 x aarch64 x -UseFastLocking > - [x] tier3 x86_64 x aarch64 x -UseFastLocking > - [x] tier4 x86_64 x aarch64 x -UseFastLocking > - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet > > ### Performance > > #### Renaissance > > > > ? | x86_64 | ? | ? | ? | aarch64 | ? | ? > -- | -- | -- | -- | -- | -- | -- | -- > ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? > AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% > Reactors | 11444.511 | 11606.66 | -1.40% | ? | 11382.594 | 11638.036 | -2.19% > Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% > ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% > GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% > LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% > MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 9898.1 | 10097.867 | -1.98% > NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% > PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% > FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% > FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% > ParMnemonics | 2527.228 | 2564.667 | -1.46% | ? | 4278.225 | 4263.863 | 0.34% > Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% > RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% > Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% > ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% > ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% > ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% > Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% > FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% > FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Fix anon owner in fast-path, avoid runtime call (x86_64) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/10907/files - new: https://git.openjdk.org/jdk/pull/10907/files/bbe8c186..74f3974d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=06-07 Stats: 34 lines in 3 files changed: 34 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/10907.diff Fetch: git fetch https://git.openjdk.org/jdk pull/10907/head:pull/10907 PR: https://git.openjdk.org/jdk/pull/10907 From rkennke at openjdk.org Thu Feb 9 13:04:11 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 9 Feb 2023 13:04:11 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v9] In-Reply-To: References: Message-ID: > This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). > > What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. > > This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal p rotocols. > > The lock-stack is grown when needed. This means that we need to check for potential overflow before attempting locking. When that is the case, locking fast-paths would call into the runtime to grow the stack and handle the locking. Compiled fast-paths (C1 and C2 on x86_64 and aarch64) do this check on method entry to avoid (possibly lots) of such checks at locking sites. > > In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. > > One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. > > As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. > > This change enables to simplify (and speed-up!) a lot of code: > > - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. > - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR > > > Testing: > - [x] tier1 x86_64 x aarch64 x +UseFastLocking > - [x] tier2 x86_64 x aarch64 x +UseFastLocking > - [x] tier3 x86_64 x aarch64 x +UseFastLocking > - [x] tier4 x86_64 x aarch64 x +UseFastLocking > - [x] tier1 x86_64 x aarch64 x -UseFastLocking > - [x] tier2 x86_64 x aarch64 x -UseFastLocking > - [x] tier3 x86_64 x aarch64 x -UseFastLocking > - [x] tier4 x86_64 x aarch64 x -UseFastLocking > - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet > > ### Performance > > #### Renaissance > > > > ? | x86_64 | ? | ? | ? | aarch64 | ? | ? > -- | -- | -- | -- | -- | -- | -- | -- > ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? > AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% > Reactors | 11444.511 | 11606.66 | -1.40% | ? | 11382.594 | 11638.036 | -2.19% > Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% > ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% > GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% > LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% > MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 9898.1 | 10097.867 | -1.98% > NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% > PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% > FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% > FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% > ParMnemonics | 2527.228 | 2564.667 | -1.46% | ? | 4278.225 | 4263.863 | 0.34% > Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% > RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% > Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% > ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% > ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% > ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% > Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% > FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% > FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Fix anon owner in fast-path, avoid runtime call (aarch64) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/10907/files - new: https://git.openjdk.org/jdk/pull/10907/files/74f3974d..38dae002 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=07-08 Stats: 29 lines in 3 files changed: 27 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/10907.diff Fetch: git fetch https://git.openjdk.org/jdk pull/10907/head:pull/10907 PR: https://git.openjdk.org/jdk/pull/10907 From aboldtch at openjdk.org Thu Feb 9 13:13:46 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 9 Feb 2023 13:13:46 GMT Subject: RFR: 8300255: Introduce interface for GC oop verification in the assembler [v3] In-Reply-To: <4CCmelM9IsYigYOrhH308nUuL-e4MybxRpmN19zJUu8=.c4deaa3e-c721-4870-93ba-8c99089e4d97@github.com> References: <4CCmelM9IsYigYOrhH308nUuL-e4MybxRpmN19zJUu8=.c4deaa3e-c721-4870-93ba-8c99089e4d97@github.com> Message-ID: On Wed, 8 Feb 2023 10:19:20 GMT, Erik ?sterlund wrote: >> In the assembly code, there is some generic oop verification code. Said verification may or may not fit a particular GC. In particular, it has not worked well for ZGC for a while, and there is an if (UseZGC). This enhancement aims at generalizing this code, such that a collector can have its own oop verification policy. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > RISC-V fixes lgtm. src/hotspot/cpu/aarch64/gc/z/zBarrierSetAssembler_aarch64.cpp line 455: > 453: void ZBarrierSetAssembler::check_oop(MacroAssembler* masm, Register obj, Register tmp1, Register tmp2, Label& error) { > 454: // Check if mask is good. > 455: // verifies that ZAddressBadMask & r0 == 0 Update comment `s/r0/obj` ------------- Marked as reviewed by aboldtch (Committer). PR: https://git.openjdk.org/jdk/pull/12443 From eosterlund at openjdk.org Thu Feb 9 15:13:23 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 9 Feb 2023 15:13:23 GMT Subject: RFR: 8300255: Introduce interface for GC oop verification in the assembler [v3] In-Reply-To: References: <4CCmelM9IsYigYOrhH308nUuL-e4MybxRpmN19zJUu8=.c4deaa3e-c721-4870-93ba-8c99089e4d97@github.com> Message-ID: On Thu, 9 Feb 2023 13:10:40 GMT, Axel Boldt-Christmas wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> RISC-V fixes > > lgtm. Thanks for the review @xmas92! Will fix the comment. ------------- PR: https://git.openjdk.org/jdk/pull/12443 From duke at openjdk.org Thu Feb 9 15:30:51 2023 From: duke at openjdk.org (duke) Date: Thu, 9 Feb 2023 15:30:51 GMT Subject: Withdrawn: JDK-8294902: Undefined Behavior in C2 regalloc with null references In-Reply-To: References: Message-ID: On Mon, 31 Oct 2022 14:40:32 GMT, Andrew Haley wrote: > This patch fixes the remaining null pointer dereference bugs that I know of. > > For the main bug, C2 was using a null reference to indicate an uninitialized `Node_List`. I replaced the null reference with a static sentinel. > > I also turned on `-fsanitize=null` and found and fixed a bunch of other null pointer dereferences. With this,I have run a full bootstrap and tier1 tests with `-fsanitize=null` enabled. > > I have checked that the code generated by GCC is not worse in any significant way, so I don't expect to see any performance regressions. > > I'd like to enable `-fsanitize=null` in debug builds to prevent regressions in this area. What do you think? This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/10920 From coleenp at openjdk.org Thu Feb 9 15:35:31 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 9 Feb 2023 15:35:31 GMT Subject: RFR: 8300255: Introduce interface for GC oop verification in the assembler [v3] In-Reply-To: <4CCmelM9IsYigYOrhH308nUuL-e4MybxRpmN19zJUu8=.c4deaa3e-c721-4870-93ba-8c99089e4d97@github.com> References: <4CCmelM9IsYigYOrhH308nUuL-e4MybxRpmN19zJUu8=.c4deaa3e-c721-4870-93ba-8c99089e4d97@github.com> Message-ID: On Wed, 8 Feb 2023 10:19:20 GMT, Erik ?sterlund wrote: >> In the assembly code, there is some generic oop verification code. Said verification may or may not fit a particular GC. In particular, it has not worked well for ZGC for a while, and there is an if (UseZGC). This enhancement aims at generalizing this code, such that a collector can have its own oop verification policy. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > RISC-V fixes This change looks nice. Did you run tier8 or tier1 tests with -XX:+VerifyOops? I thought we ran some general testing in tier3 with -XX:+VerifyOops but we don't seem to be doing that. Thanks. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.org/jdk/pull/12443 From coleenp at openjdk.org Thu Feb 9 15:54:09 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 9 Feb 2023 15:54:09 GMT Subject: RFR: 8302108: Clean up placeholder supername code Message-ID: <21a14dI3wKifmxIScLOW9c1C-ZTFVR7OPQ3wlvA_X0I=.79a4aacd-1d8b-4c5d-841e-20ce7a028917@github.com> Please review change to make PlaceholderEntry::_supername a SymbolHandle to correctly refcount the symbol rather than have ad-hoc code. It also moves the private functions to private and asserts invariant that find_and_remove must always find a matching entry. Finally, this also more eagerly removes the supername symbol from the placeholder entry, so I have to fix the test in https://github.com/openjdk/jdk/pull/12491 before pushing this. Tested with tier1-4. ------------- Commit messages: - 8302108: Clean up placeholder supername code Changes: https://git.openjdk.org/jdk/pull/12495/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12495&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302108 Stats: 51 lines in 3 files changed: 15 ins; 22 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/12495.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12495/head:pull/12495 PR: https://git.openjdk.org/jdk/pull/12495 From tschatzl at openjdk.org Thu Feb 9 16:34:49 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 9 Feb 2023 16:34:49 GMT Subject: RFR: 8225409: Investigate removal of the Hot Card Cache [v2] In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 10:51:31 GMT, Albert Mingkun Yang wrote: >> Mostly straightforward removing G1 HotCardCache, as no performance benefit is observed for various benchmarks (specjbb, specjvm, dacapo, etc). >> >> Test: tier1-6 > > Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: > > - merge > - g1-hcc Lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.org/jdk/pull/12452 From tschatzl at openjdk.org Thu Feb 9 17:08:03 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 9 Feb 2023 17:08:03 GMT Subject: RFR: 8301988: VerifyLiveClosure::verify_liveness asserts on bad pointers outside heap [v3] In-Reply-To: References: Message-ID: > Hi all, > > can I have reviews for this change to liveness verification that fixes some unwanted asserts because > > - it uses decode_not_null which will assert if the given oop address is not in the heap, making the remainder of the verification useless in that case > - if the referenced object is not in the heap, we try to get its heap region too when printing, which also fails some assertions > - in the innermost if lots of code is duplicated in both cases > > The first two issues are really annoying (there is another one when the `Klass` is garbage when calling `is_obj_dead_cond`, but I'll try to improve that separately). > > Testing: local compilation/testing, gha > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: remove one level of indentation in do_oop_work ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12456/files - new: https://git.openjdk.org/jdk/pull/12456/files/b2c41af7..b8182317 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12456&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12456&range=01-02 Stats: 50 lines in 1 file changed: 20 ins; 18 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/12456.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12456/head:pull/12456 PR: https://git.openjdk.org/jdk/pull/12456 From alanb at openjdk.org Thu Feb 9 17:09:54 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 9 Feb 2023 17:09:54 GMT Subject: RFR: 8301819: Enable continuations code by default In-Reply-To: <1FPt6K7a7E4fgcVMi6HnNcYrzAXs5t38-zyKVkiGxtE=.931145af-0143-4542-8db8-6866237369c1@github.com> References: <1FPt6K7a7E4fgcVMi6HnNcYrzAXs5t38-zyKVkiGxtE=.931145af-0143-4542-8db8-6866237369c1@github.com> Message-ID: On Thu, 9 Feb 2023 03:29:45 GMT, David Holmes wrote: > @AlanBateman I agree with enabling this and giving it time to bake. Vladimir has already checked there are no red flags on the performance front, so that is fine. If anything does arise we have, as you note, time to address it. @kuksenko did a lot of detailed analysis on the performance, this lead to JDK-8300002 to account for the interaction between inlining and the post-call nops. We can re-visit again in a few weeks or months if something comes up. > Copyright year needs updating in continuation.cpp. Yes, will do. ------------- PR: https://git.openjdk.org/jdk/pull/12459 From duke at openjdk.org Thu Feb 9 17:29:05 2023 From: duke at openjdk.org (Scott Gibbons) Date: Thu, 9 Feb 2023 17:29:05 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v14] In-Reply-To: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: <39P258Jz047UViDIsbxOSfuWNeQqkz_d3hywAObylWc=.33deebd9-d760-4ba8-8a0d-2f59f260e4bc@github.com> > Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. > > Encode performance: > **Old:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms > > > **New:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms > > > Decode performance: > **Old:** > > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms > > **New:** > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Re-name tables; save some space ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12126/files - new: https://git.openjdk.org/jdk/pull/12126/files/d174c43f..cc1bce1e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=12-13 Stats: 52 lines in 4 files changed: 13 ins; 19 del; 20 mod Patch: https://git.openjdk.org/jdk/pull/12126.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12126/head:pull/12126 PR: https://git.openjdk.org/jdk/pull/12126 From alanb at openjdk.org Thu Feb 9 17:31:28 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 9 Feb 2023 17:31:28 GMT Subject: RFR: 8301819: Enable continuations code by default [v2] In-Reply-To: References: Message-ID: > The continuations code added in JDK 19 with JEP 425 is currently disabled by default and is only enabled when running with preview features enabled (`--enable-preview`). Disabled by default was the pragmatic choice at that time to reduce risk and possible performance regressions. We'd like to change this so the code is enabled by default on the platforms that have a port. Enabling this code by default helps to shake out any issues early in JDK 21 and sets things up to make virtual threads a permanent feature in the future. If something comes up in the next few weeks/months then it can be changed to be disabled by default again. This change has no impact on the builds that do not have a port. > > Testing: tier1-tier6 Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge - Update copyright date - Initial commit ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12459/files - new: https://git.openjdk.org/jdk/pull/12459/files/2ef27802..38f9ccde Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12459&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12459&range=00-01 Stats: 6633 lines in 181 files changed: 4396 ins; 683 del; 1554 mod Patch: https://git.openjdk.org/jdk/pull/12459.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12459/head:pull/12459 PR: https://git.openjdk.org/jdk/pull/12459 From ayang at openjdk.org Thu Feb 9 17:38:48 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 9 Feb 2023 17:38:48 GMT Subject: RFR: 8301988: VerifyLiveClosure::verify_liveness asserts on bad pointers outside heap [v3] In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 17:08:03 GMT, Thomas Schatzl wrote: >> Hi all, >> >> can I have reviews for this change to liveness verification that fixes some unwanted asserts because >> >> - it uses decode_not_null which will assert if the given oop address is not in the heap, making the remainder of the verification useless in that case >> - if the referenced object is not in the heap, we try to get its heap region too when printing, which also fails some assertions >> - in the innermost if lots of code is duplicated in both cases >> >> The first two issues are really annoying (there is another one when the `Klass` is garbage when calling `is_obj_dead_cond`, but I'll try to improve that separately). >> >> Testing: local compilation/testing, gha >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > remove one level of indentation in do_oop_work Marked as reviewed by ayang (Reviewer). src/hotspot/share/gc/g1/heapRegion.cpp line 518: > 516: > 517: oop obj = CompressedOops::decode_raw_not_null(heap_oop); > 518: bool failed = false; Unused. ------------- PR: https://git.openjdk.org/jdk/pull/12456 From rkennke at openjdk.org Thu Feb 9 18:05:15 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 9 Feb 2023 18:05:15 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v10] In-Reply-To: References: Message-ID: > This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). > > What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. > > This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal p rotocols. > > The lock-stack is grown when needed. This means that we need to check for potential overflow before attempting locking. When that is the case, locking fast-paths would call into the runtime to grow the stack and handle the locking. Compiled fast-paths (C1 and C2 on x86_64 and aarch64) do this check on method entry to avoid (possibly lots) of such checks at locking sites. > > In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. > > One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. > > As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. > > This change enables to simplify (and speed-up!) a lot of code: > > - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. > - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR > > > Testing: > - [x] tier1 x86_64 x aarch64 x +UseFastLocking > - [x] tier2 x86_64 x aarch64 x +UseFastLocking > - [x] tier3 x86_64 x aarch64 x +UseFastLocking > - [x] tier4 x86_64 x aarch64 x +UseFastLocking > - [x] tier1 x86_64 x aarch64 x -UseFastLocking > - [x] tier2 x86_64 x aarch64 x -UseFastLocking > - [x] tier3 x86_64 x aarch64 x -UseFastLocking > - [x] tier4 x86_64 x aarch64 x -UseFastLocking > - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet > > ### Performance > > #### Renaissance > > > > ? | x86_64 | ? | ? | ? | aarch64 | ? | ? > -- | -- | -- | -- | -- | -- | -- | -- > ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? > AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% > Reactors | 11444.511 | 11606.66 | -1.40% | ? | 11382.594 | 11638.036 | -2.19% > Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% > ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% > GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% > LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% > MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 9898.1 | 10097.867 | -1.98% > NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% > PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% > FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% > FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% > ParMnemonics | 2527.228 | 2564.667 | -1.46% | ? | 4278.225 | 4263.863 | 0.34% > Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% > RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% > Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% > ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% > ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% > ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% > Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% > FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% > FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Small fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/10907/files - new: https://git.openjdk.org/jdk/pull/10907/files/38dae002..2af1ce38 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=08-09 Stats: 7 lines in 4 files changed: 0 ins; 1 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/10907.diff Fetch: git fetch https://git.openjdk.org/jdk pull/10907/head:pull/10907 PR: https://git.openjdk.org/jdk/pull/10907 From duke at openjdk.org Thu Feb 9 18:08:15 2023 From: duke at openjdk.org (Scott Gibbons) Date: Thu, 9 Feb 2023 18:08:15 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v15] In-Reply-To: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: > Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. > > Encode performance: > **Old:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms > > > **New:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms > > > Decode performance: > **Old:** > > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms > > **New:** > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Add URL to microbenchmark ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12126/files - new: https://git.openjdk.org/jdk/pull/12126/files/cc1bce1e..424b40d7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=14 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=13-14 Stats: 38 lines in 1 file changed: 33 ins; 1 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/12126.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12126/head:pull/12126 PR: https://git.openjdk.org/jdk/pull/12126 From tschatzl at openjdk.org Thu Feb 9 18:14:12 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 9 Feb 2023 18:14:12 GMT Subject: RFR: 8301988: VerifyLiveClosure::verify_liveness asserts on bad pointers outside heap [v4] In-Reply-To: References: Message-ID: > Hi all, > > can I have reviews for this change to liveness verification that fixes some unwanted asserts because > > - it uses decode_not_null which will assert if the given oop address is not in the heap, making the remainder of the verification useless in that case > - if the referenced object is not in the heap, we try to get its heap region too when printing, which also fails some assertions > - in the innermost if lots of code is duplicated in both cases > > The first two issues are really annoying (there is another one when the `Klass` is garbage when calling `is_obj_dead_cond`, but I'll try to improve that separately). > > Testing: local compilation/testing, gha > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: removed never-read failed local variable. Formatting cleanup. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12456/files - new: https://git.openjdk.org/jdk/pull/12456/files/b8182317..96523f99 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12456&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12456&range=02-03 Stats: 16 lines in 1 file changed: 6 ins; 10 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12456.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12456/head:pull/12456 PR: https://git.openjdk.org/jdk/pull/12456 From duke at openjdk.org Thu Feb 9 18:27:33 2023 From: duke at openjdk.org (Amit Kumar) Date: Thu, 9 Feb 2023 18:27:33 GMT Subject: RFR: 8301697: [s390] Optimized-build is broken Message-ID: This fix guards `__ asm_assert_eq("killed Z_R14", 0)` & `__asm_assert_mem8_is_zero(in_bytes(JavaThread::exception_pc_offset()), Z_thread, "exception pc already set : "FILE_AND_LINE, 0)` with ASSERT because on s390x `JavaThread::exception_oop_offset()` is always cleared but `JavaThread::exception_pc_offset()` is being cleared only in ASSERT-def, Which is causing the build failure in Optimized-Debug. ------------- Commit messages: - Merge branch 'master' into optimized_build_broken - guards asserts statements - fixes optimized build Changes: https://git.openjdk.org/jdk/pull/12400/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12400&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301697 Stats: 9 lines in 2 files changed: 3 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/12400.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12400/head:pull/12400 PR: https://git.openjdk.org/jdk/pull/12400 From duke at openjdk.org Thu Feb 9 18:27:34 2023 From: duke at openjdk.org (Amit Kumar) Date: Thu, 9 Feb 2023 18:27:34 GMT Subject: RFR: 8301697: [s390] Optimized-build is broken In-Reply-To: References: Message-ID: <7gymtH0DjIj_DnlQgE_IUmurAsluGij6pt_p-pBKwMM=.73c4d85b-5a1c-4bbd-81ff-2c0689fdf7ea@github.com> On Fri, 3 Feb 2023 04:34:46 GMT, Amit Kumar wrote: > This fix guards `__ asm_assert_eq("killed Z_R14", 0)` & `__asm_assert_mem8_is_zero(in_bytes(JavaThread::exception_pc_offset()), Z_thread, "exception pc already set : "FILE_AND_LINE, 0)` with ASSERT because on s390x `JavaThread::exception_oop_offset()` is always cleared but `JavaThread::exception_pc_offset()` is being cleared only in ASSERT-def, Which is causing the build failure in Optimized-Debug. Hi @shipilev , @theRealAph As title states optimized build is broken with GCC-9.4.0 and After my code changes build is now again successful. I've done testing for optimised, release, fastdebug, slowdebug on s390 and they're successful with these code changes. I'm pinging you here, because I've seen a patch for build-fix for optmized-build in past from you. form shipilev : https://github.com/openjdk/jdk/pull/10743 (for s390) from theRealAph: https://github.com/openjdk/jdk/pull/9103 (for arch-64). As I'm new to OpenJDK, so it would be helpful if you could share some insight about this fix or any other way to fix the build, before I mark this PR "ready for review". Thank you so much. ------------- PR: https://git.openjdk.org/jdk/pull/12400 From shade at openjdk.org Thu Feb 9 18:27:35 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 9 Feb 2023 18:27:35 GMT Subject: RFR: 8301697: [s390] Optimized-build is broken In-Reply-To: <7gymtH0DjIj_DnlQgE_IUmurAsluGij6pt_p-pBKwMM=.73c4d85b-5a1c-4bbd-81ff-2c0689fdf7ea@github.com> References: <7gymtH0DjIj_DnlQgE_IUmurAsluGij6pt_p-pBKwMM=.73c4d85b-5a1c-4bbd-81ff-2c0689fdf7ea@github.com> Message-ID: On Fri, 3 Feb 2023 04:47:32 GMT, Amit Kumar wrote: > As title states optimized build is broken with GCC-9.4.0 and After my code changes build is now again successful. I've done testing for optimised, release, fastdebug, slowdebug on s390 and they're successful with these code changes. I was confused a bit about this, because my own S390X optimized builds build fine: https://builds.shipilev.net/openjdk-jdk/build-logs/build-openjdk-jdk-linux-s390x-server-optimized-gcc10-glibc2.31.log So, current code is supposed to build fine. I think it is arguably correct to protect the asserts with `ASSERT`/`DEBUG` rather than `PRODUCT`, which would hide them in the optimized builds. But I look at the original issue, and the optimized build is actually failing on the assert, which is why my (cross-compiled) builds are fine! See: A fatal error has been detected by the Java Runtime Environment: Internal Error (src/hotspot/cpu/s390/macroAssembler_s390.cpp:5498), pid=3076078, tid=3076083 guarantee(false) failed: Z assembly code requires stop: killed Z_R14 So, this change only hides the issue, not fixes it? If so, I don't think this is a good way to solve the issue. I suggest you `git bisect` and identify which change introduced this failure, and work from there. ------------- PR: https://git.openjdk.org/jdk/pull/12400 From duke at openjdk.org Thu Feb 9 18:27:36 2023 From: duke at openjdk.org (Amit Kumar) Date: Thu, 9 Feb 2023 18:27:36 GMT Subject: RFR: 8301697: [s390] Optimized-build is broken In-Reply-To: References: <7gymtH0DjIj_DnlQgE_IUmurAsluGij6pt_p-pBKwMM=.73c4d85b-5a1c-4bbd-81ff-2c0689fdf7ea@github.com> Message-ID: On Fri, 3 Feb 2023 08:30:47 GMT, Aleksey Shipilev wrote: >> Hi @shipilev , @theRealAph >> >> As title states optimized build is broken with GCC-9.4.0 and After my code changes build is now again successful. I've done testing for optimised, release, fastdebug, slowdebug on s390 and they're successful with these code changes. >> >> I'm pinging you here, because I've seen a patch for build-fix for optmized-build in past from you. >> form shipilev : https://github.com/openjdk/jdk/pull/10743 (for s390) >> from theRealAph: https://github.com/openjdk/jdk/pull/9103 (for arch-64). >> >> As I'm new to OpenJDK, so it would be helpful if you could share some insight about this fix or any other way to fix the build, before I mark this PR "ready for review". >> >> Thank you so much. > >> As title states optimized build is broken with GCC-9.4.0 and After my code changes build is now again successful. I've done testing for optimised, release, fastdebug, slowdebug on s390 and they're successful with these code changes. > > I was confused a bit about this, because my own S390X optimized builds build fine: > https://builds.shipilev.net/openjdk-jdk/build-logs/build-openjdk-jdk-linux-s390x-server-optimized-gcc10-glibc2.31.log > > So, current code is supposed to build fine. I think it is arguably correct to protect the asserts with `ASSERT`/`DEBUG` rather than `PRODUCT`, which would hide them in the optimized builds. But I look at the original issue, and the optimized build is actually failing on the assert, which is why my (cross-compiled) builds are fine! See: > > > A fatal error has been detected by the Java Runtime Environment: > > Internal Error (src/hotspot/cpu/s390/macroAssembler_s390.cpp:5498), pid=3076078, tid=3076083 > guarantee(false) failed: Z assembly code requires stop: killed Z_R14 > > > So, this change only hides the issue, not fixes it? If so, I don't think this is a good way to solve the issue. > I suggest you `git bisect` and identify which change introduced this failure, and work from there. Hi @shipilev, I've updated my changes and Updated Justification for PR as well. Please take a look now. @RealLucy and @backwaterred , I request both of you as well for your comments/suggestions/review :) Thanks ------------- PR: https://git.openjdk.org/jdk/pull/12400 From lmesnik at openjdk.org Thu Feb 9 19:25:48 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 9 Feb 2023 19:25:48 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v7] In-Reply-To: References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> Message-ID: <-mFwsAykBQ9bXZczBEHsQX_5f5OJY5E0zOMZ31aE7rk=.b19e0c2e-03bb-4a1f-8584-0fedc6b18272@github.com> On Wed, 8 Feb 2023 23:18:20 GMT, Mikhailo Seledtsov wrote: >> Add log() and logToFile() methods to VMProps for diagnostic purposes. > > Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: > > Re-added conditional logging per review discussion Marked as reviewed by lmesnik (Reviewer). test/jtreg-ext/requires/VMProps.java line 516: > 514: > 515: // Returns comma-separated file names for stdout and stderr. > 516: private String redirectOutputToLogFile(String msg, ProcessBuilder pb, String fileNameBase) { It would be better to return List fileNames rather to combine/parse the String. test/jtreg-ext/requires/VMProps.java line 517: > 515: // Returns comma-separated file names for stdout and stderr. > 516: private String redirectOutputToLogFile(String msg, ProcessBuilder pb, String fileNameBase) { > 517: if (!Boolean.getBoolean("jtreg.log.vmprops")) { Why not dump output always? It is printed only if exitcode is not zero anyway and doesn't add noise. However not having it in the case of failure require reproducing the bug. test/jtreg-ext/requires/VMProps.java line 693: > 691: protected static void log(String msg) { > 692: if (!Boolean.getBoolean("jtreg.log.vmprops")) { > 693: return; Why not always log in to the file like vmprops.log? Also, add try/catch into the call() and redirect file to stderr if something is wrong. And redirect if the property is set for passed tests. ------------- PR: https://git.openjdk.org/jdk/pull/12239 From mseledtsov at openjdk.org Thu Feb 9 19:35:45 2023 From: mseledtsov at openjdk.org (Mikhailo Seledtsov) Date: Thu, 9 Feb 2023 19:35:45 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v7] In-Reply-To: <-mFwsAykBQ9bXZczBEHsQX_5f5OJY5E0zOMZ31aE7rk=.b19e0c2e-03bb-4a1f-8584-0fedc6b18272@github.com> References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> <-mFwsAykBQ9bXZczBEHsQX_5f5OJY5E0zOMZ31aE7rk=.b19e0c2e-03bb-4a1f-8584-0fedc6b18272@github.com> Message-ID: <7DZ8e09Bh11t7wJFMJVwoPQzdaYtNvJd3-2njgDKlTA=.a2a8dd73-3886-4513-88a6-a595ecc3bf6a@github.com> On Thu, 9 Feb 2023 19:08:42 GMT, Leonid Mesnik wrote: >> Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: >> >> Re-added conditional logging per review discussion > > test/jtreg-ext/requires/VMProps.java line 517: > >> 515: // Returns comma-separated file names for stdout and stderr. >> 516: private String redirectOutputToLogFile(String msg, ProcessBuilder pb, String fileNameBase) { >> 517: if (!Boolean.getBoolean("jtreg.log.vmprops")) { > > Why not dump output always? It is printed only if exitcode is not zero anyway and doesn't add noise. > However not having it in the case of failure require reproducing the bug. We have to be consistent. Since we agreed in earlier discussion to use property jtreg.log.vmprops as a guard for logging we should also use it for writing into a file. The user who runs the tests can always specify this property in the test execution script. ------------- PR: https://git.openjdk.org/jdk/pull/12239 From tsteele at openjdk.org Thu Feb 9 19:46:46 2023 From: tsteele at openjdk.org (Tyler Steele) Date: Thu, 9 Feb 2023 19:46:46 GMT Subject: RFR: 8301697: [s390] Optimized-build is broken In-Reply-To: References: Message-ID: On Fri, 3 Feb 2023 04:34:46 GMT, Amit Kumar wrote: > This fix guards `__ asm_assert_eq("killed Z_R14", 0)` & `__asm_assert_mem8_is_zero(in_bytes(JavaThread::exception_pc_offset()), Z_thread, "exception pc already set : "FILE_AND_LINE, 0)` with ASSERT because on s390x `JavaThread::exception_oop_offset()` is always cleared but `JavaThread::exception_pc_offset()` is being cleared only in ASSERT-def, Which is causing the build failure in Optimized-Debug. Changes requested by tsteele (Committer). src/hotspot/cpu/s390/c1_Runtime1_s390.cpp line 3: > 1: /* > 2: * Copyright (c) 2016, 2023, Oracle and/or its affiliates. All rights reserved. > 3: * Copyright (c) 2016, 2023, SAP SE. All rights reserved. As I understand it, only SAPers should touch the SAP copyright header. I recognize that this is a bit confusing because everyone updates the Oracle header. If @RealLucy confirms this to be true, please keep the Oracle line update, and revert the SAP line. src/hotspot/cpu/s390/templateInterpreterGenerator_s390.cpp line 3: > 1: /* > 2: * Copyright (c) 2016, 2023, Oracle and/or its affiliates. All rights reserved. > 3: * Copyright (c) 2016, 2023, SAP SE. All rights reserved. Same suggestion as above. ------------- PR: https://git.openjdk.org/jdk/pull/12400 From iwalulya at openjdk.org Thu Feb 9 20:03:46 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Thu, 9 Feb 2023 20:03:46 GMT Subject: RFR: 8225409: Investigate removal of the Hot Card Cache [v2] In-Reply-To: References: Message-ID: <-_6KvR6A9xir7bPVuWcS0IzICQkFy-LMXlk_gD61iNQ=.a0ffd759-2e42-4e26-b1da-c621d3789ff7@github.com> On Thu, 9 Feb 2023 10:51:31 GMT, Albert Mingkun Yang wrote: >> Mostly straightforward removing G1 HotCardCache, as no performance benefit is observed for various benchmarks (specjbb, specjvm, dacapo, etc). >> >> Test: tier1-6 > > Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: > > - merge > - g1-hcc Marked as reviewed by iwalulya (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12452 From tschatzl at openjdk.org Thu Feb 9 20:09:46 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 9 Feb 2023 20:09:46 GMT Subject: RFR: 8301988: VerifyLiveClosure::verify_liveness asserts on bad pointers outside heap [v2] In-Reply-To: <8R1JPibjyJ6BWOxhPcFBytCHSEkHbf3E-K4ar3MAkhg=.e2ea20d8-5c5e-49ec-b1e1-0816a9cb54cc@github.com> References: <8R1JPibjyJ6BWOxhPcFBytCHSEkHbf3E-K4ar3MAkhg=.e2ea20d8-5c5e-49ec-b1e1-0816a9cb54cc@github.com> Message-ID: On Thu, 9 Feb 2023 03:34:07 GMT, David Holmes wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> Missing changes > > LGTM. Thanks Thanks @dholmes-ora @albertnetymk for your reviews ------------- PR: https://git.openjdk.org/jdk/pull/12456 From tschatzl at openjdk.org Thu Feb 9 20:12:53 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 9 Feb 2023 20:12:53 GMT Subject: Integrated: 8301988: VerifyLiveClosure::verify_liveness asserts on bad pointers outside heap In-Reply-To: References: Message-ID: <4jvfuW-kksdRSy-0OeFInNwhz3bwWqknzksslC6acG0=.9c7d5e1f-de76-407a-96a7-879e7e2408a1@github.com> On Tue, 7 Feb 2023 15:03:36 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change to liveness verification that fixes some unwanted asserts because > > - it uses decode_not_null which will assert if the given oop address is not in the heap, making the remainder of the verification useless in that case > - if the referenced object is not in the heap, we try to get its heap region too when printing, which also fails some assertions > - in the innermost if lots of code is duplicated in both cases > > The first two issues are really annoying (there is another one when the `Klass` is garbage when calling `is_obj_dead_cond`, but I'll try to improve that separately). > > Testing: local compilation/testing, gha > > Thanks, > Thomas This pull request has now been integrated. Changeset: 0aeebee2 Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/0aeebee284effe9abd0ed3cf2845430b40bb53bd Stats: 84 lines in 6 files changed: 41 ins; 32 del; 11 mod 8301988: VerifyLiveClosure::verify_liveness asserts on bad pointers outside heap Reviewed-by: dholmes, ayang ------------- PR: https://git.openjdk.org/jdk/pull/12456 From duke at openjdk.org Thu Feb 9 22:18:47 2023 From: duke at openjdk.org (Scott Gibbons) Date: Thu, 9 Feb 2023 22:18:47 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v15] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: <4lZwMmKcs_TZPuWDJya9RPHxq72TaOd5VPrgM-4JjFI=.de1df5bc-e73e-4b31-be0c-a0f80ed8c874@github.com> On Thu, 9 Feb 2023 18:08:15 GMT, Scott Gibbons wrote: >> Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. >> >> Encode performance: >> **Old:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms >> >> >> **New:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms >> >> >> Decode performance: >> **Old:** >> >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms >> >> **New:** >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Add URL to microbenchmark @cl4es @sviswa7 @jatin-bhateja I believe I've addressed all of your concerns - thank you all for the reviews. Please review when you can. Thanks! ------------- PR: https://git.openjdk.org/jdk/pull/12126 From tsteele at openjdk.org Thu Feb 9 23:09:43 2023 From: tsteele at openjdk.org (Tyler Steele) Date: Thu, 9 Feb 2023 23:09:43 GMT Subject: RFR: 8301697: [s390] Optimized-build is broken In-Reply-To: References: Message-ID: On Fri, 3 Feb 2023 04:34:46 GMT, Amit Kumar wrote: > This fix guards `__ asm_assert_eq("killed Z_R14", 0)` & `__asm_assert_mem8_is_zero(in_bytes(JavaThread::exception_pc_offset()), Z_thread, "exception pc already set : "FILE_AND_LINE, 0)` with ASSERT because on s390x `JavaThread::exception_oop_offset()` is always cleared but `JavaThread::exception_pc_offset()` is being cleared only in ASSERT-def, Which is causing the build failure in Optimized-Debug. I think these problems exist because the assertions `asm_assert*` are firing when compiling with `debug-level=optimized`. However, their definitions should be trivial with the optimized build because PRODUCT should not be defined[1][2], and the `asm_assert*` definitions are guarded by `#if !defined(PRODUCT)`[3]. It also looks from @shipilev's comment, that s390x cross-compilation is working. Could this be a symptom of something wrong with the s390x build system on native (non-cross) builds? [1] https://github.com/openjdk/jdk/blob/master/make/hotspot/lib/JvmFlags.gmk#L69-L76 [2] https://github.com/openjdk/jdk/blob/master/make/autoconf/jdk-options.m4#L66-L76 [3] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/s390/macroAssembler_s390.cpp#L5315 ------------- PR: https://git.openjdk.org/jdk/pull/12400 From dholmes at openjdk.org Fri Feb 10 02:20:34 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 10 Feb 2023 02:20:34 GMT Subject: RFR: 8280419: Remove dead code related to VerifyThread and verify_thread() Message-ID: This is code that was copied from the sparc port into the arm, ppc and s390 ports, but was never actually implemented. I can't really test those platforms, but hopefully GHA builds will suffice, otherwise I need the various porters to confirm the changes. Thanks. ------------- Commit messages: - 8280419: Remove dead code related to VerifyThread and verify_thread() Changes: https://git.openjdk.org/jdk/pull/12506/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12506&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8280419 Stats: 80 lines in 18 files changed: 0 ins; 62 del; 18 mod Patch: https://git.openjdk.org/jdk/pull/12506.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12506/head:pull/12506 PR: https://git.openjdk.org/jdk/pull/12506 From dholmes at openjdk.org Fri Feb 10 02:55:40 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 10 Feb 2023 02:55:40 GMT Subject: RFR: 8302108: Clean up placeholder supername code In-Reply-To: <21a14dI3wKifmxIScLOW9c1C-ZTFVR7OPQ3wlvA_X0I=.79a4aacd-1d8b-4c5d-841e-20ce7a028917@github.com> References: <21a14dI3wKifmxIScLOW9c1C-ZTFVR7OPQ3wlvA_X0I=.79a4aacd-1d8b-4c5d-841e-20ce7a028917@github.com> Message-ID: On Thu, 9 Feb 2023 15:46:47 GMT, Coleen Phillimore wrote: > Please review change to make PlaceholderEntry::_supername a SymbolHandle to correctly refcount the symbol rather than have ad-hoc code. It also moves the private functions to private and asserts invariant that find_and_remove must always find a matching entry. Finally, this also more eagerly removes the supername symbol from the placeholder entry, so I have to fix the test in https://github.com/openjdk/jdk/pull/12491 before pushing this. > Tested with tier1-4. Took me a while to get my head around the SymbolHandle mechanics for ref counting, but once I did this looks good. :) Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12495 From haosun at openjdk.org Fri Feb 10 03:55:47 2023 From: haosun at openjdk.org (Hao Sun) Date: Fri, 10 Feb 2023 03:55:47 GMT Subject: RFR: 8301819: Enable continuations code by default [v2] In-Reply-To: References: Message-ID: <0_L0OYCbyWOsRSVyO7B5qpoCes3NlLLZ4H_dQ1tmye8=.b3a6057f-43f4-4b14-b453-9b5d01af7bc6@github.com> On Thu, 9 Feb 2023 17:31:28 GMT, Alan Bateman wrote: >> The continuations code added in JDK 19 with JEP 425 is currently disabled by default and is only enabled when running with preview features enabled (`--enable-preview`). Disabled by default was the pragmatic choice at that time to reduce risk and possible performance regressions. We'd like to change this so the code is enabled by default on the platforms that have a port. Enabling this code by default helps to shake out any issues early in JDK 21 and sets things up to make virtual threads a permanent feature in the future. If something comes up in the next few weeks/months then it can be changed to be disabled by default again. This change has no impact on the builds that do not have a port. >> >> Testing: tier1-tier6 > > Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge > - Update copyright date > - Initial commit src/hotspot/share/runtime/continuation.cpp line 416: > 414: // While virtual threads are in Preview, there are some VM mechanisms we disable if continuations aren't used > 415: bool Continuations::enabled() { > 416: return VMContinuations && Arguments::enable_preview(); With this change, I think header file `"runtime/arguments.hpp"` at line 29 can be excluded. ------------- PR: https://git.openjdk.org/jdk/pull/12459 From duke at openjdk.org Fri Feb 10 04:10:46 2023 From: duke at openjdk.org (Amit Kumar) Date: Fri, 10 Feb 2023 04:10:46 GMT Subject: RFR: 8301697: [s390] Optimized-build is broken In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 19:42:39 GMT, Tyler Steele wrote: >> This fix guards `__ asm_assert_eq("killed Z_R14", 0)` & `__asm_assert_mem8_is_zero(in_bytes(JavaThread::exception_pc_offset()), Z_thread, "exception pc already set : "FILE_AND_LINE, 0)` with ASSERT because on s390x `JavaThread::exception_oop_offset()` is always cleared but `JavaThread::exception_pc_offset()` is being cleared only in ASSERT-def, Which is causing the build failure in Optimized-Debug. > > src/hotspot/cpu/s390/c1_Runtime1_s390.cpp line 3: > >> 1: /* >> 2: * Copyright (c) 2016, 2023, Oracle and/or its affiliates. All rights reserved. >> 3: * Copyright (c) 2016, 2023, SAP SE. All rights reserved. > > As I understand it, only SAPers should touch the SAP copyright header. I recognize that this is a bit confusing because everyone updates the Oracle header. > > If @RealLucy confirms this to be true, please keep the Oracle line update, and revert the SAP line. would you suggest that we add IBM copyright header here as well ? ------------- PR: https://git.openjdk.org/jdk/pull/12400 From kbarrett at openjdk.org Fri Feb 10 05:33:16 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 10 Feb 2023 05:33:16 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute Message-ID: Please review this change to permit the use of noreturn attributes in HotSpot code. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2761.pdf https://en.cppreference.com/w/cpp/language/attributes/noreturn This will permit such changes as marking failed assertions and the like as noreturn, permitting the compiler to generate better code in some cases. This has benefits even in product builds, since some of the relevant checks occur in product code (such as `guarantee`, `fatal`, &etc). It also provides a solution to the problem described in JDK-8294031, where the potential continued execution from a failed assertion leads to compiler warnings. The change is written in such a way that it should be easy to add the appropriate text for new attributes in the future. There have been discussions of adopting C++17, which adds several attributes. The change to the Style Guide is forward looking, toward a time when more attributes are available due to the adoption of a newer language standard than the current C++14. It is written in such a way that it should be easy to add the appropriate text for new attributes. Testing: I have a prototype of making HotSpot assertions noreturn, which has been run through mach5 tier1-8 for all Oracle-supported platforms. This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Friday 24-Feb-2023 at 12h00 UTC. Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. ------------- Commit messages: - style guide permits noreturn attributes Changes: https://git.openjdk.org/jdk/pull/12507/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12507&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302124 Stats: 64 lines in 2 files changed: 52 ins; 12 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12507.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12507/head:pull/12507 PR: https://git.openjdk.org/jdk/pull/12507 From stuefe at openjdk.org Fri Feb 10 06:11:43 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 10 Feb 2023 06:11:43 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: Message-ID: <6buyk4KbjXwzeU6j7bkZgZ72S1lSe5uJgEp9eLJp12I=.3f45afa2-8e7a-4d9b-9fcc-dcf77f496212@github.com> On Fri, 10 Feb 2023 05:26:38 GMT, Kim Barrett wrote: > Please review this change to permit the use of noreturn attributes in HotSpot > code. > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2761.pdf > https://en.cppreference.com/w/cpp/language/attributes/noreturn > > This will permit such changes as marking failed assertions and the like as > noreturn, permitting the compiler to generate better code in some cases. This > has benefits even in product builds, since some of the relevant checks occur > in product code (such as `guarantee`, `fatal`, &etc). It also provides a > solution to the problem described in JDK-8294031, where the potential > continued execution from a failed assertion leads to compiler warnings. > > The change is written in such a way that it should be easy to add the > appropriate text for new attributes in the future. There have been > discussions of adopting C++17, which adds several attributes. > > The change to the Style Guide is forward looking, toward a time when more > attributes are available due to the adoption of a newer language standard than > the current C++14. It is written in such a way that it should be easy to add > the appropriate text for new attributes. > > Testing: > I have a prototype of making HotSpot assertions noreturn, which has been run > through mach5 tier1-8 for all Oracle-supported platforms. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 24-Feb-2023 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. This sounds useful. I try to think of cases where the fact that the compiler optimizes away follow-up instructions could be surprising. But with noreturn the compiler should warn, right, about non-reachable instructions? ------------- PR: https://git.openjdk.org/jdk/pull/12507 From dholmes at openjdk.org Fri Feb 10 06:58:44 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 10 Feb 2023 06:58:44 GMT Subject: RFR: 8280419: Remove dead code related to VerifyThread and verify_thread() In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 02:12:59 GMT, David Holmes wrote: > This is code that was copied from the sparc port into the arm, ppc and s390 ports, but was never actually implemented. > > I can't really test those platforms, but hopefully GHA builds will suffice, otherwise I need the various porters to confirm the changes. > > Thanks. All cross-compilation builds successful. ------------- PR: https://git.openjdk.org/jdk/pull/12506 From fyang at openjdk.org Fri Feb 10 07:41:43 2023 From: fyang at openjdk.org (Fei Yang) Date: Fri, 10 Feb 2023 07:41:43 GMT Subject: RFR: 8300255: Introduce interface for GC oop verification in the assembler [v3] In-Reply-To: <4CCmelM9IsYigYOrhH308nUuL-e4MybxRpmN19zJUu8=.c4deaa3e-c721-4870-93ba-8c99089e4d97@github.com> References: <4CCmelM9IsYigYOrhH308nUuL-e4MybxRpmN19zJUu8=.c4deaa3e-c721-4870-93ba-8c99089e4d97@github.com> Message-ID: On Wed, 8 Feb 2023 10:19:20 GMT, Erik ?sterlund wrote: >> In the assembly code, there is some generic oop verification code. Said verification may or may not fit a particular GC. In particular, it has not worked well for ZGC for a while, and there is an if (UseZGC). This enhancement aims at generalizing this code, such that a collector can have its own oop verification policy. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > RISC-V fixes Updated RISC-V part looks good. Thanks. ------------- Marked as reviewed by fyang (Reviewer). PR: https://git.openjdk.org/jdk/pull/12443 From sspitsyn at openjdk.org Fri Feb 10 08:35:55 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 10 Feb 2023 08:35:55 GMT Subject: RFR: 8298853: JvmtiVTMSTransitionDisabler should support disabling one virtual thread transitions [v5] In-Reply-To: References: Message-ID: On Sat, 24 Dec 2022 04:14:07 GMT, Serguei Spitsyn wrote: >> Now the `JvmtiVTMSTransitionDisabler` mechanism supports disabling VTMS transitions for all virtual threads only. It should also support disabling transitions for any specific virtual thread as well. This will improve scalability of the JVMTI functions operating on target virtual threads as the functions can be executed concurrently without blocking each other execution when target virtual threads are different. >> New constructor `JvmtiVTMSTransitionDisabler(jthread vthread)` is added which has jthread parameter of the target virtual thread. >> >> Testing: >> mach5 jobs are TBD (preliminary testing was completed): >> - all JVMTI, JDWP, JDI and JDB tests have to be run >> - Kitchensink >> - tier5 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > fix race between VTMS_transition_disable_for_one and start_VTMS_transition Thank you a lot for reviewing it, Patricio! ------------- PR: https://git.openjdk.org/jdk/pull/11690 From sspitsyn at openjdk.org Fri Feb 10 08:35:58 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 10 Feb 2023 08:35:58 GMT Subject: RFR: 8298853: JvmtiVTMSTransitionDisabler should support disabling one virtual thread transitions [v5] In-Reply-To: References: Message-ID: On Wed, 8 Feb 2023 22:17:34 GMT, Patricio Chilano Mateo wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> fix race between VTMS_transition_disable_for_one and start_VTMS_transition > > src/hotspot/share/prims/jvmtiThreadState.cpp line 479: > >> 477: >> 478: // Unblock waiting VTMS transition disablers. >> 479: if (_VTMS_transition_disable_for_one_count > 0 || > > In here it would actually be the other way. If we would check java_lang_Thread::VTMS_transition_disable_count(vth()) > 0 instead of the global _VTMS_transition_disable_for_one_count we would avoid the notify in case there are singler disablers waiting but not for this thread. Good comment. Yes, I considered the same as you about to suggest. It is on my plan to work on optimization of these fragments. Will keep in mind this your comment. ------------- PR: https://git.openjdk.org/jdk/pull/11690 From tschatzl at openjdk.org Fri Feb 10 08:41:36 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 10 Feb 2023 08:41:36 GMT Subject: RFR: 8302122: Parallelize TLAB retirement in prologue in G1 Message-ID: Hi all, can I have reviews for this change that parallelizes TLAB retirement and log buffer flushing (which [JDK-8297611](https://bugs.openjdk.org/browse/JDK-8297611) suggests? * the parallelization has been split into java and non-java threads: parallelization of the latter does not make sense given how they are organized in a linked list. However non-java threads are typically very few anyway, and parallelization only starts making sense with low hundreds of threads anyway. * `G1BarrierSet::write_region` added a new parameter that describes the JavaThread the write_region/invalidation (not sure why `write_region` isn't just an overload of `invalidate`) is made for. The reason is previously when `BarrierSet::make_parsable()` (not sure why this is called this way either) is called to flush out deferred card marks, the card marks generated by that were added to the *calling thread's* refinement queue. This worked because the calling thread (the worker thread/vm thread) has always been processed after all java threads (which are the only ones that can have deferred card marks). So these deferred card marks' refinement entries were put into the worker thread's refinement queue. After parallelization this is not possible any more unless we deferred non java thread log flushing until after all java threads - I did not want to do that, so the change passes that explicit java thread through to `G1BarrierSet::write_region`. I think this is clearer anyway, and co rresponds to how `G1DirtyCardQueueSet::concatenate_log_and_stats` works. Testing: tier1-5 Thanks, Thomas ------------- Commit messages: - More warning workarounds - avoid inconsistent override error - cleanup - fixes - fixes - initial version, just making possibly_parallel_threads_do cover the same threads as threads_do Changes: https://git.openjdk.org/jdk/pull/12497/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12497&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302122 Stats: 398 lines in 17 files changed: 290 ins; 73 del; 35 mod Patch: https://git.openjdk.org/jdk/pull/12497.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12497/head:pull/12497 PR: https://git.openjdk.org/jdk/pull/12497 From kbarrett at openjdk.org Fri Feb 10 08:44:41 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 10 Feb 2023 08:44:41 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: <6buyk4KbjXwzeU6j7bkZgZ72S1lSe5uJgEp9eLJp12I=.3f45afa2-8e7a-4d9b-9fcc-dcf77f496212@github.com> References: <6buyk4KbjXwzeU6j7bkZgZ72S1lSe5uJgEp9eLJp12I=.3f45afa2-8e7a-4d9b-9fcc-dcf77f496212@github.com> Message-ID: On Fri, 10 Feb 2023 06:08:45 GMT, Thomas Stuefe wrote: > This sounds useful. > > I try to think of cases where the fact that the compiler optimizes away follow-up instructions could be surprising. But with noreturn the compiler should warn, right, about non-reachable instructions? I'm sure there is code that is rendered dead by making the assertion failure reporting functions noreturn, but no warnings were given by any compiler I tried. ------------- PR: https://git.openjdk.org/jdk/pull/12507 From duke at openjdk.org Fri Feb 10 09:13:32 2023 From: duke at openjdk.org (Jan Kratochvil) Date: Fri, 10 Feb 2023 09:13:32 GMT Subject: RFR: 8302191: Performance degradation for float/double modulo on Linux Message-ID: I have OCA already processed/approved. I am not Author but my Author request is being processed these days (sent to Rob McKenna). I did regression test x86_64 OpenJDK-8. I will leave other regression testing on GHA. The patch (and former GCC performance regression) affects only x86_64+i686. ------------- Commit messages: - 8302191: Performance degradation for float/double modulo on Linux Changes: https://git.openjdk.org/jdk/pull/12508/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12508&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302191 Stats: 31 lines in 1 file changed: 29 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12508.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12508/head:pull/12508 PR: https://git.openjdk.org/jdk/pull/12508 From rcastanedalo at openjdk.org Fri Feb 10 09:33:42 2023 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Fri, 10 Feb 2023 09:33:42 GMT Subject: RFR: 8301874: BarrierSetC2 should assign barrier data to stores In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 15:12:03 GMT, Erik ?sterlund wrote: > BarrierSetC2 should assign barrier data to stores, like it does for loads and atomics. That way, a GC that uses barrier data for stores, can call the super class to generate the loads/stores, instead of copying that code. Looks good. ------------- Marked as reviewed by rcastanedalo (Reviewer). PR: https://git.openjdk.org/jdk/pull/12442 From alanb at openjdk.org Fri Feb 10 09:42:03 2023 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 10 Feb 2023 09:42:03 GMT Subject: RFR: 8301819: Enable continuations code by default [v3] In-Reply-To: References: Message-ID: > The continuations code added in JDK 19 with JEP 425 is currently disabled by default and is only enabled when running with preview features enabled (`--enable-preview`). Disabled by default was the pragmatic choice at that time to reduce risk and possible performance regressions. We'd like to change this so the code is enabled by default on the platforms that have a port. Enabling this code by default helps to shake out any issues early in JDK 21 and sets things up to make virtual threads a permanent feature in the future. If something comes up in the next few weeks/months then it can be changed to be disabled by default again. This change has no impact on the builds that do not have a port. > > Testing: tier1-tier6 Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Remove include - Merge - Merge - Update copyright date - Initial commit ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12459/files - new: https://git.openjdk.org/jdk/pull/12459/files/38f9ccde..df807cb1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12459&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12459&range=01-02 Stats: 1889 lines in 90 files changed: 1665 ins; 77 del; 147 mod Patch: https://git.openjdk.org/jdk/pull/12459.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12459/head:pull/12459 PR: https://git.openjdk.org/jdk/pull/12459 From alanb at openjdk.org Fri Feb 10 09:42:04 2023 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 10 Feb 2023 09:42:04 GMT Subject: RFR: 8301819: Enable continuations code by default [v3] In-Reply-To: <0_L0OYCbyWOsRSVyO7B5qpoCes3NlLLZ4H_dQ1tmye8=.b3a6057f-43f4-4b14-b453-9b5d01af7bc6@github.com> References: <0_L0OYCbyWOsRSVyO7B5qpoCes3NlLLZ4H_dQ1tmye8=.b3a6057f-43f4-4b14-b453-9b5d01af7bc6@github.com> Message-ID: On Fri, 10 Feb 2023 03:52:30 GMT, Hao Sun wrote: > With this change, I think header file `"runtime/arguments.hpp"` at line 29 can be excluded. Indeed, that header file is no longer needs to be included - thanks! ------------- PR: https://git.openjdk.org/jdk/pull/12459 From jsjolen at openjdk.org Fri Feb 10 09:45:41 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 10 Feb 2023 09:45:41 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 Message-ID: Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. Here are some typical things to look out for: 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. An example of this: ```c++ // This function returns null void* ret_null(); // This function returns true if *x == nullptr bool is_nullptr(void** x); Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. Thanks! ------------- Commit messages: - Fixes - Replace NULL with nullptr in cpu/x86 Changes: https://git.openjdk.org/jdk/pull/12326/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12326&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301498 Stats: 675 lines in 54 files changed: 0 ins; 0 del; 675 mod Patch: https://git.openjdk.org/jdk/pull/12326.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12326/head:pull/12326 PR: https://git.openjdk.org/jdk/pull/12326 From jsjolen at openjdk.org Fri Feb 10 09:45:57 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 10 Feb 2023 09:45:57 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 11:40:19 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Some things to fix Builds and sent out for testing. src/hotspot/cpu/x86/frame_x86.inline.hpp line 163: > 161: // value. > 162: // UPDATE: this constructor is only used by trace_method_handle_stub() now. > 163: // assert(_pc != null, "no pc?"); nullptr src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp line 233: > 231: // Calling the runtime using the regular call_VM_leaf mechanism generates > 232: // code (generated by InterpreterMacroAssember::call_VM_leaf_base) > 233: // that checks that the *(ebp+frame::interpreter_frame_last_sp) == null. nullptr src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp line 267: > 265: // Calling the runtime using the regular call_VM_leaf mechanism generates > 266: // code (generated by InterpreterMacroAssember::call_VM_leaf_base) > 267: // that checks that the *(ebp+frame::interpreter_frame_last_sp) == null. nullptr src/hotspot/cpu/x86/icBuffer_x86.cpp line 60: > 58: // (1) the value is old (i.e., doesn't matter for scavenges) > 59: // (2) these ICStubs are removed *before* a GC happens, so the roots disappear > 60: // assert(cached_value == null || cached_oop->is_perm(), "must be perm oop"); nullptr src/hotspot/cpu/x86/interp_masm_x86.cpp line 303: > 301: jcc(Assembler::equal, L); > 302: stop("InterpreterMacroAssembler::call_VM_base:" > 303: " last_sp != null"); nullptr src/hotspot/cpu/x86/interp_masm_x86.cpp line 402: > 400: movptr(tmp, Address(rthread, JavaThread::jvmti_thread_state_offset())); > 401: testptr(tmp, tmp); > 402: jcc(Assembler::zero, L); // if (thread->jvmti_thread_state() == null) exit; nullptr src/hotspot/cpu/x86/interp_masm_x86.cpp line 1779: > 1777: // // main copy of decision tree, rooted at row[1] > 1778: // if (row[0].rec == rec) { row[0].incr(); goto done; } > 1779: // if (row[0].rec != null) { nullptr src/hotspot/cpu/x86/macroAssembler_x86.cpp line 2866: > 2864: } else { > 2865: // nothing to do, (later) access of M[reg + offset] > 2866: // will provoke OS null exception if reg = null is not = src/hotspot/cpu/x86/macroAssembler_x86.cpp line 4265: > 4263: } > 4264: > 4265: // for (scan = klass->itable(); scan->interface() != null; scan += scan_step) { nullptr src/hotspot/cpu/x86/stubGenerator_x86_32.cpp line 1553: > 1551: // one less register, because needed values are on the argument stack. > 1552: // __ check_klass_subtype_fast_path(sub_klass, *super_klass*, temp, > 1553: // L_success, L_failure, null); nullptr src/hotspot/cpu/x86/stubGenerator_x86_32.cpp line 1924: > 1922: const Register length = rcx; // transfer count > 1923: > 1924: // if (src == null) return -1; nullptr src/hotspot/cpu/x86/stubGenerator_x86_32.cpp line 1934: > 1932: __ jccb(Assembler::negative, L_failed_0); > 1933: > 1934: // if (dst == null) return -1; nullptr src/hotspot/cpu/x86/stubGenerator_x86_32.cpp line 1956: > 1954: > 1955: #ifdef ASSERT > 1956: // assert(src->klass() != null); nullptr src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2298: > 2296: // > 2297: > 2298: // if (src == null) return -1; nullptr src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2307: > 2305: __ jccb(Assembler::negative, L_failed_0); > 2306: > 2307: // if (dst == null) return -1; nullptr src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2335: > 2333: __ load_klass(r10_src_klass, src, rklass_tmp); > 2334: #ifdef ASSERT > 2335: // assert(src->klass() != null); nullptr ------------- PR: https://git.openjdk.org/jdk/pull/12326 From rehn at openjdk.org Fri Feb 10 09:52:45 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 10 Feb 2023 09:52:45 GMT Subject: RFR: JDK-8301499: Replace NULL with nullptr in cpu/zero In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 11:40:29 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/zero. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Marked as reviewed by rehn (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12327 From jsjolen at openjdk.org Fri Feb 10 10:01:56 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 10 Feb 2023 10:01:56 GMT Subject: RFR: JDK-8301072: Replace NULL with nullptr in share/oops/ [v7] In-Reply-To: References: Message-ID: <7nnX7oBE2YKYwxlw2P5oBUbH87yVdDiJefpQW-ieiBY=.ef4fc682-b005-4c25-91e1-6830b3969b57@github.com> > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory share/oops/. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Johan Sj?len has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: - Merge remote-tracking branch 'origin/master' into JDK-8301072 - Merge remote-tracking branch 'origin/master' into JDK-8301072 - Remove unnecessary casts - Regular casts - static_cast to oop - dholmes comments fixed - Do stefank's fixes - FIX! - Manual fixes - Fix nullptr - ... and 1 more: https://git.openjdk.org/jdk/compare/1c7b09bc...ac3fca48 ------------- Changes: https://git.openjdk.org/jdk/pull/12186/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12186&range=06 Stats: 1080 lines in 56 files changed: 0 ins; 0 del; 1080 mod Patch: https://git.openjdk.org/jdk/pull/12186.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12186/head:pull/12186 PR: https://git.openjdk.org/jdk/pull/12186 From jsjolen at openjdk.org Fri Feb 10 10:01:59 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 10 Feb 2023 10:01:59 GMT Subject: RFR: JDK-8301072: Replace NULL with nullptr in share/oops/ [v6] In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 14:41:55 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory share/oops/. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Remove unnecessary casts Last GHA passed, integrating this as the merge conflict was trivial. ------------- PR: https://git.openjdk.org/jdk/pull/12186 From jsjolen at openjdk.org Fri Feb 10 10:02:01 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 10 Feb 2023 10:02:01 GMT Subject: Integrated: JDK-8301072: Replace NULL with nullptr in share/oops/ In-Reply-To: References: Message-ID: On Wed, 25 Jan 2023 11:46:23 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory share/oops/. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! This pull request has now been integrated. Changeset: c8ace482 Author: Johan Sj?len URL: https://git.openjdk.org/jdk/commit/c8ace482edead720c865cf996729a316025d937e Stats: 1080 lines in 56 files changed: 0 ins; 0 del; 1080 mod 8301072: Replace NULL with nullptr in share/oops/ Reviewed-by: stefank, coleenp, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/12186 From jsjolen at openjdk.org Fri Feb 10 10:06:22 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 10 Feb 2023 10:06:22 GMT Subject: RFR: JDK-8301481: Replace NULL with nullptr in os/windows [v5] In-Reply-To: References: Message-ID: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/windows. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Changed to zero ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12317/files - new: https://git.openjdk.org/jdk/pull/12317/files/5f33229b..b01c7086 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12317&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12317&range=03-04 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12317.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12317/head:pull/12317 PR: https://git.openjdk.org/jdk/pull/12317 From jsjolen at openjdk.org Fri Feb 10 10:06:22 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 10 Feb 2023 10:06:22 GMT Subject: RFR: JDK-8301481: Replace NULL with nullptr in os/windows [v4] In-Reply-To: <6fhxlbFt5TtE7Y5OKVViHtuXV7nIz1b7y3a3JL_Nto0=.738f3e4f-2fae-49a6-9668-047ebb4abc66@github.com> References: <6fhxlbFt5TtE7Y5OKVViHtuXV7nIz1b7y3a3JL_Nto0=.738f3e4f-2fae-49a6-9668-047ebb4abc66@github.com> Message-ID: On Mon, 6 Feb 2023 09:44:27 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/windows. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Can't directly compare uintptr_t, need reinterpret_cast Fixed the change to 0, awaiting GHA results. ------------- PR: https://git.openjdk.org/jdk/pull/12317 From jsjolen at openjdk.org Fri Feb 10 10:09:05 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 10 Feb 2023 10:09:05 GMT Subject: RFR: JDK-8301494: Replace NULL with nullptr in cpu/arm [v2] In-Reply-To: References: Message-ID: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/arm. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Fix dholmes' comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12322/files - new: https://git.openjdk.org/jdk/pull/12322/files/c3053e5e..1849c4c1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12322&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12322&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12322.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12322/head:pull/12322 PR: https://git.openjdk.org/jdk/pull/12322 From egahlin at openjdk.org Fri Feb 10 11:21:30 2023 From: egahlin at openjdk.org (Erik Gahlin) Date: Fri, 10 Feb 2023 11:21:30 GMT Subject: RFR: 8257967: JFR: Event for loaded agents Message-ID: Could I have a review of an event for native and Java agents. Testing: - tier1 - tier2 - jdk/jdk/jfr/* - 100 * TestLoadedAgent.java Rationale for event fields: - name: to identify problematic third party agents - options: to identify problematic options, such as too generic filters or conflicting port numbers, that could impact application behavior - dynamic: if an agent was loaded by a user using jcmd, which could explain why problem only occur on some server instances - java: if the agent is native, it could explain crashes - loadTime: to understand if application problem correlates with the time the agent was loaded Alternatives: - I considered making a non-periodic event that is emitted when the agent is loaded, but agents loaded at startup are then unlikely to make it into the recording. - If it is a JPLIS agent, the jar name is used instead of "instrument". This is likely what users expect when using VirtualMachine::loadAgent(name, options) API or -javaagent:name=options - I considered making all accesses to the agentLibrary list protected by a mutex, but it could potentially lead to deadlocks when non-JFR code iterates over the list. I think this is better fixed as a separate issue, if deemed necessary. So far so good. JFR iterates the list at every chunk rotation and not from the attach thread, so the risk for an unsafe access is real and synchronization is needed. - When a JPLIS agent was loaded by attach, it wasn't detected properly, so I added a name check so I could pass true to the constructor and make TestLoadedAgent pass. It would possible to make all detection of JPLIS using the name "instrument" and not rely on a boolean in the constructor, but I deemed it outside the scope for the enhancement. - I considered using Ticks::now() instead of os::javaTimeMillis() as time source, but TSC is not initialized this early. It could perhaps be fixed by moving the initialization earlier, but it might have other side effects, and is better done outside this enhancement. Thanks Erik ------------- Commit messages: - Change description - Change description - Use os::javaTimeMillis() - Use jio_snprintf - Remove duration - Remove printf - Remove test of smiley - Remove smiley test as it doesn't work on Windows - Cleanup - Rename test - ... and 2 more: https://git.openjdk.org/jdk/compare/27126157...ec990b5d Changes: https://git.openjdk.org/jdk/pull/12460/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12460&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8257967 Stats: 290 lines in 13 files changed: 283 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/12460.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12460/head:pull/12460 PR: https://git.openjdk.org/jdk/pull/12460 From eosterlund at openjdk.org Fri Feb 10 13:25:41 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 10 Feb 2023 13:25:41 GMT Subject: RFR: 8301874: BarrierSetC2 should assign barrier data to stores In-Reply-To: References: Message-ID: <-YzZSLiZWUqb517g8xkLo9QPUNRR8N9orpW0eqjISd0=.b4c7b534-0692-4078-b23f-317f3a032a49@github.com> On Fri, 10 Feb 2023 09:31:16 GMT, Roberto Casta?eda Lozano wrote: > Looks good. Thanks for the review! ------------- PR: https://git.openjdk.org/jdk/pull/12442 From stuefe at openjdk.org Fri Feb 10 13:51:44 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 10 Feb 2023 13:51:44 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: <6buyk4KbjXwzeU6j7bkZgZ72S1lSe5uJgEp9eLJp12I=.3f45afa2-8e7a-4d9b-9fcc-dcf77f496212@github.com> Message-ID: On Fri, 10 Feb 2023 08:41:54 GMT, Kim Barrett wrote: > > This sounds useful. > > I try to think of cases where the fact that the compiler optimizes away follow-up instructions could be surprising. But with noreturn the compiler should warn, right, about non-reachable instructions? > > I'm sure there is code that is rendered dead by making the assertion failure reporting functions noreturn, but no warnings were given by any compiler I tried. That is regrettable, but noreturn sounds useful nevertheless. ------------- PR: https://git.openjdk.org/jdk/pull/12507 From jsjolen at openjdk.org Fri Feb 10 14:01:53 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 10 Feb 2023 14:01:53 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 11:40:19 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Passes tier1. ------------- PR: https://git.openjdk.org/jdk/pull/12326 From jsjolen at openjdk.org Fri Feb 10 14:02:51 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 10 Feb 2023 14:02:51 GMT Subject: RFR: JDK-8301480: Replace NULL with nullptr in os/posix [v2] In-Reply-To: <5A0_x2-25DljN6avzpJhMvCQdL5TfVDD4A9dJ0w8VvM=.3c49b7f0-4c37-468f-8d7d-386283738dc6@github.com> References: <5A0_x2-25DljN6avzpJhMvCQdL5TfVDD4A9dJ0w8VvM=.3c49b7f0-4c37-468f-8d7d-386283738dc6@github.com> Message-ID: <4fe7OTBXCP1oMOAmzDtNwotynVMibKqOSVSYYW7DBfg=.e9801284-4448-4efa-9926-00f3d0306ee5@github.com> On Mon, 6 Feb 2023 09:39:17 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/posix. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Fix dholmes' bug Passes tier1, integrating. ------------- PR: https://git.openjdk.org/jdk/pull/12316 From coleenp at openjdk.org Fri Feb 10 14:02:54 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 10 Feb 2023 14:02:54 GMT Subject: RFR: JDK-8301494: Replace NULL with nullptr in cpu/arm [v2] In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 10:09:05 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/arm. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Fix dholmes' comment Marked as reviewed by coleenp (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12322 From jsjolen at openjdk.org Fri Feb 10 14:04:08 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 10 Feb 2023 14:04:08 GMT Subject: RFR: JDK-8301224: Replace NULL with nullptr in share/gc/shared/ [v4] In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 12:26:43 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory share/gc/shared/. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Merge branch 'master' into JDK-8301224 > - Another fix > - Fix kimbarrett's suggestions > - Fix > - Fix > - Merge remote-tracking branch 'origin/master' into JDK-8301224 > - Replace NULL with nullptr in share/gc/shared/ Passes tier1. ------------- PR: https://git.openjdk.org/jdk/pull/12249 From jsjolen at openjdk.org Fri Feb 10 14:04:10 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 10 Feb 2023 14:04:10 GMT Subject: Integrated: JDK-8301224: Replace NULL with nullptr in share/gc/shared/ In-Reply-To: References: Message-ID: On Fri, 27 Jan 2023 10:06:26 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory share/gc/shared/. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! This pull request has now been integrated. Changeset: 1428db79 Author: Johan Sj?len URL: https://git.openjdk.org/jdk/commit/1428db798c8b983c23b31001ce2964f174139fea Stats: 610 lines in 89 files changed: 0 ins; 0 del; 610 mod 8301224: Replace NULL with nullptr in share/gc/shared/ Reviewed-by: stefank, kbarrett ------------- PR: https://git.openjdk.org/jdk/pull/12249 From jsjolen at openjdk.org Fri Feb 10 14:06:01 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 10 Feb 2023 14:06:01 GMT Subject: Integrated: JDK-8301480: Replace NULL with nullptr in os/posix In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 10:16:00 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/posix. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! This pull request has now been integrated. Changeset: 4539899c Author: Johan Sj?len URL: https://git.openjdk.org/jdk/commit/4539899c55c77771b951d005c17550ef9ac94819 Stats: 208 lines in 11 files changed: 0 ins; 0 del; 208 mod 8301480: Replace NULL with nullptr in os/posix Reviewed-by: coleenp, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/12316 From eosterlund at openjdk.org Fri Feb 10 14:52:50 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 10 Feb 2023 14:52:50 GMT Subject: RFR: 8300255: Introduce interface for GC oop verification in the assembler [v3] In-Reply-To: References: <4CCmelM9IsYigYOrhH308nUuL-e4MybxRpmN19zJUu8=.c4deaa3e-c721-4870-93ba-8c99089e4d97@github.com> Message-ID: <80z82wYv0tBQ01_8GW968AAc_2-nJiDWC5-A_qd88Pc=.cc2ca19c-d2e3-4e8d-8714-96d058a835b3@github.com> On Fri, 10 Feb 2023 07:38:32 GMT, Fei Yang wrote: > Updated RISC-V part looks good. Thanks. Thanks! ------------- PR: https://git.openjdk.org/jdk/pull/12443 From eosterlund at openjdk.org Fri Feb 10 14:55:47 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 10 Feb 2023 14:55:47 GMT Subject: RFR: 8300255: Introduce interface for GC oop verification in the assembler [v3] In-Reply-To: References: <4CCmelM9IsYigYOrhH308nUuL-e4MybxRpmN19zJUu8=.c4deaa3e-c721-4870-93ba-8c99089e4d97@github.com> Message-ID: <5ehcfAXDdQU5OyM3PRrG4JxhAIC9936jJZmYuGM0es4=.b5ca7b4a-2b64-4242-a075-bca1f8675579@github.com> On Thu, 9 Feb 2023 15:32:39 GMT, Coleen Phillimore wrote: > This change looks nice. Did you run tier8 or tier1 tests with -XX:+VerifyOops? I thought we ran some general testing in tier3 with -XX:+VerifyOops but we don't seem to be doing that. Thanks. Thanks for the review! I have run gc-test-suite locally on both an AArch64 machine and x86_64 machine, with -XX:+VerifyOops with ZGC and G1, and it looks fine. I think it exercises verify oops pretty well. Do you want me to run tier8 as well? Personally, I hesitate a bit to run off a tier8 job for this change if gc-test-suite is happy with -XX:+VerifyOops. ------------- PR: https://git.openjdk.org/jdk/pull/12443 From coleenp at openjdk.org Fri Feb 10 15:43:47 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 10 Feb 2023 15:43:47 GMT Subject: RFR: 8300255: Introduce interface for GC oop verification in the assembler [v3] In-Reply-To: <4CCmelM9IsYigYOrhH308nUuL-e4MybxRpmN19zJUu8=.c4deaa3e-c721-4870-93ba-8c99089e4d97@github.com> References: <4CCmelM9IsYigYOrhH308nUuL-e4MybxRpmN19zJUu8=.c4deaa3e-c721-4870-93ba-8c99089e4d97@github.com> Message-ID: On Wed, 8 Feb 2023 10:19:20 GMT, Erik ?sterlund wrote: >> In the assembly code, there is some generic oop verification code. Said verification may or may not fit a particular GC. In particular, it has not worked well for ZGC for a while, and there is an if (UseZGC). This enhancement aims at generalizing this code, such that a collector can have its own oop verification policy. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > RISC-V fixes Tier8 is unnecessary as you've done general testing with it. ------------- PR: https://git.openjdk.org/jdk/pull/12443 From dcubed at openjdk.org Fri Feb 10 15:58:41 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 10 Feb 2023 15:58:41 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 05:26:38 GMT, Kim Barrett wrote: > Please review this change to permit the use of noreturn attributes in HotSpot > code. > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2761.pdf > https://en.cppreference.com/w/cpp/language/attributes/noreturn > > This will permit such changes as marking failed assertions and the like as > noreturn, permitting the compiler to generate better code in some cases. This > has benefits even in product builds, since some of the relevant checks occur > in product code (such as `guarantee`, `fatal`, &etc). It also provides a > solution to the problem described in JDK-8294031, where the potential > continued execution from a failed assertion leads to compiler warnings. > > The change is written in such a way that it should be easy to add the > appropriate text for new attributes in the future. There have been > discussions of adopting C++17, which adds several attributes. > > The change to the Style Guide is forward looking, toward a time when more > attributes are available due to the adoption of a newer language standard than > the current C++14. It is written in such a way that it should be easy to add > the appropriate text for new attributes. > > Testing: > I have a prototype of making HotSpot assertions noreturn, which has been run > through mach5 tier1-8 for all Oracle-supported platforms. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 24-Feb-2023 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Thumbs up. `appertains` is my new word the day! Thanks. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.org/jdk/pull/12507 From eosterlund at openjdk.org Fri Feb 10 16:33:45 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 10 Feb 2023 16:33:45 GMT Subject: RFR: 8300255: Introduce interface for GC oop verification in the assembler [v3] In-Reply-To: References: <4CCmelM9IsYigYOrhH308nUuL-e4MybxRpmN19zJUu8=.c4deaa3e-c721-4870-93ba-8c99089e4d97@github.com> Message-ID: On Fri, 10 Feb 2023 15:40:59 GMT, Coleen Phillimore wrote: > Tier8 is unnecessary as you've done general testing with it. Okay, thanks! ------------- PR: https://git.openjdk.org/jdk/pull/12443 From dcubed at openjdk.org Fri Feb 10 17:24:48 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 10 Feb 2023 17:24:48 GMT Subject: RFR: 8301819: Enable continuations code by default [v3] In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 09:42:03 GMT, Alan Bateman wrote: >> The continuations code added in JDK 19 with JEP 425 is currently disabled by default and is only enabled when running with preview features enabled (`--enable-preview`). Disabled by default was the pragmatic choice at that time to reduce risk and possible performance regressions. We'd like to change this so the code is enabled by default on the platforms that have a port. Enabling this code by default helps to shake out any issues early in JDK 21 and sets things up to make virtual threads a permanent feature in the future. If something comes up in the next few weeks/months then it can be changed to be disabled by default again. This change has no impact on the builds that do not have a port. >> >> Testing: tier1-tier6 > > Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Remove include > - Merge > - Merge > - Update copyright date > - Initial commit Marked as reviewed by dcubed (Reviewer). src/hotspot/share/runtime/continuation.hpp line 42: > 40: public: > 41: static void init(); > 42: static bool enabled(); Based on the comment, I was expecting this line to be removed. I'll take it that there's still a way to disable Continuations even after virtual threads come out of "Preview"? Update: I see the `VMContinuations` option is still present and experimental. When will it become non-experimental or removed? ------------- PR: https://git.openjdk.org/jdk/pull/12459 From dcubed at openjdk.org Fri Feb 10 17:24:49 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 10 Feb 2023 17:24:49 GMT Subject: RFR: 8301819: Enable continuations code by default [v3] In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 17:15:53 GMT, Daniel D. Daugherty wrote: >> Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: >> >> - Remove include >> - Merge >> - Merge >> - Update copyright date >> - Initial commit > > src/hotspot/share/runtime/continuation.hpp line 42: > >> 40: public: >> 41: static void init(); >> 42: static bool enabled(); > > Based on the comment, I was expecting this line to be removed. > I'll take it that there's still a way to disable Continuations even after virtual > threads come out of "Preview"? > > Update: I see the `VMContinuations` option is still present and experimental. > When will it become non-experimental or removed? And now I'm mulling why `VMContinuations` doesn't a default value... ------------- PR: https://git.openjdk.org/jdk/pull/12459 From dcubed at openjdk.org Fri Feb 10 17:24:51 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 10 Feb 2023 17:24:51 GMT Subject: RFR: 8301819: Enable continuations code by default [v3] In-Reply-To: References: Message-ID: <0ugQxcSNTSfBwJcZ6Kdp4yOlqzd4mqKG2O2WM0r8S6k=.56d9374d-0c7f-417c-a6c8-2246dc35e3bb@github.com> On Fri, 10 Feb 2023 17:18:31 GMT, Daniel D. Daugherty wrote: >> src/hotspot/share/runtime/continuation.hpp line 42: >> >>> 40: public: >>> 41: static void init(); >>> 42: static bool enabled(); >> >> Based on the comment, I was expecting this line to be removed. >> I'll take it that there's still a way to disable Continuations even after virtual >> threads come out of "Preview"? >> >> Update: I see the `VMContinuations` option is still present and experimental. >> When will it become non-experimental or removed? > > And now I'm mulling why `VMContinuations` doesn't a default value... Ohhh... that's right. It's not implemented on all platforms yet... got it... ------------- PR: https://git.openjdk.org/jdk/pull/12459 From lucy at openjdk.org Fri Feb 10 17:28:59 2023 From: lucy at openjdk.org (Lutz Schmidt) Date: Fri, 10 Feb 2023 17:28:59 GMT Subject: RFR: 8301697: [s390] Optimized-build is broken In-Reply-To: References: <7gymtH0DjIj_DnlQgE_IUmurAsluGij6pt_p-pBKwMM=.73c4d85b-5a1c-4bbd-81ff-2c0689fdf7ea@github.com> Message-ID: On Fri, 3 Feb 2023 08:30:47 GMT, Aleksey Shipilev wrote: >> Hi @shipilev , @theRealAph >> >> As title states optimized build is broken with GCC-9.4.0 and After my code changes build is now again successful. I've done testing for optimised, release, fastdebug, slowdebug on s390 and they're successful with these code changes. >> >> I'm pinging you here, because I've seen a patch for build-fix for optmized-build in past from you. >> form shipilev : https://github.com/openjdk/jdk/pull/10743 (for s390) >> from theRealAph: https://github.com/openjdk/jdk/pull/9103 (for arch-64). >> >> As I'm new to OpenJDK, so it would be helpful if you could share some insight about this fix or any other way to fix the build, before I mark this PR "ready for review". >> >> Thank you so much. > >> As title states optimized build is broken with GCC-9.4.0 and After my code changes build is now again successful. I've done testing for optimised, release, fastdebug, slowdebug on s390 and they're successful with these code changes. > > I was confused a bit about this, because my own S390X optimized builds build fine: > https://builds.shipilev.net/openjdk-jdk/build-logs/build-openjdk-jdk-linux-s390x-server-optimized-gcc10-glibc2.31.log > > So, current code is supposed to build fine. I think it is arguably correct to protect the asserts with `ASSERT`/`DEBUG` rather than `PRODUCT`, which would hide them in the optimized builds. But I look at the original issue, and the optimized build is actually failing on the assert, which is why my (cross-compiled) builds are fine! See: > > > A fatal error has been detected by the Java Runtime Environment: > > Internal Error (src/hotspot/cpu/s390/macroAssembler_s390.cpp:5498), pid=3076078, tid=3076083 > guarantee(false) failed: Z assembly code requires stop: killed Z_R14 > > > So, this change only hides the issue, not fixes it? If so, I don't think this is a good way to solve the issue. > I suggest you `git bisect` and identify which change introduced this failure, and work from there. > It also looks from @shipilev's comment, that s390x cross-compilation is working. > The cross-compile will not use the JVM it created to continue with the build. How would you run a s390x JVM on x64? ------------- PR: https://git.openjdk.org/jdk/pull/12400 From lucy at openjdk.org Fri Feb 10 17:29:03 2023 From: lucy at openjdk.org (Lutz Schmidt) Date: Fri, 10 Feb 2023 17:29:03 GMT Subject: RFR: 8301697: [s390] Optimized-build is broken In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 04:07:39 GMT, Amit Kumar wrote: >> src/hotspot/cpu/s390/c1_Runtime1_s390.cpp line 3: >> >>> 1: /* >>> 2: * Copyright (c) 2016, 2023, Oracle and/or its affiliates. All rights reserved. >>> 3: * Copyright (c) 2016, 2023, SAP SE. All rights reserved. >> >> As I understand it, only SAPers should touch the SAP copyright header. I recognize that this is a bit confusing because everyone updates the Oracle header. >> >> If @RealLucy confirms this to be true, please keep the Oracle line update, and revert the SAP line. > > would you suggest that we add IBM copyright header here as well ? Here is the "common practice" for updating copyright headers: - Everybody updates the Oracle copyright line if there was a change, however small, in the file. - Everybody can (but doesn't have to) update the SAP copyright line if there was a change. - Please note: the SAP copyright does not have a comma after the last year. - Adding the own company's copyright line is regarded acceptable only in case of major contributions/changes. ------------- PR: https://git.openjdk.org/jdk/pull/12400 From iklam at openjdk.org Fri Feb 10 17:32:10 2023 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 10 Feb 2023 17:32:10 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 18:11:18 GMT, Volker Simonis wrote: >> Prior to [JDK-8239384](https://bugs.openjdk.org/browse/JDK-8239384)/[JDK-8238358](https://bugs.openjdk.org/browse/JDK-8238358) LambdaMetaFactory has created VM-anonymous classes which could easily be unloaded once they were not referenced any more. Starting with JDK 15 and the new "hidden class" based implementation, this is not the case any more, because the hidden classes will be strongly tied to their defining class loader. If this is the default application class loader, these hidden classes can never be unloaded which can easily lead to Metaspace exhaustion (see the [test case in the JBS issue](https://bugs.openjdk.org/secure/attachment/102601/LambdaClassLeak.java)). This is a regression compared to previous JDK versions which some of our applications have been affected from when migrating to JDK 17. >> >> The reason why the newly created hidden classes are strongly linked to their defining class loader is not clear to me. JDK-8239384 mentions it as an "implementation detail": >> >>> *4. the lambda proxy class has the strong relationship with the class loader (that will share the VM metaspace for its defining loader - implementation details)* >> >> From my current understanding the strong link between a hidden class created by `LambdaMetaFactory` and its defining class loader is not strictly required. In order to prevent potential OOMs and fix the regression compared the JDK 14 and earlier I propose to create these hidden classes without the `STRONG` option. >> >> I'll be happy to add the test case as JTreg test to this PR if you think that would be useful. > > Volker Simonis has updated the pull request incrementally with two additional commits since the last revision: > > - Remove assertions which insist on Lambda proxy classes being strongly linked to their class loader > - Removed unused import of STRONG und updated copyright year Even in JDK 11, a lambda proxy classes that's referenced by the cpCache (i.e., from a resolved invokedynamic instruction associated with a lamda expression) is always kept alive. See test below. So if I understand correctly, this patch will not affect lamda expressions in Java source code. It affects only direct calls to `LambdaMetafactory.metafactory()`. Is this correct? public class LambdaGC { public static void main(String[] args) throws Throwable { System.out.println("Entering LambdaGC"); doit(() -> { Thread.dumpStack(); }); for (int i = 0; i < 10; i++) { System.gc(); } System.out.println("Finish LambdaGC"); } static void doit(Runnable r) { r.run(); } } $ java11 -cp . -XX:+UnlockDiagnosticVMOptions -XX:+ShowHiddenFrames -Xlog:class+load -Xlog:class+unload LambdaGC | grep LambdaGC [0.022s][info][class,load] LambdaGC source: file:/jdk3/tmp/ Entering LambdaGC [0.024s][info][class,load] LambdaGC$$Lambda$1/0x0000000840060840 source: LambdaGC java.lang.Exception: Stack trace at java.base/java.lang.Thread.dumpStack(Thread.java:1387) at LambdaGC.lambda$main$0(LambdaGC.java:5) at LambdaGC$$Lambda$1/0x0000000840060840.run(:1000000) at LambdaGC.doit(LambdaGC.java:13) at LambdaGC.main(LambdaGC.java:4) Finish LambdaGC ------------- PR: https://git.openjdk.org/jdk/pull/12493 From alanb at openjdk.org Fri Feb 10 17:34:25 2023 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 10 Feb 2023 17:34:25 GMT Subject: RFR: 8301819: Enable continuations code by default [v3] In-Reply-To: <0ugQxcSNTSfBwJcZ6Kdp4yOlqzd4mqKG2O2WM0r8S6k=.56d9374d-0c7f-417c-a6c8-2246dc35e3bb@github.com> References: <0ugQxcSNTSfBwJcZ6Kdp4yOlqzd4mqKG2O2WM0r8S6k=.56d9374d-0c7f-417c-a6c8-2246dc35e3bb@github.com> Message-ID: On Fri, 10 Feb 2023 17:19:20 GMT, Daniel D. Daugherty wrote: >> And now I'm mulling why `VMContinuations` doesn't a default value... > > Ohhh... that's right. It's not implemented on all platforms yet... got it... That's right. Additionally, the alternative implementation of virtual threads can be tested by running tests with -XX:-VMContinuations, and we have a few tests that do this. ------------- PR: https://git.openjdk.org/jdk/pull/12459 From jbhateja at openjdk.org Fri Feb 10 18:32:46 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 10 Feb 2023 18:32:46 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v15] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: On Thu, 9 Feb 2023 18:08:15 GMT, Scott Gibbons wrote: >> Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. >> >> Encode performance: >> **Old:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms >> >> >> **New:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms >> >> >> Decode performance: >> **Old:** >> >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms >> >> **New:** >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Add URL to microbenchmark Thanks @asgibbons. ------------- Marked as reviewed by jbhateja (Reviewer). PR: https://git.openjdk.org/jdk/pull/12126 From mchung at openjdk.org Fri Feb 10 19:03:45 2023 From: mchung at openjdk.org (Mandy Chung) Date: Fri, 10 Feb 2023 19:03:45 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: <6ILJ5r3w184aCx2hNc7k-khRUpF6tgzmjAr6mubrXfk=.90bda6f0-274e-4631-be22-503e5c954ff6@github.com> On Fri, 10 Feb 2023 17:29:37 GMT, Ioi Lam wrote: > Even in JDK 11, a lambda proxy classes that's referenced by the cpCache (i.e., from a resolved invokedynamic instruction associated with a lamda expression) is always kept alive. Thanks @iklam. That's exactly what I am trying to confirm. ------------- PR: https://git.openjdk.org/jdk/pull/12493 From simonis at openjdk.org Fri Feb 10 19:03:47 2023 From: simonis at openjdk.org (Volker Simonis) Date: Fri, 10 Feb 2023 19:03:47 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: <80S5iALXtwJyXDKAqu3ecLSY3FePCHYr_Vez_uoRs4Y=.c0256514-454b-4744-98f2-9d9106e981c4@github.com> On Fri, 10 Feb 2023 17:29:37 GMT, Ioi Lam wrote: >> Volker Simonis has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove assertions which insist on Lambda proxy classes being strongly linked to their class loader >> - Removed unused import of STRONG und updated copyright year > > Even in JDK 11, a lambda proxy classes that's referenced by the cpCache (i.e., from a resolved invokedynamic instruction associated with a lamda expression) is always kept alive. See test below. > > So if I understand correctly, this patch will not affect lamda expressions in Java source code. It affects only direct calls to `LambdaMetafactory.metafactory()`. Is this correct? > > > public class LambdaGC { > public static void main(String[] args) throws Throwable { > System.out.println("Entering LambdaGC"); > doit(() -> { > Thread.dumpStack(); > }); > for (int i = 0; i < 10; i++) { > System.gc(); > } > System.out.println("Finish LambdaGC"); > } > static void doit(Runnable r) { > r.run(); > } > } > > $ java11 -cp . -XX:+UnlockDiagnosticVMOptions -XX:+ShowHiddenFrames -Xlog:class+load -Xlog:class+unload LambdaGC | grep LambdaGC > [0.022s][info][class,load] LambdaGC source: file:/jdk3/tmp/ > Entering LambdaGC > [0.024s][info][class,load] LambdaGC$$Lambda$1/0x0000000840060840 source: LambdaGC > java.lang.Exception: Stack trace > at java.base/java.lang.Thread.dumpStack(Thread.java:1387) > at LambdaGC.lambda$main$0(LambdaGC.java:5) > at LambdaGC$$Lambda$1/0x0000000840060840.run(:1000000) > at LambdaGC.doit(LambdaGC.java:13) > at LambdaGC.main(LambdaGC.java:4) > Finish LambdaGC @iklam, I think your understanding is correct. While the bootstrap methods for Java Lambdas do call `LambdaMetafactory.metafactory()`, they store the resulting call site (for Lambdas a `BoundMethodHandle`) in the appendix slot of the constant pool cache entry of the invokedynamic bytecode. The `BoundMethodHandle` contains a reference to an instance of the generated lambda form (i.e. in your example `LambdaGC$$Lambda$1`). This is enough in order to keep `LambdaGC$$Lambda$1` alive and prevent its unloading. Until now, `LambdaGC$$Lambda$1` was also strongly linked to its defining class loader, but I don't think that's necessary to keep it alive (because it is referenced from the call site which is referenced from the constant pool cache). Running your example with `-Xlog:indy+methodhandles=debug` confirms this: set_method_handle bc=186 appendix=0x000000062b821838 method=0x00000008000c7698 (local signature) {method} - this oop: 0x00000008000c7698 - method holder: 'java/lang/invoke/Invokers$Holder' ... appendix: java.lang.invoke.BoundMethodHandle$Species_L {0x000000062b821838} - klass: 'java/lang/invoke/BoundMethodHandle$Species_L' - ---- fields (total size 5 words): ... - final 'argL0' 'Ljava/lang/Object;' @32 a 'LambdaGC$$Lambda$1+0x0000000801000400'{0x000000062b81e080} (0xc5703c10) ------------- PR: https://git.openjdk.org/jdk/pull/12493 From coleenp at openjdk.org Fri Feb 10 19:11:17 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 10 Feb 2023 19:11:17 GMT Subject: RFR: 8302108: Clean up placeholder supername code [v2] In-Reply-To: <21a14dI3wKifmxIScLOW9c1C-ZTFVR7OPQ3wlvA_X0I=.79a4aacd-1d8b-4c5d-841e-20ce7a028917@github.com> References: <21a14dI3wKifmxIScLOW9c1C-ZTFVR7OPQ3wlvA_X0I=.79a4aacd-1d8b-4c5d-841e-20ce7a028917@github.com> Message-ID: > Please review change to make PlaceholderEntry::_supername a SymbolHandle to correctly refcount the symbol rather than have ad-hoc code. It also moves the private functions to private and asserts invariant that find_and_remove must always find a matching entry. Finally, this also more eagerly removes the supername symbol from the placeholder entry, so I have to fix the test in https://github.com/openjdk/jdk/pull/12491 before pushing this. > Tested with tier1-4. Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: - Fix test for cleanup for nulling out supername when no longer used. - Merge branch 'master' into placeholder-cleanup - 8302108: Clean up placeholder supername code ------------- Changes: https://git.openjdk.org/jdk/pull/12495/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12495&range=01 Stats: 53 lines in 4 files changed: 13 ins; 24 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/12495.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12495/head:pull/12495 PR: https://git.openjdk.org/jdk/pull/12495 From mchung at openjdk.org Fri Feb 10 19:29:47 2023 From: mchung at openjdk.org (Mandy Chung) Date: Fri, 10 Feb 2023 19:29:47 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: <80S5iALXtwJyXDKAqu3ecLSY3FePCHYr_Vez_uoRs4Y=.c0256514-454b-4744-98f2-9d9106e981c4@github.com> References: <80S5iALXtwJyXDKAqu3ecLSY3FePCHYr_Vez_uoRs4Y=.c0256514-454b-4744-98f2-9d9106e981c4@github.com> Message-ID: On Fri, 10 Feb 2023 19:00:53 GMT, Volker Simonis wrote: > LambdaGC$$Lambda$1 was also strongly linked to its defining class loader, but I don't think that's necessary to keep it alive (because it is referenced from the call site which is referenced from the constant pool cache). True but it was decided at that time for JEP 371 to make it strong so that the lambda proxy classes will share the same `ClassLoaderData` metaspace as other classes defined by the same loader. This reduces the memory footprint consumption. The question is: what are the use cases for direct call to `LambdaMetafactory` that existing applications depend on? ------------- PR: https://git.openjdk.org/jdk/pull/12493 From pchilanomate at openjdk.org Fri Feb 10 19:36:28 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Fri, 10 Feb 2023 19:36:28 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation Message-ID: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Please review the following change. Some of the JVMTI functions had to be modified since they are not supported by the specs on virtual threads (StopThread, RunAgentThread, GetCurrentThreadCpuTime, GetThreadCpuTime, PopFrame, ForceEarlyReturnXXX) or shouldn't include virtual threads in the results (GetAllThreads, GetAllStackTraces, GetThreadGroupChildren). Others were modified because they should work on virtual thread but they weren't (SuspendAllVirtualThreads/ResumeAllVirtualThreads). Also support for VirtualThreadStart/VirtualThreadEnd events was added. There were a total of 71 tests under serviceability/jvmti/ that had the vm.continuations requirement. Of those, 24 where under serviceability/jvmti/vthread/. I was able to remove that requirement from 50 tests. Most of those tests were already passing for the alternative vthread implementation just by removing the check in jni_GetEnv for VMContinuations. >From the other 21 tests, 20 still fail with the changes either because they actually test continuations or because of different assumptions in the tests that don't hold true for the alternative vthread implementation. So for those I left the vm.continuations requirement untouched (from those 20 tests there are actually 4 tests from the thread/GetStackTrace/ family that are passing because of a bug in the test, but I can fix that in another RFE). The other remaining test is vthread/GetSetLocalTest/GetSetLocalTest.java which I see is problemlisted. Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). I added new test BoundVThreadTest.java which just checks for the unsupported/supported behavior mentioned previously. For some extra basic coverage I also added a new run with -XX:-VMContinuations on tests inside the serviceability/jvmti/vthread/ directory that don't require vm.continuations anymore. I could also add that for all the other tests in serviceability/jvmti/ for which I removed the vm.continuations requirement. I run the patch through mach5 tiers 1-6 plus some extra local runs on the relevant tests. Thanks, Patricio ------------- Commit messages: - v1 Changes: https://git.openjdk.org/jdk/pull/12512/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12512&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8300575 Stats: 560 lines in 63 files changed: 467 ins; 74 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/12512.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12512/head:pull/12512 PR: https://git.openjdk.org/jdk/pull/12512 From kbarrett at openjdk.org Fri Feb 10 19:44:43 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 10 Feb 2023 19:44:43 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: Message-ID: <2oyyI0MX0u8cfZIQVzQC-sX6UVUHfDl2osC6NeCW9OQ=.b95420fa-1ba8-4408-b10f-b54d8785ebc6@github.com> On Fri, 10 Feb 2023 15:56:08 GMT, Daniel D. Daugherty wrote: > `appertains` is my new word the day! Thanks. It's not a frequently used part of my vocabulary either, but it's the word used by the Standard. ------------- PR: https://git.openjdk.org/jdk/pull/12507 From phh at openjdk.org Fri Feb 10 21:20:02 2023 From: phh at openjdk.org (Paul Hohensee) Date: Fri, 10 Feb 2023 21:20:02 GMT Subject: RFR: 8302127: Remove unused arg in write_ref_field_post In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 10:56:10 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. Lgtm. ------------- Marked as reviewed by phh (Reviewer). PR: https://git.openjdk.org/jdk/pull/12489 From duke at openjdk.org Fri Feb 10 21:20:44 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Fri, 10 Feb 2023 21:20:44 GMT Subject: RFR: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64 Message-ID: Instruction pmull and pmull2 support operating on 64-bit data in Cryptographic Extension. The execution throughput of this form raises from 1 on Neoverse N1 to 4 on Neoverse V1 while the latency remains 2. The CRC32 instructions did not changed: latency 2, throughput 1. As a result, computing CRC32 using pmull could perform better than using crc32 instruction. The following test has passed. test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java The throughput reported by the micro benchmark is measured on an EC2 c7g instance. The optimization shows 11 - 99% improvement when the input is at least 384 bytes. | input | 64 | 128 | 256 | 384 | 511 | 512 | 1,024 | | ------------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | | CRC32 improvement | 0.02% | 0.02% | 0.00% | 16.00% | 11.94% | 34.75% | 69.80% | | input | 2,048 | 4,096 | 8,192 | 16,384 | 32,768 | 65,536 | | ------------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | | CRC32 improvement | 77.61% | 92.33% | 95.98% | 97.95% | 99.33% | 98.36% | Baseline TestCRC32.testCRC32Update 64 thrpt 12 173126.358 ? 118.330 ops/ms TestCRC32.testCRC32Update 128 thrpt 12 112910.118 ? 47.305 ops/ms TestCRC32.testCRC32Update 256 thrpt 12 66601.990 ? 7.294 ops/ms TestCRC32.testCRC32Update 384 thrpt 12 47229.319 ? 3.949 ops/ms TestCRC32.testCRC32Update 511 thrpt 12 33733.119 ? 4.076 ops/ms TestCRC32.testCRC32Update 512 thrpt 12 36584.565 ? 4.211 ops/ms TestCRC32.testCRC32Update 1024 thrpt 12 19239.083 ? 1.040 ops/ms TestCRC32.testCRC32Update 2048 thrpt 12 9875.652 ? 0.435 ops/ms TestCRC32.testCRC32Update 4096 thrpt 12 5004.425 ? 0.290 ops/ms TestCRC32.testCRC32Update 8192 thrpt 12 2519.185 ? 0.169 ops/ms TestCRC32.testCRC32Update 16384 thrpt 12 1263.909 ? 0.194 ops/ms TestCRC32.testCRC32Update 32768 thrpt 12 632.018 ? 0.053 ops/ms TestCRC32.testCRC32Update 65536 thrpt 12 315.471 ? 0.095 ops/ms Crypto pmull TestCRC32.testCRC32Update 64 thrpt 12 173168.669 ? 4.746 ops/ms TestCRC32.testCRC32Update 128 thrpt 12 112933.519 ? 4.583 ops/ms TestCRC32.testCRC32Update 256 thrpt 12 66602.462 ? 3.150 ops/ms TestCRC32.testCRC32Update 384 thrpt 12 54784.739 ? 2.110 ops/ms TestCRC32.testCRC32Update 511 thrpt 12 37760.816 ? 69.911 ops/ms TestCRC32.testCRC32Update 512 thrpt 12 49297.609 ? 21.983 ops/ms TestCRC32.testCRC32Update 1024 thrpt 12 32667.507 ? 90.610 ops/ms TestCRC32.testCRC32Update 2048 thrpt 12 17539.986 ? 511.416 ops/ms TestCRC32.testCRC32Update 4096 thrpt 12 9625.249 ? 9.713 ops/ms TestCRC32.testCRC32Update 8192 thrpt 12 4937.135 ? 6.121 ops/ms TestCRC32.testCRC32Update 16384 thrpt 12 2501.936 ? 1.270 ops/ms TestCRC32.testCRC32Update 32768 thrpt 12 1259.831 ? 0.119 ops/ms TestCRC32.testCRC32Update 65536 thrpt 12 625.773 ? 0.242 ops/ms ------------- Commit messages: - Remove CRC32-C - Support CRC32-C - Merge master - Add microbenchmark TestCRC32 - Change code alignment - Separate code paths - Enable on Neoverse V1 - Disable the optimization by default - Reduce data dependency before load - PMULL Changes: https://git.openjdk.org/jdk/pull/12480/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12480&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302113 Stats: 214 lines in 5 files changed: 213 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12480.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12480/head:pull/12480 PR: https://git.openjdk.org/jdk/pull/12480 From eastigeevich at openjdk.org Fri Feb 10 21:20:50 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Fri, 10 Feb 2023 21:20:50 GMT Subject: RFR: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64 In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 02:25:27 GMT, Yi-Fan Tsai wrote: > Instruction pmull and pmull2 support operating on 64-bit data in Cryptographic Extension. The execution throughput of this form raises from 1 on Neoverse N1 to 4 on Neoverse V1 while the latency remains 2. The CRC32 instructions did not changed: latency 2, throughput 1. As a result, computing CRC32 using pmull could perform better than using crc32 instruction. > > The following test has passed. > test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java > > The throughput reported by the micro benchmark is measured on an EC2 c7g instance. The optimization shows 11 - 99% improvement when the input is at least 384 bytes. > > | input | 64 | 128 | 256 | 384 | 511 | 512 | 1,024 | > | ------------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | > | CRC32 improvement | 0.02% | 0.02% | 0.00% | 16.00% | 11.94% | 34.75% | 69.80% | > > | input | 2,048 | 4,096 | 8,192 | 16,384 | 32,768 | 65,536 | > | ------------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | > | CRC32 improvement | 77.61% | 92.33% | 95.98% | 97.95% | 99.33% | 98.36% | > > > Baseline > > TestCRC32.testCRC32Update 64 thrpt 12 173126.358 ? 118.330 ops/ms > TestCRC32.testCRC32Update 128 thrpt 12 112910.118 ? 47.305 ops/ms > TestCRC32.testCRC32Update 256 thrpt 12 66601.990 ? 7.294 ops/ms > TestCRC32.testCRC32Update 384 thrpt 12 47229.319 ? 3.949 ops/ms > TestCRC32.testCRC32Update 511 thrpt 12 33733.119 ? 4.076 ops/ms > TestCRC32.testCRC32Update 512 thrpt 12 36584.565 ? 4.211 ops/ms > TestCRC32.testCRC32Update 1024 thrpt 12 19239.083 ? 1.040 ops/ms > TestCRC32.testCRC32Update 2048 thrpt 12 9875.652 ? 0.435 ops/ms > TestCRC32.testCRC32Update 4096 thrpt 12 5004.425 ? 0.290 ops/ms > TestCRC32.testCRC32Update 8192 thrpt 12 2519.185 ? 0.169 ops/ms > TestCRC32.testCRC32Update 16384 thrpt 12 1263.909 ? 0.194 ops/ms > TestCRC32.testCRC32Update 32768 thrpt 12 632.018 ? 0.053 ops/ms > TestCRC32.testCRC32Update 65536 thrpt 12 315.471 ? 0.095 ops/ms > > > Crypto pmull > > TestCRC32.testCRC32Update 64 thrpt 12 173168.669 ? 4.746 ops/ms > TestCRC32.testCRC32Update 128 thrpt 12 112933.519 ? 4.583 ops/ms > TestCRC32.testCRC32Update 256 thrpt 12 66602.462 ? 3.150 ops/ms > TestCRC32.testCRC32Update 384 thrpt 12 54784.739 ? 2.110 ops/ms > TestCRC32.testCRC32Update 511 thrpt 12 37760.816 ? 69.911 ops/ms > TestCRC32.testCRC32Update 512 thrpt 12 49297.609 ? 21.983 ops/ms > TestCRC32.testCRC32Update 1024 thrpt 12 32667.507 ? 90.610 ops/ms > TestCRC32.testCRC32Update 2048 thrpt 12 17539.986 ? 511.416 ops/ms > TestCRC32.testCRC32Update 4096 thrpt 12 9625.249 ? 9.713 ops/ms > TestCRC32.testCRC32Update 8192 thrpt 12 4937.135 ? 6.121 ops/ms > TestCRC32.testCRC32Update 16384 thrpt 12 2501.936 ? 1.270 ops/ms > TestCRC32.testCRC32Update 32768 thrpt 12 1259.831 ? 0.119 ops/ms > TestCRC32.testCRC32Update 65536 thrpt 12 625.773 ? 0.242 ops/ms In the description, could you please combine data for the baseline and the optimized versions into a table? It's hard to compare them as they are now. How was the correctness of the implementation tested? What a microbenchmark did you use? It is not mentioned in the description. There is `test/micro/org/openjdk/bench/java/util/TestCRC32C.java`. Should we add a similar benchmark for CRC32? > [Support CRC32-C](https://github.com/openjdk/jdk/pull/12480/commits/1cc1e55faf3bb5fcffb5fb7cc8851f25ea208aa4) Could you please create a separate PR for CRC32C? Small PRs are easy to review and to merge/revert. ------------- PR: https://git.openjdk.org/jdk/pull/12480 From duke at openjdk.org Fri Feb 10 21:21:05 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Fri, 10 Feb 2023 21:21:05 GMT Subject: RFR: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64 In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 11:37:16 GMT, Evgeny Astigeevich wrote: > How was the correctness of the implementation tested? What a microbenchmark did you use? It is not mentioned in the description. There is `test/micro/org/openjdk/bench/java/util/TestCRC32C.java`. Should we add a similar benchmark for CRC32? The micro benchmark has been added recently. The performance measurement is slightly different, so the threshold is raised to 384 bytes. ------------- PR: https://git.openjdk.org/jdk/pull/12480 From forax at univ-mlv.fr Fri Feb 10 22:18:47 2023 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 10 Feb 2023 23:18:47 +0100 (CET) Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: <80S5iALXtwJyXDKAqu3ecLSY3FePCHYr_Vez_uoRs4Y=.c0256514-454b-4744-98f2-9d9106e981c4@github.com> References: <80S5iALXtwJyXDKAqu3ecLSY3FePCHYr_Vez_uoRs4Y=.c0256514-454b-4744-98f2-9d9106e981c4@github.com> Message-ID: <1628065858.20125180.1676067527849.JavaMail.zimbra@u-pem.fr> ----- Original Message ----- > From: "Volker Simonis" > To: "core-libs-dev" , "hotspot-dev" > Sent: Friday, February 10, 2023 8:03:47 PM > Subject: Re: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] > On Fri, 10 Feb 2023 17:29:37 GMT, Ioi Lam wrote: > >>> Volker Simonis has updated the pull request incrementally with two additional >>> commits since the last revision: >>> >>> - Remove assertions which insist on Lambda proxy classes being strongly linked >>> to their class loader >>> - Removed unused import of STRONG und updated copyright year >> >> Even in JDK 11, a lambda proxy classes that's referenced by the cpCache (i.e., >> from a resolved invokedynamic instruction associated with a lamda expression) >> is always kept alive. See test below. >> >> So if I understand correctly, this patch will not affect lamda expressions in >> Java source code. It affects only direct calls to >> `LambdaMetafactory.metafactory()`. Is this correct? >> >> >> public class LambdaGC { >> public static void main(String[] args) throws Throwable { >> System.out.println("Entering LambdaGC"); >> doit(() -> { >> Thread.dumpStack(); >> }); >> for (int i = 0; i < 10; i++) { >> System.gc(); >> } >> System.out.println("Finish LambdaGC"); >> } >> static void doit(Runnable r) { >> r.run(); >> } >> } >> >> $ java11 -cp . -XX:+UnlockDiagnosticVMOptions -XX:+ShowHiddenFrames >> -Xlog:class+load -Xlog:class+unload LambdaGC | grep LambdaGC >> [0.022s][info][class,load] LambdaGC source: file:/jdk3/tmp/ >> Entering LambdaGC >> [0.024s][info][class,load] LambdaGC$$Lambda$1/0x0000000840060840 source: >> LambdaGC >> java.lang.Exception: Stack trace >> at java.base/java.lang.Thread.dumpStack(Thread.java:1387) >> at LambdaGC.lambda$main$0(LambdaGC.java:5) >> at LambdaGC$$Lambda$1/0x0000000840060840.run(:1000000) >> at LambdaGC.doit(LambdaGC.java:13) >> at LambdaGC.main(LambdaGC.java:4) >> Finish LambdaGC > > @iklam, I think your understanding is correct. While the bootstrap methods for > Java Lambdas do call `LambdaMetafactory.metafactory()`, they store the > resulting call site (for Lambdas a `BoundMethodHandle`) in the appendix slot of > the constant pool cache entry of the invokedynamic bytecode. The > `BoundMethodHandle` contains a reference to an instance of the generated lambda > form (i.e. in your example `LambdaGC$$Lambda$1`). This is enough in order to > keep `LambdaGC$$Lambda$1` alive and prevent its unloading. > > Until now, `LambdaGC$$Lambda$1` was also strongly linked to its defining class > loader, but I don't think that's necessary to keep it alive (because it is > referenced from the call site which is referenced from the constant pool > cache). > > Running your example with `-Xlog:indy+methodhandles=debug` confirms this: > > set_method_handle bc=186 appendix=0x000000062b821838 method=0x00000008000c7698 > (local signature) > {method} > - this oop: 0x00000008000c7698 > - method holder: 'java/lang/invoke/Invokers$Holder' > ... > appendix: java.lang.invoke.BoundMethodHandle$Species_L > {0x000000062b821838} - klass: 'java/lang/invoke/BoundMethodHandle$Species_L' > - ---- fields (total size 5 words): > ... > - final 'argL0' 'Ljava/lang/Object;' @32 a > 'LambdaGC$$Lambda$1+0x0000000801000400'{0x000000062b81e080} (0xc5703c10) Hi Volker, the main issue if the link is not STRONG is that the VM creates one classloader data per lambda as Mandy said, if you have a library full of lambdas, this will consume a lot of C memory which is always hard to to diagnose. I remember people from RedHat reverting a library code that was 'lambdaified' because of that. So it's a kind of pick your poison situation and i believe the current tradeoff, you need a classloader if you want to unload something (classes or lambdas) vs you may consume too much off heap memory is the good one. regards, R?mi > > ------------- > > PR: https://git.openjdk.org/jdk/pull/12493 R?mi From iveresov at openjdk.org Fri Feb 10 22:50:28 2023 From: iveresov at openjdk.org (Igor Veresov) Date: Fri, 10 Feb 2023 22:50:28 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 05:26:38 GMT, Kim Barrett wrote: > Please review this change to permit the use of noreturn attributes in HotSpot > code. > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2761.pdf > https://en.cppreference.com/w/cpp/language/attributes/noreturn > > This will permit such changes as marking failed assertions and the like as > noreturn, permitting the compiler to generate better code in some cases. This > has benefits even in product builds, since some of the relevant checks occur > in product code (such as `guarantee`, `fatal`, &etc). It also provides a > solution to the problem described in JDK-8294031, where the potential > continued execution from a failed assertion leads to compiler warnings. > > The change is written in such a way that it should be easy to add the > appropriate text for new attributes in the future. There have been > discussions of adopting C++17, which adds several attributes. > > The change to the Style Guide is forward looking, toward a time when more > attributes are available due to the adoption of a newer language standard than > the current C++14. It is written in such a way that it should be easy to add > the appropriate text for new attributes. > > Testing: > I have a prototype of making HotSpot assertions noreturn, which has been run > through mach5 tier1-8 for all Oracle-supported platforms. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 24-Feb-2023 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Marked as reviewed by iveresov (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12507 From kvn at openjdk.org Fri Feb 10 22:53:29 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 10 Feb 2023 22:53:29 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 11:40:19 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! src/hotspot/cpu/x86/interp_masm_x86.cpp line 303: > 301: jcc(Assembler::equal, L); > 302: stop("InterpreterMacroAssembler::call_VM_base:" > 303: " last_sp != nullptr"); `null` ------------- PR: https://git.openjdk.org/jdk/pull/12326 From redestad at openjdk.org Fri Feb 10 23:21:42 2023 From: redestad at openjdk.org (Claes Redestad) Date: Fri, 10 Feb 2023 23:21:42 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v15] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: On Thu, 9 Feb 2023 18:08:15 GMT, Scott Gibbons wrote: >> Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. >> >> Encode performance: >> **Old:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms >> >> >> **New:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms >> >> >> Decode performance: >> **Old:** >> >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms >> >> **New:** >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Add URL to microbenchmark Marked as reviewed by redestad (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12126 From kbarrett at openjdk.org Sat Feb 11 00:24:29 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 11 Feb 2023 00:24:29 GMT Subject: RFR: 8302127: Remove unused arg in write_ref_field_post In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 10:56:10 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. src/hotspot/share/gc/g1/g1BarrierSet.inline.hpp line 71: > 69: > 70: template > 71: inline void G1BarrierSet::write_ref_field_post(T* field) { So this function isn't applying the "same region" filter that's in the various other places. I wonder if that's intentional, or a bug? ------------- PR: https://git.openjdk.org/jdk/pull/12489 From kvn at openjdk.org Sat Feb 11 00:33:36 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 11 Feb 2023 00:33:36 GMT Subject: RFR: 8301874: BarrierSetC2 should assign barrier data to stores In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 15:12:03 GMT, Erik ?sterlund wrote: > BarrierSetC2 should assign barrier data to stores, like it does for loads and atomics. That way, a GC that uses barrier data for stores, can call the super class to generate the loads/stores, instead of copying that code. Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.org/jdk/pull/12442 From sspitsyn at openjdk.org Sat Feb 11 01:37:34 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 11 Feb 2023 01:37:34 GMT Subject: RFR: 8298853: JvmtiVTMSTransitionDisabler should support disabling one virtual thread transitions [v5] In-Reply-To: References: Message-ID: On Sat, 24 Dec 2022 04:14:07 GMT, Serguei Spitsyn wrote: >> Now the `JvmtiVTMSTransitionDisabler` mechanism supports disabling VTMS transitions for all virtual threads only. It should also support disabling transitions for any specific virtual thread as well. This will improve scalability of the JVMTI functions operating on target virtual threads as the functions can be executed concurrently without blocking each other execution when target virtual threads are different. >> New constructor `JvmtiVTMSTransitionDisabler(jthread vthread)` is added which has jthread parameter of the target virtual thread. >> >> Testing: >> mach5 jobs are TBD (preliminary testing was completed): >> - all JVMTI, JDWP, JDI and JDB tests have to be run >> - Kitchensink >> - tier5 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > fix race between VTMS_transition_disable_for_one and start_VTMS_transition PING: one more review is needed! ------------- PR: https://git.openjdk.org/jdk/pull/11690 From sspitsyn at openjdk.org Sat Feb 11 01:43:07 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 11 Feb 2023 01:43:07 GMT Subject: RFR: 8298853: JvmtiVTMSTransitionDisabler should support disabling one virtual thread transitions [v6] In-Reply-To: References: Message-ID: > Now the `JvmtiVTMSTransitionDisabler` mechanism supports disabling VTMS transitions for all virtual threads only. It should also support disabling transitions for any specific virtual thread as well. This will improve scalability of the JVMTI functions operating on target virtual threads as the functions can be executed concurrently without blocking each other execution when target virtual threads are different. > New constructor `JvmtiVTMSTransitionDisabler(jthread vthread)` is added which has jthread parameter of the target virtual thread. > > Testing: > mach5 jobs are TBD (preliminary testing was completed): > - all JVMTI, JDWP, JDI and JDB tests have to be run > - Kitchensink > - tier5 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: minor comments are resolved ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11690/files - new: https://git.openjdk.org/jdk/pull/11690/files/3effc27d..146c2a59 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11690&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11690&range=04-05 Stats: 4 lines in 1 file changed: 2 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/11690.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11690/head:pull/11690 PR: https://git.openjdk.org/jdk/pull/11690 From kbarrett at openjdk.org Sat Feb 11 03:24:04 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 11 Feb 2023 03:24:04 GMT Subject: RFR: 8302262: Remove -XX:SuppressErrorAt develop option Message-ID: Please review this change which removes the -XX:SuppressErrorAt develop option and the associated implementation. ------------- Commit messages: - remove SuppressErrorAt Changes: https://git.openjdk.org/jdk/pull/12521/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12521&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302262 Stats: 224 lines in 8 files changed: 0 ins; 205 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/12521.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12521/head:pull/12521 PR: https://git.openjdk.org/jdk/pull/12521 From lmesnik at openjdk.org Sat Feb 11 04:22:32 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Sat, 11 Feb 2023 04:22:32 GMT Subject: RFR: 8298853: JvmtiVTMSTransitionDisabler should support disabling one virtual thread transitions [v6] In-Reply-To: References: Message-ID: On Sat, 11 Feb 2023 01:43:07 GMT, Serguei Spitsyn wrote: >> Now the `JvmtiVTMSTransitionDisabler` mechanism supports disabling VTMS transitions for all virtual threads only. It should also support disabling transitions for any specific virtual thread as well. This will improve scalability of the JVMTI functions operating on target virtual threads as the functions can be executed concurrently without blocking each other execution when target virtual threads are different. >> New constructor `JvmtiVTMSTransitionDisabler(jthread vthread)` is added which has jthread parameter of the target virtual thread. >> >> Testing: >> mach5 jobs are TBD (preliminary testing was completed): >> - all JVMTI, JDWP, JDI and JDB tests have to be run >> - Kitchensink >> - tier5 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > minor comments are resolved Please update the copyrights. (I quickly check a couple of files and that had old year.) ------------- Marked as reviewed by lmesnik (Reviewer). PR: https://git.openjdk.org/jdk/pull/11690 From stuefe at openjdk.org Sat Feb 11 05:43:26 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 11 Feb 2023 05:43:26 GMT Subject: RFR: 8302262: Remove -XX:SuppressErrorAt develop option In-Reply-To: References: Message-ID: On Sat, 11 Feb 2023 03:17:38 GMT, Kim Barrett wrote: > Please review this change which removes the -XX:SuppressErrorAt develop option > and the associated implementation. While I have used it rarely, I found it useful the few times I did. Its useful in situations where I analyzed a problem, ran into an assert, but obtaining a new build was difficult or just time-consuming - slow hardware, or a customer's machine. E.g., I used this switch when porting to AIX since rebuilding the VM took about an hour and some asserts could be waived. Quoting from[ JDK-8302189](https://bugs.openjdk.org/browse/JDK-8302189): > "This also addresses the problems encountered in [JDK-8294031](https://bugs.openjdk.org/browse/JDK-8294031). Because the failure reporting function used by assert is not "noreturn", the compiler is analyzing the continuation from a failed assert as potentially having the values that would result in an assertion failure (in this case, a variable having a null value). Doing so, it discovers a use of that variable in a context where a null value is UB, and so issues a warning." I don't understand why this is not a problem for release builds. Relying on assert in this case is not sufficient anyway, you'd need to handle the nullptr in release builds too. > "This matters even for production, since some of the operations (such as `guarantee` and `ShouldNotReachHere`) are present in product builds, not just debug builds. " I can see that. But `guarantee` is rare, so maybe a compromise would be to make `guarantee` `noreturn` but leave `assert` as it is? ------------- PR: https://git.openjdk.org/jdk/pull/12521 From alanb at openjdk.org Sat Feb 11 06:23:35 2023 From: alanb at openjdk.org (Alan Bateman) Date: Sat, 11 Feb 2023 06:23:35 GMT Subject: Integrated: 8301819: Enable continuations code by default In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 18:37:03 GMT, Alan Bateman wrote: > The continuations code added in JDK 19 with JEP 425 is currently disabled by default and is only enabled when running with preview features enabled (`--enable-preview`). Disabled by default was the pragmatic choice at that time to reduce risk and possible performance regressions. We'd like to change this so the code is enabled by default on the platforms that have a port. Enabling this code by default helps to shake out any issues early in JDK 21 and sets things up to make virtual threads a permanent feature in the future. If something comes up in the next few weeks/months then it can be changed to be disabled by default again. This change has no impact on the builds that do not have a port. > > Testing: tier1-tier6 This pull request has now been integrated. Changeset: 74b167b2 Author: Alan Bateman URL: https://git.openjdk.org/jdk/commit/74b167b23d1eb4b6685e03caaf2e1567525b9800 Stats: 6 lines in 2 files changed: 0 ins; 2 del; 4 mod 8301819: Enable continuations code by default Reviewed-by: kvn, dholmes, dcubed ------------- PR: https://git.openjdk.org/jdk/pull/12459 From stuefe at openjdk.org Sat Feb 11 06:34:27 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 11 Feb 2023 06:34:27 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: <80S5iALXtwJyXDKAqu3ecLSY3FePCHYr_Vez_uoRs4Y=.c0256514-454b-4744-98f2-9d9106e981c4@github.com> References: <80S5iALXtwJyXDKAqu3ecLSY3FePCHYr_Vez_uoRs4Y=.c0256514-454b-4744-98f2-9d9106e981c4@github.com> Message-ID: <6YJK-1ezod9M_aR4bHS6Mesll8QWNjpDXmFj1YA42B0=.c363a6e1-b9e1-4f9c-8711-3f8fc0cf2d14@github.com> On Fri, 10 Feb 2023 19:00:53 GMT, Volker Simonis wrote: >> Even in JDK 11, a lambda proxy classes that's referenced by the cpCache (i.e., from a resolved invokedynamic instruction associated with a lamda expression) is always kept alive. See test below. >> >> So if I understand correctly, this patch will not affect lamda expressions in Java source code. It affects only direct calls to `LambdaMetafactory.metafactory()`. Is this correct? >> >> >> public class LambdaGC { >> public static void main(String[] args) throws Throwable { >> System.out.println("Entering LambdaGC"); >> doit(() -> { >> Thread.dumpStack(); >> }); >> for (int i = 0; i < 10; i++) { >> System.gc(); >> } >> System.out.println("Finish LambdaGC"); >> } >> static void doit(Runnable r) { >> r.run(); >> } >> } >> >> $ java11 -cp . -XX:+UnlockDiagnosticVMOptions -XX:+ShowHiddenFrames -Xlog:class+load -Xlog:class+unload LambdaGC | grep LambdaGC >> [0.022s][info][class,load] LambdaGC source: file:/jdk3/tmp/ >> Entering LambdaGC >> [0.024s][info][class,load] LambdaGC$$Lambda$1/0x0000000840060840 source: LambdaGC >> java.lang.Exception: Stack trace >> at java.base/java.lang.Thread.dumpStack(Thread.java:1387) >> at LambdaGC.lambda$main$0(LambdaGC.java:5) >> at LambdaGC$$Lambda$1/0x0000000840060840.run(:1000000) >> at LambdaGC.doit(LambdaGC.java:13) >> at LambdaGC.main(LambdaGC.java:4) >> Finish LambdaGC > > @iklam, I think your understanding is correct. While the bootstrap methods for Java Lambdas do call `LambdaMetafactory.metafactory()`, they store the resulting call site (for Lambdas a `BoundMethodHandle`) in the appendix slot of the constant pool cache entry of the invokedynamic bytecode. The `BoundMethodHandle` contains a reference to an instance of the generated lambda form (i.e. in your example `LambdaGC$$Lambda$1`). This is enough in order to keep `LambdaGC$$Lambda$1` alive and prevent its unloading. > > Until now, `LambdaGC$$Lambda$1` was also strongly linked to its defining class loader, but I don't think that's necessary to keep it alive (because it is referenced from the call site which is referenced from the constant pool cache). > > Running your example with `-Xlog:indy+methodhandles=debug` confirms this: > > set_method_handle bc=186 appendix=0x000000062b821838 method=0x00000008000c7698 (local signature) > {method} > - this oop: 0x00000008000c7698 > - method holder: 'java/lang/invoke/Invokers$Holder' > ... > appendix: java.lang.invoke.BoundMethodHandle$Species_L > {0x000000062b821838} - klass: 'java/lang/invoke/BoundMethodHandle$Species_L' > - ---- fields (total size 5 words): > ... > - final 'argL0' 'Ljava/lang/Object;' @32 a 'LambdaGC$$Lambda$1+0x0000000801000400'{0x000000062b81e080} (0xc5703c10) @simonis This may be a stupid question, but how exactly does your patch help with Metaspace consumption? I mean, how is the deallocation path with the patch now? We still have the CLD loaded by the BootClassLoader, right? So it is still living in the metaspace arena of the bootclassloader, even if the Lambda class itself is collected earlier. How is this memory freed or reused? If there were a way to prematurely collect its metadata via Metaspace::deallocate, then it could be reused at least by the bootloader itself. But I don't see it. ----- >True but it was decided at that time for JEP 371 to make it strong so that the lambda proxy classes will share the same ClassLoaderData metaspace as other classes defined by the same loader. This reduces the memory footprint consumption. Expanding on what @mlchung wrote, this has been a nice effect of JEP 371. Metaspace intra-chunk fragmentation went down since all these pesky little CLDs went away. It also saved some C-heap as a side effect for the CLDs itself. Example, committed metaspace for 100k lambdas: JDK11: 271MB JDK15: (+JEP371): 221MB JDK17: (+JEP371 +JEP387): 189MB ------------- PR: https://git.openjdk.org/jdk/pull/12493 From alanb at openjdk.org Sat Feb 11 07:33:27 2023 From: alanb at openjdk.org (Alan Bateman) Date: Sat, 11 Feb 2023 07:33:27 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation In-Reply-To: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Fri, 10 Feb 2023 14:50:36 GMT, Patricio Chilano Mateo wrote: > Please review the following change. Some of the JVMTI functions had to be modified since they are not supported by the specs on virtual threads (StopThread, RunAgentThread, GetCurrentThreadCpuTime, GetThreadCpuTime, PopFrame, ForceEarlyReturnXXX) or shouldn't include virtual threads in the results (GetAllThreads, GetAllStackTraces, GetThreadGroupChildren). Others were modified because they should work on virtual thread but they weren't (SuspendAllVirtualThreads/ResumeAllVirtualThreads). Also support for VirtualThreadStart/VirtualThreadEnd events was added. > > There were a total of 71 tests under serviceability/jvmti/ that had the vm.continuations requirement. Of those, 24 where under serviceability/jvmti/vthread/. I was able to remove that requirement from 50 tests. Most of those tests were already passing for the alternative vthread implementation just by removing the check in jni_GetEnv for VMContinuations. > From the other 21 tests, 20 still fail with the changes either because they actually test continuations or because of different assumptions in the tests that don't hold true for the alternative vthread implementation. So for those I left the vm.continuations requirement untouched (from those 20 tests there are actually 4 tests from the thread/GetStackTrace/ family that are passing because of a bug in the test, but I can fix that in another RFE). The other remaining test is vthread/GetSetLocalTest/GetSetLocalTest.java which I see is problemlisted. > > Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). > > I added new test BoundVThreadTest.java which just checks for the unsupported/supported behavior mentioned previously. For some extra basic coverage I also added a new run with -XX:-VMContinuations on tests inside the serviceability/jvmti/vthread/ directory that don't require vm.continuations anymore. I could also add that for all the other tests in serviceability/jvmti/ for which I removed the vm.continuations requirement. > > I run the patch through mach5 tiers 1-6 plus some extra local runs on the relevant tests. > > Thanks, > Patricio src/hotspot/share/classfile/vmClassMacros.hpp line 92: > 90: do_klass(Thread_Constants_klass, java_lang_Thread_Constants ) \ > 91: do_klass(ThreadGroup_klass, java_lang_ThreadGroup ) \ > 92: do_klass(BasicVirtualThread_klass, java_lang_BaseVirtualThread ) \ It should be BaseVirtualThread_klass rather BasicVirtualThread_klass (probably my fault when adding the alt implementation). ------------- PR: https://git.openjdk.org/jdk/pull/12512 From alanb at openjdk.org Sat Feb 11 07:40:26 2023 From: alanb at openjdk.org (Alan Bateman) Date: Sat, 11 Feb 2023 07:40:26 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation In-Reply-To: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Fri, 10 Feb 2023 14:50:36 GMT, Patricio Chilano Mateo wrote: > Please review the following change. Some of the JVMTI functions had to be modified since they are not supported by the specs on virtual threads (StopThread, RunAgentThread, GetCurrentThreadCpuTime, GetThreadCpuTime, PopFrame, ForceEarlyReturnXXX) or shouldn't include virtual threads in the results (GetAllThreads, GetAllStackTraces, GetThreadGroupChildren). Others were modified because they should work on virtual thread but they weren't (SuspendAllVirtualThreads/ResumeAllVirtualThreads). Also support for VirtualThreadStart/VirtualThreadEnd events was added. > > There were a total of 71 tests under serviceability/jvmti/ that had the vm.continuations requirement. Of those, 24 where under serviceability/jvmti/vthread/. I was able to remove that requirement from 50 tests. Most of those tests were already passing for the alternative vthread implementation just by removing the check in jni_GetEnv for VMContinuations. > From the other 21 tests, 20 still fail with the changes either because they actually test continuations or because of different assumptions in the tests that don't hold true for the alternative vthread implementation. So for those I left the vm.continuations requirement untouched (from those 20 tests there are actually 4 tests from the thread/GetStackTrace/ family that are passing because of a bug in the test, but I can fix that in another RFE). The other remaining test is vthread/GetSetLocalTest/GetSetLocalTest.java which I see is problemlisted. > > Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). > > I added new test BoundVThreadTest.java which just checks for the unsupported/supported behavior mentioned previously. For some extra basic coverage I also added a new run with -XX:-VMContinuations on tests inside the serviceability/jvmti/vthread/ directory that don't require vm.continuations anymore. I could also add that for all the other tests in serviceability/jvmti/ for which I removed the vm.continuations requirement. > > I run the patch through mach5 tiers 1-6 plus some extra local runs on the relevant tests. > > Thanks, > Patricio I see that JVM_GetAllThreads will now returned the filtered list (as an array). That looks okay. There's another part to this in the ThreadMXBean implementation where the filtering is done in the Java implementation. We can look at that in a follow-up issue, to avoid adding more to this PR. On tests, there are a few other tests that can be changed to drop `@requires vm.continuations`, e.g. the jshell tests that run with --enable-preview should no longer need the `@requires`. There are also a few JDI and at least one AppCDS test. A search will find them. The tests re-running with -XX:-VMContinuations is fine but it does mean they will run twice on the ports without an implementation. The test runtime/jni/IsVirtualThread/IsVirtualThread.java is an example where it only runs once on these ports, at the cost of additional `@test` tag. ------------- PR: https://git.openjdk.org/jdk/pull/12512 From kbarrett at openjdk.org Sat Feb 11 09:45:25 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 11 Feb 2023 09:45:25 GMT Subject: RFR: 8302262: Remove -XX:SuppressErrorAt develop option In-Reply-To: References: Message-ID: On Sat, 11 Feb 2023 05:40:13 GMT, Thomas Stuefe wrote: > While I have used it rarely, I found it useful the few times I did. > > Its useful in situations where I analyzed a problem, ran into an assert, but obtaining a new build was difficult or just time-consuming - slow hardware, or a customer's machine. E.g., I used this switch when porting to AIX since rebuilding the VM took about an hour and some asserts could be waived. > > Quoting from[ JDK-8302189](https://bugs.openjdk.org/browse/JDK-8302189): > > > "This also addresses the problems encountered in [JDK-8294031](https://bugs.openjdk.org/browse/JDK-8294031). Because the failure reporting function used by assert is not "noreturn", the compiler is analyzing the continuation from a failed assert as potentially having the values that would result in an assertion failure (in this case, a variable having a null value). Doing so, it discovers a use of that variable in a context where a null value is UB, and so issues a warning." > > I don't understand why this is not a problem for release builds. Relying on assert in this case is not sufficient anyway, you'd need to handle the nullptr in release builds too. In the example from JDK-8294031 the value shouldn?t be null. But the existence of the not-null assertion indicates that it was null on the failed assertion branch of the conditional. That branch doesn?t terminate in a noreturn/unreachable black hole. The compiler can separately treat the post-assertion rejoin point as two different continuations, one of which definitely has a null value (because of the lack of the black hole). And that leads to discovering later code that is unhappy with that null value, and warning about it. That kind of splitting and value propagation is a normal thing that can happen for all kinds of value, not just checks for non-null. For example, an index in-bounds assertion could run into exactly the same issue. In a release build there isn't a null check, so the compiler has no information either way, and _assumes_ the code doesn't invoke UB, and doesn't check for it either. > > "This matters even for production, since some of the operations (such as `guarantee` and `ShouldNotReachHere`) are present in product builds, not just debug builds. " > > I can see that. But `guarantee` is rare, so maybe a compromise would be to make `guarantee` `noreturn` but leave `assert` as it is? Also `ShouldNotReachHere`, `fatal`, &etc. which are pretty common. ------------- PR: https://git.openjdk.org/jdk/pull/12521 From stuefe at openjdk.org Sat Feb 11 11:34:24 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 11 Feb 2023 11:34:24 GMT Subject: RFR: 8302262: Remove -XX:SuppressErrorAt develop option In-Reply-To: References: Message-ID: On Sat, 11 Feb 2023 09:42:17 GMT, Kim Barrett wrote: > > While I have used it rarely, I found it useful the few times I did. > > Its useful in situations where I analyzed a problem, ran into an assert, but obtaining a new build was difficult or just time-consuming - slow hardware, or a customer's machine. E.g., I used this switch when porting to AIX since rebuilding the VM took about an hour and some asserts could be waived. > > Quoting from[ JDK-8302189](https://bugs.openjdk.org/browse/JDK-8302189): > > > "This also addresses the problems encountered in [JDK-8294031](https://bugs.openjdk.org/browse/JDK-8294031). Because the failure reporting function used by assert is not "noreturn", the compiler is analyzing the continuation from a failed assert as potentially having the values that would result in an assertion failure (in this case, a variable having a null value). Doing so, it discovers a use of that variable in a context where a null value is UB, and so issues a warning." > > > > > > I don't understand why this is not a problem for release builds. Relying on assert in this case is not sufficient anyway, you'd need to handle the nullptr in release builds too. > > In the example from JDK-8294031 the value shouldn?t be null. But the existence of the not-null assertion indicates that it was null on the failed assertion branch of the conditional. That branch doesn?t terminate in a noreturn/unreachable black hole. The compiler can separately treat the post-assertion rejoin point as two different continuations, one of which definitely has a null value (because of the lack of the black hole). And that leads to discovering later code that is unhappy with that null value, and warning about it. That kind of splitting and value propagation is a normal thing that can happen for all kinds of value, not just checks for non-null. For example, an index in-bounds assertion could run into exactly the same issue. > > In a release build there isn't a null check, so the compiler has no information either way, and _assumes_ the code doesn't invoke UB, and doesn't check for it either. Interesting. Thanks for explaining this. So the fact that we set the assert triggers the compiler check. > > > > "This matters even for production, since some of the operations (such as `guarantee` and `ShouldNotReachHere`) are present in product builds, not just debug builds. " > > > > > > I can see that. But `guarantee` is rare, so maybe a compromise would be to make `guarantee` `noreturn` but leave `assert` as it is? > > Also `ShouldNotReachHere`, `fatal`, &etc. which are pretty common. In the light of your explanation, I think loosing the switch could be an acceptable price. I see the C-asserts in glibc also use noreturn attributes. Less code is also nice. ------------- PR: https://git.openjdk.org/jdk/pull/12521 From stuefe at openjdk.org Sat Feb 11 11:34:25 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 11 Feb 2023 11:34:25 GMT Subject: RFR: 8302262: Remove -XX:SuppressErrorAt develop option In-Reply-To: References: Message-ID: On Sat, 11 Feb 2023 03:17:38 GMT, Kim Barrett wrote: > Please review this change which removes the -XX:SuppressErrorAt develop option > and the associated implementation. Stupid question, what happens if a function is decorated with noreturn and you return? E.g. in case of the `Debugging` switch? Or by just falling off the function, failing to call exit or abort? ------------- PR: https://git.openjdk.org/jdk/pull/12521 From aph at openjdk.org Sat Feb 11 13:47:27 2023 From: aph at openjdk.org (Andrew Haley) Date: Sat, 11 Feb 2023 13:47:27 GMT Subject: RFR: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64 In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 02:25:27 GMT, Yi-Fan Tsai wrote: > Instruction pmull and pmull2 support operating on 64-bit data in Cryptographic Extension. The execution throughput of this form raises from 1 on Neoverse N1 to 4 on Neoverse V1 while the latency remains 2. The CRC32 instructions did not changed: latency 2, throughput 1. As a result, computing CRC32 using pmull could perform better than using crc32 instruction. > > The following test has passed. > test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java > > The throughput reported by the micro benchmark is measured on an EC2 c7g instance. The optimization shows 11 - 99% improvement when the input is at least 384 bytes. > > | input | 64 | 128 | 256 | 384 | 511 | 512 | 1,024 | > | ------------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | > | CRC32 improvement | 0.02% | 0.02% | 0.00% | 16.00% | 11.94% | 34.75% | 69.80% | > > | input | 2,048 | 4,096 | 8,192 | 16,384 | 32,768 | 65,536 | > | ------------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | > | CRC32 improvement | 77.61% | 92.33% | 95.98% | 97.95% | 99.33% | 98.36% | > > > Baseline > > TestCRC32.testCRC32Update 64 thrpt 12 173126.358 ? 118.330 ops/ms > TestCRC32.testCRC32Update 128 thrpt 12 112910.118 ? 47.305 ops/ms > TestCRC32.testCRC32Update 256 thrpt 12 66601.990 ? 7.294 ops/ms > TestCRC32.testCRC32Update 384 thrpt 12 47229.319 ? 3.949 ops/ms > TestCRC32.testCRC32Update 511 thrpt 12 33733.119 ? 4.076 ops/ms > TestCRC32.testCRC32Update 512 thrpt 12 36584.565 ? 4.211 ops/ms > TestCRC32.testCRC32Update 1024 thrpt 12 19239.083 ? 1.040 ops/ms > TestCRC32.testCRC32Update 2048 thrpt 12 9875.652 ? 0.435 ops/ms > TestCRC32.testCRC32Update 4096 thrpt 12 5004.425 ? 0.290 ops/ms > TestCRC32.testCRC32Update 8192 thrpt 12 2519.185 ? 0.169 ops/ms > TestCRC32.testCRC32Update 16384 thrpt 12 1263.909 ? 0.194 ops/ms > TestCRC32.testCRC32Update 32768 thrpt 12 632.018 ? 0.053 ops/ms > TestCRC32.testCRC32Update 65536 thrpt 12 315.471 ? 0.095 ops/ms > > > Crypto pmull > > TestCRC32.testCRC32Update 64 thrpt 12 173168.669 ? 4.746 ops/ms > TestCRC32.testCRC32Update 128 thrpt 12 112933.519 ? 4.583 ops/ms > TestCRC32.testCRC32Update 256 thrpt 12 66602.462 ? 3.150 ops/ms > TestCRC32.testCRC32Update 384 thrpt 12 54784.739 ? 2.110 ops/ms > TestCRC32.testCRC32Update 511 thrpt 12 37760.816 ? 69.911 ops/ms > TestCRC32.testCRC32Update 512 thrpt 12 49297.609 ? 21.983 ops/ms > TestCRC32.testCRC32Update 1024 thrpt 12 32667.507 ? 90.610 ops/ms > TestCRC32.testCRC32Update 2048 thrpt 12 17539.986 ? 511.416 ops/ms > TestCRC32.testCRC32Update 4096 thrpt 12 9625.249 ? 9.713 ops/ms > TestCRC32.testCRC32Update 8192 thrpt 12 4937.135 ? 6.121 ops/ms > TestCRC32.testCRC32Update 16384 thrpt 12 2501.936 ? 1.270 ops/ms > TestCRC32.testCRC32Update 32768 thrpt 12 1259.831 ? 0.119 ops/ms > TestCRC32.testCRC32Update 65536 thrpt 12 625.773 ? 0.242 ops/ms src/hotspot/cpu/aarch64/globals_aarch64.hpp line 91: > 89: product(bool, UseCRC32, false, \ > 90: "Use CRC32 instructions for CRC32 computation") \ > 91: product(bool, UseCryptoPmull, false, \ Suggestion: product(bool, UsePmullForCRC32, false, \ ------------- PR: https://git.openjdk.org/jdk/pull/12480 From aph at openjdk.org Sat Feb 11 13:53:25 2023 From: aph at openjdk.org (Andrew Haley) Date: Sat, 11 Feb 2023 13:53:25 GMT Subject: RFR: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64 In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 02:25:27 GMT, Yi-Fan Tsai wrote: > Instruction pmull and pmull2 support operating on 64-bit data in Cryptographic Extension. The execution throughput of this form raises from 1 on Neoverse N1 to 4 on Neoverse V1 while the latency remains 2. The CRC32 instructions did not changed: latency 2, throughput 1. As a result, computing CRC32 using pmull could perform better than using crc32 instruction. > > The following test has passed. > test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java > > The throughput reported by the micro benchmark is measured on an EC2 c7g instance. The optimization shows 11 - 99% improvement when the input is at least 384 bytes. > > | input | 64 | 128 | 256 | 384 | 511 | 512 | 1,024 | > | ------------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | > | CRC32 improvement | 0.02% | 0.02% | 0.00% | 16.00% | 11.94% | 34.75% | 69.80% | > > | input | 2,048 | 4,096 | 8,192 | 16,384 | 32,768 | 65,536 | > | ------------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | > | CRC32 improvement | 77.61% | 92.33% | 95.98% | 97.95% | 99.33% | 98.36% | > > > Baseline > > TestCRC32.testCRC32Update 64 thrpt 12 173126.358 ? 118.330 ops/ms > TestCRC32.testCRC32Update 128 thrpt 12 112910.118 ? 47.305 ops/ms > TestCRC32.testCRC32Update 256 thrpt 12 66601.990 ? 7.294 ops/ms > TestCRC32.testCRC32Update 384 thrpt 12 47229.319 ? 3.949 ops/ms > TestCRC32.testCRC32Update 511 thrpt 12 33733.119 ? 4.076 ops/ms > TestCRC32.testCRC32Update 512 thrpt 12 36584.565 ? 4.211 ops/ms > TestCRC32.testCRC32Update 1024 thrpt 12 19239.083 ? 1.040 ops/ms > TestCRC32.testCRC32Update 2048 thrpt 12 9875.652 ? 0.435 ops/ms > TestCRC32.testCRC32Update 4096 thrpt 12 5004.425 ? 0.290 ops/ms > TestCRC32.testCRC32Update 8192 thrpt 12 2519.185 ? 0.169 ops/ms > TestCRC32.testCRC32Update 16384 thrpt 12 1263.909 ? 0.194 ops/ms > TestCRC32.testCRC32Update 32768 thrpt 12 632.018 ? 0.053 ops/ms > TestCRC32.testCRC32Update 65536 thrpt 12 315.471 ? 0.095 ops/ms > > > Crypto pmull > > TestCRC32.testCRC32Update 64 thrpt 12 173168.669 ? 4.746 ops/ms > TestCRC32.testCRC32Update 128 thrpt 12 112933.519 ? 4.583 ops/ms > TestCRC32.testCRC32Update 256 thrpt 12 66602.462 ? 3.150 ops/ms > TestCRC32.testCRC32Update 384 thrpt 12 54784.739 ? 2.110 ops/ms > TestCRC32.testCRC32Update 511 thrpt 12 37760.816 ? 69.911 ops/ms > TestCRC32.testCRC32Update 512 thrpt 12 49297.609 ? 21.983 ops/ms > TestCRC32.testCRC32Update 1024 thrpt 12 32667.507 ? 90.610 ops/ms > TestCRC32.testCRC32Update 2048 thrpt 12 17539.986 ? 511.416 ops/ms > TestCRC32.testCRC32Update 4096 thrpt 12 9625.249 ? 9.713 ops/ms > TestCRC32.testCRC32Update 8192 thrpt 12 4937.135 ? 6.121 ops/ms > TestCRC32.testCRC32Update 16384 thrpt 12 2501.936 ? 1.270 ops/ms > TestCRC32.testCRC32Update 32768 thrpt 12 1259.831 ? 0.119 ops/ms > TestCRC32.testCRC32Update 65536 thrpt 12 625.773 ? 0.242 ops/ms Why do we need any new code for this processor? We already have routines for CRC32 using pmull. ------------- PR: https://git.openjdk.org/jdk/pull/12480 From jwaters at openjdk.org Sat Feb 11 14:08:27 2023 From: jwaters at openjdk.org (Julian Waters) Date: Sat, 11 Feb 2023 14:08:27 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 05:26:38 GMT, Kim Barrett wrote: > Please review this change to permit the use of noreturn attributes in HotSpot > code. > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2761.pdf > https://en.cppreference.com/w/cpp/language/attributes/noreturn > > This will permit such changes as marking failed assertions and the like as > noreturn, permitting the compiler to generate better code in some cases. This > has benefits even in product builds, since some of the relevant checks occur > in product code (such as `guarantee`, `fatal`, &etc). It also provides a > solution to the problem described in JDK-8294031, where the potential > continued execution from a failed assertion leads to compiler warnings. > > The change is written in such a way that it should be easy to add the > appropriate text for new attributes in the future. There have been > discussions of adopting C++17, which adds several attributes. > > The change to the Style Guide is forward looking, toward a time when more > attributes are available due to the adoption of a newer language standard than > the current C++14. It is written in such a way that it should be easy to add > the appropriate text for new attributes. > > Testing: > I have a prototype of making HotSpot assertions noreturn, which has been run > through mach5 tier1-8 for all Oracle-supported platforms. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 24-Feb-2023 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Glad to see Attributes are being considered for HotSpot now, even if it's only just 1 at the moment. Wish I could give a vote, but I'm unfortunately not part of the HotSpot Group. Maybe one day I'll be competent enough to properly work on hotspot stuff, but until then... ------------- PR: https://git.openjdk.org/jdk/pull/12507 From duke at openjdk.org Sat Feb 11 16:29:27 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Sat, 11 Feb 2023 16:29:27 GMT Subject: RFR: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64 In-Reply-To: References: Message-ID: On Sat, 11 Feb 2023 13:51:02 GMT, Andrew Haley wrote: > Why do we need any new code for this processor? We already have routines for CRC32 using pmull. Do you refer to [this pmull implementation](https://github.com/openjdk/jdk/blob/1ef9f6507ba45419f0fa896915eec064762c5153/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L3753)? It uses pmull to take in 8-bit elements and produces 16-bit results. The new code uses pmull variation in Cryptographic Extension to take in 32-bit elements and produces 64-bit results. It is more efficient along with eor3. ------------- PR: https://git.openjdk.org/jdk/pull/12480 From kbarrett at openjdk.org Sat Feb 11 19:16:26 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 11 Feb 2023 19:16:26 GMT Subject: RFR: 8302262: Remove -XX:SuppressErrorAt develop option In-Reply-To: References: Message-ID: On Sat, 11 Feb 2023 11:32:07 GMT, Thomas Stuefe wrote: > Stupid question, what happens if a function is decorated with noreturn and you return? E.g. in case of the `Debugging` switch? Or by just falling off the function, failing to call exit or abort? Undefined behavior. gcc warns about this (with the warning levels we currently have). I didn't experiment with others. The `Debugging` variable is another thing that will need to be addressed in order to make the failed assertion reporters noreturn. Also Windows-only `UseOSErrorReporting`. (My in-progress prototype for JDK-8302189 retains those features in a different form, rather than outright deletion as being done here for `SuppressErrorAt`.) ------------- PR: https://git.openjdk.org/jdk/pull/12521 From aph at openjdk.org Sat Feb 11 23:33:24 2023 From: aph at openjdk.org (Andrew Haley) Date: Sat, 11 Feb 2023 23:33:24 GMT Subject: RFR: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64 In-Reply-To: References: Message-ID: On Sat, 11 Feb 2023 16:26:20 GMT, Yi-Fan Tsai wrote: > > Why do we need any new code for this processor? We already have routines for CRC32 using pmull. > > Do you refer to [this pmull implementation](https://github.com/openjdk/jdk/blob/1ef9f6507ba45419f0fa896915eec064762c5153/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L3753)? It uses pmull to take in 8-bit elements and produces 16-bit results. The new code uses pmull variation in Cryptographic Extension to take in 32-bit elements and produces 64-bit results. It is more efficient along with eor3. How much more efficient is it than the old version of CRC32 using pmull? And do we today need three different hand-coded versions of CRC32? ------------- PR: https://git.openjdk.org/jdk/pull/12480 From aph at openjdk.org Sun Feb 12 00:03:26 2023 From: aph at openjdk.org (Andrew Haley) Date: Sun, 12 Feb 2023 00:03:26 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: Message-ID: <7JDeUz6e3C4IJ8rIVV88dqEUOHExljnw_8pgXA78htQ=.26a4f74d-7840-47a6-b002-b21f90a70399@github.com> On Fri, 10 Feb 2023 05:26:38 GMT, Kim Barrett wrote: > Please review this change to permit the use of noreturn attributes in HotSpot > code. > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2761.pdf > https://en.cppreference.com/w/cpp/language/attributes/noreturn > > This will permit such changes as marking failed assertions and the like as > noreturn, permitting the compiler to generate better code in some cases. This > has benefits even in product builds, since some of the relevant checks occur > in product code (such as `guarantee`, `fatal`, &etc). It also provides a > solution to the problem described in JDK-8294031, where the potential > continued execution from a failed assertion leads to compiler warnings. > > The change is written in such a way that it should be easy to add the > appropriate text for new attributes in the future. There have been > discussions of adopting C++17, which adds several attributes. > > The change to the Style Guide is forward looking, toward a time when more > attributes are available due to the adoption of a newer language standard than > the current C++14. It is written in such a way that it should be easy to add > the appropriate text for new attributes. > > Testing: > I have a prototype of making HotSpot assertions noreturn, which has been run > through mach5 tier1-8 for all Oracle-supported platforms. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 24-Feb-2023 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. I'd be careful. A call to an error routine marked noreturn can turn into a tail call (i.e. a jump) to it. That messes up backtraces. ------------- PR: https://git.openjdk.org/jdk/pull/12507 From kbarrett at openjdk.org Sun Feb 12 01:14:27 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Sun, 12 Feb 2023 01:14:27 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: <7JDeUz6e3C4IJ8rIVV88dqEUOHExljnw_8pgXA78htQ=.26a4f74d-7840-47a6-b002-b21f90a70399@github.com> References: <7JDeUz6e3C4IJ8rIVV88dqEUOHExljnw_8pgXA78htQ=.26a4f74d-7840-47a6-b002-b21f90a70399@github.com> Message-ID: On Sun, 12 Feb 2023 00:00:53 GMT, Andrew Haley wrote: > I'd be careful. A call to an error routine marked noreturn can turn into a tail call (i.e. a jump) to it. That messes up backtraces. Explicitly not so for gcc. https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Common-Function-Attributes.html#Common-Function-Attributes Under "noreturn" "In order to preserve backtraces, GCC will never turn calls to noreturn functions into tail calls." I don't know about other compilers, but I would think they would want to do something similar. ------------- PR: https://git.openjdk.org/jdk/pull/12507 From duke at openjdk.org Sun Feb 12 02:06:25 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Sun, 12 Feb 2023 02:06:25 GMT Subject: RFR: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64 In-Reply-To: References: Message-ID: <1RlXKAiW6VrbcI2mZumTbLdVes4jgbdXxS0wz6VF9hs=.c6ea5183-3812-491f-a5f8-d7aea24c8f99@github.com> On Sat, 11 Feb 2023 23:30:56 GMT, Andrew Haley wrote: > How much more efficient is it than the old version of CRC32 using pmull? The old version of CRC32 using pmull is inefficient comparing to the version using crc32 instructions measured on the same EC2 c7g instance. The following is the performance comparing to the version using crc32 instructions. | input | 64 | 128 | 256 | 384 | 511 | 512 | 1,024 | | ---------------------------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | | -XX:-UseCRC32 -XX:+UseNeon (pmull) | -88.10% | -88.31% | -88.57% | -88.64% | -88.71% | -88.73% | -88.85% | | -XX:-UseCRC32 -XX:-UseNeon | -90.45% | -92.53% | -93.62% | -93.98% | -93.73% | -94.17% | -94.51% | | input | 2,048 | 4,096 | 8,192 | 16,384 | 32,768 | 65,536 | | -----------------------------------| ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | | -XX:-UseCRC32 -XX:+UseNeon (pmull) | -88.92% | -88.96% | -88.97% | -88.98% | -88.96% | -88.97% | | -XX:-UseCRC32 -XX:-UseNeon | -94.58% | -94.65% | -94.68% | -94.70% | -94.70% | -94.70% | `-XX:-UseCRC32 -XX:+UseNeon` (pmull) TestCRC32.testCRC32Update 64 thrpt 12 20593.817 ? 8.676 ops/ms TestCRC32.testCRC32Update 128 thrpt 12 13197.522 ? 4.268 ops/ms TestCRC32.testCRC32Update 256 thrpt 12 7610.119 ? 6.329 ops/ms TestCRC32.testCRC32Update 384 thrpt 12 5365.177 ? 0.570 ops/ms TestCRC32.testCRC32Update 511 thrpt 12 3808.462 ? 3.441 ops/ms TestCRC32.testCRC32Update 512 thrpt 12 4121.495 ? 4.549 ops/ms TestCRC32.testCRC32Update 1024 thrpt 12 2144.889 ? 1.610 ops/ms TestCRC32.testCRC32Update 2048 thrpt 12 1093.756 ? 0.114 ops/ms TestCRC32.testCRC32Update 4096 thrpt 12 552.393 ? 0.164 ops/ms TestCRC32.testCRC32Update 8192 thrpt 12 277.963 ? 0.099 ops/ms TestCRC32.testCRC32Update 16384 thrpt 12 139.233 ? 0.061 ops/ms TestCRC32.testCRC32Update 32768 thrpt 12 69.752 ? 0.013 ops/ms TestCRC32.testCRC32Update 65536 thrpt 12 34.789 ? 0.002 ops/ms `-XX:-UseCRC32 -XX:-UseNeon` TestCRC32.testCRC32Update 64 thrpt 12 16541.230 ? 50.847 ops/ms TestCRC32.testCRC32Update 128 thrpt 12 8432.515 ? 6.676 ops/ms TestCRC32.testCRC32Update 256 thrpt 12 4252.450 ? 2.897 ops/ms TestCRC32.testCRC32Update 384 thrpt 12 2844.691 ? 0.110 ops/ms TestCRC32.testCRC32Update 511 thrpt 12 2114.317 ? 2.670 ops/ms TestCRC32.testCRC32Update 512 thrpt 12 2134.668 ? 0.150 ops/ms TestCRC32.testCRC32Update 1024 thrpt 12 1055.619 ? 27.552 ops/ms TestCRC32.testCRC32Update 2048 thrpt 12 535.514 ? 0.273 ops/ms TestCRC32.testCRC32Update 4096 thrpt 12 267.859 ? 0.133 ops/ms TestCRC32.testCRC32Update 8192 thrpt 12 133.916 ? 0.014 ops/ms TestCRC32.testCRC32Update 16384 thrpt 12 67.003 ? 0.034 ops/ms TestCRC32.testCRC32Update 32768 thrpt 12 33.471 ? 0.023 ops/ms TestCRC32.testCRC32Update 65536 thrpt 12 16.724 ? 0.018 ops/ms > And do we today need three different hand-coded versions of CRC32? Each version depends on different processor features. Removing the least efficient versions could degrade existing use cases. It might be still reasonable to remove them though. The crc32c intrinsic only has one version, which uses crc32c instruction. ------------- PR: https://git.openjdk.org/jdk/pull/12480 From kbarrett at openjdk.org Sun Feb 12 03:04:28 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Sun, 12 Feb 2023 03:04:28 GMT Subject: RFR: 8302122: Parallelize TLAB retirement in prologue in G1 In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 15:58:35 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that parallelizes TLAB retirement and log buffer flushing (which [JDK-8297611](https://bugs.openjdk.org/browse/JDK-8297611) suggests? > > * the parallelization has been split into java and non-java threads: parallelization of the latter does not make sense given how they are organized in a linked list. However non-java threads are typically very few anyway, and parallelization only starts making sense with low hundreds of threads anyway. > * `G1BarrierSet::write_region` added a new parameter that describes the JavaThread the write_region/invalidation (not sure why `write_region` isn't just an overload of `invalidate`) is made for. The reason is previously when `BarrierSet::make_parsable()` (not sure why this is called this way either) is called to flush out deferred card marks, the card marks generated by that were added to the *calling thread's* refinement queue. This worked because the calling thread (the worker thread/vm thread) has always been processed after all java threads (which are the only ones that can have deferred card marks). So these deferred card marks' refinement entries were put into the worker thread's refinement queue. After parallelization this is not possible any more unless we deferred non java thread log flushing until after all java threads - I did not want to do that, so the change passes that explicit java thread through to `G1BarrierSet::write_region`. I think this is clearer anyway, and corresponds to how `G1DirtyCardQueueSet::concatenate_log_and_stats` works. > > Testing: tier1-5 > > Thanks, > Thomas A few comments, but overall looks quite nice. src/hotspot/share/gc/g1/g1BarrierSet.inline.hpp line 74: > 72: Thread* thread = Thread::current(); > 73: assert(thread->is_Java_thread(), "must be"); > 74: invalidate((JavaThread*)thread, mr); Use `JavaThread::current()` src/hotspot/share/gc/g1/g1YoungGCPreEvacuateTasks.cpp line 82: > 80: ~JavaThreadRetireTLABAndFlushLogs() { > 81: FREE_C_HEAP_ARRAY(G1ConcurrentRefineStats, _local_refinement_stats); > 82: FREE_C_HEAP_ARRAY(ThreadLocalAllocStats, _local_tlab_stats); Consider static-asserting that the destructors for these two types are trivial. src/hotspot/share/gc/g1/g1YoungGCPreEvacuateTasks.cpp line 104: > 102: for (uint i = 0; i < _num_workers; i++) { > 103: _local_tlab_stats[i].reset(); > 104: _local_refinement_stats[i].reset(); Since the elements haven't been constructed yet, they aren't technically objects of the appropriate type. Correct is to construct them using placement-new, i.e. ::new (&_local_tlab_stats[i]) ThreadLocalAllocStats(); And yeah, I know we have tons of code not doing that. src/hotspot/share/gc/g1/g1YoungGCPreEvacuateTasks.cpp line 118: > 116: G1ConcurrentRefineStats refinement_stats() const { > 117: G1ConcurrentRefineStats result; > 118: for (uint i = 0; i < _num_workers; i++) { indentation src/hotspot/share/gc/g1/g1YoungGCPreEvacuateTasks.cpp line 185: > 183: total_refinement_stats += _non_java_stats->refinement_stats(); > 184: qset.update_refinement_stats(total_refinement_stats); > 185: Shouldn't there be something here to delete `_java_stats` and `_non_java_stats`? ------------- Changes requested by kbarrett (Reviewer). PR: https://git.openjdk.org/jdk/pull/12497 From stuefe at openjdk.org Sun Feb 12 06:24:25 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sun, 12 Feb 2023 06:24:25 GMT Subject: RFR: 8302262: Remove -XX:SuppressErrorAt develop option In-Reply-To: References: Message-ID: On Sat, 11 Feb 2023 03:17:38 GMT, Kim Barrett wrote: > Please review this change which removes the -XX:SuppressErrorAt develop option > and the associated implementation. LGTM ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.org/jdk/pull/12521 From alanb at openjdk.org Sun Feb 12 08:18:25 2023 From: alanb at openjdk.org (Alan Bateman) Date: Sun, 12 Feb 2023 08:18:25 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation In-Reply-To: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: <2MCpmgUCa9wZNv4hBjb7tspJVRmx_0iEDvhwuWGsC18=.afad6f08-3ff8-4002-a29e-f17c4cd0b813@github.com> On Fri, 10 Feb 2023 14:50:36 GMT, Patricio Chilano Mateo wrote: > Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). The timer functions need re-visiting to allow an implementation support these functions on virtual threads. Same for StopThread which needs a spec change to minimally support this function on a thread that is suspended at a breakpoint or single step event. So if we sort out these spec issues then it looks like JavaThread:: _is_bound_vthread could go away and there would be less coupling with the alt implementation. ------------- PR: https://git.openjdk.org/jdk/pull/12512 From aph at openjdk.org Sun Feb 12 12:00:27 2023 From: aph at openjdk.org (Andrew Haley) Date: Sun, 12 Feb 2023 12:00:27 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: <7JDeUz6e3C4IJ8rIVV88dqEUOHExljnw_8pgXA78htQ=.26a4f74d-7840-47a6-b002-b21f90a70399@github.com> Message-ID: On Sun, 12 Feb 2023 01:11:25 GMT, Kim Barrett wrote: > > I'd be careful. A call to an error routine marked noreturn can turn into a tail call (i.e. a jump) to it. That messes up backtraces. > > Explicitly not so for gcc. https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Common-Function-Attributes.html#Common-Function-Attributes Under "noreturn" "In order to preserve backtraces, GCC will never turn calls to noreturn functions into tail calls." I don't know about other compilers, but I would think they would want to do something similar. Mm, interesting. I see it's been like that for a while, but only recently has that been explicit in the docs. I think we'd want to check for other compilers. ------------- PR: https://git.openjdk.org/jdk/pull/12507 From aph at openjdk.org Sun Feb 12 14:54:31 2023 From: aph at openjdk.org (Andrew Haley) Date: Sun, 12 Feb 2023 14:54:31 GMT Subject: RFR: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64 In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 02:25:27 GMT, Yi-Fan Tsai wrote: > Instruction pmull and pmull2 support operating on 64-bit data in Cryptographic Extension. The execution throughput of this form raises from 1 on Neoverse N1 to 4 on Neoverse V1 while the latency remains 2. The CRC32 instructions did not changed: latency 2, throughput 1. As a result, computing CRC32 using pmull could perform better than using crc32 instruction. > > The following test has passed. > test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java > > The throughput reported by the micro benchmark is measured on an EC2 c7g instance. The optimization shows 11 - 99% improvement when the input is at least 384 bytes. > > | input | 64 | 128 | 256 | 384 | 511 | 512 | 1,024 | > | ------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | > | improvement | 0.02% | 0.02% | 0.00% | 16.00% | 11.94% | 34.75% | 69.80% | > > | input | 2,048 | 4,096 | 8,192 | 16,384 | 32,768 | 65,536 | > | ------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | > | improvement | 77.61% | 92.33% | 95.98% | 97.95% | 99.33% | 98.36% | > > > Baseline > > TestCRC32.testCRC32Update 64 thrpt 12 173126.358 ? 118.330 ops/ms > TestCRC32.testCRC32Update 128 thrpt 12 112910.118 ? 47.305 ops/ms > TestCRC32.testCRC32Update 256 thrpt 12 66601.990 ? 7.294 ops/ms > TestCRC32.testCRC32Update 384 thrpt 12 47229.319 ? 3.949 ops/ms > TestCRC32.testCRC32Update 511 thrpt 12 33733.119 ? 4.076 ops/ms > TestCRC32.testCRC32Update 512 thrpt 12 36584.565 ? 4.211 ops/ms > TestCRC32.testCRC32Update 1024 thrpt 12 19239.083 ? 1.040 ops/ms > TestCRC32.testCRC32Update 2048 thrpt 12 9875.652 ? 0.435 ops/ms > TestCRC32.testCRC32Update 4096 thrpt 12 5004.425 ? 0.290 ops/ms > TestCRC32.testCRC32Update 8192 thrpt 12 2519.185 ? 0.169 ops/ms > TestCRC32.testCRC32Update 16384 thrpt 12 1263.909 ? 0.194 ops/ms > TestCRC32.testCRC32Update 32768 thrpt 12 632.018 ? 0.053 ops/ms > TestCRC32.testCRC32Update 65536 thrpt 12 315.471 ? 0.095 ops/ms > > > Crypto pmull > > TestCRC32.testCRC32Update 64 thrpt 12 173168.669 ? 4.746 ops/ms > TestCRC32.testCRC32Update 128 thrpt 12 112933.519 ? 4.583 ops/ms > TestCRC32.testCRC32Update 256 thrpt 12 66602.462 ? 3.150 ops/ms > TestCRC32.testCRC32Update 384 thrpt 12 54784.739 ? 2.110 ops/ms > TestCRC32.testCRC32Update 511 thrpt 12 37760.816 ? 69.911 ops/ms > TestCRC32.testCRC32Update 512 thrpt 12 49297.609 ? 21.983 ops/ms > TestCRC32.testCRC32Update 1024 thrpt 12 32667.507 ? 90.610 ops/ms > TestCRC32.testCRC32Update 2048 thrpt 12 17539.986 ? 511.416 ops/ms > TestCRC32.testCRC32Update 4096 thrpt 12 9625.249 ? 9.713 ops/ms > TestCRC32.testCRC32Update 8192 thrpt 12 4937.135 ? 6.121 ops/ms > TestCRC32.testCRC32Update 16384 thrpt 12 2501.936 ? 1.270 ops/ms > TestCRC32.testCRC32Update 32768 thrpt 12 1259.831 ? 0.119 ops/ms > TestCRC32.testCRC32Update 65536 thrpt 12 625.773 ? 0.242 ops/ms The optional CRC instructions in v8.0 become a requirement in ARMv8.1. ARMv8.0 is stuff like Cortex A53, but also including, apparently, Cortex A72. So I guess we're stuck with all three, at least for now. This is a bad business, but I guess it's something that Arm & partners have dropped on us, and there's little we can do about it. ------------- PR: https://git.openjdk.org/jdk/pull/12480 From gcao at openjdk.org Sun Feb 12 15:36:27 2023 From: gcao at openjdk.org (Gui Cao) Date: Sun, 12 Feb 2023 15:36:27 GMT Subject: RFR: 8302114: RISC-V: Several foreign jtreg tests fail with debug build after JDK-8301818 In-Reply-To: References: Message-ID: <5_NV6birDqgnapzWbUuCOPGXYV1MLy73PhJ-QAFk4Vg=.71b788a5-e97b-4850-a67d-abcad6a472b1@github.com> On Thu, 9 Feb 2023 06:14:18 GMT, Feilong Jiang wrote: > This problem only triggers when using debug build to run foreign jtreg tests. > [JDK-8301818](https://bugs.openjdk.org/browse/JDK-8301818) made following code changes in function generate_generic_copy [1]: > > > // At this point, it is known to be a typeArray (array_tag 0x3). > #ifdef ASSERT > { > BLOCK_COMMENT("assert primitive array {"); > Label L; > - __ mvw(t1, Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift); > + __ mv(t1, Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift); > __ bge(lh, t1, L); > __ stop("must be a primitive array"); > __ bind(L); > BLOCK_COMMENT("} assert primitive array done"); > } > #endif > > > Notably, `Klass::_lh_array_tag_type_value` is of type `unsigned int`: > > static const unsigned int _lh_array_tag_type_value = 0Xffffffff; > static const int _lh_array_tag_shift = BitsPerInt - _lh_array_tag_bits; > static const int _lh_array_tag_bits = 2; > > So `Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift` would be `0xc0000000` of type `unsigned int`. > > Previously, `mvw` function would sign-extend this value into 64-bit `0xffffffffc0000000` since we want to do 64-bit signed compare and branch with `bge` instruction next. > > template::value)> > inline void mv(Register Rd, T o) { li(Rd, (int64_t)o); } > > inline void mvw(Register Rd, int32_t imm32) { mv(Rd, imm32); } > > > But this is not the case when changed to use `mv` with type `unsigned int`. So I think this changes the behaviour. > A simple fix would be adding an explicit `int32_t` type conversion here for this value, like: > > __ mv(t1, (int32_t)(Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift)); > > > > P.S.: We also checked other places where moving an immediate into registers and some type conversions were added/removed. > > > [1] https://github.com/openjdk/jdk/commit/c04a982eb47170f3c613617179fca012bb4d40ae > > Testing: > - [x] jdk_foreign all passed on Unmatched board (fastdebug build) > - [x] tier1-3 on Unmatched board (release build) LGTM, Thank you for fixing this, it was my mistake. ------------- Marked as reviewed by gcao (Author). PR: https://git.openjdk.org/jdk/pull/12481 From dholmes at openjdk.org Mon Feb 13 00:59:27 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 13 Feb 2023 00:59:27 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 05:26:38 GMT, Kim Barrett wrote: > Please review this change to permit the use of noreturn attributes in HotSpot > code. > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2761.pdf > https://en.cppreference.com/w/cpp/language/attributes/noreturn > > This will permit such changes as marking failed assertions and the like as > noreturn, permitting the compiler to generate better code in some cases. This > has benefits even in product builds, since some of the relevant checks occur > in product code (such as `guarantee`, `fatal`, &etc). It also provides a > solution to the problem described in JDK-8294031, where the potential > continued execution from a failed assertion leads to compiler warnings. > > The change is written in such a way that it should be easy to add the > appropriate text for new attributes in the future. There have been > discussions of adopting C++17, which adds several attributes. > > The change to the Style Guide is forward looking, toward a time when more > attributes are available due to the adoption of a newer language standard than > the current C++14. It is written in such a way that it should be easy to add > the appropriate text for new attributes. > > Testing: > I have a prototype of making HotSpot assertions noreturn, which has been run > through mach5 tier1-8 for all Oracle-supported platforms. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 24-Feb-2023 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. I approve of supporting attributes. Seeking some clarification on syntactic placement rules. Thanks. doc/hotspot-style.md line 1072: > 1070: * An attribute that appertains to a function is placed at the beginning of the > 1071: function's declaration, rather than between the function name and the parameter > 1072: list. Sorry to be pedantic but what exactly constitutes the "function declaration"? Does it include keywords like `virtual`? What about placement wrt. compiler-specific attributes? ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12507 From dholmes at openjdk.org Mon Feb 13 01:11:25 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 13 Feb 2023 01:11:25 GMT Subject: RFR: 8302262: Remove -XX:SuppressErrorAt develop option In-Reply-To: References: Message-ID: On Sat, 11 Feb 2023 11:29:35 GMT, Thomas Stuefe wrote: > Interesting. Thanks for explaining this. So the fact that we set the assert triggers the compiler check. For the public record I was a supporter of the use of `SuppressErrorAt` as I have often found it useful when reproducing product failures on debug builds. I was also frustrated that the assertion triggers the extra compiler checks - a classic example, to me, of the compiler trying to be too clever/helpful and actually just causing problems. But I have been persuaded that the trade-offs come down (slightly) in favour of removing `SuppressErrorAt`, even though for me build times are slow enough that disabling the problem assertion is still an inconvenience. ------------- PR: https://git.openjdk.org/jdk/pull/12521 From dholmes at openjdk.org Mon Feb 13 01:19:32 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 13 Feb 2023 01:19:32 GMT Subject: RFR: 8302262: Remove -XX:SuppressErrorAt develop option In-Reply-To: References: Message-ID: <_6GGYLa6zNo8iiVBw8wrSlebzMqiGa2jkSs54n6wjZk=.8509c09b-c63e-4405-9a3a-ed48b00e89a6@github.com> On Sat, 11 Feb 2023 03:17:38 GMT, Kim Barrett wrote: > Please review this change which removes the -XX:SuppressErrorAt develop option > and the associated implementation. Changes look good - smaller than I had envisaged. Getting rid of the archaic test comments is fine too. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12521 From fjiang at openjdk.org Mon Feb 13 01:22:26 2023 From: fjiang at openjdk.org (Feilong Jiang) Date: Mon, 13 Feb 2023 01:22:26 GMT Subject: RFR: 8302114: RISC-V: Several foreign jtreg tests fail with debug build after JDK-8301818 In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 06:30:57 GMT, Fei Yang wrote: >> This problem only triggers when using debug build to run foreign jtreg tests. >> [JDK-8301818](https://bugs.openjdk.org/browse/JDK-8301818) made following code changes in function generate_generic_copy [1]: >> >> >> // At this point, it is known to be a typeArray (array_tag 0x3). >> #ifdef ASSERT >> { >> BLOCK_COMMENT("assert primitive array {"); >> Label L; >> - __ mvw(t1, Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift); >> + __ mv(t1, Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift); >> __ bge(lh, t1, L); >> __ stop("must be a primitive array"); >> __ bind(L); >> BLOCK_COMMENT("} assert primitive array done"); >> } >> #endif >> >> >> Notably, `Klass::_lh_array_tag_type_value` is of type `unsigned int`: >> >> static const unsigned int _lh_array_tag_type_value = 0Xffffffff; >> static const int _lh_array_tag_shift = BitsPerInt - _lh_array_tag_bits; >> static const int _lh_array_tag_bits = 2; >> >> So `Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift` would be `0xc0000000` of type `unsigned int`. >> >> Previously, `mvw` function would sign-extend this value into 64-bit `0xffffffffc0000000` since we want to do 64-bit signed compare and branch with `bge` instruction next. >> >> template::value)> >> inline void mv(Register Rd, T o) { li(Rd, (int64_t)o); } >> >> inline void mvw(Register Rd, int32_t imm32) { mv(Rd, imm32); } >> >> >> But this is not the case when changed to use `mv` with type `unsigned int`. So I think this changes the behaviour. >> A simple fix would be adding an explicit `int32_t` type conversion here for this value, like: >> >> __ mv(t1, (int32_t)(Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift)); >> >> >> >> P.S.: We also checked other places where moving an immediate into registers and some type conversions were added/removed. >> >> >> [1] https://github.com/openjdk/jdk/commit/c04a982eb47170f3c613617179fca012bb4d40ae >> >> Testing: >> - [x] jdk_foreign all passed on Unmatched board (fastdebug build) >> - [x] tier1-3 on Unmatched board (release build) > > Marked as reviewed by fyang (Reviewer). @RealFYang @zifeihan -- Thanks! ------------- PR: https://git.openjdk.org/jdk/pull/12481 From dholmes at openjdk.org Mon Feb 13 01:22:30 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 13 Feb 2023 01:22:30 GMT Subject: RFR: JDK-8301494: Replace NULL with nullptr in cpu/arm [v2] In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 10:09:05 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/arm. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Fix dholmes' comment Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12322 From dholmes at openjdk.org Mon Feb 13 01:22:30 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 13 Feb 2023 01:22:30 GMT Subject: RFR: JDK-8301481: Replace NULL with nullptr in os/windows [v5] In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 10:06:22 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/windows. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Changed to zero Latest updates look good. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12317 From dholmes at openjdk.org Mon Feb 13 01:46:32 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 13 Feb 2023 01:46:32 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 In-Reply-To: References: Message-ID: <26pQRrJfLjbpGJk3cFZcKgGNl08NnuI2gSh0exe_pQg=.f3e40d0a-b337-4f02-825d-223f4405ccd3@github.com> On Tue, 31 Jan 2023 11:40:19 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! src/hotspot/cpu/x86/interp_masm_x86.cpp line 271: > 269: jcc(Assembler::equal, L); > 270: stop("InterpreterMacroAssembler::call_VM_leaf_base:" > 271: " last_sp != null"); nullptr src/hotspot/cpu/x86/macroAssembler_x86.cpp line 2857: > 2855: void MacroAssembler::null_check(Register reg, int offset) { > 2856: if (needs_explicit_null_check(offset)) { > 2857: // provoke OS null exception if reg = null by suggestion "reg is null" src/hotspot/cpu/x86/macroAssembler_x86.hpp line 96: > 94: // Support for null-checks > 95: // > 96: // Generates code that causes a null OS exception if the content of reg is null. Note use of "reg is null" here :) src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp line 1765: > 1763: in_ByteSize(-1), > 1764: in_ByteSize(-1), > 1765: (OopMapSet*)nullptr); Unnecessary cast? src/hotspot/cpu/x86/templateTable_x86.cpp line 4188: > 4186: __ bind(done); > 4187: // rax = 0: obj == null or obj is not an instanceof the specified klass > 4188: // rax = 1: obj != null and obj is an instanceof the specified klass nullptr ------------- PR: https://git.openjdk.org/jdk/pull/12326 From dholmes at openjdk.org Mon Feb 13 01:46:35 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 13 Feb 2023 01:46:35 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 22:43:54 GMT, Vladimir Kozlov wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > src/hotspot/cpu/x86/interp_masm_x86.cpp line 303: > >> 301: jcc(Assembler::equal, L); >> 302: stop("InterpreterMacroAssembler::call_VM_base:" >> 303: " last_sp != nullptr"); > > `null` These are textual code fragments so I think `nullptr` is more appropriate - as per changes made to cpu/arm code. ------------- PR: https://git.openjdk.org/jdk/pull/12326 From dholmes at openjdk.org Mon Feb 13 01:46:37 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 13 Feb 2023 01:46:37 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 10:22:09 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 2866: > >> 2864: } else { >> 2865: // nothing to do, (later) access of M[reg + offset] >> 2866: // will provoke OS null exception if reg = null > > is not = suggestion: "reg is null" ------------- PR: https://git.openjdk.org/jdk/pull/12326 From fjiang at openjdk.org Mon Feb 13 02:04:33 2023 From: fjiang at openjdk.org (Feilong Jiang) Date: Mon, 13 Feb 2023 02:04:33 GMT Subject: Integrated: 8302114: RISC-V: Several foreign jtreg tests fail with debug build after JDK-8301818 In-Reply-To: References: Message-ID: <1r1tv9PECM6Vkm5sIB5YcmXkze1uXlZ-3ImNRW7BLho=.bbefc152-c0d0-4dcc-ba0b-e752289f57f6@github.com> On Thu, 9 Feb 2023 06:14:18 GMT, Feilong Jiang wrote: > This problem only triggers when using debug build to run foreign jtreg tests. > [JDK-8301818](https://bugs.openjdk.org/browse/JDK-8301818) made following code changes in function generate_generic_copy [1]: > > > // At this point, it is known to be a typeArray (array_tag 0x3). > #ifdef ASSERT > { > BLOCK_COMMENT("assert primitive array {"); > Label L; > - __ mvw(t1, Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift); > + __ mv(t1, Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift); > __ bge(lh, t1, L); > __ stop("must be a primitive array"); > __ bind(L); > BLOCK_COMMENT("} assert primitive array done"); > } > #endif > > > Notably, `Klass::_lh_array_tag_type_value` is of type `unsigned int`: > > static const unsigned int _lh_array_tag_type_value = 0Xffffffff; > static const int _lh_array_tag_shift = BitsPerInt - _lh_array_tag_bits; > static const int _lh_array_tag_bits = 2; > > So `Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift` would be `0xc0000000` of type `unsigned int`. > > Previously, `mvw` function would sign-extend this value into 64-bit `0xffffffffc0000000` since we want to do 64-bit signed compare and branch with `bge` instruction next. > > template::value)> > inline void mv(Register Rd, T o) { li(Rd, (int64_t)o); } > > inline void mvw(Register Rd, int32_t imm32) { mv(Rd, imm32); } > > > But this is not the case when changed to use `mv` with type `unsigned int`. So I think this changes the behaviour. > A simple fix would be adding an explicit `int32_t` type conversion here for this value, like: > > __ mv(t1, (int32_t)(Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift)); > > > > P.S.: We also checked other places where moving an immediate into registers and some type conversions were added/removed. > > > [1] https://github.com/openjdk/jdk/commit/c04a982eb47170f3c613617179fca012bb4d40ae > > Testing: > - [x] jdk_foreign all passed on Unmatched board (fastdebug build) > - [x] tier1-3 on Unmatched board (release build) This pull request has now been integrated. Changeset: 7c233bc1 Author: Feilong Jiang Committer: Fei Yang URL: https://git.openjdk.org/jdk/commit/7c233bc1c88564b53ee3b46dbe7763de81ef5468 Stats: 11 lines in 4 files changed: 0 ins; 0 del; 11 mod 8302114: RISC-V: Several foreign jtreg tests fail with debug build after JDK-8301818 Reviewed-by: fyang, gcao ------------- PR: https://git.openjdk.org/jdk/pull/12481 From jwaters at openjdk.org Mon Feb 13 02:18:27 2023 From: jwaters at openjdk.org (Julian Waters) Date: Mon, 13 Feb 2023 02:18:27 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 00:55:46 GMT, David Holmes wrote: >> Please review this change to permit the use of noreturn attributes in HotSpot >> code. >> >> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2761.pdf >> https://en.cppreference.com/w/cpp/language/attributes/noreturn >> >> This will permit such changes as marking failed assertions and the like as >> noreturn, permitting the compiler to generate better code in some cases. This >> has benefits even in product builds, since some of the relevant checks occur >> in product code (such as `guarantee`, `fatal`, &etc). It also provides a >> solution to the problem described in JDK-8294031, where the potential >> continued execution from a failed assertion leads to compiler warnings. >> >> The change is written in such a way that it should be easy to add the >> appropriate text for new attributes in the future. There have been >> discussions of adopting C++17, which adds several attributes. >> >> The change to the Style Guide is forward looking, toward a time when more >> attributes are available due to the adoption of a newer language standard than >> the current C++14. It is written in such a way that it should be easy to add >> the appropriate text for new attributes. >> >> Testing: >> I have a prototype of making HotSpot assertions noreturn, which has been run >> through mach5 tier1-8 for all Oracle-supported platforms. >> >> This is a modification of the Style Guide, so rough consensus among the >> HotSpot Group members is required to make this change. Only Group members >> should vote for approval (via the github PR), though reasoned objections or >> comments from anyone will be considered. A decision on this proposal will not >> be made before Friday 24-Feb-2023 at 12h00 UTC. >> >> Since we're piggybacking on github PRs here, please use the PR review process >> to approve (click on Review Changes > Approve), rather than sending a "vote: >> yes" email reply that would be normal for a CFV. > > doc/hotspot-style.md line 1072: > >> 1070: * An attribute that appertains to a function is placed at the beginning of the >> 1071: function's declaration, rather than between the function name and the parameter >> 1072: list. > > Sorry to be pedantic but what exactly constitutes the "function declaration"? Does it include keywords like `virtual`? What about placement wrt. compiler-specific attributes? As for the function declaration, it's pretty much the entire function, keywords and all, as the C++ Standard always has the attribute specifier sequence appear before everything else (alignas is also an attribute specifier, to that extent), and never between keywords. I'm in favour of adding an extra line also forbidding having the attributes placed behind the function's parameter list brackets though, since HotSpot currently has a ton of attributes that are listed behind the whole function I think Kim has stated before that compiler specific attributes don't give much benefit and isn't really interested in having them, that said it would be nice to have them in the future with perhaps the compiler's namespace required in the attribute (msvc::noinline in C++20 guarantees no inlining of a function as compared to our current __declspec(noinline) for instance). ------------- PR: https://git.openjdk.org/jdk/pull/12507 From kbarrett at openjdk.org Mon Feb 13 02:36:27 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 13 Feb 2023 02:36:27 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: <7JDeUz6e3C4IJ8rIVV88dqEUOHExljnw_8pgXA78htQ=.26a4f74d-7840-47a6-b002-b21f90a70399@github.com> Message-ID: On Sun, 12 Feb 2023 11:57:33 GMT, Andrew Haley wrote: > > > I'd be careful. A call to an error routine marked noreturn can turn into a tail call (i.e. a jump) to it. That messes up backtraces. > > > > > > Explicitly not so for gcc. https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Common-Function-Attributes.html#Common-Function-Attributes Under "noreturn" "In order to preserve backtraces, GCC will never turn calls to noreturn functions into tail calls." I don't know about other compilers, but I would think they would want to do something similar. > > Mm, interesting. I see it's been like that for a while, but only recently has that been explicit in the docs. I think we'd want to check for other compilers. I wasn't able to find anything in the Visual Studio docs about this question. Not authoritative, but I did find this: https://stackoverflow.com/questions/55656938/why-noreturn-builtin-unreachable-prevents-tail-call-optimization ------------- PR: https://git.openjdk.org/jdk/pull/12507 From dholmes at openjdk.org Mon Feb 13 02:55:28 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 13 Feb 2023 02:55:28 GMT Subject: RFR: 8302191: Performance degradation for float/double modulo on Linux In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 09:06:56 GMT, Jan Kratochvil wrote: > I have OCA already processed/approved. I am not Author but my Author request is being processed these days (sent to Rob McKenna). > I did regression test x86_64 OpenJDK-8. I will leave other regression testing on GHA. > The patch (and former GCC performance regression) affects only x86_64+i686. Thanks for raising this issue, and making the contribution. If I understand correctly a change in gcc 4.9 (over 8 years ago) meant that our use of the `fmod` C-library implementation was no longer being compiled in an "optimal" way, and the proposal here is to replace the use of `fmod` with x86 assembler that does use that optimal" way. However such a change does not belong in the file that has been modified, which is nominally (though not completely) shared code. If there is to be an x86 specific implementation for this then it would be added to `src/hotspot/cpu/x86/sharedRuntime_x86*.cpp` and shared version adapted accordingly for other architectures. But before we would go that way we first need to confirm that: a) the assembly code is 100% equivalent to the `fmod` being replaced b) that the performance gain warrants the additional maintenance overhead of using cpu specific code To ascertain (b) it would be good to see the DivisionDemo reformulated as a [JMH](https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwi_-qfWvpH9AhW9T2wGHWj5AzYQFnoECCIQAQ&url=https%3A%2F%2Fgithub.com%2Fopenjdk%2Fjmh&usg=AOvVaw2LM5nJq9rAJUQjgub5sUZV) benchmark. Thank you. ------------- PR: https://git.openjdk.org/jdk/pull/12508 From kbarrett at openjdk.org Mon Feb 13 05:00:28 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 13 Feb 2023 05:00:28 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 02:16:02 GMT, Julian Waters wrote: >> doc/hotspot-style.md line 1072: >> >>> 1070: * An attribute that appertains to a function is placed at the beginning of the >>> 1071: function's declaration, rather than between the function name and the parameter >>> 1072: list. >> >> Sorry to be pedantic but what exactly constitutes the "function declaration"? Does it include keywords like `virtual`? What about placement wrt. compiler-specific attributes? > > As for the function declaration, it's pretty much the entire function, keywords and all, as the C++ Standard always has the attribute specifier sequence appear before everything else (alignas is also an attribute specifier, to that extent), and never between keywords. I'm in favour of adding an extra line also forbidding having the attributes placed behind the function's parameter list brackets though, since HotSpot currently has a ton of attributes that are listed behind the whole function > > I think Kim has stated before that compiler specific attributes don't give much benefit and isn't really interested in having them, that said it would be nice to have them in the future with perhaps the compiler's namespace required in the attribute (msvc::noinline in C++20 guarantees no inlining of a function as compared to our current __declspec(noinline) for instance). Standard attributes are syntactically permitted in a plethora of places. But specific attributes are only allowed in places related to their semantics. For function declarations (including definitions), this is either first, before anything else related to the declaration, or after the name of the function but before the parameter list. What I've proposed is that HotSpot code prefers putting function attributes first, rather than immediately after the function name (so before the parameter list). Standard attributes can also appear after the parameter list, but those apply to the function *type* rather than the function. That differs from gcc `__attribute__`. (Standard attributes can also appear after the return type but before the function name. Those attributes apply to the return type. Hoping we don't need any of those, other than perhaps for alignment.) I've no idea how standard attributes and gcc `__attribute__` or msvc `__declspec` interact syntactically. My hope would be that they can interleave, but I have no idea whether that's true or not. I guess some research is needed, though I think it doesn't matter for the current set. (I doubt anyone is going to declare a function both noreturn and always-inline, for example.) My objection to the earlier proposal to (for example) change ATTRIBUTE_PRINTF to use `[[gnu::format(printf...)]]` instead of the `__attribute__` form was that it required moving a lot of the macro uses from the end of a declaration to its front (because of the above-mentioned difference), so lots of code changes for not much gain. Unrecognized attributes cause problems. C++17 changes that, requiring that unrecognized attributes be ignored. That might make some uses of compiler-specific attributes more palatable, as they can be used directly rather than behind a macro that is conditionally defined. OTOH, we might still end up putting some attributes behind conditionally defined macros because multiple platforms provide the same feature but with different spellings. ------------- PR: https://git.openjdk.org/jdk/pull/12507 From dholmes at openjdk.org Mon Feb 13 05:17:27 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 13 Feb 2023 05:17:27 GMT Subject: RFR: 8302191: Performance degradation for float/double modulo on Linux In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 09:06:56 GMT, Jan Kratochvil wrote: > I have OCA already processed/approved. I am not Author but my Author request is being processed these days (sent to Rob McKenna). > I did regression test x86_64 OpenJDK-8. I will leave other regression testing on GHA. > The patch (and former GCC performance regression) affects only x86_64+i686. Correction, now IIUC gcc made the change because the code was incorrect in the general sense, and so would only be used when enabled explicitly. So the VM use of fmod started compiling into a more correct but slower form. The claim here is that it is okay for the VM to use the "incorrect" form, and that is being provided directly by use of `fprem`. ------------- PR: https://git.openjdk.org/jdk/pull/12508 From duke at openjdk.org Mon Feb 13 05:23:28 2023 From: duke at openjdk.org (Jan Kratochvil) Date: Mon, 13 Feb 2023 05:23:28 GMT Subject: RFR: 8302191: Performance degradation for float/double modulo on Linux In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 09:06:56 GMT, Jan Kratochvil wrote: > I have OCA already processed/approved. I am not Author but my Author request is being processed these days (sent to Rob McKenna). > I did regression test x86_64 OpenJDK-8. I will leave other regression testing on GHA. > The patch (and former GCC performance regression) affects only x86_64+i686. Yes, I suspect GCC found it was incorrect. I did not find a countercase why it was incorrect but I plan to leave that up to GCC developers. Yes, I believe I have checked all the cornercases that the `fprem` does provide exactly what OpenJDK VM expects. ------------- PR: https://git.openjdk.org/jdk/pull/12508 From dholmes at openjdk.org Mon Feb 13 05:27:24 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 13 Feb 2023 05:27:24 GMT Subject: RFR: 8302191: Performance degradation for float/double modulo on Linux In-Reply-To: References: Message-ID: <_sQRKbcNXAGTS09Z1LxhSqJRlxMPwgE1QL-8gXTQiVM=.e3c2fa32-1024-403a-9a4f-ff220ffff6a6@github.com> On Fri, 10 Feb 2023 09:06:56 GMT, Jan Kratochvil wrote: > I have OCA already processed/approved. I am not Author but my Author request is being processed these days (sent to Rob McKenna). > I did regression test x86_64 OpenJDK-8. I will leave other regression testing on GHA. > The patch (and former GCC performance regression) affects only x86_64+i686. The patch you link to: https://gcc.gnu.org/pipermail/gcc-patches/2014-September/400104.html seems to be i386 and FPU 387 specific. I can't see anything that suggests a more general impact of that patch to 64-bit x86 platform. ?? ------------- PR: https://git.openjdk.org/jdk/pull/12508 From duke at openjdk.org Mon Feb 13 05:37:27 2023 From: duke at openjdk.org (Jan Kratochvil) Date: Mon, 13 Feb 2023 05:37:27 GMT Subject: RFR: 8302191: Performance degradation for float/double modulo on Linux In-Reply-To: References: Message-ID: <3M7L99lOVIWYoRHw-UZsK3fEUkQMjwK0XhoUukMY2Os=.3c3e39d1-dd39-49a0-8846-903a7705503c@github.com> On Fri, 10 Feb 2023 09:06:56 GMT, Jan Kratochvil wrote: > I have OCA already processed/approved. I am not Author but my Author request is being processed these days (sent to Rob McKenna). > I did regression test x86_64 OpenJDK-8. I will leave other regression testing on GHA. > The patch (and former GCC performance regression) affects only x86_64+i686. GCC has both 32-bit and 64-bit x86-64 (incl. third x32 which is a 32-bit x86_64) in a single directory `gcc/config/i386`: https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/i386/t-linux64;hb=HEAD Also I was verifying the GCC behavior only on x86-64 by reverting/applying [that patch](https://gcc.gnu.org/pipermail/gcc-patches/2014-September/400104.html). ------------- PR: https://git.openjdk.org/jdk/pull/12508 From jsjolen at openjdk.org Mon Feb 13 09:26:07 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 13 Feb 2023 09:26:07 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 [v2] In-Reply-To: References: Message-ID: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Some more fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12326/files - new: https://git.openjdk.org/jdk/pull/12326/files/432ec5d5..e0747443 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12326&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12326&range=00-01 Stats: 5 lines in 3 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/12326.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12326/head:pull/12326 PR: https://git.openjdk.org/jdk/pull/12326 From jsjolen at openjdk.org Mon Feb 13 09:26:07 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 13 Feb 2023 09:26:07 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 11:40:19 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! I added the fixes, thanks! Last commit passed tier1. ------------- PR: https://git.openjdk.org/jdk/pull/12326 From jsjolen at openjdk.org Mon Feb 13 09:28:35 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 13 Feb 2023 09:28:35 GMT Subject: RFR: JDK-8301499: Replace NULL with nullptr in cpu/zero In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 11:40:29 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/zero. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Thanks ------------- PR: https://git.openjdk.org/jdk/pull/12327 From jsjolen at openjdk.org Mon Feb 13 09:28:36 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 13 Feb 2023 09:28:36 GMT Subject: Integrated: JDK-8301499: Replace NULL with nullptr in cpu/zero In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 11:40:29 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/zero. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! This pull request has now been integrated. Changeset: 4e327db1 Author: Johan Sj?len URL: https://git.openjdk.org/jdk/commit/4e327db1d127c652ef39e31c164e36ae429a0065 Stats: 147 lines in 24 files changed: 0 ins; 0 del; 147 mod 8301499: Replace NULL with nullptr in cpu/zero Reviewed-by: dholmes, rehn ------------- PR: https://git.openjdk.org/jdk/pull/12327 From tschatzl at openjdk.org Mon Feb 13 09:41:27 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 13 Feb 2023 09:41:27 GMT Subject: RFR: 8302122: Parallelize TLAB retirement in prologue in G1 In-Reply-To: References: Message-ID: On Sun, 12 Feb 2023 02:57:53 GMT, Kim Barrett wrote: >> Hi all, >> >> can I have reviews for this change that parallelizes TLAB retirement and log buffer flushing (which [JDK-8297611](https://bugs.openjdk.org/browse/JDK-8297611) suggests? >> >> * the parallelization has been split into java and non-java threads: parallelization of the latter does not make sense given how they are organized in a linked list. However non-java threads are typically very few anyway, and parallelization only starts making sense with low hundreds of threads anyway. >> * `G1BarrierSet::write_region` added a new parameter that describes the JavaThread the write_region/invalidation (not sure why `write_region` isn't just an overload of `invalidate`) is made for. The reason is previously when `BarrierSet::make_parsable()` (not sure why this is called this way either) is called to flush out deferred card marks, the card marks generated by that were added to the *calling thread's* refinement queue. This worked because the calling thread (the worker thread/vm thread) has always been processed after all java threads (which are the only ones that can have deferred card marks). So these deferred card marks' refinement entries were put into the worker thread's refinement queue. After parallelization this is not possible any more unless we deferred non java thread log flushing until after all java threads - I did not want to do that, so the change passes that explicit java thread through to `G1BarrierSet::write_region`. I think this is clearer anyway, and corresponds to how `G1DirtyCardQueueSet::concatenate_log_and_stats` works. >> >> Testing: tier1-5 >> >> Thanks, >> Thomas > > src/hotspot/share/gc/g1/g1YoungGCPreEvacuateTasks.cpp line 185: > >> 183: total_refinement_stats += _non_java_stats->refinement_stats(); >> 184: qset.update_refinement_stats(total_refinement_stats); >> 185: > > Shouldn't there be something here to delete `_java_stats` and `_non_java_stats`? Both are actually the tasks that after adding them to the batchtask are owned and deleted by it. I'll rename the variables because otherwise it is confusing (I went back and forth with the names of these multiple times and apparently chose the more confusing option). ------------- PR: https://git.openjdk.org/jdk/pull/12497 From ngasson at openjdk.org Mon Feb 13 09:54:27 2023 From: ngasson at openjdk.org (Nick Gasson) Date: Mon, 13 Feb 2023 09:54:27 GMT Subject: RFR: 8301012: [vectorapi]: Intrinsify CompressBitsV/ExpandBitsV and add the AArch64 SVE backend implementation In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 17:23:20 GMT, Bhavana Kilambi wrote: > This patch adds mid-end compiler vector IR nodes for the scalar CompressBits and ExpandBits nodes - CompressBitsV and ExpandBitsV and also adds aarch64 backend support for these nodes using SVE2 instructions (included in the svebitperm feature). As there are direct instructions in SVE2 that map to these operations, a huge speed up in performance can be observed and it might significantly benefit all those workloads that extensively run these operations on an SVE2(with svebitperm feature) supporting machine. > > All the JTREG tests under "test/jdk/jdk/incubator/vector" pass successfully with this patch on an SVE2 machine. > The JMH tests - COMPRESS_BITS and EXPAND_BITS from [1] and [2] were run on a 128-bit vector length, SVE2 and svebitperm supporting aarch64 machine. Following are the gains observed with this patch - > > > Benchmark (length) Mode Cnt Gain > IntMaxVector.COMPRESS_BITS 1024 thrpt 15 81.68x > IntMaxVector.EXPAND_BITS 1024 thrpt 15 85.65x > LongMaxVector.COMPRESS_BITS 1024 thrpt 15 70.78x > LongMaxVector.EXPAND_BITS 1024 thrpt 15 76.31x > > > The "Gain" column is the ratio between the throughput of benchmark runs with this patch and that of benchmark runs without this patch. This patch does not change the performance of these operations for all other machines that do not support these instructions or when run on a different architecture. > With this patch, vectorization of CompressBits and ExpandBits operations happens only through vectorapi for aarch64. Autovectorization does not take place as the current JDK source does not contain aarch64 backend implementation for scalar CompressBits and ExpandBits. However, this PR - https://github.com/openjdk/jdk/pull/10537 adds aarch64 backend implementaton for CompressBits and ExpandBits and may lead to autovectorization of these nodes as well eventually but this PR is a standalone one and not dependent on the scalar implementation. > > [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/IntMaxVector.java > [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/LongMaxVector.java The test and changes under `cpu/aarch64` look OK to me but needs someone else to review the C2 changes. ------------- Marked as reviewed by ngasson (Reviewer). PR: https://git.openjdk.org/jdk/pull/12446 From tschatzl at openjdk.org Mon Feb 13 09:55:00 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 13 Feb 2023 09:55:00 GMT Subject: RFR: 8302122: Parallelize TLAB retirement in prologue in G1 [v2] In-Reply-To: References: Message-ID: > Hi all, > > can I have reviews for this change that parallelizes TLAB retirement and log buffer flushing (which [JDK-8297611](https://bugs.openjdk.org/browse/JDK-8297611) suggests? > > * the parallelization has been split into java and non-java threads: parallelization of the latter does not make sense given how they are organized in a linked list. However non-java threads are typically very few anyway, and parallelization only starts making sense with low hundreds of threads anyway. > * `G1BarrierSet::write_region` added a new parameter that describes the JavaThread the write_region/invalidation (not sure why `write_region` isn't just an overload of `invalidate`) is made for. The reason is previously when `BarrierSet::make_parsable()` (not sure why this is called this way either) is called to flush out deferred card marks, the card marks generated by that were added to the *calling thread's* refinement queue. This worked because the calling thread (the worker thread/vm thread) has always been processed after all java threads (which are the only ones that can have deferred card marks). So these deferred card marks' refinement entries were put into the worker thread's refinement queue. After parallelization this is not possible any more unless we deferred non java thread log flushing until after all java threads - I did not want to do that, so the change passes that explicit java thread through to `G1BarrierSet::write_region`. I think this is clearer anyway, and corresponds to how `G1DirtyCardQueueSet::concatenate_log_and_stats` works. > > Testing: tier1-5 > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: kbarrett review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12497/files - new: https://git.openjdk.org/jdk/pull/12497/files/ad492777..950b849d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12497&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12497&range=00-01 Stats: 19 lines in 3 files changed: 3 ins; 2 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/12497.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12497/head:pull/12497 PR: https://git.openjdk.org/jdk/pull/12497 From ayang at openjdk.org Mon Feb 13 10:22:30 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 13 Feb 2023 10:22:30 GMT Subject: RFR: 8302127: Remove unused arg in write_ref_field_post In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 23:19:27 GMT, Kim Barrett wrote: >> Trivial removing dead code. > > src/hotspot/share/gc/g1/g1BarrierSet.inline.hpp line 71: > >> 69: >> 70: template >> 71: inline void G1BarrierSet::write_ref_field_post(T* field) { > > So this function isn't applying the "same region" filter that's in the various other places. I wonder if > that's intentional, or a bug? It's safe because card-scanning is always prepared for "uninteresting" oops. It's used only by Access API; its counterpart in interpreter/c1/c2 checks for null and cross-region additionally. It's not obvious whether the discrepancy is intentional or not though. ------------- PR: https://git.openjdk.org/jdk/pull/12489 From duke at openjdk.org Mon Feb 13 10:38:05 2023 From: duke at openjdk.org (Amit Kumar) Date: Mon, 13 Feb 2023 10:38:05 GMT Subject: RFR: 8301697: [s390] Optimized-build is broken [v2] In-Reply-To: References: Message-ID: > This fix guards `__ asm_assert_eq("killed Z_R14", 0)` & `__asm_assert_mem8_is_zero(in_bytes(JavaThread::exception_pc_offset()), Z_thread, "exception pc already set : "FILE_AND_LINE, 0)` with ASSERT because on s390x `JavaThread::exception_oop_offset()` is always cleared but `JavaThread::exception_pc_offset()` is being cleared only in ASSERT-def, Which is causing the build failure in Optimized-Debug. Amit Kumar has updated the pull request incrementally with two additional commits since the last revision: - Merge branch 'optimized_build_broken' of github.com:offamitkumar/jdk into optimized_build_broken - addressing comments from lutz and tyler ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12400/files - new: https://git.openjdk.org/jdk/pull/12400/files/6455f9db..fae3bf91 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12400&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12400&range=00-01 Stats: 19 lines in 3 files changed: 1 ins; 7 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/12400.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12400/head:pull/12400 PR: https://git.openjdk.org/jdk/pull/12400 From duke at openjdk.org Mon Feb 13 10:45:57 2023 From: duke at openjdk.org (Amit Kumar) Date: Mon, 13 Feb 2023 10:45:57 GMT Subject: RFR: 8301697: [s390] Optimized-build is broken [v3] In-Reply-To: References: Message-ID: > This fix guards `__ asm_assert_eq("killed Z_R14", 0)` & `__asm_assert_mem8_is_zero(in_bytes(JavaThread::exception_pc_offset()), Z_thread, "exception pc already set : "FILE_AND_LINE, 0)` with ASSERT because on s390x `JavaThread::exception_oop_offset()` is always cleared but `JavaThread::exception_pc_offset()` is being cleared only in ASSERT-def, Which is causing the build failure in Optimized-Debug. Amit Kumar has updated the pull request incrementally with one additional commit since the last revision: reverting extra copyright header updates ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12400/files - new: https://git.openjdk.org/jdk/pull/12400/files/fae3bf91..a0dcae19 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12400&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12400&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12400.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12400/head:pull/12400 PR: https://git.openjdk.org/jdk/pull/12400 From rkennke at openjdk.org Mon Feb 13 11:16:35 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 13 Feb 2023 11:16:35 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v11] In-Reply-To: <0oSGAyFe1v4NSMKc-zmIKOm9KIchWhj7TSOf4qGpM78=.f9ea7cd7-e60f-40ae-a003-c935a5400d2f@github.com> References: <0oSGAyFe1v4NSMKc-zmIKOm9KIchWhj7TSOf4qGpM78=.f9ea7cd7-e60f-40ae-a003-c935a5400d2f@github.com> Message-ID: On Thu, 19 Jan 2023 11:26:22 GMT, Stefan Karlsson wrote: >> Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits: >> >> - Merge branch 'master' into JDK-8139457 >> - Fix gtest for correct base offsets in 32bit builds >> - Fix cast warning >> - Revert relaxation of array length >> - Add guards to ArrayBaseOffsets test to allow running with -UseCompressedClassPointers >> - Fix another cast warning >> - Clean cast warning from size_t to int >> - Clear leading 32bits in Z array allocator >> - SA adjustments >> - Test for 32bit build >> - ... and 15 more: https://git.openjdk.org/jdk/compare/500b45e1...c278a53b > > I tried to read the code very carefully, but it's not entirely easy to review a patch like this. I think I found two bugs, and a number of things that I think could help the readability of the code. > > I didn't look to closely at the tests, but I think I'd like to see a test that checks the boundary conditions of the max array calculations. > > I'd suggest that more Reviewers take a close look at the code. @stefank @shipilev are you ok with the change now? ------------- PR: https://git.openjdk.org/jdk/pull/11044 From tschatzl at openjdk.org Mon Feb 13 11:48:16 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 13 Feb 2023 11:48:16 GMT Subject: RFR: 8302312: Make ParGCRareEvent_lock G1 specific Message-ID: Hi all, please have a look at this renaming of ParGCRareEvent_lock to G1RareEvent_lock to align with it only being used in G1. Previously it had been used with CMS too, but that one's gone now. Testing: local compilation, gha Based on PR#12511 Thanks, Thomas ------------- Depends on: https://git.openjdk.org/jdk/pull/12511 Commit messages: - rename Changes: https://git.openjdk.org/jdk/pull/12531/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12531&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302312 Stats: 9 lines in 6 files changed: 2 ins; 2 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/12531.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12531/head:pull/12531 PR: https://git.openjdk.org/jdk/pull/12531 From lucy at openjdk.org Mon Feb 13 13:41:29 2023 From: lucy at openjdk.org (Lutz Schmidt) Date: Mon, 13 Feb 2023 13:41:29 GMT Subject: RFR: 8301697: [s390] Optimized-build is broken [v3] In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 10:45:57 GMT, Amit Kumar wrote: >> This fix guards `__ asm_assert_eq("killed Z_R14", 0)` & `__asm_assert_mem8_is_zero(in_bytes(JavaThread::exception_pc_offset()), Z_thread, "exception pc already set : "FILE_AND_LINE, 0)` with ASSERT because on s390x `JavaThread::exception_oop_offset()` is always cleared but `JavaThread::exception_pc_offset()` is being cleared only in ASSERT-def, Which is causing the build failure in Optimized-Debug. > > Amit Kumar has updated the pull request incrementally with one additional commit since the last revision: > > reverting extra copyright header updates Changes look good to me. Thank you for tracking down and fixing this. As discussed off-list, you may want to file a separate RFE to execute the checking code only #ifdef ASSERT. ------------- Marked as reviewed by lucy (Reviewer). PR: https://git.openjdk.org/jdk/pull/12400 From duke at openjdk.org Mon Feb 13 13:55:24 2023 From: duke at openjdk.org (Amit Kumar) Date: Mon, 13 Feb 2023 13:55:24 GMT Subject: RFR: 8301697: [s390] Optimized-build is broken In-Reply-To: References: <7gymtH0DjIj_DnlQgE_IUmurAsluGij6pt_p-pBKwMM=.73c4d85b-5a1c-4bbd-81ff-2c0689fdf7ea@github.com> Message-ID: On Fri, 10 Feb 2023 17:26:25 GMT, Lutz Schmidt wrote: >>> As title states optimized build is broken with GCC-9.4.0 and After my code changes build is now again successful. I've done testing for optimised, release, fastdebug, slowdebug on s390 and they're successful with these code changes. >> >> I was confused a bit about this, because my own S390X optimized builds build fine: >> https://builds.shipilev.net/openjdk-jdk/build-logs/build-openjdk-jdk-linux-s390x-server-optimized-gcc10-glibc2.31.log >> >> So, current code is supposed to build fine. I think it is arguably correct to protect the asserts with `ASSERT`/`DEBUG` rather than `PRODUCT`, which would hide them in the optimized builds. But I look at the original issue, and the optimized build is actually failing on the assert, which is why my (cross-compiled) builds are fine! See: >> >> >> A fatal error has been detected by the Java Runtime Environment: >> >> Internal Error (src/hotspot/cpu/s390/macroAssembler_s390.cpp:5498), pid=3076078, tid=3076083 >> guarantee(false) failed: Z assembly code requires stop: killed Z_R14 >> >> >> So, this change only hides the issue, not fixes it? If so, I don't think this is a good way to solve the issue. >> I suggest you `git bisect` and identify which change introduced this failure, and work from there. > >> It also looks from @shipilev's comment, that s390x cross-compilation is working. >> > The cross-compile will not use the JVM it created to continue with the build. How would you run a s390x JVM on x64? Sure @RealLucy, I'm integrating it now, either you could sponsor it or I expect this from next reviewer. ------------- PR: https://git.openjdk.org/jdk/pull/12400 From coleenp at openjdk.org Mon Feb 13 13:59:31 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 13 Feb 2023 13:59:31 GMT Subject: RFR: 8302108: Clean up placeholder supername code [v2] In-Reply-To: References: <21a14dI3wKifmxIScLOW9c1C-ZTFVR7OPQ3wlvA_X0I=.79a4aacd-1d8b-4c5d-841e-20ce7a028917@github.com> Message-ID: On Fri, 10 Feb 2023 19:11:17 GMT, Coleen Phillimore wrote: >> Please review change to make PlaceholderEntry::_supername a SymbolHandle to correctly refcount the symbol rather than have ad-hoc code. It also moves the private functions to private and asserts invariant that find_and_remove must always find a matching entry. Finally, this also more eagerly removes the supername symbol from the placeholder entry, so I have to fix the test in https://github.com/openjdk/jdk/pull/12491 before pushing this. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: > > - Fix test for cleanup for nulling out supername when no longer used. > - Merge branch 'master' into placeholder-cleanup > - 8302108: Clean up placeholder supername code Thanks David. I had printf code to verify what SymbolHandle refcounting did to understand it but it's better that it just works rather than ad-hoc ref counting. Thanks for reviewing! ------------- PR: https://git.openjdk.org/jdk/pull/12495 From eosterlund at openjdk.org Mon Feb 13 14:19:31 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 13 Feb 2023 14:19:31 GMT Subject: RFR: 8301874: BarrierSetC2 should assign barrier data to stores In-Reply-To: References: Message-ID: <-CCHIGXy439QnBqAJWhOPI8TToEMlWhm7-XSJflEbbY=.e796f262-a97f-480f-a168-cde5e1c9d84e@github.com> On Sat, 11 Feb 2023 00:30:49 GMT, Vladimir Kozlov wrote: > Good. Thanks for the review! ------------- PR: https://git.openjdk.org/jdk/pull/12442 From duke at openjdk.org Mon Feb 13 14:47:08 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Mon, 13 Feb 2023 14:47:08 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test Message-ID: Fixes the issue by applying a fix that is already implemented in PPC. ------------- Commit messages: - Fix typo in ARM change - Fix ASGCT sanity test - Check more frames in ASGCT sanity test Changes: https://git.openjdk.org/jdk/pull/12535/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12535&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302320 Stats: 37 lines in 3 files changed: 33 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/12535.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12535/head:pull/12535 PR: https://git.openjdk.org/jdk/pull/12535 From eosterlund at openjdk.org Mon Feb 13 15:53:46 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 13 Feb 2023 15:53:46 GMT Subject: Integrated: 8300255: Introduce interface for GC oop verification in the assembler In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 15:32:50 GMT, Erik ?sterlund wrote: > In the assembly code, there is some generic oop verification code. Said verification may or may not fit a particular GC. In particular, it has not worked well for ZGC for a while, and there is an if (UseZGC). This enhancement aims at generalizing this code, such that a collector can have its own oop verification policy. This pull request has now been integrated. Changeset: f4d4fa50 Author: Erik ?sterlund URL: https://git.openjdk.org/jdk/commit/f4d4fa500c5038c85551bd7ed997e697d9f088eb Stats: 182 lines in 18 files changed: 114 ins; 58 del; 10 mod 8300255: Introduce interface for GC oop verification in the assembler Co-authored-by: Martin Doerr Co-authored-by: Axel Boldt-Christmas Co-authored-by: Yadong Wang Reviewed-by: fyang, aboldtch, coleenp ------------- PR: https://git.openjdk.org/jdk/pull/12443 From kvn at openjdk.org Mon Feb 13 15:56:32 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 13 Feb 2023 15:56:32 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 [v2] In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 01:31:21 GMT, David Holmes wrote: >> src/hotspot/cpu/x86/interp_masm_x86.cpp line 303: >> >>> 301: jcc(Assembler::equal, L); >>> 302: stop("InterpreterMacroAssembler::call_VM_base:" >>> 303: " last_sp != nullptr"); >> >> `null` > > These are textual code fragments so I think `nullptr` is more appropriate - as per changes made to cpu/arm code. Right. ------------- PR: https://git.openjdk.org/jdk/pull/12326 From tsteele at openjdk.org Mon Feb 13 16:14:29 2023 From: tsteele at openjdk.org (Tyler Steele) Date: Mon, 13 Feb 2023 16:14:29 GMT Subject: RFR: 8301697: [s390] Optimized-build is broken [v3] In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 10:45:57 GMT, Amit Kumar wrote: >> This fix guards `__ asm_assert_eq("killed Z_R14", 0)` & `__asm_assert_mem8_is_zero(in_bytes(JavaThread::exception_pc_offset()), Z_thread, "exception pc already set : "FILE_AND_LINE, 0)` with ASSERT because on s390x `JavaThread::exception_oop_offset()` is always cleared but `JavaThread::exception_pc_offset()` is being cleared only in ASSERT-def, Which is causing the build failure in Optimized-Debug. > > Amit Kumar has updated the pull request incrementally with one additional commit since the last revision: > > reverting extra copyright header updates Marked as reviewed by tsteele (Committer). ------------- PR: https://git.openjdk.org/jdk/pull/12400 From duke at openjdk.org Mon Feb 13 16:17:37 2023 From: duke at openjdk.org (Amit Kumar) Date: Mon, 13 Feb 2023 16:17:37 GMT Subject: Integrated: 8301697: [s390] Optimized-build is broken In-Reply-To: References: Message-ID: On Fri, 3 Feb 2023 04:34:46 GMT, Amit Kumar wrote: > This fix guards `__ asm_assert_eq("killed Z_R14", 0)` & `__asm_assert_mem8_is_zero(in_bytes(JavaThread::exception_pc_offset()), Z_thread, "exception pc already set : "FILE_AND_LINE, 0)` with ASSERT because on s390x `JavaThread::exception_oop_offset()` is always cleared but `JavaThread::exception_pc_offset()` is being cleared only in ASSERT-def, Which is causing the build failure in Optimized-Debug. This pull request has now been integrated. Changeset: 101db262 Author: Amit Kumar Committer: Tyler Steele URL: https://git.openjdk.org/jdk/commit/101db262e1eef9afcc316009740ebf74a7c598d9 Stats: 14 lines in 2 files changed: 2 ins; 5 del; 7 mod 8301697: [s390] Optimized-build is broken Reviewed-by: tsteele, lucy ------------- PR: https://git.openjdk.org/jdk/pull/12400 From shade at openjdk.org Mon Feb 13 16:50:37 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 13 Feb 2023 16:50:37 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v19] In-Reply-To: References: Message-ID: <0Y1Bhvn-wJJdPJ83fSeICTbMQzdW4wLvEnaZhm8J6h8=.08240fe1-c05a-45f2-8975-02b1d12db63b@github.com> On Wed, 1 Feb 2023 17:09:21 GMT, Roman Kennke wrote: >> See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. >> >> Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. >> >> Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. >> >> Testing: >> - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] tier1 (x86_64, x86_32, aarch64, riscv) >> - [x] tier2 (x86_64, aarch64, riscv) >> - [x] tier3 (x86_64, riscv) > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Remove stale method Still have a few questions. src/hotspot/share/gc/shared/collectedHeap.cpp line 257: > 255: const size_t elements_per_word = HeapWordSize / sizeof(jint); > 256: int base_offset_in_ints = arrayOopDesc::base_offset_in_ints(T_INT); > 257: _filler_array_max_size = align_object_size((base_offset_in_ints + max_len) / elements_per_word); Isn't this expression susceptible to overflow, like the removed comment in `CollectedHeap::max_tlab_size` (below) states? I.e. max_len is probably very close to SIZE_MAX on 32-bit platforms, and adding the base offset gets dangerously close there. Not to mention the positive side of signed `int` domain is lower than SIZE_MAX to beging with? I think you need to keep doing the division `max_len / elements_per_word` first. src/hotspot/share/oops/arrayOop.hpp line 136: > 134: } > 135: > 136: // Return the maximum length (num elements) of an array of BasicType. The length can passed "(num elements)" addition here is unnecessary, I think. test/hotspot/gtest/oops/test_objArrayOop.cpp line 46: > 44: { 8, false, false, 4 }, // 12 byte header, 4 byte oops, wordsize 4 > 45: #endif > 46: { -1, false, false, -1 } Indentation compared to similar cases above. test/hotspot/jtreg/runtime/FieldLayout/ArrayBaseOffsets.java line 110: > 108: Asserts.assertEquals(unsafe.arrayBaseOffset(float[].class), intOffset, "Misplaced float array base"); > 109: Asserts.assertEquals(unsafe.arrayBaseOffset(double[].class), longOffset, "Misplaced double array base"); > 110: int expectedObjArrayOffset = (COOP || !Platform.is64bit()) ? intOffset : longOffset; `|| !Platform.is64Bit()` looks redundant, since `COOP` is guaranteed to be false on 32-bit? ------------- PR: https://git.openjdk.org/jdk/pull/11044 From shade at openjdk.org Mon Feb 13 16:50:39 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 13 Feb 2023 16:50:39 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v15] In-Reply-To: <33mXkqcQNVxZA7xZLMUQoJOGRbUJ8zSF3wh2Sg2GWLU=.0a0ec230-203f-427e-a6d1-bdbe76a4d161@github.com> References: <-oqaGCDF75UbtgkKVwsOE9F8-WQ48xCV6GxiN4ZbOxc=.e5042b09-8786-4b44-9929-4d6703303e63@github.com> <33mXkqcQNVxZA7xZLMUQoJOGRbUJ8zSF3wh2Sg2GWLU=.0a0ec230-203f-427e-a6d1-bdbe76a4d161@github.com> Message-ID: On Tue, 31 Jan 2023 11:31:27 GMT, Roman Kennke wrote: >> src/hotspot/share/opto/runtime.cpp line 312: >> >>> 310: BasicType elem_type = TypeArrayKlass::cast(array_type)->element_type(); >>> 311: size_t hs_bytes = arrayOopDesc::base_offset_in_bytes(elem_type); >>> 312: assert(is_aligned(hs_bytes, BytesPerInt), "must be 4 byte aligned"); >> >> This assert is misplaced, should be in the `if` block below? Otherwise checking for HeapWordSize alignment after assert-checking for BytesPerInt seems rather odd: the check is guaranteed to pass if assert does not fire. > > This assert checks int-alignment of the base offset (in bytes). This is always expected. Subsequently, we check if the base offset is 8-byte aligned, and if it is not, then we make it so. After that correction, we assert 8-byte-alignment. I believe it is correct as it is. Right, I misread this one, sorry. ------------- PR: https://git.openjdk.org/jdk/pull/11044 From duke at openjdk.org Mon Feb 13 17:14:05 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Mon, 13 Feb 2023 17:14:05 GMT Subject: RFR: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64 [v2] In-Reply-To: References: Message-ID: > Instruction pmull and pmull2 support operating on 64-bit data in Cryptographic Extension. The execution throughput of this form raises from 1 on Neoverse N1 to 4 on Neoverse V1 while the latency remains 2. The CRC32 instructions did not changed: latency 2, throughput 1. As a result, computing CRC32 using pmull could perform better than using crc32 instruction. > > The following test has passed. > test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java > > The throughput reported by the micro benchmark is measured on an EC2 c7g instance. The optimization shows 11 - 99% improvement when the input is at least 384 bytes. > > | input | 64 | 128 | 256 | 384 | 511 | 512 | 1,024 | > | ------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | > | improvement | 0.02% | 0.02% | 0.00% | 16.00% | 11.94% | 34.75% | 69.80% | > > | input | 2,048 | 4,096 | 8,192 | 16,384 | 32,768 | 65,536 | > | ------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | > | improvement | 77.61% | 92.33% | 95.98% | 97.95% | 99.33% | 98.36% | > > > Baseline > > TestCRC32.testCRC32Update 64 thrpt 12 173126.358 ? 118.330 ops/ms > TestCRC32.testCRC32Update 128 thrpt 12 112910.118 ? 47.305 ops/ms > TestCRC32.testCRC32Update 256 thrpt 12 66601.990 ? 7.294 ops/ms > TestCRC32.testCRC32Update 384 thrpt 12 47229.319 ? 3.949 ops/ms > TestCRC32.testCRC32Update 511 thrpt 12 33733.119 ? 4.076 ops/ms > TestCRC32.testCRC32Update 512 thrpt 12 36584.565 ? 4.211 ops/ms > TestCRC32.testCRC32Update 1024 thrpt 12 19239.083 ? 1.040 ops/ms > TestCRC32.testCRC32Update 2048 thrpt 12 9875.652 ? 0.435 ops/ms > TestCRC32.testCRC32Update 4096 thrpt 12 5004.425 ? 0.290 ops/ms > TestCRC32.testCRC32Update 8192 thrpt 12 2519.185 ? 0.169 ops/ms > TestCRC32.testCRC32Update 16384 thrpt 12 1263.909 ? 0.194 ops/ms > TestCRC32.testCRC32Update 32768 thrpt 12 632.018 ? 0.053 ops/ms > TestCRC32.testCRC32Update 65536 thrpt 12 315.471 ? 0.095 ops/ms > > > Crypto pmull > > TestCRC32.testCRC32Update 64 thrpt 12 173168.669 ? 4.746 ops/ms > TestCRC32.testCRC32Update 128 thrpt 12 112933.519 ? 4.583 ops/ms > TestCRC32.testCRC32Update 256 thrpt 12 66602.462 ? 3.150 ops/ms > TestCRC32.testCRC32Update 384 thrpt 12 54784.739 ? 2.110 ops/ms > TestCRC32.testCRC32Update 511 thrpt 12 37760.816 ? 69.911 ops/ms > TestCRC32.testCRC32Update 512 thrpt 12 49297.609 ? 21.983 ops/ms > TestCRC32.testCRC32Update 1024 thrpt 12 32667.507 ? 90.610 ops/ms > TestCRC32.testCRC32Update 2048 thrpt 12 17539.986 ? 511.416 ops/ms > TestCRC32.testCRC32Update 4096 thrpt 12 9625.249 ? 9.713 ops/ms > TestCRC32.testCRC32Update 8192 thrpt 12 4937.135 ? 6.121 ops/ms > TestCRC32.testCRC32Update 16384 thrpt 12 2501.936 ? 1.270 ops/ms > TestCRC32.testCRC32Update 32768 thrpt 12 1259.831 ? 0.119 ops/ms > TestCRC32.testCRC32Update 65536 thrpt 12 625.773 ? 0.242 ops/ms Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: Change to UseCryptoPmullForCRC32 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12480/files - new: https://git.openjdk.org/jdk/pull/12480/files/d691eba6..f15ac5e9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12480&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12480&range=00-01 Stats: 7 lines in 3 files changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/12480.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12480/head:pull/12480 PR: https://git.openjdk.org/jdk/pull/12480 From duke at openjdk.org Mon Feb 13 17:14:05 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Mon, 13 Feb 2023 17:14:05 GMT Subject: RFR: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64 [v2] In-Reply-To: References: Message-ID: On Sat, 11 Feb 2023 13:44:54 GMT, Andrew Haley wrote: >> Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: >> >> Change to UseCryptoPmullForCRC32 > > src/hotspot/cpu/aarch64/globals_aarch64.hpp line 91: > >> 89: product(bool, UseCRC32, false, \ >> 90: "Use CRC32 instructions for CRC32 computation") \ >> 91: product(bool, UseCryptoPmull, false, \ > > Suggestion: > > product(bool, UsePmullForCRC32, false, \ Changed to UseCryptoPmullForCRC32 as there is another pmull version. ------------- PR: https://git.openjdk.org/jdk/pull/12480 From mseledtsov at openjdk.org Mon Feb 13 17:46:30 2023 From: mseledtsov at openjdk.org (Mikhailo Seledtsov) Date: Mon, 13 Feb 2023 17:46:30 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v7] In-Reply-To: <7DZ8e09Bh11t7wJFMJVwoPQzdaYtNvJd3-2njgDKlTA=.a2a8dd73-3886-4513-88a6-a595ecc3bf6a@github.com> References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> <-mFwsAykBQ9bXZczBEHsQX_5f5OJY5E0zOMZ31aE7rk=.b19e0c2e-03bb-4a1f-8584-0fedc6b18272@github.com> <7DZ8e09Bh11t7wJFMJVwoPQzdaYtNvJd3-2njgDKlTA=.a2a8dd73-3886-4513-88a6-a595ecc3bf6a@github.com> Message-ID: On Thu, 9 Feb 2023 19:32:31 GMT, Mikhailo Seledtsov wrote: >> test/jtreg-ext/requires/VMProps.java line 517: >> >>> 515: // Returns comma-separated file names for stdout and stderr. >>> 516: private String redirectOutputToLogFile(String msg, ProcessBuilder pb, String fileNameBase) { >>> 517: if (!Boolean.getBoolean("jtreg.log.vmprops")) { >> >> Why not dump output always? It is printed only if exitcode is not zero anyway and doesn't add noise. >> However not having it in the case of failure require reproducing the bug. > > We have to be consistent. Since we agreed in earlier discussion to use property jtreg.log.vmprops as a guard for logging we should also use it for writing into a file. The user who runs the tests can always specify this property in the test execution script. If we take the approach of always logging to vmprops.log then this makes sense. If no objections I will always redirect output into a file. ------------- PR: https://git.openjdk.org/jdk/pull/12239 From mseledtsov at openjdk.org Mon Feb 13 17:46:33 2023 From: mseledtsov at openjdk.org (Mikhailo Seledtsov) Date: Mon, 13 Feb 2023 17:46:33 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v7] In-Reply-To: <-mFwsAykBQ9bXZczBEHsQX_5f5OJY5E0zOMZ31aE7rk=.b19e0c2e-03bb-4a1f-8584-0fedc6b18272@github.com> References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> <-mFwsAykBQ9bXZczBEHsQX_5f5OJY5E0zOMZ31aE7rk=.b19e0c2e-03bb-4a1f-8584-0fedc6b18272@github.com> Message-ID: On Thu, 9 Feb 2023 19:23:10 GMT, Leonid Mesnik wrote: >> Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: >> >> Re-added conditional logging per review discussion > > test/jtreg-ext/requires/VMProps.java line 693: > >> 691: protected static void log(String msg) { >> 692: if (!Boolean.getBoolean("jtreg.log.vmprops")) { >> 693: return; > > Why not always log in to the file like vmprops.log? Also, add try/catch into the call() and redirect file to stderr if something is wrong. And redirect if the property is set for passed tests. I had a discussion with Leonid in person, and I am OK with such approach. To summarize: - always log to a file (vmprops.log or similar) - this way we always have information to base investigation on if things go wrong - conditionally log into stderr to avoid polluting the output and being too verbose If I do not hear any objections I will proceed with implementation tomorrow. ------------- PR: https://git.openjdk.org/jdk/pull/12239 From duke at openjdk.org Mon Feb 13 17:53:28 2023 From: duke at openjdk.org (Scott Gibbons) Date: Mon, 13 Feb 2023 17:53:28 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v15] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: On Fri, 10 Feb 2023 23:18:47 GMT, Claes Redestad wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Add URL to microbenchmark > > Marked as reviewed by redestad (Reviewer). @cl4es Can you please initiate the in-depth testing required for this fix? Thanks. ------------- PR: https://git.openjdk.org/jdk/pull/12126 From rkennke at openjdk.org Mon Feb 13 17:57:37 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 13 Feb 2023 17:57:37 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v19] In-Reply-To: <0Y1Bhvn-wJJdPJ83fSeICTbMQzdW4wLvEnaZhm8J6h8=.08240fe1-c05a-45f2-8975-02b1d12db63b@github.com> References: <0Y1Bhvn-wJJdPJ83fSeICTbMQzdW4wLvEnaZhm8J6h8=.08240fe1-c05a-45f2-8975-02b1d12db63b@github.com> Message-ID: On Mon, 13 Feb 2023 16:15:56 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove stale method > > test/hotspot/jtreg/runtime/FieldLayout/ArrayBaseOffsets.java line 110: > >> 108: Asserts.assertEquals(unsafe.arrayBaseOffset(float[].class), intOffset, "Misplaced float array base"); >> 109: Asserts.assertEquals(unsafe.arrayBaseOffset(double[].class), longOffset, "Misplaced double array base"); >> 110: int expectedObjArrayOffset = (COOP || !Platform.is64bit()) ? intOffset : longOffset; > > `|| !Platform.is64Bit()` looks redundant, since `COOP` is guaranteed to be false on 32-bit? No, because on 32 bit this expression becomes true, which is what we want - the small (int) offset for reference arrays. ------------- PR: https://git.openjdk.org/jdk/pull/11044 From rkennke at openjdk.org Mon Feb 13 18:07:37 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 13 Feb 2023 18:07:37 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v19] In-Reply-To: <0Y1Bhvn-wJJdPJ83fSeICTbMQzdW4wLvEnaZhm8J6h8=.08240fe1-c05a-45f2-8975-02b1d12db63b@github.com> References: <0Y1Bhvn-wJJdPJ83fSeICTbMQzdW4wLvEnaZhm8J6h8=.08240fe1-c05a-45f2-8975-02b1d12db63b@github.com> Message-ID: On Mon, 13 Feb 2023 16:28:14 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove stale method > > src/hotspot/share/gc/shared/collectedHeap.cpp line 257: > >> 255: const size_t elements_per_word = HeapWordSize / sizeof(jint); >> 256: int base_offset_in_ints = arrayOopDesc::base_offset_in_ints(T_INT); >> 257: _filler_array_max_size = align_object_size((base_offset_in_ints + max_len) / elements_per_word); > > Isn't this expression susceptible to overflow, like the removed comment in `CollectedHeap::max_tlab_size` (below) states? I.e. max_len is probably very close to SIZE_MAX on 32-bit platforms, and adding the base offset gets dangerously close there. Not to mention the positive side of signed `int` domain is lower than SIZE_MAX to beging with? I think you need to keep doing the division `max_len / elements_per_word` first. As you say, the positive side of int32_t is much smaller than SIZE_MAX and thus we are basically guaranteed to not overflow here. Also, arrayOopDesc::max_array_length() is specifically designed to prevent such overflows (and even the more likely overflowing of size_t when converting. I am going to add corresponding assert there. ------------- PR: https://git.openjdk.org/jdk/pull/11044 From rehn at openjdk.org Mon Feb 13 19:04:11 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 13 Feb 2023 19:04:11 GMT Subject: RFR: 8302354: InstanceKlasss init state/thread should be atomic Message-ID: Hi, please consider. These states are read 'lock-less' should thus be atomic. Also the assert setting init thread state can be stricter. Passes t1-4 ------------- Commit messages: - Initial fix Changes: https://git.openjdk.org/jdk/pull/12542/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12542&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302354 Stats: 15 lines in 5 files changed: 1 ins; 0 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/12542.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12542/head:pull/12542 PR: https://git.openjdk.org/jdk/pull/12542 From pchilanomate at openjdk.org Mon Feb 13 19:51:12 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 13 Feb 2023 19:51:12 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v2] In-Reply-To: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: > Please review the following change. Some of the JVMTI functions had to be modified since they are not supported by the specs on virtual threads (StopThread, RunAgentThread, GetCurrentThreadCpuTime, GetThreadCpuTime, PopFrame, ForceEarlyReturnXXX) or shouldn't include virtual threads in the results (GetAllThreads, GetAllStackTraces, GetThreadGroupChildren). Others were modified because they should work on virtual thread but they weren't (SuspendAllVirtualThreads/ResumeAllVirtualThreads). Also support for VirtualThreadStart/VirtualThreadEnd events was added. > > There were a total of 71 tests under serviceability/jvmti/ that had the vm.continuations requirement. Of those, 24 where under serviceability/jvmti/vthread/. I was able to remove that requirement from 50 tests. Most of those tests were already passing for the alternative vthread implementation just by removing the check in jni_GetEnv for VMContinuations. > From the other 21 tests, 20 still fail with the changes either because they actually test continuations or because of different assumptions in the tests that don't hold true for the alternative vthread implementation. So for those I left the vm.continuations requirement untouched (from those 20 tests there are actually 4 tests from the thread/GetStackTrace/ family that are passing because of a bug in the test, but I can fix that in another RFE). The other remaining test is vthread/GetSetLocalTest/GetSetLocalTest.java which I see is problemlisted. > > Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). > > I added new test BoundVThreadTest.java which just checks for the unsupported/supported behavior mentioned previously. For some extra basic coverage I also added a new run with -XX:-VMContinuations on tests inside the serviceability/jvmti/vthread/ directory that don't require vm.continuations anymore. I could also add that for all the other tests in serviceability/jvmti/ for which I removed the vm.continuations requirement. > > I run the patch through mach5 tiers 1-6 plus some extra local runs on the relevant tests. > > Thanks, > Patricio Patricio Chilano Mateo has updated the pull request incrementally with three additional commits since the last revision: - use different @test to run with -XX:-VMContinuations - more removals of @requires vm.continuations - change BasicVirtualThread_klass -> BaseVirtualThread_klass ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12512/files - new: https://git.openjdk.org/jdk/pull/12512/files/571f0d2e..99cf204c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12512&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12512&range=00-01 Stats: 105 lines in 22 files changed: 69 ins; 7 del; 29 mod Patch: https://git.openjdk.org/jdk/pull/12512.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12512/head:pull/12512 PR: https://git.openjdk.org/jdk/pull/12512 From pchilanomate at openjdk.org Mon Feb 13 19:57:26 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 13 Feb 2023 19:57:26 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Sat, 11 Feb 2023 07:37:34 GMT, Alan Bateman wrote: > On tests, there are a few other tests that can be changed to drop `@requires vm.continuations`, e.g. the jshell tests that run with --enable-preview should no longer need the `@requires`. There are also a few JDI and at least one AppCDS test. A search will find them. > I removed the vm.continuations requirement on those tests (only SuspendAfterDeath.java and RedefineRunningMethods_Shared.java were actually creating a JVMTI environment though). >The tests re-running with -XX:-VMContinuations is fine but it does mean they will run twice on the ports without an implementation. The test runtime/jni/IsVirtualThread/IsVirtualThread.java is an example where it only runs once on these ports, at the cost of additional `@test` tag. > I see, I removed the extra run and created another @test tag for that with `@requires vm.continuations`, like it is done in IsVirtualThread.java. ------------- PR: https://git.openjdk.org/jdk/pull/12512 From pchilanomate at openjdk.org Mon Feb 13 19:57:30 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 13 Feb 2023 19:57:30 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v2] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Mon, 13 Feb 2023 19:51:12 GMT, Patricio Chilano Mateo wrote: >> Please review the following change. Some of the JVMTI functions had to be modified since they are not supported by the specs on virtual threads (StopThread, RunAgentThread, GetCurrentThreadCpuTime, GetThreadCpuTime) or shouldn't include virtual threads in the results (GetAllThreads, GetAllStackTraces, GetThreadGroupChildren). Others were modified because they should work on virtual thread but they weren't (SuspendAllVirtualThreads/ResumeAllVirtualThreads). Also support for VirtualThreadStart/VirtualThreadEnd events was added. >> >> There were a total of 71 tests under serviceability/jvmti/ that had the vm.continuations requirement. Of those, 24 where under serviceability/jvmti/vthread/. I was able to remove that requirement from 50 tests. Most of those tests were already passing for the alternative vthread implementation just by removing the check in jni_GetEnv for VMContinuations. >> From the other 21 tests, 20 still fail with the changes either because they actually test continuations or because of different assumptions in the tests that don't hold true for the alternative vthread implementation. So for those I left the vm.continuations requirement untouched (from those 20 tests there are actually 4 tests from the thread/GetStackTrace/ family that are passing because of a bug in the test, but I can fix that in another RFE). The other remaining test is vthread/GetSetLocalTest/GetSetLocalTest.java which I see is problemlisted. >> >> Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). >> >> I added new test BoundVThreadTest.java which just checks for the unsupported/supported behavior mentioned previously. For some extra basic coverage I also added a new run with -XX:-VMContinuations on tests inside the serviceability/jvmti/vthread/ directory that don't require vm.continuations anymore. I could also add that for all the other tests in serviceability/jvmti/ for which I removed the vm.continuations requirement. >> >> I run the patch through mach5 tiers 1-6 plus some extra local runs on the relevant tests. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with three additional commits since the last revision: > > - use different @test to run with -XX:-VMContinuations > - more removals of @requires vm.continuations > - change BasicVirtualThread_klass -> BaseVirtualThread_klass Also fixed a bug in the description. PopFrame and ForceEarlyReturnXXX don't actually need to return UNSUPPORTED_OPERATION (optionally supported by the specs), but when looking at jvmtiEnv.cpp, I saw the "// No support for virtual threads (yet)" comment, so I just changed them to return UNSUPPORTED_OPERATION for the alternative implementation too to match the behavior with normal virtual threads. But I could revert that if agreed. In fact I left SetLocalXXX unmodified where currently the implementation has different support on virtual threads than platform threads. ------------- PR: https://git.openjdk.org/jdk/pull/12512 From pchilanomate at openjdk.org Mon Feb 13 19:57:33 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 13 Feb 2023 19:57:33 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v2] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Sat, 11 Feb 2023 07:31:00 GMT, Alan Bateman wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with three additional commits since the last revision: >> >> - use different @test to run with -XX:-VMContinuations >> - more removals of @requires vm.continuations >> - change BasicVirtualThread_klass -> BaseVirtualThread_klass > > src/hotspot/share/classfile/vmClassMacros.hpp line 92: > >> 90: do_klass(Thread_Constants_klass, java_lang_Thread_Constants ) \ >> 91: do_klass(ThreadGroup_klass, java_lang_ThreadGroup ) \ >> 92: do_klass(BasicVirtualThread_klass, java_lang_BaseVirtualThread ) \ > > It should be BaseVirtualThread_klass rather than BasicVirtualThread_klass (probably my fault when adding the alt implementation). Fixed. ------------- PR: https://git.openjdk.org/jdk/pull/12512 From coleenp at openjdk.org Mon Feb 13 20:18:29 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 13 Feb 2023 20:18:29 GMT Subject: RFR: 8302354: InstanceKlasss init state/thread should be atomic In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 18:56:19 GMT, Robbin Ehn wrote: > Hi, please consider. > > These states are read 'lock-less' should thus be atomic. > Also the assert setting init thread state can be stricter. > > Passes t1-4 This looks good but can you edit the bug and PR to not have so many 'sss' in InstanceKlass ? ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.org/jdk/pull/12542 From iklam at openjdk.org Mon Feb 13 20:24:31 2023 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 13 Feb 2023 20:24:31 GMT Subject: RFR: 8302108: Clean up placeholder supername code [v2] In-Reply-To: References: <21a14dI3wKifmxIScLOW9c1C-ZTFVR7OPQ3wlvA_X0I=.79a4aacd-1d8b-4c5d-841e-20ce7a028917@github.com> Message-ID: On Fri, 10 Feb 2023 19:11:17 GMT, Coleen Phillimore wrote: >> Please review change to make PlaceholderEntry::_supername a SymbolHandle to correctly refcount the symbol rather than have ad-hoc code. It also moves the private functions to private and asserts invariant that find_and_remove must always find a matching entry. Finally, this also more eagerly removes the supername symbol from the placeholder entry, so I have to fix the test in https://github.com/openjdk/jdk/pull/12491 before pushing this. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: > > - Fix test for cleanup for nulling out supername when no longer used. > - Merge branch 'master' into placeholder-cleanup > - 8302108: Clean up placeholder supername code LGTM. Just a small nit. src/hotspot/share/classfile/placeholders.hpp line 102: > 100: bool remove_seen_thread(JavaThread* thread, PlaceholderTable::classloadAction action); > 101: > 102: SeenThread* superThreadQ() const { return _superThreadQ; } If you just want to change these function to private, instead of moving them, I would suggest adding `private:` and `public:` keywords around them. That way we can limit the delta. ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.org/jdk/pull/12495 From sspitsyn at openjdk.org Mon Feb 13 20:28:29 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 13 Feb 2023 20:28:29 GMT Subject: RFR: 8298853: JvmtiVTMSTransitionDisabler should support disabling one virtual thread transitions [v7] In-Reply-To: References: Message-ID: > Now the `JvmtiVTMSTransitionDisabler` mechanism supports disabling VTMS transitions for all virtual threads only. It should also support disabling transitions for any specific virtual thread as well. This will improve scalability of the JVMTI functions operating on target virtual threads as the functions can be executed concurrently without blocking each other execution when target virtual threads are different. > New constructor `JvmtiVTMSTransitionDisabler(jthread vthread)` is added which has jthread parameter of the target virtual thread. > > Testing: > mach5 jobs are TBD (preliminary testing was completed): > - all JVMTI, JDWP, JDI and JDB tests have to be run > - Kitchensink > - tier5 Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: - Merge branch 'master' into br21 - minor comments are resolved - fix race between VTMS_transition_disable_for_one and start_VTMS_transition - use owned_by_self vs is_locked in JvmtiVTMSTransition_lock asserts - address wqsecond round review comments - review comments are addressed - fix trailing white spaces in javaClasses.cpp and jvmtiThreadState.cpp - 8298853: JvmtiVTMSTransitionDisabler should support disabling one virtual thread transitions ------------- Changes: https://git.openjdk.org/jdk/pull/11690/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=11690&range=06 Stats: 198 lines in 8 files changed: 129 ins; 16 del; 53 mod Patch: https://git.openjdk.org/jdk/pull/11690.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11690/head:pull/11690 PR: https://git.openjdk.org/jdk/pull/11690 From sspitsyn at openjdk.org Mon Feb 13 20:28:33 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 13 Feb 2023 20:28:33 GMT Subject: RFR: 8298853: JvmtiVTMSTransitionDisabler should support disabling one virtual thread transitions [v6] In-Reply-To: References: Message-ID: On Sat, 11 Feb 2023 01:43:07 GMT, Serguei Spitsyn wrote: >> Now the `JvmtiVTMSTransitionDisabler` mechanism supports disabling VTMS transitions for all virtual threads only. It should also support disabling transitions for any specific virtual thread as well. This will improve scalability of the JVMTI functions operating on target virtual threads as the functions can be executed concurrently without blocking each other execution when target virtual threads are different. >> New constructor `JvmtiVTMSTransitionDisabler(jthread vthread)` is added which has jthread parameter of the target virtual thread. >> >> Testing: >> mach5 jobs are TBD (preliminary testing was completed): >> - all JVMTI, JDWP, JDI and JDB tests have to be run >> - Kitchensink >> - tier5 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > minor comments are resolved Thank you a lot for reviewing, Leonid! Will update the copyrights. ------------- PR: https://git.openjdk.org/jdk/pull/11690 From coleenp at openjdk.org Mon Feb 13 20:34:28 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 13 Feb 2023 20:34:28 GMT Subject: RFR: 8302108: Clean up placeholder supername code [v2] In-Reply-To: References: <21a14dI3wKifmxIScLOW9c1C-ZTFVR7OPQ3wlvA_X0I=.79a4aacd-1d8b-4c5d-841e-20ce7a028917@github.com> Message-ID: On Mon, 13 Feb 2023 19:20:03 GMT, Ioi Lam wrote: >> Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: >> >> - Fix test for cleanup for nulling out supername when no longer used. >> - Merge branch 'master' into placeholder-cleanup >> - 8302108: Clean up placeholder supername code > > src/hotspot/share/classfile/placeholders.hpp line 102: > >> 100: bool remove_seen_thread(JavaThread* thread, PlaceholderTable::classloadAction action); >> 101: >> 102: SeenThread* superThreadQ() const { return _superThreadQ; } > > If you just want to change these function to private, instead of moving them, I would suggest adding `private:` and `public:` keywords around them. That way we can limit the delta. I really wanted them moved to the section of the class where the other private methods are. This class isn't so overgrown that we need to keep switching from private to public, and I want the public methods to be together. Java is nice in that you can have 'private' and 'public' keywords on each function, but I'd still like to keep the public ones together. ------------- PR: https://git.openjdk.org/jdk/pull/12495 From rehn at openjdk.org Mon Feb 13 20:40:29 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 13 Feb 2023 20:40:29 GMT Subject: RFR: 8302354: InstanceKlass init state/thread should be atomic In-Reply-To: References: Message-ID: <7074xxAAiF-jADZ2q1M-pFLOnmKszY1WJ8LmGhT_fJo=.6bfee5c5-1d0b-4d8e-b4a6-5ff9e3d1c758@github.com> On Mon, 13 Feb 2023 20:15:46 GMT, Coleen Phillimore wrote: > This looks good but can you edit the bug and PR to not have so many 'sss' in InstanceKlass ? Thanks! Fixed! ------------- PR: https://git.openjdk.org/jdk/pull/12542 From coleenp at openjdk.org Mon Feb 13 21:02:40 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 13 Feb 2023 21:02:40 GMT Subject: RFR: 8302108: Clean up placeholder supername code [v2] In-Reply-To: References: <21a14dI3wKifmxIScLOW9c1C-ZTFVR7OPQ3wlvA_X0I=.79a4aacd-1d8b-4c5d-841e-20ce7a028917@github.com> Message-ID: On Fri, 10 Feb 2023 19:11:17 GMT, Coleen Phillimore wrote: >> Please review change to make PlaceholderEntry::_supername a SymbolHandle to correctly refcount the symbol rather than have ad-hoc code. It also moves the private functions to private and asserts invariant that find_and_remove must always find a matching entry. Finally, this also more eagerly removes the supername symbol from the placeholder entry, so I have to fix the test in https://github.com/openjdk/jdk/pull/12491 before pushing this. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: > > - Fix test for cleanup for nulling out supername when no longer used. > - Merge branch 'master' into placeholder-cleanup > - 8302108: Clean up placeholder supername code Thanks for reviewing Ioi! ------------- PR: https://git.openjdk.org/jdk/pull/12495 From coleenp at openjdk.org Mon Feb 13 21:02:43 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 13 Feb 2023 21:02:43 GMT Subject: Integrated: 8302108: Clean up placeholder supername code In-Reply-To: <21a14dI3wKifmxIScLOW9c1C-ZTFVR7OPQ3wlvA_X0I=.79a4aacd-1d8b-4c5d-841e-20ce7a028917@github.com> References: <21a14dI3wKifmxIScLOW9c1C-ZTFVR7OPQ3wlvA_X0I=.79a4aacd-1d8b-4c5d-841e-20ce7a028917@github.com> Message-ID: On Thu, 9 Feb 2023 15:46:47 GMT, Coleen Phillimore wrote: > Please review change to make PlaceholderEntry::_supername a SymbolHandle to correctly refcount the symbol rather than have ad-hoc code. It also moves the private functions to private and asserts invariant that find_and_remove must always find a matching entry. Finally, this also more eagerly removes the supername symbol from the placeholder entry, so I have to fix the test in https://github.com/openjdk/jdk/pull/12491 before pushing this. > Tested with tier1-4. This pull request has now been integrated. Changeset: abbeb7e4 Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/abbeb7e4d2f5739dff77b2c79e675fb69368db1e Stats: 53 lines in 4 files changed: 13 ins; 24 del; 16 mod 8302108: Clean up placeholder supername code Reviewed-by: dholmes, iklam ------------- PR: https://git.openjdk.org/jdk/pull/12495 From ccheung at openjdk.org Tue Feb 14 01:09:43 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Tue, 14 Feb 2023 01:09:43 GMT Subject: RFR: 8296344: Remove dependency on G1 for writing the CDS archive heap [v4] In-Reply-To: References: Message-ID: On Fri, 3 Feb 2023 05:23:26 GMT, Ioi Lam wrote: >> Goals >> >> - Simplify the writing of the CDS archive heap >> - We no longer need to allocate special "archive regions" during CDS dump time. See all the removed G1 code. >> - Make it possible to (in a future RFE) write the CDS archive heap using any garbage collector >> >> Implementation - the following runs inside a safepoint so heap objects aren't moving >> >> - Find all the objects that should be archived >> - Allocate buffer using a GrowableArray >> - Copy all "open" objects into the buffer >> - Copy all "closed" objects into the buffer >> - Make sure the heap image can be mapped with all possible G1 region sizes: >> - While copying, add fillers to make sure no object spans across 1MB boundary (minimal G1 region size) >> - Align the bottom of the "open" and "closed" objects at 1MB boundary >> - Write the buffer to disk as the image for the archive heap >> >> Testing: tiers 1 ~ 5 > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Simplified relocation by using buffered address instead of requested address src/hotspot/share/cds/heapShared.cpp line 485: > 483: } > 484: bool success = archive_reachable_objects_from(level, subgraph_info, oop_field, is_closed_archive); > 485: int root_index = append_root(oop_field); Maybe add an assert after line 484 to ensure the `success` status? ------------- PR: https://git.openjdk.org/jdk/pull/12304 From sspitsyn at openjdk.org Tue Feb 14 01:32:03 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 14 Feb 2023 01:32:03 GMT Subject: Integrated: 8298853: JvmtiVTMSTransitionDisabler should support disabling one virtual thread transitions In-Reply-To: References: Message-ID: On Thu, 15 Dec 2022 11:51:10 GMT, Serguei Spitsyn wrote: > Now the `JvmtiVTMSTransitionDisabler` mechanism supports disabling VTMS transitions for all virtual threads only. It should also support disabling transitions for any specific virtual thread as well. This will improve scalability of the JVMTI functions operating on target virtual threads as the functions can be executed concurrently without blocking each other execution when target virtual threads are different. > New constructor `JvmtiVTMSTransitionDisabler(jthread vthread)` is added which has jthread parameter of the target virtual thread. > > Testing: > mach5 jobs are TBD (preliminary testing was completed): > - all JVMTI, JDWP, JDI and JDB tests have to be run > - Kitchensink > - tier5 This pull request has now been integrated. Changeset: 13b1ebba Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/13b1ebba276940ff83e53b8ec3659280b3574204 Stats: 198 lines in 8 files changed: 129 ins; 16 del; 53 mod 8298853: JvmtiVTMSTransitionDisabler should support disabling one virtual thread transitions Reviewed-by: pchilanomate, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/11690 From sviswanathan at openjdk.org Tue Feb 14 02:00:58 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Tue, 14 Feb 2023 02:00:58 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v15] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: <4LXDemRbOxX-JiiDgh14WTZ9mc5VwbRLz4KjfxS29Cw=.9c16ea8e-dd1c-4aa3-a6f1-4736f3515edb@github.com> On Thu, 9 Feb 2023 18:08:15 GMT, Scott Gibbons wrote: >> Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. >> >> Encode performance: >> **Old:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms >> >> >> **New:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms >> >> >> Decode performance: >> **Old:** >> >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms >> >> **New:** >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Add URL to microbenchmark src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2399: > 2397: VM_Version::supports_avx512bw()) { > 2398: __ cmpl(length, 31); // 32-bytes is break-even for AVX-512 > 2399: __ jcc(Assembler::lessEqual, L_bruteForce); The avx2 code needs the length to be atleast 0x2c (44) bytes. We could directly go to non-avx code instead of L_bruteForce here. We will save one subtract/branch. src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2658: > 2656: // Check for buffer too small (for algorithm) > 2657: __ subl(length, 0x2c); > 2658: __ jcc(Assembler::lessEqual, L_tailProc); This could be Assembler::less instead of Assembler::lessEqual. src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2699: > 2697: __ addptr(dest, 0x18); > 2698: __ subl(length, 0x20); > 2699: __ jcc(Assembler::lessEqual, L_tailProc); This could be Assembler::less instead of Assembler::lessEqual. ------------- PR: https://git.openjdk.org/jdk/pull/12126 From jiefu at openjdk.org Tue Feb 14 03:36:29 2023 From: jiefu at openjdk.org (Jie Fu) Date: Tue, 14 Feb 2023 03:36:29 GMT Subject: RFR: 8302368: [ZGC] Client build fails after JDK-8300255 Message-ID: Hi all, Please review this small change to fix the client vm build failure. "__" redefined [-Werror] occurred if build without C2. Thanks. Best regards, Jie ------------- Commit messages: - 8302368: [ZGC] Client build fails after JDK-8300255 Changes: https://git.openjdk.org/jdk/pull/12547/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12547&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302368 Stats: 11 lines in 3 files changed: 3 ins; 5 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/12547.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12547/head:pull/12547 PR: https://git.openjdk.org/jdk/pull/12547 From iklam at openjdk.org Tue Feb 14 04:53:36 2023 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 14 Feb 2023 04:53:36 GMT Subject: RFR: 8224980: FLAG_SET_ERGO silently ignores invalid values Message-ID: Clarify the `FLAG_SET_ERGO` API - the caller must ensure that a valid value is passed. Added an assert to check this. Fixed two issues found by the assert: - `NonNMethodCodeHeapSize` - `XXXThreshold` and `XXXNotifyFreqLog` flags in `CompilerConfig::set_compilation_policy_flags()` ------------- Commit messages: - 8224980: FLAG_SET_ERGO silently ignores invalid values Changes: https://git.openjdk.org/jdk/pull/12549/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12549&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8224980 Stats: 41 lines in 5 files changed: 22 ins; 1 del; 18 mod Patch: https://git.openjdk.org/jdk/pull/12549.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12549/head:pull/12549 PR: https://git.openjdk.org/jdk/pull/12549 From iveresov at openjdk.org Tue Feb 14 04:53:37 2023 From: iveresov at openjdk.org (Igor Veresov) Date: Tue, 14 Feb 2023 04:53:37 GMT Subject: RFR: 8224980: FLAG_SET_ERGO silently ignores invalid values In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 04:03:36 GMT, Ioi Lam wrote: > Clarify the `FLAG_SET_ERGO` API - the caller must ensure that a valid value is passed. Added an assert to check this. > > Fixed two issues found by the assert: > > - `NonNMethodCodeHeapSize` > - `XXXThreshold` and `XXXNotifyFreqLog` flags in `CompilerConfig::set_compilation_policy_flags()` Marked as reviewed by iveresov (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12549 From dholmes at openjdk.org Tue Feb 14 04:53:39 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 14 Feb 2023 04:53:39 GMT Subject: RFR: 8224980: FLAG_SET_ERGO silently ignores invalid values In-Reply-To: References: Message-ID: <_qn4sevll_Nuotb4JaMdtzeOe_AagXZ772m69kqC8yw=.d48f9e1e-7956-472c-9b56-e10cc64a05bf@github.com> On Tue, 14 Feb 2023 04:03:36 GMT, Ioi Lam wrote: > Clarify the `FLAG_SET_ERGO` API - the caller must ensure that a valid value is passed. Added an assert to check this. > > Fixed two issues found by the assert: > > - `NonNMethodCodeHeapSize` > - `XXXThreshold` and `XXXNotifyFreqLog` flags in `CompilerConfig::set_compilation_policy_flags()` src/hotspot/share/runtime/flags/jvmFlagAccess.cpp line 70: > 68: if (err != JVMFlag::SUCCESS) { > 69: if (origin == JVMFlagOrigin::ERGONOMIC) { > 70: fatal("FLAG_SET_ERGO cannot be used to set an invalid value for %s", flag->name()); This would be okay if we are guaranteed to see the `printError` message from the constraint function, but I don't think we are. If I'm reading things correctly we need `verbose` to be true to get the error message, but `verbose` comes from `AtParse` which seems to be set to 0 and never modified. ?? I think we need to force `verbose` to be true if `origin == JVMFlagOrigin::ERGONOMIC)`. ------------- PR: https://git.openjdk.org/jdk/pull/12549 From iklam at openjdk.org Tue Feb 14 05:15:15 2023 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 14 Feb 2023 05:15:15 GMT Subject: RFR: 8224980: FLAG_SET_ERGO silently ignores invalid values [v2] In-Reply-To: References: Message-ID: <8JBp5wYLz4IbjkPlhWUa845BAR4YnRKdhViQ0SfCnGo=.3f34ba5c-ba98-458d-8243-05f5440d1d32@github.com> > Clarify the `FLAG_SET_ERGO` API - the caller must ensure that a valid value is passed. Added an assert to check this. > > Fixed two issues found by the assert: > > - `NonNMethodCodeHeapSize` > - `XXXThreshold` and `XXXNotifyFreqLog` flags in `CompilerConfig::set_compilation_policy_flags()` Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: @dholmes-ora comment - make sure errors are printed for FLAG_SET_ERGO ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12549/files - new: https://git.openjdk.org/jdk/pull/12549/files/a41c3989..adef377e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12549&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12549&range=00-01 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12549.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12549/head:pull/12549 PR: https://git.openjdk.org/jdk/pull/12549 From iklam at openjdk.org Tue Feb 14 05:15:15 2023 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 14 Feb 2023 05:15:15 GMT Subject: RFR: 8224980: FLAG_SET_ERGO silently ignores invalid values [v2] In-Reply-To: <_qn4sevll_Nuotb4JaMdtzeOe_AagXZ772m69kqC8yw=.d48f9e1e-7956-472c-9b56-e10cc64a05bf@github.com> References: <_qn4sevll_Nuotb4JaMdtzeOe_AagXZ772m69kqC8yw=.d48f9e1e-7956-472c-9b56-e10cc64a05bf@github.com> Message-ID: On Tue, 14 Feb 2023 04:28:49 GMT, David Holmes wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> @dholmes-ora comment - make sure errors are printed for FLAG_SET_ERGO > > src/hotspot/share/runtime/flags/jvmFlagAccess.cpp line 70: > >> 68: if (err != JVMFlag::SUCCESS) { >> 69: if (origin == JVMFlagOrigin::ERGONOMIC) { >> 70: fatal("FLAG_SET_ERGO cannot be used to set an invalid value for %s", flag->name()); > > This would be okay if we are guaranteed to see the `printError` message from the constraint function, but I don't think we are. If I'm reading things correctly we need `verbose` to be true to get the error message, but `verbose` comes from `AtParse` which seems to be set to 0 and never modified. ?? > I think we need to force `verbose` to be true if `origin == JVMFlagOrigin::ERGONOMIC)`. I've forced `verbose` to be true in another code path. I added the same logic here to be complete. ------------- PR: https://git.openjdk.org/jdk/pull/12549 From jcking at openjdk.org Tue Feb 14 05:24:41 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 14 Feb 2023 05:24:41 GMT Subject: RFR: 8302262: Remove -XX:SuppressErrorAt develop option In-Reply-To: References: Message-ID: On Sat, 11 Feb 2023 03:17:38 GMT, Kim Barrett wrote: > Please review this change which removes the -XX:SuppressErrorAt develop option > and the associated implementation. Drive by comment. Glad to see this, was going to propose this at some point as well. ------------- PR: https://git.openjdk.org/jdk/pull/12521 From sspitsyn at openjdk.org Tue Feb 14 05:45:34 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 14 Feb 2023 05:45:34 GMT Subject: RFR: 8299240: rank of JvmtiVTMSTransition_lock can be safepoint Message-ID: The rank of JvmtiVTMSTransition_lock is better to be safepoint instead of nosafepoint. The fix includes removal of the function `check_vthread_and_suspend_at_safepoint` which is not needed anymore. Testing: mach5 jobs are in progress: Kitchensink, tiers1-6 (all JVMTI, JDWP, JDI and JDB tests have to be included) ------------- Commit messages: - 8299240: rank of JvmtiVTMSTransition_lock can be safepoint Changes: https://git.openjdk.org/jdk/pull/12550/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12550&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8299240 Stats: 49 lines in 6 files changed: 8 ins; 30 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/12550.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12550/head:pull/12550 PR: https://git.openjdk.org/jdk/pull/12550 From iklam at openjdk.org Tue Feb 14 05:48:26 2023 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 14 Feb 2023 05:48:26 GMT Subject: RFR: 8296344: Remove dependency on G1 for writing the CDS archive heap [v5] In-Reply-To: References: Message-ID: > Goals > > - Simplify the writing of the CDS archive heap > - We no longer need to allocate special "archive regions" during CDS dump time. See all the removed G1 code. > - Make it possible to (in a future RFE) write the CDS archive heap using any garbage collector > > Implementation - the following runs inside a safepoint so heap objects aren't moving > > - Find all the objects that should be archived > - Allocate buffer using a GrowableArray > - Copy all "open" objects into the buffer > - Copy all "closed" objects into the buffer > - Make sure the heap image can be mapped with all possible G1 region sizes: > - While copying, add fillers to make sure no object spans across 1MB boundary (minimal G1 region size) > - Align the bottom of the "open" and "closed" objects at 1MB boundary > - Write the buffer to disk as the image for the archive heap > > Testing: tiers 1 ~ 5 Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - Merge branch 'master' into 8296344-remove-cds-heap-dump-dependency-on-G1-public-review - @calvinccheung review comment -- added assert) - Simplified relocation by using buffered address instead of requested address - @ashu-mehra review; added comments about relocating HeapShared::roots() - fixed windows and 32-bit builds which do not support archive heap - clean up - 8296344: Remove dependency on G1 for writing the CDS archive heap (step 2: no intermediate buffer) - 8296344: Remove dependency on G1 for writing the CDS archive heap (step 1: with buffer copying) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12304/files - new: https://git.openjdk.org/jdk/pull/12304/files/00dcedd7..f72e1803 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12304&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12304&range=03-04 Stats: 60984 lines in 1510 files changed: 25291 ins; 16216 del; 19477 mod Patch: https://git.openjdk.org/jdk/pull/12304.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12304/head:pull/12304 PR: https://git.openjdk.org/jdk/pull/12304 From fyang at openjdk.org Tue Feb 14 05:50:43 2023 From: fyang at openjdk.org (Fei Yang) Date: Tue, 14 Feb 2023 05:50:43 GMT Subject: RFR: 8302368: [ZGC] Client build fails after JDK-8300255 In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 03:29:51 GMT, Jie Fu wrote: > Hi all, > > Please review this small change to fix the client vm build failure. > > "__" redefined [-Werror] occurred if build without C2. > > Thanks. > Best regards, > Jie Changes requested by fyang (Reviewer). src/hotspot/cpu/riscv/gc/z/zBarrierSetAssembler_riscv.cpp line 447: > 445: #endif // COMPILER1 > 446: > 447: #undef __ I think we should remove the "#undef __" at line #354 of this file at the same time for riscv. ------------- PR: https://git.openjdk.org/jdk/pull/12547 From jiefu at openjdk.org Tue Feb 14 06:04:11 2023 From: jiefu at openjdk.org (Jie Fu) Date: Tue, 14 Feb 2023 06:04:11 GMT Subject: RFR: 8302368: [ZGC] Client build fails after JDK-8300255 [v2] In-Reply-To: References: Message-ID: > Hi all, > > Please review this small change to fix the client vm build failure. > > "__" redefined [-Werror] occurred if build without C2. > > Thanks. > Best regards, > Jie Jie Fu has updated the pull request incrementally with one additional commit since the last revision: Address review comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12547/files - new: https://git.openjdk.org/jdk/pull/12547/files/b223f2c5..d25035c4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12547&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12547&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12547.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12547/head:pull/12547 PR: https://git.openjdk.org/jdk/pull/12547 From eosterlund at openjdk.org Tue Feb 14 06:04:11 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 14 Feb 2023 06:04:11 GMT Subject: RFR: 8302368: [ZGC] Client build fails after JDK-8300255 [v2] In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 06:00:10 GMT, Jie Fu wrote: >> Hi all, >> >> Please review this small change to fix the client vm build failure. >> >> "__" redefined [-Werror] occurred if build without C2. >> >> Thanks. >> Best regards, >> Jie > > Jie Fu has updated the pull request incrementally with one additional commit since the last revision: > > Address review comment Looks good. Thanks for fixing. ------------- Marked as reviewed by eosterlund (Reviewer). PR: https://git.openjdk.org/jdk/pull/12547 From jiefu at openjdk.org Tue Feb 14 06:04:12 2023 From: jiefu at openjdk.org (Jie Fu) Date: Tue, 14 Feb 2023 06:04:12 GMT Subject: RFR: 8302368: [ZGC] Client build fails after JDK-8300255 [v2] In-Reply-To: References: Message-ID: <38od2oxwdScXj0ZesHkCRAnoXkxRoFmB4xVA6_U_Qao=.85f7994e-8032-4f7e-a8da-e2532010ae39@github.com> On Tue, 14 Feb 2023 05:46:24 GMT, Fei Yang wrote: >> Jie Fu has updated the pull request incrementally with one additional commit since the last revision: >> >> Address review comment > > src/hotspot/cpu/riscv/gc/z/zBarrierSetAssembler_riscv.cpp line 447: > >> 445: #endif // COMPILER1 >> 446: >> 447: #undef __ > > I think we should remove the "#undef __" at line #354 of this file at the same time for riscv. Good catch! Updated. Thanks. ------------- PR: https://git.openjdk.org/jdk/pull/12547 From dholmes at openjdk.org Tue Feb 14 06:16:47 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 14 Feb 2023 06:16:47 GMT Subject: RFR: 8302354: InstanceKlass init state/thread should be atomic In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 18:56:19 GMT, Robbin Ehn wrote: > Hi, please consider. > > These states are read 'lock-less' should thus be atomic. > Also the assert setting init thread state can be stricter. > > Passes t1-4 src/hotspot/share/oops/instanceKlass.hpp line 237: > 235: > 236: Monitor* _init_monitor; // mutual exclusion to _init_state and _init_thread. > 237: JavaThread* volatile _init_thread; // Pointer to current thread doing initialization (to handle recursive initialization) `volatile` is not strictly needed here - see discussion above - but flags this as a "lock-free mutable field". (Though a special type - again see above.) src/hotspot/share/oops/instanceKlass.hpp line 237: > 235: > 236: Monitor* _init_monitor; // mutual exclusion to _init_state and _init_thread. > 237: JavaThread* volatile _init_thread; // Pointer to current thread doing initialization (to handle recursive initialization) Nit: please re-align comments with preceding lines. Thanks src/hotspot/share/oops/instanceKlass.hpp line 514: > 512: bool is_being_initialized() const { return init_state() == being_initialized; } > 513: bool is_in_error_state() const { return init_state() == initialization_error; } > 514: bool is_init_thread(JavaThread *thread) { return thread == Atomic::load(&_init_thread); } Just to comment ... `is_init_thread` only ever makes sense to be called by passing the current thread, in which case there is no need for an Atomic::load because a thread's own writes are always visible to itself, and no other thread will ever write a different thread's address into the field. So you could simplify this by doing: bool is_init_thread(JavaThread *current { assert(current == JavaThread::current(), "invariant"); return thread == _init_thread; } src/hotspot/share/oops/instanceKlass.hpp line 1083: > 1081: assert((thread == JavaThread::current() && _init_thread == nullptr) || > 1082: (thread == nullptr && _init_thread == JavaThread::current()), "Only one thread is allowed to own initialization"); > 1083: Atomic::store(&_init_thread, thread); As a corollary to the above comment, Atomic::store is not needed here when only the current thread can set the value to itself, or change the value from itself to nullptr. ------------- PR: https://git.openjdk.org/jdk/pull/12542 From dholmes at openjdk.org Tue Feb 14 06:30:45 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 14 Feb 2023 06:30:45 GMT Subject: RFR: 8302354: InstanceKlass init state/thread should be atomic In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 18:56:19 GMT, Robbin Ehn wrote: > Hi, please consider. > > These states are read 'lock-less' should thus be atomic. > Also the assert setting init thread state can be stricter. > > Passes t1-4 Scratch previous review comments. Use of Atomic::load/store guards against theoretical possibility of non-atomic ptr read/writes. I was confusing with visibility issues. Sorry for the noise. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12542 From dholmes at openjdk.org Tue Feb 14 06:31:45 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 14 Feb 2023 06:31:45 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 [v2] In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 09:26:07 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Some more fixes LGTM! Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12326 From dholmes at openjdk.org Tue Feb 14 06:43:42 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 14 Feb 2023 06:43:42 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 14:39:00 GMT, Johannes Bechberger wrote: > Fixes the issue by applying a fix that is already implemented in PPC. src/hotspot/cpu/x86/frame_x86.cpp line 72: > 70: > 71: // unextended sp must be within the stack > 72: if (!thread->is_in_full_stack_checked(unextended_sp)) { I'm not at all sure this relaxation is valid. What are you basing this on? ------------- PR: https://git.openjdk.org/jdk/pull/12535 From fyang at openjdk.org Tue Feb 14 06:54:45 2023 From: fyang at openjdk.org (Fei Yang) Date: Tue, 14 Feb 2023 06:54:45 GMT Subject: RFR: 8302368: [ZGC] Client build fails after JDK-8300255 [v2] In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 06:04:11 GMT, Jie Fu wrote: >> Hi all, >> >> Please review this small change to fix the client vm build failure. >> >> "__" redefined [-Werror] occurred if build without C2. >> >> Thanks. >> Best regards, >> Jie > > Jie Fu has updated the pull request incrementally with one additional commit since the last revision: > > Address review comment Updated changes LGTM. Thanks. ------------- Marked as reviewed by fyang (Reviewer). PR: https://git.openjdk.org/jdk/pull/12547 From dholmes at openjdk.org Tue Feb 14 06:58:46 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 14 Feb 2023 06:58:46 GMT Subject: RFR: 8224980: FLAG_SET_ERGO silently ignores invalid values [v2] In-Reply-To: <8JBp5wYLz4IbjkPlhWUa845BAR4YnRKdhViQ0SfCnGo=.3f34ba5c-ba98-458d-8243-05f5440d1d32@github.com> References: <8JBp5wYLz4IbjkPlhWUa845BAR4YnRKdhViQ0SfCnGo=.3f34ba5c-ba98-458d-8243-05f5440d1d32@github.com> Message-ID: On Tue, 14 Feb 2023 05:15:15 GMT, Ioi Lam wrote: >> Clarify the `FLAG_SET_ERGO` API - the caller must ensure that a valid value is passed. Added an assert to check this. >> >> Fixed two issues found by the assert: >> >> - `NonNMethodCodeHeapSize` >> - `XXXThreshold` and `XXXNotifyFreqLog` flags in `CompilerConfig::set_compilation_policy_flags()` > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @dholmes-ora comment - make sure errors are printed for FLAG_SET_ERGO src/hotspot/share/code/codeCache.cpp line 1197: > 1195: } else { > 1196: // Use a single code heap > 1197: FLAG_SET_ERGO(NonNMethodCodeHeapSize, (uintx)os::vm_page_size()); Sanity check: this code does execute after `os::init()` doesn't it? src/hotspot/share/compiler/compilerDefinitions.cpp line 134: > 132: intx CompilerConfig::jvmflag_scaled_freq_log(intx freq_log) { > 133: return MAX2((intx)0, MIN2(scaled_freq_log(freq_log), (intx)30)); > 134: } If the passed in arguments are out of range this appears to cap them within range but the initial error is never made known to the caller. ?? src/hotspot/share/runtime/flags/jvmFlagAccess.cpp line 71: > 69: if (err != JVMFlag::SUCCESS) { > 70: if (origin == JVMFlagOrigin::ERGONOMIC) { > 71: fatal("FLAG_SET_ERGO cannot be used to set an invalid value for %s", flag->name()); Just occurred to me, should the `fatal` rather be a `vm_exit_during_initialization`? src/hotspot/share/runtime/globals_extension.hpp line 92: > 90: > 91: // FLAG_SET_ERGO must be always be called with a valid value. You cannot pass in a unvalidated > 92: // value and expect a return code for success/failure. Suggestion: // FLAG_SET_ERGO must be always be called with a valid value. If an invalid value // is detected then the VM will exit. ------------- PR: https://git.openjdk.org/jdk/pull/12549 From duke at openjdk.org Tue Feb 14 07:06:44 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Tue, 14 Feb 2023 07:06:44 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test In-Reply-To: References: Message-ID: <2oXvjl1UFJq6x70IUcUgLKUVQgnjLC9U7YsB3XQg7BQ=.d651fdb1-df5c-40bc-9adc-4921045014a0@github.com> On Tue, 14 Feb 2023 06:41:21 GMT, David Holmes wrote: >> Fixes the issue by applying a fix that is already implemented in PPC. > > src/hotspot/cpu/x86/frame_x86.cpp line 72: > >> 70: >> 71: // unextended sp must be within the stack >> 72: if (!thread->is_in_full_stack_checked(unextended_sp)) { > > I'm not at all sure this relaxation is valid. What are you basing this on? The test fails without this relaxation and the relaxation has been used on PPC to solve a similar issue a few years back. ------------- PR: https://git.openjdk.org/jdk/pull/12535 From rkennke at openjdk.org Tue Feb 14 07:22:28 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 14 Feb 2023 07:22:28 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v20] In-Reply-To: References: Message-ID: > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Add asserts in and around arrayOop::max_array_length() - Fix intendation in test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/24a95a7c..7254fcd1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=19 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=18-19 Stats: 74 lines in 4 files changed: 53 ins; 18 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From rkennke at openjdk.org Tue Feb 14 07:37:26 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 14 Feb 2023 07:37:26 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v21] In-Reply-To: References: Message-ID: > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Fix assert in collectedHeap ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/7254fcd1..0c6bdc73 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=20 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=19-20 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From duke at openjdk.org Tue Feb 14 07:40:48 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Tue, 14 Feb 2023 07:40:48 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test In-Reply-To: <2oXvjl1UFJq6x70IUcUgLKUVQgnjLC9U7YsB3XQg7BQ=.d651fdb1-df5c-40bc-9adc-4921045014a0@github.com> References: <2oXvjl1UFJq6x70IUcUgLKUVQgnjLC9U7YsB3XQg7BQ=.d651fdb1-df5c-40bc-9adc-4921045014a0@github.com> Message-ID: On Tue, 14 Feb 2023 07:04:16 GMT, Johannes Bechberger wrote: >> src/hotspot/cpu/x86/frame_x86.cpp line 72: >> >>> 70: >>> 71: // unextended sp must be within the stack >>> 72: if (!thread->is_in_full_stack_checked(unextended_sp)) { >> >> I'm not at all sure this relaxation is valid. What are you basing this on? > > The test fails without this relaxation and the relaxation has been used on PPC to solve a similar issue a few years back. The original assumption does not always hold anymore and should therefore relaxed. ------------- PR: https://git.openjdk.org/jdk/pull/12535 From rkennke at openjdk.org Tue Feb 14 07:42:23 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 14 Feb 2023 07:42:23 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v22] In-Reply-To: References: Message-ID: > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Remove debug output ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/0c6bdc73..67d616a0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=21 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=20-21 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From jiefu at openjdk.org Tue Feb 14 07:53:43 2023 From: jiefu at openjdk.org (Jie Fu) Date: Tue, 14 Feb 2023 07:53:43 GMT Subject: RFR: 8302368: [ZGC] Client build fails after JDK-8300255 [v2] In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 06:52:10 GMT, Fei Yang wrote: >> Jie Fu has updated the pull request incrementally with one additional commit since the last revision: >> >> Address review comment > > Updated changes LGTM. Thanks. Thanks @RealFYang and @fisk for the review. I'll push it tonight to fix the build breakage as soon as possible if there is no other comments. ------------- PR: https://git.openjdk.org/jdk/pull/12547 From sjohanss at openjdk.org Tue Feb 14 07:55:42 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Tue, 14 Feb 2023 07:55:42 GMT Subject: RFR: 8302312: Make ParGCRareEvent_lock G1 specific In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 11:40:24 GMT, Thomas Schatzl wrote: > Hi all, > > please have a look at this renaming of ParGCRareEvent_lock to G1RareEvent_lock to align with it only being used in G1. Previously it had been used with CMS too, but that one's gone now. > > Testing: local compilation, gha > > Based on PR#12511 > > Thanks, > Thomas Looks good and trivial. ------------- Marked as reviewed by sjohanss (Reviewer). PR: https://git.openjdk.org/jdk/pull/12531 From rkennke at openjdk.org Tue Feb 14 08:33:23 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 14 Feb 2023 08:33:23 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v23] In-Reply-To: References: Message-ID: > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Clarify comment on arrayOopDesc::max_array_length() ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/67d616a0..eab5720a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=22 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=21-22 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From rkennke at openjdk.org Tue Feb 14 08:33:23 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 14 Feb 2023 08:33:23 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v17] In-Reply-To: References: Message-ID: <9pOXuIGGgh1tRnV1IIrXqZD_UM4ZvOEW9WOsIxiBfec=.9f114617-8e9a-46b5-99e0-d8dff3688db3@github.com> On Fri, 27 Jan 2023 17:12:37 GMT, Coleen Phillimore wrote: > I'm following occurrences of header_size() in the code because that might be where we have assumptions about where the payload for oops starts. Ignoring the occurrences from metadata, there is one unused one in instanceOop.hpp, not related to your change that I think should be removed. The one OopDesc::header_size() is used by min_dummy_object_size and min_fill_size in CollectedHeap. These usages look okay to me but if they were rewritten to use heap_word_size(), maybe that would be better. > > I ran tier1-8 on this for linux-x64-debug with no failures. Thanks, Coleen, for reviewing. I already removed the one occurance of header_size() in instanceOop.hpp. The oopDesc::header_size() is problematic not only because it assumes word-aligned object header, but also because it depends on sizeof(oopDesc) - with Lilliput, this will no longer be a useful way to measure. However, I'd probably tackle it in a separate PR to avoid scope creep? ------------- PR: https://git.openjdk.org/jdk/pull/11044 From stefank at openjdk.org Tue Feb 14 09:00:55 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 14 Feb 2023 09:00:55 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy [v5] In-Reply-To: <60XHlRRmnr8C-BcA_pRbLwIs7P5lqv5cayfc4ifBHeE=.12ed00cf-afc1-410a-b415-4425769be62b@github.com> References: <60XHlRRmnr8C-BcA_pRbLwIs7P5lqv5cayfc4ifBHeE=.12ed00cf-afc1-410a-b415-4425769be62b@github.com> Message-ID: On Tue, 7 Feb 2023 17:25:02 GMT, Justin King wrote: >> - Rename `MEMFLAGS` to `MemoryType`. `MEMFLAGS` is highly misleading as flags typically can be combined. >> - Update `MemoryType` to have enumeration names that follow the style guide, no `mt` prefix. >> - Create aliases for old `mtXXX` names. >> - Remove `mt_number_of_types` from the enumeration. >> - Shift implementation of utilities related to `MEMFLAGS` from `NMTUtil` to `MemoryTypes`. Handle missing `mt` prefix during parsing. >> - `NMTUtil` references are not updated to avoid increasing patch size. >> - Merge `AllocFailStrategy` and `AllocFailType` into `AllocationFailureStrategy`. >> - Move `MemoryType` and `AllocationFailureStrategy` to their own respective headers. >> >> This change does not update variable verbiage. That should be something done over time. > > Justin King has updated the pull request incrementally with one additional commit since the last revision: > > Add precompiled header > > Signed-off-by: Justin King FWIW, I strongly dislike the uppercase MEMFLAGS name. I wouldn't mind this rename at all. ------------- PR: https://git.openjdk.org/jdk/pull/12454 From ayang at openjdk.org Tue Feb 14 09:02:50 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 14 Feb 2023 09:02:50 GMT Subject: RFR: 8296344: Remove dependency on G1 for writing the CDS archive heap [v5] In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 05:48:26 GMT, Ioi Lam wrote: >> Goals >> >> - Simplify the writing of the CDS archive heap >> - We no longer need to allocate special "archive regions" during CDS dump time. See all the removed G1 code. >> - Make it possible to (in a future RFE) write the CDS archive heap using any garbage collector >> >> Implementation - the following runs inside a safepoint so heap objects aren't moving >> >> - Find all the objects that should be archived >> - Allocate buffer using a GrowableArray >> - Copy all "open" objects into the buffer >> - Copy all "closed" objects into the buffer >> - Make sure the heap image can be mapped with all possible G1 region sizes: >> - While copying, add fillers to make sure no object spans across 1MB boundary (minimal G1 region size) >> - Align the bottom of the "open" and "closed" objects at 1MB boundary >> - Write the buffer to disk as the image for the archive heap >> >> Testing: tiers 1 ~ 5 > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'master' into 8296344-remove-cds-heap-dump-dependency-on-G1-public-review > - @calvinccheung review comment -- added assert) > - Simplified relocation by using buffered address instead of requested address > - @ashu-mehra review; added comments about relocating HeapShared::roots() > - fixed windows and 32-bit builds which do not support archive heap > - clean up > - 8296344: Remove dependency on G1 for writing the CDS archive heap (step 2: no intermediate buffer) > - 8296344: Remove dependency on G1 for writing the CDS archive heap (step 1: with buffer copying) The G1 part looks good. ------------- Marked as reviewed by ayang (Reviewer). PR: https://git.openjdk.org/jdk/pull/12304 From tschatzl at openjdk.org Tue Feb 14 09:02:50 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 14 Feb 2023 09:02:50 GMT Subject: RFR: 8296344: Remove dependency on G1 for writing the CDS archive heap [v5] In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 05:48:26 GMT, Ioi Lam wrote: >> Goals >> >> - Simplify the writing of the CDS archive heap >> - We no longer need to allocate special "archive regions" during CDS dump time. See all the removed G1 code. >> - Make it possible to (in a future RFE) write the CDS archive heap using any garbage collector >> >> Implementation - the following runs inside a safepoint so heap objects aren't moving >> >> - Find all the objects that should be archived >> - Allocate buffer using a GrowableArray >> - Copy all "open" objects into the buffer >> - Copy all "closed" objects into the buffer >> - Make sure the heap image can be mapped with all possible G1 region sizes: >> - While copying, add fillers to make sure no object spans across 1MB boundary (minimal G1 region size) >> - Align the bottom of the "open" and "closed" objects at 1MB boundary >> - Write the buffer to disk as the image for the archive heap >> >> Testing: tiers 1 ~ 5 > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'master' into 8296344-remove-cds-heap-dump-dependency-on-G1-public-review > - @calvinccheung review comment -- added assert) > - Simplified relocation by using buffered address instead of requested address > - @ashu-mehra review; added comments about relocating HeapShared::roots() > - fixed windows and 32-bit builds which do not support archive heap > - clean up > - 8296344: Remove dependency on G1 for writing the CDS archive heap (step 2: no intermediate buffer) > - 8296344: Remove dependency on G1 for writing the CDS archive heap (step 1: with buffer copying) GC parts look good ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.org/jdk/pull/12304 From eosterlund at openjdk.org Tue Feb 14 09:21:54 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 14 Feb 2023 09:21:54 GMT Subject: Integrated: 8301874: BarrierSetC2 should assign barrier data to stores In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 15:12:03 GMT, Erik ?sterlund wrote: > BarrierSetC2 should assign barrier data to stores, like it does for loads and atomics. That way, a GC that uses barrier data for stores, can call the super class to generate the loads/stores, instead of copying that code. This pull request has now been integrated. Changeset: 7f71a104 Author: Erik ?sterlund URL: https://git.openjdk.org/jdk/commit/7f71a1040d9c03f72d082e329ccaf2c4a3c060a6 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod 8301874: BarrierSetC2 should assign barrier data to stores Reviewed-by: rcastanedalo, kvn ------------- PR: https://git.openjdk.org/jdk/pull/12442 From rrich at openjdk.org Tue Feb 14 09:24:43 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Tue, 14 Feb 2023 09:24:43 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test In-Reply-To: References: <2oXvjl1UFJq6x70IUcUgLKUVQgnjLC9U7YsB3XQg7BQ=.d651fdb1-df5c-40bc-9adc-4921045014a0@github.com> Message-ID: On Tue, 14 Feb 2023 07:38:21 GMT, Johannes Bechberger wrote: >> The test fails without this relaxation and the relaxation has been used on PPC to solve a similar issue a few years back. > > The original assumption does not always hold anymore and should therefore relaxed. I think `unextended_sp < sp` is possible on x86 when interpreter entries generated by `MethodHandles::generate_method_handle_interpreter_entry()` are executed. They modify `sp` ([here for example](https://github.com/openjdk/jdk/blob/7f71a1040d9c03f72d082e329ccaf2c4a3c060a6/src/hotspot/cpu/x86/methodHandles_x86.cpp#L310-L314)) but not `sender_sp` (`InterpreterMacroAssembler ::_bcp_register`). ------------- PR: https://git.openjdk.org/jdk/pull/12535 From rehn at openjdk.org Tue Feb 14 09:26:15 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 14 Feb 2023 09:26:15 GMT Subject: RFR: 8302354: InstanceKlass init state/thread should be atomic [v2] In-Reply-To: References: Message-ID: > Hi, please consider. > > These states are read 'lock-less' should thus be atomic. > Also the assert setting init thread state can be stricter. > > Passes t1-4 Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: Re-align comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12542/files - new: https://git.openjdk.org/jdk/pull/12542/files/595c2349..5cceddba Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12542&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12542&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12542.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12542/head:pull/12542 PR: https://git.openjdk.org/jdk/pull/12542 From rehn at openjdk.org Tue Feb 14 09:26:18 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 14 Feb 2023 09:26:18 GMT Subject: RFR: 8302354: InstanceKlass init state/thread should be atomic [v2] In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 06:27:28 GMT, David Holmes wrote: > Scratch previous review comments. Use of Atomic::load/store guards against theoretical possibility of non-atomic ptr read/writes. I was confusing with visibility issues. Sorry for the noise. Thanks! Fixed "re-align comments". ------------- PR: https://git.openjdk.org/jdk/pull/12542 From ayang at openjdk.org Tue Feb 14 10:18:43 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 14 Feb 2023 10:18:43 GMT Subject: RFR: 8225409: G1: Remove the Hot Card Cache [v2] In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 10:51:31 GMT, Albert Mingkun Yang wrote: >> Mostly straightforward removing G1 HotCardCache, as no performance benefit is observed for various benchmarks (specjbb, specjvm, dacapo, etc). >> >> Test: tier1-6 > > Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: > > - merge > - g1-hcc Thanks for the review. ------------- PR: https://git.openjdk.org/jdk/pull/12452 From ayang at openjdk.org Tue Feb 14 10:21:56 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 14 Feb 2023 10:21:56 GMT Subject: Integrated: 8225409: G1: Remove the Hot Card Cache In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 11:22:42 GMT, Albert Mingkun Yang wrote: > Mostly straightforward removing G1 HotCardCache, as no performance benefit is observed for various benchmarks (specjbb, specjvm, dacapo, etc). > > Test: tier1-6 This pull request has now been integrated. Changeset: 7c50ab16 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/7c50ab1612fafaa5281cc72d8f511e388cdb1d97 Stats: 1259 lines in 34 files changed: 50 ins; 1176 del; 33 mod 8225409: G1: Remove the Hot Card Cache Reviewed-by: tschatzl, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/12452 From tschatzl at openjdk.org Tue Feb 14 10:34:42 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 14 Feb 2023 10:34:42 GMT Subject: RFR: 8302262: Remove -XX:SuppressErrorAt develop option In-Reply-To: References: Message-ID: On Sat, 11 Feb 2023 03:17:38 GMT, Kim Barrett wrote: > Please review this change which removes the -XX:SuppressErrorAt develop option > and the associated implementation. Marked as reviewed by tschatzl (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12521 From redestad at openjdk.org Tue Feb 14 10:48:43 2023 From: redestad at openjdk.org (Claes Redestad) Date: Tue, 14 Feb 2023 10:48:43 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v15] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: On Fri, 10 Feb 2023 23:18:47 GMT, Claes Redestad wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Add URL to microbenchmark > > Marked as reviewed by redestad (Reviewer). > @cl4es Can you please initiate the in-depth testing required for this fix? Thanks. I've checked out the sources and fired off a sanity run of the first couple of tiers. Holding off on testing higher tiers until @sviswa7's concerns has been resolved. ------------- PR: https://git.openjdk.org/jdk/pull/12126 From jiefu at openjdk.org Tue Feb 14 11:00:58 2023 From: jiefu at openjdk.org (Jie Fu) Date: Tue, 14 Feb 2023 11:00:58 GMT Subject: Integrated: 8302368: [ZGC] Client build fails after JDK-8300255 In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 03:29:51 GMT, Jie Fu wrote: > Hi all, > > Please review this small change to fix the client vm build failure. > > "__" redefined [-Werror] occurred if build without C2. > > Thanks. > Best regards, > Jie This pull request has now been integrated. Changeset: 66742c83 Author: Jie Fu URL: https://git.openjdk.org/jdk/commit/66742c83d43fd114b86bfadc823d34448da3cec6 Stats: 13 lines in 3 files changed: 3 ins; 7 del; 3 mod 8302368: [ZGC] Client build fails after JDK-8300255 Reviewed-by: fyang, eosterlund ------------- PR: https://git.openjdk.org/jdk/pull/12547 From rrich at openjdk.org Tue Feb 14 14:22:32 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Tue, 14 Feb 2023 14:22:32 GMT Subject: RFR: 8302158: PPC: test/jdk/jdk/internal/vm/Continuation/Fuzz.java: AssertionError: res: false shouldPin: false Message-ID: This fixes the linked issue by trimming the caller of a frame to be deoptimized back to its `unextended_sp` iff it is compiled. The creation of the section `dead after deoptimization` shown in the attachment [yield_after_deopt_failure.log](https://bugs.openjdk.org/secure/attachment/102602/yield_after_deopt_failure.log) is prevented by this. A new mode is added to the test BasicExt.java where all frames are deoptimized after a yield operation. The issue can be deterministically reproduced with the new mode. It's not worth to execute all test cases with the new mode though. Instead `ContinuationCompiledFramesWithStackArgs_3c4` is always executed a 2nd time in this mode. Before this BasicExt.java was refactored for better argument processing and representation of the test modes. Also the try-catch-clause in the main method had to be changed to rethrow the caught exception because without this the test would have succeeded. Testing: jtreg tests tier 1-4 on standard platforms and also on ppc64le. ------------- Commit messages: - Trimm compiled caller of deoptee to unextended sp - Deoptimize after yield - BasicExt.java test: use enums for parameters - BasicExt.java test: rethrow caught exception to indicate failure Changes: https://git.openjdk.org/jdk/pull/12557/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12557&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302158 Stats: 203 lines in 3 files changed: 148 ins; 13 del; 42 mod Patch: https://git.openjdk.org/jdk/pull/12557.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12557/head:pull/12557 PR: https://git.openjdk.org/jdk/pull/12557 From alanb at openjdk.org Tue Feb 14 14:23:56 2023 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 14 Feb 2023 14:23:56 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v2] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Mon, 13 Feb 2023 19:51:12 GMT, Patricio Chilano Mateo wrote: >> Please review the following change. Some of the JVMTI functions had to be modified since they are not supported by the specs on virtual threads (StopThread, RunAgentThread, GetCurrentThreadCpuTime, GetThreadCpuTime) or shouldn't include virtual threads in the results (GetAllThreads, GetAllStackTraces, GetThreadGroupChildren). Others were modified because they should work on virtual thread but they weren't (SuspendAllVirtualThreads/ResumeAllVirtualThreads). Also support for VirtualThreadStart/VirtualThreadEnd events was added. >> >> There were a total of 71 tests under serviceability/jvmti/ that had the vm.continuations requirement. Of those, 24 where under serviceability/jvmti/vthread/. I was able to remove that requirement from 50 tests. Most of those tests were already passing for the alternative vthread implementation just by removing the check in jni_GetEnv for VMContinuations. >> From the other 21 tests, 20 still fail with the changes either because they actually test continuations or because of different assumptions in the tests that don't hold true for the alternative vthread implementation. So for those I left the vm.continuations requirement untouched (from those 20 tests there are actually 4 tests from the thread/GetStackTrace/ family that are passing because of a bug in the test, but I can fix that in another RFE). The other remaining test is vthread/GetSetLocalTest/GetSetLocalTest.java which I see is problemlisted. >> >> Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). >> >> I added new test BoundVThreadTest.java which just checks for the unsupported/supported behavior mentioned previously. For some extra basic coverage I also added a new run with -XX:-VMContinuations on tests inside the serviceability/jvmti/vthread/ directory that don't require vm.continuations anymore. I could also add that for all the other tests in serviceability/jvmti/ for which I removed the vm.continuations requirement. >> >> I run the patch through mach5 tiers 1-6 plus some extra local runs on the relevant tests. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with three additional commits since the last revision: > > - use different @test to run with -XX:-VMContinuations > - more removals of @requires vm.continuations > - change BasicVirtualThread_klass -> BaseVirtualThread_klass The changes in the updates look good. I think the main thing to decide on is whether JavaThread:: _is_bound_vthread is needed or not. If GetCurrentThreadCpuTime is changed to allow an implementation support this function then it looks like the additional field would not be needed. ------------- PR: https://git.openjdk.org/jdk/pull/12512 From kbarrett at openjdk.org Tue Feb 14 14:29:43 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 14 Feb 2023 14:29:43 GMT Subject: RFR: 8302312: Make ParGCRareEvent_lock G1 specific In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 11:40:24 GMT, Thomas Schatzl wrote: > Hi all, > > please have a look at this renaming of ParGCRareEvent_lock to G1RareEvent_lock to align with it only being used in G1. Previously it had been used with CMS too, but that one's gone now. > > Testing: local compilation, gha > > Based on PR#12511 > > Thanks, > Thomas Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.org/jdk/pull/12531 From rkennke at openjdk.org Tue Feb 14 14:38:51 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 14 Feb 2023 14:38:51 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v11] In-Reply-To: References: Message-ID: > This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). > > What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. > > This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal p rotocols. > > The lock-stack is grown when needed. This means that we need to check for potential overflow before attempting locking. When that is the case, locking fast-paths would call into the runtime to grow the stack and handle the locking. Compiled fast-paths (C1 and C2 on x86_64 and aarch64) do this check on method entry to avoid (possibly lots) of such checks at locking sites. > > In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. > > One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. > > As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. > > This change enables to simplify (and speed-up!) a lot of code: > > - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. > - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR > > > Testing: > - [x] tier1 x86_64 x aarch64 x +UseFastLocking > - [x] tier2 x86_64 x aarch64 x +UseFastLocking > - [x] tier3 x86_64 x aarch64 x +UseFastLocking > - [x] tier4 x86_64 x aarch64 x +UseFastLocking > - [x] tier1 x86_64 x aarch64 x -UseFastLocking > - [x] tier2 x86_64 x aarch64 x -UseFastLocking > - [x] tier3 x86_64 x aarch64 x -UseFastLocking > - [x] tier4 x86_64 x aarch64 x -UseFastLocking > - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet > > ### Performance > > #### Renaissance > > > > ? | x86_64 | ? | ? | ? | aarch64 | ? | ? > -- | -- | -- | -- | -- | -- | -- | -- > ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? > AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% > Reactors | 11444.511 | 11606.66 | -1.40% | ? | 11382.594 | 11638.036 | -2.19% > Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% > ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% > GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% > LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% > MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 9898.1 | 10097.867 | -1.98% > NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% > PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% > FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% > FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% > ParMnemonics | 2527.228 | 2564.667 | -1.46% | ? | 4278.225 | 4263.863 | 0.34% > Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% > RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% > Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% > ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% > ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% > ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% > Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% > FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% > FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 92 commits: - Merge branch 'master' into JDK-8291555-v2 - Small fixes - Fix anon owner in fast-path, avoid runtime call (aarch64) - Fix anon owner in fast-path, avoid runtime call (x86_64) - Daniel's suggested changes - 8291555-dcubed-editorial-1 - Merge branch 'master' into JDK-8291556-v2 - Revert UseFastLocking to default to off - Change log message inflate(locked) -> inflate(has_locker) - Properly set ZF on anon-check path; avoid some conditional branches - ... and 82 more: https://git.openjdk.org/jdk/compare/8c2c8b3f...8bdeda39 ------------- Changes: https://git.openjdk.org/jdk/pull/10907/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=10 Stats: 2066 lines in 74 files changed: 1309 ins; 93 del; 664 mod Patch: https://git.openjdk.org/jdk/pull/10907.diff Fetch: git fetch https://git.openjdk.org/jdk pull/10907/head:pull/10907 PR: https://git.openjdk.org/jdk/pull/10907 From rehn at openjdk.org Tue Feb 14 14:42:20 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 14 Feb 2023 14:42:20 GMT Subject: Integrated: 8302354: InstanceKlass init state/thread should be atomic In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 18:56:19 GMT, Robbin Ehn wrote: > Hi, please consider. > > These states are read 'lock-less' should thus be atomic. > Also the assert setting init thread state can be stricter. > > Passes t1-4 This pull request has now been integrated. Changeset: 77519e5f Author: Robbin Ehn URL: https://git.openjdk.org/jdk/commit/77519e5f4fe75f953c02fb3f15b7f9a58c933fea Stats: 15 lines in 5 files changed: 1 ins; 0 del; 14 mod 8302354: InstanceKlass init state/thread should be atomic Reviewed-by: coleenp, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/12542 From sgibbons at openjdk.org Tue Feb 14 15:11:45 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Tue, 14 Feb 2023 15:11:45 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v16] In-Reply-To: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: > Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. > > Encode performance: > **Old:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms > > > **New:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms > > > Decode performance: > **Old:** > > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms > > **New:** > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Branch around for very small buffers ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12126/files - new: https://git.openjdk.org/jdk/pull/12126/files/424b40d7..5077b4e8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=15 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=14-15 Stats: 11 lines in 1 file changed: 5 ins; 3 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/12126.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12126/head:pull/12126 PR: https://git.openjdk.org/jdk/pull/12126 From sgibbons at openjdk.org Tue Feb 14 15:11:53 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Tue, 14 Feb 2023 15:11:53 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v15] In-Reply-To: <4LXDemRbOxX-JiiDgh14WTZ9mc5VwbRLz4KjfxS29Cw=.9c16ea8e-dd1c-4aa3-a6f1-4736f3515edb@github.com> References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> <4LXDemRbOxX-JiiDgh14WTZ9mc5VwbRLz4KjfxS29Cw=.9c16ea8e-dd1c-4aa3-a6f1-4736f3515edb@github.com> Message-ID: <-cgyVapky014oQOVjAUNJKLMXsppjXLvP4KiglJzKo4=.4818efca-5200-48fa-81ec-f6a812c5dc8e@github.com> On Tue, 14 Feb 2023 01:48:37 GMT, Sandhya Viswanathan wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Add URL to microbenchmark > > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2399: > >> 2397: VM_Version::supports_avx512bw()) { >> 2398: __ cmpl(length, 31); // 32-bytes is break-even for AVX-512 >> 2399: __ jcc(Assembler::lessEqual, L_bruteForce); > > The avx2 code needs the length to be atleast 0x2c (44) bytes. We could directly go to non-avx code instead of L_bruteForce here. We will save one subtract/branch. Done. > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2658: > >> 2656: // Check for buffer too small (for algorithm) >> 2657: __ subl(length, 0x2c); >> 2658: __ jcc(Assembler::lessEqual, L_tailProc); > > This could be Assembler::less instead of Assembler::lessEqual. Why? There is no performance difference and the intent is clear. Is this just a "style" thing? > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2699: > >> 2697: __ addptr(dest, 0x18); >> 2698: __ subl(length, 0x20); >> 2699: __ jcc(Assembler::lessEqual, L_tailProc); > > This could be Assembler::less instead of Assembler::lessEqual. Why? There is no performance difference and the intent is clear. Is this just a "style" thing? ------------- PR: https://git.openjdk.org/jdk/pull/12126 From yzheng at openjdk.org Tue Feb 14 15:21:08 2023 From: yzheng at openjdk.org (Yudi Zheng) Date: Tue, 14 Feb 2023 15:21:08 GMT Subject: RFR: 8302452: [JVMCI] Export _poly1305_processBlocks, JfrThreadLocal fields to JVMCI compiler. Message-ID: This PR allows JVMCI compiler to reuse the _poly1305_processBlocks stub and to update JfrThreadLocal fields on `Thread.setCurrentThread` events. ------------- Commit messages: - Export _poly1305_processBlocks, JfrThreadLocal fields to JVMCI compiler. Changes: https://git.openjdk.org/jdk/pull/12560/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12560&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302452 Stats: 12 lines in 3 files changed: 12 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12560.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12560/head:pull/12560 PR: https://git.openjdk.org/jdk/pull/12560 From redestad at openjdk.org Tue Feb 14 15:22:08 2023 From: redestad at openjdk.org (Claes Redestad) Date: Tue, 14 Feb 2023 15:22:08 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v15] In-Reply-To: <-cgyVapky014oQOVjAUNJKLMXsppjXLvP4KiglJzKo4=.4818efca-5200-48fa-81ec-f6a812c5dc8e@github.com> References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> <4LXDemRbOxX-JiiDgh14WTZ9mc5VwbRLz4KjfxS29Cw=.9c16ea8e-dd1c-4aa3-a6f1-4736f3515edb@github.com> <-cgyVapky014oQOVjAUNJKLMXsppjXLvP4KiglJzKo4=.4818efca-5200-48fa-81ec-f6a812c5dc8e@github.com> Message-ID: On Tue, 14 Feb 2023 15:03:50 GMT, Scott Gibbons wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2699: >> >>> 2697: __ addptr(dest, 0x18); >>> 2698: __ subl(length, 0x20); >>> 2699: __ jcc(Assembler::lessEqual, L_tailProc); >> >> This could be Assembler::less instead of Assembler::lessEqual. > > Why? There is no performance difference and the intent is clear. Is this just a "style" thing? I think with `lessEqual` we'll jump to `L_tailProc` for the final 32-byte chunk in inputs that are divisible by 32 (starting from 64), no? Using `less` avoids that, while not affecting performance of any other inputs. ------------- PR: https://git.openjdk.org/jdk/pull/12126 From ccheung at openjdk.org Tue Feb 14 16:41:45 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Tue, 14 Feb 2023 16:41:45 GMT Subject: RFR: 8296344: Remove dependency on G1 for writing the CDS archive heap [v5] In-Reply-To: References: Message-ID: <3lENyl_YZkpI_b8POHOsgh8Mf8shIPB1TZT3_uaPZ0w=.651eae51-9c68-4728-ad33-c2309e1c5932@github.com> On Tue, 14 Feb 2023 05:48:26 GMT, Ioi Lam wrote: >> Goals >> >> - Simplify the writing of the CDS archive heap >> - We no longer need to allocate special "archive regions" during CDS dump time. See all the removed G1 code. >> - Make it possible to (in a future RFE) write the CDS archive heap using any garbage collector >> >> Implementation - the following runs inside a safepoint so heap objects aren't moving >> >> - Find all the objects that should be archived >> - Allocate buffer using a GrowableArray >> - Copy all "open" objects into the buffer >> - Copy all "closed" objects into the buffer >> - Make sure the heap image can be mapped with all possible G1 region sizes: >> - While copying, add fillers to make sure no object spans across 1MB boundary (minimal G1 region size) >> - Align the bottom of the "open" and "closed" objects at 1MB boundary >> - Write the buffer to disk as the image for the archive heap >> >> Testing: tiers 1 ~ 5 > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'master' into 8296344-remove-cds-heap-dump-dependency-on-G1-public-review > - @calvinccheung review comment -- added assert) > - Simplified relocation by using buffered address instead of requested address > - @ashu-mehra review; added comments about relocating HeapShared::roots() > - fixed windows and 32-bit builds which do not support archive heap > - clean up > - 8296344: Remove dependency on G1 for writing the CDS archive heap (step 2: no intermediate buffer) > - 8296344: Remove dependency on G1 for writing the CDS archive heap (step 1: with buffer copying) Updates looks good. ------------- Marked as reviewed by ccheung (Reviewer). PR: https://git.openjdk.org/jdk/pull/12304 From ccheung at openjdk.org Tue Feb 14 17:38:33 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Tue, 14 Feb 2023 17:38:33 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node Message-ID: Please review this patch for embedding the Symbol inside a ConcurrentHashTable Node instead of having a pointer to a Symbol. This eliminates malloc/free for each Symbol. This patch is co-authored by @robehn. Passed tiers 1 - 4 testing. ------------- Commit messages: - 8301992: Embed SymbolTable CHT node Changes: https://git.openjdk.org/jdk/pull/12562/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12562&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301992 Stats: 175 lines in 4 files changed: 51 ins; 85 del; 39 mod Patch: https://git.openjdk.org/jdk/pull/12562.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12562/head:pull/12562 PR: https://git.openjdk.org/jdk/pull/12562 From sviswanathan at openjdk.org Tue Feb 14 17:55:44 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Tue, 14 Feb 2023 17:55:44 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v15] In-Reply-To: <-cgyVapky014oQOVjAUNJKLMXsppjXLvP4KiglJzKo4=.4818efca-5200-48fa-81ec-f6a812c5dc8e@github.com> References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> <4LXDemRbOxX-JiiDgh14WTZ9mc5VwbRLz4KjfxS29Cw=.9c16ea8e-dd1c-4aa3-a6f1-4736f3515edb@github.com> <-cgyVapky014oQOVjAUNJKLMXsppjXLvP4KiglJzKo4=.4818efca-5200-48fa-81ec-f6a812c5dc8e@github.com> Message-ID: On Tue, 14 Feb 2023 15:03:49 GMT, Scott Gibbons wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2658: >> >>> 2656: // Check for buffer too small (for algorithm) >>> 2657: __ subl(length, 0x2c); >>> 2658: __ jcc(Assembler::lessEqual, L_tailProc); >> >> This could be Assembler::less instead of Assembler::lessEqual. > > Why? There is no performance difference and the intent is clear. Is this just a "style" thing? The thought is that when the length is equal to 44 bytes, we could do the vector loop once before tail processing. The rest of the logic seems to allow that. 44 bytes of base64 -> 33 bytes decoded. So a 32 byte write in vector loop would still be ok and we wont be writing beyond. ------------- PR: https://git.openjdk.org/jdk/pull/12126 From sviswanathan at openjdk.org Tue Feb 14 17:55:46 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Tue, 14 Feb 2023 17:55:46 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v15] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> <4LXDemRbOxX-JiiDgh14WTZ9mc5VwbRLz4KjfxS29Cw=.9c16ea8e-dd1c-4aa3-a6f1-4736f3515edb@github.com> <-cgyVapky014oQOVjAUNJKLMXsppjXLvP4KiglJzKo4=.4818efca-5200-48fa-81ec-f6a812c5dc8e@github.com> Message-ID: On Tue, 14 Feb 2023 15:19:34 GMT, Claes Redestad wrote: >> Why? There is no performance difference and the intent is clear. Is this just a "style" thing? > > I think with `lessEqual` we'll jump to `L_tailProc` for the final 32-byte chunk in inputs that are divisible by 32 (starting from 64), no? Using `less` avoids that, while not affecting performance of any other inputs. As Claes mentioned, this would allow us to do one more iteration of vector loop. ------------- PR: https://git.openjdk.org/jdk/pull/12126 From sgibbons at openjdk.org Tue Feb 14 18:22:32 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Tue, 14 Feb 2023 18:22:32 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v17] In-Reply-To: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: > Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. > > Encode performance: > **Old:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms > > > **New:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms > > > Decode performance: > **Old:** > > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms > > **New:** > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Last of review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12126/files - new: https://git.openjdk.org/jdk/pull/12126/files/5077b4e8..2adaa5da Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=16 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=15-16 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12126.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12126/head:pull/12126 PR: https://git.openjdk.org/jdk/pull/12126 From sviswanathan at openjdk.org Tue Feb 14 19:00:46 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Tue, 14 Feb 2023 19:00:46 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v17] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: On Tue, 14 Feb 2023 18:22:32 GMT, Scott Gibbons wrote: >> Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. >> >> Encode performance: >> **Old:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms >> >> >> **New:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms >> >> >> Decode performance: >> **Old:** >> >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms >> >> **New:** >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Last of review comments Looks good to me. ------------- Marked as reviewed by sviswanathan (Reviewer). PR: https://git.openjdk.org/jdk/pull/12126 From kvn at openjdk.org Tue Feb 14 19:19:45 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 14 Feb 2023 19:19:45 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 [v2] In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 09:26:07 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Some more fixes In addition to few missed placed I noticed that you keep `mov_metadata(rbx, (Metadata*)nullptr);` casting. Is `Metadata*` something special we need to cast it? And more general question - do we need to keep some casting for `nullptr`? We kept them in previous PRs which are already pushed #12029 Should we update our "spec" for `nullptr` to remove all casting or keep some? What is condition to keep? src/hotspot/cpu/x86/interp_masm_x86.cpp line 1785: > 1783: // // degenerate decision tree, rooted at row[2] > 1784: // if (row[2].rec == rec) { row[2].incr(); goto done; } > 1785: // if (row[2].rec != null) { count.incr(); goto done; } // overflow Missed change to nullptr. ------------- Changes requested by kvn (Reviewer). PR: https://git.openjdk.org/jdk/pull/12326 From kvn at openjdk.org Tue Feb 14 19:19:49 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 14 Feb 2023 19:19:49 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 [v2] In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 10:21:25 GMT, Johan Sj?len wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Some more fixes > > src/hotspot/cpu/x86/interp_masm_x86.cpp line 402: > >> 400: movptr(tmp, Address(rthread, JavaThread::jvmti_thread_state_offset())); >> 401: testptr(tmp, tmp); >> 402: jcc(Assembler::zero, L); // if (thread->jvmti_thread_state() == null) exit; > > nullptr Missed change to nullptr as you suggested. > src/hotspot/cpu/x86/interp_masm_x86.cpp line 1779: > >> 1777: // // main copy of decision tree, rooted at row[1] >> 1778: // if (row[0].rec == rec) { row[0].incr(); goto done; } >> 1779: // if (row[0].rec != null) { > > nullptr Missed change to nullptr as you suggested. > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 4265: > >> 4263: } >> 4264: >> 4265: // for (scan = klass->itable(); scan->interface() != null; scan += scan_step) { > > nullptr Missed change to nullptr as you suggested. ------------- PR: https://git.openjdk.org/jdk/pull/12326 From jcking at openjdk.org Tue Feb 14 20:15:44 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 14 Feb 2023 20:15:44 GMT Subject: RFR: JDK-8300783: Consolidate byteswap implementations [v13] In-Reply-To: <4AfWUfBaHKZwVsm6XhCkKc0krgKgWTw1RVI2R1-MdcM=.48008df7-1411-42d6-aeb1-3276f443dfe3@github.com> References: <4AfWUfBaHKZwVsm6XhCkKc0krgKgWTw1RVI2R1-MdcM=.48008df7-1411-42d6-aeb1-3276f443dfe3@github.com> Message-ID: On Tue, 31 Jan 2023 20:05:21 GMT, Justin King wrote: >> Deduplicate byte swapping implementations by consolidating them into `utilities/byteswap.hpp`, following `std::byteswap` introduced in C++23. Further simplification of `Bytes` will follow in https://github.com/openjdk/jdk/pull/12078. > > Justin King has updated the pull request incrementally with one additional commit since the last revision: > > Fix copyright > > Signed-off-by: Justin King Friendly poke. ------------- PR: https://git.openjdk.org/jdk/pull/12114 From inakonechnyy at openjdk.org Tue Feb 14 22:05:35 2023 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Tue, 14 Feb 2023 22:05:35 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error Message-ID: The proposed approach added a new function for getting the cause of an exception -`java_lang_Throwable::get_cause_simple `, that gets called within `InstanceKlass::add_initialization_error` if an old one `java_lang_Throwable::get_cause_with_stack_trace` didn't succeed because of an exception during the VM call. The simple function doesn't call the VM for getting a stack trace but fills in any other information about an exception. Besides that, the discovering information about an exception was added to `ConstantPoolCacheEntry::save_and_throw_indy_exc` function. Jtreg for reproducing the issue also was added to the commit. The commit was tested with tier1 tests. ------------- Commit messages: - 8302491: NoClassDefFoundError omits the original cause of an error Changes: https://git.openjdk.org/jdk/pull/12566/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12566&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302491 Stats: 121 lines in 5 files changed: 118 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12566.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12566/head:pull/12566 PR: https://git.openjdk.org/jdk/pull/12566 From iklam at openjdk.org Tue Feb 14 22:06:24 2023 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 14 Feb 2023 22:06:24 GMT Subject: RFR: 8224980: FLAG_SET_ERGO silently ignores invalid values [v3] In-Reply-To: References: Message-ID: > Clarify the `FLAG_SET_ERGO` API - the caller must ensure that a valid value is passed. Added an assert to check this. > > Fixed two issues found by the assert: > > - `NonNMethodCodeHeapSize` > - `XXXThreshold` and `XXXNotifyFreqLog` flags in `CompilerConfig::set_compilation_policy_flags()` Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: @dholmes-ora suggestion ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12549/files - new: https://git.openjdk.org/jdk/pull/12549/files/adef377e..8cf4b6e3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12549&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12549&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12549.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12549/head:pull/12549 PR: https://git.openjdk.org/jdk/pull/12549 From iklam at openjdk.org Tue Feb 14 22:06:33 2023 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 14 Feb 2023 22:06:33 GMT Subject: RFR: 8224980: FLAG_SET_ERGO silently ignores invalid values [v2] In-Reply-To: References: <8JBp5wYLz4IbjkPlhWUa845BAR4YnRKdhViQ0SfCnGo=.3f34ba5c-ba98-458d-8243-05f5440d1d32@github.com> Message-ID: <6KXZyhQfRrWk8VoZ6zep3O3wruWNdkkCjHVOxGzyly0=.1b8a5a04-4434-4388-b36a-d890fba3a9c6@github.com> On Tue, 14 Feb 2023 06:47:38 GMT, David Holmes wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> @dholmes-ora comment - make sure errors are printed for FLAG_SET_ERGO > > src/hotspot/share/code/codeCache.cpp line 1197: > >> 1195: } else { >> 1196: // Use a single code heap >> 1197: FLAG_SET_ERGO(NonNMethodCodeHeapSize, (uintx)os::vm_page_size()); > > Sanity check: this code does execute after `os::init()` doesn't it? Yes. In fact, `FLAG_SET_ERGO` internally calls the constraint function for this flag, `VMPageSizeConstraintFunc()`, which internally calls `os::vm_page_size()`. > src/hotspot/share/compiler/compilerDefinitions.cpp line 134: > >> 132: intx CompilerConfig::jvmflag_scaled_freq_log(intx freq_log) { >> 133: return MAX2((intx)0, MIN2(scaled_freq_log(freq_log), (intx)30)); >> 134: } > > If the passed in arguments are out of range this appears to cap them within range but the initial error is never made known to the caller. ?? This code is called only by `CompilerConfig::set_compilation_policy_flags()` to scale 15 flags according to `CompileThresholdScaling`. If the user sets `CompileThresholdScaling` to be too high, then some of the calculated values will be capped. @veresov @TobiHartmann - do you think some warning messages are needed here? My thinking is that `CompileThresholdScaling` is a somewhat magical knob. If the user really wants to know what's happening, they should add `-XX:+PrintFlagsFinal` and see the effect on all the affected flags. > src/hotspot/share/runtime/flags/jvmFlagAccess.cpp line 71: > >> 69: if (err != JVMFlag::SUCCESS) { >> 70: if (origin == JVMFlagOrigin::ERGONOMIC) { >> 71: fatal("FLAG_SET_ERGO cannot be used to set an invalid value for %s", flag->name()); > > Just occurred to me, should the `fatal` rather be a `vm_exit_during_initialization`? `vm_exit_during_initialization` is usually for conditions not controllable by the VM (out of memory, bad command-line options, etc). Here were have a programming error in the VM's ergo code. So `fatal()` is more appropriate. It also generates an hs_err file that can help debugging. > src/hotspot/share/runtime/globals_extension.hpp line 92: > >> 90: >> 91: // FLAG_SET_ERGO must be always be called with a valid value. You cannot pass in a unvalidated >> 92: // value and expect a return code for success/failure. > > Suggestion: > > // FLAG_SET_ERGO must be always be called with a valid value. If an invalid value > // is detected then the VM will exit. Fixed. ------------- PR: https://git.openjdk.org/jdk/pull/12549 From kdnilsen at openjdk.org Tue Feb 14 22:08:49 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 14 Feb 2023 22:08:49 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v23] In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 08:33:23 GMT, Roman Kennke wrote: >> See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. >> >> Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. >> >> Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. >> >> Testing: >> - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] tier1 (x86_64, x86_32, aarch64, riscv) >> - [x] tier2 (x86_64, aarch64, riscv) >> - [x] tier3 (x86_64, riscv) > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Clarify comment on arrayOopDesc::max_array_length() Looks good to me. ------------- Marked as reviewed by kdnilsen (no project role). PR: https://git.openjdk.org/jdk/pull/11044 From lmesnik at openjdk.org Tue Feb 14 22:34:42 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 14 Feb 2023 22:34:42 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v2] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Mon, 13 Feb 2023 19:51:12 GMT, Patricio Chilano Mateo wrote: >> Please review the following change. Some of the JVMTI functions had to be modified since they are not supported by the specs on virtual threads (StopThread, RunAgentThread, GetCurrentThreadCpuTime, GetThreadCpuTime) or shouldn't include virtual threads in the results (GetAllThreads, GetAllStackTraces, GetThreadGroupChildren). Others were modified because they should work on virtual thread but they weren't (SuspendAllVirtualThreads/ResumeAllVirtualThreads). Also support for VirtualThreadStart/VirtualThreadEnd events was added. >> >> There were a total of 71 tests under serviceability/jvmti/ that had the vm.continuations requirement. Of those, 24 where under serviceability/jvmti/vthread/. I was able to remove that requirement from 50 tests. Most of those tests were already passing for the alternative vthread implementation just by removing the check in jni_GetEnv for VMContinuations. >> From the other 21 tests, 20 still fail with the changes either because they actually test continuations or because of different assumptions in the tests that don't hold true for the alternative vthread implementation. So for those I left the vm.continuations requirement untouched (from those 20 tests there are actually 4 tests from the thread/GetStackTrace/ family that are passing because of a bug in the test, but I can fix that in another RFE). The other remaining test is vthread/GetSetLocalTest/GetSetLocalTest.java which I see is problemlisted. >> >> Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). >> >> I added new test BoundVThreadTest.java which just checks for the unsupported/supported behavior mentioned previously. For some extra basic coverage I also added a new run with -XX:-VMContinuations on tests inside the serviceability/jvmti/vthread/ directory that don't require vm.continuations anymore. I could also add that for all the other tests in serviceability/jvmti/ for which I removed the vm.continuations requirement. >> >> I run the patch through mach5 tiers 1-6 plus some extra local runs on the relevant tests. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with three additional commits since the last revision: > > - use different @test to run with -XX:-VMContinuations > - more removals of @requires vm.continuations > - change BasicVirtualThread_klass -> BaseVirtualThread_klass I've left one comment. Otherwise test changes looks good to me. test/hotspot/jtreg/serviceability/jvmti/vthread/BoundVThreadTest/BoundVThreadTest.java line 27: > 25: * @test > 26: * @summary Verifies correct JVMTI behavior for BoundVirtualThreads. > 27: * @compile --enable-preview -source ${jdk.version} BoundVThreadTest.java You could use @enablePreview instead. ------------- Marked as reviewed by lmesnik (Reviewer). PR: https://git.openjdk.org/jdk/pull/12512 From redestad at openjdk.org Tue Feb 14 22:44:43 2023 From: redestad at openjdk.org (Claes Redestad) Date: Tue, 14 Feb 2023 22:44:43 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v17] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: <2s_1x1gYW-OGAW6s4Cpx1xDwQeqxFfQY1x9WKx0WPzY=.46b006eb-3743-4516-afb7-e08102100432@github.com> On Tue, 14 Feb 2023 18:22:32 GMT, Scott Gibbons wrote: >> Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. >> >> Encode performance: >> **Old:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms >> >> >> **New:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms >> >> >> Decode performance: >> **Old:** >> >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms >> >> **New:** >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Last of review comments I've started tier1-5 testing internally. Will let you know if we find any issues. ------------- PR: https://git.openjdk.org/jdk/pull/12126 From sviswanathan at openjdk.org Tue Feb 14 22:49:46 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Tue, 14 Feb 2023 22:49:46 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v17] In-Reply-To: <2s_1x1gYW-OGAW6s4Cpx1xDwQeqxFfQY1x9WKx0WPzY=.46b006eb-3743-4516-afb7-e08102100432@github.com> References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> <2s_1x1gYW-OGAW6s4Cpx1xDwQeqxFfQY1x9WKx0WPzY=.46b006eb-3743-4516-afb7-e08102100432@github.com> Message-ID: <61Kh7tlZRZQOK_58qHyZPtU5u9BCFQ_qhfWa60E7ibs=.45dfeea1-f1a3-46b2-9d4b-47fd7b6d3f38@github.com> On Tue, 14 Feb 2023 22:41:47 GMT, Claes Redestad wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Last of review comments > > I've started tier1-5 testing internally. Will let you know if we find any issues. Thanks a lot @cl4es for doing the tests for this PR. ------------- PR: https://git.openjdk.org/jdk/pull/12126 From mseledtsov at openjdk.org Wed Feb 15 00:38:15 2023 From: mseledtsov at openjdk.org (Mikhailo Seledtsov) Date: Wed, 15 Feb 2023 00:38:15 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v8] In-Reply-To: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> Message-ID: > Add log() and logToFile() methods to VMProps for diagnostic purposes. Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: Addressed recent review feedback from Leonid ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12239/files - new: https://git.openjdk.org/jdk/pull/12239/files/b85636da..e1890eaa Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12239&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12239&range=06-07 Stats: 54 lines in 1 file changed: 20 ins; 11 del; 23 mod Patch: https://git.openjdk.org/jdk/pull/12239.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12239/head:pull/12239 PR: https://git.openjdk.org/jdk/pull/12239 From kbarrett at openjdk.org Wed Feb 15 00:49:33 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 15 Feb 2023 00:49:33 GMT Subject: RFR: 8302262: Remove -XX:SuppressErrorAt develop option [v2] In-Reply-To: References: Message-ID: <28p327vANynlkA8_Ux2guSmP0DPOGX8k_5yKuguTXts=.7e06d2ba-0067-4b5b-b6b5-fb7078247aae@github.com> > Please review this change which removes the -XX:SuppressErrorAt develop option > and the associated implementation. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'master' into SuppressErrorAt - remove SuppressErrorAt ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12521/files - new: https://git.openjdk.org/jdk/pull/12521/files/c7163aa3..fe1014b7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12521&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12521&range=00-01 Stats: 34920 lines in 510 files changed: 11699 ins; 12239 del; 10982 mod Patch: https://git.openjdk.org/jdk/pull/12521.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12521/head:pull/12521 PR: https://git.openjdk.org/jdk/pull/12521 From kbarrett at openjdk.org Wed Feb 15 00:49:35 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 15 Feb 2023 00:49:35 GMT Subject: RFR: 8302262: Remove -XX:SuppressErrorAt develop option In-Reply-To: References: Message-ID: <-gWeeq1fOHwtN3ywqyvRDpoL_dmpijCOVXg_XkjVSto=.80bda9b4-0873-4582-bf30-4523054005e3@github.com> On Sat, 11 Feb 2023 03:17:38 GMT, Kim Barrett wrote: > Please review this change which removes the -XX:SuppressErrorAt develop option > and the associated implementation. Thanks for reviews. ------------- PR: https://git.openjdk.org/jdk/pull/12521 From kbarrett at openjdk.org Wed Feb 15 00:49:36 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 15 Feb 2023 00:49:36 GMT Subject: Integrated: 8302262: Remove -XX:SuppressErrorAt develop option In-Reply-To: References: Message-ID: On Sat, 11 Feb 2023 03:17:38 GMT, Kim Barrett wrote: > Please review this change which removes the -XX:SuppressErrorAt develop option > and the associated implementation. This pull request has now been integrated. Changeset: f1d76fa9 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/f1d76fa92501e45f25a7d33d8c5eee7ef60973eb Stats: 224 lines in 8 files changed: 0 ins; 205 del; 19 mod 8302262: Remove -XX:SuppressErrorAt develop option Reviewed-by: stuefe, dholmes, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/12521 From coleenp at openjdk.org Wed Feb 15 00:49:50 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 15 Feb 2023 00:49:50 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 21:58:01 GMT, Ilarion Nakonechnyy wrote: > The proposed approach added a new function for getting the cause of an exception -`java_lang_Throwable::get_cause_simple `, that gets called within `InstanceKlass::add_initialization_error` if an old one `java_lang_Throwable::get_cause_with_stack_trace` didn't succeed because of an exception during the VM call. The simple function doesn't call the VM for getting a stack trace but fills in any other information about an exception. > > Besides that, the discovering information about an exception was added to `ConstantPoolCacheEntry::save_and_throw_indy_exc` function. > > Jtreg for reproducing the issue also was added to the commit. > The commit was tested with tier1 tests. This seems like a reasonable fix for this feature. src/hotspot/share/classfile/javaClasses.cpp line 2783: > 2781: } > 2782: > 2783: Handle java_lang_Throwable::get_cause_simple(Handle throwable, TRAPS) { If the function doesn't throw an exception, don't pass TRAPS. Instead make the first argument JavaThread* current. src/hotspot/share/classfile/javaClasses.cpp line 2800: > 2798: > 2799: Symbol* exception_name = throwable()->klass()->name(); > 2800: CLEAR_PENDING_EXCEPTION; Is an exception possible here? I don't see one and this should clear the exception before calling this. src/hotspot/share/oops/instanceKlass.cpp line 987: > 985: // Retry with the simple method > 986: cause = java_lang_Throwable::get_cause_simple(exception, THREAD); > 987: CLEAR_PENDING_EXCEPTION; Reverse this: clear the pending exception first. test/hotspot/jtreg/runtime/ClassInitErrors/TestNoClassDefFoundCause.java line 26: > 24: /** > 25: * @test > 26: * @bug 1234567 Fix the bug number. Do you need a @requires to run with -Xcomp? ------------- Changes requested by coleenp (Reviewer). PR: https://git.openjdk.org/jdk/pull/12566 From coleenp at openjdk.org Wed Feb 15 00:54:51 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 15 Feb 2023 00:54:51 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 17:29:41 GMT, Calvin Cheung wrote: > Please review this patch for embedding the Symbol inside a ConcurrentHashTable Node instead of having a pointer to a Symbol. This eliminates malloc/free for each Symbol. > > This patch is co-authored by @robehn. > > Passed tiers 1 - 4 testing. This looks good - I have just a couple of changes that I think you should make. src/hotspot/share/classfile/symbolTable.cpp line 469: > 467: const int alloc_size = Symbol::byte_size(len); > 468: u1* u1_buf = NEW_RESOURCE_ARRAY(u1, alloc_size); > 469: Symbol* tmp = ::new ((void*)u1_buf) Symbol((const u1*)name, len, (heap && !DumpSharedSpaces) ? 1 : PERM_REFCOUNT); There's a version of NEW_RESOURCE_ARRAY that takes the Thread argument, so it doesn't have to call Thread::current() src/hotspot/share/oops/symbol.cpp line 68: > 66: } > 67: > 68: Symbol::Symbol(const Symbol& s1) { Can you add a comment that this copies the symbol when added to the ConcurrentHashTable. ------------- Changes requested by coleenp (Reviewer). PR: https://git.openjdk.org/jdk/pull/12562 From kbarrett at openjdk.org Wed Feb 15 01:05:44 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 15 Feb 2023 01:05:44 GMT Subject: RFR: 8297539: Consolidate the uses of the int<->float union conversion trick [v2] In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 10:31:21 GMT, Afshin Zafari wrote: >> ### Discussion >> >> There are different ways of casting/converting integral types to floating point types and vice versa in the code. >> These conversions are changed to invocation of a centralised `cast(From)` method in `primitiveConversions.hpp`. To make this changes build and run, some new `cast` methods are introduced and header dependencies changed. >> >> ### Patch >> >> - All instances of using `union` technique that are used for int<->float and long<->double type conversions are replaced by `PrimitiveConversions::cast(From)` method. >> - Instances of, for example, converting `int`<->`intptr_t` are not changed. >> - After this change, `JavaValue` also uses the `PrimitiveConversions::cast(From)` templates. Therefore, all the `union` conversions for `int`<-> `float` and `long` <-> `double` in hotspot are now centralised in one location. >> - Templates in `PrimitiveConversions` class are numbered from 1-9 and all lines of code that use `PrimitivEConversions::cast(From)` have comments saying which templated `cast(From)` matches. Issue number 8297539 is also in the comments to make the search/reviews more easier. >> - The new 'cast(From)' templates are numbered 6-9 for some instances where existing templates (1-5) did not match. >> - `globalDefinitions.hpp` now includes `primitiveConversions.hpp`. >> - `primitiveConversions.hpp` does NOT include `globalDefinitions.hpp` any more. Some typedef's are added to it to compensate the missing types. >> - The followings are not changed: >> - `jni.hpp` >> - `ciConstant.hpp` is not changed, since uinon is not used for converting `int` <->`float` or `long` <-> `double`. There are some asserts that allow setter/getter used only for the same types. >> - `bytecodeInterpreter_zero.hpp` is not changed. Writing to and reading from the `VMJavaVal64` are always with the same type and no conversion is made. >> - union with bit fields are not changed. >> - `json.hpp` and `json.cpp` in src/hotspot/share/utilities. >> >> ### Test >> mach5 tiers 1-5 passed, tiers 6-8 in progress > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > 8297539: Consolidate the uses of the int<->float union conversion trick I think there are some significant problems with the [v2] version (webrev 1). Afshin and I are going to meet, for better bandwidth vs discussing in PR. Will summarize here later, but other folks might want to hold off looking at that version. ------------- PR: https://git.openjdk.org/jdk/pull/12384 From pchilanomate at openjdk.org Wed Feb 15 01:22:33 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 15 Feb 2023 01:22:33 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v3] In-Reply-To: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: > Please review the following change. Some of the JVMTI functions had to be modified since they are not supported by the specs on virtual threads (StopThread, RunAgentThread, GetCurrentThreadCpuTime, GetThreadCpuTime) or shouldn't include virtual threads in the results (GetAllThreads, GetAllStackTraces, GetThreadGroupChildren). Others were modified because they should work on virtual thread but they weren't (SuspendAllVirtualThreads/ResumeAllVirtualThreads). Also support for VirtualThreadStart/VirtualThreadEnd events was added. > > There were a total of 71 tests under serviceability/jvmti/ that had the vm.continuations requirement. Of those, 24 where under serviceability/jvmti/vthread/. I was able to remove that requirement from 50 tests. Most of those tests were already passing for the alternative vthread implementation just by removing the check in jni_GetEnv for VMContinuations. > From the other 21 tests, 20 still fail with the changes either because they actually test continuations or because of different assumptions in the tests that don't hold true for the alternative vthread implementation. So for those I left the vm.continuations requirement untouched (from those 20 tests there are actually 4 tests from the thread/GetStackTrace/ family that are passing because of a bug in the test, but I can fix that in another RFE). The other remaining test is vthread/GetSetLocalTest/GetSetLocalTest.java which I see is problemlisted. > > Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). > > I added new test BoundVThreadTest.java which just checks for the unsupported/supported behavior mentioned previously. For some extra basic coverage I also added a new run with -XX:-VMContinuations on tests inside the serviceability/jvmti/vthread/ directory that don't require vm.continuations anymore. I could also add that for all the other tests in serviceability/jvmti/ for which I removed the vm.continuations requirement. > > I run the patch through mach5 tiers 1-6 plus some extra local runs on the relevant tests. > > Thanks, > Patricio Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: use @enablePreview instead ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12512/files - new: https://git.openjdk.org/jdk/pull/12512/files/99cf204c..f7f9553e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12512&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12512&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12512.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12512/head:pull/12512 PR: https://git.openjdk.org/jdk/pull/12512 From pchilanomate at openjdk.org Wed Feb 15 01:22:35 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 15 Feb 2023 01:22:35 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v2] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Tue, 14 Feb 2023 14:20:30 GMT, Alan Bateman wrote: > The changes in the updates look good. I think the main thing to decide on is whether JavaThread:: _is_bound_vthread is needed or not. If GetCurrentThreadCpuTime is changed to allow an implementation support this function then it looks like the additional field would not be needed. > Great, thanks for looking at this Alan. I'll remove _is_bound_vthread if we change the specs for GetCurrentThreadCpuTime then. ------------- PR: https://git.openjdk.org/jdk/pull/12512 From pchilanomate at openjdk.org Wed Feb 15 01:22:37 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 15 Feb 2023 01:22:37 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v2] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Tue, 14 Feb 2023 22:31:43 GMT, Leonid Mesnik wrote: > I've left one comment. Otherwise test changes looks good to me. > Thanks for the review Leonid! > test/hotspot/jtreg/serviceability/jvmti/vthread/BoundVThreadTest/BoundVThreadTest.java line 27: > >> 25: * @test >> 26: * @summary Verifies correct JVMTI behavior for BoundVirtualThreads. >> 27: * @compile --enable-preview -source ${jdk.version} BoundVThreadTest.java > > You could use @enablePreview instead. Changed. ------------- PR: https://git.openjdk.org/jdk/pull/12512 From inakonechnyy at openjdk.org Wed Feb 15 01:42:48 2023 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Wed, 15 Feb 2023 01:42:48 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error In-Reply-To: References: Message-ID: <3PPkXDXtcUMLVD3xYuoFP0TobqmB1sz-piqpdqs84y0=.57906c0a-e5e8-44f4-a052-7da5ff0c639d@github.com> On Wed, 15 Feb 2023 00:46:23 GMT, Coleen Phillimore wrote: >> The proposed approach added a new function for getting the cause of an exception -`java_lang_Throwable::get_cause_simple `, that gets called within `InstanceKlass::add_initialization_error` if an old one `java_lang_Throwable::get_cause_with_stack_trace` didn't succeed because of an exception during the VM call. The simple function doesn't call the VM for getting a stack trace but fills in any other information about an exception. >> >> Besides that, the discovering information about an exception was added to `ConstantPoolCacheEntry::save_and_throw_indy_exc` function. >> >> Jtreg for reproducing the issue also was added to the commit. >> The commit was tested with tier1 tests. > > test/hotspot/jtreg/runtime/ClassInitErrors/TestNoClassDefFoundCause.java line 26: > >> 24: /** >> 25: * @test >> 26: * @bug 1234567 > > Fix the bug number. > Do you need a @requires to run with -Xcomp? Do you propose to use something like `@requires vm.compMode != "Xint"` ------------- PR: https://git.openjdk.org/jdk/pull/12566 From iklam at openjdk.org Wed Feb 15 02:32:09 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 15 Feb 2023 02:32:09 GMT Subject: RFR: 8296344: Remove dependency on G1 for writing the CDS archive heap [v6] In-Reply-To: References: Message-ID: > Goals > > - Simplify the writing of the CDS archive heap > - We no longer need to allocate special "archive regions" during CDS dump time. See all the removed G1 code. > - Make it possible to (in a future RFE) write the CDS archive heap using any garbage collector > > Implementation - the following runs inside a safepoint so heap objects aren't moving > > - Find all the objects that should be archived > - Allocate buffer using a GrowableArray > - Copy all "open" objects into the buffer > - Copy all "closed" objects into the buffer > - Make sure the heap image can be mapped with all possible G1 region sizes: > - While copying, add fillers to make sure no object spans across 1MB boundary (minimal G1 region size) > - Align the bottom of the "open" and "closed" objects at 1MB boundary > - Write the buffer to disk as the image for the archive heap > > Testing: tiers 1 ~ 5 Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: - Merge branch 'master' into 8296344-remove-cds-heap-dump-dependency-on-G1-public-review - Merge branch 'master' into 8296344-remove-cds-heap-dump-dependency-on-G1-public-review - @calvinccheung review comment -- added assert) - Simplified relocation by using buffered address instead of requested address - @ashu-mehra review; added comments about relocating HeapShared::roots() - fixed windows and 32-bit builds which do not support archive heap - clean up - 8296344: Remove dependency on G1 for writing the CDS archive heap (step 2: no intermediate buffer) - 8296344: Remove dependency on G1 for writing the CDS archive heap (step 1: with buffer copying) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12304/files - new: https://git.openjdk.org/jdk/pull/12304/files/f72e1803..4d3ec2e5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12304&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12304&range=04-05 Stats: 2795 lines in 93 files changed: 1100 ins; 1434 del; 261 mod Patch: https://git.openjdk.org/jdk/pull/12304.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12304/head:pull/12304 PR: https://git.openjdk.org/jdk/pull/12304 From inakonechnyy at openjdk.org Wed Feb 15 02:39:24 2023 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Wed, 15 Feb 2023 02:39:24 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v2] In-Reply-To: References: Message-ID: > The proposed approach added a new function for getting the cause of an exception -`java_lang_Throwable::get_cause_simple `, that gets called within `InstanceKlass::add_initialization_error` if an old one `java_lang_Throwable::get_cause_with_stack_trace` didn't succeed because of an exception during the VM call. The simple function doesn't call the VM for getting a stack trace but fills in any other information about an exception. > > Besides that, the discovering information about an exception was added to `ConstantPoolCacheEntry::save_and_throw_indy_exc` function. > > Jtreg for reproducing the issue also was added to the commit. > The commit was tested with tier1 tests. Ilarion Nakonechnyy has updated the pull request incrementally with two additional commits since the last revision: - Address review notes - Correct the jtreg test - check stacktrace only for NoClassDefFoundError ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12566/files - new: https://git.openjdk.org/jdk/pull/12566/files/7ee0adbd..f2e5b362 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12566&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12566&range=00-01 Stats: 15 lines in 4 files changed: 4 ins; 2 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/12566.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12566/head:pull/12566 PR: https://git.openjdk.org/jdk/pull/12566 From dholmes at openjdk.org Wed Feb 15 03:16:48 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 15 Feb 2023 03:16:48 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v2] In-Reply-To: References: Message-ID: On Wed, 15 Feb 2023 02:39:24 GMT, Ilarion Nakonechnyy wrote: >> The proposed approach added a new function for getting the cause of an exception -`java_lang_Throwable::get_cause_simple `, that gets called within `InstanceKlass::add_initialization_error` if an old one `java_lang_Throwable::get_cause_with_stack_trace` didn't succeed because of an exception during the VM call. The simple function doesn't call the VM for getting a stack trace but fills in any other information about an exception. >> >> Besides that, the discovering information about an exception was added to `ConstantPoolCacheEntry::save_and_throw_indy_exc` function. >> >> Jtreg for reproducing the issue also was added to the commit. >> The commit was tested with tier1 tests. > > Ilarion Nakonechnyy has updated the pull request incrementally with two additional commits since the last revision: > > - Address review notes > - Correct the jtreg test - > check stacktrace only for NoClassDefFoundError So if I understand correctly the scenario is: 1. During execution of clinit of class C a StackOverflowError is thrown. 2. The VM sees the exception and creates the NoClassDefFoundError for C 3. The VM calls `get_cause_with_stack_trace` to get the original SOE but that triggers a secondary SOE. 4. The VM ignores the secondary SOE and the NCDFE is created without a cause. The fix is to retry step 3 avoiding the Java upcall and so avoiding a secondary SOE. Is that right? A few nits/suggestions below. Thanks src/hotspot/share/classfile/javaClasses.cpp line 2783: > 2781: } > 2782: > 2783: Handle java_lang_Throwable::get_cause_simple(JavaThread* current, Handle throwable) { How about `get_cause_without_stacktrace`? The two methods should be able to share a lot of code rather than duplicating most of it. src/hotspot/share/classfile/javaClasses.cpp line 2784: > 2782: > 2783: Handle java_lang_Throwable::get_cause_simple(JavaThread* current, Handle throwable) { > 2784: // Same as get_cause_with_stack_trace, but without calling a JVM. The comment at L2739: // Call to JVM ... is wrong as it is actually calling Java. Your comment should be more like: // Similar to `get_cause_with_strace` but avoids the Java upcall to get the stacktrace, and // so avoids triggering further exceptions. src/hotspot/share/classfile/javaClasses.cpp line 2799: > 2797: } > 2798: > 2799: Symbol* exception_name = throwable()->klass()->name(); Nit: extra space after `=` src/hotspot/share/oops/cpCache.cpp line 478: > 476: oop cause = java_lang_Throwable::cause(PENDING_EXCEPTION); > 477: Symbol* cause_sym = NULL; > 478: Symbol* cause_msg = NULL; s/NULL/nullptr/ src/hotspot/share/oops/cpCache.cpp line 480: > 478: Symbol* cause_msg = NULL; > 479: > 480: if (cause != NULL ) { s/NULL/nullptr/ test/hotspot/jtreg/runtime/ClassInitErrors/TestNoClassDefFoundCause.java line 27: > 25: * @test > 26: * @bug 8302491 > 27: * @requires vm.compMode != "Xint" Why? test/hotspot/jtreg/runtime/ClassInitErrors/TestNoClassDefFoundCause.java line 28: > 26: * @bug 8302491 > 27: * @requires vm.compMode != "Xint" > 28: * @summary Test that StackOverflowError is correctly reporting in stack trace s/reporting/reported/ ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12566 From iklam at openjdk.org Wed Feb 15 03:38:41 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 15 Feb 2023 03:38:41 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 17:29:41 GMT, Calvin Cheung wrote: > Please review this patch for embedding the Symbol inside a ConcurrentHashTable Node instead of having a pointer to a Symbol. This eliminates malloc/free for each Symbol. > > This patch is co-authored by @robehn. > > Passed tiers 1 - 4 testing. Changes requested by iklam (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12562 From iklam at openjdk.org Wed Feb 15 03:38:42 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 15 Feb 2023 03:38:42 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node In-Reply-To: References: Message-ID: On Wed, 15 Feb 2023 00:50:31 GMT, Coleen Phillimore wrote: >> Please review this patch for embedding the Symbol inside a ConcurrentHashTable Node instead of having a pointer to a Symbol. This eliminates malloc/free for each Symbol. >> >> This patch is co-authored by @robehn. >> >> Passed tiers 1 - 4 testing. > > src/hotspot/share/classfile/symbolTable.cpp line 469: > >> 467: const int alloc_size = Symbol::byte_size(len); >> 468: u1* u1_buf = NEW_RESOURCE_ARRAY(u1, alloc_size); >> 469: Symbol* tmp = ::new ((void*)u1_buf) Symbol((const u1*)name, len, (heap && !DumpSharedSpaces) ? 1 : PERM_REFCOUNT); > > There's a version of NEW_RESOURCE_ARRAY that takes the Thread argument, so it doesn't have to call Thread::current() The name of the `heap` parameter to this function should be changed. The node, which contains the Symbol, is always C_HEAP allocated. I would suggest renaming it to `is_permanent`. Same for the other functions that have a `heap` parameter. ------------- PR: https://git.openjdk.org/jdk/pull/12562 From sspitsyn at openjdk.org Wed Feb 15 05:01:43 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 15 Feb 2023 05:01:43 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v3] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Wed, 15 Feb 2023 01:22:33 GMT, Patricio Chilano Mateo wrote: >> Please review the following change. Some of the JVMTI functions had to be modified since they are not supported by the specs on virtual threads (StopThread, RunAgentThread, GetCurrentThreadCpuTime, GetThreadCpuTime) or shouldn't include virtual threads in the results (GetAllThreads, GetAllStackTraces, GetThreadGroupChildren). Others were modified because they should work on virtual thread but they weren't (SuspendAllVirtualThreads/ResumeAllVirtualThreads). Also support for VirtualThreadStart/VirtualThreadEnd events was added. >> >> There were a total of 71 tests under serviceability/jvmti/ that had the vm.continuations requirement. Of those, 24 where under serviceability/jvmti/vthread/. I was able to remove that requirement from 50 tests. Most of those tests were already passing for the alternative vthread implementation just by removing the check in jni_GetEnv for VMContinuations. >> From the other 21 tests, 20 still fail with the changes either because they actually test continuations or because of different assumptions in the tests that don't hold true for the alternative vthread implementation. So for those I left the vm.continuations requirement untouched (from those 20 tests there are actually 4 tests from the thread/GetStackTrace/ family that are passing because of a bug in the test, but I can fix that in another RFE). The other remaining test is vthread/GetSetLocalTest/GetSetLocalTest.java which I see is problemlisted. >> >> Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). >> >> I added new test BoundVThreadTest.java which just checks for the unsupported/supported behavior mentioned previously. For some extra basic coverage I also added a new run with -XX:-VMContinuations on tests inside the serviceability/jvmti/vthread/ directory that don't require vm.continuations anymore. I could also add that for all the other tests in serviceability/jvmti/ for which I removed the vm.continuations requirement. >> >> I run the patch through mach5 tiers 1-6 plus some extra local runs on the relevant tests. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > use @enablePreview instead src/hotspot/share/prims/jvmtiEnv.cpp line 1045: > 1043: JvmtiEnvBase::is_vthread_alive(vt_oop) && > 1044: !JvmtiVTSuspender::is_vthread_suspended(vt_oop)) || > 1045: java_thread->is_bound_vthread()) && This condition does not look correct. I'd expect it to be: ``` ((java_lang_VirtualThread::is_instance(vt_oop) && JvmtiEnvBase::is_vthread_alive(vt_oop)) || java_thread->is_bound_vthread()) && !JvmtiVTSuspender::is_vthread_suspended(vt_oop) && It is important to check for `!JvmtiVTSuspender::is_vthread_suspended(vt_oop)` for `bound_vthread` as well. The same issue is in the `JvmtiEnv::ResumeAllVirtualThreads`. ------------- PR: https://git.openjdk.org/jdk/pull/12512 From sspitsyn at openjdk.org Wed Feb 15 05:06:46 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 15 Feb 2023 05:06:46 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v3] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Wed, 15 Feb 2023 01:22:33 GMT, Patricio Chilano Mateo wrote: >> Please review the following change. Some of the JVMTI functions had to be modified since they are not supported by the specs on virtual threads (StopThread, RunAgentThread, GetCurrentThreadCpuTime, GetThreadCpuTime) or shouldn't include virtual threads in the results (GetAllThreads, GetAllStackTraces, GetThreadGroupChildren). Others were modified because they should work on virtual thread but they weren't (SuspendAllVirtualThreads/ResumeAllVirtualThreads). Also support for VirtualThreadStart/VirtualThreadEnd events was added. >> >> There were a total of 71 tests under serviceability/jvmti/ that had the vm.continuations requirement. Of those, 24 where under serviceability/jvmti/vthread/. I was able to remove that requirement from 50 tests. Most of those tests were already passing for the alternative vthread implementation just by removing the check in jni_GetEnv for VMContinuations. >> From the other 21 tests, 20 still fail with the changes either because they actually test continuations or because of different assumptions in the tests that don't hold true for the alternative vthread implementation. So for those I left the vm.continuations requirement untouched (from those 20 tests there are actually 4 tests from the thread/GetStackTrace/ family that are passing because of a bug in the test, but I can fix that in another RFE). The other remaining test is vthread/GetSetLocalTest/GetSetLocalTest.java which I see is problemlisted. >> >> Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). >> >> I added new test BoundVThreadTest.java which just checks for the unsupported/supported behavior mentioned previously. For some extra basic coverage I also added a new run with -XX:-VMContinuations on tests inside the serviceability/jvmti/vthread/ directory that don't require vm.continuations anymore. I could also add that for all the other tests in serviceability/jvmti/ for which I removed the vm.continuations requirement. >> >> I run the patch through mach5 tiers 1-6 plus some extra local runs on the relevant tests. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > use @enablePreview instead Patricio, The fix looks pretty solid to me. I've already posted one comment and will post a couple of formatting nits. I agree with Alan, it'd be nice to get rid of `is_bound_vthread` if possible. I will make one more pass after your update. Also, I'd address the JVMTI spec issue (mentioned by Alan) in a separate CR that needs to go with a CSR. I'll file an RFE or spec bug on it. Thanks, Serguei ------------- PR: https://git.openjdk.org/jdk/pull/12512 From sspitsyn at openjdk.org Wed Feb 15 05:09:46 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 15 Feb 2023 05:09:46 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v3] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Wed, 15 Feb 2023 01:22:33 GMT, Patricio Chilano Mateo wrote: >> Please review the following change. Some of the JVMTI functions had to be modified since they are not supported by the specs on virtual threads (StopThread, RunAgentThread, GetCurrentThreadCpuTime, GetThreadCpuTime) or shouldn't include virtual threads in the results (GetAllThreads, GetAllStackTraces, GetThreadGroupChildren). Others were modified because they should work on virtual thread but they weren't (SuspendAllVirtualThreads/ResumeAllVirtualThreads). Also support for VirtualThreadStart/VirtualThreadEnd events was added. >> >> There were a total of 71 tests under serviceability/jvmti/ that had the vm.continuations requirement. Of those, 24 where under serviceability/jvmti/vthread/. I was able to remove that requirement from 50 tests. Most of those tests were already passing for the alternative vthread implementation just by removing the check in jni_GetEnv for VMContinuations. >> From the other 21 tests, 20 still fail with the changes either because they actually test continuations or because of different assumptions in the tests that don't hold true for the alternative vthread implementation. So for those I left the vm.continuations requirement untouched (from those 20 tests there are actually 4 tests from the thread/GetStackTrace/ family that are passing because of a bug in the test, but I can fix that in another RFE). The other remaining test is vthread/GetSetLocalTest/GetSetLocalTest.java which I see is problemlisted. >> >> Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). >> >> I added new test BoundVThreadTest.java which just checks for the unsupported/supported behavior mentioned previously. For some extra basic coverage I also added a new run with -XX:-VMContinuations on tests inside the serviceability/jvmti/vthread/ directory that don't require vm.continuations anymore. I could also add that for all the other tests in serviceability/jvmti/ for which I removed the vm.continuations requirement. >> >> I run the patch through mach5 tiers 1-6 plus some extra local runs on the relevant tests. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > use @enablePreview instead src/hotspot/share/prims/jvmtiEnvBase.cpp line 1587: > 1585: if (!is_passive_cthread) { > 1586: assert(thread_h() != nullptr, "sanity check"); > 1587: assert(single_suspend || thread_h()->is_a(vmClasses::BaseVirtualThread_klass()), "SuspendAllVirtualThreads should never suspend non-virtual threads"); Nit: It is better to split this line by placing assert message on a separate line. src/hotspot/share/prims/jvmtiEnvBase.cpp line 1648: > 1646: if (!is_passive_cthread) { > 1647: assert(thread_h() != nullptr, "sanity check"); > 1648: assert(single_resume || thread_h()->is_a(vmClasses::BaseVirtualThread_klass()), "ResumeAllVirtualThreads should never resume non-virtual threads"); Nit: It is better to split this line by placing assert message on a separate line. ------------- PR: https://git.openjdk.org/jdk/pull/12512 From iklam at openjdk.org Wed Feb 15 05:15:53 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 15 Feb 2023 05:15:53 GMT Subject: RFR: 8296344: Remove dependency on G1 for writing the CDS archive heap [v4] In-Reply-To: References: Message-ID: <7brPhKzjUPzxwqeS5etIYwRxR1l-nBHcY-9Oqp6jy80=.229ed39e-3c1c-4d23-9a73-1fbf6b999dd3@github.com> On Fri, 3 Feb 2023 14:06:34 GMT, Ashutosh Mehra wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Simplified relocation by using buffered address instead of requested address > > @iklam thank you for addressing the comments. It looks good to me. Thanks @ashu-mehra @calvinccheung @albertnetymk @tschatzl for the review. Passed tiers 1-5. ------------- PR: https://git.openjdk.org/jdk/pull/12304 From iklam at openjdk.org Wed Feb 15 05:15:55 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 15 Feb 2023 05:15:55 GMT Subject: Integrated: 8296344: Remove dependency on G1 for writing the CDS archive heap In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 05:34:20 GMT, Ioi Lam wrote: > Goals > > - Simplify the writing of the CDS archive heap > - We no longer need to allocate special "archive regions" during CDS dump time. See all the removed G1 code. > - Make it possible to (in a future RFE) write the CDS archive heap using any garbage collector > > Implementation - the following runs inside a safepoint so heap objects aren't moving > > - Find all the objects that should be archived > - Allocate buffer using a GrowableArray > - Copy all "open" objects into the buffer > - Copy all "closed" objects into the buffer > - Make sure the heap image can be mapped with all possible G1 region sizes: > - While copying, add fillers to make sure no object spans across 1MB boundary (minimal G1 region size) > - Align the bottom of the "open" and "closed" objects at 1MB boundary > - Write the buffer to disk as the image for the archive heap > > Testing: tiers 1 ~ 5 This pull request has now been integrated. Changeset: bdcbafb2 Author: Ioi Lam URL: https://git.openjdk.org/jdk/commit/bdcbafb2196f0360466ee789b969f2db954ca85f Stats: 1928 lines in 26 files changed: 915 ins; 859 del; 154 mod 8296344: Remove dependency on G1 for writing the CDS archive heap Reviewed-by: ayang, tschatzl, ccheung ------------- PR: https://git.openjdk.org/jdk/pull/12304 From dholmes at openjdk.org Wed Feb 15 05:20:42 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 15 Feb 2023 05:20:42 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test In-Reply-To: References: <2oXvjl1UFJq6x70IUcUgLKUVQgnjLC9U7YsB3XQg7BQ=.d651fdb1-df5c-40bc-9adc-4921045014a0@github.com> Message-ID: On Tue, 14 Feb 2023 09:22:00 GMT, Richard Reingruber wrote: >> The original assumption does not always hold anymore and should therefore relaxed. > > I think `unextended_sp < sp` is possible on x86 when interpreter entries generated by `MethodHandles::generate_method_handle_interpreter_entry()` are executed. They modify `sp` ([here for example](https://github.com/openjdk/jdk/blob/7f71a1040d9c03f72d082e329ccaf2c4a3c060a6/src/hotspot/cpu/x86/methodHandles_x86.cpp#L310-L314)) but not `sender_sp` (`InterpreterMacroAssembler ::_bcp_register`). >From `frame_x86.hpp`: // The interpreter and adapters will extend the frame of the caller. // Since oopMaps are based on the sp of the caller before extension // we need to know that value. However in order to compute the address // of the return address we need the real "raw" sp. By convention we // use sp() to mean "raw" sp and unextended_sp() to mean the caller's // original sp. by that definition `unextended_sp` is always > `sp`. If something has violated that definition/convention then we may have bigger problems! ------------- PR: https://git.openjdk.org/jdk/pull/12535 From dholmes at openjdk.org Wed Feb 15 06:23:44 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 15 Feb 2023 06:23:44 GMT Subject: RFR: 8224980: FLAG_SET_ERGO silently ignores invalid values [v3] In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 22:06:24 GMT, Ioi Lam wrote: >> Clarify the `FLAG_SET_ERGO` API - the caller must ensure that a valid value is passed. Added an assert to check this. >> >> Fixed two issues found by the assert: >> >> - `NonNMethodCodeHeapSize` >> - `XXXThreshold` and `XXXNotifyFreqLog` flags in `CompilerConfig::set_compilation_policy_flags()` > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @dholmes-ora suggestion Looks good. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12549 From dholmes at openjdk.org Wed Feb 15 06:23:48 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 15 Feb 2023 06:23:48 GMT Subject: RFR: 8224980: FLAG_SET_ERGO silently ignores invalid values [v2] In-Reply-To: <6KXZyhQfRrWk8VoZ6zep3O3wruWNdkkCjHVOxGzyly0=.1b8a5a04-4434-4388-b36a-d890fba3a9c6@github.com> References: <8JBp5wYLz4IbjkPlhWUa845BAR4YnRKdhViQ0SfCnGo=.3f34ba5c-ba98-458d-8243-05f5440d1d32@github.com> <6KXZyhQfRrWk8VoZ6zep3O3wruWNdkkCjHVOxGzyly0=.1b8a5a04-4434-4388-b36a-d890fba3a9c6@github.com> Message-ID: On Tue, 14 Feb 2023 21:59:49 GMT, Ioi Lam wrote: >> src/hotspot/share/compiler/compilerDefinitions.cpp line 134: >> >>> 132: intx CompilerConfig::jvmflag_scaled_freq_log(intx freq_log) { >>> 133: return MAX2((intx)0, MIN2(scaled_freq_log(freq_log), (intx)30)); >>> 134: } >> >> If the passed in arguments are out of range this appears to cap them within range but the initial error is never made known to the caller. ?? > > This code is called only by `CompilerConfig::set_compilation_policy_flags()` to scale 15 flags according to `CompileThresholdScaling`. If the user sets `CompileThresholdScaling` to be too high, then some of the calculated values will be capped. > > @veresov @TobiHartmann - do you think some warning messages are needed here? > > My thinking is that `CompileThresholdScaling` is a somewhat magical knob. If the user really wants to know what's happening, they should add `-XX:+PrintFlagsFinal` and see the effect on all the affected flags. Okay - whether this should produce warnings is a distinct issue for compiler folk to decide. >> src/hotspot/share/runtime/flags/jvmFlagAccess.cpp line 71: >> >>> 69: if (err != JVMFlag::SUCCESS) { >>> 70: if (origin == JVMFlagOrigin::ERGONOMIC) { >>> 71: fatal("FLAG_SET_ERGO cannot be used to set an invalid value for %s", flag->name()); >> >> Just occurred to me, should the `fatal` rather be a `vm_exit_during_initialization`? > > `vm_exit_during_initialization` is usually for conditions not controllable by the VM (out of memory, bad command-line options, etc). Here were have a programming error in the VM's ergo code. So `fatal()` is more appropriate. It also generates an hs_err file that can help debugging. Okay ------------- PR: https://git.openjdk.org/jdk/pull/12549 From thartmann at openjdk.org Wed Feb 15 06:31:42 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 15 Feb 2023 06:31:42 GMT Subject: RFR: 8224980: FLAG_SET_ERGO silently ignores invalid values [v2] In-Reply-To: References: <8JBp5wYLz4IbjkPlhWUa845BAR4YnRKdhViQ0SfCnGo=.3f34ba5c-ba98-458d-8243-05f5440d1d32@github.com> <6KXZyhQfRrWk8VoZ6zep3O3wruWNdkkCjHVOxGzyly0=.1b8a5a04-4434-4388-b36a-d890fba3a9c6@github.com> Message-ID: On Wed, 15 Feb 2023 06:20:02 GMT, David Holmes wrote: >> This code is called only by `CompilerConfig::set_compilation_policy_flags()` to scale 15 flags according to `CompileThresholdScaling`. If the user sets `CompileThresholdScaling` to be too high, then some of the calculated values will be capped. >> >> @veresov @TobiHartmann - do you think some warning messages are needed here? >> >> My thinking is that `CompileThresholdScaling` is a somewhat magical knob. If the user really wants to know what's happening, they should add `-XX:+PrintFlagsFinal` and see the effect on all the affected flags. > > Okay - whether this should produce warnings is a distinct issue for compiler folk to decide. I think it's fine to cap the values of the individual thresholds at their maximum without a warning, when `CompileThresholdScaling` is used. Similar issues were already discussed in https://github.com/openjdk/jdk/pull/8501. > I suggest to address this issue in a separate RFE. @tobiasholenstein did you file that separate RFE? ------------- PR: https://git.openjdk.org/jdk/pull/12549 From yyang at openjdk.org Wed Feb 15 06:37:43 2023 From: yyang at openjdk.org (Yi Yang) Date: Wed, 15 Feb 2023 06:37:43 GMT Subject: RFR: 8298091: Dump native instruction along with nmethod name when using Compiler.codelist In-Reply-To: References: <1dSP2mbbyqiKLRmWwCfwGdb67ll7-K4cRtV_muUkS9I=.2d2dc2f8-acca-49de-89f5-70a825c057fa@github.com> Message-ID: <7jNRdw1-H6YZt3mBqA-j3yscAJ4mTQXFpdO_thLvIag=.c82048e4-ff31-428c-9422-6a302152e752@github.com> On Tue, 7 Feb 2023 05:02:29 GMT, David Holmes wrote: >> This patch adds new functionality for Compiler.codelist, it optionally prints assembly code along with compiled method line. This allows us to inspect assembly code for specified JIT method on the fly, also a manageable flag ForceLoadDisassembler is added to load hs-dis if it was not initially present when JVM starts. >> >> The output looks like this: >> >> $ jcmd Compiler.codelist decode=Thread.interrupt >> 76900: >> ... >> 2678 3 0 com.sun.tools.javac.api.JavacTaskPool$ReusableContext$1.scan(Lcom/sun/source/tree/Tree;Lcom/sun/tools/javac/code/Symtab;)Ljava/lang/Void; [0x00007fbe85105590, 0x00007fbe85105780 - 0x00007fbe85105ec0] >> 2683 3 0 java.lang.Thread.interrupted()Z [0x00007fbe85106090, 0x00007fbe85106220 - 0x00007fbe85106488] >> [Disassembly] >> -------------------------------------------------------------------------------- >> [Constant Pool (empty)] >> >> -------------------------------------------------------------------------------- >> >> [MachCode] >> 0x00007fbe85106220: 8984 2400 | c0fe ff55 | 4883 ec40 | 4181 7f20 | 0700 0000 | 7405 e8a5 | 8af9 0648 | be28 d2ca >> 0x00007fbe85106240: 3cbe 7f00 | 008b bef4 | 0000 0083 | c702 89be | f400 0000 | 81e7 fe07 | 0000 83ff | 000f 8461 >> 0x00007fbe85106260: 0100 0048 | be28 d2ca | 3cbe 7f00 | 0048 8386 | 3801 0000 | 0149 8bb7 | a802 0000 | 488b 3648 >> 0x00007fbe85106280: 3b06 488b | fe48 bb28 | d2ca 3cbe | 7f00 008b | 7f08 49ba | 0000 0000 | 0800 0000 | 4903 fa48 >> 0x00007fbe851062a0: 3bbb 5801 | 0000 750d | 4883 8360 | 0100 0001 | e960 0000 | 0048 3bbb | 6801 0000 | 750d 4883 >> 0x00007fbe851062c0: 8370 0100 | 0001 e94a | 0000 0048 | 83bb 5801 | 0000 0075 | 1748 89bb | 5801 0000 | 48c7 8360 >> 0x00007fbe851062e0: 0100 0001 | 0000 00e9 | 2900 0000 | 4883 bb68 | 0100 0000 | 7517 4889 | bb68 0100 | 0048 c783 >> 0x00007fbe85106300: 7001 0000 | 0100 0000 | e908 0000 | 0048 8383 | 4801 0000 | 0148 bfc8 | 7036 3cbe | 7f00 008b >> 0x00007fbe85106320: 9ff4 0000 | 0083 c302 | 899f f400 | 0000 81e3 | feff 1f00 | 83fb 000f | 84ad 0000 | 000f be7e >> 0x00007fbe85106340: 3683 ff00 | 48bb c870 | 363c be7f | 0000 48b8 | 3801 0000 | 0000 0000 | 0f84 0a00 | 0000 48b8 >> 0x00007fbe85106360: 4801 0000 | 0000 0000 | 488b 1403 | 488d 5201 | 4889 1403 | 0f84 2e00 | 0000 897c | 2428 bb00 >> 0x00007fbe85106380: 0000 0088 | 5e36 f083 | 4424 c000 | 48be c870 | 363c be7f | 0000 4883 | 8658 0100 | 0001 90e8 >> 0x00007fbe851063a0: 5c11 fb06 | 8b7c 2428 | 83e7 0183 | e701 488b | c748 83c4 | 405d 493b | a778 0300 | 000f 8748 >> 0x00007fbe851063c0: 0000 00c3 | 49ba 98aa | 4200 0800 | 0000 4c89 | 5424 0848 | c704 24ff | ffff ffe8 | 20f7 0607 >> 0x00007fbe851063e0: e97e feff | ffe8 1684 | 0607 49ba | 7880 0100 | 0800 0000 | 4c89 5424 | 0848 c704 | 24ff ffff >> 0x00007fbe85106400: ffe8 faf6 | 0607 e932 | ffff ff49 | bab6 6310 | 85be 7f00 | 004d 8997 | 9003 0000 | e95f 77fb >> 0x00007fbe85106420: 0649 8b87 | 2804 0000 | 49c7 8728 | 0400 0000 | 0000 0049 | c787 3004 | 0000 0000 | 0000 4883 >> 0x00007fbe85106440: c440 5de9 | b86b 0607 | e833 af06 | 0748 bf42 | 29a7 a2be | 7f00 0048 | 83e4 f0e8 | b097 4b1d >> 0x00007fbe85106460: f449 ba61 | 6410 85be | 7f00 0041 | 52e9 ae69 | fb06 48bb | 0000 0000 | 0000 0000 | e9fb ffff >> 0x00007fbe85106480: fff4 f4f4 | f4f4 f4f4 >> [/MachCode] >> -------------------------------------------------------------------------------- >> [/Disassembly] >> 2684 3 0 java.io.FileInputStream.read()I [0x00007fbe85106590, 0x00007fbe85106740 - 0x00007fbe85106928] >> 2686 3 0 jdk.internal.org.jline.utils.NonBlockingInputStream.read(J)I [0x00007fbe85106a90, 0x00007fbe85106c20 - 0x00007fbe85106de0] >> 2687 3 0 jdk.internal.org.jline.terminal.impl.AbstractPty.checkInterrupted()V [0x00007fbe85106e90, 0x00007fbe85107060 - 0x00007fbe85107458] >> >> Once we put hsdis into lib/server and forcibly load it after turning on ForceLoadDisassembler, we have the following output: >> >> 2677 3 0 com.sun.tools.javac.api.JavacTaskPool$ReusableContext$1.scan(Lcom/sun/source/tree/Tree;Ljava/lang/Object;)Ljava/lang/Object; [0x00007fbe85104c90, 0x00007fbe85104e20 - 0x00007fbe85105058] >> 2679 3 0 com.sun.source.util.TreeScanner.scan(Lcom/sun/source/tree/Tree;Ljava/lang/Object;)Ljava/lang/Object; [0x00007fbe85105110, 0x00007fbe851052a0 - 0x00007fbe851054c8] >> 2678 3 0 com.sun.tools.javac.api.JavacTaskPool$ReusableContext$1.scan(Lcom/sun/source/tree/Tree;Lcom/sun/tools/javac/code/Symtab;)Ljava/lang/Void; [0x00007fbe85105590, 0x00007fbe85105780 - 0x00007fbe85105ec0] >> 2683 3 0 java.lang.Thread.interrupted()Z [0x00007fbe85106090, 0x00007fbe85106220 - 0x00007fbe85106488] >> [Disassembly] >> -------------------------------------------------------------------------------- >> [Constant Pool (empty)] >> >> -------------------------------------------------------------------------------- >> >> [Verified Entry Point] >> # {method} {0x000000080042aa98} 'interrupted' '()Z' in 'java/lang/Thread' >> # [sp+0x50] (sp of caller) >> 0x00007fbe85106220: mov %eax,-0x14000(%rsp) >> 0x00007fbe85106227: push %rbp >> 0x00007fbe85106228: sub $0x40,%rsp >> 0x00007fbe8510622c: cmpl $0x7,0x20(%r15) >> 0x00007fbe85106234: je 0x00007fbe8510623b >> .... >> 0x00007fbe8510643e: add $0x40,%rsp >> 0x00007fbe85106442: pop %rbp >> 0x00007fbe85106443: jmpq 0x00007fbe8c16d000 ; {runtime_call unwind_exception Runtime1 stub} >> [Exception Handler] >> 0x00007fbe85106448: callq 0x00007fbe8c171380 ; {no_reloc} >> 0x00007fbe8510644d: mov $0x7fbea2a72942,%rdi ; {external_word} >> 0x00007fbe85106457: and $0xfffffffffffffff0,%rsp >> 0x00007fbe8510645b: callq 0x00007fbea25bfc10 ; {runtime_call MacroAssembler::debug64(char*, long, long*)} >> 0x00007fbe85106460: hlt >> [Deopt Handler Code] >> 0x00007fbe85106461: mov $0x7fbe85106461,%r10 ; {section_word} >> 0x00007fbe8510646b: push %r10 >> 0x00007fbe8510646d: jmpq 0x00007fbe8c0bce20 ; {runtime_call DeoptimizationBlob} >> 0x00007fbe85106472: mov $0x0,%rbx ; {static_stub} >> 0x00007fbe8510647c: jmpq 0x00007fbe8510647c ; {runtime_call} >> 0x00007fbe85106481: hlt >> 0x00007fbe85106482: hlt >> 0x00007fbe85106483: hlt >> 0x00007fbe85106484: hlt >> 0x00007fbe85106485: hlt >> 0x00007fbe85106486: hlt >> 0x00007fbe85106487: hlt >> -------------------------------------------------------------------------------- >> [/Disassembly] >> 2684 3 0 java.io.FileInputStream.read()I [0x00007fbe85106590, 0x00007fbe85106740 - 0x00007fbe85106928] >> 2686 3 0 jdk.internal.org.jline.utils.NonBlockingInputStream.read(J)I [0x00007fbe85106a90, 0x00007fbe85106c20 - 0x00007fbe85106de0] >> 2687 3 0 jdk.internal.org.jline.terminal.impl.AbstractPty.checkInterrupted()V [0x00007fbe85106e90, 0x00007fbe85107060 - 0x00007fbe85107458] >> ... >> >> >> A sample use case is we want to know where line of code we have high cache line contention once we know a JIT address from perf c2c tool: >> >> Cacheline 0x456017840 >> -- Peer Snoop -- ------- Store Refs ------ ------- CL -------- ---------- cycles ---------- Total cpu >> Rmt Lcl L1 Hit L1 Miss N/A Off Node PA cnt Code address rmt peer lcl peer load records cnt Symbol >> 0.00% 35.59% 0.00% 0.00% 0.00% 0x0 1 1 0xffff688f2a84 0 406 324 199524 1 [.] 0x0000ffff688f2a84 [JIT] ti >> 0.00% 33.12% 0.00% 0.00% 0.00% 0x0 1 1 0xffff688f2ab8 0 411 329 190202 1 [.] >> ... >> >> But this example is too conservative. In fact, after adding this function, we can easily check the assembly representation of any JIT method, whether we find a potential performance problem with a JIT address, or we find it from the flame graph, or when we do some debugging. > > Encouraging the compiler folk to take a look at this. Hi @dholmes-ora @vnkozlov , sorry for late reply, I updated output example and sample usecase in PR description. Thanks for your reviews. ------------- PR: https://git.openjdk.org/jdk/pull/12381 From stuefe at openjdk.org Wed Feb 15 06:54:52 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 15 Feb 2023 06:54:52 GMT Subject: RFR: JDK-8293313: NMT: Rework MallocLimit [v9] In-Reply-To: <-KHeOvjwe5Osf05XWQuO5ELZUs-ogUVj25xCouJaDaU=.b0906406-dd44-4eb4-be76-debe83867249@github.com> References: <-KHeOvjwe5Osf05XWQuO5ELZUs-ogUVj25xCouJaDaU=.b0906406-dd44-4eb4-be76-debe83867249@github.com> Message-ID: > Rework NMT `MallocLimit` to make it more useful for native OOM analysis. Also remove `MallocMaxTestWords`, which had been mostly redundant with NMT (for background, see JDK-8293292). > > > ### Background > > Some months ago we had [JDK-8291919](https://bugs.openjdk.org/browse/JDK-8291919), a compiler regression that caused compiler arenas to grow boundlessly. Process was usually OOM-killed before we could take a look, so this was difficult to analyze. > > To improve analysis options, we added NMT *malloc limits* with [JDK-8291878](https://bugs.openjdk.org/browse/JDK-8291878). Malloc limits let us set one or multiple limits to malloc load. Surpassing a limit causes the VM to stop with a fatal error in the allocation function, giving us a hs-err file and a core right at the point that (probably) causes the leak. This makes error analysis a lot simpler, and is also valuable for regression-testing footprint usage. > > Some people wished for a way to not end with a fatal error but to mimic a native OOM (causing os::malloc to return NULL). This would allow us to test resilience in the face of native OOMs, among other things. > > ### Patch > > - Expands the `MallocLimit` option syntax, allowing for an "oom mode" that mimics an oom: > > > > Global form: > -XX:MallocLimit=[:] > Category-specific form: > -XX:MallocLimit=:[:][,:[:] ...] > Examples: > -XX:MallocLimit=3g > -XX:MallocLimit=3g:oom > -XX:MallocLimit=compiler:200m:oom > > > - moves parsing of `-XX:MallocLimit` out of arguments.cpp into `mallocLimit.hpp/cpp`, and rewrites it. > > - changes when limits are checked. Before, the limits were checked *after* the allocation that caused the threshold overflow. Now we check beforehand. > > - removes `MallocMaxTestWords`, replacing its uses with `MallocLimit` > > - adds new gtests and new jtreg tests > > - removes a bunch of jtreg tests which are now better served by the new gtests. > > - gives us more precise error messages upon reaching a limit. For example: > > before: > > # fatal error: MallocLimit: category "Compiler" reached limit (size: 20997680, limit: 20971520) > > > now: > > # fatal error: MallocLimit: reached category "Compiler" limit (triggering allocation size: 1920B, allocated so far: 20505K, limit: 20480K) Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: - Merge branch 'openjdk:master' into JDK-8293313-NMT-fake-oom - nullptr - merge and fix conflicts - Merge; fix conflicts - Feedback Gerard - Revert strchrnul - Copyrights - Feedback Johan - Merge - Merge branch 'master' into JDK-8293313-NMT-fake-oom - ... and 1 more: https://git.openjdk.org/jdk/compare/98a392c4...2d02a5bd ------------- Changes: https://git.openjdk.org/jdk/pull/11371/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=11371&range=08 Stats: 1010 lines in 21 files changed: 653 ins; 257 del; 100 mod Patch: https://git.openjdk.org/jdk/pull/11371.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11371/head:pull/11371 PR: https://git.openjdk.org/jdk/pull/11371 From duke at openjdk.org Wed Feb 15 07:13:44 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Wed, 15 Feb 2023 07:13:44 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test In-Reply-To: References: <2oXvjl1UFJq6x70IUcUgLKUVQgnjLC9U7YsB3XQg7BQ=.d651fdb1-df5c-40bc-9adc-4921045014a0@github.com> Message-ID: <4kKR1KP5iJ7G1_lvNKt4jOgRcCIyzDvlLDfO4rtxm2w=.c6fab761-d1ed-405b-b3b5-001a8e19f5db@github.com> On Wed, 15 Feb 2023 05:17:31 GMT, David Holmes wrote: >> I think `unextended_sp < sp` is possible on x86 when interpreter entries generated by `MethodHandles::generate_method_handle_interpreter_entry()` are executed. They modify `sp` ([here for example](https://github.com/openjdk/jdk/blob/7f71a1040d9c03f72d082e329ccaf2c4a3c060a6/src/hotspot/cpu/x86/methodHandles_x86.cpp#L310-L314)) but not `sender_sp` (`InterpreterMacroAssembler ::_bcp_register`). > > From `frame_x86.hpp`: > > // The interpreter and adapters will extend the frame of the caller. > // Since oopMaps are based on the sp of the caller before extension > // we need to know that value. However in order to compute the address > // of the return address we need the real "raw" sp. By convention we > // use sp() to mean "raw" sp and unextended_sp() to mean the caller's > // original sp. > > by that definition `unextended_sp` is always > `sp`. If something has violated that definition/convention then we may have bigger problems! I get the following values for `sp` and `unextended_sp` when running this test case: int sp = 0x7f9f1ec614e0, unextended sp = 0x7f9f1ec614e0 int sp = 0x7f9f1ec61538, unextended sp = 0x7f9f1ec61548 int sp = 0x7f9f1ec615a0, unextended sp = 0x7f9f1ec61598 // the assumption fails here int sp = 0x7f9f1ec61600, unextended sp = 0x7f9f1ec61608 int sp = 0x7f9f1ec61670, unextended sp = 0x7f9f1ec61688 ... ------------- PR: https://git.openjdk.org/jdk/pull/12535 From dholmes at openjdk.org Wed Feb 15 07:48:40 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 15 Feb 2023 07:48:40 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test In-Reply-To: <4kKR1KP5iJ7G1_lvNKt4jOgRcCIyzDvlLDfO4rtxm2w=.c6fab761-d1ed-405b-b3b5-001a8e19f5db@github.com> References: <2oXvjl1UFJq6x70IUcUgLKUVQgnjLC9U7YsB3XQg7BQ=.d651fdb1-df5c-40bc-9adc-4921045014a0@github.com> <4kKR1KP5iJ7G1_lvNKt4jOgRcCIyzDvlLDfO4rtxm2w=.c6fab761-d1ed-405b-b3b5-001a8e19f5db@github.com> Message-ID: On Wed, 15 Feb 2023 07:10:29 GMT, Johannes Bechberger wrote: >> From `frame_x86.hpp`: >> >> // The interpreter and adapters will extend the frame of the caller. >> // Since oopMaps are based on the sp of the caller before extension >> // we need to know that value. However in order to compute the address >> // of the return address we need the real "raw" sp. By convention we >> // use sp() to mean "raw" sp and unextended_sp() to mean the caller's >> // original sp. >> >> by that definition `unextended_sp` is always > `sp`. If something has violated that definition/convention then we may have bigger problems! > > I get the following values for `sp` and `unextended_sp` when running this test case: > > > int sp = 0x7f9f1ec614e0, unextended sp = 0x7f9f1ec614e0 > int sp = 0x7f9f1ec61538, unextended sp = 0x7f9f1ec61548 > int sp = 0x7f9f1ec615a0, unextended sp = 0x7f9f1ec61598 // the assumption fails here > int sp = 0x7f9f1ec61600, unextended sp = 0x7f9f1ec61608 > int sp = 0x7f9f1ec61670, unextended sp = 0x7f9f1ec61688 > ... What is the test case? Is it sending signals and then using AGCT? If so then maybe a signal hits when a call is made but before the code has a chance to update unextended_sp? ------------- PR: https://git.openjdk.org/jdk/pull/12535 From duke at openjdk.org Wed Feb 15 07:53:40 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Wed, 15 Feb 2023 07:53:40 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test In-Reply-To: References: <2oXvjl1UFJq6x70IUcUgLKUVQgnjLC9U7YsB3XQg7BQ=.d651fdb1-df5c-40bc-9adc-4921045014a0@github.com> <4kKR1KP5iJ7G1_lvNKt4jOgRcCIyzDvlLDfO4rtxm2w=.c6fab761-d1ed-405b-b3b5-001a8e19f5db@github.com> Message-ID: On Wed, 15 Feb 2023 07:46:02 GMT, David Holmes wrote: > What is the test case? The sanity test I amended with more checks. > Is it sending signals and then using AGCT? No. Just calling ASGCT directly in a native method which is called directly in the main method of a JTREG test. ------------- PR: https://git.openjdk.org/jdk/pull/12535 From rehn at openjdk.org Wed Feb 15 08:45:38 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 15 Feb 2023 08:45:38 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 17:29:41 GMT, Calvin Cheung wrote: > Please review this patch for embedding the Symbol inside a ConcurrentHashTable Node instead of having a pointer to a Symbol. This eliminates malloc/free for each Symbol. > > This patch is co-authored by @robehn. > > Passed tiers 1 - 4 testing. @calvinccheung thanks! ------------- PR: https://git.openjdk.org/jdk/pull/12562 From dholmes at openjdk.org Wed Feb 15 09:04:49 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 15 Feb 2023 09:04:49 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test In-Reply-To: References: <2oXvjl1UFJq6x70IUcUgLKUVQgnjLC9U7YsB3XQg7BQ=.d651fdb1-df5c-40bc-9adc-4921045014a0@github.com> <4kKR1KP5iJ7G1_lvNKt4jOgRcCIyzDvlLDfO4rtxm2w=.c6fab761-d1ed-405b-b3b5-001a8e19f5db@github.com> Message-ID: <3T7nJ7BvnPjezFb-EzwCNkjq-zRjjsK-LrZTdRgZmzY=.c9ad8912-fbab-4b96-a9c1-9fe104cc904d@github.com> On Wed, 15 Feb 2023 07:50:56 GMT, Johannes Bechberger wrote: >> What is the test case? Is it sending signals and then using AGCT? If so then maybe a signal hits when a call is made but before the code has a chance to update unextended_sp? > >> What is the test case? > > The sanity test I amended with more checks. > >> Is it sending signals and then using AGCT? > > No. Just calling ASGCT directly in a native method which is called directly in the main method of a JTREG test. Are you referring to the test in test/hotspot/jtreg/serviceability/AsyncGetCallTrace/ ? ------------- PR: https://git.openjdk.org/jdk/pull/12535 From duke at openjdk.org Wed Feb 15 09:04:51 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Wed, 15 Feb 2023 09:04:51 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test In-Reply-To: <3T7nJ7BvnPjezFb-EzwCNkjq-zRjjsK-LrZTdRgZmzY=.c9ad8912-fbab-4b96-a9c1-9fe104cc904d@github.com> References: <2oXvjl1UFJq6x70IUcUgLKUVQgnjLC9U7YsB3XQg7BQ=.d651fdb1-df5c-40bc-9adc-4921045014a0@github.com> <4kKR1KP5iJ7G1_lvNKt4jOgRcCIyzDvlLDfO4rtxm2w=.c6fab761-d1ed-405b-b3b5-001a8e19f5db@github.com> <3T7nJ7BvnPjezFb-EzwCNkjq-zRjjsK-LrZTdRgZmzY=.c9ad8912-fbab-4b96-a9c1-9fe104cc904d@github.com> Message-ID: On Wed, 15 Feb 2023 09:00:23 GMT, David Holmes wrote: >>> What is the test case? >> >> The sanity test I amended with more checks. >> >>> Is it sending signals and then using AGCT? >> >> No. Just calling ASGCT directly in a native method which is called directly in the main method of a JTREG test. > > Are you referring to the test in test/hotspot/jtreg/serviceability/AsyncGetCallTrace/ ? Yes. ------------- PR: https://git.openjdk.org/jdk/pull/12535 From redestad at openjdk.org Wed Feb 15 09:29:55 2023 From: redestad at openjdk.org (Claes Redestad) Date: Wed, 15 Feb 2023 09:29:55 GMT Subject: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v17] In-Reply-To: References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: On Tue, 14 Feb 2023 18:22:32 GMT, Scott Gibbons wrote: >> Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. >> >> Encode performance: >> **Old:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms >> >> >> **New:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error Units >> Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms >> >> >> Decode performance: >> **Old:** >> >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms >> >> **New:** >> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Last of review comments No problem! Testing looks good. ------------- PR: https://git.openjdk.org/jdk/pull/12126 From sgibbons at openjdk.org Wed Feb 15 09:29:57 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Wed, 15 Feb 2023 09:29:57 GMT Subject: Integrated: JDK-8300808: Accelerate Base64 on x86 for AVX2 In-Reply-To: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> References: <9zN3bjyMEikXYOlpjTy9_QPc-9usYiBJMUFdb-TaVrQ=.e8f65ce9-ac0f-4f8d-9c6b-b39ae1c26cd2@github.com> Message-ID: On Sat, 21 Jan 2023 00:15:10 GMT, Scott Gibbons wrote: > Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms. > > Encode performance: > **Old:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ? 2.632 ops/ms > > > **New:** > > Benchmark (maxNumBytes) Mode Cnt Score Error Units > Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ? 102.026 ops/ms > > > Decode performance: > **Old:** > > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ? 93.409 ops/ms > > **New:** > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units > Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ? 24.383 ops/ms This pull request has now been integrated. Changeset: 33bec207 Author: Scott Gibbons Committer: Claes Redestad URL: https://git.openjdk.org/jdk/commit/33bec207103acd520eb99afb093cfafa44aecfda Stats: 234 lines in 7 files changed: 208 ins; 5 del; 21 mod 8300808: Accelerate Base64 on x86 for AVX2 Reviewed-by: jbhateja, redestad, sviswanathan ------------- PR: https://git.openjdk.org/jdk/pull/12126 From tschatzl at openjdk.org Wed Feb 15 09:42:48 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 15 Feb 2023 09:42:48 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 05:26:38 GMT, Kim Barrett wrote: > Please review this change to permit the use of noreturn attributes in HotSpot > code. > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2761.pdf > https://en.cppreference.com/w/cpp/language/attributes/noreturn > > This will permit such changes as marking failed assertions and the like as > noreturn, permitting the compiler to generate better code in some cases. This > has benefits even in product builds, since some of the relevant checks occur > in product code (such as `guarantee`, `fatal`, &etc). It also provides a > solution to the problem described in JDK-8294031, where the potential > continued execution from a failed assertion leads to compiler warnings. > > The change is written in such a way that it should be easy to add the > appropriate text for new attributes in the future. There have been > discussions of adopting C++17, which adds several attributes. > > The change to the Style Guide is forward looking, toward a time when more > attributes are available due to the adoption of a newer language standard than > the current C++14. It is written in such a way that it should be easy to add > the appropriate text for new attributes. > > Testing: > I have a prototype of making HotSpot assertions noreturn, which has been run > through mach5 tier1-8 for all Oracle-supported platforms. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 24-Feb-2023 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. lgtm ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.org/jdk/pull/12507 From simonis at openjdk.org Wed Feb 15 09:56:48 2023 From: simonis at openjdk.org (Volker Simonis) Date: Wed, 15 Feb 2023 09:56:48 GMT Subject: RFR: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64 [v2] In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 17:14:05 GMT, Yi-Fan Tsai wrote: >> Instruction pmull and pmull2 support operating on 64-bit data in Cryptographic Extension. The execution throughput of this form raises from 1 on Neoverse N1 to 4 on Neoverse V1 while the latency remains 2. The CRC32 instructions did not changed: latency 2, throughput 1. As a result, computing CRC32 using pmull could perform better than using crc32 instruction. >> >> The following test has passed. >> test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java >> >> The throughput reported by [the micro benchmark](https://github.com/openjdk/jdk/blob/master/test/micro/org/openjdk/bench/java/util/TestCRC32.java) is measured on an EC2 c7g instance. The optimization shows 11 - 99% improvement when the input is at least 384 bytes. >> >> | input | 64 | 128 | 256 | 384 | 511 | 512 | 1,024 | >> | ------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | >> | improvement | 0.02% | 0.02% | 0.00% | 16.00% | 11.94% | 34.75% | 69.80% | >> >> | input | 2,048 | 4,096 | 8,192 | 16,384 | 32,768 | 65,536 | >> | ------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | >> | improvement | 77.61% | 92.33% | 95.98% | 97.95% | 99.33% | 98.36% | >> >> >> Baseline >> >> TestCRC32.testCRC32Update 64 thrpt 12 173126.358 ? 118.330 ops/ms >> TestCRC32.testCRC32Update 128 thrpt 12 112910.118 ? 47.305 ops/ms >> TestCRC32.testCRC32Update 256 thrpt 12 66601.990 ? 7.294 ops/ms >> TestCRC32.testCRC32Update 384 thrpt 12 47229.319 ? 3.949 ops/ms >> TestCRC32.testCRC32Update 511 thrpt 12 33733.119 ? 4.076 ops/ms >> TestCRC32.testCRC32Update 512 thrpt 12 36584.565 ? 4.211 ops/ms >> TestCRC32.testCRC32Update 1024 thrpt 12 19239.083 ? 1.040 ops/ms >> TestCRC32.testCRC32Update 2048 thrpt 12 9875.652 ? 0.435 ops/ms >> TestCRC32.testCRC32Update 4096 thrpt 12 5004.425 ? 0.290 ops/ms >> TestCRC32.testCRC32Update 8192 thrpt 12 2519.185 ? 0.169 ops/ms >> TestCRC32.testCRC32Update 16384 thrpt 12 1263.909 ? 0.194 ops/ms >> TestCRC32.testCRC32Update 32768 thrpt 12 632.018 ? 0.053 ops/ms >> TestCRC32.testCRC32Update 65536 thrpt 12 315.471 ? 0.095 ops/ms >> >> >> Crypto pmull >> >> TestCRC32.testCRC32Update 64 thrpt 12 173168.669 ? 4.746 ops/ms >> TestCRC32.testCRC32Update 128 thrpt 12 112933.519 ? 4.583 ops/ms >> TestCRC32.testCRC32Update 256 thrpt 12 66602.462 ? 3.150 ops/ms >> TestCRC32.testCRC32Update 384 thrpt 12 54784.739 ? 2.110 ops/ms >> TestCRC32.testCRC32Update 511 thrpt 12 37760.816 ? 69.911 ops/ms >> TestCRC32.testCRC32Update 512 thrpt 12 49297.609 ? 21.983 ops/ms >> TestCRC32.testCRC32Update 1024 thrpt 12 32667.507 ? 90.610 ops/ms >> TestCRC32.testCRC32Update 2048 thrpt 12 17539.986 ? 511.416 ops/ms >> TestCRC32.testCRC32Update 4096 thrpt 12 9625.249 ? 9.713 ops/ms >> TestCRC32.testCRC32Update 8192 thrpt 12 4937.135 ? 6.121 ops/ms >> TestCRC32.testCRC32Update 16384 thrpt 12 2501.936 ? 1.270 ops/ms >> TestCRC32.testCRC32Update 32768 thrpt 12 1259.831 ? 0.119 ops/ms >> TestCRC32.testCRC32Update 65536 thrpt 12 625.773 ? 0.242 ops/ms > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Change to UseCryptoPmullForCRC32 In general your changes look good. I only have a few minor questions/comments below. Also, can you please update the copyright year in the files you've touched? Thank you and best regards, Volker src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 3777: > 3775: > 3776: if (UseCRC32) { > 3777: if (UseCryptoPmullForCRC32) { Whis is this inside `UseCRC32`? It means that if we set `-XX:-UseCRC32 -XX:+UseCryptoPmullForCRC32` on the command line, we won't get `UseCryptoPmullForCRC32` src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 257: > 255: } > 256: > 257: if (UseCryptoPmullForCRC32 && (!VM_Version::supports_sha3() || !VM_Version::supports_pmull())) { Why does `UseCryptoPmullForCRC32` depend on `VM_Version::supports_sha3()`? Does `VM_Version::supports_pmull()` really check for the `pmull` in the cryptographic extension? ------------- PR: https://git.openjdk.org/jdk/pull/12480 From stuefe at openjdk.org Wed Feb 15 10:11:09 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 15 Feb 2023 10:11:09 GMT Subject: RFR: JDK-8293313: NMT: Rework MallocLimit [v7] In-Reply-To: References: <-KHeOvjwe5Osf05XWQuO5ELZUs-ogUVj25xCouJaDaU=.b0906406-dd44-4eb4-be76-debe83867249@github.com> Message-ID: On Mon, 30 Jan 2023 09:23:48 GMT, Johan Sj?len wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: >> >> - merge and fix conflicts >> - Merge; fix conflicts >> - Feedback Gerard >> - Revert strchrnul >> - Copyrights >> - Feedback Johan >> - Merge >> - Merge branch 'master' into JDK-8293313-NMT-fake-oom >> - MallocLimit > > Sorry, could you please convert the NULLs into nullptr :)? @jdksjolen @gerard-ziemski could you please re-approve? For some reason I cannot integrate and I suspect its because reviews are stale. ------------- PR: https://git.openjdk.org/jdk/pull/11371 From aph at openjdk.org Wed Feb 15 10:21:40 2023 From: aph at openjdk.org (Andrew Haley) Date: Wed, 15 Feb 2023 10:21:40 GMT Subject: RFR: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64 [v2] In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 17:09:31 GMT, Yi-Fan Tsai wrote: > Changed to UseCryptoPmullForCRC32 as there is another pmull version. Can you give me a reference for that? I can only find one definition of pmull. ------------- PR: https://git.openjdk.org/jdk/pull/12480 From tschatzl at openjdk.org Wed Feb 15 12:00:28 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 15 Feb 2023 12:00:28 GMT Subject: RFR: 8302122: Parallelize TLAB retirement in prologue in G1 [v3] In-Reply-To: References: Message-ID: <81vcVzZwHKTKZ3aNxc92CXVLH1TJFXwAdtkxBzu7T-8=.20d4b2e7-131e-4e23-a76a-bfbda3a90a3c@github.com> > Hi all, > > can I have reviews for this change that parallelizes TLAB retirement and log buffer flushing (which [JDK-8297611](https://bugs.openjdk.org/browse/JDK-8297611) suggests? > > * the parallelization has been split into java and non-java threads: parallelization of the latter does not make sense given how they are organized in a linked list. However non-java threads are typically very few anyway, and parallelization only starts making sense with low hundreds of threads anyway. > * `G1BarrierSet::write_region` added a new parameter that describes the JavaThread the write_region/invalidation (not sure why `write_region` isn't just an overload of `invalidate`) is made for. The reason is previously when `BarrierSet::make_parsable()` (not sure why this is called this way either) is called to flush out deferred card marks, the card marks generated by that were added to the *calling thread's* refinement queue. This worked because the calling thread (the worker thread/vm thread) has always been processed after all java threads (which are the only ones that can have deferred card marks). So these deferred card marks' refinement entries were put into the worker thread's refinement queue. After parallelization this is not possible any more unless we deferred non java thread log flushing until after all java threads - I did not want to do that, so the change passes that explicit java thread through to `G1BarrierSet::write_region`. I think this is clearer anyway, and corresponds to how `G1DirtyCardQueueSet::concatenate_log_and_stats` works. > > Testing: tier1-5 > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: ayang review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12497/files - new: https://git.openjdk.org/jdk/pull/12497/files/950b849d..165085a3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12497&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12497&range=01-02 Stats: 55 lines in 1 file changed: 28 ins; 25 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12497.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12497/head:pull/12497 PR: https://git.openjdk.org/jdk/pull/12497 From kbarrett at openjdk.org Wed Feb 15 12:21:07 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 15 Feb 2023 12:21:07 GMT Subject: RFR: JDK-8300783: Consolidate byteswap implementations [v13] In-Reply-To: <4AfWUfBaHKZwVsm6XhCkKc0krgKgWTw1RVI2R1-MdcM=.48008df7-1411-42d6-aeb1-3276f443dfe3@github.com> References: <4AfWUfBaHKZwVsm6XhCkKc0krgKgWTw1RVI2R1-MdcM=.48008df7-1411-42d6-aeb1-3276f443dfe3@github.com> Message-ID: On Tue, 31 Jan 2023 20:05:21 GMT, Justin King wrote: >> Deduplicate byte swapping implementations by consolidating them into `utilities/byteswap.hpp`, following `std::byteswap` introduced in C++23. Further simplification of `Bytes` will follow in https://github.com/openjdk/jdk/pull/12078. > > Justin King has updated the pull request incrementally with one additional commit since the last revision: > > Fix copyright > > Signed-off-by: Justin King Changes requested by kbarrett (Reviewer). src/hotspot/cpu/aarch64/bytes_aarch64.hpp line 54: > 52: static inline void put_Java_u2(address p, u2 x) { put_native_u2(p, byteswap(x)); } > 53: static inline void put_Java_u4(address p, u4 x) { put_native_u4(p, byteswap(x)); } > 54: static inline void put_Java_u8(address p, u8 x) { put_native_u8(p, byteswap(x)); } Explicit type parameters for byteswap aren't needed or desirable; let the type be deduced. This is a common theme throughout, so I didn't comment on all occurrences. src/hotspot/share/jfr/utilities/jfrBigEndian.hpp line 40: > 38: # define bigendian_16(x) byteswap(x) > 39: # define bigendian_32(x) byteswap(x) > 40: # define bigendian_64(x) byteswap(x) The explicit template parameters might really be needed, unlike (nearly?) every other call. src/hotspot/share/utilities/byteswap.hpp line 36: > 34: > 35: #include "metaprogramming/enableIf.hpp" > 36: #include "utilities/debug.hpp" I think debug.hpp is only being included for STATIC_ASSERT? We have static_assert now, which likely provides better error messages. src/hotspot/share/utilities/byteswap.hpp line 38: > 36: #include "utilities/debug.hpp" > 37: #include "utilities/globalDefinitions.hpp" > 38: #include "utilities/macros.hpp" Not sure why macros.hpp is being directly included. I don't see anything from it that is being used here. src/hotspot/share/utilities/byteswap.hpp line 54: > 52: struct ByteswapImpl; > 53: > 54: template ::value)> I don't think we need CanByteswapImpl here. Just use `std::is_integral::value` and let the sized specializations of ByteswapImpl determine what sizes are available. Yes, one might get a better error message this way if passing in some extended integer value, but that's not something I think we need to worry about. src/hotspot/share/utilities/byteswap.hpp line 55: > 53: > 54: template ::value)> > 55: ALWAYSINLINE T byteswap(T x) { Can this be constexpr? I think it probably can, assuming all the various intrinsics are constexpr. Also, we don't generally use ALWAYSINLINE unless there is a demonstrated reason to do so. There are some operator new's that use it to help with the allocation recording. The oop_oop_iterate suite uses it because they were found to often not inline, and ensuring they were inlined was found to be performance critical. Other than that, there are a handful of uses, some of which might not actually be needed. src/hotspot/share/utilities/byteswap.hpp line 55: > 53: > 54: template ::value)> > 55: ALWAYSINLINE T byteswap(T x) { There should be a description comment for this function. src/hotspot/share/utilities/byteswap.hpp line 57: > 55: ALWAYSINLINE T byteswap(T x) { > 56: using U = std::make_unsigned_t; > 57: STATIC_ASSERT(sizeof(T) == sizeof(U)); I don't think this static assert adds anything other than clutter. src/hotspot/share/utilities/byteswap.hpp line 70: > 68: // We support 8-bit integer types to be compatible with C++23's std::byteswap. > 69: template > 70: struct ByteswapFallbackImpl { I think we don't need this specialization, nor do we need the copied size==1 specializations for ByteswapImpl. Just have one unconditional specialization of ByteswapImpl. src/hotspot/share/utilities/byteswap.hpp line 72: > 70: struct ByteswapFallbackImpl { > 71: STATIC_ASSERT(CanByteswapImpl::value); > 72: STATIC_ASSERT(sizeof(T) == 1); (Throughout) I don't think these static asserts add anything other than clutter. src/hotspot/share/utilities/byteswap.hpp line 138: > 136: > 137: template > 138: struct ByteswapImpl final { (Throughout) I don't think `final` adds any value here. src/hotspot/share/utilities/byteswap.hpp line 143: > 141: > 142: ALWAYSINLINE T operator()(T x) const { > 143: return static_cast(__builtin_bswap16(static_cast(x))); (Throughout) I think we can just replace T with uint16_t / whatever, and dispense with the casts. src/hotspot/share/utilities/byteswap.hpp line 176: > 174: // architecture-specific byteswap instruction, if available. If it is not available, GCC emits the > 175: // exact same implementation that underpins its __builtin_bswap in libgcc as there is really only > 176: // one way to implement it, as we have in fallback. Well that's kind of annoying. Or alternatively, it's annoying that we can't use the portable version everywhere and depend on compilers to reliably recognize the pattern and replace with an architecture-specific implementation if available. Oh well. Good that you noticed this. src/hotspot/share/utilities/byteswap.hpp line 243: > 241: #elif defined(TARGET_COMPILER_xlc) > 242: > 243: #include Not needed if we (can) use clang `__builtin_bswapN`? src/hotspot/share/utilities/byteswap.hpp line 264: > 262: ALWAYSINLINE T operator()(T x) const { > 263: unsigned short y; > 264: __store2r(static_cast(x), &y); Can we not use `__builtin_bswapN` for these, since the compiler is clang-based? src/hotspot/share/utilities/byteswap.hpp line 282: > 280: }; > 281: > 282: #if defined(_ARCH_PWR7) && defined(_ARCH_PPC64) Does the need for this conditionalization go away if we use `__builtin_bswap64`? src/hotspot/share/utilities/copy.cpp line 131: > 129: > 130: if (swap) { > 131: tmp = byteswap(tmp); I think the explicit template argument shouldn't be needed - just let it be deduced from `tmp`. src/hotspot/share/utilities/moveBits.hpp line 31: > 29: #include "metaprogramming/enableIf.hpp" > 30: #include "utilities/byteswap.hpp" > 31: #include "utilities/debug.hpp" I think debug.hpp is only being included for STATIC_ASSERT? We have static_assert now, which likely provides better error messages. src/hotspot/share/utilities/moveBits.hpp line 51: > 49: struct ReverseBitsImpl; > 50: > 51: template ::value)> Similarly to byteswap, I don't think we need CanReverseBitsImpl, just std::is_integral and let the supported sizes be determined from the provided specializations. src/hotspot/share/utilities/moveBits.hpp line 52: > 50: > 51: template ::value)> > 52: ALWAYSINLINE T reverse_bits(T x) { So far as I know, ALWAYSINLINE does not imply constexpr. So this is a feature regression. src/hotspot/share/utilities/moveBits.hpp line 53: > 51: template ::value)> > 52: ALWAYSINLINE T reverse_bits(T x) { > 53: return ReverseBitsImpl{}(x); I think I would have preferred to have the change to bit reversal by separate from byteswap changes. With these changes, I also think this file should be renamed to reverse_bits.hpp or something like that. It was called moveBits because it contained both bit and byte swapping. src/hotspot/share/utilities/moveBits.hpp line 81: > 79: x = ((x & rep_3333) << 2) | ((x >> 2) & rep_3333); > 80: x = ((x & rep_0F0F) << 4) | ((x >> 4) & rep_0F0F); > 81: return byteswap(static_cast(x)); If we happen to use this on a platform that has hardware support for bitswap, and the implementation of byteswap is an intrinsic, it seems like there might be a risk of interfering with the pattern recognition that could transform into the hardware-supported method. src/hotspot/share/utilities/moveBits.hpp line 171: > 169: > 170: template > 171: struct ReverseBitsImpl final : public ReverseBitsFallbackImpl {}; This compiler is clang-based, so may have the built-ins. test/hotspot/gtest/utilities/test_moveBits.cpp line 81: > 79: > 80: int32_t code_quality_reverse_bytes_32(int32_t x) { > 81: return byteswap(x); This belongs in test_byteswap now. test/hotspot/gtest/utilities/test_moveBits.cpp line 89: > 87: > 88: int64_t code_quality_reverse_bytes_64(int64_t x) { > 89: return byteswap(x); This belongs in test_byteswap now. ------------- PR: https://git.openjdk.org/jdk/pull/12114 From goetz at openjdk.org Wed Feb 15 12:38:41 2023 From: goetz at openjdk.org (Goetz Lindenmaier) Date: Wed, 15 Feb 2023 12:38:41 GMT Subject: RFR: 8302158: PPC: test/jdk/jdk/internal/vm/Continuation/Fuzz.java: AssertionError: res: false shouldPin: false In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 14:10:08 GMT, Richard Reingruber wrote: > This fixes the linked issue by trimming the caller of a frame to be deoptimized back to its `unextended_sp` iff it is compiled. The creation of the section `dead after deoptimization` shown in the attachment [yield_after_deopt_failure.log](https://bugs.openjdk.org/secure/attachment/102602/yield_after_deopt_failure.log) is prevented by this. > > A new mode is added to the test BasicExt.java where all frames are deoptimized after a yield operation. The issue can be deterministically reproduced with the new mode. It's not worth to execute all test cases with the new mode though. Instead `ContinuationCompiledFramesWithStackArgs_3c4` is always executed a 2nd time in this mode. > > Before this BasicExt.java was refactored for better argument processing and representation of the test modes. > Also the try-catch-clause in the main method had to be changed to rethrow the caught exception because without this the test would have succeeded. > > Testing: jtreg tests tier 1-4 on standard platforms and also on ppc64le. LGTM src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp line 3094: > 3092: > 3093: // Freezing continuation frames requires that the caller is trimmed to unextended sp if compiled. > 3094: Register caller_sp = R23_tmp3; Can the caller be interpreted? I.e., not compiled? Then it might be helpful to comment like // If not compiled this contains sp() resulting in a resize of 0. ------------- Marked as reviewed by goetz (Reviewer). PR: https://git.openjdk.org/jdk/pull/12557 From ayang at openjdk.org Wed Feb 15 12:40:45 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 15 Feb 2023 12:40:45 GMT Subject: RFR: 8302122: Parallelize TLAB retirement in prologue in G1 [v3] In-Reply-To: <81vcVzZwHKTKZ3aNxc92CXVLH1TJFXwAdtkxBzu7T-8=.20d4b2e7-131e-4e23-a76a-bfbda3a90a3c@github.com> References: <81vcVzZwHKTKZ3aNxc92CXVLH1TJFXwAdtkxBzu7T-8=.20d4b2e7-131e-4e23-a76a-bfbda3a90a3c@github.com> Message-ID: On Wed, 15 Feb 2023 12:00:28 GMT, Thomas Schatzl wrote: >> Hi all, >> >> can I have reviews for this change that parallelizes TLAB retirement and log buffer flushing (which [JDK-8297611](https://bugs.openjdk.org/browse/JDK-8297611) suggests? >> >> * the parallelization has been split into java and non-java threads: parallelization of the latter does not make sense given how they are organized in a linked list. However non-java threads are typically very few anyway, and parallelization only starts making sense with low hundreds of threads anyway. >> * `G1BarrierSet::write_region` added a new parameter that describes the JavaThread the write_region/invalidation (not sure why `write_region` isn't just an overload of `invalidate`) is made for. The reason is previously when `BarrierSet::make_parsable()` (not sure why this is called this way either) is called to flush out deferred card marks, the card marks generated by that were added to the *calling thread's* refinement queue. This worked because the calling thread (the worker thread/vm thread) has always been processed after all java threads (which are the only ones that can have deferred card marks). So these deferred card marks' refinement entries were put into the worker thread's refinement queue. After parallelization this is not possible any more unless we deferred non java thread log flushing until after all java threads - I did not want to do that, so the change passes that explicit java thread through to `G1BarrierSet::write_region`. I think this is clearer anyway, and corresponds to how `G1DirtyCardQueueSet::concatenate_log_and_stats` works. >> >> Testing: tier1-5 >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > ayang review Minor subjective suggestion. src/hotspot/share/gc/g1/g1YoungCollector.cpp line 476: > 474: G1PreEvacuateCollectionSetBatchTask cl; > 475: G1CollectedHeap::heap()->run_batch_task(&cl); > 476: phase_times()->record_pre_evacuate_prepare_time_ms((Ticks::now() - start).seconds() * 1000.0); "PreEvacuate" and "pre_evacuate_prepare" seems too abstract/generic. Maybe sth more concrete to reflect what it is, not how it is connected/compared to its context. ------------- Marked as reviewed by ayang (Reviewer). PR: https://git.openjdk.org/jdk/pull/12497 From tschatzl at openjdk.org Wed Feb 15 12:49:29 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 15 Feb 2023 12:49:29 GMT Subject: RFR: 8302122: Parallelize TLAB retirement in prologue in G1 [v4] In-Reply-To: References: Message-ID: > Hi all, > > can I have reviews for this change that parallelizes TLAB retirement and log buffer flushing (which [JDK-8297611](https://bugs.openjdk.org/browse/JDK-8297611) suggests? > > * the parallelization has been split into java and non-java threads: parallelization of the latter does not make sense given how they are organized in a linked list. However non-java threads are typically very few anyway, and parallelization only starts making sense with low hundreds of threads anyway. > * `G1BarrierSet::write_region` added a new parameter that describes the JavaThread the write_region/invalidation (not sure why `write_region` isn't just an overload of `invalidate`) is made for. The reason is previously when `BarrierSet::make_parsable()` (not sure why this is called this way either) is called to flush out deferred card marks, the card marks generated by that were added to the *calling thread's* refinement queue. This worked because the calling thread (the worker thread/vm thread) has always been processed after all java threads (which are the only ones that can have deferred card marks). So these deferred card marks' refinement entries were put into the worker thread's refinement queue. After parallelization this is not possible any more unless we deferred non java thread log flushing until after all java threads - I did not want to do that, so the change passes that explicit java thread through to `G1BarrierSet::write_region`. I think this is clearer anyway, and corresponds to how `G1DirtyCardQueueSet::concatenate_log_and_stats` works. > > Testing: tier1-5 > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: ayang review2 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12497/files - new: https://git.openjdk.org/jdk/pull/12497/files/165085a3..53f887b9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12497&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12497&range=02-03 Stats: 6 lines in 3 files changed: 0 ins; 2 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/12497.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12497/head:pull/12497 PR: https://git.openjdk.org/jdk/pull/12497 From tschatzl at openjdk.org Wed Feb 15 12:49:33 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 15 Feb 2023 12:49:33 GMT Subject: RFR: 8302122: Parallelize TLAB retirement in prologue in G1 [v3] In-Reply-To: References: <81vcVzZwHKTKZ3aNxc92CXVLH1TJFXwAdtkxBzu7T-8=.20d4b2e7-131e-4e23-a76a-bfbda3a90a3c@github.com> Message-ID: On Wed, 15 Feb 2023 12:37:08 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> ayang review > > src/hotspot/share/gc/g1/g1YoungCollector.cpp line 476: > >> 474: G1PreEvacuateCollectionSetBatchTask cl; >> 475: G1CollectedHeap::heap()->run_batch_task(&cl); >> 476: phase_times()->record_pre_evacuate_prepare_time_ms((Ticks::now() - start).seconds() * 1000.0); > > "PreEvacuate" and "pre_evacuate_prepare" seems too abstract/generic. Maybe sth more concrete to reflect what it is, not how it is connected/compared to its context. There is intention to put more stuff into that phase (like for the post evacuate tasks), and I would like to avoid (significant) renaming later. As it looks now there will be a second batch task after evaluating the collection set as there are dependencies on its result (after all) that do not seem to be easily untangled. I would like to put that one into the `g1YoungGCPreEvacuateTasks.hpp` file similar to the post evacuate tasks too, hence the somewhat generic naming. ------------- PR: https://git.openjdk.org/jdk/pull/12497 From shade at openjdk.org Wed Feb 15 13:41:59 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 15 Feb 2023 13:41:59 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v19] In-Reply-To: References: <0Y1Bhvn-wJJdPJ83fSeICTbMQzdW4wLvEnaZhm8J6h8=.08240fe1-c05a-45f2-8975-02b1d12db63b@github.com> Message-ID: On Mon, 13 Feb 2023 17:55:06 GMT, Roman Kennke wrote: >> test/hotspot/jtreg/runtime/FieldLayout/ArrayBaseOffsets.java line 110: >> >>> 108: Asserts.assertEquals(unsafe.arrayBaseOffset(float[].class), intOffset, "Misplaced float array base"); >>> 109: Asserts.assertEquals(unsafe.arrayBaseOffset(double[].class), longOffset, "Misplaced double array base"); >>> 110: int expectedObjArrayOffset = (COOP || !Platform.is64bit()) ? intOffset : longOffset; >> >> `|| !Platform.is64Bit()` looks redundant, since `COOP` is guaranteed to be false on 32-bit? > > No, because on 32 bit this expression becomes true, which is what we want - the small (int) offset for reference arrays. Right, OK. ------------- PR: https://git.openjdk.org/jdk/pull/11044 From coleenp at openjdk.org Wed Feb 15 13:43:47 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 15 Feb 2023 13:43:47 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node In-Reply-To: References: Message-ID: On Wed, 15 Feb 2023 03:35:50 GMT, Ioi Lam wrote: >> src/hotspot/share/classfile/symbolTable.cpp line 469: >> >>> 467: const int alloc_size = Symbol::byte_size(len); >>> 468: u1* u1_buf = NEW_RESOURCE_ARRAY(u1, alloc_size); >>> 469: Symbol* tmp = ::new ((void*)u1_buf) Symbol((const u1*)name, len, (heap && !DumpSharedSpaces) ? 1 : PERM_REFCOUNT); >> >> There's a version of NEW_RESOURCE_ARRAY that takes the Thread argument, so it doesn't have to call Thread::current() > > The name of the `heap` parameter to this function should be changed. The node, which contains the Symbol, is always C_HEAP allocated. I would suggest renaming it to `is_permanent`. Same for the other functions that have a `heap` parameter. If the symbol is permanent, it's allocated to an arena not C heap but I agree that the variable name should be changed. ------------- PR: https://git.openjdk.org/jdk/pull/12562 From tschatzl at openjdk.org Wed Feb 15 13:52:27 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 15 Feb 2023 13:52:27 GMT Subject: RFR: 8302122: Parallelize TLAB retirement in prologue in G1 [v5] In-Reply-To: References: Message-ID: <8P_TVw6hR1Ass_t8G1ZpV44nD04zUK3vktrRJH7hBFs=.9cb55249-4db5-4245-8988-33fc0cb70693@github.com> > Hi all, > > can I have reviews for this change that parallelizes TLAB retirement and log buffer flushing (which [JDK-8297611](https://bugs.openjdk.org/browse/JDK-8297611) suggests? > > * the parallelization has been split into java and non-java threads: parallelization of the latter does not make sense given how they are organized in a linked list. However non-java threads are typically very few anyway, and parallelization only starts making sense with low hundreds of threads anyway. > * `G1BarrierSet::write_region` added a new parameter that describes the JavaThread the write_region/invalidation (not sure why `write_region` isn't just an overload of `invalidate`) is made for. The reason is previously when `BarrierSet::make_parsable()` (not sure why this is called this way either) is called to flush out deferred card marks, the card marks generated by that were added to the *calling thread's* refinement queue. This worked because the calling thread (the worker thread/vm thread) has always been processed after all java threads (which are the only ones that can have deferred card marks). So these deferred card marks' refinement entries were put into the worker thread's refinement queue. After parallelization this is not possible any more unless we deferred non java thread log flushing until after all java threads - I did not want to do that, so the change passes that explicit java thread through to `G1BarrierSet::write_region`. I think this is clearer anyway, and corresponds to how `G1DirtyCardQueueSet::concatenate_log_and_stats` works. > > Testing: tier1-5 > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: Fix compilation on osx ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12497/files - new: https://git.openjdk.org/jdk/pull/12497/files/53f887b9..1e37161d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12497&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12497&range=03-04 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12497.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12497/head:pull/12497 PR: https://git.openjdk.org/jdk/pull/12497 From duke at openjdk.org Wed Feb 15 14:19:15 2023 From: duke at openjdk.org (Jan Kratochvil) Date: Wed, 15 Feb 2023 14:19:15 GMT Subject: RFR: 8302191: Performance degradation for float/double modulo on Linux In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 09:06:56 GMT, Jan Kratochvil wrote: > I have OCA already processed/approved. I am not Author but my Author request is being processed these days (sent to Rob McKenna). > I did regression test x86_64 OpenJDK-8. I will leave other regression testing on GHA. > The patch (and former GCC performance regression) affects only x86_64+i686. > To ascertain (b) it would be good to see the DivisionDemo reformulated as a [JMH](https://github.com/openjdk/jmh) benchmark. https://github.com/openjdk/jmh-jdk-microbenchmarks/compare/master...jankratochvil:jmh-jdk-microbenchmarks:java?expand=1 `java -jar target/micros-jdk8-1.0-SNAPSHOT.jar org.openjdk.bench.java.lang.FloatDoubleRem -i 2 -r 2 -wi 2 -f 2` unpatched dd15d306a68caa02659dd95d16b71d0f1a437bc6: Benchmark Mode Cnt Score Error Units FloatDoubleRem.calcDoubleJava avgt 4 79.259 ? 1.466 ns/op FloatDoubleRem.calcFloatJava avgt 4 79.582 ? 0.617 ns/op FloatDoubleRem.cornercaseDoubleJava avgt 4 9.660 ? 0.213 ns/op FloatDoubleRem.cornercaseFloatJava avgt 4 9.629 ? 1.099 ns/op patched dd15d306a68caa02659dd95d16b71d0f1a437bc6: Benchmark Mode Cnt Score Error Units FloatDoubleRem.calcDoubleJava avgt 4 10.211 ? 1.982 ns/op FloatDoubleRem.calcFloatJava avgt 4 10.303 ? 1.325 ns/op FloatDoubleRem.cornercaseDoubleJava avgt 4 9.712 ? 0.565 ns/op FloatDoubleRem.cornercaseFloatJava avgt 4 9.694 ? 0.419 ns/op ------------- PR: https://git.openjdk.org/jdk/pull/12508 From kbarrett at openjdk.org Wed Feb 15 14:30:14 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 15 Feb 2023 14:30:14 GMT Subject: RFR: 8302127: Remove unused arg in write_ref_field_post In-Reply-To: References: Message-ID: <58B2nA55wsAS8uZIdtKyx9q-r-05dIUNUg-fWPwnh6w=.8cf19318-fd3b-4998-a250-5a84557f9544@github.com> On Thu, 9 Feb 2023 10:56:10 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.org/jdk/pull/12489 From kbarrett at openjdk.org Wed Feb 15 14:30:19 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 15 Feb 2023 14:30:19 GMT Subject: RFR: 8302127: Remove unused arg in write_ref_field_post In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 10:20:00 GMT, Albert Mingkun Yang wrote: >> src/hotspot/share/gc/g1/g1BarrierSet.inline.hpp line 71: >> >>> 69: >>> 70: template >>> 71: inline void G1BarrierSet::write_ref_field_post(T* field) { >> >> So this function isn't applying the "same region" filter that's in the various other places. I wonder if >> that's intentional, or a bug? > > It's safe because card-scanning is always prepared for "uninteresting" oops. > > It's used only by Access API; its counterpart in interpreter/c1/c2 checks for null and cross-region additionally. It's not obvious whether the discrepancy is intentional or not though. Yeah, it looks okay. Not a hot path I think, so the additional optimization probably isn't important. ------------- PR: https://git.openjdk.org/jdk/pull/12489 From ayang at openjdk.org Wed Feb 15 14:35:26 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 15 Feb 2023 14:35:26 GMT Subject: RFR: 8302127: Remove unused arg in write_ref_field_post In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 10:56:10 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. Thanks for the review. ------------- PR: https://git.openjdk.org/jdk/pull/12489 From ayang at openjdk.org Wed Feb 15 14:38:23 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 15 Feb 2023 14:38:23 GMT Subject: Integrated: 8302127: Remove unused arg in write_ref_field_post In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 10:56:10 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. This pull request has now been integrated. Changeset: 28f5250f Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/28f5250fa5cfd7938bb0899a2c17847b7458536c Stats: 8 lines in 6 files changed: 0 ins; 0 del; 8 mod 8302127: Remove unused arg in write_ref_field_post Reviewed-by: phh, kbarrett ------------- PR: https://git.openjdk.org/jdk/pull/12489 From stefank at openjdk.org Wed Feb 15 14:54:41 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 15 Feb 2023 14:54:41 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v23] In-Reply-To: References: Message-ID: <0-lVZpSsP2FUCAiK6Oo2CrI9FNIg-vyq0G5mdoEh1JQ=.50c51610-7efd-4752-8ea5-b8da19f7cce8@github.com> On Tue, 14 Feb 2023 08:33:23 GMT, Roman Kennke wrote: >> See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. >> >> Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. >> >> Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. >> >> Testing: >> - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] tier1 (x86_64, x86_32, aarch64, riscv) >> - [x] tier2 (x86_64, aarch64, riscv) >> - [x] tier3 (x86_64, riscv) > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Clarify comment on arrayOopDesc::max_array_length() Changes requested by stefank (Reviewer). src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp line 202: > 200: assert(is_aligned(hdr_size_in_bytes, BytesPerInt), "must be 32-bit-aligned"); > 201: strw(zr, Address(obj, hdr_size_in_bytes)); > 202: hdr_size_in_bytes += BytesPerInt; With this change `hdr_size_in_bytes` stops containing the header size in bytes. This could be confusing to readers and a potential source for bugs in the future. This comment applies to more places where we adjust the header size / base offset in the code. Given that this is compiler code I let you and the compiler maintainers decide if this should be changed. src/hotspot/cpu/arm/c1_MacroAssembler_arm.cpp line 166: > 164: void C1_MacroAssembler::allocate_array(Register obj, Register len, > 165: Register tmp1, Register tmp2, Register tmp3, > 166: int header_size_in_bytes, int element_size, Other platforms call this parameter base_offset_in_bytes. I looked at the code below, but couldn't convince myself that the MinOjbAlignmentInBytesMask usages were correct. Maybe they are because this is 32-bit code, but could be worth taking a closer look. src/hotspot/cpu/s390/c1_MacroAssembler_s390.cpp line 303: > 301: } > 302: add2reg(arr_size, base_offset_in_bytes + MinObjAlignmentInBytesMask); // Add space for header & alignment. > 303: z_nill(arr_size, (~MinObjAlignmentInBytesMask) & 0xffff); // Align array size. Comments became unaligned src/hotspot/share/gc/shared/collectedHeap.cpp line 426: > 424: int payload_start_in_words = payload_start / HeapWordSize; > 425: Copy::fill_to_words(start + payload_start_in_words, > 426: words - payload_start_in_words, value); `start` refers to an address and `payload_start` refers to an offset. It would be good to not reuse the word `start` for to different types. Maybe rename to `payload_offset`? src/hotspot/share/gc/shared/jvmFlagConstraintsGC.cpp line 340: > 338: return JVMFlag::VIOLATES_CONSTRAINT; > 339: } > 340: if (value > ((uint64_t)ThreadLocalAllocBuffer::max_size() * HeapWordSize)) { Was this change needed because _filler_array_max_size was changed and max_tlab_size uses it? Could you motivate why this is needed or add an assert that we don't overflow size_t. I see that we have another place where we don't guard against a size_t overflow: static size_t max_size_in_bytes() { return max_size() * BytesPerWord; } If need to care about overflow in MinTLABSizeConstraintFunc, don't we need to care about it in usages of max_size_in_bytes()? src/hotspot/share/gc/z/zObjArrayAllocator.cpp line 63: > 61: assert(is_aligned(base_offset, HeapWordSize), "remaining array base must be 64 bit aligned"); > 62: > 63: const size_t header = heap_word_size(base_offset); `base_offset` has already been aligned by the code above, so we don't have to call heap_word_size here and can simply divide with HeapWordSize. src/hotspot/share/opto/type.cpp line 5010: > 5008: if( _offset != 0 ) { > 5009: BasicType basic_elem_type = elem()->basic_type(); > 5010: int header_size = arrayOopDesc::base_offset_in_bytes(basic_elem_type); This function now calls base_offset_in_bytes twice. One to set header_size and the other to set the array_base. Maybe clean that up? ------------- PR: https://git.openjdk.org/jdk/pull/11044 From stefank at openjdk.org Wed Feb 15 14:54:42 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 15 Feb 2023 14:54:42 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v11] In-Reply-To: References: <0oSGAyFe1v4NSMKc-zmIKOm9KIchWhj7TSOf4qGpM78=.f9ea7cd7-e60f-40ae-a003-c935a5400d2f@github.com> Message-ID: On Tue, 24 Jan 2023 18:10:35 GMT, Aleksey Shipilev wrote: >> You are right that it aligns up. I agree that it's not 100% clear that check_for_bad_heap_word_value is correct. > > Similar thing applies to `filler_array_min_size` hunk above? Although there it is masked by "larger" `align_object_size`. I think this should be rewritten to not use the word-based MemRegion class. We need to be able to return something that properly divide the header part from the payload. ------------- PR: https://git.openjdk.org/jdk/pull/11044 From mdoerr at openjdk.org Wed Feb 15 15:00:35 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 15 Feb 2023 15:00:35 GMT Subject: RFR: 8302158: PPC: test/jdk/jdk/internal/vm/Continuation/Fuzz.java: AssertionError: res: false shouldPin: false In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 14:10:08 GMT, Richard Reingruber wrote: > This fixes the linked issue by trimming the caller of a frame to be deoptimized back to its `unextended_sp` iff it is compiled. The creation of the section `dead after deoptimization` shown in the attachment [yield_after_deopt_failure.log](https://bugs.openjdk.org/secure/attachment/102602/yield_after_deopt_failure.log) is prevented by this. > > A new mode is added to the test BasicExt.java where all frames are deoptimized after a yield operation. The issue can be deterministically reproduced with the new mode. It's not worth to execute all test cases with the new mode though. Instead `ContinuationCompiledFramesWithStackArgs_3c4` is always executed a 2nd time in this mode. > > Before this BasicExt.java was refactored for better argument processing and representation of the test modes. > Also the try-catch-clause in the main method had to be changed to rethrow the caught exception because without this the test would have succeeded. > > Testing: jtreg tests tier 1-4 on standard platforms and also on ppc64le. LGTM. Thanks for fixing! src/hotspot/cpu/ppc/frame_ppc.cpp line 424: > 422: > 423: intptr_t *frame::initial_deoptimization_info() { > 424: // `this` is the caller of the deoptee. We want to trimm it, if compiled, to I believe "to trim" should contain only one 'm' in English. ------------- Marked as reviewed by mdoerr (Reviewer). PR: https://git.openjdk.org/jdk/pull/12557 From mbaesken at openjdk.org Wed Feb 15 15:03:30 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 15 Feb 2023 15:03:30 GMT Subject: RFR: 8299817: [s390] AES-CTR mode intrinsic fails with multiple short update() calls [v2] In-Reply-To: References: Message-ID: On Mon, 30 Jan 2023 12:26:57 GMT, Lutz Schmidt wrote: >> This PR addresses an issue in the AES-CTR mode intrinsic on s390. When a message is ciphered in multiple, small (< 16 bytes) segments, the result is incorrect. >> >> This is not just a band-aid fix. The issue was taken as a chance to restructure the code. though still complicated, It is now easier to read and (hopefully) understand. >> >> Except for the new jetreg test, the changes are purely s390. There are no side effects on other platforms. Issue-specific tests pass. Other tests are in progress. I will update this PR once they are complete. >> >> **Reviews and comments are very much appreciated.** >> >> @backwaterred could you please run some "official" s390 tests? Thanks. > > Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: > > 8299817: Update copyright Hi Lutz, is the JIT_TIMER still needed ? ------------- PR: https://git.openjdk.org/jdk/pull/11967 From jcking at openjdk.org Wed Feb 15 15:39:14 2023 From: jcking at openjdk.org (Justin King) Date: Wed, 15 Feb 2023 15:39:14 GMT Subject: RFR: JDK-8300783: Consolidate byteswap implementations [v14] In-Reply-To: References: Message-ID: <-b783DPmWbWFeigKf7F7SFYddDKwErM4AdFcfRx01eM=.5794fa08-a0d7-4761-a449-8ebfd639e30d@github.com> > Deduplicate byte swapping implementations by consolidating them into `utilities/byteswap.hpp`, following `std::byteswap` introduced in C++23. Further simplification of `Bytes` will follow in https://github.com/openjdk/jdk/pull/12078. Justin King has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 24 additional commits since the last revision: - Merge remote-tracking branch 'upstream/master' into byteswap - Update based on review Signed-off-by: Justin King - Fix copyright Signed-off-by: Justin King - Update copyright Signed-off-by: Justin King - Add missing include Signed-off-by: Justin King - Remove unused include Signed-off-by: Justin King - Reorganize tests Signed-off-by: Justin King - Fix test Signed-off-by: Justin King - Merge remote-tracking branch 'upstream/master' into byteswap - Be restrict on requiring 1, 2, 4, or 8 byte integers Signed-off-by: Justin King - ... and 14 more: https://git.openjdk.org/jdk/compare/8697eb93...223d733b ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12114/files - new: https://git.openjdk.org/jdk/pull/12114/files/f2ea0038..223d733b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12114&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12114&range=12-13 Stats: 78274 lines in 2100 files changed: 30636 ins; 20323 del; 27315 mod Patch: https://git.openjdk.org/jdk/pull/12114.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12114/head:pull/12114 PR: https://git.openjdk.org/jdk/pull/12114 From jcking at openjdk.org Wed Feb 15 15:39:45 2023 From: jcking at openjdk.org (Justin King) Date: Wed, 15 Feb 2023 15:39:45 GMT Subject: RFR: JDK-8300783: Consolidate byteswap implementations [v13] In-Reply-To: References: <4AfWUfBaHKZwVsm6XhCkKc0krgKgWTw1RVI2R1-MdcM=.48008df7-1411-42d6-aeb1-3276f443dfe3@github.com> Message-ID: On Wed, 15 Feb 2023 10:47:21 GMT, Kim Barrett wrote: >> Justin King has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix copyright >> >> Signed-off-by: Justin King > > src/hotspot/cpu/aarch64/bytes_aarch64.hpp line 54: > >> 52: static inline void put_Java_u2(address p, u2 x) { put_native_u2(p, byteswap(x)); } >> 53: static inline void put_Java_u4(address p, u4 x) { put_native_u4(p, byteswap(x)); } >> 54: static inline void put_Java_u8(address p, u8 x) { put_native_u8(p, byteswap(x)); } > > Explicit type parameters for byteswap aren't needed or desirable; let the type be deduced. This is a > common theme throughout, so I didn't comment on all occurrences. Done. Think I got them all. > src/hotspot/share/jfr/utilities/jfrBigEndian.hpp line 40: > >> 38: # define bigendian_16(x) byteswap(x) >> 39: # define bigendian_32(x) byteswap(x) >> 40: # define bigendian_64(x) byteswap(x) > > The explicit template parameters might really be needed, unlike (nearly?) every other call. Yeah, they are required here based on the existing code. > src/hotspot/share/utilities/byteswap.hpp line 36: > >> 34: >> 35: #include "metaprogramming/enableIf.hpp" >> 36: #include "utilities/debug.hpp" > > I think debug.hpp is only being included for STATIC_ASSERT? We have static_assert now, which likely provides > better error messages. Done. > src/hotspot/share/utilities/byteswap.hpp line 38: > >> 36: #include "utilities/debug.hpp" >> 37: #include "utilities/globalDefinitions.hpp" >> 38: #include "utilities/macros.hpp" > > Not sure why macros.hpp is being directly included. I don't see anything from it that is being used here. Habit, you are correct nothing from there is used. > src/hotspot/share/utilities/byteswap.hpp line 54: > >> 52: struct ByteswapImpl; >> 53: >> 54: template ::value)> > > I don't think we need CanByteswapImpl here. Just use `std::is_integral::value` and let the sized specializations of ByteswapImpl determine what sizes are available. Yes, one might get a better error > message this way if passing in some extended integer value, but that's not something I think we need > to worry about. Fair. Fixed. > src/hotspot/share/utilities/byteswap.hpp line 55: > >> 53: >> 54: template ::value)> >> 55: ALWAYSINLINE T byteswap(T x) { > > There should be a description comment for this function. It was above, I moved it to be located correctly now. > src/hotspot/share/utilities/byteswap.hpp line 57: > >> 55: ALWAYSINLINE T byteswap(T x) { >> 56: using U = std::make_unsigned_t; >> 57: STATIC_ASSERT(sizeof(T) == sizeof(U)); > > I don't think this static assert adds anything other than clutter. Yeah, paranoia on my part. If the STL is doing something incorrect, we have worse problems. > src/hotspot/share/utilities/byteswap.hpp line 70: > >> 68: // We support 8-bit integer types to be compatible with C++23's std::byteswap. >> 69: template >> 70: struct ByteswapFallbackImpl { > > I think we don't need this specialization, nor do we need the copied size==1 specializations for ByteswapImpl. > Just have one unconditional specialization of ByteswapImpl. Done. > src/hotspot/share/utilities/byteswap.hpp line 72: > >> 70: struct ByteswapFallbackImpl { >> 71: STATIC_ASSERT(CanByteswapImpl::value); >> 72: STATIC_ASSERT(sizeof(T) == 1); > > (Throughout) I don't think these static asserts add anything other than clutter. Done. > src/hotspot/share/utilities/byteswap.hpp line 138: > >> 136: >> 137: template >> 138: struct ByteswapImpl final { > > (Throughout) I don't think `final` adds any value here. Fair. In other projects which expose C++ APIs to consume it's sometimes worth it to prevent people from doing odd things, so marking as final is another barrier. Here we do not have that problem. > src/hotspot/share/utilities/byteswap.hpp line 143: > >> 141: >> 142: ALWAYSINLINE T operator()(T x) const { >> 143: return static_cast(__builtin_bswap16(static_cast(x))); > > (Throughout) I think we can just replace T with uint16_t / whatever, and dispense with the casts. Hm. I believe we can. Originally I was worried about implicit integer promotion, for example if some compiler decided that uint32_t was unsigned int, but unsigned long was also the same size. It would promote unsigned long to unsigned long long causing issues. But we shouldn't have that here with the current approach. > src/hotspot/share/utilities/byteswap.hpp line 243: > >> 241: #elif defined(TARGET_COMPILER_xlc) >> 242: >> 243: #include > > Not needed if we (can) use clang `__builtin_bswapN`? Done. > src/hotspot/share/utilities/byteswap.hpp line 264: > >> 262: ALWAYSINLINE T operator()(T x) const { >> 263: unsigned short y; >> 264: __store2r(static_cast(x), &y); > > Can we not use `__builtin_bswapN` for these, since the compiler is clang-based? Hmm. We might be able to, let's see. > src/hotspot/share/utilities/byteswap.hpp line 282: > >> 280: }; >> 281: >> 282: #if defined(_ARCH_PWR7) && defined(_ARCH_PPC64) > > Does the need for this conditionalization go away if we use `__builtin_bswap64`? Done. > src/hotspot/share/utilities/copy.cpp line 131: > >> 129: >> 130: if (swap) { >> 131: tmp = byteswap(tmp); > > I think the explicit template argument shouldn't be needed - just let it be deduced from `tmp`. Done. > src/hotspot/share/utilities/moveBits.hpp line 31: > >> 29: #include "metaprogramming/enableIf.hpp" >> 30: #include "utilities/byteswap.hpp" >> 31: #include "utilities/debug.hpp" > > I think debug.hpp is only being included for STATIC_ASSERT? We have static_assert now, which > likely provides better error messages. STATIC_ASSERT is static_assert, but sets the second argument to the stringification of the first. static_assert only accepts single arguments in, I think, C++17. But since its removed here anyway, its no longer needed. > src/hotspot/share/utilities/moveBits.hpp line 51: > >> 49: struct ReverseBitsImpl; >> 50: >> 51: template ::value)> > > Similarly to byteswap, I don't think we need CanReverseBitsImpl, just std::is_integral and let the supported > sizes be determined from the provided specializations. Done. > src/hotspot/share/utilities/moveBits.hpp line 53: > >> 51: template ::value)> >> 52: ALWAYSINLINE T reverse_bits(T x) { >> 53: return ReverseBitsImpl{}(x); > > I think I would have preferred to have the change to bit reversal by separate from byteswap changes. > With these changes, I also think this file should be renamed to reverse_bits.hpp or something like that. > It was called moveBits because it contained both bit and byte swapping. Out of curiosity, why `reverse_bits.hpp` instead of `reverseBits.hpp`? I have seen a few files in `utilities/` use the former, but the rest of Hotspot seems to always use the later? Is it because they contain a single function? I went with `reverse_bits.hpp` for now as suggested for consistency with the other bit fiddling headers in `utilities/`. > src/hotspot/share/utilities/moveBits.hpp line 171: > >> 169: >> 170: template >> 171: struct ReverseBitsImpl final : public ReverseBitsFallbackImpl {}; > > This compiler is clang-based, so may have the built-ins. Yeah, I merge XLC with the GCC block in both. Let's see. > test/hotspot/gtest/utilities/test_moveBits.cpp line 81: > >> 79: >> 80: int32_t code_quality_reverse_bytes_32(int32_t x) { >> 81: return byteswap(x); > > This belongs in test_byteswap now. Done. > test/hotspot/gtest/utilities/test_moveBits.cpp line 89: > >> 87: >> 88: int64_t code_quality_reverse_bytes_64(int64_t x) { >> 89: return byteswap(x); > > This belongs in test_byteswap now. Done. ------------- PR: https://git.openjdk.org/jdk/pull/12114 From jcking at openjdk.org Wed Feb 15 15:43:32 2023 From: jcking at openjdk.org (Justin King) Date: Wed, 15 Feb 2023 15:43:32 GMT Subject: RFR: JDK-8300783: Consolidate byteswap implementations [v13] In-Reply-To: References: <4AfWUfBaHKZwVsm6XhCkKc0krgKgWTw1RVI2R1-MdcM=.48008df7-1411-42d6-aeb1-3276f443dfe3@github.com> Message-ID: <77RcHUS2MpVKficmIr8ezFtOeX6YKQ0R4F3uXmWk3WQ=.14f9f2b3-3356-4770-adb4-01f45c782319@github.com> On Wed, 15 Feb 2023 10:59:58 GMT, Kim Barrett wrote: >> Justin King has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix copyright >> >> Signed-off-by: Justin King > > src/hotspot/share/utilities/byteswap.hpp line 55: > >> 53: >> 54: template ::value)> >> 55: ALWAYSINLINE T byteswap(T x) { > > Can this be constexpr? I think it probably can, assuming all the various intrinsics are constexpr. > > Also, we don't generally use ALWAYSINLINE unless there is a demonstrated reason to do so. There are some > operator new's that use it to help with the allocation recording. The oop_oop_iterate suite uses it because > they were found to often not inline, and ensuring they were inlined was found to be performance critical. > Other than that, there are a handful of uses, some of which might not actually be needed. `__builtin_bswapX` and `__builtin_bitreverseX` where not always `constexpr` compatible in Clang or GCC, though both will usually generate compile time constants in `-O2` if given compile time constants as inputs. That support for use with `constexpr` was only added in the last couple of versions in preparation for C++20. To be able to be `constexpr` compatible, while using intrinsics ensuring we get the best generated code, we would have to replicate something like `std::is_constant_evaluated()` from C++20 and `__has_constexpr_builtin`. However no existing usages in Hotspot rely on them being `constexpr` now, so it should be fine. I replaced all uses of `ALWAYSINLINE` with just `inline`, until we can prove that `ALWAYSINLINE` is needed as suggested. > src/hotspot/share/utilities/moveBits.hpp line 52: > >> 50: >> 51: template ::value)> >> 52: ALWAYSINLINE T reverse_bits(T x) { > > So far as I know, ALWAYSINLINE does not imply constexpr. So this is a feature regression. Same as above, no existing usages rely on this being `constexpr`. So we have to choose between ensuring the best generated code vs. `constexpr`. ------------- PR: https://git.openjdk.org/jdk/pull/12114 From ccheung at openjdk.org Wed Feb 15 16:11:19 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Wed, 15 Feb 2023 16:11:19 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node In-Reply-To: References: Message-ID: <-FlSnuiw1rqY5DXqquMUerIifbxK-weAOcpQSM1pwfA=.0586c79f-0005-438e-9599-cf4d02cb8ed5@github.com> On Wed, 15 Feb 2023 13:41:20 GMT, Coleen Phillimore wrote: >> The name of the `heap` parameter to this function should be changed. The node, which contains the Symbol, is always C_HEAP allocated. I would suggest renaming it to `is_permanent`. Same for the other functions that have a `heap` parameter. > > If the symbol is permanent, it's allocated to an arena not C heap but I agree that the variable name should be changed. > There's a version of NEW_RESOURCE_ARRAY that takes the Thread argument, so it doesn't have to call Thread::current() I've changed it to: `u1* u1_buf = NEW_RESOURCE_ARRAY_IN_THREAD(current, u1, alloc_size);` ------------- PR: https://git.openjdk.org/jdk/pull/12562 From ccheung at openjdk.org Wed Feb 15 16:11:20 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Wed, 15 Feb 2023 16:11:20 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node In-Reply-To: <-FlSnuiw1rqY5DXqquMUerIifbxK-weAOcpQSM1pwfA=.0586c79f-0005-438e-9599-cf4d02cb8ed5@github.com> References: <-FlSnuiw1rqY5DXqquMUerIifbxK-weAOcpQSM1pwfA=.0586c79f-0005-438e-9599-cf4d02cb8ed5@github.com> Message-ID: On Wed, 15 Feb 2023 16:07:10 GMT, Calvin Cheung wrote: >> If the symbol is permanent, it's allocated to an arena not C heap but I agree that the variable name should be changed. > >> There's a version of NEW_RESOURCE_ARRAY that takes the Thread argument, so it doesn't have to call Thread::current() > > I've changed it to: > `u1* u1_buf = NEW_RESOURCE_ARRAY_IN_THREAD(current, u1, alloc_size);` > The name of the `heap` parameter to this function should be changed. The node, which contains the Symbol, is always C_HEAP allocated. I would suggest renaming it to `is_permanent`. Same for the other functions that have a `heap` parameter. Which other functions did you notice? ------------- PR: https://git.openjdk.org/jdk/pull/12562 From jcking at openjdk.org Wed Feb 15 16:21:26 2023 From: jcking at openjdk.org (Justin King) Date: Wed, 15 Feb 2023 16:21:26 GMT Subject: RFR: JDK-8300783: Consolidate byteswap implementations [v14] In-Reply-To: <-b783DPmWbWFeigKf7F7SFYddDKwErM4AdFcfRx01eM=.5794fa08-a0d7-4761-a449-8ebfd639e30d@github.com> References: <-b783DPmWbWFeigKf7F7SFYddDKwErM4AdFcfRx01eM=.5794fa08-a0d7-4761-a449-8ebfd639e30d@github.com> Message-ID: <0vB6_VqHFyB2dJZJijPOkFQO9Fz9U1myzjeBhX0XNUM=.d62b2289-4e70-4723-8d53-692a509d7c02@github.com> On Wed, 15 Feb 2023 15:39:14 GMT, Justin King wrote: >> Deduplicate byte swapping implementations by consolidating them into `utilities/byteswap.hpp`, following `std::byteswap` introduced in C++23. Further simplification of `Bytes` will follow in https://github.com/openjdk/jdk/pull/12078. > > Justin King has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 24 additional commits since the last revision: > > - Merge remote-tracking branch 'upstream/master' into byteswap > - Update based on review > > Signed-off-by: Justin King > - Fix copyright > > Signed-off-by: Justin King > - Update copyright > > Signed-off-by: Justin King > - Add missing include > > Signed-off-by: Justin King > - Remove unused include > > Signed-off-by: Justin King > - Reorganize tests > > Signed-off-by: Justin King > - Fix test > > Signed-off-by: Justin King > - Merge remote-tracking branch 'upstream/master' into byteswap > - Be restrict on requiring 1, 2, 4, or 8 byte integers > > Signed-off-by: Justin King > - ... and 14 more: https://git.openjdk.org/jdk/compare/99bfcf91...223d733b Okay, cleaned up based on review. PTAL again when you get a chance. Thanks! ------------- PR: https://git.openjdk.org/jdk/pull/12114 From jcking at openjdk.org Wed Feb 15 16:21:32 2023 From: jcking at openjdk.org (Justin King) Date: Wed, 15 Feb 2023 16:21:32 GMT Subject: RFR: JDK-8300783: Consolidate byteswap implementations [v13] In-Reply-To: References: <4AfWUfBaHKZwVsm6XhCkKc0krgKgWTw1RVI2R1-MdcM=.48008df7-1411-42d6-aeb1-3276f443dfe3@github.com> Message-ID: <_hS5Yq24QuSZzgw8m4-MFbFdFqVS_gCffmmdIrfwDAE=.21f9197c-6bbd-4e6c-bc69-e609c40545d1@github.com> On Wed, 15 Feb 2023 12:10:30 GMT, Kim Barrett wrote: >> Justin King has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix copyright >> >> Signed-off-by: Justin King > > src/hotspot/share/utilities/moveBits.hpp line 81: > >> 79: x = ((x & rep_3333) << 2) | ((x >> 2) & rep_3333); >> 80: x = ((x & rep_0F0F) << 4) | ((x >> 4) & rep_0F0F); >> 81: return byteswap(static_cast(x)); > > If we happen to use this on a platform that has hardware support for bitswap, and the implementation > of byteswap is an intrinsic, it seems like there might be a risk of interfering with the pattern recognition > that could transform into the hardware-supported method. Looks like neither GCC nor Clang are good at recognizing bit reverse, even with the existing implementation. At least based on compiler explorer output. The only way to get Clang to generate `RBIT` on AArch64 is to use `__builtin_bitreverse` or `__rbit` from ``. GCC will also only generate `RBIT` when using `__rbit` from ``. Using `__rbit` won't work to well because we would have to use preprocessor detection to detect whether the target actually has the instruction first. There does not seem like a consistent way to do this across GCC/Clang. ------------- PR: https://git.openjdk.org/jdk/pull/12114 From phh at openjdk.org Wed Feb 15 17:18:41 2023 From: phh at openjdk.org (Paul Hohensee) Date: Wed, 15 Feb 2023 17:18:41 GMT Subject: RFR: 8302070: Factor null-check into load_klass() calls In-Reply-To: References: Message-ID: <8volOklgs1I2ud_FE06W3kI1aRum8mYdUwl8KaKzHCE=.d497a20f-9658-4db4-bfe8-120af3df1e1e@github.com> On Wed, 8 Feb 2023 12:55:29 GMT, Roman Kennke wrote: > In several asm routines, a pattern like the following can be found: > > __ null_check(obj, oopDesc::klass_offset_in_bytes()); > __ load_klass(klass, obj, tmp); > > The null-check logically belongs into load_klass(), though. This becomes apparent in Lilliput, where the klass-offset is different, depending on whether or not Lilliput is enabled. > > Testing: > - [x] tier1 x86_64 > - [x] tier1 aarch64 Refactoring looks good. ------------- Marked as reviewed by phh (Reviewer). PR: https://git.openjdk.org/jdk/pull/12473 From coleenp at openjdk.org Wed Feb 15 17:25:44 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 15 Feb 2023 17:25:44 GMT Subject: RFR: 8302070: Factor null-check into load_klass() calls In-Reply-To: References: Message-ID: On Wed, 8 Feb 2023 12:55:29 GMT, Roman Kennke wrote: > In several asm routines, a pattern like the following can be found: > > __ null_check(obj, oopDesc::klass_offset_in_bytes()); > __ load_klass(klass, obj, tmp); > > The null-check logically belongs into load_klass(), though. This becomes apparent in Lilliput, where the klass-offset is different, depending on whether or not Lilliput is enabled. > > Testing: > - [x] tier1 x86_64 > - [x] tier1 aarch64 Seems fine. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.org/jdk/pull/12473 From ccheung at openjdk.org Wed Feb 15 17:48:01 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Wed, 15 Feb 2023 17:48:01 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node [v2] In-Reply-To: References: Message-ID: > Please review this patch for embedding the Symbol inside a ConcurrentHashTable Node instead of having a pointer to a Symbol. This eliminates malloc/free for each Symbol. > > This patch is co-authored by @robehn. > > Passed tiers 1 - 4 testing. Calvin Cheung has updated the pull request incrementally with one additional commit since the last revision: @coleenp and @iklam comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12562/files - new: https://git.openjdk.org/jdk/pull/12562/files/5c7a5845..f7a7812e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12562&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12562&range=00-01 Stats: 12 lines in 3 files changed: 2 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/12562.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12562/head:pull/12562 PR: https://git.openjdk.org/jdk/pull/12562 From ccheung at openjdk.org Wed Feb 15 17:50:43 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Wed, 15 Feb 2023 17:50:43 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node [v2] In-Reply-To: References: Message-ID: On Wed, 15 Feb 2023 00:51:29 GMT, Coleen Phillimore wrote: >> Calvin Cheung has updated the pull request incrementally with one additional commit since the last revision: >> >> @coleenp and @iklam comments > > src/hotspot/share/oops/symbol.cpp line 68: > >> 66: } >> 67: >> 68: Symbol::Symbol(const Symbol& s1) { > > Can you add a comment that this copies the symbol when added to the ConcurrentHashTable. I've added a comment and have pushed a commit based on your and Ioi's comments. For now, I've changed the parameter name in `do_add_if_needed()` from `heap` to `is_permanent` and adjusted the callsites accordingly. ------------- PR: https://git.openjdk.org/jdk/pull/12562 From jvernee at openjdk.org Wed Feb 15 20:28:58 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Wed, 15 Feb 2023 20:28:58 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test In-Reply-To: References: <2oXvjl1UFJq6x70IUcUgLKUVQgnjLC9U7YsB3XQg7BQ=.d651fdb1-df5c-40bc-9adc-4921045014a0@github.com> <4kKR1KP5iJ7G1_lvNKt4jOgRcCIyzDvlLDfO4rtxm2w=.c6fab761-d1ed-405b-b3b5-001a8e19f5db@github.com> <3T7nJ7BvnPjezFb-EzwCNkjq-zRjjsK-LrZTdRgZmzY=.c9ad8912-fbab-4b96-a9c1-9fe104cc904d@github.com> Message-ID: On Wed, 15 Feb 2023 09:01:29 GMT, Johannes Bechberger wrote: >> Are you referring to the test in test/hotspot/jtreg/serviceability/AsyncGetCallTrace/ ? > > Yes. > I think `unextended_sp < sp` is possible on x86 when interpreter entries generated by `MethodHandles::generate_method_handle_interpreter_entry()` are executed. They modify `sp` ([here for example](https://github.com/openjdk/jdk/blob/7f71a1040d9c03f72d082e329ccaf2c4a3c060a6/src/hotspot/cpu/x86/methodHandles_x86.cpp#L310-L314)) but not `sender_sp` (`InterpreterMacroAssembler ::_bcp_register`). I think we're talking about a case where an interpreted caller calls an interpreted callee through a MethodHandle. If my understanding is correct, in that case there are 2 sender sps, one that is computed based on the frame pointer of the callee. This just adds 2 words (skipping the return address), and the other is the sp which is saved into `r13` before e.g. a c2i adapter extends the stack [here](https://github.com/openjdk/jdk/blob/50dcc2aec5b16c0826e27d58e49a7f55a5f5ad38/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp#L631) Which then seems to be saved in the interpreted callee's frame as well (I'm not super familiar with interpreter frame layouts though...). I think your conclusion is correct wrt `unextended_sp < sp` in that case. For certain MH linker calls, an additional `MemberName` 'appendix' is passed as the last argument. The MH adapter/trampoline stub pops the appendix argument from the stack for an interpreted caller, and uses it to find out the actual callee. But that means that, the sp saved into `r13` will be one slot too large. In other words, the MethodHandle adapter for interpreted callers 'consumes' the appendix argument, but doesn't adjust `r13`. I think this could be done though, i.e. just add: __ subptr(r13, Interpreter::stackElementSize); // adjust unexteded_sp for callee, for proper stack walking In the linked code in `MethodHandles::generate_method_handle_interpreter_entry` to make this work. (note that we can not just leave the appendix on the stack, since the callee doesn't expect it, so we would break argument oop GC if we left it there) I think doing that would be preferable to adjusting `safe_for_sender`. That seems like it would uphold the `sp <= unextended_sp` invariant as well (maybe an assert to that effect in the constructor of `frame` wouldn't be a bad idea as well). It seems like PPC is not modifying the stack in this case (?): Register R19_member = R19_method; // MemberName ptr; incoming method ptr is dead now __ ld(R19_member, RegisterOrConstant((intptr_t)8), R15_argbase); __ addi(R15_argbase, R15_argbase, Interpreter::stackElementSize); generate_method_handle_dispatch(_masm, iid, tmp_recv, R19_member, not_for_compiler_entry); ------------- PR: https://git.openjdk.org/jdk/pull/12535 From rrich at openjdk.org Wed Feb 15 20:30:00 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 15 Feb 2023 20:30:00 GMT Subject: RFR: 8302158: PPC: test/jdk/jdk/internal/vm/Continuation/Fuzz.java: AssertionError: res: false shouldPin: false [v2] In-Reply-To: References: Message-ID: > This fixes the linked issue by trimming the caller of a frame to be deoptimized back to its `unextended_sp` iff it is compiled. The creation of the section `dead after deoptimization` shown in the attachment [yield_after_deopt_failure.log](https://bugs.openjdk.org/secure/attachment/102602/yield_after_deopt_failure.log) is prevented by this. > > A new mode is added to the test BasicExt.java where all frames are deoptimized after a yield operation. The issue can be deterministically reproduced with the new mode. It's not worth to execute all test cases with the new mode though. Instead `ContinuationCompiledFramesWithStackArgs_3c4` is always executed a 2nd time in this mode. > > Before this BasicExt.java was refactored for better argument processing and representation of the test modes. > Also the try-catch-clause in the main method had to be changed to rethrow the caught exception because without this the test would have succeeded. > > Testing: jtreg tests tier 1-4 on standard platforms and also on ppc64le. Richard Reingruber has updated the pull request incrementally with three additional commits since the last revision: - Improve comment - Improve comment - Spelling ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12557/files - new: https://git.openjdk.org/jdk/pull/12557/files/7ca17977..9d647388 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12557&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12557&range=00-01 Stats: 5 lines in 2 files changed: 4 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12557.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12557/head:pull/12557 PR: https://git.openjdk.org/jdk/pull/12557 From rrich at openjdk.org Wed Feb 15 20:30:22 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 15 Feb 2023 20:30:22 GMT Subject: RFR: 8302158: PPC: test/jdk/jdk/internal/vm/Continuation/Fuzz.java: AssertionError: res: false shouldPin: false [v2] In-Reply-To: References: Message-ID: On Wed, 15 Feb 2023 12:35:52 GMT, Goetz Lindenmaier wrote: >> Richard Reingruber has updated the pull request incrementally with three additional commits since the last revision: >> >> - Improve comment >> - Improve comment >> - Spelling > > src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp line 3094: > >> 3092: >> 3093: // Freezing continuation frames requires that the caller is trimmed to unextended sp if compiled. >> 3094: Register caller_sp = R23_tmp3; > > Can the caller be interpreted? I.e., not compiled? > Then it might be helpful to comment like > // If not compiled this contains sp() resulting in a resize of 0. Yes it can be interpreted. I've extended the comment for this case. ------------- PR: https://git.openjdk.org/jdk/pull/12557 From ccheung at openjdk.org Wed Feb 15 20:30:30 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Wed, 15 Feb 2023 20:30:30 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node [v3] In-Reply-To: References: Message-ID: > Please review this patch for embedding the Symbol inside a ConcurrentHashTable Node instead of having a pointer to a Symbol. This eliminates malloc/free for each Symbol. > > This patch is co-authored by @robehn. > > Passed tiers 1 - 4 testing. Calvin Cheung has updated the pull request incrementally with one additional commit since the last revision: remove unused allocate_symbol() declaration in symbolTable.hpp ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12562/files - new: https://git.openjdk.org/jdk/pull/12562/files/f7a7812e..9b4c63d9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12562&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12562&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12562.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12562/head:pull/12562 PR: https://git.openjdk.org/jdk/pull/12562 From jcking at openjdk.org Wed Feb 15 20:36:00 2023 From: jcking at openjdk.org (Justin King) Date: Wed, 15 Feb 2023 20:36:00 GMT Subject: RFR: JDK-8299554: Remove EnableIf from metaprogramming/enableIf.hpp [v4] In-Reply-To: References: <4VZUOmguEjK6C8NEMI3KrS_E5HNU-G1H-h8MJty2mlY=.2aa0c645-1695-45ee-b53d-f701772884d8@github.com> Message-ID: On Wed, 4 Jan 2023 02:02:11 GMT, Justin King wrote: >> As the title says, remove `EnableIf` type alias from `metaprogramming/enableIf.hpp` leaving only `ENABLE_IF` and `ENABLE_IF_SDEFN`. > > Justin King has updated the pull request incrementally with one additional commit since the last revision: > > s//::value > > Signed-off-by: Justin King Withdrawing. ------------- PR: https://git.openjdk.org/jdk/pull/11835 From jcking at openjdk.org Wed Feb 15 20:36:00 2023 From: jcking at openjdk.org (Justin King) Date: Wed, 15 Feb 2023 20:36:00 GMT Subject: Withdrawn: JDK-8299554: Remove EnableIf from metaprogramming/enableIf.hpp In-Reply-To: <4VZUOmguEjK6C8NEMI3KrS_E5HNU-G1H-h8MJty2mlY=.2aa0c645-1695-45ee-b53d-f701772884d8@github.com> References: <4VZUOmguEjK6C8NEMI3KrS_E5HNU-G1H-h8MJty2mlY=.2aa0c645-1695-45ee-b53d-f701772884d8@github.com> Message-ID: On Tue, 3 Jan 2023 19:34:26 GMT, Justin King wrote: > As the title says, remove `EnableIf` type alias from `metaprogramming/enableIf.hpp` leaving only `ENABLE_IF` and `ENABLE_IF_SDEFN`. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/11835 From duke at openjdk.org Wed Feb 15 20:42:34 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Wed, 15 Feb 2023 20:42:34 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test In-Reply-To: References: <2oXvjl1UFJq6x70IUcUgLKUVQgnjLC9U7YsB3XQg7BQ=.d651fdb1-df5c-40bc-9adc-4921045014a0@github.com> <4kKR1KP5iJ7G1_lvNKt4jOgRcCIyzDvlLDfO4rtxm2w=.c6fab761-d1ed-405b-b3b5-001a8e19f5db@github.com> <3T7nJ7BvnPjezFb-EzwCNkjq-zRjjsK-LrZTdRgZmzY=.c9ad8912-fbab-4b96-a9c1-9fe104cc904d@github.com> Message-ID: On Wed, 15 Feb 2023 20:24:16 GMT, Jorn Vernee wrote: >> Yes. > >> I think `unextended_sp < sp` is possible on x86 when interpreter entries generated by `MethodHandles::generate_method_handle_interpreter_entry()` are executed. They modify `sp` ([here for example](https://github.com/openjdk/jdk/blob/7f71a1040d9c03f72d082e329ccaf2c4a3c060a6/src/hotspot/cpu/x86/methodHandles_x86.cpp#L310-L314)) but not `sender_sp` (`InterpreterMacroAssembler ::_bcp_register`). > > I think we're talking about a case where an interpreted caller calls an interpreted callee through a MethodHandle. If my understanding is correct (looking at `frame::sender_for_interpreter_frame`), in that case there are 2 sender sps, one that is computed based on the frame pointer of the callee. This just adds 2 words (skipping the return address), and the other is the sp which is saved into `r13` before e.g. a c2i adapter extends the stack [here](https://github.com/openjdk/jdk/blob/50dcc2aec5b16c0826e27d58e49a7f55a5f5ad38/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp#L631) Which then seems to be saved in the interpreted callee's frame as well (I'm not super familiar with interpreter frame layouts though...). The latter being what ends up as `unextended_sp`. > > I think your conclusion is correct wrt `unextended_sp < sp` in that case. For certain MH linker calls, an additional `MemberName` 'appendix' is passed as the last argument. The MH adapter/trampoline stub pops the appendix argument from the stack for an interpreted caller, and uses it to find out the actual callee. But that means that, the sp saved into `r13` will be one slot too 'large' (small? accounting for one too many arguments). In other words, the MethodHandle adapter for interpreted callers 'consumes' the appendix argument, but doesn't adjust `r13`. I think this could be done though, i.e. just add: > > > __ addptr(r13, Interpreter::stackElementSize); // adjust unexteded_sp for callee, for proper stack walking > > > In the linked code in `MethodHandles::generate_method_handle_interpreter_entry` to make this work. > > (note that we can not just leave the appendix on the stack, since the callee doesn't expect it, so we would break argument oop GC if we left it there) > > If that works, I think doing that would be preferable to adjusting `safe_for_sender`. That seems like it would uphold the `sp <= unextended_sp` invariant as well (maybe an assert to that effect in the constructor of `frame` wouldn't be a bad idea as well). > > It seems like PPC is not modifying the stack in this case (?): > > > Register R19_member = R19_method; // MemberName ptr; incoming method ptr is dead now > __ ld(R19_member, RegisterOrConstant((intptr_t)8), R15_argbase); > __ addi(R15_argbase, R15_argbase, Interpreter::stackElementSize); > generate_method_handle_dispatch(_masm, iid, tmp_recv, R19_member, not_for_compiler_entry); This is an interesting insight, I'm going to test your fix idea tomorrow. ------------- PR: https://git.openjdk.org/jdk/pull/12535 From pchilanomate at openjdk.org Wed Feb 15 20:59:06 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 15 Feb 2023 20:59:06 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v4] In-Reply-To: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: > Please review the following change. Some of the JVMTI functions had to be modified since they are not supported by the specs on virtual threads (StopThread, RunAgentThread, GetCurrentThreadCpuTime, GetThreadCpuTime) or shouldn't include virtual threads in the results (GetAllThreads, GetAllStackTraces, GetThreadGroupChildren). Others were modified because they should work on virtual thread but they weren't (SuspendAllVirtualThreads/ResumeAllVirtualThreads). Also support for VirtualThreadStart/VirtualThreadEnd events was added. > > There were a total of 71 tests under serviceability/jvmti/ that had the vm.continuations requirement. Of those, 24 where under serviceability/jvmti/vthread/. I was able to remove that requirement from 50 tests. Most of those tests were already passing for the alternative vthread implementation just by removing the check in jni_GetEnv for VMContinuations. > From the other 21 tests, 20 still fail with the changes either because they actually test continuations or because of different assumptions in the tests that don't hold true for the alternative vthread implementation. So for those I left the vm.continuations requirement untouched (from those 20 tests there are actually 4 tests from the thread/GetStackTrace/ family that are passing because of a bug in the test, but I can fix that in another RFE). The other remaining test is vthread/GetSetLocalTest/GetSetLocalTest.java which I see is problemlisted. > > Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). > > I added new test BoundVThreadTest.java which just checks for the unsupported/supported behavior mentioned previously. For some extra basic coverage I also added a new run with -XX:-VMContinuations on tests inside the serviceability/jvmti/vthread/ directory that don't require vm.continuations anymore. I could also add that for all the other tests in serviceability/jvmti/ for which I removed the vm.continuations requirement. > > I run the patch through mach5 tiers 1-6 plus some extra local runs on the relevant tests. > > Thanks, > Patricio Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: split assert message ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12512/files - new: https://git.openjdk.org/jdk/pull/12512/files/f7f9553e..7872f94c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12512&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12512&range=02-03 Stats: 4 lines in 1 file changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12512.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12512/head:pull/12512 PR: https://git.openjdk.org/jdk/pull/12512 From pchilanomate at openjdk.org Wed Feb 15 20:59:10 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 15 Feb 2023 20:59:10 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v3] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Wed, 15 Feb 2023 05:04:21 GMT, Serguei Spitsyn wrote: > Patricio, The fix looks pretty solid to me. I've already posted one comment and will post a couple of formatting nits. I agree with Alan, it'd be nice to get rid of `is_bound_vthread` if possible. I will make one more pass after your update. Also, I'd address the JVMTI spec issue (mentioned by Alan) in a separate CR that needs to go with a CSR. I'll file an RFE or spec bug on it. Thanks, Serguei > Thanks for looking at this Serguei. > src/hotspot/share/prims/jvmtiEnv.cpp line 1045: > >> 1043: JvmtiEnvBase::is_vthread_alive(vt_oop) && >> 1044: !JvmtiVTSuspender::is_vthread_suspended(vt_oop)) || >> 1045: java_thread->is_bound_vthread()) && > > This condition does not look correct. > I'd expect it to be: > ``` > ((java_lang_VirtualThread::is_instance(vt_oop) && > JvmtiEnvBase::is_vthread_alive(vt_oop)) || > java_thread->is_bound_vthread()) && > !JvmtiVTSuspender::is_vthread_suspended(vt_oop) && > > It is important to check for `!JvmtiVTSuspender::is_vthread_suspended(vt_oop)` for `bound_vthread` as well. > The same issue is in the `JvmtiEnv::ResumeAllVirtualThreads`. The thing is that JvmtiVTSuspender::is_vthread_suspended() checks _suspended_list/not_suspended_list which are not set with single suspend/resume functions(SuspendThread/SuspendThreadList/ResumeThread/ResumeThreadList) on a BoundVirtualThread. For single suspend/resume on a BoundVirtualThread we go through the same path as for platform threads. In any case, the suspend_thread() call will just return JVMTI_ERROR_THREAD_SUSPENDED if a thread is already suspended, but if you want to avoid the call altogether I can add the equivalent check which would be !java_thread->is_suspended(). Now, I realized that although single-suspends will not change _suspended_list/not_suspended_list for a BoundVirtualThread, SuspendAllVirtualThreads/ResumeAllVirtualThreads can if a BoundVirtualThread is part of the exception list. In the SuspendAllVirtualThreads case for example, when restoring the resume state, the call to JvmtiVTSuspender::is_vthread_suspended() would return true (we are already in SR_all mode and the BoundVirtualThread will not be in _not_suspended_list), meaning the BoundVirtualThread will be added to the not_suspended_list. So we have to decide if we want to keep track of BoundVirtualThreads on these lists or not. Given that I see JvmtiVTSuspender::is_vthread_suspended() is always called on a VirtualThread the easiest thing would be to not track them, and just add a check in SuspendAllVirtualThreads/ResumeAllVirtualThreads where the elist is populated so only VirtualThreads are added. What do you think? > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1587: > >> 1585: if (!is_passive_cthread) { >> 1586: assert(thread_h() != nullptr, "sanity check"); >> 1587: assert(single_suspend || thread_h()->is_a(vmClasses::BaseVirtualThread_klass()), "SuspendAllVirtualThreads should never suspend non-virtual threads"); > > Nit: It is better to split this line by placing assert message on a separate line. Fixed. > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1648: > >> 1646: if (!is_passive_cthread) { >> 1647: assert(thread_h() != nullptr, "sanity check"); >> 1648: assert(single_resume || thread_h()->is_a(vmClasses::BaseVirtualThread_klass()), "ResumeAllVirtualThreads should never resume non-virtual threads"); > > Nit: It is better to split this line by placing assert message on a separate line. Fixed. ------------- PR: https://git.openjdk.org/jdk/pull/12512 From rrich at openjdk.org Wed Feb 15 21:07:53 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 15 Feb 2023 21:07:53 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test In-Reply-To: References: <2oXvjl1UFJq6x70IUcUgLKUVQgnjLC9U7YsB3XQg7BQ=.d651fdb1-df5c-40bc-9adc-4921045014a0@github.com> <4kKR1KP5iJ7G1_lvNKt4jOgRcCIyzDvlLDfO4rtxm2w=.c6fab761-d1ed-405b-b3b5-001a8e19f5db@github.com> <3T7nJ7BvnPjezFb-EzwCNkjq-zRjjsK-LrZTdRgZmzY=.c9ad8912-fbab-4b96-a9c1-9fe104cc904d@github.com> Message-ID: On Wed, 15 Feb 2023 20:38:04 GMT, Johannes Bechberger wrote: >>> I think `unextended_sp < sp` is possible on x86 when interpreter entries generated by `MethodHandles::generate_method_handle_interpreter_entry()` are executed. They modify `sp` ([here for example](https://github.com/openjdk/jdk/blob/7f71a1040d9c03f72d082e329ccaf2c4a3c060a6/src/hotspot/cpu/x86/methodHandles_x86.cpp#L310-L314)) but not `sender_sp` (`InterpreterMacroAssembler ::_bcp_register`). >> >> I think we're talking about a case where an interpreted caller calls an interpreted callee through a MethodHandle. If my understanding is correct (looking at `frame::sender_for_interpreter_frame`), in that case there are 2 sender sps, one that is computed based on the frame pointer of the callee. This just adds 2 words (skipping the return address), and the other is the sp which is saved into `r13` in `InterpreterMacroAssembler::prepare_to_jump_from_interpreted`, and before e.g. a c2i adapter extends the stack [here](https://github.com/openjdk/jdk/blob/50dcc2aec5b16c0826e27d58e49a7f55a5f5ad38/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp#L631) Which then seems to be saved in the interpreted callee's frame as well (I'm not super familiar with interpreter frame layouts though...). The latter being what ends up as `unextended_sp`. >> >> I think @reinrich's conclusion is correct wrt `unextended_sp < sp` in that case. For certain MH linker calls, an additional `MemberName` 'appendix' is passed as the last argument. The MH adapter/trampoline stub pops the appendix argument from the stack for an interpreted caller, and uses it to find out the actual callee. But that means that, the sp saved into `r13` will be one slot too 'large' (small? accounting for one too many arguments). In other words, the MethodHandle adapter for interpreted callers 'consumes' the appendix argument, but doesn't adjust `r13`. I think this could be done though, i.e. just add: >> >> >> __ addptr(r13, Interpreter::stackElementSize); // adjust unexteded_sp for callee, for proper stack walking >> >> >> In the linked code in `MethodHandles::generate_method_handle_interpreter_entry` to make this work. >> >> (note that we can not just leave the appendix on the stack, since the callee doesn't expect it, so we would break argument oop GC if we left it there) >> >> If that works, I think doing that would be preferable to adjusting `safe_for_sender`. That seems like it would uphold the `sp <= unextended_sp` invariant as well (maybe an assert to that effect in the constructor of `frame` wouldn't be a bad idea as well). >> >> It seems like PPC is not modifying the stack in this case (?): >> >> >> Register R19_member = R19_method; // MemberName ptr; incoming method ptr is dead now >> __ ld(R19_member, RegisterOrConstant((intptr_t)8), R15_argbase); >> __ addi(R15_argbase, R15_argbase, Interpreter::stackElementSize); >> generate_method_handle_dispatch(_masm, iid, tmp_recv, R19_member, not_for_compiler_entry); > > This is an interesting insight, I'm going to test your fix idea tomorrow. > I think your conclusion is correct wrt `unextended_sp < sp` in that case. For certain MH linker calls, an additional `MemberName` 'appendix' is passed as the last argument. The MH adapter/trampoline stub pops the appendix argument from the stack for an interpreted caller, and uses it to find out the actual callee. But that means that, the sp saved into `r13` will be one slot too 'large' (small? accounting for one too many arguments). In other words, the MethodHandle adapter for interpreted callers 'consumes' the appendix argument, but doesn't adjust `r13`. I think this could be done though, i.e. just add: > > ``` > __ addptr(r13, Interpreter::stackElementSize); // adjust unexteded_sp for callee, for proper stack walking > ``` I don't think you can do that. Modifying the unextended_sp would break accesses to compiled frames (gc, locals). The original unextended_sp is also needed to undo the c2i extension. > In the linked code in `MethodHandles::generate_method_handle_interpreter_entry` to make this work. > > (note that we can not just leave the appendix on the stack, since the callee doesn't expect it, so we would break argument oop GC if we left it there) > > If that works, I think doing that would be preferable to adjusting `safe_for_sender`. That seems like it would uphold the `sp <= unextended_sp` invariant as well (maybe an assert to that effect in the constructor of `frame` wouldn't be a bad idea as well). > > It seems like PPC is not modifying the stack in this case (?): > > ``` > Register R19_member = R19_method; // MemberName ptr; incoming method ptr is dead now > __ ld(R19_member, RegisterOrConstant((intptr_t)8), R15_argbase); > __ addi(R15_argbase, R15_argbase, Interpreter::stackElementSize); > generate_method_handle_dispatch(_masm, iid, tmp_recv, R19_member, not_for_compiler_entry); > ``` Only the expression stack pointer (R15) is modified but not the sp (R1). unextended_sp < sp is possible on PPC because the expression stack is allocated with its maximum size and it's trimmed to the used part when making a call. The sender sp is the original sp. ------------- PR: https://git.openjdk.org/jdk/pull/12535 From jvernee at openjdk.org Wed Feb 15 21:15:03 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Wed, 15 Feb 2023 21:15:03 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test In-Reply-To: References: <2oXvjl1UFJq6x70IUcUgLKUVQgnjLC9U7YsB3XQg7BQ=.d651fdb1-df5c-40bc-9adc-4921045014a0@github.com> <4kKR1KP5iJ7G1_lvNKt4jOgRcCIyzDvlLDfO4rtxm2w=.c6fab761-d1ed-405b-b3b5-001a8e19f5db@github.com> <3T7nJ7BvnPjezFb-EzwCNkjq-zRjjsK-LrZTdRgZmzY=.c9ad8912-fbab-4b96-a9c1-9fe104cc904d@github.com> Message-ID: On Wed, 15 Feb 2023 21:05:07 GMT, Richard Reingruber wrote: > I don't think you can do that. Modifying the unextended_sp would break accesses to compiled frames (gc, locals). This is for interpreter entries, so the caller is definitely interpreted. If the callee is compiled, it shouldn't care about the contents of `r13` right? And if the callee is interpreted, it will get the correct `r13`. > The original unextended_sp is also needed to undo the c2i extension. Can you explain this? We shouldn't have a c2i adapters in this case. since the caller is interpreted. ------------- PR: https://git.openjdk.org/jdk/pull/12535 From rkennke at openjdk.org Wed Feb 15 21:15:54 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 15 Feb 2023 21:15:54 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v12] In-Reply-To: References: Message-ID: > This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). > > What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. > > This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal p rotocols. > > The lock-stack is grown when needed. This means that we need to check for potential overflow before attempting locking. When that is the case, locking fast-paths would call into the runtime to grow the stack and handle the locking. Compiled fast-paths (C1 and C2 on x86_64 and aarch64) do this check on method entry to avoid (possibly lots) of such checks at locking sites. > > In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. > > One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. > > As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. > > This change enables to simplify (and speed-up!) a lot of code: > > - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. > - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR > > > Testing: > - [x] tier1 x86_64 x aarch64 x +UseFastLocking > - [x] tier2 x86_64 x aarch64 x +UseFastLocking > - [x] tier3 x86_64 x aarch64 x +UseFastLocking > - [x] tier4 x86_64 x aarch64 x +UseFastLocking > - [x] tier1 x86_64 x aarch64 x -UseFastLocking > - [x] tier2 x86_64 x aarch64 x -UseFastLocking > - [x] tier3 x86_64 x aarch64 x -UseFastLocking > - [x] tier4 x86_64 x aarch64 x -UseFastLocking > - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet > > ### Performance > > #### Renaissance > > > > ? | x86_64 | ? | ? | ? | aarch64 | ? | ? > -- | -- | -- | -- | -- | -- | -- | -- > ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? > AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% > Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% > Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% > ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% > GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% > LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% > MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% > NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% > PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% > FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% > FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% > ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% > Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% > RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% > Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% > ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% > ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% > ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% > Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% > FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% > FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Fix merge error (move done label into correct places) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/10907/files - new: https://git.openjdk.org/jdk/pull/10907/files/8bdeda39..ed611b0b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=10907&range=10-11 Stats: 12 lines in 2 files changed: 6 ins; 5 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/10907.diff Fetch: git fetch https://git.openjdk.org/jdk pull/10907/head:pull/10907 PR: https://git.openjdk.org/jdk/pull/10907 From rrich at openjdk.org Wed Feb 15 21:25:27 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 15 Feb 2023 21:25:27 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test In-Reply-To: References: <2oXvjl1UFJq6x70IUcUgLKUVQgnjLC9U7YsB3XQg7BQ=.d651fdb1-df5c-40bc-9adc-4921045014a0@github.com> <4kKR1KP5iJ7G1_lvNKt4jOgRcCIyzDvlLDfO4rtxm2w=.c6fab761-d1ed-405b-b3b5-001a8e19f5db@github.com> <3T7nJ7BvnPjezFb-EzwCNkjq-zRjjsK-LrZTdRgZmzY=.c9ad8912-fbab-4b96-a9c1-9fe104cc904d@github.com> Message-ID: On Wed, 15 Feb 2023 21:12:19 GMT, Jorn Vernee wrote: >>> I think your conclusion is correct wrt `unextended_sp < sp` in that case. For certain MH linker calls, an additional `MemberName` 'appendix' is passed as the last argument. The MH adapter/trampoline stub pops the appendix argument from the stack for an interpreted caller, and uses it to find out the actual callee. But that means that, the sp saved into `r13` will be one slot too 'large' (small? accounting for one too many arguments). In other words, the MethodHandle adapter for interpreted callers 'consumes' the appendix argument, but doesn't adjust `r13`. I think this could be done though, i.e. just add: >>> >>> ``` >>> __ addptr(r13, Interpreter::stackElementSize); // adjust unexteded_sp for callee, for proper stack walking >>> ``` >> >> I don't think you can do that. Modifying the unextended_sp would break accesses to compiled frames (gc, locals). The original unextended_sp is also needed to undo the c2i extension. >> >>> In the linked code in `MethodHandles::generate_method_handle_interpreter_entry` to make this work. >>> >>> (note that we can not just leave the appendix on the stack, since the callee doesn't expect it, so we would break argument oop GC if we left it there) >>> >>> If that works, I think doing that would be preferable to adjusting `safe_for_sender`. That seems like it would uphold the `sp <= unextended_sp` invariant as well (maybe an assert to that effect in the constructor of `frame` wouldn't be a bad idea as well). >>> >>> It seems like PPC is not modifying the stack in this case (?): >>> >>> ``` >>> Register R19_member = R19_method; // MemberName ptr; incoming method ptr is dead now >>> __ ld(R19_member, RegisterOrConstant((intptr_t)8), R15_argbase); >>> __ addi(R15_argbase, R15_argbase, Interpreter::stackElementSize); >>> generate_method_handle_dispatch(_masm, iid, tmp_recv, R19_member, not_for_compiler_entry); >>> ``` >> >> Only the expression stack pointer (R15) is modified but not the sp (R1). >> >> unextended_sp < sp is possible on PPC because the expression stack is allocated with its maximum size and it's trimmed to the used part when making a call. The sender sp is the original sp. > >> I don't think you can do that. Modifying the unextended_sp would break accesses to compiled frames (gc, locals). > > This is for interpreter entries, so the caller is definitely interpreted. If the callee is compiled, it shouldn't care about the contents of `r13` right? And if the callee is interpreted, it will get the correct `r13`. > >> The original unextended_sp is also needed to undo the c2i extension. > > Can you explain this? We shouldn't have a c2i adapters in this case. since the caller is interpreted. ~If the callee is compiled, we will go through a c2i adapter for that, but in that case the adapter overwrites `r13` again.~ It's to long since I was working on low level MH stuff. If an ordinary interpreter entry is used for a call this means the callee will be interpreted. The caller is not necessarily interpreted. It can be compiled. Is this different here? ------------- PR: https://git.openjdk.org/jdk/pull/12535 From jvernee at openjdk.org Wed Feb 15 21:43:17 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Wed, 15 Feb 2023 21:43:17 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test In-Reply-To: References: <2oXvjl1UFJq6x70IUcUgLKUVQgnjLC9U7YsB3XQg7BQ=.d651fdb1-df5c-40bc-9adc-4921045014a0@github.com> <4kKR1KP5iJ7G1_lvNKt4jOgRcCIyzDvlLDfO4rtxm2w=.c6fab761-d1ed-405b-b3b5-001a8e19f5db@github.com> <3T7nJ7BvnPjezFb-EzwCNkjq-zRjjsK-LrZTdRgZmzY=.c9ad8912-fbab-4b96-a9c1-9fe104cc904d@github.com> Message-ID: On Wed, 15 Feb 2023 21:22:48 GMT, Richard Reingruber wrote: > If an ordinary interpreter entry is used for a call this means the callee will be interpreted Maybe we're not talking about the same thing? `Method` has a `from_interpreted_entry` and a `from_compiled_entry`, but which one is used depends on the caller, and which calling convention they are using. The callee can be either interpreted or compiled (and if there's a mismatch, we go through an adapter). The entry generated by `generate_method_handle_interpreter_entry` is essentially the `from_interpreted_entry`, AFAIU. ------------- PR: https://git.openjdk.org/jdk/pull/12535 From coleenp at openjdk.org Wed Feb 15 22:56:15 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 15 Feb 2023 22:56:15 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node [v3] In-Reply-To: References: Message-ID: <3m1Da1W_zA5LWWIjpPkkyaR9ng8hEc3jrK8P_dmjVGk=.e38fb70b-7e0f-480b-a412-1d0aac523c9b@github.com> On Wed, 15 Feb 2023 20:30:30 GMT, Calvin Cheung wrote: >> Please review this patch for embedding the Symbol inside a ConcurrentHashTable Node instead of having a pointer to a Symbol. This eliminates malloc/free for each Symbol. >> >> This patch is co-authored by @robehn. >> >> Passed tiers 1 - 4 testing. > > Calvin Cheung has updated the pull request incrementally with one additional commit since the last revision: > > remove unused allocate_symbol() declaration in symbolTable.hpp This changes look great! thank you! ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.org/jdk/pull/12562 From rrich at openjdk.org Wed Feb 15 23:23:10 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 15 Feb 2023 23:23:10 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test In-Reply-To: References: <2oXvjl1UFJq6x70IUcUgLKUVQgnjLC9U7YsB3XQg7BQ=.d651fdb1-df5c-40bc-9adc-4921045014a0@github.com> <4kKR1KP5iJ7G1_lvNKt4jOgRcCIyzDvlLDfO4rtxm2w=.c6fab761-d1ed-405b-b3b5-001a8e19f5db@github.com> <3T7nJ7BvnPjezFb-EzwCNkjq-zRjjsK-LrZTdRgZmzY=.c9ad8912-fbab-4b96-a9c1-9fe104cc904d@github.com> Message-ID: On Wed, 15 Feb 2023 21:39:52 GMT, Jorn Vernee wrote: > The entry generated by generate_method_handle_interpreter_entry is essentially the from_interpreted_entry, AFAIU. But it is also stored in `Method::_i2i_entry` which is [used by c2i adapters](https://github.com/openjdk/jdk/blob/3ba156082b73c4a8e9d890a57a42fb68df2bf98f/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp#L748-L750). (Maybe too) long version: The interpreter has entries depending on the `MethodKind`. See `AbstractInterpreter::entry_for_kind()`. The entries are stored in `AbstractInterpreter::_entry_table`. In `MethodHandlesAdapterGenerator::generate()` the interpreter entries for method handle intrinsics MethodKinds are generated using the generator `MethodHandles::generate_method_handle_interpreter_entry()`. In `Method::link_method()` the entry points are set depending on the `MethodKind`. The interpreter entry (see above) for the method's `MethodKind` is looked up and stored in `Method::_i2i_entry` which is also [used by c2i adapters](https://github.com/openjdk/jdk/blob/3ba156082b73c4a8e9d890a57a42fb68df2bf98f/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp#L748-L750). So to me it looks as if the entry generated by `generate_method_handle_interpreter_entry()` can be reached coming from a compiled caller. The generator for ordinary methods is `TemplateInterpreterGenerator::generate_normal_entry()`. The entries it generates are surely reached by compiled callers. I might be missing something in the case of MH intrinsics but right now it looks to me as if the caller can be compiled too. ------------- PR: https://git.openjdk.org/jdk/pull/12535 From iklam at openjdk.org Wed Feb 15 23:40:31 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 15 Feb 2023 23:40:31 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node [v3] In-Reply-To: References: Message-ID: On Wed, 15 Feb 2023 20:30:30 GMT, Calvin Cheung wrote: >> Please review this patch for embedding the Symbol inside a ConcurrentHashTable Node instead of having a pointer to a Symbol. This eliminates malloc/free for each Symbol. >> >> This patch is co-authored by @robehn. >> >> Passed tiers 1 - 4 testing. > > Calvin Cheung has updated the pull request incrementally with one additional commit since the last revision: > > remove unused allocate_symbol() declaration in symbolTable.hpp src/hotspot/share/classfile/symbolTable.cpp line 161: > 159: FreeHeap(memory); > 160: SymbolTable::item_removed(); > 161: } We don?t call afree() anymore if the symbol is permanent. . Is this intentional? Actually the old code has a bug. A non permanent symbol allocated from the c heap can overflow the refcount and become permanent. If it needs to be freed, we cannot call afree. Anyway, how is it possible for a permanent symbol to be freed here? Maybe this should be changed to an assert? ------------- PR: https://git.openjdk.org/jdk/pull/12562 From coleenp at openjdk.org Thu Feb 16 00:28:43 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 16 Feb 2023 00:28:43 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node [v3] In-Reply-To: References: Message-ID: On Wed, 15 Feb 2023 23:37:15 GMT, Ioi Lam wrote: >> Calvin Cheung has updated the pull request incrementally with one additional commit since the last revision: >> >> remove unused allocate_symbol() declaration in symbolTable.hpp > > src/hotspot/share/classfile/symbolTable.cpp line 161: > >> 159: FreeHeap(memory); >> 160: SymbolTable::item_removed(); >> 161: } > > We don?t call afree() anymore if the symbol is permanent. . Is this intentional? > > Actually the old code has a bug. A non permanent symbol allocated from the c heap can overflow the refcount and become permanent. If it needs to be freed, we cannot call afree. > > Anyway, how is it possible for a permanent symbol to be freed here? Maybe this should be changed to an assert? Yes, it seemed dubious if we ever freed the entry from the symbol arena, so I think it's ok. It would only free the Symbol if it was the last one allocated and there was a race when inserting the same permanent symbol in the table. You're right there would be a bug if we tried to free a CHeap allocated symbol from the arena if the refcount became permanent. I'm not sure what would happen in that case. ------------- PR: https://git.openjdk.org/jdk/pull/12562 From coleenp at openjdk.org Thu Feb 16 00:28:45 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 16 Feb 2023 00:28:45 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node [v3] In-Reply-To: References: Message-ID: <9GcPkoPBeh5mcKE9kgn6um-5eDO9aABZVMoeOO5CNk0=.91bf2883-a84a-48b6-b3ab-91e7b22d4555@github.com> On Wed, 15 Feb 2023 23:48:12 GMT, Coleen Phillimore wrote: >> src/hotspot/share/classfile/symbolTable.cpp line 161: >> >>> 159: FreeHeap(memory); >>> 160: SymbolTable::item_removed(); >>> 161: } >> >> We don?t call afree() anymore if the symbol is permanent. . Is this intentional? >> >> Actually the old code has a bug. A non permanent symbol allocated from the c heap can overflow the refcount and become permanent. If it needs to be freed, we cannot call afree. >> >> Anyway, how is it possible for a permanent symbol to be freed here? Maybe this should be changed to an assert? > > Yes, it seemed dubious if we ever freed the entry from the symbol arena, so I think it's ok. It would only free the Symbol if it was the last one allocated and there was a race when inserting the same permanent symbol in the table. > > You're right there would be a bug if we tried to free a CHeap allocated symbol from the arena if the refcount became permanent. I'm not sure what would happen in that case. Oh but if it overflows the refcount to become permanent, the permanence is sticky. We won't free it, so it's not a bug. ------------- PR: https://git.openjdk.org/jdk/pull/12562 From jvernee at openjdk.org Thu Feb 16 00:44:01 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Thu, 16 Feb 2023 00:44:01 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test In-Reply-To: References: <2oXvjl1UFJq6x70IUcUgLKUVQgnjLC9U7YsB3XQg7BQ=.d651fdb1-df5c-40bc-9adc-4921045014a0@github.com> <4kKR1KP5iJ7G1_lvNKt4jOgRcCIyzDvlLDfO4rtxm2w=.c6fab761-d1ed-405b-b3b5-001a8e19f5db@github.com> <3T7nJ7BvnPjezFb-EzwCNkjq-zRjjsK-LrZTdRgZmzY=.c9ad8912-fbab-4b96-a9c1-9fe104cc904d@github.com> Message-ID: On Wed, 15 Feb 2023 23:20:00 GMT, Richard Reingruber wrote: > (Maybe too) long version: Not at all :) I don't know much about the interpreter, so this is pretty helpful. Ok. I think the main point is that a MH linker method doesn't have a c2i adapter, it instead has separate versions for interpreted and compiled callers altogether (except `linkToNative` which doesn't have an interpreter version atm) Longer version: I think there are 4 theoretical cases (are there more?): 1. interpreted -> MH interpreter entry -> interpreted (the case we want to fix, I think) 2. interpreted -> MH interpreter entry -> i2c -> compiled 3. compiled -> c2i -> MH interpreter entry -> interpreted 4. compiled -> c2i -> MH interpreter entry -> i2c -> compiled So... my understanding is that a c2i adapter is used when a callee is interpreted, so its `from_compiled_entry` points to the c2i adapter. A compiled caller calls the `from_compiled_entry`, so it ends up going through the `c2i` adapter of an interpreted callee. However, for method handle linkers, the `from_compiled_entry` never points to a c2i adapter, but to the MH linker compiler entry generated in `gen_special_dispatch` in e.g `sharedRuntime_x86_64`. So, in other words, the 3. and 4. scenarios above are not possible, AFAICS. An MH linker's interpreted/compiled entries simply replace the regular `i2c` and `c2i` adapters for a MH linker method. Though, the actual target method can be either compiled or interpreted, so we can go into either a `c2i` adapter from a MH linker compiler entry, or a `i2c` adapter from a MH linker interpreter entry (which is scenario 2.). For scenario 1.: caller sets `r13` before jumping to callee (MH linker), we end up in the MH adapter, pop the last argument, and adjust `r13` for the actual callee. The MH interpreter entry uses `from_interpreted_entry` of the target method, so we don't go through any adapter. For scenario 2.: mostly the same as 1., but we go through an `i2c` adapter, and the compiled callee doesn't care about what's in `r13` any ways. (there's a comment in the ic2 adapter code about preserving `r13` for the case that we might go back into a c2i adapter if we lose a race where the callee is made non-entrant, but `r13` is also used as a scratch register later in the i2c code, and the c2i code overwrites `r13` any ways. So, I think that is an outdated comment...) ------------- PR: https://git.openjdk.org/jdk/pull/12535 From iklam at openjdk.org Thu Feb 16 01:29:13 2023 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 16 Feb 2023 01:29:13 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node [v3] In-Reply-To: <9GcPkoPBeh5mcKE9kgn6um-5eDO9aABZVMoeOO5CNk0=.91bf2883-a84a-48b6-b3ab-91e7b22d4555@github.com> References: <9GcPkoPBeh5mcKE9kgn6um-5eDO9aABZVMoeOO5CNk0=.91bf2883-a84a-48b6-b3ab-91e7b22d4555@github.com> Message-ID: On Wed, 15 Feb 2023 23:51:26 GMT, Coleen Phillimore wrote: >> Yes, it seemed dubious if we ever freed the entry from the symbol arena, so I think it's ok. It would only free the Symbol if it was the last one allocated and there was a race when inserting the same permanent symbol in the table. >> >> You're right there would be a bug if we tried to free a CHeap allocated symbol from the arena if the refcount became permanent. I'm not sure what would happen in that case. > > Oh but if it overflows the refcount to become permanent, the permanence is sticky. We won't free it, so it's not a bug. >From the comment above this code, it seems we do free permanent symbols, but only when the current thread lost when competing to insert a symbol for the boot loader. So the old code is correct, and the new code would have a leak. ------------- PR: https://git.openjdk.org/jdk/pull/12562 From duke at openjdk.org Thu Feb 16 02:33:47 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Thu, 16 Feb 2023 02:33:47 GMT Subject: RFR: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64 [v2] In-Reply-To: References: Message-ID: On Wed, 15 Feb 2023 09:51:45 GMT, Volker Simonis wrote: > Why does `UseCryptoPmullForCRC32` depend on `VM_Version::supports_sha3()`? The implementation uses [eor3](https://developer.arm.com/documentation/ddi0602/2022-06/SIMD-FP-Instructions/EOR3--Three-way-Exclusive-OR-) as well, and it is only available when SHA3 is implemented. > Does `VM_Version::supports_pmull()` really check for the `pmull` in the cryptographic extension? supports_pmull() returns true when CPU_PMULL/HWCAP_PMULL of VM_Version::_features is [enabled](https://github.com/openjdk/jdk/blob/3ba156082b73c4a8e9d890a57a42fb68df2bf98f/src/hotspot/cpu/aarch64/vm_version_aarch64.hpp#L121). HWCAP_PMULL refers to the [functionality](https://docs.kernel.org/arm64/elf_hwcaps.html) implied by ID_AA64ISAR0_EL1.AES == 0b0010, i.e. [PMULL/PMULL2](https://developer.arm.com/documentation/ddi0595/2021-03/AArch64-Registers/ID-AA64ISAR0-EL1--AArch64-Instruction-Set-Attribute-Register-0) instructions operating on 64-bit data quantities. ------------- PR: https://git.openjdk.org/jdk/pull/12480 From iklam at openjdk.org Thu Feb 16 02:42:46 2023 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 16 Feb 2023 02:42:46 GMT Subject: RFR: 8224980: FLAG_SET_ERGO silently ignores invalid values [v4] In-Reply-To: References: Message-ID: > Clarify the `FLAG_SET_ERGO` API - the caller must ensure that a valid value is passed. Added an assert to check this. > > Fixed two issues found by the assert: > > - `NonNMethodCodeHeapSize` > - `XXXThreshold` and `XXXNotifyFreqLog` flags in `CompilerConfig::set_compilation_policy_flags()` Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Merge branch 'master' into 8224980-FLAG_SET_ERGO-ignores-invalid-values - @dholmes-ora suggestion - @dholmes-ora comment - make sure errors are printed for FLAG_SET_ERGO - 8224980: FLAG_SET_ERGO silently ignores invalid values ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12549/files - new: https://git.openjdk.org/jdk/pull/12549/files/8cf4b6e3..cfc685f4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12549&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12549&range=02-03 Stats: 7004 lines in 229 files changed: 3490 ins; 2392 del; 1122 mod Patch: https://git.openjdk.org/jdk/pull/12549.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12549/head:pull/12549 PR: https://git.openjdk.org/jdk/pull/12549 From iklam at openjdk.org Thu Feb 16 03:58:43 2023 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 16 Feb 2023 03:58:43 GMT Subject: RFR: 8224980: FLAG_SET_ERGO silently ignores invalid values [v3] In-Reply-To: References: Message-ID: On Wed, 15 Feb 2023 06:20:48 GMT, David Holmes wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> @dholmes-ora suggestion > > Looks good. > > Thanks Thanks @dholmes-ora @veresov @TobiHartmann for the review ------------- PR: https://git.openjdk.org/jdk/pull/12549 From iklam at openjdk.org Thu Feb 16 03:58:47 2023 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 16 Feb 2023 03:58:47 GMT Subject: Integrated: 8224980: FLAG_SET_ERGO silently ignores invalid values In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 04:03:36 GMT, Ioi Lam wrote: > Clarify the `FLAG_SET_ERGO` API - the caller must ensure that a valid value is passed. Added an assert to check this. > > Fixed two issues found by the assert: > > - `NonNMethodCodeHeapSize` > - `XXXThreshold` and `XXXNotifyFreqLog` flags in `CompilerConfig::set_compilation_policy_flags()` This pull request has now been integrated. Changeset: 573c316c Author: Ioi Lam URL: https://git.openjdk.org/jdk/commit/573c316c5764ccd8d483f1f187fd6eb21ceeea63 Stats: 42 lines in 5 files changed: 23 ins; 1 del; 18 mod 8224980: FLAG_SET_ERGO silently ignores invalid values Reviewed-by: iveresov, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/12549 From dholmes at openjdk.org Thu Feb 16 04:23:17 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 16 Feb 2023 04:23:17 GMT Subject: RFR: 8280419: Remove dead code related to VerifyThread and verify_thread() [v2] In-Reply-To: References: Message-ID: > This is code that was copied from the sparc port into the arm, ppc and s390 ports, but was never actually implemented. > > I can't really test those platforms, but hopefully GHA builds will suffice, otherwise I need the various porters to confirm the changes. > > Thanks. David Holmes has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: - Merge branch 'master' into 8280419-verifyThread - 8280419: Remove dead code related to VerifyThread and verify_thread() ------------- Changes: https://git.openjdk.org/jdk/pull/12506/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12506&range=01 Stats: 79 lines in 18 files changed: 0 ins; 62 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/12506.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12506/head:pull/12506 PR: https://git.openjdk.org/jdk/pull/12506 From mdoerr at openjdk.org Thu Feb 16 06:02:28 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 16 Feb 2023 06:02:28 GMT Subject: RFR: 8302158: PPC: test/jdk/jdk/internal/vm/Continuation/Fuzz.java: AssertionError: res: false shouldPin: false [v2] In-Reply-To: References: Message-ID: <5AzezQ8wN3nMxXccPmkQCEGjPHomkmXFOqjfXoMKWv8=.87f1eb49-25a9-4945-b6d4-4ae148205b0d@github.com> On Wed, 15 Feb 2023 20:30:00 GMT, Richard Reingruber wrote: >> This fixes the linked issue by trimming the caller of a frame to be deoptimized back to its `unextended_sp` iff it is compiled. The creation of the section `dead after deoptimization` shown in the attachment [yield_after_deopt_failure.log](https://bugs.openjdk.org/secure/attachment/102602/yield_after_deopt_failure.log) is prevented by this. >> >> A new mode is added to the test BasicExt.java where all frames are deoptimized after a yield operation. The issue can be deterministically reproduced with the new mode. It's not worth to execute all test cases with the new mode though. Instead `ContinuationCompiledFramesWithStackArgs_3c4` is always executed a 2nd time in this mode. >> >> Before this BasicExt.java was refactored for better argument processing and representation of the test modes. >> Also the try-catch-clause in the main method had to be changed to rethrow the caught exception because without this the test would have succeeded. >> >> Testing: jtreg tests tier 1-4 on standard platforms and also on ppc64le. > > Richard Reingruber has updated the pull request incrementally with three additional commits since the last revision: > > - Improve comment > - Improve comment > - Spelling Thanks! ------------- Marked as reviewed by mdoerr (Reviewer). PR: https://git.openjdk.org/jdk/pull/12557 From duke at openjdk.org Thu Feb 16 06:13:03 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Thu, 16 Feb 2023 06:13:03 GMT Subject: RFR: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64 [v3] In-Reply-To: References: Message-ID: > Instruction pmull and pmull2 support operating on 64-bit data in Cryptographic Extension. The execution throughput of this form raises from 1 on Neoverse N1 to 4 on Neoverse V1 while the latency remains 2. The CRC32 instructions did not changed: latency 2, throughput 1. As a result, computing CRC32 using pmull could perform better than using crc32 instruction. > > The following test has passed. > test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java > > The throughput reported by [the micro benchmark](https://github.com/openjdk/jdk/blob/master/test/micro/org/openjdk/bench/java/util/TestCRC32.java) is measured on an EC2 c7g instance. The optimization shows 11 - 99% improvement when the input is at least 384 bytes. > > | input | 64 | 128 | 256 | 384 | 511 | 512 | 1,024 | > | ------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | > | improvement | 0.02% | 0.02% | 0.00% | 16.00% | 11.94% | 34.75% | 69.80% | > > | input | 2,048 | 4,096 | 8,192 | 16,384 | 32,768 | 65,536 | > | ------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | > | improvement | 77.61% | 92.33% | 95.98% | 97.95% | 99.33% | 98.36% | > > > Baseline > > TestCRC32.testCRC32Update 64 thrpt 12 173126.358 ? 118.330 ops/ms > TestCRC32.testCRC32Update 128 thrpt 12 112910.118 ? 47.305 ops/ms > TestCRC32.testCRC32Update 256 thrpt 12 66601.990 ? 7.294 ops/ms > TestCRC32.testCRC32Update 384 thrpt 12 47229.319 ? 3.949 ops/ms > TestCRC32.testCRC32Update 511 thrpt 12 33733.119 ? 4.076 ops/ms > TestCRC32.testCRC32Update 512 thrpt 12 36584.565 ? 4.211 ops/ms > TestCRC32.testCRC32Update 1024 thrpt 12 19239.083 ? 1.040 ops/ms > TestCRC32.testCRC32Update 2048 thrpt 12 9875.652 ? 0.435 ops/ms > TestCRC32.testCRC32Update 4096 thrpt 12 5004.425 ? 0.290 ops/ms > TestCRC32.testCRC32Update 8192 thrpt 12 2519.185 ? 0.169 ops/ms > TestCRC32.testCRC32Update 16384 thrpt 12 1263.909 ? 0.194 ops/ms > TestCRC32.testCRC32Update 32768 thrpt 12 632.018 ? 0.053 ops/ms > TestCRC32.testCRC32Update 65536 thrpt 12 315.471 ? 0.095 ops/ms > > > Crypto pmull > > TestCRC32.testCRC32Update 64 thrpt 12 173168.669 ? 4.746 ops/ms > TestCRC32.testCRC32Update 128 thrpt 12 112933.519 ? 4.583 ops/ms > TestCRC32.testCRC32Update 256 thrpt 12 66602.462 ? 3.150 ops/ms > TestCRC32.testCRC32Update 384 thrpt 12 54784.739 ? 2.110 ops/ms > TestCRC32.testCRC32Update 511 thrpt 12 37760.816 ? 69.911 ops/ms > TestCRC32.testCRC32Update 512 thrpt 12 49297.609 ? 21.983 ops/ms > TestCRC32.testCRC32Update 1024 thrpt 12 32667.507 ? 90.610 ops/ms > TestCRC32.testCRC32Update 2048 thrpt 12 17539.986 ? 511.416 ops/ms > TestCRC32.testCRC32Update 4096 thrpt 12 9625.249 ? 9.713 ops/ms > TestCRC32.testCRC32Update 8192 thrpt 12 4937.135 ? 6.121 ops/ms > TestCRC32.testCRC32Update 16384 thrpt 12 2501.936 ? 1.270 ops/ms > TestCRC32.testCRC32Update 32768 thrpt 12 1259.831 ? 0.119 ops/ms > TestCRC32.testCRC32Update 65536 thrpt 12 625.773 ? 0.242 ops/ms Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: Make UseCryptoPmullForCRC32 independent of UseCRC32 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12480/files - new: https://git.openjdk.org/jdk/pull/12480/files/f15ac5e9..d92625aa Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12480&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12480&range=01-02 Stats: 14 lines in 4 files changed: 5 ins; 5 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/12480.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12480/head:pull/12480 PR: https://git.openjdk.org/jdk/pull/12480 From duke at openjdk.org Thu Feb 16 06:15:28 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Thu, 16 Feb 2023 06:15:28 GMT Subject: RFR: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64 [v2] In-Reply-To: References: Message-ID: On Wed, 15 Feb 2023 09:48:54 GMT, Volker Simonis wrote: >> Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: >> >> Change to UseCryptoPmullForCRC32 > > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 3777: > >> 3775: >> 3776: if (UseCRC32) { >> 3777: if (UseCryptoPmullForCRC32) { > > Whis is this inside `UseCRC32`? It means that if we set `-XX:-UseCRC32 -XX:+UseCryptoPmullForCRC32` on the command line, we won't get `UseCryptoPmullForCRC32` crc32 instructions are used to process remaining input not multiple of 128 bytes. The dependency between these two options is removed. ------------- PR: https://git.openjdk.org/jdk/pull/12480 From duke at openjdk.org Thu Feb 16 06:25:28 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Thu, 16 Feb 2023 06:25:28 GMT Subject: RFR: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64 [v3] In-Reply-To: References: Message-ID: On Wed, 15 Feb 2023 10:18:33 GMT, Andrew Haley wrote: >> Changed to UseCryptoPmullForCRC32 as there is another pmull version. > >> Changed to UseCryptoPmullForCRC32 as there is another pmull version. > > Can you give me a reference for that? I can only find one definition of pmull. There will be two pmull implementations 1) `-XX:+UseCryptoPmullForCRC32` to use this 64x64 pmull implementation 2) `-XX:-UseCryptoPmullForCRC32 -XX:-UseCRC32 -XX:+UseNeon` to use the [8x8 pmull implementation](https://github.com/openjdk/jdk/blob/1ef9f6507ba45419f0fa896915eec064762c5153/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L3753). ------------- PR: https://git.openjdk.org/jdk/pull/12480 From sspitsyn at openjdk.org Thu Feb 16 07:54:28 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 16 Feb 2023 07:54:28 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v3] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Wed, 15 Feb 2023 20:53:23 GMT, Patricio Chilano Mateo wrote: >> src/hotspot/share/prims/jvmtiEnv.cpp line 1045: >> >>> 1043: JvmtiEnvBase::is_vthread_alive(vt_oop) && >>> 1044: !JvmtiVTSuspender::is_vthread_suspended(vt_oop)) || >>> 1045: java_thread->is_bound_vthread()) && >> >> This condition does not look correct. >> I'd expect it to be: >> ``` >> ((java_lang_VirtualThread::is_instance(vt_oop) && >> JvmtiEnvBase::is_vthread_alive(vt_oop)) || >> java_thread->is_bound_vthread()) && >> !JvmtiVTSuspender::is_vthread_suspended(vt_oop) && >> >> It is important to check for `!JvmtiVTSuspender::is_vthread_suspended(vt_oop)` for `bound_vthread` as well. >> The same issue is in the `JvmtiEnv::ResumeAllVirtualThreads`. > > The thing is that JvmtiVTSuspender::is_vthread_suspended() checks _suspended_list/not_suspended_list which are not set with single suspend/resume functions(SuspendThread/SuspendThreadList/ResumeThread/ResumeThreadList) on a BoundVirtualThread. For single suspend/resume on a BoundVirtualThread we go through the same path as for platform threads. In any case, the suspend_thread() call will just return JVMTI_ERROR_THREAD_SUSPENDED if a thread is already suspended, but if you want to avoid the call altogether I can add the equivalent check which would be !java_thread->is_suspended(). > > Now, I realized that although single-suspends will not change _suspended_list/not_suspended_list for a BoundVirtualThread, SuspendAllVirtualThreads/ResumeAllVirtualThreads can if a BoundVirtualThread is part of the exception list. In the SuspendAllVirtualThreads case for example, when restoring the resume state, the call to JvmtiVTSuspender::is_vthread_suspended() would return true (we are already in SR_all mode and the BoundVirtualThread will not be in _not_suspended_list), meaning the BoundVirtualThread will be added to the not_suspended_list. > So we have to decide if we want to keep track of BoundVirtualThreads on these lists or not. Given that I see JvmtiVTSuspender::is_vthread_suspended() is always called on a VirtualThread the easiest thing would be to not track them, and just add a check in SuspendAllVirtualThreads/ResumeAllVirtualThreads where the elist is populated so only VirtualThreads are added. What do you think? The `SuspendAllVirtualThreads` spec has this statement: Suspend all virtual threads except those in the exception list. Virtual threads that are currently suspended do not change state. And the `ResumeAllVirtualThreads` spec has this statement: Resume all virtual threads except those in the exception list. Virtual threads that are currently resumed do not change state So that, `SuspendAllVirtualThreads` should do nothing with the already suspended virtual threads. And `ResumeAllVirtualThreads` should do nothing with the already resumed virtual threads. It does not depend on the except_list. It is better to avoid suspending already suspended threads and resuming already resumed. So, at least a check for `!java_thread->is_suspended()` is needed. Did you observe any JVMTI or JDI suspend/resume tests failing? > Now, I realized that although single-suspends will not change _suspended_list/not_suspended_list for a BoundVirtualThread, SuspendAllVirtualThreads/ResumeAllVirtualThreads can if a BoundVirtualThread is part of the exception list. In the SuspendAllVirtualThreads case for example, when restoring the resume state, the call to JvmtiVTSuspender::is_vthread_suspended() would return true (we are already in SR_all mode and the BoundVirtualThread will not be in _not_suspended_list), meaning the BoundVirtualThread will be added to the not_suspended_list. > So we have to decide if we want to keep track of BoundVirtualThreads on these lists or not. Given that I see JvmtiVTSuspender::is_vthread_suspended() is always called on a VirtualThread the easiest thing would be to not track them, and just add a check in SuspendAllVirtualThreads/ResumeAllVirtualThreads where the elist is populated so only VirtualThreads are added. What do you think? I feel that following the VirtualThread's implementation code path would be more consistent, unified and free of surprises. But I also see that following the platform threads code path looks pretty simple. So, it can be okay. So, for this approach, I think, the code fragments which collect and restore resumed/suspended state for virtual threads need to be bypassed by the BoundVirtualThreads implementation. Please, let me know if it does not work by any reason. We also will have to rely on the testing here. ------------- PR: https://git.openjdk.org/jdk/pull/12512 From stuefe at openjdk.org Thu Feb 16 07:58:28 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 16 Feb 2023 07:58:28 GMT Subject: RFR: 8280419: Remove dead code related to VerifyThread and verify_thread() [v2] In-Reply-To: References: Message-ID: <5LeM-ceP-wJRKbqN49Me5WuHImOlIWwj6v-pUuM62DY=.a3d23351-423f-4269-a376-73fe1e9221f7@github.com> On Thu, 16 Feb 2023 04:23:17 GMT, David Holmes wrote: >> This is code that was copied from the sparc port into the arm, ppc and s390 ports, but was never actually implemented. >> >> I can't really test those platforms, but hopefully GHA builds will suffice, otherwise I need the various porters to confirm the changes. >> >> Thanks. > > David Holmes has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: > > - Merge branch 'master' into 8280419-verifyThread > - 8280419: Remove dead code related to VerifyThread and verify_thread() Marked as reviewed by stuefe (Reviewer). Changes look good to me. GHAs are not there though. I tested arm manually and it runs fine, rejecting VerifyThread as expected. Maybe @RealLucy can eyeball the IBM platform changes. ------------- PR: https://git.openjdk.org/jdk/pull/12506 From lucy at openjdk.org Thu Feb 16 08:15:29 2023 From: lucy at openjdk.org (Lutz Schmidt) Date: Thu, 16 Feb 2023 08:15:29 GMT Subject: RFR: 8280419: Remove dead code related to VerifyThread and verify_thread() [v2] In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 04:23:17 GMT, David Holmes wrote: >> This is code that was copied from the sparc port into the arm, ppc and s390 ports, but was never actually implemented. >> >> I can't really test those platforms, but hopefully GHA builds will suffice, otherwise I need the various porters to confirm the changes. >> >> Thanks. > > David Holmes has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: > > - Merge branch 'master' into 8280419-verifyThread > - 8280419: Remove dead code related to VerifyThread and verify_thread() Changes for ppc and s390 look good to me. There is one copyright typo you may want to fix. src/hotspot/cpu/s390/interp_masm_s390.cpp line 2: > 1: /* > 2: * Copyright (c) 2016, 20223, Oracle and/or its affiliates. All rights reserved. Copyright year a bit far in the future ------------- Marked as reviewed by lucy (Reviewer). PR: https://git.openjdk.org/jdk/pull/12506 From mbaesken at openjdk.org Thu Feb 16 08:30:32 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 16 Feb 2023 08:30:32 GMT Subject: RFR: 8299817: [s390] AES-CTR mode intrinsic fails with multiple short update() calls [v2] In-Reply-To: References: Message-ID: On Mon, 30 Jan 2023 12:26:57 GMT, Lutz Schmidt wrote: >> This PR addresses an issue in the AES-CTR mode intrinsic on s390. When a message is ciphered in multiple, small (< 16 bytes) segments, the result is incorrect. >> >> This is not just a band-aid fix. The issue was taken as a chance to restructure the code. though still complicated, It is now easier to read and (hopefully) understand. >> >> Except for the new jetreg test, the changes are purely s390. There are no side effects on other platforms. Issue-specific tests pass. Other tests are in progress. I will update this PR once they are complete. >> >> **Reviews and comments are very much appreciated.** >> >> @backwaterred could you please run some "official" s390 tests? Thanks. > > Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: > > 8299817: Update copyright Some small typos found nesessary -> necessary sace extended SP -> save extended SP Why do you clear memory only in the ASSERT case here ? src/hotspot/cpu/s390/stubGenerator_s390.cpp #ifdef ASSERT __ clear_mem(Address(Z_SP, (intptr_t)8), resize_len - 8); #endif ------------- PR: https://git.openjdk.org/jdk/pull/11967 From jsjolen at openjdk.org Thu Feb 16 09:07:33 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 16 Feb 2023 09:07:33 GMT Subject: RFR: JDK-8293313: NMT: Rework MallocLimit [v9] In-Reply-To: References: <-KHeOvjwe5Osf05XWQuO5ELZUs-ogUVj25xCouJaDaU=.b0906406-dd44-4eb4-be76-debe83867249@github.com> Message-ID: On Wed, 15 Feb 2023 06:54:52 GMT, Thomas Stuefe wrote: >> Rework NMT `MallocLimit` to make it more useful for native OOM analysis. Also remove `MallocMaxTestWords`, which had been mostly redundant with NMT (for background, see JDK-8293292). >> >> >> ### Background >> >> Some months ago we had [JDK-8291919](https://bugs.openjdk.org/browse/JDK-8291919), a compiler regression that caused compiler arenas to grow boundlessly. Process was usually OOM-killed before we could take a look, so this was difficult to analyze. >> >> To improve analysis options, we added NMT *malloc limits* with [JDK-8291878](https://bugs.openjdk.org/browse/JDK-8291878). Malloc limits let us set one or multiple limits to malloc load. Surpassing a limit causes the VM to stop with a fatal error in the allocation function, giving us a hs-err file and a core right at the point that (probably) causes the leak. This makes error analysis a lot simpler, and is also valuable for regression-testing footprint usage. >> >> Some people wished for a way to not end with a fatal error but to mimic a native OOM (causing os::malloc to return NULL). This would allow us to test resilience in the face of native OOMs, among other things. >> >> ### Patch >> >> - Expands the `MallocLimit` option syntax, allowing for an "oom mode" that mimics an oom: >> >> >> >> Global form: >> -XX:MallocLimit=[:] >> Category-specific form: >> -XX:MallocLimit=:[:][,:[:] ...] >> Examples: >> -XX:MallocLimit=3g >> -XX:MallocLimit=3g:oom >> -XX:MallocLimit=compiler:200m:oom >> >> >> - moves parsing of `-XX:MallocLimit` out of arguments.cpp into `mallocLimit.hpp/cpp`, and rewrites it. >> >> - changes when limits are checked. Before, the limits were checked *after* the allocation that caused the threshold overflow. Now we check beforehand. >> >> - removes `MallocMaxTestWords`, replacing its uses with `MallocLimit` >> >> - adds new gtests and new jtreg tests >> >> - removes a bunch of jtreg tests which are now better served by the new gtests. >> >> - gives us more precise error messages upon reaching a limit. For example: >> >> before: >> >> # fatal error: MallocLimit: category "Compiler" reached limit (size: 20997680, limit: 20971520) >> >> >> now: >> >> # fatal error: MallocLimit: reached category "Compiler" limit (triggering allocation size: 1920B, allocated so far: 20505K, limit: 20480K) > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: > > - Merge branch 'openjdk:master' into JDK-8293313-NMT-fake-oom > - nullptr > - merge and fix conflicts > - Merge; fix conflicts > - Feedback Gerard > - Revert strchrnul > - Copyrights > - Feedback Johan > - Merge > - Merge branch 'master' into JDK-8293313-NMT-fake-oom > - ... and 1 more: https://git.openjdk.org/jdk/compare/98a392c4...2d02a5bd Re-approving! ------------- Marked as reviewed by jsjolen (Committer). PR: https://git.openjdk.org/jdk/pull/11371 From rrich at openjdk.org Thu Feb 16 11:03:28 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 16 Feb 2023 11:03:28 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test In-Reply-To: References: <2oXvjl1UFJq6x70IUcUgLKUVQgnjLC9U7YsB3XQg7BQ=.d651fdb1-df5c-40bc-9adc-4921045014a0@github.com> <4kKR1KP5iJ7G1_lvNKt4jOgRcCIyzDvlLDfO4rtxm2w=.c6fab761-d1ed-405b-b3b5-001a8e19f5db@github.com> <3T7nJ7BvnPjezFb-EzwCNkjq-zRjjsK-LrZTdRgZmzY=.c9ad8912-fbab-4b96-a9c1-9fe104cc904d@github.com> Message-ID: On Thu, 16 Feb 2023 00:36:03 GMT, Jorn Vernee wrote: > So... my understanding is that a c2i adapter is used when a callee is > interpreted, so its `from_compiled_entry` points to the c2i adapter. A > compiled caller calls the `from_compiled_entry`, so it ends up going through > the `c2i` adapter of an interpreted callee. However, for method handle > linkers, the `from_compiled_entry` never points to a c2i adapter, but to the > MH linker compiler entry generated in `gen_special_dispatch` in e.g > `sharedRuntime_x86_64`. So, in other words, the 3. and 4. scenarios above are > not possible, AFAICS. I see. Thanks! That's what I've been missing. It's implemented in `SystemDictionary::find_method_handle_intrinsic()` I think. I agree that it will be ok to modify r13 as you suggested. Maybe explaining why we never reach here coming from a c2i adapter: because native wrappers are generated eagerly in `SystemDictionary::find_method_handle_intrinsic()` unless -Xint is given but then the caller cannot be compiled. ------------- PR: https://git.openjdk.org/jdk/pull/12535 From lucy at openjdk.org Thu Feb 16 11:51:59 2023 From: lucy at openjdk.org (Lutz Schmidt) Date: Thu, 16 Feb 2023 11:51:59 GMT Subject: RFR: 8299817: [s390] AES-CTR mode intrinsic fails with multiple short update() calls [v3] In-Reply-To: References: Message-ID: > This PR addresses an issue in the AES-CTR mode intrinsic on s390. When a message is ciphered in multiple, small (< 16 bytes) segments, the result is incorrect. > > This is not just a band-aid fix. The issue was taken as a chance to restructure the code. though still complicated, It is now easier to read and (hopefully) understand. > > Except for the new jetreg test, the changes are purely s390. There are no side effects on other platforms. Issue-specific tests pass. Other tests are in progress. I will update this PR once they are complete. > > **Reviews and comments are very much appreciated.** > > @backwaterred could you please run some "official" s390 tests? Thanks. Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: 829817: fixed typos, removed JIT_TIMER references ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11967/files - new: https://git.openjdk.org/jdk/pull/11967/files/33722f27..ca40de84 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11967&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11967&range=01-02 Stats: 11 lines in 1 file changed: 0 ins; 8 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/11967.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11967/head:pull/11967 PR: https://git.openjdk.org/jdk/pull/11967 From mbaesken at openjdk.org Thu Feb 16 11:56:30 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 16 Feb 2023 11:56:30 GMT Subject: RFR: 8299817: [s390] AES-CTR mode intrinsic fails with multiple short update() calls [v3] In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 11:51:59 GMT, Lutz Schmidt wrote: >> This PR addresses an issue in the AES-CTR mode intrinsic on s390. When a message is ciphered in multiple, small (< 16 bytes) segments, the result is incorrect. >> >> This is not just a band-aid fix. The issue was taken as a chance to restructure the code. though still complicated, It is now easier to read and (hopefully) understand. >> >> Except for the new jetreg test, the changes are purely s390. There are no side effects on other platforms. Issue-specific tests pass. Other tests are in progress. I will update this PR once they are complete. >> >> **Reviews and comments are very much appreciated.** >> >> @backwaterred could you please run some "official" s390 tests? Thanks. > > Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: > > 829817: fixed typos, removed JIT_TIMER references Marked as reviewed by mbaesken (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/11967 From lucy at openjdk.org Thu Feb 16 11:56:31 2023 From: lucy at openjdk.org (Lutz Schmidt) Date: Thu, 16 Feb 2023 11:56:31 GMT Subject: RFR: 8299817: [s390] AES-CTR mode intrinsic fails with multiple short update() calls [v2] In-Reply-To: References: Message-ID: On Mon, 30 Jan 2023 12:26:57 GMT, Lutz Schmidt wrote: >> This PR addresses an issue in the AES-CTR mode intrinsic on s390. When a message is ciphered in multiple, small (< 16 bytes) segments, the result is incorrect. >> >> This is not just a band-aid fix. The issue was taken as a chance to restructure the code. though still complicated, It is now easier to read and (hopefully) understand. >> >> Except for the new jetreg test, the changes are purely s390. There are no side effects on other platforms. Issue-specific tests pass. Other tests are in progress. I will update this PR once they are complete. >> >> **Reviews and comments are very much appreciated.** >> >> @backwaterred could you please run some "official" s390 tests? Thanks. > > Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: > > 8299817: Update copyright Typos repaired, JIT_TIMER references removed. The allocated stack space is cleared for debugging purposes only. Thanks for having a look, @MBaesken ------------- PR: https://git.openjdk.org/jdk/pull/11967 From tholenstein at openjdk.org Thu Feb 16 12:21:41 2023 From: tholenstein at openjdk.org (Tobias Holenstein) Date: Thu, 16 Feb 2023 12:21:41 GMT Subject: RFR: 8224980: FLAG_SET_ERGO silently ignores invalid values [v2] In-Reply-To: References: <8JBp5wYLz4IbjkPlhWUa845BAR4YnRKdhViQ0SfCnGo=.3f34ba5c-ba98-458d-8243-05f5440d1d32@github.com> <6KXZyhQfRrWk8VoZ6zep3O3wruWNdkkCjHVOxGzyly0=.1b8a5a04-4434-4388-b36a-d890fba3a9c6@github.com> Message-ID: On Wed, 15 Feb 2023 06:28:57 GMT, Tobias Hartmann wrote: >> Okay - whether this should produce warnings is a distinct issue for compiler folk to decide. > > I think it's fine to cap the values of the individual thresholds at their maximum without a warning, when `CompileThresholdScaling` is used. > > Similar issues were already discussed in https://github.com/openjdk/jdk/pull/8501. > >> I suggest to address this issue in a separate RFE. > > @tobiasholenstein did you file that separate RFE? > I think it's fine to cap the values of the individual thresholds at their maximum without a warning, when `CompileThresholdScaling` is used. > > Similar issues were already discussed in #8501. > > > I suggest to address this issue in a separate RFE. > > @tobiasholenstein did you file that separate RFE? No I did not file an RFE back then. But it seems like this PR resolves the issue ------------- PR: https://git.openjdk.org/jdk/pull/12549 From simonis at openjdk.org Thu Feb 16 12:37:30 2023 From: simonis at openjdk.org (Volker Simonis) Date: Thu, 16 Feb 2023 12:37:30 GMT Subject: RFR: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64 [v3] In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 06:13:03 GMT, Yi-Fan Tsai wrote: >> Instruction pmull and pmull2 support operating on 64-bit data in Cryptographic Extension. The execution throughput of this form raises from 1 on Neoverse N1 to 4 on Neoverse V1 while the latency remains 2. The CRC32 instructions did not changed: latency 2, throughput 1. As a result, computing CRC32 using pmull could perform better than using crc32 instruction. >> >> The following test has passed. >> test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java >> >> The throughput reported by [the micro benchmark](https://github.com/openjdk/jdk/blob/master/test/micro/org/openjdk/bench/java/util/TestCRC32.java) is measured on an EC2 c7g instance. The optimization shows 11 - 99% improvement when the input is at least 384 bytes. >> >> | input | 64 | 128 | 256 | 384 | 511 | 512 | 1,024 | >> | ------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | >> | improvement | 0.02% | 0.02% | 0.00% | 16.00% | 11.94% | 34.75% | 69.80% | >> >> | input | 2,048 | 4,096 | 8,192 | 16,384 | 32,768 | 65,536 | >> | ------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | >> | improvement | 77.61% | 92.33% | 95.98% | 97.95% | 99.33% | 98.36% | >> >> >> Baseline >> >> TestCRC32.testCRC32Update 64 thrpt 12 173126.358 ? 118.330 ops/ms >> TestCRC32.testCRC32Update 128 thrpt 12 112910.118 ? 47.305 ops/ms >> TestCRC32.testCRC32Update 256 thrpt 12 66601.990 ? 7.294 ops/ms >> TestCRC32.testCRC32Update 384 thrpt 12 47229.319 ? 3.949 ops/ms >> TestCRC32.testCRC32Update 511 thrpt 12 33733.119 ? 4.076 ops/ms >> TestCRC32.testCRC32Update 512 thrpt 12 36584.565 ? 4.211 ops/ms >> TestCRC32.testCRC32Update 1024 thrpt 12 19239.083 ? 1.040 ops/ms >> TestCRC32.testCRC32Update 2048 thrpt 12 9875.652 ? 0.435 ops/ms >> TestCRC32.testCRC32Update 4096 thrpt 12 5004.425 ? 0.290 ops/ms >> TestCRC32.testCRC32Update 8192 thrpt 12 2519.185 ? 0.169 ops/ms >> TestCRC32.testCRC32Update 16384 thrpt 12 1263.909 ? 0.194 ops/ms >> TestCRC32.testCRC32Update 32768 thrpt 12 632.018 ? 0.053 ops/ms >> TestCRC32.testCRC32Update 65536 thrpt 12 315.471 ? 0.095 ops/ms >> >> >> Crypto pmull >> >> TestCRC32.testCRC32Update 64 thrpt 12 173168.669 ? 4.746 ops/ms >> TestCRC32.testCRC32Update 128 thrpt 12 112933.519 ? 4.583 ops/ms >> TestCRC32.testCRC32Update 256 thrpt 12 66602.462 ? 3.150 ops/ms >> TestCRC32.testCRC32Update 384 thrpt 12 54784.739 ? 2.110 ops/ms >> TestCRC32.testCRC32Update 511 thrpt 12 37760.816 ? 69.911 ops/ms >> TestCRC32.testCRC32Update 512 thrpt 12 49297.609 ? 21.983 ops/ms >> TestCRC32.testCRC32Update 1024 thrpt 12 32667.507 ? 90.610 ops/ms >> TestCRC32.testCRC32Update 2048 thrpt 12 17539.986 ? 511.416 ops/ms >> TestCRC32.testCRC32Update 4096 thrpt 12 9625.249 ? 9.713 ops/ms >> TestCRC32.testCRC32Update 8192 thrpt 12 4937.135 ? 6.121 ops/ms >> TestCRC32.testCRC32Update 16384 thrpt 12 2501.936 ? 1.270 ops/ms >> TestCRC32.testCRC32Update 32768 thrpt 12 1259.831 ? 0.119 ops/ms >> TestCRC32.testCRC32Update 65536 thrpt 12 625.773 ? 0.242 ops/ms > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Make UseCryptoPmullForCRC32 independent of UseCRC32 Looks good now. Thanks for the explanation and for decoupling `UseCryptoPmullForCRC32` and `UseCRC32`. ------------- Marked as reviewed by simonis (Reviewer). PR: https://git.openjdk.org/jdk/pull/12480 From rehn at openjdk.org Thu Feb 16 12:51:24 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 16 Feb 2023 12:51:24 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms Message-ID: Hi all, please consider. The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts, the fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. Secondly it moves handshakes part out of the Compile_lock where it is possible. Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. Thanks, Robbin ------------- Commit messages: - Fixed WS - Deopt scopes Changes: https://git.openjdk.org/jdk/pull/12585/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12585&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8300926 Stats: 380 lines in 24 files changed: 192 ins; 91 del; 97 mod Patch: https://git.openjdk.org/jdk/pull/12585.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12585/head:pull/12585 PR: https://git.openjdk.org/jdk/pull/12585 From thartmann at openjdk.org Thu Feb 16 12:59:37 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Thu, 16 Feb 2023 12:59:37 GMT Subject: RFR: 8224980: FLAG_SET_ERGO silently ignores invalid values [v2] In-Reply-To: References: <8JBp5wYLz4IbjkPlhWUa845BAR4YnRKdhViQ0SfCnGo=.3f34ba5c-ba98-458d-8243-05f5440d1d32@github.com> <6KXZyhQfRrWk8VoZ6zep3O3wruWNdkkCjHVOxGzyly0=.1b8a5a04-4434-4388-b36a-d890fba3a9c6@github.com> Message-ID: On Thu, 16 Feb 2023 12:18:22 GMT, Tobias Holenstein wrote: >> I think it's fine to cap the values of the individual thresholds at their maximum without a warning, when `CompileThresholdScaling` is used. >> >> Similar issues were already discussed in https://github.com/openjdk/jdk/pull/8501. >> >>> I suggest to address this issue in a separate RFE. >> >> @tobiasholenstein did you file that separate RFE? > >> I think it's fine to cap the values of the individual thresholds at their maximum without a warning, when `CompileThresholdScaling` is used. >> >> Similar issues were already discussed in #8501. >> >> > I suggest to address this issue in a separate RFE. >> >> @tobiasholenstein did you file that separate RFE? > > No I did not file an RFE back then. But it seems like this PR resolves the issue Even better, thanks for verifying! ------------- PR: https://git.openjdk.org/jdk/pull/12549 From stuefe at openjdk.org Thu Feb 16 13:00:37 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 16 Feb 2023 13:00:37 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy [v5] In-Reply-To: References: <60XHlRRmnr8C-BcA_pRbLwIs7P5lqv5cayfc4ifBHeE=.12ed00cf-afc1-410a-b415-4425769be62b@github.com> Message-ID: On Tue, 14 Feb 2023 08:57:38 GMT, Stefan Karlsson wrote: > FWIW, I strongly dislike the uppercase MEMFLAGS name. I wouldn't mind this rename at all. Thinking this through again, I somewhat regret my strong opposition. Reeks too much of resisting change for resisting's sake, and we don't want that. While I still think the change is much too big, I never have liked MEMFLAGS myself either. In comments and code, it has been named "nmt category" in places, which I like more. I think simple renaming change MEMFLAGS->xx, with xx being something like "NMTCategory" or "NMTTags" or similar, would be okay. If other refactorings are omitted, such an isolated change could be backported more easily to older releases. ------------- PR: https://git.openjdk.org/jdk/pull/12454 From lucy at openjdk.org Thu Feb 16 14:02:33 2023 From: lucy at openjdk.org (Lutz Schmidt) Date: Thu, 16 Feb 2023 14:02:33 GMT Subject: RFR: JDK-8293313: NMT: Rework MallocLimit [v9] In-Reply-To: References: <-KHeOvjwe5Osf05XWQuO5ELZUs-ogUVj25xCouJaDaU=.b0906406-dd44-4eb4-be76-debe83867249@github.com> Message-ID: On Wed, 15 Feb 2023 06:54:52 GMT, Thomas Stuefe wrote: >> Rework NMT `MallocLimit` to make it more useful for native OOM analysis. Also remove `MallocMaxTestWords`, which had been mostly redundant with NMT (for background, see JDK-8293292). >> >> >> ### Background >> >> Some months ago we had [JDK-8291919](https://bugs.openjdk.org/browse/JDK-8291919), a compiler regression that caused compiler arenas to grow boundlessly. Process was usually OOM-killed before we could take a look, so this was difficult to analyze. >> >> To improve analysis options, we added NMT *malloc limits* with [JDK-8291878](https://bugs.openjdk.org/browse/JDK-8291878). Malloc limits let us set one or multiple limits to malloc load. Surpassing a limit causes the VM to stop with a fatal error in the allocation function, giving us a hs-err file and a core right at the point that (probably) causes the leak. This makes error analysis a lot simpler, and is also valuable for regression-testing footprint usage. >> >> Some people wished for a way to not end with a fatal error but to mimic a native OOM (causing os::malloc to return NULL). This would allow us to test resilience in the face of native OOMs, among other things. >> >> ### Patch >> >> - Expands the `MallocLimit` option syntax, allowing for an "oom mode" that mimics an oom: >> >> >> >> Global form: >> -XX:MallocLimit=[:] >> Category-specific form: >> -XX:MallocLimit=:[:][,:[:] ...] >> Examples: >> -XX:MallocLimit=3g >> -XX:MallocLimit=3g:oom >> -XX:MallocLimit=compiler:200m:oom >> >> >> - moves parsing of `-XX:MallocLimit` out of arguments.cpp into `mallocLimit.hpp/cpp`, and rewrites it. >> >> - changes when limits are checked. Before, the limits were checked *after* the allocation that caused the threshold overflow. Now we check beforehand. >> >> - removes `MallocMaxTestWords`, replacing its uses with `MallocLimit` >> >> - adds new gtests and new jtreg tests >> >> - removes a bunch of jtreg tests which are now better served by the new gtests. >> >> - gives us more precise error messages upon reaching a limit. For example: >> >> before: >> >> # fatal error: MallocLimit: category "Compiler" reached limit (size: 20997680, limit: 20971520) >> >> >> now: >> >> # fatal error: MallocLimit: reached category "Compiler" limit (triggering allocation size: 1920B, allocated so far: 20505K, limit: 20480K) > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: > > - Merge branch 'openjdk:master' into JDK-8293313-NMT-fake-oom > - nullptr > - merge and fix conflicts > - Merge; fix conflicts > - Feedback Gerard > - Revert strchrnul > - Copyrights > - Feedback Johan > - Merge > - Merge branch 'master' into JDK-8293313-NMT-fake-oom > - ... and 1 more: https://git.openjdk.org/jdk/compare/98a392c4...2d02a5bd LGTM. Thanks for the work and relentless discussion. ------------- Marked as reviewed by lucy (Reviewer). PR: https://git.openjdk.org/jdk/pull/11371 From mbaesken at openjdk.org Thu Feb 16 14:13:35 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 16 Feb 2023 14:13:35 GMT Subject: RFR: JDK-8293313: NMT: Rework MallocLimit [v9] In-Reply-To: References: <-KHeOvjwe5Osf05XWQuO5ELZUs-ogUVj25xCouJaDaU=.b0906406-dd44-4eb4-be76-debe83867249@github.com> Message-ID: On Wed, 15 Feb 2023 06:54:52 GMT, Thomas Stuefe wrote: >> Rework NMT `MallocLimit` to make it more useful for native OOM analysis. Also remove `MallocMaxTestWords`, which had been mostly redundant with NMT (for background, see JDK-8293292). >> >> >> ### Background >> >> Some months ago we had [JDK-8291919](https://bugs.openjdk.org/browse/JDK-8291919), a compiler regression that caused compiler arenas to grow boundlessly. Process was usually OOM-killed before we could take a look, so this was difficult to analyze. >> >> To improve analysis options, we added NMT *malloc limits* with [JDK-8291878](https://bugs.openjdk.org/browse/JDK-8291878). Malloc limits let us set one or multiple limits to malloc load. Surpassing a limit causes the VM to stop with a fatal error in the allocation function, giving us a hs-err file and a core right at the point that (probably) causes the leak. This makes error analysis a lot simpler, and is also valuable for regression-testing footprint usage. >> >> Some people wished for a way to not end with a fatal error but to mimic a native OOM (causing os::malloc to return NULL). This would allow us to test resilience in the face of native OOMs, among other things. >> >> ### Patch >> >> - Expands the `MallocLimit` option syntax, allowing for an "oom mode" that mimics an oom: >> >> >> >> Global form: >> -XX:MallocLimit=[:] >> Category-specific form: >> -XX:MallocLimit=:[:][,:[:] ...] >> Examples: >> -XX:MallocLimit=3g >> -XX:MallocLimit=3g:oom >> -XX:MallocLimit=compiler:200m:oom >> >> >> - moves parsing of `-XX:MallocLimit` out of arguments.cpp into `mallocLimit.hpp/cpp`, and rewrites it. >> >> - changes when limits are checked. Before, the limits were checked *after* the allocation that caused the threshold overflow. Now we check beforehand. >> >> - removes `MallocMaxTestWords`, replacing its uses with `MallocLimit` >> >> - adds new gtests and new jtreg tests >> >> - removes a bunch of jtreg tests which are now better served by the new gtests. >> >> - gives us more precise error messages upon reaching a limit. For example: >> >> before: >> >> # fatal error: MallocLimit: category "Compiler" reached limit (size: 20997680, limit: 20971520) >> >> >> now: >> >> # fatal error: MallocLimit: reached category "Compiler" limit (triggering allocation size: 1920B, allocated so far: 20505K, limit: 20480K) > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: > > - Merge branch 'openjdk:master' into JDK-8293313-NMT-fake-oom > - nullptr > - merge and fix conflicts > - Merge; fix conflicts > - Feedback Gerard > - Revert strchrnul > - Copyrights > - Feedback Johan > - Merge > - Merge branch 'master' into JDK-8293313-NMT-fake-oom > - ... and 1 more: https://git.openjdk.org/jdk/compare/98a392c4...2d02a5bd LGTM. Will there be a release note documenting the changed/removed XX-flags ? Maybe there should be one . ------------- Marked as reviewed by mbaesken (Reviewer). PR: https://git.openjdk.org/jdk/pull/11371 From duke at openjdk.org Thu Feb 16 14:25:15 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Thu, 16 Feb 2023 14:25:15 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v2] In-Reply-To: References: Message-ID: > Fixes the issue by applying a fix that is already implemented in PPC. Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Implement addptr suggestion by @JohnVernee and @reinrich ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12535/files - new: https://git.openjdk.org/jdk/pull/12535/files/b32cf7c0..e63ab44a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12535&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12535&range=00-01 Stats: 7 lines in 3 files changed: 4 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/12535.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12535/head:pull/12535 PR: https://git.openjdk.org/jdk/pull/12535 From duke at openjdk.org Thu Feb 16 14:25:18 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Thu, 16 Feb 2023 14:25:18 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v2] In-Reply-To: References: <2oXvjl1UFJq6x70IUcUgLKUVQgnjLC9U7YsB3XQg7BQ=.d651fdb1-df5c-40bc-9adc-4921045014a0@github.com> <4kKR1KP5iJ7G1_lvNKt4jOgRcCIyzDvlLDfO4rtxm2w=.c6fab761-d1ed-405b-b3b5-001a8e19f5db@github.com> <3T7nJ7BvnPjezFb-EzwCNkjq-zRjjsK-LrZTdRgZmzY=.c9ad8912-fbab-4b96-a9c1-9fe104cc904d@github.com> Message-ID: <3z3oBmOCLdwQRnBfeei0LU7RFQ3J-0IyAk0C8BUwGvU=.5c45065f-6600-4c0d-8aca-201c96239bb6@github.com> On Thu, 16 Feb 2023 11:00:14 GMT, Richard Reingruber wrote: >>> (Maybe too) long version: >> >> Not at all :) I don't know much about the interpreter, so this is pretty helpful. >> >> Ok. I think the main point is that a MH linker method doesn't ~have~ use a c2i adapter, it instead has separate versions for interpreted and compiled callers altogether (except `linkToNative` which doesn't have an interpreter version atm) >> >> Longer version: >> >> I think there are 4 theoretical cases (are there more?): >> 1. interpreted -> MH interpreter entry -> interpreted (the case we want to fix, I think) >> 2. interpreted -> MH interpreter entry -> i2c -> compiled >> 3. compiled -> c2i -> MH interpreter entry -> interpreted >> 4. compiled -> c2i -> MH interpreter entry -> i2c -> compiled >> >> So... my understanding is that a c2i adapter is used when a callee is interpreted, so its `from_compiled_entry` points to the c2i adapter. A compiled caller calls the `from_compiled_entry`, so it ends up going through the `c2i` adapter of an interpreted callee. However, for method handle linkers, the `from_compiled_entry` never points to a c2i adapter, but to the MH linker compiler entry generated in `gen_special_dispatch` in e.g `sharedRuntime_x86_64`. So, in other words, the 3. and 4. scenarios above are not possible, AFAICS. An MH linker's interpreted/compiled entries simply replace the regular `i2c` and `c2i` adapters for a MH linker method. Though, the actual target method can be either compiled or interpreted, so we can go into either a `c2i` adapter from a MH linker compiler entry, or a `i2c` adapter from a MH linker interpreter entry (which is scenario 2.). >> >> For scenario 1.: caller sets `r13` before jumping to callee (MH linker), we end up in the MH adapter, pop the last argument, and adjust `r13` for the actual callee. The MH interpreter entry uses `from_interpreted_entry` of the target method, so we don't go through any adapter. >> >> For scenario 2.: mostly the same as 1., but we go through an `i2c` adapter, and the compiled callee doesn't care about what's in `r13` any ways. (there's a comment in the ic2 adapter code about preserving `r13` for the case that we might go back into a c2i adapter if we lose a race where the callee is made non-entrant, but `r13` is also used as a scratch register later in the i2c code, and the c2i code overwrites `r13` any ways. So, I think that is an outdated comment...) > >> So... my understanding is that a c2i adapter is used when a callee is >> interpreted, so its `from_compiled_entry` points to the c2i adapter. A >> compiled caller calls the `from_compiled_entry`, so it ends up going through >> the `c2i` adapter of an interpreted callee. However, for method handle >> linkers, the `from_compiled_entry` never points to a c2i adapter, but to the >> MH linker compiler entry generated in `gen_special_dispatch` in e.g >> `sharedRuntime_x86_64`. So, in other words, the 3. and 4. scenarios above are >> not possible, AFAICS. > > I see. Thanks! That's what I've been missing. It's implemented in `SystemDictionary::find_method_handle_intrinsic()` I think. > > I agree that it will be ok to modify r13 as you suggested. Maybe explaining why we never reach here coming from a c2i adapter: because native wrappers are generated eagerly in `SystemDictionary::find_method_handle_intrinsic()` unless -Xint is given but then the caller cannot be compiled. I integrated your suggestion, it seems to work now. I'm hopeful that our internal queue does not find a test error. I'm happy for any commenting suggestions. ------------- PR: https://git.openjdk.org/jdk/pull/12535 From jwaters at openjdk.org Thu Feb 16 14:57:30 2023 From: jwaters at openjdk.org (Julian Waters) Date: Thu, 16 Feb 2023 14:57:30 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 04:57:48 GMT, Kim Barrett wrote: >> As for the function declaration, it's pretty much the entire function, keywords and all, as the C++ Standard always has the attribute specifier sequence appear before everything else (alignas is also an attribute specifier, to that extent), and never between keywords. I'm in favour of adding an extra line also forbidding having the attributes placed behind the function's parameter list brackets though, since HotSpot currently has a ton of attributes that are listed behind the whole function >> >> I think Kim has stated before that compiler specific attributes don't give much benefit and isn't really interested in having them, that said it would be nice to have them in the future with perhaps the compiler's namespace required in the attribute (msvc::noinline in C++20 guarantees no inlining of a function as compared to our current __declspec(noinline) for instance). > > Standard attributes are syntactically permitted in a plethora of places. But > specific attributes are only allowed in places related to their semantics. > For function declarations (including definitions), this is either first, > before anything else related to the declaration, or after the name of the > function but before the parameter list. > > What I've proposed is that HotSpot code prefers putting function attributes > first, rather than immediately after the function name (so before the > parameter list). > > Standard attributes can also appear after the parameter list, but those apply > to the function *type* rather than the function. That differs from gcc > `__attribute__`. > > (Standard attributes can also appear after the return type but before the > function name. Those attributes apply to the return type. Hoping we don't > need any of those, other than perhaps for alignment.) > > I've no idea how standard attributes and gcc `__attribute__` or msvc `__declspec` > interact syntactically. My hope would be that they can interleave, but I have > no idea whether that's true or not. I guess some research is needed, though I > think it doesn't matter for the current set. (I doubt anyone is going to > declare a function both noreturn and always-inline, for example.) > > My objection to the earlier proposal to (for example) change ATTRIBUTE_PRINTF > to use `[[gnu::format(printf...)]]` instead of the `__attribute__` form was > that it required moving a lot of the macro uses from the end of a declaration > to its front (because of the above-mentioned difference), so lots of code > changes for not much gain. > > Unrecognized attributes cause problems. C++17 changes that, requiring that > unrecognized attributes be ignored. That might make some uses of > compiler-specific attributes more palatable, as they can be used directly > rather than behind a macro that is conditionally defined. > > OTOH, we might still end up putting some attributes behind conditionally > defined macros because multiple platforms provide the same feature but with > different spellings. Ah, I see, thanks for the clarification behind the location of the attribute changing where it applies to, that was something I'd forgotten. I'd argue that eventually biting the bullet and changing the attribute positions that we already have under macros (can be done in one fell swoop honestly) is a net gain since they now have well defined behaviour according to C++ and we know what they apply to exactly, but I don't know how much you'd agree on that point Speaking of moving to C++17 though, what about C++20? It's the C++ version that Visual C++ gives access to the noinline and forceinline attributes which are guaranteed to not inline or always inline, as opposed to what we have now, which globalDefinitions_visCPP notes that we are uncertain whether they actually work, and while it's probably farfetched to jump an entire version of C++ just for that, I found it pretty interesting and wondered what you'd have to say about that ------------- PR: https://git.openjdk.org/jdk/pull/12507 From eosterlund at openjdk.org Thu Feb 16 16:07:30 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 16 Feb 2023 16:07:30 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 08:38:42 GMT, Robbin Ehn wrote: > Hi all, please consider. > > The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. > All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. > Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. > > Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. > The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) > > This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. > > This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. > Secondly it moves handshakes part out of the Compile_lock where it is possible. > > Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. > > It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. > > Thanks, Robbin Changes requested by eosterlund (Reviewer). src/hotspot/share/runtime/deoptimization.cpp line 110: > 108: MutexLocker ml(CompiledMethod_lock, Mutex::_no_safepoint_check_flag); > 109: // If there is nothing to deopt _gen is the same as comitted. > 110: _gen = DeoptimizationScope::_committed_deopt_gen; Could we find a better name for the "_gen" member? If I'm reading this right, it's some kind of _required_gen, right? As in we need to have committed up until this number, or it isn't safe to continue. ------------- PR: https://git.openjdk.org/jdk/pull/12585 From thomas.stuefe at gmail.com Thu Feb 16 16:12:34 2023 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 16 Feb 2023 17:12:34 +0100 Subject: RFR: JDK-8293313: NMT: Rework MallocLimit [v9] In-Reply-To: References: <-KHeOvjwe5Osf05XWQuO5ELZUs-ogUVj25xCouJaDaU=.b0906406-dd44-4eb4-be76-debe83867249@github.com> Message-ID: Danke Jungs! On Thu, Feb 16, 2023 at 3:13 PM Matthias Baesken wrote: > On Wed, 15 Feb 2023 06:54:52 GMT, Thomas Stuefe > wrote: > > >> Rework NMT `MallocLimit` to make it more useful for native OOM > analysis. Also remove `MallocMaxTestWords`, which had been mostly redundant > with NMT (for background, see JDK-8293292). > >> > >> > >> ### Background > >> > >> Some months ago we had [JDK-8291919]( > https://bugs.openjdk.org/browse/JDK-8291919), a compiler regression that > caused compiler arenas to grow boundlessly. Process was usually OOM-killed > before we could take a look, so this was difficult to analyze. > >> > >> To improve analysis options, we added NMT *malloc limits* with > [JDK-8291878](https://bugs.openjdk.org/browse/JDK-8291878). Malloc limits > let us set one or multiple limits to malloc load. Surpassing a limit causes > the VM to stop with a fatal error in the allocation function, giving us a > hs-err file and a core right at the point that (probably) causes the leak. > This makes error analysis a lot simpler, and is also valuable for > regression-testing footprint usage. > >> > >> Some people wished for a way to not end with a fatal error but to mimic > a native OOM (causing os::malloc to return NULL). This would allow us to > test resilience in the face of native OOMs, among other things. > >> > >> ### Patch > >> > >> - Expands the `MallocLimit` option syntax, allowing for an "oom mode" > that mimics an oom: > >> > >> > >> > >> Global form: > >> -XX:MallocLimit=[:] > >> Category-specific form: > >> -XX:MallocLimit=:[:][,:[:] > ...] > >> Examples: > >> -XX:MallocLimit=3g > >> -XX:MallocLimit=3g:oom > >> -XX:MallocLimit=compiler:200m:oom > >> > >> > >> - moves parsing of `-XX:MallocLimit` out of arguments.cpp into > `mallocLimit.hpp/cpp`, and rewrites it. > >> > >> - changes when limits are checked. Before, the limits were checked > *after* the allocation that caused the threshold overflow. Now we check > beforehand. > >> > >> - removes `MallocMaxTestWords`, replacing its uses with `MallocLimit` > >> > >> - adds new gtests and new jtreg tests > >> > >> - removes a bunch of jtreg tests which are now better served by the new > gtests. > >> > >> - gives us more precise error messages upon reaching a limit. For > example: > >> > >> before: > >> > >> # fatal error: MallocLimit: category "Compiler" reached limit (size: > 20997680, limit: 20971520) > >> > >> > >> now: > >> > >> # fatal error: MallocLimit: reached category "Compiler" limit > (triggering allocation size: 1920B, allocated so far: 20505K, limit: 20480K) > > > > Thomas Stuefe has updated the pull request with a new target base due to > a merge or a rebase. The pull request now contains 11 commits: > > > > - Merge branch 'openjdk:master' into JDK-8293313-NMT-fake-oom > > - nullptr > > - merge and fix conflicts > > - Merge; fix conflicts > > - Feedback Gerard > > - Revert strchrnul > > - Copyrights > > - Feedback Johan > > - Merge > > - Merge branch 'master' into JDK-8293313-NMT-fake-oom > > - ... and 1 more: > https://git.openjdk.org/jdk/compare/98a392c4...2d02a5bd > > LGTM. > > Will there be a release note documenting the changed/removed XX-flags ? > Maybe there should be one . > > ------------- > > Marked as reviewed by mbaesken (Reviewer). > > PR: https://git.openjdk.org/jdk/pull/11371 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stuefe at openjdk.org Thu Feb 16 16:15:43 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 16 Feb 2023 16:15:43 GMT Subject: RFR: JDK-8293313: NMT: Rework MallocLimit [v9] In-Reply-To: References: <-KHeOvjwe5Osf05XWQuO5ELZUs-ogUVj25xCouJaDaU=.b0906406-dd44-4eb4-be76-debe83867249@github.com> Message-ID: On Wed, 15 Feb 2023 06:54:52 GMT, Thomas Stuefe wrote: >> Rework NMT `MallocLimit` to make it more useful for native OOM analysis. Also remove `MallocMaxTestWords`, which had been mostly redundant with NMT (for background, see JDK-8293292). >> >> >> ### Background >> >> Some months ago we had [JDK-8291919](https://bugs.openjdk.org/browse/JDK-8291919), a compiler regression that caused compiler arenas to grow boundlessly. Process was usually OOM-killed before we could take a look, so this was difficult to analyze. >> >> To improve analysis options, we added NMT *malloc limits* with [JDK-8291878](https://bugs.openjdk.org/browse/JDK-8291878). Malloc limits let us set one or multiple limits to malloc load. Surpassing a limit causes the VM to stop with a fatal error in the allocation function, giving us a hs-err file and a core right at the point that (probably) causes the leak. This makes error analysis a lot simpler, and is also valuable for regression-testing footprint usage. >> >> Some people wished for a way to not end with a fatal error but to mimic a native OOM (causing os::malloc to return NULL). This would allow us to test resilience in the face of native OOMs, among other things. >> >> ### Patch >> >> - Expands the `MallocLimit` option syntax, allowing for an "oom mode" that mimics an oom: >> >> >> >> Global form: >> -XX:MallocLimit=[:] >> Category-specific form: >> -XX:MallocLimit=:[:][,:[:] ...] >> Examples: >> -XX:MallocLimit=3g >> -XX:MallocLimit=3g:oom >> -XX:MallocLimit=compiler:200m:oom >> >> >> - moves parsing of `-XX:MallocLimit` out of arguments.cpp into `mallocLimit.hpp/cpp`, and rewrites it. >> >> - changes when limits are checked. Before, the limits were checked *after* the allocation that caused the threshold overflow. Now we check beforehand. >> >> - removes `MallocMaxTestWords`, replacing its uses with `MallocLimit` >> >> - adds new gtests and new jtreg tests >> >> - removes a bunch of jtreg tests which are now better served by the new gtests. >> >> - gives us more precise error messages upon reaching a limit. For example: >> >> before: >> >> # fatal error: MallocLimit: category "Compiler" reached limit (size: 20997680, limit: 20971520) >> >> >> now: >> >> # fatal error: MallocLimit: reached category "Compiler" limit (triggering allocation size: 1920B, allocated so far: 20505K, limit: 20480K) > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: > > - Merge branch 'openjdk:master' into JDK-8293313-NMT-fake-oom > - nullptr > - merge and fix conflicts > - Merge; fix conflicts > - Feedback Gerard > - Revert strchrnul > - Copyrights > - Feedback Johan > - Merge > - Merge branch 'master' into JDK-8293313-NMT-fake-oom > - ... and 1 more: https://git.openjdk.org/jdk/compare/98a392c4...2d02a5bd Thanks guys! > LGTM. > > Will there be a release note documenting the changed/removed XX-flags ? Maybe there should be one . That is not necessary since this was a diagnostic switch. Also, the new format is backward compatible with the old one (so no behavioral change if someone had set it) Cheers, Thomas ------------- PR: https://git.openjdk.org/jdk/pull/11371 From stuefe at openjdk.org Thu Feb 16 16:32:46 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 16 Feb 2023 16:32:46 GMT Subject: Integrated: JDK-8293313: NMT: Rework MallocLimit In-Reply-To: <-KHeOvjwe5Osf05XWQuO5ELZUs-ogUVj25xCouJaDaU=.b0906406-dd44-4eb4-be76-debe83867249@github.com> References: <-KHeOvjwe5Osf05XWQuO5ELZUs-ogUVj25xCouJaDaU=.b0906406-dd44-4eb4-be76-debe83867249@github.com> Message-ID: <__0oH6M7zaEGaLm922QJPrApOeCSJq_RQnzDrAMF_uU=.5942b04b-1e9e-4f39-acc0-5775916932f6@github.com> On Fri, 25 Nov 2022 17:34:19 GMT, Thomas Stuefe wrote: > Rework NMT `MallocLimit` to make it more useful for native OOM analysis. Also remove `MallocMaxTestWords`, which had been mostly redundant with NMT (for background, see JDK-8293292). > > > ### Background > > Some months ago we had [JDK-8291919](https://bugs.openjdk.org/browse/JDK-8291919), a compiler regression that caused compiler arenas to grow boundlessly. Process was usually OOM-killed before we could take a look, so this was difficult to analyze. > > To improve analysis options, we added NMT *malloc limits* with [JDK-8291878](https://bugs.openjdk.org/browse/JDK-8291878). Malloc limits let us set one or multiple limits to malloc load. Surpassing a limit causes the VM to stop with a fatal error in the allocation function, giving us a hs-err file and a core right at the point that (probably) causes the leak. This makes error analysis a lot simpler, and is also valuable for regression-testing footprint usage. > > Some people wished for a way to not end with a fatal error but to mimic a native OOM (causing os::malloc to return NULL). This would allow us to test resilience in the face of native OOMs, among other things. > > ### Patch > > - Expands the `MallocLimit` option syntax, allowing for an "oom mode" that mimics an oom: > > > > Global form: > -XX:MallocLimit=[:] > Category-specific form: > -XX:MallocLimit=:[:][,:[:] ...] > Examples: > -XX:MallocLimit=3g > -XX:MallocLimit=3g:oom > -XX:MallocLimit=compiler:200m:oom > > > - moves parsing of `-XX:MallocLimit` out of arguments.cpp into `mallocLimit.hpp/cpp`, and rewrites it. > > - changes when limits are checked. Before, the limits were checked *after* the allocation that caused the threshold overflow. Now we check beforehand. > > - removes `MallocMaxTestWords`, replacing its uses with `MallocLimit` > > - adds new gtests and new jtreg tests > > - removes a bunch of jtreg tests which are now better served by the new gtests. > > - gives us more precise error messages upon reaching a limit. For example: > > before: > > # fatal error: MallocLimit: category "Compiler" reached limit (size: 20997680, limit: 20971520) > > > now: > > # fatal error: MallocLimit: reached category "Compiler" limit (triggering allocation size: 1920B, allocated so far: 20505K, limit: 20480K) This pull request has now been integrated. Changeset: 90e09228 Author: Thomas Stuefe URL: https://git.openjdk.org/jdk/commit/90e092280f67dc1f77ff04c7c8f46317a00e3af9 Stats: 1010 lines in 21 files changed: 653 ins; 257 del; 100 mod 8293313: NMT: Rework MallocLimit 8293292: Remove MallocMaxTestWords Reviewed-by: jsjolen, gziemski, lucy, mbaesken ------------- PR: https://git.openjdk.org/jdk/pull/11371 From ccheung at openjdk.org Thu Feb 16 16:56:56 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Thu, 16 Feb 2023 16:56:56 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node [v4] In-Reply-To: References: Message-ID: <2mGYkBwMFEqx5ec71tIta6hboHgsSj0G6NyK7oTBW5E=.4c557bd6-852f-47cb-9796-e9475ff81b1d@github.com> > Please review this patch for embedding the Symbol inside a ConcurrentHashTable Node instead of having a pointer to a Symbol. This eliminates malloc/free for each Symbol. > > This patch is co-authored by @robehn. > > Passed tiers 1 - 4 testing. Calvin Cheung has updated the pull request incrementally with one additional commit since the last revision: add Afree() for permanent symbols ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12562/files - new: https://git.openjdk.org/jdk/pull/12562/files/9b4c63d9..b22eb3b0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12562&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12562&range=02-03 Stats: 10 lines in 1 file changed: 9 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12562.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12562/head:pull/12562 PR: https://git.openjdk.org/jdk/pull/12562 From ccheung at openjdk.org Thu Feb 16 16:58:28 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Thu, 16 Feb 2023 16:58:28 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node [v3] In-Reply-To: References: <9GcPkoPBeh5mcKE9kgn6um-5eDO9aABZVMoeOO5CNk0=.91bf2883-a84a-48b6-b3ab-91e7b22d4555@github.com> Message-ID: On Thu, 16 Feb 2023 01:12:51 GMT, Ioi Lam wrote: >> Oh but if it overflows the refcount to become permanent, the permanence is sticky. We won't free it, so it's not a bug. > > From the comment above this code, it seems we do free permanent symbols, but only when the current thread lost when competing to insert a symbol for the boot loader. So the old code is correct, and the new code would have a leak. I've added back the `Afree()` for permanent symbols. It passed tier1 testing. Running other tiers testing currently. ------------- PR: https://git.openjdk.org/jdk/pull/12562 From pchilanomate at openjdk.org Thu Feb 16 16:59:05 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Thu, 16 Feb 2023 16:59:05 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v5] In-Reply-To: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: > Please review the following change. Some of the JVMTI functions had to be modified since they are not supported by the specs on virtual threads (StopThread, RunAgentThread, GetCurrentThreadCpuTime, GetThreadCpuTime) or shouldn't include virtual threads in the results (GetAllThreads, GetAllStackTraces, GetThreadGroupChildren). Others were modified because they should work on virtual thread but they weren't (SuspendAllVirtualThreads/ResumeAllVirtualThreads). Also support for VirtualThreadStart/VirtualThreadEnd events was added. > > There were a total of 71 tests under serviceability/jvmti/ that had the vm.continuations requirement. Of those, 24 where under serviceability/jvmti/vthread/. I was able to remove that requirement from 50 tests. Most of those tests were already passing for the alternative vthread implementation just by removing the check in jni_GetEnv for VMContinuations. > From the other 21 tests, 20 still fail with the changes either because they actually test continuations or because of different assumptions in the tests that don't hold true for the alternative vthread implementation. So for those I left the vm.continuations requirement untouched (from those 20 tests there are actually 4 tests from the thread/GetStackTrace/ family that are passing because of a bug in the test, but I can fix that in another RFE). The other remaining test is vthread/GetSetLocalTest/GetSetLocalTest.java which I see is problemlisted. > > Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). > > I added new test BoundVThreadTest.java which just checks for the unsupported/supported behavior mentioned previously. For some extra basic coverage I also added a new run with -XX:-VMContinuations on tests inside the serviceability/jvmti/vthread/ directory that don't require vm.continuations anymore. I could also add that for all the other tests in serviceability/jvmti/ for which I removed the vm.continuations requirement. > > I run the patch through mach5 tiers 1-6 plus some extra local runs on the relevant tests. > > Thanks, > Patricio Patricio Chilano Mateo has updated the pull request incrementally with two additional commits since the last revision: - restore state should be only for VirtualThreads - add suspend check ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12512/files - new: https://git.openjdk.org/jdk/pull/12512/files/7872f94c..3cb32734 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12512&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12512&range=03-04 Stats: 6 lines in 1 file changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/12512.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12512/head:pull/12512 PR: https://git.openjdk.org/jdk/pull/12512 From pchilanomate at openjdk.org Thu Feb 16 17:06:30 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Thu, 16 Feb 2023 17:06:30 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v3] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Thu, 16 Feb 2023 07:48:16 GMT, Serguei Spitsyn wrote: > The `SuspendAllVirtualThreads` spec has this statement: > > ``` > Suspend all virtual threads except those in the exception list. > Virtual threads that are currently suspended do not change state. > ``` > > And the `ResumeAllVirtualThreads` spec has this statement: > > ``` > Resume all virtual threads except those in the exception list. > Virtual threads that are currently resumed do not change state > ``` > > So that, `SuspendAllVirtualThreads` should do nothing with the already suspended virtual threads. And `ResumeAllVirtualThreads` should do nothing with the already resumed virtual threads. It does not depend on the except_list. It is better to avoid suspending already suspended threads and resuming already resumed. So, at least a check for `!java_thread->is_suspended()` is needed. > Yes, the elist comment was related to the other issue. The reason why I didn't explicitly added this check is because it will we done in the suspend/resume call already (while holding the handshake lock which serializes the suspend/resume requests). So inside the suspend handshake for example either the target is suspended, so nothing is done, or the target is not suspended and we suspend it. Is there an actual issue I'm missing or is this to make a fast check and avoid the call? In any case I added the check. Also I see the specs for the single suspend/resume functions say the same thing about not changing the state and, for example, in SuspendThreadList we are also not making an explicit check before the call to suspend_thread(). So if there is an actual issue here, don't we need to add that check for them too? >Did you observe any JVMTI or JDI suspend/resume tests failing? > No. I'll retest the patch again though once I resolve all comments. > > Now, I realized that although single-suspends will not change _suspended_list/not_suspended_list for a BoundVirtualThread, SuspendAllVirtualThreads/ResumeAllVirtualThreads can if a BoundVirtualThread is part of the exception list. In the SuspendAllVirtualThreads case for example, when restoring the resume state, the call to JvmtiVTSuspender::is_vthread_suspended() would return true (we are already in SR_all mode and the BoundVirtualThread will not be in _not_suspended_list), meaning the BoundVirtualThread will be added to the not_suspended_list. > > So we have to decide if we want to keep track of BoundVirtualThreads on these lists or not. Given that I see JvmtiVTSuspender::is_vthread_suspended() is always called on a VirtualThread the easiest thing would be to not track them, and just add a check in SuspendAllVirtualThreads/ResumeAllVirtualThreads where the elist is populated so only VirtualThreads are added. What do you think? > > I feel that following the VirtualThread's implementation code path would be more consistent, unified and free of surprises. But I also see that following the platform threads code path looks pretty simple. So, it can be okay. So, for this approach, I think, the code fragments which collect and restore resumed/suspended state for virtual threads need to be bypassed by the BoundVirtualThreads implementation, so the `_suspended_list`/`not_suspended_list` won't be impacted. Please, let me know if it does not work by any reason. We also will have to rely on the testing here. > Sounds good. Yes, just following the platform thread code path is easier. Plus the other functions that are supported on virtual threads already have a branch for platform thread vs virtual thread, and the BoundVirtualThread already takes the platform thread path. Last commit is a fix for this issue in SuspendAllVirtualThreads/ResumeAllVirtualThreads, please take a look. ------------- PR: https://git.openjdk.org/jdk/pull/12512 From kbarrett at openjdk.org Thu Feb 16 17:29:26 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 16 Feb 2023 17:29:26 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 14:55:06 GMT, Julian Waters wrote: >> Standard attributes are syntactically permitted in a plethora of places. But >> specific attributes are only allowed in places related to their semantics. >> For function declarations (including definitions), this is either first, >> before anything else related to the declaration, or after the name of the >> function but before the parameter list. >> >> What I've proposed is that HotSpot code prefers putting function attributes >> first, rather than immediately after the function name (so before the >> parameter list). >> >> Standard attributes can also appear after the parameter list, but those apply >> to the function *type* rather than the function. That differs from gcc >> `__attribute__`. >> >> (Standard attributes can also appear after the return type but before the >> function name. Those attributes apply to the return type. Hoping we don't >> need any of those, other than perhaps for alignment.) >> >> I've no idea how standard attributes and gcc `__attribute__` or msvc `__declspec` >> interact syntactically. My hope would be that they can interleave, but I have >> no idea whether that's true or not. I guess some research is needed, though I >> think it doesn't matter for the current set. (I doubt anyone is going to >> declare a function both noreturn and always-inline, for example.) >> >> My objection to the earlier proposal to (for example) change ATTRIBUTE_PRINTF >> to use `[[gnu::format(printf...)]]` instead of the `__attribute__` form was >> that it required moving a lot of the macro uses from the end of a declaration >> to its front (because of the above-mentioned difference), so lots of code >> changes for not much gain. >> >> Unrecognized attributes cause problems. C++17 changes that, requiring that >> unrecognized attributes be ignored. That might make some uses of >> compiler-specific attributes more palatable, as they can be used directly >> rather than behind a macro that is conditionally defined. >> >> OTOH, we might still end up putting some attributes behind conditionally >> defined macros because multiple platforms provide the same feature but with >> different spellings. > > Ah, I see, thanks for the clarification behind the location of the attribute changing where it applies to, that was something I'd forgotten. I'd argue that eventually biting the bullet and changing the attribute positions that we already have under macros (can be done in one fell swoop honestly) is a net gain since they now have well defined behaviour according to C++ and we know what they apply to exactly, but I don't know how much you'd agree on that point > > Speaking of moving to C++17 though, what about C++20? It's the C++ version that Visual C++ gives access to the noinline and forceinline attributes which are guaranteed to not inline or always inline, as opposed to what we have now, which globalDefinitions_visCPP notes that we are uncertain whether they actually work, and while it's probably farfetched to jump an entire version of C++ just for that, I found it pretty interesting and wondered what you'd have to say about that Moving to a newer C++ language version requires compiler support, for all supported OpenJDK ports (*). Only recently has the xlclang compiler used by the aix-ppc port had a version that supported C++17. (I think it was released last summer/fall.) I don't know when they might release a version supporting C++20. Use of that new version is being investigated by the port maintainers. I've done a little bit of peeking ahead to C++17 as well, and haven't yet run into any problems with the Oracle-supported ports (which is what I expected). (*) Non-open licensee ports also need to be considered, though I'm not sure what the requirements are there. I know of one such that didn't yet have complete C++14 support when we adopted it for OpenJDK. ------------- PR: https://git.openjdk.org/jdk/pull/12507 From sspitsyn at openjdk.org Thu Feb 16 18:14:13 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 16 Feb 2023 18:14:13 GMT Subject: RFR: 8302615: make JVMTI thread cpu time functions optional for virtual threads Message-ID: This is a minor JVM TI spec update for two timer functions `GetCurrentThreadCpuTime` and `GetThreadCpuTime` to allow implementations to support virtual threads. The CSR is: [JDK-8302616](https://bugs.openjdk.org/browse/JDK-8302616): make JVMTI thread cpu time functions optional for virtual threads No testing is needed. ------------- Commit messages: - 8302615: make JVMTI thread cpu time functions optional for virtual threads Changes: https://git.openjdk.org/jdk/pull/12604/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12604&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302615 Stats: 4 lines in 1 file changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12604.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12604/head:pull/12604 PR: https://git.openjdk.org/jdk/pull/12604 From coleenp at openjdk.org Thu Feb 16 18:48:29 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 16 Feb 2023 18:48:29 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node [v4] In-Reply-To: <2mGYkBwMFEqx5ec71tIta6hboHgsSj0G6NyK7oTBW5E=.4c557bd6-852f-47cb-9796-e9475ff81b1d@github.com> References: <2mGYkBwMFEqx5ec71tIta6hboHgsSj0G6NyK7oTBW5E=.4c557bd6-852f-47cb-9796-e9475ff81b1d@github.com> Message-ID: On Thu, 16 Feb 2023 16:56:56 GMT, Calvin Cheung wrote: >> Please review this patch for embedding the Symbol inside a ConcurrentHashTable Node instead of having a pointer to a Symbol. This eliminates malloc/free for each Symbol. >> >> This patch is co-authored by @robehn. >> >> Passed tiers 1 - 4 testing. > > Calvin Cheung has updated the pull request incrementally with one additional commit since the last revision: > > add Afree() for permanent symbols This seems fine. I haven't actually observed us getting to this code, except maybe once, but you might as well keep it. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.org/jdk/pull/12562 From iklam at openjdk.org Thu Feb 16 18:56:30 2023 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 16 Feb 2023 18:56:30 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node [v4] In-Reply-To: <2mGYkBwMFEqx5ec71tIta6hboHgsSj0G6NyK7oTBW5E=.4c557bd6-852f-47cb-9796-e9475ff81b1d@github.com> References: <2mGYkBwMFEqx5ec71tIta6hboHgsSj0G6NyK7oTBW5E=.4c557bd6-852f-47cb-9796-e9475ff81b1d@github.com> Message-ID: On Thu, 16 Feb 2023 16:56:56 GMT, Calvin Cheung wrote: >> Please review this patch for embedding the Symbol inside a ConcurrentHashTable Node instead of having a pointer to a Symbol. This eliminates malloc/free for each Symbol. >> >> This patch is co-authored by @robehn. >> >> Passed tiers 1 - 4 testing. > > Calvin Cheung has updated the pull request incrementally with one additional commit since the last revision: > > add Afree() for permanent symbols src/hotspot/share/classfile/symbolTable.cpp line 168: > 166: if (!SymbolTable::arena()->Afree(memory, alloc_size)) { > 167: log_trace_symboltable_helper(&value, "Leaked permanent symbol"); > 168: } We should avoid freeing if DumpSharedSpaces is true, so that the code is symmetrical with allocate_node_impl(). Maybe this: #if INCLUDE_CDS if (DumpSharedSpaces) { // no deallocation is needed } else #endif if (value.refcount() != PERM_REFCOUNT) { ------------- PR: https://git.openjdk.org/jdk/pull/12562 From ccheung at openjdk.org Thu Feb 16 19:38:33 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Thu, 16 Feb 2023 19:38:33 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node [v4] In-Reply-To: References: <2mGYkBwMFEqx5ec71tIta6hboHgsSj0G6NyK7oTBW5E=.4c557bd6-852f-47cb-9796-e9475ff81b1d@github.com> Message-ID: On Thu, 16 Feb 2023 18:53:17 GMT, Ioi Lam wrote: >> Calvin Cheung has updated the pull request incrementally with one additional commit since the last revision: >> >> add Afree() for permanent symbols > > src/hotspot/share/classfile/symbolTable.cpp line 168: > >> 166: if (!SymbolTable::arena()->Afree(memory, alloc_size)) { >> 167: log_trace_symboltable_helper(&value, "Leaked permanent symbol"); >> 168: } > > We should avoid freeing if DumpSharedSpaces is true, so that the code is symmetrical with allocate_node_impl(). Maybe this: > > > #if INCLUDE_CDS > if (DumpSharedSpaces) { > // no deallocation is needed > } else > #endif > if (value.refcount() != PERM_REFCOUNT) { How about performing the above check at the beginning of the function, right after the assert statement as follows? Let me know if the added assert is necessary. --- a/src/hotspot/share/classfile/symbolTable.cpp +++ b/src/hotspot/share/classfile/symbolTable.cpp @@ -151,6 +151,13 @@ public: // If #2, then the symbol is dead (refcount==0) assert(value.is_permanent() || (value.refcount() == 1) || (value.refcount() == 0), "refcount %d", value.refcount()); +#if INCLUDE_CDS + if (DumpSharedSpaces) { + assert(value.is_permanent(), "symbol should be permanent during CDS dumping"); + // no deallocation is needed + return; + } +#endif if (value.refcount() == 1) { value.decrement_refcount(); assert(value.refcount() == 0, "expected dead symbol"); ------------- PR: https://git.openjdk.org/jdk/pull/12562 From alanb at openjdk.org Thu Feb 16 19:40:27 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 16 Feb 2023 19:40:27 GMT Subject: RFR: 8302615: make JVMTI thread cpu time functions optional for virtual threads In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 18:05:54 GMT, Serguei Spitsyn wrote: > This is a minor JVM TI spec update for two timer functions `GetCurrentThreadCpuTime` and `GetThreadCpuTime` to allow implementations to support virtual threads. > > The CSR is: > [JDK-8302616](https://bugs.openjdk.org/browse/JDK-8302616): make JVMTI thread cpu time functions optional for virtual threads > > No testing is needed. Marked as reviewed by alanb (Reviewer). I assume we'll rev the JVMTI version at some point in 21. The change looks fine. It might be that we'll need to add a new capacity or add a field to jvmtiTimerInfo to say whether the implementation supports CPU time for virtual threads, but what we have now is okay. ------------- PR: https://git.openjdk.org/jdk/pull/12604 From iklam at openjdk.org Thu Feb 16 19:55:28 2023 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 16 Feb 2023 19:55:28 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node [v4] In-Reply-To: References: <2mGYkBwMFEqx5ec71tIta6hboHgsSj0G6NyK7oTBW5E=.4c557bd6-852f-47cb-9796-e9475ff81b1d@github.com> Message-ID: On Thu, 16 Feb 2023 19:35:45 GMT, Calvin Cheung wrote: >> src/hotspot/share/classfile/symbolTable.cpp line 168: >> >>> 166: if (!SymbolTable::arena()->Afree(memory, alloc_size)) { >>> 167: log_trace_symboltable_helper(&value, "Leaked permanent symbol"); >>> 168: } >> >> We should avoid freeing if DumpSharedSpaces is true, so that the code is symmetrical with allocate_node_impl(). Maybe this: >> >> >> #if INCLUDE_CDS >> if (DumpSharedSpaces) { >> // no deallocation is needed >> } else >> #endif >> if (value.refcount() != PERM_REFCOUNT) { > > How about performing the above check at the beginning of the function, right after the assert statement as follows? > Let me know if the added assert is necessary. > > --- a/src/hotspot/share/classfile/symbolTable.cpp > +++ b/src/hotspot/share/classfile/symbolTable.cpp > @@ -151,6 +151,13 @@ public: > // If #2, then the symbol is dead (refcount==0) > assert(value.is_permanent() || (value.refcount() == 1) || (value.refcount() == 0), > "refcount %d", value.refcount()); > +#if INCLUDE_CDS > + if (DumpSharedSpaces) { > + assert(value.is_permanent(), "symbol should be permanent during CDS dumping"); > + // no deallocation is needed > + return; > + } > +#endif > if (value.refcount() == 1) { > value.decrement_refcount(); > assert(value.refcount() == 0, "expected dead symbol"); Currently, in `allocate_node_impl()` there's no guarantee that the refcount is always permanent when DumpSharedSpaces is true. During CDS dumping, some classes may be loaded by non-boot loaders, and their symbols may initially have a non-permanent count. Eventually, we force all the archived symbols to be permanent, but theoretically calls to `free_node()` may happen before that. ------------- PR: https://git.openjdk.org/jdk/pull/12562 From dcubed at openjdk.org Thu Feb 16 21:06:42 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 16 Feb 2023 21:06:42 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms In-Reply-To: References: Message-ID: <2kWWJCUwEsAKhx1DVvvj3auS_H98JngAThEH9_1ZMHA=.88c64a25-207f-458e-85a4-62e699392be4@github.com> On Thu, 16 Feb 2023 08:38:42 GMT, Robbin Ehn wrote: > Hi all, please consider. > > The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. > All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. > Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. > > Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. > The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) > > This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. > > This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. > Secondly it moves handshakes part out of the Compile_lock where it is possible. > > Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. > > It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. > > Thanks, Robbin I have a bunch of editorial nits from my first read thru. I'm going to post those first and then do another read thru to try and make sense of the logic changes. src/hotspot/share/code/dependencyContext.cpp line 66: > 64: // Walk the list of dependent nmethods searching for nmethods which > 65: // are dependent on the changes that were passed in and mark them for > 66: // deoptimization. Returns the number of nmethods found. This part of the comment is now stale: Returns the number of nmethods found. src/hotspot/share/code/dependencyContext.cpp line 73: > 71: if (b->count() > 0) { > 72: if (nm->is_marked_for_deoptimization()) { > 73: deopt_scope->dependant(nm); nit typo: s/dependant/dependent/ src/hotspot/share/prims/methodHandles.cpp line 953: > 951: } > 952: > 953: void MethodHandles::flush_dependent_nmethods(DeoptimizationScope* deopt_scope, Handle call_site, Handle target) { Is `flush_dependent_nmethods` really the right name for this method? Perhaps `mark_dependent_nmethods` is a better name. Yes, I know that this name predates this PR... src/hotspot/share/prims/methodHandles.cpp line 1334: > 1332: { > 1333: NoSafepointVerifier nsv; > 1334: MutexLocker mu2(THREAD, CodeCache_lock, Mutex::_no_safepoint_check_flag); Since we no longer have a `mu1`, the name shouldn't be `mu2`. Also, the outer braces on L1330 and L1341 are not needed anymore which also means that the code from L1331 -> L1342 doesn't have to be indented the extra level. src/hotspot/share/runtime/deoptimization.cpp line 123: > 121: // If it's already marked but we still need it to be deopted. > 122: if (cm->is_marked_for_deoptimization()) { > 123: dependant(cm); nit typo: s/dependant/dependent/ src/hotspot/share/runtime/deoptimization.cpp line 139: > 137: } > 138: > 139: void DeoptimizationScope::dependant(CompiledMethod* cm) { nit typo: s/dependant/dependent/ src/hotspot/share/runtime/deoptimization.cpp line 150: > 148: > 149: void DeoptimizationScope::deoptimize_marked() { > 150: assert(!_deopted, "Already deopt"); nit typo: s/deopt/deopted/ src/hotspot/share/runtime/deoptimization.cpp line 152: > 150: assert(!_deopted, "Already deopt"); > 151: > 152: // Safepoint are a special case, handled here. nit typo: s/Safepoint/Safepoints/ src/hotspot/share/runtime/deoptimization.cpp line 190: > 188: } else { > 189: // Performs the handshake. > 190: Deoptimization::deoptimize_all_marked(); // May safepoint and an additional deopt may have occured. nit typo: s/occured/occurred/ src/hotspot/share/runtime/deoptimization.cpp line 195: > 193: MutexLocker ml(CompiledMethod_lock->owned_by_self() ? nullptr : CompiledMethod_lock, > 194: Mutex::_no_safepoint_check_flag); > 195: // Make sure that committed don't go backwards. nit typo: s/don't/doesn't/ src/hotspot/share/runtime/deoptimization.hpp line 59: > 57: DeoptimizationScope(); > 58: ~DeoptimizationScope(); > 59: // Mark a method, if already marked as dependant. nit typo: s/dependant/dependent/ src/hotspot/share/runtime/deoptimization.hpp line 61: > 59: // Mark a method, if already marked as dependant. > 60: void mark(CompiledMethod* cm, bool inc_recompile_counts = true); > 61: // Record this as a dependant method. nit typo: s/dependant/dependent/ src/hotspot/share/runtime/deoptimization.hpp line 65: > 63: > 64: // Execute the deoptimization. > 65: // Mmake the nmethods not entrant, stackwalks and patch return pcs and sets post call nops. nit typo: s/Mmake/Make/ ------------- PR: https://git.openjdk.org/jdk/pull/12585 From dcubed at openjdk.org Thu Feb 16 21:06:44 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 16 Feb 2023 21:06:44 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 16:04:11 GMT, Erik ?sterlund wrote: >> Hi all, please consider. >> >> The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. >> All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. >> Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. >> >> Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. >> The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) >> >> This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. >> >> This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. >> Secondly it moves handshakes part out of the Compile_lock where it is possible. >> >> Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. >> >> It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. >> >> Thanks, Robbin > > src/hotspot/share/runtime/deoptimization.cpp line 110: > >> 108: MutexLocker ml(CompiledMethod_lock, Mutex::_no_safepoint_check_flag); >> 109: // If there is nothing to deopt _gen is the same as comitted. >> 110: _gen = DeoptimizationScope::_committed_deopt_gen; > > Could we find a better name for the "_gen" member? If I'm reading this right, it's some kind of _required_gen, right? As in we need to have committed up until this number, or it isn't safe to continue. The comment from deoptimization.hpp is this: // The highest gen we need to execute/wait for uint64_t _gen; so better names might be: `_highest_gen` or `_highest_required_gen` or Erik's `_required_gen`. In any case, if you rename this field, please update the comment in deoptimization.hpp to match... ------------- PR: https://git.openjdk.org/jdk/pull/12585 From dcubed at openjdk.org Thu Feb 16 21:21:28 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 16 Feb 2023 21:21:28 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 08:38:42 GMT, Robbin Ehn wrote: > Hi all, please consider. > > The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. > All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. > Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. > > Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. > The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) > > This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. > > This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. > Secondly it moves handshakes part out of the Compile_lock where it is possible. > > Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. > > It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. > > Thanks, Robbin src/hotspot/share/runtime/deoptimization.hpp line 49: > 47: // What gen to mark a method with, hence larger than _committed_deopt_gen. > 48: static uint64_t _active_deopt_gen; > 49: // Indicate a in-progress deopt handshake. nit typo: s/a in-progress/an in-progress/ ------------- PR: https://git.openjdk.org/jdk/pull/12585 From dcubed at openjdk.org Thu Feb 16 21:28:27 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 16 Feb 2023 21:28:27 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 08:38:42 GMT, Robbin Ehn wrote: > Hi all, please consider. > > The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. > All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. > Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. > > Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. > The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) > > This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. > > This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. > Secondly it moves handshakes part out of the Compile_lock where it is possible. > > Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. > > It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. > > Thanks, Robbin src/hotspot/share/runtime/deoptimization.cpp line 131: > 129: Atomic::store(&cm->_deoptimization_status, status); > 130: > 131: // Make sure active is not committted nit typo: s/committted/committed/ ------------- PR: https://git.openjdk.org/jdk/pull/12585 From dcubed at openjdk.org Thu Feb 16 21:51:32 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 16 Feb 2023 21:51:32 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 08:38:42 GMT, Robbin Ehn wrote: > Hi all, please consider. > > The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. > All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. > Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. > > Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. > The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) > > This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. > > This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. > Secondly it moves handshakes part out of the Compile_lock where it is possible. > > Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. > > It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. > > Thanks, Robbin src/hotspot/share/code/codeCache.cpp line 1419: > 1417: void CodeCache::flush_dependents_on_method_for_breakpoint(const methodHandle& m_h) { > 1418: // --- Compile_lock is not held. However we are at a safepoint. > 1419: assert_locked_or_safepoint(Compile_lock); Not your bug, but this comment doesn't make sense with respect to the `assert_locked_or_safepoint(Compile_lock)` call. ------------- PR: https://git.openjdk.org/jdk/pull/12585 From ccheung at openjdk.org Thu Feb 16 23:12:17 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Thu, 16 Feb 2023 23:12:17 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node [v5] In-Reply-To: References: Message-ID: > Please review this patch for embedding the Symbol inside a ConcurrentHashTable Node instead of having a pointer to a Symbol. This eliminates malloc/free for each Symbol. > > This patch is co-authored by @robehn. > > Passed tiers 1 - 4 testing. Calvin Cheung has updated the pull request incrementally with one additional commit since the last revision: do nothing in free_node() if DumpSharedSpaces ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12562/files - new: https://git.openjdk.org/jdk/pull/12562/files/b22eb3b0..0f1d5a81 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12562&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12562&range=03-04 Stats: 6 lines in 1 file changed: 6 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12562.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12562/head:pull/12562 PR: https://git.openjdk.org/jdk/pull/12562 From ccheung at openjdk.org Thu Feb 16 23:12:24 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Thu, 16 Feb 2023 23:12:24 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node [v4] In-Reply-To: References: <2mGYkBwMFEqx5ec71tIta6hboHgsSj0G6NyK7oTBW5E=.4c557bd6-852f-47cb-9796-e9475ff81b1d@github.com> Message-ID: On Thu, 16 Feb 2023 19:52:31 GMT, Ioi Lam wrote: >> How about performing the above check at the beginning of the function, right after the assert statement as follows? >> Let me know if the added assert is necessary. >> >> --- a/src/hotspot/share/classfile/symbolTable.cpp >> +++ b/src/hotspot/share/classfile/symbolTable.cpp >> @@ -151,6 +151,13 @@ public: >> // If #2, then the symbol is dead (refcount==0) >> assert(value.is_permanent() || (value.refcount() == 1) || (value.refcount() == 0), >> "refcount %d", value.refcount()); >> +#if INCLUDE_CDS >> + if (DumpSharedSpaces) { >> + assert(value.is_permanent(), "symbol should be permanent during CDS dumping"); >> + // no deallocation is needed >> + return; >> + } >> +#endif >> if (value.refcount() == 1) { >> value.decrement_refcount(); >> assert(value.refcount() == 0, "expected dead symbol"); > > Currently, in `allocate_node_impl()` there's no guarantee that the refcount is always permanent when DumpSharedSpaces is true. > > During CDS dumping, some classes may be loaded by non-boot loaders, and their symbols may initially have a non-permanent count. Eventually, we force all the archived symbols to be permanent, but theoretically calls to `free_node()` may happen before that. Ok. I've pushed a commit without the assert statement. ------------- PR: https://git.openjdk.org/jdk/pull/12562 From sspitsyn at openjdk.org Thu Feb 16 23:28:30 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 16 Feb 2023 23:28:30 GMT Subject: RFR: 8302615: make JVMTI thread cpu time functions optional for virtual threads In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 18:05:54 GMT, Serguei Spitsyn wrote: > This is a minor JVM TI spec update for two timer functions `GetCurrentThreadCpuTime` and `GetThreadCpuTime` to allow implementations to support virtual threads. > > The CSR is: > [JDK-8302616](https://bugs.openjdk.org/browse/JDK-8302616): make JVMTI thread cpu time functions optional for virtual threads > > No testing is needed. Thank you for review, Alan. ------------- PR: https://git.openjdk.org/jdk/pull/12604 From iklam at openjdk.org Fri Feb 17 03:10:04 2023 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 17 Feb 2023 03:10:04 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node [v5] In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 23:12:17 GMT, Calvin Cheung wrote: >> Please review this patch for embedding the Symbol inside a ConcurrentHashTable Node instead of having a pointer to a Symbol. This eliminates malloc/free for each Symbol. >> >> This patch is co-authored by @robehn. >> >> Passed tiers 1 - 4 testing. > > Calvin Cheung has updated the pull request incrementally with one additional commit since the last revision: > > do nothing in free_node() if DumpSharedSpaces LGTM ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.org/jdk/pull/12562 From dholmes at openjdk.org Fri Feb 17 03:10:27 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 17 Feb 2023 03:10:27 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v2] In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 14:25:15 GMT, Johannes Bechberger wrote: >> Extends the existing AsyncGetCallTrace test case and fixes the issue by modifying `MethodHandles` code. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Implement addptr suggestion by @JohnVernee and @reinrich There are crashes in stack walking with this change. Try e.g. vmTestbase/vm/mlvm/indy/func/jvmti/stepBreakPopReturn/INDIFY_Test.java # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (c:\sb\prod\1676562544\workspace\open\src\hotspot\share\code/codeCache.inline.hpp:49), pid=55852, tid=36120 # assert(cb != nullptr) failed: must be Stack: [0x0000001077100000,0x0000001077200000] Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [jvm.dll+0xbf7311] os::win32::platform_print_native_stack+0xf1 (os_windows_x86.cpp:236) V [jvm.dll+0xe705d0] VMError::report+0x10c0 (vmError.cpp:813) V [jvm.dll+0xe7220e] VMError::report_and_die+0x6ce (vmError.cpp:1593) V [jvm.dll+0xe72954] VMError::report_and_die+0x64 (vmError.cpp:1352) V [jvm.dll+0x576ef6] report_vm_error+0x96 (debug.cpp:181) V [jvm.dll+0x3fb595] frame::sender_for_compiled_frame+0x515 (frame_x86.inline.hpp:441) V [jvm.dll+0x3fb85d] frame::sender_raw+0x1dd (frame_x86.inline.hpp:386) V [jvm.dll+0x524f4f] vframeStreamCommon::next+0x52f (vframe.inline.hpp:103) V [jvm.dll+0x7d7350] java_lang_Throwable::fill_in_stack_trace+0x810 (javaClasses.cpp:2589) V [jvm.dll+0x7d6b16] java_lang_Throwable::fill_in_stack_trace+0x86 (javaClasses.cpp:2649) V [jvm.dll+0x8b723d] JVM_FillInStackTrace+0x1ad (jvm.cpp:505) C [java.dll+0x7162] ------------- PR: https://git.openjdk.org/jdk/pull/12535 From dholmes at openjdk.org Fri Feb 17 03:11:54 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 17 Feb 2023 03:11:54 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v8] In-Reply-To: References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> Message-ID: On Wed, 15 Feb 2023 00:38:15 GMT, Mikhailo Seledtsov wrote: >> Add log() and logToFile() methods to VMProps for diagnostic purposes. > > Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: > > Addressed recent review feedback from Leonid Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12239 From lmesnik at openjdk.org Fri Feb 17 03:11:58 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 17 Feb 2023 03:11:58 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v8] In-Reply-To: References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> Message-ID: On Wed, 15 Feb 2023 00:38:15 GMT, Mikhailo Seledtsov wrote: >> Add log() and logToFile() methods to VMProps for diagnostic purposes. > > Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: > > Addressed recent review feedback from Leonid just refresh my approval ------------- Marked as reviewed by lmesnik (Reviewer). PR: https://git.openjdk.org/jdk/pull/12239 From duke at openjdk.org Fri Feb 17 03:15:56 2023 From: duke at openjdk.org (SUN Guoyun) Date: Fri, 17 Feb 2023 03:15:56 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v5] In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 07:01:43 GMT, Fei Yang wrote: >> This PR implements JEP 422: Linux/RISC-V Port [1]. >> The PR starts as a squashed merge of the https://openjdk.java.net/projects/riscv-port branch. >> >> This has been tested with jtreg tier{1,2,3,4} and jcstress on HiFive Unmatched board. Dacapo, SPECjbb2015 and SPECjvm2008 benchmark tests are also carried out regularly. So it should be good enough to run most Java programs. >> >> [1] https://openjdk.java.net/jeps/422 > > Fei Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Merge remote-tracking branch 'upstream/master' into JDK-8276799 > - Fix copyright header > - Address review comments > - Merge remote-tracking branch 'upstream/master' into JDK-8276799 > - 8276799: Implementation of JEP 422: Linux/RISC-V Port src/hotspot/cpu/riscv/riscv.ad line 8758: > 8756: %{ > 8757: // Same match rule as `far_cmpU_loop'. > 8758: match(CountedLoopEnd cmp (CmpU op1 op2)); Which testcases can test this instruct and the following instructs? match(CountedLoopEnd cmp (CmpP op1 op2)); match(CountedLoopEnd cmp (CmpN op1 op2)); match(CountedLoopEnd cmp (CmpF op1 op2)); match(CountedLoopEnd cmp (CmpD op1 op2)); I suspect this instruction is useless. ------------- PR: https://git.openjdk.org/jdk/pull/6294 From dholmes at openjdk.org Fri Feb 17 03:15:58 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 17 Feb 2023 03:15:58 GMT Subject: RFR: JDK-8293313: NMT: Rework MallocLimit [v9] In-Reply-To: References: <-KHeOvjwe5Osf05XWQuO5ELZUs-ogUVj25xCouJaDaU=.b0906406-dd44-4eb4-be76-debe83867249@github.com> Message-ID: On Thu, 16 Feb 2023 16:12:12 GMT, Thomas Stuefe wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: >> >> - Merge branch 'openjdk:master' into JDK-8293313-NMT-fake-oom >> - nullptr >> - merge and fix conflicts >> - Merge; fix conflicts >> - Feedback Gerard >> - Revert strchrnul >> - Copyrights >> - Feedback Johan >> - Merge >> - Merge branch 'master' into JDK-8293313-NMT-fake-oom >> - ... and 1 more: https://git.openjdk.org/jdk/compare/98a392c4...2d02a5bd > > Thanks guys! > >> LGTM. >> >> Will there be a release note documenting the changed/removed XX-flags ? Maybe there should be one . > > That is not necessary since this was a diagnostic switch. Also, the new format is backward compatible with the old one (so no behavioral change if someone had set it) > > Cheers, Thomas @tstuefe with regards to a release note ... while the flag is now diagnostic that is a change made in JDK 21, so in terms of users of JDK 20 migrating to JDK 21, a release note may be need to both explain the flag is now diagnostic (and subject to change without notice), and that "none" is no longer a valid value. ------------- PR: https://git.openjdk.org/jdk/pull/11371 From dholmes at openjdk.org Fri Feb 17 03:17:04 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 17 Feb 2023 03:17:04 GMT Subject: RFR: 8280419: Remove dead code related to VerifyThread and verify_thread() [v3] In-Reply-To: References: Message-ID: > This is code that was copied from the sparc port into the arm, ppc and s390 ports, but was never actually implemented. > > I can't really test those platforms, but hopefully GHA builds will suffice, otherwise I need the various porters to confirm the changes. > > Thanks. David Holmes has updated the pull request incrementally with two additional commits since the last revision: - Merge branch '8280419-verifyThread' of github.com:dholmes-ora/jdk into 8280419-verifyThread - Fix copyright. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12506/files - new: https://git.openjdk.org/jdk/pull/12506/files/96f6b52e..3d3031ad Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12506&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12506&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12506.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12506/head:pull/12506 PR: https://git.openjdk.org/jdk/pull/12506 From dholmes at openjdk.org Fri Feb 17 03:17:17 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 17 Feb 2023 03:17:17 GMT Subject: RFR: 8280419: Remove dead code related to VerifyThread and verify_thread() [v2] In-Reply-To: <5LeM-ceP-wJRKbqN49Me5WuHImOlIWwj6v-pUuM62DY=.a3d23351-423f-4269-a376-73fe1e9221f7@github.com> References: <5LeM-ceP-wJRKbqN49Me5WuHImOlIWwj6v-pUuM62DY=.a3d23351-423f-4269-a376-73fe1e9221f7@github.com> Message-ID: On Thu, 16 Feb 2023 07:55:54 GMT, Thomas Stuefe wrote: > GHAs are not there though. @tstuefe , sorry the GHA was a manual run: https://github.com/dholmes-ora/jdk/actions/runs/4140195206 Thanks for the reviews @tstuefe and @RealLucy ! ------------- PR: https://git.openjdk.org/jdk/pull/12506 From dholmes at openjdk.org Fri Feb 17 03:17:35 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 17 Feb 2023 03:17:35 GMT Subject: RFR: 8280419: Remove dead code related to VerifyThread and verify_thread() [v2] In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 08:02:12 GMT, Lutz Schmidt wrote: >> David Holmes has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: >> >> - Merge branch 'master' into 8280419-verifyThread >> - 8280419: Remove dead code related to VerifyThread and verify_thread() > > src/hotspot/cpu/s390/interp_masm_s390.cpp line 2: > >> 1: /* >> 2: * Copyright (c) 2016, 20223, Oracle and/or its affiliates. All rights reserved. > > Copyright year a bit far in the future Fixed ------------- PR: https://git.openjdk.org/jdk/pull/12506 From dholmes at openjdk.org Fri Feb 17 03:17:38 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 17 Feb 2023 03:17:38 GMT Subject: Integrated: 8280419: Remove dead code related to VerifyThread and verify_thread() In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 02:12:59 GMT, David Holmes wrote: > This is code that was copied from the sparc port into the arm, ppc and s390 ports, but was never actually implemented. > > I can't really test those platforms, but hopefully GHA builds will suffice, otherwise I need the various porters to confirm the changes. > > Thanks. This pull request has now been integrated. Changeset: b242eef9 Author: David Holmes URL: https://git.openjdk.org/jdk/commit/b242eef93e23ad2fce428e975a1b6c150cf6f17c Stats: 79 lines in 18 files changed: 0 ins; 62 del; 17 mod 8280419: Remove dead code related to VerifyThread and verify_thread() Reviewed-by: stuefe, lucy ------------- PR: https://git.openjdk.org/jdk/pull/12506 From sspitsyn at openjdk.org Fri Feb 17 04:43:30 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 17 Feb 2023 04:43:30 GMT Subject: RFR: 8299240: rank of JvmtiVTMSTransition_lock can be safepoint In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 05:37:51 GMT, Serguei Spitsyn wrote: > The rank of JvmtiVTMSTransition_lock is better to be safepoint instead of nosafepoint. > The fix includes removal of the function `check_vthread_and_suspend_at_safepoint` which is not needed anymore. > > Testing: > mach5 jobs are in progress: > Kitchensink, tiers1-6 (all JVMTI, JDWP, JDI and JDB tests have to be included) PING! This is relatively simple update. The testing is clean and has not found any regressions. ------------- PR: https://git.openjdk.org/jdk/pull/12550 From dholmes at openjdk.org Fri Feb 17 05:14:59 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 17 Feb 2023 05:14:59 GMT Subject: RFR: 8299240: rank of JvmtiVTMSTransition_lock can be safepoint In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 05:37:51 GMT, Serguei Spitsyn wrote: > The rank of JvmtiVTMSTransition_lock is better to be safepoint instead of nosafepoint. > The fix includes removal of the function `check_vthread_and_suspend_at_safepoint` which is not needed anymore. > > Testing: > mach5 jobs are in progress: > Kitchensink, tiers1-6 (all JVMTI, JDWP, JDI and JDB tests have to be included) This looks a lot cleaner have the `JvmtiVTMSTransition_lock` be a normal safepoint-honouring monitor, but that said, it means there are now many places that can block for safepoints that previously wouldn't and it is very to hard to tell if that is okay or not. src/hotspot/share/prims/jvmtiEnv.cpp line 952: > 950: } > 951: // protect thread_oop as a safepoint can be reached in disabler destructor > 952: self_tobj = Handle(current, thread_oop); If you move this after line 953 then you can delete line 932 and just have: Handle self_tobj = Handle(current, thread_oop); ------------- PR: https://git.openjdk.org/jdk/pull/12550 From kbarrett at openjdk.org Fri Feb 17 05:41:59 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 17 Feb 2023 05:41:59 GMT Subject: RFR: JDK-8300783: Consolidate byteswap implementations [v14] In-Reply-To: <-b783DPmWbWFeigKf7F7SFYddDKwErM4AdFcfRx01eM=.5794fa08-a0d7-4761-a449-8ebfd639e30d@github.com> References: <-b783DPmWbWFeigKf7F7SFYddDKwErM4AdFcfRx01eM=.5794fa08-a0d7-4761-a449-8ebfd639e30d@github.com> Message-ID: On Wed, 15 Feb 2023 15:39:14 GMT, Justin King wrote: >> Deduplicate byte swapping implementations by consolidating them into `utilities/byteswap.hpp`, following `std::byteswap` introduced in C++23. Further simplification of `Bytes` will follow in https://github.com/openjdk/jdk/pull/12078. > > Justin King has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 24 additional commits since the last revision: > > - Merge remote-tracking branch 'upstream/master' into byteswap > - Update based on review > > Signed-off-by: Justin King > - Fix copyright > > Signed-off-by: Justin King > - Update copyright > > Signed-off-by: Justin King > - Add missing include > > Signed-off-by: Justin King > - Remove unused include > > Signed-off-by: Justin King > - Reorganize tests > > Signed-off-by: Justin King > - Fix test > > Signed-off-by: Justin King > - Merge remote-tracking branch 'upstream/master' into byteswap > - Be restrict on requiring 1, 2, 4, or 8 byte integers > > Signed-off-by: Justin King > - ... and 14 more: https://git.openjdk.org/jdk/compare/1c13bbfd...223d733b Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.org/jdk/pull/12114 From kbarrett at openjdk.org Fri Feb 17 05:42:01 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 17 Feb 2023 05:42:01 GMT Subject: RFR: JDK-8300783: Consolidate byteswap implementations [v13] In-Reply-To: References: <4AfWUfBaHKZwVsm6XhCkKc0krgKgWTw1RVI2R1-MdcM=.48008df7-1411-42d6-aeb1-3276f443dfe3@github.com> Message-ID: On Wed, 15 Feb 2023 15:34:41 GMT, Justin King wrote: >> src/hotspot/share/utilities/moveBits.hpp line 53: >> >>> 51: template ::value)> >>> 52: ALWAYSINLINE T reverse_bits(T x) { >>> 53: return ReverseBitsImpl{}(x); >> >> I think I would have preferred to have the change to bit reversal by separate from byteswap changes. >> With these changes, I also think this file should be renamed to reverse_bits.hpp or something like that. >> It was called moveBits because it contained both bit and byte swapping. > > Out of curiosity, why `reverse_bits.hpp` instead of `reverseBits.hpp`? I have seen a few files in `utilities/` use the former, but the rest of Hotspot seems to always use the later? Is it because they contain a single function? > > I went with `reverse_bits.hpp` for now as suggested for consistency with the other bit fiddling headers in `utilities/`. The usual camel-case with lowercase start convention typically refers to a class name in the file. I'm not sure snake-case for files with a single function was a good idea, even though it might have been my fault. But it's what we're doing (at least for now). ------------- PR: https://git.openjdk.org/jdk/pull/12114 From dholmes at openjdk.org Fri Feb 17 06:30:19 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 17 Feb 2023 06:30:19 GMT Subject: RFR: 8302191: Performance degradation for float/double modulo on Linux In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 09:06:56 GMT, Jan Kratochvil wrote: > I have OCA already processed/approved. I am not Author but my Author request is being processed these days (sent to Rob McKenna). > I did regression test x86_64 OpenJDK-8. I will leave other regression testing on GHA. > The patch (and former GCC performance regression) affects only x86_64+i686. Thanks for doing the benchmark. Could you please include it in the PR by adding to test/micro/org/openjdk/bench/vm/floating-point. The code still needs to be factored out into an x86 specific file e.g. src/hotspot/cpu/x86/sharedRuntime_x86.cpp. You can either name it `SharedRuntime::frem` and ifdef out the "shared" version; or else define `SharedRuntime::frem_x86` and call it from `SharedRuntime::frem` (similar to the Windows `fmod_winx64`). It should suffice to use X86 as the conditional rather than the compiler(?) macros `__x86_64_` and `__i386_`. Thanks ------------- PR: https://git.openjdk.org/jdk/pull/12508 From stuefe at openjdk.org Fri Feb 17 07:10:32 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 17 Feb 2023 07:10:32 GMT Subject: RFR: JDK-8293313: NMT: Rework MallocLimit [v9] In-Reply-To: References: <-KHeOvjwe5Osf05XWQuO5ELZUs-ogUVj25xCouJaDaU=.b0906406-dd44-4eb4-be76-debe83867249@github.com> Message-ID: On Thu, 16 Feb 2023 16:12:12 GMT, Thomas Stuefe wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: >> >> - Merge branch 'openjdk:master' into JDK-8293313-NMT-fake-oom >> - nullptr >> - merge and fix conflicts >> - Merge; fix conflicts >> - Feedback Gerard >> - Revert strchrnul >> - Copyrights >> - Feedback Johan >> - Merge >> - Merge branch 'master' into JDK-8293313-NMT-fake-oom >> - ... and 1 more: https://git.openjdk.org/jdk/compare/98a392c4...2d02a5bd > > Thanks guys! > >> LGTM. >> >> Will there be a release note documenting the changed/removed XX-flags ? Maybe there should be one . > > That is not necessary since this was a diagnostic switch. Also, the new format is backward compatible with the old one (so no behavioral change if someone had set it) > > Cheers, Thomas > @tstuefe with regards to a release note ... while the flag is now diagnostic that is a change made in JDK 21, so in terms of users of JDK 20 migrating to JDK 21, a release note may be need to both explain the flag is now diagnostic (and subject to change without notice), and that "none" is no longer a valid value. I assume you mean JDK-8302129, right? The MetaspaceReclaimPolicy=none removal? Makes sense, I will do that. I opened https://bugs.openjdk.org/browse/JDK-8302708 to track that. ------------- PR: https://git.openjdk.org/jdk/pull/11371 From rehn at openjdk.org Fri Feb 17 07:25:17 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 17 Feb 2023 07:25:17 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 19:56:42 GMT, Daniel D. Daugherty wrote: >> src/hotspot/share/runtime/deoptimization.cpp line 110: >> >>> 108: MutexLocker ml(CompiledMethod_lock, Mutex::_no_safepoint_check_flag); >>> 109: // If there is nothing to deopt _gen is the same as comitted. >>> 110: _gen = DeoptimizationScope::_committed_deopt_gen; >> >> Could we find a better name for the "_gen" member? If I'm reading this right, it's some kind of _required_gen, right? As in we need to have committed up until this number, or it isn't safe to continue. > > The comment from deoptimization.hpp is this: > > // The highest gen we need to execute/wait for > uint64_t _gen; > > so better names might be: `_highest_gen` or `_highest_required_gen` or Erik's > `_required_gen`. In any case, if you rename this field, please update the comment > in deoptimization.hpp to match... Fixed, thanks! ------------- PR: https://git.openjdk.org/jdk/pull/12585 From rehn at openjdk.org Fri Feb 17 07:25:23 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 17 Feb 2023 07:25:23 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms In-Reply-To: <2kWWJCUwEsAKhx1DVvvj3auS_H98JngAThEH9_1ZMHA=.88c64a25-207f-458e-85a4-62e699392be4@github.com> References: <2kWWJCUwEsAKhx1DVvvj3auS_H98JngAThEH9_1ZMHA=.88c64a25-207f-458e-85a4-62e699392be4@github.com> Message-ID: On Thu, 16 Feb 2023 19:58:43 GMT, Daniel D. Daugherty wrote: >> Hi all, please consider. >> >> The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. >> All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. >> Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. >> >> Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. >> The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) >> >> This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. >> >> This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. >> Secondly it moves handshakes part out of the Compile_lock where it is possible. >> >> Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. >> >> It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. >> >> Thanks, Robbin > > src/hotspot/share/runtime/deoptimization.cpp line 123: > >> 121: // If it's already marked but we still need it to be deopted. >> 122: if (cm->is_marked_for_deoptimization()) { >> 123: dependant(cm); > > nit typo: s/dependant/dependent/ Fixed, thanks! > src/hotspot/share/runtime/deoptimization.cpp line 139: > >> 137: } >> 138: >> 139: void DeoptimizationScope::dependant(CompiledMethod* cm) { > > nit typo: s/dependant/dependent/ Fixed, thanks! > src/hotspot/share/runtime/deoptimization.hpp line 59: > >> 57: DeoptimizationScope(); >> 58: ~DeoptimizationScope(); >> 59: // Mark a method, if already marked as dependant. > > nit typo: s/dependant/dependent/ Fixed, thanks! > src/hotspot/share/runtime/deoptimization.hpp line 61: > >> 59: // Mark a method, if already marked as dependant. >> 60: void mark(CompiledMethod* cm, bool inc_recompile_counts = true); >> 61: // Record this as a dependant method. > > nit typo: s/dependant/dependent/ Fixed, thanks! > src/hotspot/share/runtime/deoptimization.hpp line 65: > >> 63: >> 64: // Execute the deoptimization. >> 65: // Mmake the nmethods not entrant, stackwalks and patch return pcs and sets post call nops. > > nit typo: s/Mmake/Make/ Fixed, thanks! ------------- PR: https://git.openjdk.org/jdk/pull/12585 From rehn at openjdk.org Fri Feb 17 07:44:36 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 17 Feb 2023 07:44:36 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms In-Reply-To: <2kWWJCUwEsAKhx1DVvvj3auS_H98JngAThEH9_1ZMHA=.88c64a25-207f-458e-85a4-62e699392be4@github.com> References: <2kWWJCUwEsAKhx1DVvvj3auS_H98JngAThEH9_1ZMHA=.88c64a25-207f-458e-85a4-62e699392be4@github.com> Message-ID: On Thu, 16 Feb 2023 20:17:29 GMT, Daniel D. Daugherty wrote: >> Hi all, please consider. >> >> The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. >> All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. >> Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. >> >> Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. >> The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) >> >> This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. >> >> This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. >> Secondly it moves handshakes part out of the Compile_lock where it is possible. >> >> Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. >> >> It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. >> >> Thanks, Robbin > > src/hotspot/share/code/dependencyContext.cpp line 66: > >> 64: // Walk the list of dependent nmethods searching for nmethods which >> 65: // are dependent on the changes that were passed in and mark them for >> 66: // deoptimization. Returns the number of nmethods found. > > This part of the comment is now stale: Returns the number of nmethods found. Fixed, thanks! > src/hotspot/share/code/dependencyContext.cpp line 73: > >> 71: if (b->count() > 0) { >> 72: if (nm->is_marked_for_deoptimization()) { >> 73: deopt_scope->dependant(nm); > > nit typo: s/dependant/dependent/ Thanks, fixed. > src/hotspot/share/prims/methodHandles.cpp line 953: > >> 951: } >> 952: >> 953: void MethodHandles::flush_dependent_nmethods(DeoptimizationScope* deopt_scope, Handle call_site, Handle target) { > > Is `flush_dependent_nmethods` really the right name for this method? > Perhaps `mark_dependent_nmethods` is a better name. > > Yes, I know that this name predates this PR... Yes agreed, fixed, thanks! > src/hotspot/share/prims/methodHandles.cpp line 1334: > >> 1332: { >> 1333: NoSafepointVerifier nsv; >> 1334: MutexLocker mu2(THREAD, CodeCache_lock, Mutex::_no_safepoint_check_flag); > > Since we no longer have a `mu1`, the name shouldn't be `mu2`. > > Also, the outer braces on L1330 and L1341 are not needed anymore which > also means that the code from L1331 -> L1342 doesn't have to be indented > the extra level. Yes, fixed, thanks! > src/hotspot/share/runtime/deoptimization.cpp line 131: > >> 129: Atomic::store(&cm->_deoptimization_status, status); >> 130: >> 131: // Make sure active is not committted > > nit typo: s/committted/committed/ Thanks, fixed! > src/hotspot/share/runtime/deoptimization.cpp line 150: > >> 148: >> 149: void DeoptimizationScope::deoptimize_marked() { >> 150: assert(!_deopted, "Already deopt"); > > nit typo: s/deopt/deopted/ Fixed, thanks! > src/hotspot/share/runtime/deoptimization.cpp line 152: > >> 150: assert(!_deopted, "Already deopt"); >> 151: >> 152: // Safepoint are a special case, handled here. > > nit typo: s/Safepoint/Safepoints/ Fixed, thanks! > src/hotspot/share/runtime/deoptimization.cpp line 190: > >> 188: } else { >> 189: // Performs the handshake. >> 190: Deoptimization::deoptimize_all_marked(); // May safepoint and an additional deopt may have occured. > > nit typo: s/occured/occurred/ Fixed, thanks! > src/hotspot/share/runtime/deoptimization.cpp line 195: > >> 193: MutexLocker ml(CompiledMethod_lock->owned_by_self() ? nullptr : CompiledMethod_lock, >> 194: Mutex::_no_safepoint_check_flag); >> 195: // Make sure that committed don't go backwards. > > nit typo: s/don't/doesn't/ Fixed, thanks! > src/hotspot/share/runtime/deoptimization.hpp line 49: > >> 47: // What gen to mark a method with, hence larger than _committed_deopt_gen. >> 48: static uint64_t _active_deopt_gen; >> 49: // Indicate a in-progress deopt handshake. > > nit typo: s/a in-progress/an in-progress/ Fixed, thanks! ------------- PR: https://git.openjdk.org/jdk/pull/12585 From duke at openjdk.org Fri Feb 17 07:55:19 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Fri, 17 Feb 2023 07:55:19 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v2] In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 14:25:15 GMT, Johannes Bechberger wrote: >> Extends the existing AsyncGetCallTrace test case and fixes the issue by modifying `MethodHandles` code. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Implement addptr suggestion by @JohnVernee and @reinrich There also errors on Linux, e.g. `vmTestbase/vm/mlvm/indy/func/jvmti/stepBreakPopReturn/INDIFY_Test.java`: - stepBreakPopReturn.cpp, 57: Setting debuggee class name to Lvm/mlvm/indy/func/jvmti/stepBreakPopReturn/INDIFY_Test; - stepBreakPopReturn.cpp, 86: Entering method: Lvm/mlvm/indy/func/jvmti/stepBreakPopReturn/INDIFY_Test;.setDebuggeeMethodName - stepBreakPopReturn.cpp, 51: Setting debuggee method name to target ### TRACE 0: Call site 1 - stepBreakPopReturn.cpp, 86: Entering method: Lvm/mlvm/indy/func/jvmti/stepBreakPopReturn/INDIFY_Test;.bootstrap ### TRACE 0: Lookup vm.mlvm.indy.func.jvmti.stepBreakPopReturn.INDIFY_Test; method name = greet; method type = (Object,String,int)int - stepBreakPopReturn.cpp, 86: Entering method: Lvm/mlvm/indy/func/jvmti/stepBreakPopReturn/INDIFY_Test;.target - stepBreakPopReturn.cpp, 115: Single step event: Lvm/mlvm/indy/func/jvmti/stepBreakPopReturn/INDIFY_Test; .target :1 - stepBreakPopReturn.cpp, 128: Pop a frame # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x0000000000000315, pid=41551, tid=41572 # # JRE version: OpenJDK Runtime Environment (21.0) (fastdebug build 21-internal-adhoc.openjdk.jdk-dev) # Java VM: OpenJDK 64-Bit Server VM (fastdebug 21-internal-adhoc.openjdk.jdk-dev, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) # Problematic frame: # C 0x00000000e755a950 # # Core dump will be written. Default location: /priv/jvmtests/output_openjdk21_dev_dbgU_linuxx86_64/jtreg_hotspot_tier4_work/JTwork/scratch/0 # # An error report file with more information is saved as: # /priv/jvmtests/output_openjdk21_dev_dbgU_linuxx86_64/jtreg_hotspot_tier4_work/JTwork/scratch/hs_err_pid41551.log # # If you would like to submit a bug report, please visit: # https://bugreport.java.com/bugreport/crash.jsp # So back to my original suggestion which is simple and worked (ran it through our internal test queue and found no errors)? ------------- PR: https://git.openjdk.org/jdk/pull/12535 From rrich at openjdk.org Fri Feb 17 08:21:17 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 17 Feb 2023 08:21:17 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v2] In-Reply-To: References: Message-ID: <5xO0svcjCbnRvfhW7LCQMYJHaIioUptiuBgrld_DPYE=.a975b1e5-45b8-42ed-975b-64b474955f80@github.com> On Thu, 16 Feb 2023 14:25:15 GMT, Johannes Bechberger wrote: >> Extends the existing AsyncGetCallTrace test case and fixes the issue by modifying `MethodHandles` code. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Implement addptr suggestion by @JohnVernee and @reinrich Java stack # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x0000000000000315, pid=41551, tid=41572 # # JRE version: OpenJDK Runtime Environment (21.0) (fastdebug build 21-internal-adhoc.openjdk.jdk-dev) # Java VM: OpenJDK 64-Bit Server VM (fastdebug 21-internal-adhoc.openjdk.jdk-dev, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) # Problematic frame: # C 0x00000000e755a950 # # Core dump will be written. Default location: .../jtreg_hotspot_tier4_work/JTwork/scratch/0 # # If you would like to submit a bug report, please visit: # https://bugreport.java.com/bugreport/crash.jsp # --------------- S U M M A R Y ------------ Command Line: -Dtest.vm.opts=-Xmx768m -Djava.awt.headless=true -Djava.util.prefs.userRoot=.../jtreg_hotspot_tier4_work/tmp -Djava.io.tmpdir=.../jtreg_hotspot_tier4_work/tmp -Dtest.getfreeport.max.tries=40 -ea -esa -Dtest.tool.vm.opts=-J-Xmx768m -J-Djava.awt.headless=true -J-Djava.util.prefs.userRoot=.../jtreg_hotspot_tier4_work/tmp -J-Djava.io.tmpdir=.../jtreg_hotspot_tier4_work/tmp -J-Dtest.getfreeport.max.tries=40 -J-ea -J-esa -Dtest.compiler.opts= -Dtest.java.opts= -Dtest.jdk=.../sapjvm_21 -Dcompile.jdk=.../sapjvm_21 -Dtest.timeout.factor=6.0 -Dtest.nativepath=.../grmpf/testdata/jtreg/jtreg_test_21/test/hotspot/jtreg/native -Dtest.root=.../grmpf/testdata/jtreg/jtreg_test_21/test/hotspot/jtreg -Dtest.name=vmTestbase/vm/mlvm/indy/func/jvmti/stepBreakPopReturn/INDIFY_Test.java -Dtest.file=.../grmpf/testdata/jtreg/jtreg_test_21/test/hotspot/jtreg/vmTestbase/vm/mlvm/indy/func/jvmti/stepBreakPopReturn/INDIFY_Test.java -Dtest.src=.../grmpf/testdata/jtreg/jtreg_test_21/test/hotspot/jtreg/ vmTestbase/vm/mlvm/indy/func/jvmti/stepBreakPopReturn -Dtest.src.path=.../grmpf/testdata/jtreg/jtreg_test_21/test/hotspot/jtreg/vmTestbase/vm/mlvm/indy/func/jvmti/stepBreakPopReturn:.../grmpf/testdata/jtreg/jtreg_test_21/test/hotspot/jtreg/vmTestbase:.../grmpf/testdata/jtreg/jtreg_test_21/test/lib -Dtest.classes=.../jtreg_hotspot_tier4_work/JTwork/classes/vmTestbase/vm/mlvm/indy/func/jvmti/stepBreakPopReturn/INDIFY_Test.d -Dtest.class.path=.../jtreg_hotspot_tier4_work/JTwork/classes/vmTestbase/vm/mlvm/indy/func/jvmti/stepBreakPopReturn/INDIFY_Test.d:.../jtreg_hotspot_tier4_work/JTwork/classes/vmTestbase:.../jtreg_hotspot_tier4_work/JTwork/classes/test/lib -Dtest.class.path.prefix=.../jtreg_hotspot_tier4_work/JTwork/classes/vmTestbase/vm/mlvm/indy/func/jvmti/stepBreakPopReturn/INDIFY_Test.d:.../grmpf/testdata/jtreg/jtreg_test_21/test/hotspot/jtreg/vmTestbase/vm/mlvm/indy/func/jvmti/stepBreakPopReturn:.../jtreg_hotspot_tier4_work/JTwork/classes/vmTestbase:.../jtreg_hotspot_tier4_work/ JTwork/classes/test/lib -Xmx768m -Djava.awt.headless=true -Djava.util.prefs.userRoot=.../jtreg_hotspot_tier4_work/tmp -Djava.io.tmpdir=.../jtreg_hotspot_tier4_work/tmp -Dtest.getfreeport.max.tries=40 -ea -esa -Djava.library.path=.../grmpf/testdata/jtreg/jtreg_test_21/test/hotspot/jtreg/native -agentlib:stepBreakPopReturn=verbose= com.sun.javatest.regtest.agent.MainWrapper .../jtreg_hotspot_tier4_work/JTwork/vmTestbase/vm/mlvm/indy/func/jvmti/stepBreakPopReturn/INDIFY_Test.d/main.0.jta Host: ..., Intel(R) Xeon(R) Platinum 8260M CPU @ 2.40GHz, 8 cores, 23G, SUSE Linux Enterprise Server 15 SP3 Time: Fri Feb 17 01:34:06 2023 CET elapsed time: 2.781186 seconds (0d 0h 0m 2s) --------------- T H R E A D --------------- Current thread (0x00007fbc94422600): JavaThread "MainThread" [_thread_in_Java, id=41572, stack(0x00007fbc6af26000,0x00007fbc6b027000)] Stack: [0x00007fbc6af26000,0x00007fbc6b027000], sp=0x00007fbc6b025118, free space=1020k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) C 0x00000000e755a950 j java.lang.invoke.LambdaForm$MH+0x0000000801018400.linkToTargetMethod(Ljava/lang/Object;Ljava/lang/Object;ILjava/lang/Object;)I+7 java.base at 21-internal j vm.mlvm.indy.func.jvmti.stepBreakPopReturn.INDIFY_Test.run()Z+52 j vm.mlvm.share.MlvmTestExecutor.runMlvmTestInstance(Lvm/mlvm/share/MlvmTest;)Z+82 j vm.mlvm.share.MlvmTestExecutor.runMlvmTest(Ljava/lang/Class;[Ljava/lang/Object;)Z+21 j vm.mlvm.share.MlvmTestExecutor.launch(Ljava/lang/Class;[Ljava/lang/Object;)V+28 j vm.mlvm.share.MlvmTestExecutor.launch([Ljava/lang/Object;)V+20 j vm.mlvm.share.MlvmTestExecutor.launch([Ljava/lang/String;[Ljava/lang/Object;)V+5 j vm.mlvm.share.MlvmTest.launch([Ljava/lang/String;)V+2 j vm.mlvm.indy.func.jvmti.stepBreakPopReturn.INDIFY_Test.main([Ljava/lang/String;)V+1 j java.lang.invoke.LambdaForm$DMH+0x0000000801001800.invokeStatic(Ljava/lang/Object;Ljava/lang/Object;)V+10 java.base at 21-internal j java.lang.invoke.LambdaForm$MH+0x0000000801002c00.invoke(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;+33 java.base at 21-internal j java.lang.invoke.Invokers$Holder.invokeExact_MT(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;+20 java.base at 21-internal j jdk.internal.reflect.DirectMethodHandleAccessor.invokeImpl(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+55 java.base at 21-internal j jdk.internal.reflect.DirectMethodHandleAccessor.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+23 java.base at 21-internal j java.lang.reflect.Method.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+102 java.base at 21-internal j com.sun.javatest.regtest.agent.MainWrapper$MainThread.run()V+134 j java.lang.Thread.runWith(Ljava/lang/Object;Ljava/lang/Runnable;)V+5 java.base at 21-internal j java.lang.Thread.run()V+19 java.base at 21-internal v ~StubRoutines::call_stub 0x00007fbc8417fd21 V [libjvm.so+0xeec41a] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, JavaThread*)+0x4da (javaCalls.cpp:415) V [libjvm.so+0xeeca90] JavaCalls::call_virtual(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, JavaThread*)+0x290 (javaCalls.cpp:329) ------------- PR: https://git.openjdk.org/jdk/pull/12535 From sspitsyn at openjdk.org Fri Feb 17 08:26:20 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 17 Feb 2023 08:26:20 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v5] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Thu, 16 Feb 2023 16:59:05 GMT, Patricio Chilano Mateo wrote: >> Please review the following change. Some of the JVMTI functions had to be modified since they are not supported by the specs on virtual threads (StopThread, RunAgentThread, GetCurrentThreadCpuTime, GetThreadCpuTime) or shouldn't include virtual threads in the results (GetAllThreads, GetAllStackTraces, GetThreadGroupChildren). Others were modified because they should work on virtual thread but they weren't (SuspendAllVirtualThreads/ResumeAllVirtualThreads). Also support for VirtualThreadStart/VirtualThreadEnd events was added. >> >> There were a total of 71 tests under serviceability/jvmti/ that had the vm.continuations requirement. Of those, 24 where under serviceability/jvmti/vthread/. I was able to remove that requirement from 50 tests. Most of those tests were already passing for the alternative vthread implementation just by removing the check in jni_GetEnv for VMContinuations. >> From the other 21 tests, 20 still fail with the changes either because they actually test continuations or because of different assumptions in the tests that don't hold true for the alternative vthread implementation. So for those I left the vm.continuations requirement untouched (from those 20 tests there are actually 4 tests from the thread/GetStackTrace/ family that are passing because of a bug in the test, but I can fix that in another RFE). The other remaining test is vthread/GetSetLocalTest/GetSetLocalTest.java which I see is problemlisted. >> >> Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). >> >> I added new test BoundVThreadTest.java which just checks for the unsupported/supported behavior mentioned previously. For some extra basic coverage I also added a new run with -XX:-VMContinuations on tests inside the serviceability/jvmti/vthread/ directory that don't require vm.continuations anymore. I could also add that for all the other tests in serviceability/jvmti/ for which I removed the vm.continuations requirement. >> >> I run the patch through mach5 tiers 1-6 plus some extra local runs on the relevant tests. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with two additional commits since the last revision: > > - restore state should be only for VirtualThreads > - add suspend check test/hotspot/jtreg/serviceability/jvmti/vthread/BoundVThreadTest/libBoundVThreadTest.cpp line 35: > 33: static int vthread_start_count = 0; > 34: static int vthread_end_count = 0; > 35: static bool passed = JNI_TRUE; Nit: It is better to rename: passed => status. ------------- PR: https://git.openjdk.org/jdk/pull/12512 From sspitsyn at openjdk.org Fri Feb 17 08:32:18 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 17 Feb 2023 08:32:18 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v5] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: <9wwOk3VkUENUkmqGVBeprYDaaFs_I1Ofv21DvwwWQms=.c79c56dc-f373-499f-ab63-c1959ff039ee@github.com> On Thu, 16 Feb 2023 16:59:05 GMT, Patricio Chilano Mateo wrote: >> Please review the following change. Some of the JVMTI functions had to be modified since they are not supported by the specs on virtual threads (StopThread, RunAgentThread, GetCurrentThreadCpuTime, GetThreadCpuTime) or shouldn't include virtual threads in the results (GetAllThreads, GetAllStackTraces, GetThreadGroupChildren). Others were modified because they should work on virtual thread but they weren't (SuspendAllVirtualThreads/ResumeAllVirtualThreads). Also support for VirtualThreadStart/VirtualThreadEnd events was added. >> >> There were a total of 71 tests under serviceability/jvmti/ that had the vm.continuations requirement. Of those, 24 where under serviceability/jvmti/vthread/. I was able to remove that requirement from 50 tests. Most of those tests were already passing for the alternative vthread implementation just by removing the check in jni_GetEnv for VMContinuations. >> From the other 21 tests, 20 still fail with the changes either because they actually test continuations or because of different assumptions in the tests that don't hold true for the alternative vthread implementation. So for those I left the vm.continuations requirement untouched (from those 20 tests there are actually 4 tests from the thread/GetStackTrace/ family that are passing because of a bug in the test, but I can fix that in another RFE). The other remaining test is vthread/GetSetLocalTest/GetSetLocalTest.java which I see is problemlisted. >> >> Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). >> >> I added new test BoundVThreadTest.java which just checks for the unsupported/supported behavior mentioned previously. For some extra basic coverage I also added a new run with -XX:-VMContinuations on tests inside the serviceability/jvmti/vthread/ directory that don't require vm.continuations anymore. I could also add that for all the other tests in serviceability/jvmti/ for which I removed the vm.continuations requirement. >> >> I run the patch through mach5 tiers 1-6 plus some extra local runs on the relevant tests. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with two additional commits since the last revision: > > - restore state should be only for VirtualThreads > - add suspend check The fix looks good to me. Thank you for the update. I'm not sure if you wanted to get rid of `is_bound_vthread` though. Just want to know you intention. Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR: https://git.openjdk.org/jdk/pull/12512 From sspitsyn at openjdk.org Fri Feb 17 08:47:27 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 17 Feb 2023 08:47:27 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v3] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Thu, 16 Feb 2023 17:03:32 GMT, Patricio Chilano Mateo wrote: > Yes, the elist comment was related to the other issue. The reason why I didn't explicitly added this check is because it will we done in the suspend/resume call already (while holding the handshake lock which serializes the suspend/resume requests). So inside the suspend handshake for example either the target is suspended, so nothing is done, or the target is not suspended and we suspend it. Is there an actual issue I'm missing or is this to make a fast check and avoid the call? In any case I added the check. Also I see the specs for the single suspend/resume functions say the same thing about not changing the state and, for example, in SuspendThreadList we are also not making an explicit check before the call to suspend_thread(). So if there is an actual issue here, don't we need to add that check for them too? I agree with your analysis. And you are right on the `SuspendThreadList` as well. The `SuspendThreadList` also return an error code of each suspend in the result array. It'd be nice to add a check there as well to make it consistent but I'd suggest to be conservative here and leave it as it is now. ------------- PR: https://git.openjdk.org/jdk/pull/12512 From rehn at openjdk.org Fri Feb 17 09:02:22 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 17 Feb 2023 09:02:22 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 21:40:14 GMT, Daniel D. Daugherty wrote: >> Hi all, please consider. >> >> The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. >> All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. >> Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. >> >> Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. >> The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) >> >> This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. >> >> This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. >> Secondly it moves handshakes part out of the Compile_lock where it is possible. >> >> Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. >> >> It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. >> >> Thanks, Robbin > > src/hotspot/share/code/codeCache.cpp line 1419: > >> 1417: void CodeCache::flush_dependents_on_method_for_breakpoint(const methodHandle& m_h) { >> 1418: // --- Compile_lock is not held. However we are at a safepoint. >> 1419: assert_locked_or_safepoint(Compile_lock); > > Not your bug, but this comment doesn't make sense with respect to > the `assert_locked_or_safepoint(Compile_lock)` call. Fixed, thanks! ------------- PR: https://git.openjdk.org/jdk/pull/12585 From rehn at openjdk.org Fri Feb 17 09:25:55 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 17 Feb 2023 09:25:55 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v2] In-Reply-To: References: Message-ID: <3lTNf7dMxXfNZlqfgtV5UHqxq4moAgrnyHCM2CtlkYI=.4df55d81-05d3-4636-bdfc-296a031aa815@github.com> > Hi all, please consider. > > The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. > All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. > Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. > > Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. > The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) > > This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. > > This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. > Secondly it moves handshakes part out of the Compile_lock where it is possible. > > Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. > > It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. > > Thanks, Robbin Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: Review fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12585/files - new: https://git.openjdk.org/jdk/pull/12585/files/558daa45..22dbc2b4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12585&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12585&range=00-01 Stats: 43 lines in 8 files changed: 1 ins; 4 del; 38 mod Patch: https://git.openjdk.org/jdk/pull/12585.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12585/head:pull/12585 PR: https://git.openjdk.org/jdk/pull/12585 From rehn at openjdk.org Fri Feb 17 09:28:23 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 17 Feb 2023 09:28:23 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v2] In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 16:04:20 GMT, Erik ?sterlund wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Review fixes > > Changes requested by eosterlund (Reviewer). Thanks @fisk and @dcubed-ojdk for having a look! Please see the update when you get the chance. ------------- PR: https://git.openjdk.org/jdk/pull/12585 From jsjolen at openjdk.org Fri Feb 17 09:44:28 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 17 Feb 2023 09:44:28 GMT Subject: RFR: JDK-8301481: Replace NULL with nullptr in os/windows [v5] In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 10:06:22 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/windows. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Changed to zero All ~changes~ *checks* successful and two approvals, integrating this. ------------- PR: https://git.openjdk.org/jdk/pull/12317 From jsjolen at openjdk.org Fri Feb 17 09:44:30 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 17 Feb 2023 09:44:30 GMT Subject: Integrated: JDK-8301481: Replace NULL with nullptr in os/windows In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 10:16:14 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory os/windows. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! This pull request has now been integrated. Changeset: c91cd281 Author: Johan Sj?len URL: https://git.openjdk.org/jdk/commit/c91cd2814baa8dee2af8af0fecf9185d4a0a44cf Stats: 652 lines in 21 files changed: 0 ins; 0 del; 652 mod 8301481: Replace NULL with nullptr in os/windows Reviewed-by: coleenp, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/12317 From goetz at openjdk.org Fri Feb 17 10:43:54 2023 From: goetz at openjdk.org (Goetz Lindenmaier) Date: Fri, 17 Feb 2023 10:43:54 GMT Subject: RFR: JDK-8300080: offset_of for GCC/Clang exhibits undefined behavior and is not always a compile-time constant [v4] In-Reply-To: References: Message-ID: On Fri, 13 Jan 2023 16:06:44 GMT, Justin King wrote: >> The implementation of `offset_of` for GCC/Clang only deals with types are aligned to 16 bytes or less, if they are more, such as `zCollectedHeap` the behavior is undefined. UBSan also suggests that `offset_of` is not always a compile time constant, as the stack trace came from the dynamic loader during library loading. This patch changes `offset_of` to use `offsetof` and disables the warning `invalid-offsetof` for the JVM. > > Justin King has updated the pull request incrementally with one additional commit since the last revision: > > Move attribute on lambda to correct location > > Signed-off-by: Justin King @kimbarret, is this ready to be sponsored? ------------- PR: https://git.openjdk.org/jdk/pull/11978 From jsjolen at openjdk.org Fri Feb 17 11:37:50 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 17 Feb 2023 11:37:50 GMT Subject: RFR: JDK-8301494: Replace NULL with nullptr in cpu/arm [v2] In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 10:09:05 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/arm. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Fix dholmes' comment Passes tier1, integrating. ------------- PR: https://git.openjdk.org/jdk/pull/12322 From jsjolen at openjdk.org Fri Feb 17 11:37:53 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 17 Feb 2023 11:37:53 GMT Subject: Integrated: JDK-8301494: Replace NULL with nullptr in cpu/arm In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 11:39:38 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/arm. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! This pull request has now been integrated. Changeset: c4ffe4bf Author: Johan Sj?len URL: https://git.openjdk.org/jdk/commit/c4ffe4bf6369d5b271aa8689b8648f3fe8dcabed Stats: 314 lines in 38 files changed: 0 ins; 0 del; 314 mod 8301494: Replace NULL with nullptr in cpu/arm Reviewed-by: dholmes, coleenp ------------- PR: https://git.openjdk.org/jdk/pull/12322 From rrich at openjdk.org Fri Feb 17 13:25:57 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 17 Feb 2023 13:25:57 GMT Subject: RFR: 8302158: PPC: test/jdk/jdk/internal/vm/Continuation/Fuzz.java: AssertionError: res: false shouldPin: false [v2] In-Reply-To: References: Message-ID: On Wed, 15 Feb 2023 20:30:00 GMT, Richard Reingruber wrote: >> This fixes the linked issue by trimming the caller of a frame to be deoptimized back to its `unextended_sp` iff it is compiled. The creation of the section `dead after deoptimization` shown in the attachment [yield_after_deopt_failure.log](https://bugs.openjdk.org/secure/attachment/102602/yield_after_deopt_failure.log) is prevented by this. >> >> A new mode is added to the test BasicExt.java where all frames are deoptimized after a yield operation. The issue can be deterministically reproduced with the new mode. It's not worth to execute all test cases with the new mode though. Instead `ContinuationCompiledFramesWithStackArgs_3c4` is always executed a 2nd time in this mode. >> >> Before this BasicExt.java was refactored for better argument processing and representation of the test modes. >> Also the try-catch-clause in the main method had to be changed to rethrow the caught exception because without this the test would have succeeded. >> >> Testing: jtreg tests tier 1-4 on standard platforms and also on ppc64le. > > Richard Reingruber has updated the pull request incrementally with three additional commits since the last revision: > > - Improve comment > - Improve comment > - Spelling Thanks for the reviews! ------------- PR: https://git.openjdk.org/jdk/pull/12557 From rrich at openjdk.org Fri Feb 17 13:43:27 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 17 Feb 2023 13:43:27 GMT Subject: Integrated: 8302158: PPC: test/jdk/jdk/internal/vm/Continuation/Fuzz.java: AssertionError: res: false shouldPin: false In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 14:10:08 GMT, Richard Reingruber wrote: > This fixes the linked issue by trimming the caller of a frame to be deoptimized back to its `unextended_sp` iff it is compiled. The creation of the section `dead after deoptimization` shown in the attachment [yield_after_deopt_failure.log](https://bugs.openjdk.org/secure/attachment/102602/yield_after_deopt_failure.log) is prevented by this. > > A new mode is added to the test BasicExt.java where all frames are deoptimized after a yield operation. The issue can be deterministically reproduced with the new mode. It's not worth to execute all test cases with the new mode though. Instead `ContinuationCompiledFramesWithStackArgs_3c4` is always executed a 2nd time in this mode. > > Before this BasicExt.java was refactored for better argument processing and representation of the test modes. > Also the try-catch-clause in the main method had to be changed to rethrow the caught exception because without this the test would have succeeded. > > Testing: jtreg tests tier 1-4 on standard platforms and also on ppc64le. This pull request has now been integrated. Changeset: b8c9d6cd Author: Richard Reingruber URL: https://git.openjdk.org/jdk/commit/b8c9d6cdf60ea5e680eb00d5c01a1c4d2ed04006 Stats: 207 lines in 3 files changed: 152 ins; 13 del; 42 mod 8302158: PPC: test/jdk/jdk/internal/vm/Continuation/Fuzz.java: AssertionError: res: false shouldPin: false Reviewed-by: goetz, mdoerr ------------- PR: https://git.openjdk.org/jdk/pull/12557 From jwaters at openjdk.org Fri Feb 17 14:08:34 2023 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 17 Feb 2023 14:08:34 GMT Subject: RFR: 8302667: Improve message format when failing to load symbols or libraries [v2] In-Reply-To: References: Message-ID: > DLL_ERROR4 is a macro expanding to an error message when a failure to load a generic item (shared libraries or an exported symbol from said libraries for example) occurs. "Error: loading:" is not a very pretty message, so this small change results in "Error: Failed to load %s" instead, which looks better and also means the message makes more sense if we want to append a reason behind as well, such as "Error: Failed to load libjvm.so because xxx" Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'openjdk:master' into patch-6 - Error: Failed to load ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12596/files - new: https://git.openjdk.org/jdk/pull/12596/files/43afc03a..5f4370f8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12596&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12596&range=00-01 Stats: 4003 lines in 132 files changed: 2148 ins; 635 del; 1220 mod Patch: https://git.openjdk.org/jdk/pull/12596.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12596/head:pull/12596 PR: https://git.openjdk.org/jdk/pull/12596 From jvernee at openjdk.org Fri Feb 17 14:18:50 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Fri, 17 Feb 2023 14:18:50 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v2] In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 14:25:15 GMT, Johannes Bechberger wrote: >> Extends the existing AsyncGetCallTrace test case and fixes the issue by modifying `MethodHandles` code. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Implement addptr suggestion by @JohnVernee and @reinrich Thanks for trying it out! I ran tier 1-4 here as well, and all failing tests are in `vmTestbase/vm/mlvm/indy/func/jvmti`. They all seem to be using `PopFrame` before a crash happens, so I think that might be where the problem is. I don't think it's impossible that there is other code out there that has it's own workaround for the incorrect unextended_sp. Also, I later realized that interpreter frames also save the sp before before making a call (`interpreter_frame_last_sp_offset`), and I can find at least some code that seems to assume `caller.unextended_sp() == (intptr_t*)caller.at(frame::interpreter_frame_last_sp_offset)`, so we'd just be breaking another invariant with my "fix". There's probably a bit of tech debt here (filed: https://bugs.openjdk.org/browse/JDK-8302745). I had a closer look at the original fix, and it looks like the `unextended_sp` is only used to check the size of the caller frame for interpreter -> interpreter calls. So, I think it's okay to relax the `unextende_sp >= sp` check (the invariant it tests turns out not to hold for otherwise valid stacks any way). But, in that case, please leave a comment that explains why we can have `sp > unextended_sp`. ------------- PR: https://git.openjdk.org/jdk/pull/12535 From jcking at openjdk.org Fri Feb 17 14:19:20 2023 From: jcking at openjdk.org (Justin King) Date: Fri, 17 Feb 2023 14:19:20 GMT Subject: RFR: JDK-8300783: Consolidate byteswap implementations [v13] In-Reply-To: References: <4AfWUfBaHKZwVsm6XhCkKc0krgKgWTw1RVI2R1-MdcM=.48008df7-1411-42d6-aeb1-3276f443dfe3@github.com> Message-ID: On Fri, 17 Feb 2023 05:38:41 GMT, Kim Barrett wrote: >> Out of curiosity, why `reverse_bits.hpp` instead of `reverseBits.hpp`? I have seen a few files in `utilities/` use the former, but the rest of Hotspot seems to always use the later? Is it because they contain a single function? >> >> I went with `reverse_bits.hpp` for now as suggested for consistency with the other bit fiddling headers in `utilities/`. > > The usual camel-case with lowercase start convention typically refers to a class name in the file. > I'm not sure snake-case for files with a single function was a good idea, even though it might have > been my fault. But it's what we're doing (at least for now). Works for me, just want to make sure I am consistent with the rest. ------------- PR: https://git.openjdk.org/jdk/pull/12114 From duke at openjdk.org Fri Feb 17 14:29:30 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Fri, 17 Feb 2023 14:29:30 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v2] In-Reply-To: References: Message-ID: On Fri, 17 Feb 2023 14:00:45 GMT, Jorn Vernee wrote: > But, in that case, please leave a comment that explains why we can have sp > unextended_sp. Could you be so kind to synthesize a comment for me? I'm slightly overwhelmed by the discussion. ------------- PR: https://git.openjdk.org/jdk/pull/12535 From jvernee at openjdk.org Fri Feb 17 14:29:31 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Fri, 17 Feb 2023 14:29:31 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v2] In-Reply-To: References: Message-ID: On Fri, 17 Feb 2023 14:13:04 GMT, Johannes Bechberger wrote: > Could you be so kind to synthesize a comment for me? I'm slightly overwhelmed by the discussion. Something like: // Note: sp can be greater than unextended_sp in the case of // interpreted -> interpreted calls that go through a method handle linker, // since those pop the last argument (the appendix) from the stack. ------------- PR: https://git.openjdk.org/jdk/pull/12535 From tholenstein at openjdk.org Fri Feb 17 14:45:55 2023 From: tholenstein at openjdk.org (Tobias Holenstein) Date: Fri, 17 Feb 2023 14:45:55 GMT Subject: RFR: JDK-8302656: Missing spaces in output of -XX:+CIPrintMethodCodes Message-ID: When printing bytecode and method data with `-XX:+CIPrintMethodCodes` we got some lines with missing whitespace: ``` 11760bci: 1895VirtualCallData count(0) nonprofiled_count(0) entries(0) ``` The reason for this is the use of `st->fill_to(6);` which fills up the stream with whitespace up to length 6. In our case we have 2 whitespaces before and then a numbers >=1000 which already adds up to 6+ characters in the stream. **Solution**: Always add a whitespace afterwards. Now 0 bci: 2 VirtualCallData count(0) nonprofiled_count(0) entries(0) 48 bci: 19 VirtualCallData count(0) nonprofiled_count(0) entries(0) 552 bci: 109 VirtualCallData count(0) nonprofiled_count(0) entries(0) 7320 bci: 1207 VirtualCallData count(0) nonprofiled_count(0) entries(0) 11760 bci: 1895 VirtualCallData count(0) nonprofiled_count(0) entries(0) ------------- Commit messages: - JDK-8302656: Missing spaces in output of -XX:+CIPrintMethodCodes Changes: https://git.openjdk.org/jdk/pull/12591/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12591&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302656 Stats: 4 lines in 2 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/12591.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12591/head:pull/12591 PR: https://git.openjdk.org/jdk/pull/12591 From duke at openjdk.org Fri Feb 17 14:46:54 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Fri, 17 Feb 2023 14:46:54 GMT Subject: Integrated: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64 In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 02:25:27 GMT, Yi-Fan Tsai wrote: > Instruction pmull and pmull2 support operating on 64-bit data in Cryptographic Extension. The execution throughput of this form raises from 1 on Neoverse N1 to 4 on Neoverse V1 while the latency remains 2. The CRC32 instructions did not changed: latency 2, throughput 1. As a result, computing CRC32 using pmull could perform better than using crc32 instruction. > > The following test has passed. > test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java > > The throughput reported by [the micro benchmark](https://github.com/openjdk/jdk/blob/master/test/micro/org/openjdk/bench/java/util/TestCRC32.java) is measured on an EC2 c7g instance. The optimization shows 11 - 99% improvement when the input is at least 384 bytes. > > | input | 64 | 128 | 256 | 384 | 511 | 512 | 1,024 | > | ------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | > | improvement | 0.02% | 0.02% | 0.00% | 16.00% | 11.94% | 34.75% | 69.80% | > > | input | 2,048 | 4,096 | 8,192 | 16,384 | 32,768 | 65,536 | > | ------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | > | improvement | 77.61% | 92.33% | 95.98% | 97.95% | 99.33% | 98.36% | > > > Baseline > > TestCRC32.testCRC32Update 64 thrpt 12 173126.358 ? 118.330 ops/ms > TestCRC32.testCRC32Update 128 thrpt 12 112910.118 ? 47.305 ops/ms > TestCRC32.testCRC32Update 256 thrpt 12 66601.990 ? 7.294 ops/ms > TestCRC32.testCRC32Update 384 thrpt 12 47229.319 ? 3.949 ops/ms > TestCRC32.testCRC32Update 511 thrpt 12 33733.119 ? 4.076 ops/ms > TestCRC32.testCRC32Update 512 thrpt 12 36584.565 ? 4.211 ops/ms > TestCRC32.testCRC32Update 1024 thrpt 12 19239.083 ? 1.040 ops/ms > TestCRC32.testCRC32Update 2048 thrpt 12 9875.652 ? 0.435 ops/ms > TestCRC32.testCRC32Update 4096 thrpt 12 5004.425 ? 0.290 ops/ms > TestCRC32.testCRC32Update 8192 thrpt 12 2519.185 ? 0.169 ops/ms > TestCRC32.testCRC32Update 16384 thrpt 12 1263.909 ? 0.194 ops/ms > TestCRC32.testCRC32Update 32768 thrpt 12 632.018 ? 0.053 ops/ms > TestCRC32.testCRC32Update 65536 thrpt 12 315.471 ? 0.095 ops/ms > > > Crypto pmull > > TestCRC32.testCRC32Update 64 thrpt 12 173168.669 ? 4.746 ops/ms > TestCRC32.testCRC32Update 128 thrpt 12 112933.519 ? 4.583 ops/ms > TestCRC32.testCRC32Update 256 thrpt 12 66602.462 ? 3.150 ops/ms > TestCRC32.testCRC32Update 384 thrpt 12 54784.739 ? 2.110 ops/ms > TestCRC32.testCRC32Update 511 thrpt 12 37760.816 ? 69.911 ops/ms > TestCRC32.testCRC32Update 512 thrpt 12 49297.609 ? 21.983 ops/ms > TestCRC32.testCRC32Update 1024 thrpt 12 32667.507 ? 90.610 ops/ms > TestCRC32.testCRC32Update 2048 thrpt 12 17539.986 ? 511.416 ops/ms > TestCRC32.testCRC32Update 4096 thrpt 12 9625.249 ? 9.713 ops/ms > TestCRC32.testCRC32Update 8192 thrpt 12 4937.135 ? 6.121 ops/ms > TestCRC32.testCRC32Update 16384 thrpt 12 2501.936 ? 1.270 ops/ms > TestCRC32.testCRC32Update 32768 thrpt 12 1259.831 ? 0.119 ops/ms > TestCRC32.testCRC32Update 65536 thrpt 12 625.773 ? 0.242 ops/ms This pull request has now been integrated. Changeset: 57fde75b Author: Yi-Fan Tsai Committer: Volker Simonis URL: https://git.openjdk.org/jdk/commit/57fde75b2a9d853c2abe1396ace6a83d198dd284 Stats: 215 lines in 5 files changed: 213 ins; 0 del; 2 mod 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64 Reviewed-by: simonis ------------- PR: https://git.openjdk.org/jdk/pull/12480 From duke at openjdk.org Fri Feb 17 14:47:56 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Fri, 17 Feb 2023 14:47:56 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v3] In-Reply-To: References: Message-ID: > Extends the existing AsyncGetCallTrace test case and fixes the issue. Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Revert previous change ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12535/files - new: https://git.openjdk.org/jdk/pull/12535/files/e63ab44a..da706d3b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12535&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12535&range=01-02 Stats: 9 lines in 2 files changed: 3 ins; 4 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12535.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12535/head:pull/12535 PR: https://git.openjdk.org/jdk/pull/12535 From duke at openjdk.org Fri Feb 17 14:47:57 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Fri, 17 Feb 2023 14:47:57 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v2] In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 14:25:15 GMT, Johannes Bechberger wrote: >> Extends the existing AsyncGetCallTrace test case and fixes the issue. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Implement addptr suggestion by @JohnVernee and @reinrich Thanks, I modified my PR accordingly. ------------- PR: https://git.openjdk.org/jdk/pull/12535 From luhenry at openjdk.org Fri Feb 17 15:24:26 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Fri, 17 Feb 2023 15:24:26 GMT Subject: RFR: 8302776: Fix typo CSR_INSTERT to CSR_INSTRET Message-ID: <1nO3RE1e_4wr81-UlV-WXIEERv6vyYswpijNwVYBsYA=.d71b98a8-9d84-4728-98d4-5e970b55fbaa@github.com> 8302776: Fix typo CSR_INSTERT to CSR_INSTRET ------------- Commit messages: - 8302776: Fix typo CSR_INSTERT to CSR_INSTRET Changes: https://git.openjdk.org/jdk/pull/12616/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12616&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302776 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12616.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12616/head:pull/12616 PR: https://git.openjdk.org/jdk/pull/12616 From pchilanomate at openjdk.org Fri Feb 17 15:30:51 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Fri, 17 Feb 2023 15:30:51 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v6] In-Reply-To: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: > Please review the following change. Some of the JVMTI functions had to be modified since they are not supported by the specs on virtual threads (StopThread, RunAgentThread, GetCurrentThreadCpuTime, GetThreadCpuTime) or shouldn't include virtual threads in the results (GetAllThreads, GetAllStackTraces, GetThreadGroupChildren). Others were modified because they should work on virtual thread but they weren't (SuspendAllVirtualThreads/ResumeAllVirtualThreads). Also support for VirtualThreadStart/VirtualThreadEnd events was added. > > There were a total of 71 tests under serviceability/jvmti/ that had the vm.continuations requirement. Of those, 24 where under serviceability/jvmti/vthread/. I was able to remove that requirement from 50 tests. Most of those tests were already passing for the alternative vthread implementation just by removing the check in jni_GetEnv for VMContinuations. > From the other 21 tests, 20 still fail with the changes either because they actually test continuations or because of different assumptions in the tests that don't hold true for the alternative vthread implementation. So for those I left the vm.continuations requirement untouched (from those 20 tests there are actually 4 tests from the thread/GetStackTrace/ family that are passing because of a bug in the test, but I can fix that in another RFE). The other remaining test is vthread/GetSetLocalTest/GetSetLocalTest.java which I see is problemlisted. > > Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). > > I added new test BoundVThreadTest.java which just checks for the unsupported/supported behavior mentioned previously. For some extra basic coverage I also added a new run with -XX:-VMContinuations on tests inside the serviceability/jvmti/vthread/ directory that don't require vm.continuations anymore. I could also add that for all the other tests in serviceability/jvmti/ for which I removed the vm.continuations requirement. > > I run the patch through mach5 tiers 1-6 plus some extra local runs on the relevant tests. > > Thanks, > Patricio Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: remove _is_bound_vthread ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12512/files - new: https://git.openjdk.org/jdk/pull/12512/files/3cb32734..d8f52713 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12512&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12512&range=04-05 Stats: 28 lines in 6 files changed: 0 ins; 20 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/12512.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12512/head:pull/12512 PR: https://git.openjdk.org/jdk/pull/12512 From pchilanomate at openjdk.org Fri Feb 17 15:36:25 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Fri, 17 Feb 2023 15:36:25 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v5] In-Reply-To: <9wwOk3VkUENUkmqGVBeprYDaaFs_I1Ofv21DvwwWQms=.c79c56dc-f373-499f-ab63-c1959ff039ee@github.com> References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> <9wwOk3VkUENUkmqGVBeprYDaaFs_I1Ofv21DvwwWQms=.c79c56dc-f373-499f-ab63-c1959ff039ee@github.com> Message-ID: On Fri, 17 Feb 2023 08:29:53 GMT, Serguei Spitsyn wrote: > The fix looks good to me. Thank you for the update. I'm not sure if you wanted to get rid of `is_bound_vthread` though. Just want to know you intention. Thanks, Serguei > Great. Yes, I just removed is_bound_vthread, please take a look when you have time. I'll also re-run all testing again before integration. Thanks for the review Serguei! ------------- PR: https://git.openjdk.org/jdk/pull/12512 From pchilanomate at openjdk.org Fri Feb 17 15:48:02 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Fri, 17 Feb 2023 15:48:02 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v5] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Fri, 17 Feb 2023 08:23:13 GMT, Serguei Spitsyn wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with two additional commits since the last revision: >> >> - restore state should be only for VirtualThreads >> - add suspend check > > test/hotspot/jtreg/serviceability/jvmti/vthread/BoundVThreadTest/libBoundVThreadTest.cpp line 35: > >> 33: static int vthread_start_count = 0; >> 34: static int vthread_end_count = 0; >> 35: static bool passed = JNI_TRUE; > > Nit: It is better to rename: passed => status. Fixed. ------------- PR: https://git.openjdk.org/jdk/pull/12512 From pchilanomate at openjdk.org Fri Feb 17 16:07:50 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Fri, 17 Feb 2023 16:07:50 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v7] In-Reply-To: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: > Please review the following change. Some of the JVMTI functions had to be modified since they are not supported by the specs on virtual threads (StopThread, RunAgentThread, GetCurrentThreadCpuTime, GetThreadCpuTime) or shouldn't include virtual threads in the results (GetAllThreads, GetAllStackTraces, GetThreadGroupChildren). Others were modified because they should work on virtual thread but they weren't (SuspendAllVirtualThreads/ResumeAllVirtualThreads). Also support for VirtualThreadStart/VirtualThreadEnd events was added. > > There were a total of 71 tests under serviceability/jvmti/ that had the vm.continuations requirement. Of those, 24 where under serviceability/jvmti/vthread/. I was able to remove that requirement from 50 tests. Most of those tests were already passing for the alternative vthread implementation just by removing the check in jni_GetEnv for VMContinuations. > From the other 21 tests, 20 still fail with the changes either because they actually test continuations or because of different assumptions in the tests that don't hold true for the alternative vthread implementation. So for those I left the vm.continuations requirement untouched (from those 20 tests there are actually 4 tests from the thread/GetStackTrace/ family that are passing because of a bug in the test, but I can fix that in another RFE). The other remaining test is vthread/GetSetLocalTest/GetSetLocalTest.java which I see is problemlisted. > > Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). > > I added new test BoundVThreadTest.java which just checks for the unsupported/supported behavior mentioned previously. For some extra basic coverage I also added a new run with -XX:-VMContinuations on tests inside the serviceability/jvmti/vthread/ directory that don't require vm.continuations anymore. I could also add that for all the other tests in serviceability/jvmti/ for which I removed the vm.continuations requirement. > > I run the patch through mach5 tiers 1-6 plus some extra local runs on the relevant tests. > > Thanks, > Patricio Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: s/passed/status ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12512/files - new: https://git.openjdk.org/jdk/pull/12512/files/d8f52713..6b3ec24e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12512&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12512&range=05-06 Stats: 6 lines in 1 file changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/12512.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12512/head:pull/12512 PR: https://git.openjdk.org/jdk/pull/12512 From epeter at openjdk.org Fri Feb 17 16:13:51 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Fri, 17 Feb 2023 16:13:51 GMT Subject: RFR: 8302146: Move TestOverloadCompileQueues.java to tier3 Message-ID: This test is expected to run long, as it has to fill the compile queue. But it does not need to run at a low tier. We just want to run it occasionally. Moved it to `tier3`, by adding it to `hotspot_slow_compiler`, which is excluded from `tier1` and `tier2`. ------------- Commit messages: - 8302146: Move TestOverloadCompileQueues.java to tier3 Changes: https://git.openjdk.org/jdk/pull/12617/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12617&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302146 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12617.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12617/head:pull/12617 PR: https://git.openjdk.org/jdk/pull/12617 From epeter at openjdk.org Fri Feb 17 16:50:50 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Fri, 17 Feb 2023 16:50:50 GMT Subject: RFR: 8302144: Move ZeroTLABTest.java to tier3 Message-ID: This is a "Hello World" test for the flags `UseTLAB` and `ZeroTLAB`, running with `-Xcomp`. It takes about 30seconds. This test does not need to run on `tier1`. So I pushed it out to `tier3`. Moved it to tier3, by adding it to hotspot_slow_compiler, which is excluded from tier1 and tier2. ------------- Commit messages: - 8302144: Move ZeroTLABTest.java to tier3 Changes: https://git.openjdk.org/jdk/pull/12620/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12620&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302144 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12620.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12620/head:pull/12620 PR: https://git.openjdk.org/jdk/pull/12620 From mchung at openjdk.org Fri Feb 17 17:25:35 2023 From: mchung at openjdk.org (Mandy Chung) Date: Fri, 17 Feb 2023 17:25:35 GMT Subject: RFR: 8302667: Improve message format when failing to load symbols or libraries [v2] In-Reply-To: References: Message-ID: On Fri, 17 Feb 2023 14:08:34 GMT, Julian Waters wrote: >> DLL_ERROR4 is a macro expanding to an error message when a failure to load a generic item (shared libraries or an exported symbol from said libraries for example) occurs. "Error: loading:" is not a very pretty message, so this small change results in "Error: Failed to load %s" instead, which looks better and also means the message makes more sense if we want to append a reason behind as well, such as "Error: Failed to load libjvm.so because xxx" > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch 'openjdk:master' into patch-6 > - Error: Failed to load This change is fine. `expandArgFile` reports `DLL_ERROR4` when it fails to read `@arg-file` which is unrelated to shared libraries. Would you mind also adding `ARG_ERROR18` to replace `DLL_ERROR4`? ------------- PR: https://git.openjdk.org/jdk/pull/12596 From kvn at openjdk.org Fri Feb 17 18:11:46 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 17 Feb 2023 18:11:46 GMT Subject: RFR: 8302146: Move TestOverloadCompileQueues.java to tier3 In-Reply-To: References: Message-ID: On Fri, 17 Feb 2023 15:08:38 GMT, Emanuel Peter wrote: > This test is expected to run long, as it has to fill the compile queue. But it does not need to run at a low tier. We just want to run it occasionally. > > Moved it to `tier3`, by adding it to `hotspot_slow_compiler`, which is excluded from `tier1` and `tier2`. Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.org/jdk/pull/12617 From kvn at openjdk.org Fri Feb 17 18:11:49 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 17 Feb 2023 18:11:49 GMT Subject: RFR: JDK-8302656: Missing spaces in output of -XX:+CIPrintMethodCodes In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 13:39:58 GMT, Tobias Holenstein wrote: > When printing bytecode and method data with `-XX:+CIPrintMethodCodes` we got some lines with missing whitespace: > ``` > 11760bci: 1895VirtualCallData count(0) nonprofiled_count(0) entries(0) > ``` > > The reason for this is the use of `st->fill_to(6);` which fills up the stream with whitespace up to length 6. In our case we have 2 whitespaces before and then a numbers >=1000 which already adds up to 6+ characters in the stream. > > **Solution**: Always add a whitespace afterwards. Now > > > 0 bci: 2 VirtualCallData count(0) nonprofiled_count(0) entries(0) > 48 bci: 19 VirtualCallData count(0) nonprofiled_count(0) entries(0) > 552 bci: 109 VirtualCallData count(0) nonprofiled_count(0) entries(0) > 7320 bci: 1207 VirtualCallData count(0) nonprofiled_count(0) entries(0) > 11760 bci: 1895 VirtualCallData count(0) nonprofiled_count(0) entries(0) Good and trivial. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.org/jdk/pull/12591 From mseledtsov at openjdk.org Fri Feb 17 18:32:04 2023 From: mseledtsov at openjdk.org (Mikhailo Seledtsov) Date: Fri, 17 Feb 2023 18:32:04 GMT Subject: RFR: 8294402: Add diagnostic logging to VMProps.checkDockerSupport [v8] In-Reply-To: References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> Message-ID: On Wed, 15 Feb 2023 00:38:15 GMT, Mikhailo Seledtsov wrote: >> Add log() and logToFile() methods to VMProps for diagnostic purposes. > > Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: > > Addressed recent review feedback from Leonid Thank you David, Leonid and Dan. ------------- PR: https://git.openjdk.org/jdk/pull/12239 From ccheung at openjdk.org Fri Feb 17 19:21:29 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Fri, 17 Feb 2023 19:21:29 GMT Subject: RFR: 8301992: Embed SymbolTable CHT node [v4] In-Reply-To: References: <2mGYkBwMFEqx5ec71tIta6hboHgsSj0G6NyK7oTBW5E=.4c557bd6-852f-47cb-9796-e9475ff81b1d@github.com> Message-ID: <6sSN0wroVi6C1mAAALOXg0-zbfcO7gR35GCr1i0G6GU=.c89da404-6cab-4c62-ac30-31228a5303a7@github.com> On Thu, 16 Feb 2023 18:45:39 GMT, Coleen Phillimore wrote: >> Calvin Cheung has updated the pull request incrementally with one additional commit since the last revision: >> >> add Afree() for permanent symbols > > This seems fine. I haven't actually observed us getting to this code, except maybe once, but you might as well keep it. Thanks @coleenp, @iklam for the review! ------------- PR: https://git.openjdk.org/jdk/pull/12562 From mseledtsov at openjdk.org Fri Feb 17 19:41:11 2023 From: mseledtsov at openjdk.org (Mikhailo Seledtsov) Date: Fri, 17 Feb 2023 19:41:11 GMT Subject: Integrated: 8294402: Add diagnostic logging to VMProps.checkDockerSupport In-Reply-To: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> References: <_X-xpw_Wz6dtbi5Z6u2b_g68sKSQqX5W_99c0aOXaAY=.a520a70a-66b4-4c51-a57a-388836404541@github.com> Message-ID: On Thu, 26 Jan 2023 23:23:16 GMT, Mikhailo Seledtsov wrote: > Add log() and logToFile() methods to VMProps for diagnostic purposes. This pull request has now been integrated. Changeset: 03d613bb Author: Mikhailo Seledtsov URL: https://git.openjdk.org/jdk/commit/03d613bbab99dd84dfc5115a5034c60f4e510259 Stats: 89 lines in 1 file changed: 87 ins; 0 del; 2 mod 8294402: Add diagnostic logging to VMProps.checkDockerSupport Reviewed-by: dholmes, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/12239 From ccheung at openjdk.org Fri Feb 17 19:55:35 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Fri, 17 Feb 2023 19:55:35 GMT Subject: Integrated: 8301992: Embed SymbolTable CHT node In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 17:29:41 GMT, Calvin Cheung wrote: > Please review this patch for embedding the Symbol inside a ConcurrentHashTable Node instead of having a pointer to a Symbol. This eliminates malloc/free for each Symbol. > > This patch is co-authored by @robehn. > > Passed tiers 1 - 4 testing. This pull request has now been integrated. Changeset: 86b9fce9 Author: Calvin Cheung URL: https://git.openjdk.org/jdk/commit/86b9fce9807eb5cbada90f9fa4d3763e3bff84cb Stats: 198 lines in 4 files changed: 66 ins; 84 del; 48 mod 8301992: Embed SymbolTable CHT node Co-authored-by: Robbin Ehn Reviewed-by: coleenp, iklam ------------- PR: https://git.openjdk.org/jdk/pull/12562 From pchilanomate at openjdk.org Fri Feb 17 19:56:56 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Fri, 17 Feb 2023 19:56:56 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v7] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Fri, 17 Feb 2023 17:28:19 GMT, Alan Bateman wrote: > Thanks for dropping is_bound_vthread, I think it's much better now. What would you think about changing it to test the thread type is_a vmClasses::BaseVirtualThread_klass() so the VM doesn't need to know about ThreadBuilders$BoundVirtualThread. > Yes, we could do that. Although I think that checking for BoundVirtualThread_klass is more descriptive of what we are trying to do and will be less confusing when reading the code. How about checking for `(!VMContinuations && x->is_a(vmClasses::BaseVirtualThread_klass()))`? ------------- PR: https://git.openjdk.org/jdk/pull/12512 From coleenp at openjdk.org Fri Feb 17 20:00:54 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 17 Feb 2023 20:00:54 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v2] In-Reply-To: <3lTNf7dMxXfNZlqfgtV5UHqxq4moAgrnyHCM2CtlkYI=.4df55d81-05d3-4636-bdfc-296a031aa815@github.com> References: <3lTNf7dMxXfNZlqfgtV5UHqxq4moAgrnyHCM2CtlkYI=.4df55d81-05d3-4636-bdfc-296a031aa815@github.com> Message-ID: On Fri, 17 Feb 2023 09:25:55 GMT, Robbin Ehn wrote: >> Hi all, please consider. >> >> The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. >> All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. >> Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. >> >> Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. >> The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) >> >> This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. >> >> This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. >> Secondly it moves handshakes part out of the Compile_lock where it is possible. >> >> Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. >> >> It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. >> >> Thanks, Robbin > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Review fixes src/hotspot/share/classfile/systemDictionary.cpp line 1505: > 1503: // Add to systemDictionary - so other classes can see it. > 1504: // Grabs and releases SystemDictionary_lock > 1505: update_dictionary(THREAD, k, loader_data); All these patterns are the same in class loading, except here we update the SystemDictionary after deoptimizing holding the Compile_lock and in your change we update it beforehand. I was going to suggest future cleanup of moving add_to_hierarchy and the associated deopt_scope code to InstanceKlass, but now I'm not sure if this order matters. Holding the Compile_lock while adding a class to the SystemDictionary and the code to hold the Compile_lock in ciEnv::get_klass_by_name() now doesn't make sense to me. Klass is read lock free in the dictionary and temporarily holding the Compile_lock doesn't seem to do what it says: ``` { // Grabbing the Compile_lock prevents systemDictionary updates // during compilations. At any rate, I think the SystemDictionary might need to be updated after deoptimization is done. Can you check this? ------------- PR: https://git.openjdk.org/jdk/pull/12585 From simonis at openjdk.org Fri Feb 17 20:03:54 2023 From: simonis at openjdk.org (Volker Simonis) Date: Fri, 17 Feb 2023 20:03:54 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 18:11:18 GMT, Volker Simonis wrote: >> Prior to [JDK-8239384](https://bugs.openjdk.org/browse/JDK-8239384)/[JDK-8238358](https://bugs.openjdk.org/browse/JDK-8238358) LambdaMetaFactory has created VM-anonymous classes which could easily be unloaded once they were not referenced any more. Starting with JDK 15 and the new "hidden class" based implementation, this is not the case any more, because the hidden classes will be strongly tied to their defining class loader. If this is the default application class loader, these hidden classes can never be unloaded which can easily lead to Metaspace exhaustion (see the [test case in the JBS issue](https://bugs.openjdk.org/secure/attachment/102601/LambdaClassLeak.java)). This is a regression compared to previous JDK versions which some of our applications have been affected from when migrating to JDK 17. >> >> The reason why the newly created hidden classes are strongly linked to their defining class loader is not clear to me. JDK-8239384 mentions it as an "implementation detail": >> >>> *4. the lambda proxy class has the strong relationship with the class loader (that will share the VM metaspace for its defining loader - implementation details)* >> >> From my current understanding the strong link between a hidden class created by `LambdaMetaFactory` and its defining class loader is not strictly required. In order to prevent potential OOMs and fix the regression compared the JDK 14 and earlier I propose to create these hidden classes without the `STRONG` option. >> >> I'll be happy to add the test case as JTreg test to this PR if you think that would be useful. > > Volker Simonis has updated the pull request incrementally with two additional commits since the last revision: > > - Remove assertions which insist on Lambda proxy classes being strongly linked to their class loader > - Removed unused import of STRONG und updated copyright year Here are some numbers. I basically measured the overhead in Metaspace and native memory (i.e. `malloc()`) for 10_000 Lambda classes created with `LambdaMetafactory.metafactory()`. For the Metaspace numbers I took the "used" column for both "Class" and "Non-Class" areas of all class loaders (i.e. "Total Usage") reported by `jcmd VM.metaspace show-loaders`. This is basically the same like the "Total" "BlockSz" reported by `jcmd VM.classloader_stats`. For native memory overhead I took the "malloc" numbers from the "Class" section in the output from `VM.native_memory`. I did run the test program with `-Xint -XX:+UseSerialGC -Xms512m -Xmx512m -XX:MetaspaceSize=100M -Xlog:gc -XX:NativeMemoryTracking=detail LambdaClassLeak 10000` to make sure I have no garbage collections except the explicit one triggered by me. The results in the first two columns are the difference between the numbers before and after creating the 10_000 Lambda classes and the results in the last two columns are the numbers after doing a `System.gc()` and potentially unloading the created Lambda classes (because they are all not referenced from the program any more). | JDK | Metaspace (kb) | malloc (kb) | - after GC -> | Metaspace (kb) | malloc (kb) | |:---:| ---------------:| ------------:| ------------- | ---------------:| -----------:| | 11 | 16_328 | 6_561 | | 0 | 232 | | 15 | 13_624 | 240 | | 13_624 | 92 | | 17 | 13_394 | 3_38 | | 13_394 | 93 | | tip | 13_451 | 337 | | 13_451 | 93 | | fix | 13_437 | 5_957 | | 0 | 10 | As you can see, with JDK 11, where Lambda classes are implemented by VM Anonymous classes, the 10_000 classes take up ~16mb of Metaspace and 6mb of native memory. The native memory is required for the `ClassLoaderData` structures because each VM Anonymous class lives in its own `ClassLoaderData`. After a GC, all the Metaspace and most of the native memory gets freed up. Starting with JDK 15, Lambda classes are now backed by the newly introduced "Hidden Classes". The Metaspace consumption slightly drops from ~16 to ~13mb (because the classes are a little smaller) and the native consumption drops significantly from ~6mb to 240kb. That's because the newly generated hidden classes are all strongly coupled to their defining class loader and therefor live in its `ClassLoaderData` section. While this saves ~6mb of native memory, it also prevents unloading of these classes. These numbers haven't changed up to the current JDK development version. With the fix proposed in this PR, we basically trade about ~6mb of native memory (i.e. ~600 bytes per Lambda class) for the ability to easily unload such Lambda classes once they are not referenced any more. I still think doing this change is reasonable because: 1. `LambdaMetafactory.metafactory()` does not mention that the classes is creates will not be unloadable. And they actually were unloadable up to JDK 11 while they were backed by VM anonymous classes. This is unexpected for users. 2. I'd argue that Metaspace is more "*valuable*" than native memory because it is limited by `-XX:MetaspaceSize` whereas native memory is only limited by the amount of available memory on the system. Also, growing Metaspace will trigger garbage collections which might be unexpected. 3. I think only "real" users of Lambda functions should pay the price for them and not users who don't need them any more. 4. Finally, the PR fixes a regression compared to the JDK 11 behavior. In general it seems to me that creating a new `ClassLoaderData` for each hidden (or VM anonymous) class is a lot of overhead and was probably only done as "a hack" to simplify the implementation of class unloading for such classes. This might have been appropriate in the "early days" when we had just a few VM anonymous classes. But as the usage of `LambdaMetafactory.metafactory()` becomes more common it may make sense to revise the implementation such that for example all hidden classes defined by a class loader can live in a single `ClassLoaderData` structure. I think this should be technically possible although it would complicate the unloading logic (and should therefore be postponed to a later change). ------------- PR: https://git.openjdk.org/jdk/pull/12493 From duke at openjdk.org Fri Feb 17 20:07:16 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Fri, 17 Feb 2023 20:07:16 GMT Subject: RFR: 8302783: Improve CRC32C intrinsic with crypto pmull on AArch64 Message-ID: Instruction pmull and pmull2 support operating on 64-bit data in Cryptographic Extension. The execution throughput of this form raises from 1 on Neoverse N1 to 4 on Neoverse V1 while the latency remains 2. The CRC32C instructions did not changed: latency 2, throughput 1. As a result, computing CRC32C using pmull could perform better than using crc32c instruction. The following test has passed. test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32C.java The throughput reported by [the micro benchmark](https://github.com/openjdk/jdk/blob/master/test/micro/org/openjdk/bench/java/util/TestCRC32C.java) is measured on an EC2 c7g instance. The optimization shows 10 - 99% improvement when the input is at least 384 bytes. | input | 64 | 128 | 256 | 384 | 511 | 512 | 1,024 | | ------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | | improvement | 1.60% | 0.00% | 0.00% | 15.24% | 10.76% | 34.32% | 72.39% | | input | 2,048 | 4,096 | 8,192 | 16,384 | 32,768 | 65,536 | | ------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | | improvement | 84.96% | 92.59% | 96.19% | 98.02% | 99.32% | 98.36% | Baseline Benchmark (count) Mode Cnt Score Error Units TestCRC32C.testCRC32CUpdate 64 thrpt 12 196575.739 ? 1824.113 ops/ms TestCRC32C.testCRC32CUpdate 128 thrpt 12 123666.570 ? 2.730 ops/ms TestCRC32C.testCRC32CUpdate 256 thrpt 12 70188.989 ? 2.002 ops/ms TestCRC32C.testCRC32CUpdate 384 thrpt 12 49000.690 ? 1.421 ops/ms TestCRC32C.testCRC32CUpdate 511 thrpt 12 34106.279 ? 25.390 ops/ms TestCRC32C.testCRC32CUpdate 512 thrpt 12 37638.349 ? 1.039 ops/ms TestCRC32C.testCRC32CUpdate 1024 thrpt 12 19526.513 ? 0.439 ops/ms TestCRC32C.testCRC32CUpdate 2048 thrpt 12 9951.392 ? 4.803 ops/ms TestCRC32C.testCRC32CUpdate 4096 thrpt 12 5023.268 ? 0.240 ops/ms TestCRC32C.testCRC32CUpdate 8192 thrpt 12 2523.877 ? 0.062 ops/ms TestCRC32C.testCRC32CUpdate 16384 thrpt 12 1265.011 ? 0.047 ops/ms TestCRC32C.testCRC32CUpdate 32768 thrpt 12 632.291 ? 0.058 ops/ms TestCRC32C.testCRC32CUpdate 65536 thrpt 12 315.396 ? 0.160 ops/ms Crypto pmull Benchmark (count) Mode Cnt Score Error Units TestCRC32C.testCRC32CUpdate 64 thrpt 12 199726.599 ? 166.477 ops/ms TestCRC32C.testCRC32CUpdate 128 thrpt 12 123669.385 ? 1.821 ops/ms TestCRC32C.testCRC32CUpdate 256 thrpt 12 70188.727 ? 1.313 ops/ms TestCRC32C.testCRC32CUpdate 384 thrpt 12 56468.837 ? 76.524 ops/ms TestCRC32C.testCRC32CUpdate 511 thrpt 12 37777.205 ? 406.431 ops/ms TestCRC32C.testCRC32CUpdate 512 thrpt 12 50554.555 ? 17.169 ops/ms TestCRC32C.testCRC32CUpdate 1024 thrpt 12 33661.006 ? 140.471 ops/ms TestCRC32C.testCRC32CUpdate 2048 thrpt 12 18406.482 ? 205.952 ops/ms TestCRC32C.testCRC32CUpdate 4096 thrpt 12 9674.159 ? 20.390 ops/ms TestCRC32C.testCRC32CUpdate 8192 thrpt 12 4951.562 ? 6.566 ops/ms TestCRC32C.testCRC32CUpdate 16384 thrpt 12 2504.970 ? 1.883 ops/ms TestCRC32C.testCRC32CUpdate 32768 thrpt 12 1260.278 ? 0.484 ops/ms TestCRC32C.testCRC32CUpdate 65536 thrpt 12 625.608 ? 0.300 ops/ms ------------- Commit messages: - pmull CRC32C Changes: https://git.openjdk.org/jdk/pull/12624/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12624&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302783 Stats: 78 lines in 3 files changed: 77 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12624.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12624/head:pull/12624 PR: https://git.openjdk.org/jdk/pull/12624 From stuefe at openjdk.org Fri Feb 17 20:26:26 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 17 Feb 2023 20:26:26 GMT Subject: RFR: JDK-8294266: Add a way to pre-touch java thread stacks [v4] In-Reply-To: References: Message-ID: On Mon, 14 Nov 2022 10:16:58 GMT, Thomas Stuefe wrote: >> When doing performance- and footprint analysis, `AlwaysPreTouch` option is very handy for reducing noise. It would be good to have a similar option for pre-touching thread stacks. In addition to reducing noise, it can serve as worst-case test for thread costs, as well as a test for NMT regressions. >> >> Patch adds a new diagnostic switch, `AlwaysPreTouchStacks`, as a companion switch to `AlwaysPreTouch`. Touching is super-simple using `alloca()`. Also, regression test. >> >> Examples: >> >> NMT, thread stacks, 10000 Threads, default: >> >> >> - Thread (reserved=10332400KB, committed=331828KB) >> (thread #10021) >> (stack: reserved=10301560KB, committed=300988KB) >> (malloc=19101KB #60755) >> (arena=11739KB #20037) >> >> >> NMT, thread stacks, 10000 Threads, +AlwaysPreTouchStacks: >> >> >> - Thread (reserved=10332400KB, committed=10284360KB) >> (thread #10021) >> (stack: reserved=10301560KB, committed=10253520KB) >> (malloc=19101KB #60755) >> (arena=11739KB #20037) > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > test changes, comment change not yet bot ------------- PR: https://git.openjdk.org/jdk/pull/10403 From forax at openjdk.org Fri Feb 17 20:36:30 2023 From: forax at openjdk.org (=?UTF-8?B?UsOpbWk=?= Forax) Date: Fri, 17 Feb 2023 20:36:30 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 18:11:18 GMT, Volker Simonis wrote: >> Prior to [JDK-8239384](https://bugs.openjdk.org/browse/JDK-8239384)/[JDK-8238358](https://bugs.openjdk.org/browse/JDK-8238358) LambdaMetaFactory has created VM-anonymous classes which could easily be unloaded once they were not referenced any more. Starting with JDK 15 and the new "hidden class" based implementation, this is not the case any more, because the hidden classes will be strongly tied to their defining class loader. If this is the default application class loader, these hidden classes can never be unloaded which can easily lead to Metaspace exhaustion (see the [test case in the JBS issue](https://bugs.openjdk.org/secure/attachment/102601/LambdaClassLeak.java)). This is a regression compared to previous JDK versions which some of our applications have been affected from when migrating to JDK 17. >> >> The reason why the newly created hidden classes are strongly linked to their defining class loader is not clear to me. JDK-8239384 mentions it as an "implementation detail": >> >>> *4. the lambda proxy class has the strong relationship with the class loader (that will share the VM metaspace for its defining loader - implementation details)* >> >> From my current understanding the strong link between a hidden class created by `LambdaMetaFactory` and its defining class loader is not strictly required. In order to prevent potential OOMs and fix the regression compared the JDK 14 and earlier I propose to create these hidden classes without the `STRONG` option. >> >> I'll be happy to add the test case as JTreg test to this PR if you think that would be useful. > > Volker Simonis has updated the pull request incrementally with two additional commits since the last revision: > > - Remove assertions which insist on Lambda proxy classes being strongly linked to their class loader > - Removed unused import of STRONG und updated copyright year Hi Volker, in the future, we may change the implementation of lambdas again, by example Valhalla parametric generics, Isolated methods (JDK-8158765), JDK-8087219 or JDK-8001537, may allow to re-use the same class file/method for all lambdas implementing the same interface. I fear that forcing unloading may block future refactoring. ------------- PR: https://git.openjdk.org/jdk/pull/12493 From kbarrett at openjdk.org Fri Feb 17 20:52:29 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 17 Feb 2023 20:52:29 GMT Subject: RFR: JDK-8300080: offset_of for GCC/Clang exhibits undefined behavior and is not always a compile-time constant [v4] In-Reply-To: References: Message-ID: On Fri, 17 Feb 2023 10:32:52 GMT, Goetz Lindenmaier wrote: > @kimbarret, is this ready to be sponsored? It needs a second reviewer. ------------- PR: https://git.openjdk.org/jdk/pull/11978 From dcubed at openjdk.org Fri Feb 17 21:21:39 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 17 Feb 2023 21:21:39 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v2] In-Reply-To: <3lTNf7dMxXfNZlqfgtV5UHqxq4moAgrnyHCM2CtlkYI=.4df55d81-05d3-4636-bdfc-296a031aa815@github.com> References: <3lTNf7dMxXfNZlqfgtV5UHqxq4moAgrnyHCM2CtlkYI=.4df55d81-05d3-4636-bdfc-296a031aa815@github.com> Message-ID: On Fri, 17 Feb 2023 09:25:55 GMT, Robbin Ehn wrote: >> Hi all, please consider. >> >> The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. >> All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. >> Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. >> >> Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. >> The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) >> >> This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. >> >> This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. >> Secondly it moves handshakes part out of the Compile_lock where it is possible. >> >> Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. >> >> It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. >> >> Thanks, Robbin > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Review fixes Another round of editorial changes. src/hotspot/share/code/codeCache.cpp line 1416: > 1414: } > 1415: > 1416: // Flushes compiled methods dependent on dependee nit typo: s/Flushes/Marks/ src/hotspot/share/code/codeCache.hpp line 316: > 314: > 315: // RedefineClasses support > 316: // Flushing and deoptimization in case of evolution nit typo: s/Flushing/Marking/ src/hotspot/share/oops/instanceKlass.cpp line 1184: > 1182: DeoptimizationScope deopt_scope; > 1183: { > 1184: // Now flush all code that assume the class is not linked. nit typo: s/assume/assumes/ src/hotspot/share/prims/methodHandles.cpp line 959: > 957: { > 958: NoSafepointVerifier nsv; > 959: MutexLocker mu2(CodeCache_lock, Mutex::_no_safepoint_check_flag); I missed this naming issue earlier: s/mu2/mu/ since we only have one Mutex here. src/hotspot/share/prims/methodHandles.cpp line 1222: > 1220: MethodHandles::mark_dependent_nmethods(&deopt_scope, call_site, target); > 1221: java_lang_invoke_CallSite::set_target(call_site(), target()); > 1222: // This assumed to be an 'atomic' operation by verification. nit typo: s/This assumed/This is assumed/ src/hotspot/share/prims/methodHandles.cpp line 1238: > 1236: MethodHandles::mark_dependent_nmethods(&deopt_scope, call_site, target); > 1237: java_lang_invoke_CallSite::set_target_volatile(call_site(), target()); > 1238: // This assumed to be an 'atomic' operation by verification. nit typo: s/This assumed/This is assumed/ src/hotspot/share/prims/methodHandles.cpp line 1336: > 1334: DependencyContext deps = java_lang_invoke_MethodHandleNatives_CallSiteContext::vmdependencies(context()); > 1335: deps.remove_and_mark_for_deoptimization_all_dependents(&deopt_scope); > 1336: // This assumed to be an 'atomic' operation by verification. nit typo: s/This assumed/This is assumed/ src/hotspot/share/prims/methodHandles.hpp line 82: > 80: static void clean_dependency_context(oop call_site); > 81: > 82: static void mark_dependent_nmethods(DeoptimizationScope* deopt_scope, Handle call_site, Handle target); Need to update copyright year in this file. src/hotspot/share/runtime/deoptimization.cpp line 109: > 107: > 108: MutexLocker ml(CompiledMethod_lock, Mutex::_no_safepoint_check_flag); > 109: // If there is nothing to deopt _gen is the same as comitted. nit typo: s/_gen/_required_gen/ src/hotspot/share/runtime/deoptimization.cpp line 143: > 141: Mutex::_no_safepoint_check_flag); > 142: // A method marked by someone else may have a _gen lower than what we marked with. > 143: // Therefore only store it if it's higher than _gen. nit typo: s/_gen/_required_gen/ in two places ------------- PR: https://git.openjdk.org/jdk/pull/12585 From jvernee at openjdk.org Fri Feb 17 21:25:26 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Fri, 17 Feb 2023 21:25:26 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: On Fri, 17 Feb 2023 20:01:08 GMT, Volker Simonis wrote: > (because they are all not referenced from the program any more). So, it sounds like this is testing a case where `LambdaMetafactory.metafactory` is being called directly? I'd like to point out that, while I buy that people are doing this (I see enough of this on StackOverflow), `LambdaMetafactory` is a runtime API specifically designed to support a language feature. Calling it directly is an unintended use case. We should cater towards the intended use case instead (the language feature) which benefits from the strong tie to the class loader. This change also seems to hurt the common case in favor of the rare case. The native memory cost should be multiplied by the ratio of use case that see a regression (lambdas) to the use cases that see an improvement (direct calls to LambdaMetafactory). I think it is safe to say that that ratio is quite large. i.e. we seem to be trading one regression for another, far worse one. ------------- PR: https://git.openjdk.org/jdk/pull/12493 From stuefe at openjdk.org Fri Feb 17 21:48:33 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 17 Feb 2023 21:48:33 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 18:11:18 GMT, Volker Simonis wrote: >> Prior to [JDK-8239384](https://bugs.openjdk.org/browse/JDK-8239384)/[JDK-8238358](https://bugs.openjdk.org/browse/JDK-8238358) LambdaMetaFactory has created VM-anonymous classes which could easily be unloaded once they were not referenced any more. Starting with JDK 15 and the new "hidden class" based implementation, this is not the case any more, because the hidden classes will be strongly tied to their defining class loader. If this is the default application class loader, these hidden classes can never be unloaded which can easily lead to Metaspace exhaustion (see the [test case in the JBS issue](https://bugs.openjdk.org/secure/attachment/102601/LambdaClassLeak.java)). This is a regression compared to previous JDK versions which some of our applications have been affected from when migrating to JDK 17. >> >> The reason why the newly created hidden classes are strongly linked to their defining class loader is not clear to me. JDK-8239384 mentions it as an "implementation detail": >> >>> *4. the lambda proxy class has the strong relationship with the class loader (that will share the VM metaspace for its defining loader - implementation details)* >> >> From my current understanding the strong link between a hidden class created by `LambdaMetaFactory` and its defining class loader is not strictly required. In order to prevent potential OOMs and fix the regression compared the JDK 14 and earlier I propose to create these hidden classes without the `STRONG` option. >> >> I'll be happy to add the test case as JTreg test to this PR if you think that would be useful. > > Volker Simonis has updated the pull request incrementally with two additional commits since the last revision: > > - Remove assertions which insist on Lambda proxy classes being strongly linked to their class loader > - Removed unused import of STRONG und updated copyright year Hi Volker, just a short feedback since I'm entering vacation mode. > In general it seems to me that creating a new `ClassLoaderData` for each hidden (or VM anonymous) class is a lot of overhead and was probably only done as "a hack" to simplify the implementation of class unloading for such classes. This might have been appropriate in the "early days" when we had just a few VM anonymous classes. But as the usage of `LambdaMetafactory.metafactory()` becomes more common it may make sense to revise the implementation such that for example all hidden classes defined by a class loader can live in a single `ClassLoaderData` structure. I think this should be technically possible although it would complicate the unloading logic (and should therefore be postponed to a later change). Interesting numbers. I underestimated the impact of native allocations. A side note, the amount of waste avoided by JEP 371 depends highly on the actual size of the lambda (precisely, how close a lambda would be to the chunk size). That's pretty much random, but nice to know since it can shift the numbers from almost nil overhead to almost twofold overhead. >I'd argue that Metaspace is more "valuable" than native memory because it is limited by -XX:MetaspaceSize, whereas native memory is only limited by the amount of available memory on the system. Also, growing Metaspace will trigger garbage collections which might be unexpected. I don't follow that logic, for several reasons: - Metaspace is more easily reclaimed, since it returns memory to the OS eagerly (since JEP 387). C-Heap, otoh, sticks to the process unless you (atm) manually trim. Depends on the libc used, but with glibc - arguable the main platform - this is the state. We are working on C-heap autotrim, but it is not completed yet. So while Metaspace is limited, it recovers from spikes more easily. - The metaspace GC threshold is an (impreceise) mechanism to trigger GC based on native resource consumption. That this is not affected by C-heap memory is just a side effect of an imperfect implementation - ideally, all native resources associated with a class loader - be it metaspace or C-heap - together would weight into the decision of when to do a GC. In fact, your numbers may be taken as a hint that maybe CLDs should live in Metaspace instead. --- As I stated before, we do have the ability to prematurely repurpose metaspace before the loader dies via Metaspace::deallocate(). It would be nice if there were a mechanism to trigger this at least from Java. E.g. by making Lambdas closable. ------------- PR: https://git.openjdk.org/jdk/pull/12493 From sspitsyn at openjdk.org Fri Feb 17 21:48:32 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 17 Feb 2023 21:48:32 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v7] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Fri, 17 Feb 2023 16:07:50 GMT, Patricio Chilano Mateo wrote: >> Please review the following change. Some of the JVMTI functions had to be modified since they are not supported by the specs on virtual threads (StopThread, RunAgentThread, GetCurrentThreadCpuTime, GetThreadCpuTime) or shouldn't include virtual threads in the results (GetAllThreads, GetAllStackTraces, GetThreadGroupChildren). Others were modified because they should work on virtual thread but they weren't (SuspendAllVirtualThreads/ResumeAllVirtualThreads). Also support for VirtualThreadStart/VirtualThreadEnd events was added. >> >> There were a total of 71 tests under serviceability/jvmti/ that had the vm.continuations requirement. Of those, 24 where under serviceability/jvmti/vthread/. I was able to remove that requirement from 50 tests. Most of those tests were already passing for the alternative vthread implementation just by removing the check in jni_GetEnv for VMContinuations. >> From the other 21 tests, 20 still fail with the changes either because they actually test continuations or because of different assumptions in the tests that don't hold true for the alternative vthread implementation. So for those I left the vm.continuations requirement untouched (from those 20 tests there are actually 4 tests from the thread/GetStackTrace/ family that are passing because of a bug in the test, but I can fix that in another RFE). The other remaining test is vthread/GetSetLocalTest/GetSetLocalTest.java which I see is problemlisted. >> >> Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). >> >> I added new test BoundVThreadTest.java which just checks for the unsupported/supported behavior mentioned previously. For some extra basic coverage I also added a new run with -XX:-VMContinuations on tests inside the serviceability/jvmti/vthread/ directory that don't require vm.continuations anymore. I could also add that for all the other tests in serviceability/jvmti/ for which I removed the vm.continuations requirement. >> >> I run the patch through mach5 tiers 1-6 plus some extra local runs on the relevant tests. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > s/passed/status Updates look good to me. Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR: https://git.openjdk.org/jdk/pull/12512 From sspitsyn at openjdk.org Fri Feb 17 21:52:33 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 17 Feb 2023 21:52:33 GMT Subject: RFR: 8302615: make JVMTI thread cpu time functions optional for virtual threads In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 18:05:54 GMT, Serguei Spitsyn wrote: > This is a minor JVM TI spec update for two timer functions `GetCurrentThreadCpuTime` and `GetThreadCpuTime` to allow implementations to support virtual threads. > > The CSR is: > [JDK-8302616](https://bugs.openjdk.org/browse/JDK-8302616): make JVMTI thread cpu time functions optional for virtual threads > > No testing is needed. I will integrate this as a trivial fix. ------------- PR: https://git.openjdk.org/jdk/pull/12604 From sspitsyn at openjdk.org Fri Feb 17 21:52:34 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 17 Feb 2023 21:52:34 GMT Subject: Integrated: 8302615: make JVMTI thread cpu time functions optional for virtual threads In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 18:05:54 GMT, Serguei Spitsyn wrote: > This is a minor JVM TI spec update for two timer functions `GetCurrentThreadCpuTime` and `GetThreadCpuTime` to allow implementations to support virtual threads. > > The CSR is: > [JDK-8302616](https://bugs.openjdk.org/browse/JDK-8302616): make JVMTI thread cpu time functions optional for virtual threads > > No testing is needed. This pull request has now been integrated. Changeset: 6b082fb3 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/6b082fb3c658524905a9a7b33dcb58e375c95c1b Stats: 4 lines in 1 file changed: 2 ins; 0 del; 2 mod 8302615: make JVMTI thread cpu time functions optional for virtual threads Reviewed-by: alanb ------------- PR: https://git.openjdk.org/jdk/pull/12604 From dcubed at openjdk.org Fri Feb 17 22:28:31 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 17 Feb 2023 22:28:31 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v2] In-Reply-To: <3lTNf7dMxXfNZlqfgtV5UHqxq4moAgrnyHCM2CtlkYI=.4df55d81-05d3-4636-bdfc-296a031aa815@github.com> References: <3lTNf7dMxXfNZlqfgtV5UHqxq4moAgrnyHCM2CtlkYI=.4df55d81-05d3-4636-bdfc-296a031aa815@github.com> Message-ID: On Fri, 17 Feb 2023 09:25:55 GMT, Robbin Ehn wrote: >> Hi all, please consider. >> >> The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. >> All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. >> Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. >> >> Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. >> The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) >> >> This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. >> >> This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. >> Secondly it moves handshakes part out of the Compile_lock where it is possible. >> >> Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. >> >> It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. >> >> Thanks, Robbin > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Review fixes src/hotspot/share/code/codeCache.cpp line 1392: > 1390: if (nm->is_marked_for_deoptimization() && !nm->has_been_deoptimized() && nm->can_be_deoptimized()) { > 1391: nm->make_not_entrant(); > 1392: nm->make_deoptimized(); `CodeCache::make_marked_nmethods_deoptimized()` used to call `make_nmethod_deoptimized(nm)` which made a couple of checks before it called `nm->make_deoptimized()`. `make_deoptimized()` does not make these checks so why are they no longer needed? Here's the removed code: void CodeCache::make_nmethod_deoptimized(CompiledMethod* nm) { if (nm->is_marked_for_deoptimization() && nm->can_be_deoptimized()) { nm->make_deoptimized(); } } ------------- PR: https://git.openjdk.org/jdk/pull/12585 From dcubed at openjdk.org Fri Feb 17 22:31:31 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 17 Feb 2023 22:31:31 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v2] In-Reply-To: <3lTNf7dMxXfNZlqfgtV5UHqxq4moAgrnyHCM2CtlkYI=.4df55d81-05d3-4636-bdfc-296a031aa815@github.com> References: <3lTNf7dMxXfNZlqfgtV5UHqxq4moAgrnyHCM2CtlkYI=.4df55d81-05d3-4636-bdfc-296a031aa815@github.com> Message-ID: On Fri, 17 Feb 2023 09:25:55 GMT, Robbin Ehn wrote: >> Hi all, please consider. >> >> The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. >> All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. >> Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. >> >> Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. >> The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) >> >> This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. >> >> This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. >> Secondly it moves handshakes part out of the Compile_lock where it is possible. >> >> Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. >> >> It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. >> >> Thanks, Robbin > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Review fixes src/hotspot/share/code/nmethod.cpp line 1166: > 1164: set_deoptimized_done(); > 1165: return; > 1166: } I don't understand this check-and-bailout logic at all. If continuations are not enabled, then why does `nmethod::make_deoptimized()` do nothing? This isn't an issue introduced by your patch, but I just happened to notice when I was reading thru... ------------- PR: https://git.openjdk.org/jdk/pull/12585 From sspitsyn at openjdk.org Sat Feb 18 01:36:24 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 18 Feb 2023 01:36:24 GMT Subject: RFR: 8299240: rank of JvmtiVTMSTransition_lock can be safepoint In-Reply-To: References: Message-ID: On Fri, 17 Feb 2023 05:09:25 GMT, David Holmes wrote: >> The rank of JvmtiVTMSTransition_lock is better to be safepoint instead of nosafepoint. >> The fix includes removal of the function `check_vthread_and_suspend_at_safepoint` which is not needed anymore. >> >> Testing: >> mach5 jobs are in progress: >> Kitchensink, tiers1-6 (all JVMTI, JDWP, JDI and JDB tests have to be included) > > src/hotspot/share/prims/jvmtiEnv.cpp line 952: > >> 950: } >> 951: // protect thread_oop as a safepoint can be reached in disabler destructor >> 952: self_tobj = Handle(current, thread_oop); > > If you move this after line 953 then you can delete line 932 and just have: > > Handle self_tobj = Handle(current, thread_oop); Thank you for looking at this PR, David! Please, note a disabler at L938. A safepont can be reached in its destructor. Also, see the comment at L952: 937 { 938 JvmtiVTMSTransitionDisabler disabler(true); 939 ThreadsListHandle tlh(current); 940 941 err = get_threadOop_and_JavaThread(tlh.list(), thread, &java_thread, &thread_oop); 942 if (err != JVMTI_ERROR_NONE) { 943 return err; 944 } 945 946 // Do not use JvmtiVTMSTransitionDisabler in context of self suspend to avoid deadlocks. 947 if (java_thread != current) { 948 err = suspend_thread(thread_oop, java_thread, /* single_suspend */ true, nullptr); 949 return err; 950 } 951 // protect thread_oop as a safepoint can be reached in disabler destructor 952 self_tobj = Handle(current, thread_oop); 953 } ------------- PR: https://git.openjdk.org/jdk/pull/12550 From sspitsyn at openjdk.org Sat Feb 18 01:36:25 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 18 Feb 2023 01:36:25 GMT Subject: RFR: 8299240: rank of JvmtiVTMSTransition_lock can be safepoint In-Reply-To: References: Message-ID: On Sat, 18 Feb 2023 01:30:27 GMT, Serguei Spitsyn wrote: >> src/hotspot/share/prims/jvmtiEnv.cpp line 952: >> >>> 950: } >>> 951: // protect thread_oop as a safepoint can be reached in disabler destructor >>> 952: self_tobj = Handle(current, thread_oop); >> >> If you move this after line 953 then you can delete line 932 and just have: >> >> Handle self_tobj = Handle(current, thread_oop); > > Thank you for looking at this PR, David! > Please, note a disabler at L938. A safepont can be reached in its destructor. > Also, see the comment at L952: > > 937 { > 938 JvmtiVTMSTransitionDisabler disabler(true); > 939 ThreadsListHandle tlh(current); > 940 > 941 err = get_threadOop_and_JavaThread(tlh.list(), thread, &java_thread, &thread_oop); > 942 if (err != JVMTI_ERROR_NONE) { > 943 return err; > 944 } > 945 > 946 // Do not use JvmtiVTMSTransitionDisabler in context of self suspend to avoid deadlocks. > 947 if (java_thread != current) { > 948 err = suspend_thread(thread_oop, java_thread, /* single_suspend */ true, nullptr); > 949 return err; > 950 } > 951 // protect thread_oop as a safepoint can be reached in disabler destructor > 952 self_tobj = Handle(current, thread_oop); > 953 } > it means there are now many places that can block for safepoints that previously wouldn't and it is very to hard to tell if that is okay or not. We partially rely on testing and good test coverage here. ------------- PR: https://git.openjdk.org/jdk/pull/12550 From mchung at openjdk.org Sat Feb 18 01:58:29 2023 From: mchung at openjdk.org (Mandy Chung) Date: Sat, 18 Feb 2023 01:58:29 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: On Fri, 17 Feb 2023 20:01:08 GMT, Volker Simonis wrote: > Finally, the PR fixes a regression compared to the JDK 11 behavior. This is an intended change that improves the C heap memory usage. In addition, the spec of `LambdaMetafactory` has no guarantee of the spinned classes be unloaded independent of it defining loader. This is indeed a behavioral change if `LambdaMetafactory` API is called directly instead of via `invokedynamic`. I can create a CSR to retrofit this behavioral change. ------------- PR: https://git.openjdk.org/jdk/pull/12493 From fyang at openjdk.org Sat Feb 18 06:36:22 2023 From: fyang at openjdk.org (Fei Yang) Date: Sat, 18 Feb 2023 06:36:22 GMT Subject: RFR: 8302776: Fix typo CSR_INSTERT to CSR_INSTRET In-Reply-To: <1nO3RE1e_4wr81-UlV-WXIEERv6vyYswpijNwVYBsYA=.d71b98a8-9d84-4728-98d4-5e970b55fbaa@github.com> References: <1nO3RE1e_4wr81-UlV-WXIEERv6vyYswpijNwVYBsYA=.d71b98a8-9d84-4728-98d4-5e970b55fbaa@github.com> Message-ID: On Fri, 17 Feb 2023 15:06:59 GMT, Ludovic Henry wrote: > 8302776: Fix typo CSR_INSTERT to CSR_INSTRET Looks good and trivial. I think the title should start with "RISC-V: " to make explicit that this is a riscv-specific issue. ------------- Marked as reviewed by fyang (Reviewer). PR: https://git.openjdk.org/jdk/pull/12616 From jwaters at openjdk.org Sat Feb 18 07:30:27 2023 From: jwaters at openjdk.org (Julian Waters) Date: Sat, 18 Feb 2023 07:30:27 GMT Subject: RFR: JDK-8300080: offset_of for GCC/Clang exhibits undefined behavior and is not always a compile-time constant [v4] In-Reply-To: References: Message-ID: <_g0Y7QEFggp5tnzIncC-UYrFLevSzA1ndKWQhCUTF_c=.349a118c-8ced-453f-8749-bda880105257@github.com> On Fri, 13 Jan 2023 16:06:44 GMT, Justin King wrote: >> The implementation of `offset_of` for GCC/Clang only deals with types are aligned to 16 bytes or less, if they are more, such as `zCollectedHeap` the behavior is undefined. UBSan also suggests that `offset_of` is not always a compile time constant, as the stack trace came from the dynamic loader during library loading. This patch changes `offset_of` to use `offsetof` and disables the warning `invalid-offsetof` for the JVM. > > Justin King has updated the pull request incrementally with one additional commit since the last revision: > > Move attribute on lambda to correct location > > Signed-off-by: Justin King Should probably change the integration requirements to mark this as such :( ------------- PR: https://git.openjdk.org/jdk/pull/11978 From jwaters at openjdk.org Sat Feb 18 07:33:23 2023 From: jwaters at openjdk.org (Julian Waters) Date: Sat, 18 Feb 2023 07:33:23 GMT Subject: RFR: 8302667: Improve message format when failing to load symbols or libraries [v2] In-Reply-To: References: Message-ID: On Fri, 17 Feb 2023 14:08:34 GMT, Julian Waters wrote: >> DLL_ERROR4 is a macro expanding to an error message when a failure to load a generic item (shared libraries or an exported symbol from said libraries for example) occurs. "Error: loading:" is not a very pretty message, so this small change results in "Error: Failed to load %s" instead, which looks better and also means the message makes more sense if we want to append a reason behind as well, such as "Error: Failed to load libjvm.so because xxx" > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch 'openjdk:master' into patch-6 > - Error: Failed to load No problem, what should I define ARG_ERROR18 as? ------------- PR: https://git.openjdk.org/jdk/pull/12596 From alanb at openjdk.org Sat Feb 18 07:59:26 2023 From: alanb at openjdk.org (Alan Bateman) Date: Sat, 18 Feb 2023 07:59:26 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v7] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Fri, 17 Feb 2023 16:07:50 GMT, Patricio Chilano Mateo wrote: >> Please review the following change. Some of the JVMTI functions had to be modified since they are not supported by the specs on virtual threads (StopThread, RunAgentThread, GetCurrentThreadCpuTime, GetThreadCpuTime) or shouldn't include virtual threads in the results (GetAllThreads, GetAllStackTraces, GetThreadGroupChildren). Others were modified because they should work on virtual thread but they weren't (SuspendAllVirtualThreads/ResumeAllVirtualThreads). Also support for VirtualThreadStart/VirtualThreadEnd events was added. >> >> There were a total of 71 tests under serviceability/jvmti/ that had the vm.continuations requirement. Of those, 24 where under serviceability/jvmti/vthread/. I was able to remove that requirement from 50 tests. Most of those tests were already passing for the alternative vthread implementation just by removing the check in jni_GetEnv for VMContinuations. >> From the other 21 tests, 20 still fail with the changes either because they actually test continuations or because of different assumptions in the tests that don't hold true for the alternative vthread implementation. So for those I left the vm.continuations requirement untouched (from those 20 tests there are actually 4 tests from the thread/GetStackTrace/ family that are passing because of a bug in the test, but I can fix that in another RFE). The other remaining test is vthread/GetSetLocalTest/GetSetLocalTest.java which I see is problemlisted. >> >> Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). >> >> I added new test BoundVThreadTest.java which just checks for the unsupported/supported behavior mentioned previously. For some extra basic coverage I also added a new run with -XX:-VMContinuations on tests inside the serviceability/jvmti/vthread/ directory that don't require vm.continuations anymore. I could also add that for all the other tests in serviceability/jvmti/ for which I removed the vm.continuations requirement. >> >> I run the patch through mach5 tiers 1-6 plus some extra local runs on the relevant tests. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > s/passed/status Marked as reviewed by alanb (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12512 From alanb at openjdk.org Sat Feb 18 07:59:29 2023 From: alanb at openjdk.org (Alan Bateman) Date: Sat, 18 Feb 2023 07:59:29 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v7] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Fri, 17 Feb 2023 19:54:08 GMT, Patricio Chilano Mateo wrote: > Yes, we could do that. Although I think that checking for BoundVirtualThread_klass is more descriptive of what we are trying to do and will be less confusing when reading the code. How about checking for `(!VMContinuations && x->is_a(vmClasses::BaseVirtualThread_klass()))`? We can go with what you have for now and re-visit again if it becomes an issue. The main thing I had hoped is that the VM won't get too coupled to BoundVirtualThread as that may move or be refactored soon. ------------- PR: https://git.openjdk.org/jdk/pull/12512 From duke at openjdk.org Sat Feb 18 15:33:58 2023 From: duke at openjdk.org (Afshin Zafari) Date: Sat, 18 Feb 2023 15:33:58 GMT Subject: RFR: 8290903: Enable function warning attribute for Clang build once Clang supports merging Message-ID: The warning attribute is enabled.. ### Test mach5 tier1-5. ------------- Commit messages: - 8290903: Enable function warning attribute for Clang build once Clang supports merging - 8290903: Enable function warning attribute for Clang build once Clang supports merging Changes: https://git.openjdk.org/jdk/pull/12634/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12634&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8290903 Stats: 6 lines in 1 file changed: 0 ins; 4 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12634.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12634/head:pull/12634 PR: https://git.openjdk.org/jdk/pull/12634 From duke at openjdk.org Sat Feb 18 15:38:13 2023 From: duke at openjdk.org (Afshin Zafari) Date: Sat, 18 Feb 2023 15:38:13 GMT Subject: RFR: 8301117: Remove old_size param from ResizeableResourceHashtable::resize() Message-ID: `old_size` is removed from the function parameters in definitions and calls. `table_size` is used as old size. ### Test mach5 tiers 1-5. ------------- Commit messages: - 8301117: Remove old_size param from ResizeableResourceHashtable::resize() - 8301117: Remove old_size param from ResizeableResourceHashtable::resize() Changes: https://git.openjdk.org/jdk/pull/12635/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12635&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301117 Stats: 4 lines in 2 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/12635.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12635/head:pull/12635 PR: https://git.openjdk.org/jdk/pull/12635 From duke at openjdk.org Sat Feb 18 17:51:28 2023 From: duke at openjdk.org (liach) Date: Sat, 18 Feb 2023 17:51:28 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 18:11:18 GMT, Volker Simonis wrote: >> Prior to [JDK-8239384](https://bugs.openjdk.org/browse/JDK-8239384)/[JDK-8238358](https://bugs.openjdk.org/browse/JDK-8238358) LambdaMetaFactory has created VM-anonymous classes which could easily be unloaded once they were not referenced any more. Starting with JDK 15 and the new "hidden class" based implementation, this is not the case any more, because the hidden classes will be strongly tied to their defining class loader. If this is the default application class loader, these hidden classes can never be unloaded which can easily lead to Metaspace exhaustion (see the [test case in the JBS issue](https://bugs.openjdk.org/secure/attachment/102601/LambdaClassLeak.java)). This is a regression compared to previous JDK versions which some of our applications have been affected from when migrating to JDK 17. >> >> The reason why the newly created hidden classes are strongly linked to their defining class loader is not clear to me. JDK-8239384 mentions it as an "implementation detail": >> >>> *4. the lambda proxy class has the strong relationship with the class loader (that will share the VM metaspace for its defining loader - implementation details)* >> >> From my current understanding the strong link between a hidden class created by `LambdaMetaFactory` and its defining class loader is not strictly required. In order to prevent potential OOMs and fix the regression compared the JDK 14 and earlier I propose to create these hidden classes without the `STRONG` option. >> >> I'll be happy to add the test case as JTreg test to this PR if you think that would be useful. > > Volker Simonis has updated the pull request incrementally with two additional commits since the last revision: > > - Remove assertions which insist on Lambda proxy classes being strongly linked to their class loader > - Removed unused import of STRONG und updated copyright year This "called directly" usage reminds me of another behavioral change around the time "original" access for lookup was introduced, where a library can no longer create SAM interface instances out of static methods, as the provided lookup had private access (previously full privilege) but couldn't have original access in newer Java. ------------- PR: https://git.openjdk.org/jdk/pull/12493 From alexandr.miloslavskiy at syntevo.com Sat Feb 18 21:13:38 2023 From: alexandr.miloslavskiy at syntevo.com (Alexandr Miloslavskiy) Date: Sun, 19 Feb 2023 00:13:38 +0300 Subject: JVM crashes in JIT compiled code without hs_err_pid file Message-ID: A single customer reports frequent crashes where hs_err_pid file is not created. The JDK is Zulu17.32+13-CA (build 17.0.2+8-LTS). We have studied Windows Event log and it indeed shows numerous crashes. We have then configured WER to capture a minidump and I studied it. The crash occurs in JIT generated code: ---- 00000000`0e11f2c1 49ba401a3f93fd7f0000 mov r10,offset jvm+0x2d1a40 (00007ffd`933f1a40) 00000000`0e11f2cb 41ffd2 call r10 00000000`0e11f2ce 6690 nop 00000000`0e11f2d0 41b801000000 mov r8d,1 00000000`0e11f2d6 448b542424 mov r10d,dword ptr [rsp+24h] 00000000`0e11f2db 4183c230 add r10d,30h 00000000`0e11f2df 90 nop 00000000`0e11f2e0 443b542428 cmp r10d,dword ptr [rsp+28h] 00000000`0e11f2e5 0f8d81160000 jge 00000000`0e12096c 00000000`0e11f2eb 418bd8 mov ebx,r8d 00000000`0e11f2ee 448b5c2420 mov r11d,dword ptr [rsp+20h] 00000000`0e11f2f3 488b742458 mov rsi,qword ptr [rsp+58h] 00000000`0e11f2f8 8b4c2428 mov ecx,dword ptr [rsp+28h] 00000000`0e11f2fc 4c8b442460 mov r8,qword ptr [rsp+60h] 00000000`0e11f301 8bc3 mov eax,ebx 00000000`0e11f303 4c8b6c2468 mov r13,qword ptr [rsp+68h] 00000000`0e11f308 43807c131070 cmp byte ptr [r11+r10+10h],70h 00000000`0e11f30e 0f8471000000 je 00000000`0e11f385 00000000`0e11f314 418b5e2c mov ebx,dword ptr [r14+2Ch] 00000000`0e11f318 8b6b0c mov ebp,dword ptr [rbx+0Ch] 00000000`0e11f31b 3bc5 cmp eax,ebp 00000000`0e11f31d 666690 nop 00000000`0e11f320 0f85df0c0000 jne 00000000`0e120005 00000000`0e11f326 4c8b4c2430 mov r9,qword ptr [rsp+30h] 00000000`0e11f32b 418b410c mov eax,dword ptr [r9+0Ch] ---- It crashes on this instruction: 00000000`0e11f32b 418b410c mov eax,dword ptr [r9+0Ch] Because r9 is NULL. Looks like implicit null exception to me, which should have been handled by JVM. I have then investigated if JVM's crash handler is installed properly. In the minidump, the following can be seen in 'FunctionTableStream' * MinimumAddress=0000000005CD0000 MaximumAddress=0000000014CD0000 BaseAddress=0000000005CD0000 * * BeginAddress=0000000000000000 EndAddress=000000000F000000 UnwindData=000000000000039C This looks like valid handler installed via 'os::register_code_area'. At least when I inspect 'os::register_code_area' on my machine, it registers very similar values. At this point, I think I need advice. What could go wrong in JVM so that it neither handles this exception nor crashes into hs_err_pid and instead lets Windows handle the crash? From duke at openjdk.org Sun Feb 19 08:55:35 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Sun, 19 Feb 2023 08:55:35 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v3] In-Reply-To: References: Message-ID: <9bvykV6KIxtCVdRfWa7l1gnKKw758542V6Oqe3Xs3uA=.60ddff5f-a561-4cca-884b-235ff4d2006f@github.com> On Fri, 17 Feb 2023 14:47:56 GMT, Johannes Bechberger wrote: >> Extends the existing AsyncGetCallTrace test case and fixes the issue. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Revert previous change Is this PR then ready? ------------- PR: https://git.openjdk.org/jdk/pull/12535 From thomas.stuefe at gmail.com Sun Feb 19 09:43:41 2023 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Sun, 19 Feb 2023 10:43:41 +0100 Subject: JVM crashes in JIT compiled code without hs_err_pid file In-Reply-To: References: Message-ID: Hi Alexandr, my first guess would be what you also assumed, that registering code cache address ranges is flunky. Do you crash right away, or do other signals from code heap generally work (e.g. polling page faults)? Do the UNWIND information look correct? Have you confirmed that the addresses of the faulting instruction really lie within a registered UNWIND region? Another possibility, albeit unlikely, is that in your build the outer __try/__except maybe does not work, or is missing? Cheers, Thomas On Sat, Feb 18, 2023 at 10:13 PM Alexandr Miloslavskiy < alexandr.miloslavskiy at syntevo.com> wrote: > A single customer reports frequent crashes where hs_err_pid file is not > created. > > The JDK is Zulu17.32+13-CA (build 17.0.2+8-LTS). > > We have studied Windows Event log and it indeed shows numerous crashes. > We have then configured WER to capture a minidump and I studied it. > > The crash occurs in JIT generated code: > ---- > 00000000`0e11f2c1 49ba401a3f93fd7f0000 mov r10,offset jvm+0x2d1a40 > (00007ffd`933f1a40) > 00000000`0e11f2cb 41ffd2 call r10 > 00000000`0e11f2ce 6690 nop > 00000000`0e11f2d0 41b801000000 mov r8d,1 > 00000000`0e11f2d6 448b542424 mov r10d,dword ptr [rsp+24h] > 00000000`0e11f2db 4183c230 add r10d,30h > 00000000`0e11f2df 90 nop > 00000000`0e11f2e0 443b542428 cmp r10d,dword ptr [rsp+28h] > 00000000`0e11f2e5 0f8d81160000 jge 00000000`0e12096c > 00000000`0e11f2eb 418bd8 mov ebx,r8d > 00000000`0e11f2ee 448b5c2420 mov r11d,dword ptr [rsp+20h] > 00000000`0e11f2f3 488b742458 mov rsi,qword ptr [rsp+58h] > 00000000`0e11f2f8 8b4c2428 mov ecx,dword ptr [rsp+28h] > 00000000`0e11f2fc 4c8b442460 mov r8,qword ptr [rsp+60h] > 00000000`0e11f301 8bc3 mov eax,ebx > 00000000`0e11f303 4c8b6c2468 mov r13,qword ptr [rsp+68h] > 00000000`0e11f308 43807c131070 cmp byte ptr [r11+r10+10h],70h > 00000000`0e11f30e 0f8471000000 je 00000000`0e11f385 > 00000000`0e11f314 418b5e2c mov ebx,dword ptr [r14+2Ch] > 00000000`0e11f318 8b6b0c mov ebp,dword ptr [rbx+0Ch] > 00000000`0e11f31b 3bc5 cmp eax,ebp > 00000000`0e11f31d 666690 nop > 00000000`0e11f320 0f85df0c0000 jne 00000000`0e120005 > 00000000`0e11f326 4c8b4c2430 mov r9,qword ptr [rsp+30h] > 00000000`0e11f32b 418b410c mov eax,dword ptr [r9+0Ch] > ---- > > It crashes on this instruction: > 00000000`0e11f32b 418b410c mov eax,dword ptr [r9+0Ch] > > Because r9 is NULL. Looks like implicit null exception to me, which > should have been handled by JVM. > > I have then investigated if JVM's crash handler is installed properly. > In the minidump, the following can be seen in 'FunctionTableStream' > * MinimumAddress=0000000005CD0000 MaximumAddress=0000000014CD0000 > BaseAddress=0000000005CD0000 > * * BeginAddress=0000000000000000 EndAddress=000000000F000000 > UnwindData=000000000000039C > > This looks like valid handler installed via 'os::register_code_area'. At > least when I inspect 'os::register_code_area' on my machine, it > registers very similar values. > > At this point, I think I need advice. What could go wrong in JVM so that > it neither handles this exception nor crashes into hs_err_pid and > instead lets Windows handle the crash? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexandr.miloslavskiy at syntevo.com Sun Feb 19 11:06:15 2023 From: alexandr.miloslavskiy at syntevo.com (Alexandr Miloslavskiy) Date: Sun, 19 Feb 2023 14:06:15 +0300 Subject: JVM crashes in JIT compiled code without hs_err_pid file In-Reply-To: References: Message-ID: <65436f8c-21f0-0270-1ad8-32f72d2163df@syntevo.com> Hi Thomas, Thank you for your reply! > Do you crash right away, or do other > signals from code heap generally work (e.g. polling page faults)? According to the dump and customer's description, it crashes intermittently. In the dump I seen, the process uptime was over 30 minutes, so it must have handled a lot of exceptions already and then, all of sudden, it crashes. According to event log user sent, he had 6 crashes in 14 hours. One other thing to consider, user says that crashes started a few days ago. No idea what that means currently. I suspected 3rd party software, but I simply can't explain what could it be doing to trigger such symptoms. > Do the UNWIND information look correct? Unfortunately not seen in the dump. I requested a full dump, but that may run into confidentiality issues. Waiting for customer's response currently. > Have you confirmed that the addresses of the faulting instructionreally lie within a registered UNWIND region? Yes, as can be seen in my previous mail, registered region is MinimumAddress=00000000`05CD0000 MaximumAddress=00000000`14CD0000 and faulting RIP is 00000000`0e11f32b So it's in between, and it also confirms that this is Java's code cache. > Another possibility, albeit unlikely, is that in your build the outer > __try/__except maybe does not work, or is missing? That's stock Azul Zulu. I wouldn't expect such serious changes in JDK. From thomas.stuefe at gmail.com Sun Feb 19 13:30:43 2023 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Sun, 19 Feb 2023 14:30:43 +0100 Subject: JVM crashes in JIT compiled code without hs_err_pid file In-Reply-To: <65436f8c-21f0-0270-1ad8-32f72d2163df@syntevo.com> References: <65436f8c-21f0-0270-1ad8-32f72d2163df@syntevo.com> Message-ID: Hi Alexandr, (ccing hotspot-compiler-dev since this is about code cache registration on Windows x64). at this point, I'm taking wild guesses. Of course, this is not stock OpenJDK but Zulu 17 so Azul support should look into this. Possibility 1: someone in the process may have established a competing signal handler via AddVectoredExceptionHandler. Could be some foreign native code. On Windows x64 and x86, we use Structured Exception Handling, but VEH has preference. If someone were to establish their own exception handler, that one would get called, and our signal handler would be skipped. This is just theoretical, I have not tried it. On *nix, we have safeguarding mechanism to catch such errors (Xcheck:jni), but I'm not sure they work on Windows. Possibility 2: registering code cache did not work, or there is some bug. Since the code cache is address-stable after VM init this is improbable. Note that this is also true for +UseSegmentedCodeCache - still just one fixed address range. This is getting into guessing territory and there is not much more I can contribute from outside, sorry. Cheers, Thomas On Sun, Feb 19, 2023 at 12:06 PM Alexandr Miloslavskiy < alexandr.miloslavskiy at syntevo.com> wrote: > Hi Thomas, > > Thank you for your reply! > > > Do you crash right away, or do other > > signals from code heap generally work (e.g. polling page faults)? > > According to the dump and customer's description, it crashes > intermittently. In the dump I seen, the process uptime was over 30 > minutes, so it must have handled a lot of exceptions already and then, > all of sudden, it crashes. According to event log user sent, he had 6 > crashes in 14 hours. > > One other thing to consider, user says that crashes started a few days > ago. No idea what that means currently. I suspected 3rd party software, > but I simply can't explain what could it be doing to trigger such symptoms. > > > Do the UNWIND information look correct? > > Unfortunately not seen in the dump. I requested a full dump, but that > may run into confidentiality issues. Waiting for customer's response > currently. > > > Have you confirmed that the addresses of the faulting instructionreally > lie within a registered UNWIND region? > > Yes, as can be seen in my previous mail, registered region is > MinimumAddress=00000000`05CD0000 MaximumAddress=00000000`14CD0000 > > and faulting RIP is > 00000000`0e11f32b > > So it's in between, and it also confirms that this is Java's code cache. > > > Another possibility, albeit unlikely, is that in your build the outer > > __try/__except maybe does not work, or is missing? > > That's stock Azul Zulu. I wouldn't expect such serious changes in JDK. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexandr.miloslavskiy at syntevo.com Sun Feb 19 15:57:21 2023 From: alexandr.miloslavskiy at syntevo.com (Alexandr Miloslavskiy) Date: Sun, 19 Feb 2023 18:57:21 +0300 Subject: JVM crashes in JIT compiled code without hs_err_pid file In-Reply-To: References: <65436f8c-21f0-0270-1ad8-32f72d2163df@syntevo.com> Message-ID: <5cd78ffa-56da-03c5-142a-1250722fe9dd@syntevo.com> Hi Thomas, > someone in the process may have established a competing signal handler via AddVectoredExceptionHandler I tested it and you were right, if 'AddVectoredExceptionHandler()' is present then it's called instead of 'RtlAddFunctionTable()'. See attached code for a demo. I'll try to figure if this was the case for the customer. > Possibility 2: registering code cache did not work Sounds unlikely, because in the dump, I can see that it's registered (when a dump is created, it also collects all function tables). Also, in such case, application wouldn't run longer than a second or two before crashing. -------------- next part -------------- #include #include #include #pragma comment(linker, "/SUBSYSTEM:console /ENTRY:mainCRTStartup") LONG FunctionableHandler(IN PEXCEPTION_RECORD ExceptionRecord, IN ULONG64 EstablisherFrame, IN OUT PCONTEXT ContextRecord, IN OUT PDISPATCHER_CONTEXT DispatcherContext) { printf( "FunctionableHandler: exception %08X at %p\n", ExceptionRecord->ExceptionCode, ExceptionRecord->ExceptionAddress ); TerminateProcess(GetCurrentProcess(), 0); return 0; } LONG VectoredExceptionHandler(_EXCEPTION_POINTERS *ExceptionInfo) { printf( "VectoredExceptionHandler: exception %08X at %p\n", ExceptionInfo->ExceptionRecord->ExceptionCode, ExceptionInfo->ExceptionRecord->ExceptionAddress ); TerminateProcess(GetCurrentProcess(), 0); return 0; } // Same as in OpenJDK typedef unsigned char UBYTE; typedef struct _UNWIND_INFO_EH_ONLY { UBYTE Version : 3; UBYTE Flags : 5; UBYTE SizeOfProlog; UBYTE CountOfCodes; UBYTE FrameRegister : 4; UBYTE FrameOffset : 4; union { OPTIONAL ULONG ExceptionHandler; OPTIONAL ULONG FunctionEntry; }; OPTIONAL ULONG ExceptionData[1]; } UNWIND_INFO_EH_ONLY, *PUNWIND_INFO_EH_ONLY; struct CodePageContents { UNWIND_INFO_EH_ONLY unwind; RUNTIME_FUNCTION runtimeFunction; uint8_t exceptionHandlerCode[16]; uint8_t jitCode[64]; }; typedef void (*PlainFunction)(); PlainFunction createJitCode() { const uint32_t codePageSize = 0x1000; CodePageContents* memoryPage = (CodePageContents*)VirtualAlloc(0, codePageSize, MEM_COMMIT, PAGE_EXECUTE_READWRITE); // Set up exception handling { UNWIND_INFO_EH_ONLY* punwind = &memoryPage->unwind; punwind->Version = 1; punwind->Flags = UNW_FLAG_EHANDLER; punwind->SizeOfProlog = 0; punwind->CountOfCodes = 0; punwind->FrameRegister = 0; punwind->FrameOffset = 0; punwind->ExceptionHandler = (ULONG)((char*)memoryPage->exceptionHandlerCode - (char*)memoryPage); punwind->ExceptionData[0] = 0; RUNTIME_FUNCTION* prt = &memoryPage->runtimeFunction; prt->BeginAddress = 0; prt->EndAddress = codePageSize; prt->UnwindData = (DWORD)((char*)&memoryPage->unwind - (char*)memoryPage); RtlAddFunctionTable(prt, 1, (DWORD64)memoryPage); uint8_t* code = memoryPage->exceptionHandlerCode; // mov rax, FunctionableHandler code[0] = 0x48; code[1] = 0xb8; code += 2; *(uint64_t*)code = (uint64_t)FunctionableHandler; code += 8; // jmp rax code[0] = 0xff; code[1] = 0xe0; } // "JIT" code { uint8_t* code = memoryPage->jitCode; // xor eax, eax code[0] = 0x31; code[1] = 0xc0; code += 2; // mov eax,DWORD PTR [eax] code[0] = 0x67; code[1] = 0x8b; code[2] = 0x00; code += 3; } return (PlainFunction)&memoryPage->jitCode; } int main() { PlainFunction function = createJitCode(); // Comment/uncomment as needed AddVectoredExceptionHandler(TRUE, VectoredExceptionHandler); // Run "jit" code and crash function(); return 0; } From alexandr.miloslavskiy at syntevo.com Sun Feb 19 18:16:30 2023 From: alexandr.miloslavskiy at syntevo.com (Alexandr Miloslavskiy) Date: Sun, 19 Feb 2023 21:16:30 +0300 Subject: JVM crashes in JIT compiled code without hs_err_pid file In-Reply-To: References: <65436f8c-21f0-0270-1ad8-32f72d2163df@syntevo.com> Message-ID: <3fd60ae0-2719-018d-5679-a78d1588e2bb@syntevo.com> Thomas, > Possibility 1: someone in the process may have established a competing > signal handler via AddVectoredExceptionHandler. I checked user's dump and the list of 'AddVectoredExceptionHandler()' is empty. That is, there are no registered handlers currently. 'SetUnhandledExceptionFilter()' is default C++ filter (set by C++ CRT automatically). Overall it seems as if somehow JVM's 'RtlAddFunctionTable()' suddenly stops being called. Hopefully I can get full dump and double-check all internals. From dholmes at openjdk.org Sun Feb 19 22:10:25 2023 From: dholmes at openjdk.org (David Holmes) Date: Sun, 19 Feb 2023 22:10:25 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v3] In-Reply-To: References: Message-ID: On Fri, 17 Feb 2023 14:47:56 GMT, Johannes Bechberger wrote: >> Extends the existing AsyncGetCallTrace test case and fixes the issue. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Revert previous change I'm somewhat concerned that the code doesn't seem to have a clear notion of exactly how `unextended_sp` relates to `sp` in general. I would still be concerned that the relaxation may cause a bad value to be considered safe and so lead to a crash. But I guess we will just have to address that if it arises. Thanks for all the due-diligence on this one. ------------- PR: https://git.openjdk.org/jdk/pull/12535 From dholmes at openjdk.org Mon Feb 20 02:04:29 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 20 Feb 2023 02:04:29 GMT Subject: RFR: 8290903: Enable function warning attribute for Clang build once Clang supports merging In-Reply-To: References: Message-ID: <1Lk0s2t-Lsgd0q3ZKhinh_tnY1817ZeueryVZKQaql4=.3bccd475-fb28-4e3b-8ff0-6923f67da43d@github.com> On Sat, 18 Feb 2023 15:26:04 GMT, Afshin Zafari wrote: > The warning attribute is enabled.. > > ### Test > mach5 tier1-5. I'm a bit confused about the status here. Looking at: https://github.com/llvm/llvm-project/issues/56519 the issue is still open, not fixed. And the issue is reported as existing in version 14 so the uncommented version check seems wrong anyway. ?? ------------- PR: https://git.openjdk.org/jdk/pull/12634 From dholmes at openjdk.org Mon Feb 20 06:00:25 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 20 Feb 2023 06:00:25 GMT Subject: RFR: 8301117: Remove old_size param from ResizeableResourceHashtable::resize() In-Reply-To: References: Message-ID: <3ZYHy2OkwNZpROjQaJ7OQ_VSBal6piDNEZpqEllUTmI=.71b11100-b0c6-470a-898a-cd3cdd50253b@github.com> On Sat, 18 Feb 2023 15:31:46 GMT, Afshin Zafari wrote: > `old_size` is removed from the function parameters in definitions and calls. > `table_size` is used as old size. > ### Test > mach5 tiers 1-5. Changes look good - thanks. Seems to me that if anyone ever passed an old_size that was not the current table-size then we would lose part of the table when we grew it! ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12635 From dholmes at openjdk.org Mon Feb 20 06:35:46 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 20 Feb 2023 06:35:46 GMT Subject: RFR: JDK-8293313: NMT: Rework MallocLimit [v9] In-Reply-To: References: <-KHeOvjwe5Osf05XWQuO5ELZUs-ogUVj25xCouJaDaU=.b0906406-dd44-4eb4-be76-debe83867249@github.com> Message-ID: On Fri, 17 Feb 2023 07:07:11 GMT, Thomas Stuefe wrote: >> Thanks guys! >> >>> LGTM. >>> >>> Will there be a release note documenting the changed/removed XX-flags ? Maybe there should be one . >> >> That is not necessary since this was a diagnostic switch. Also, the new format is backward compatible with the old one (so no behavioral change if someone had set it) >> >> Cheers, Thomas > >> @tstuefe with regards to a release note ... while the flag is now diagnostic that is a change made in JDK 21, so in terms of users of JDK 20 migrating to JDK 21, a release note may be need to both explain the flag is now diagnostic (and subject to change without notice), and that "none" is no longer a valid value. > > I assume you mean JDK-8302129, right? The MetaspaceReclaimPolicy=none removal? > > Makes sense, I will do that. I opened https://bugs.openjdk.org/browse/JDK-8302708 to track that. Sorry @tstuefe - no idea how I got the two issues mixed up. :( ------------- PR: https://git.openjdk.org/jdk/pull/11371 From dholmes at openjdk.org Mon Feb 20 07:14:34 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 20 Feb 2023 07:14:34 GMT Subject: RFR: JDK-8300080: offset_of for GCC/Clang exhibits undefined behavior and is not always a compile-time constant [v4] In-Reply-To: <_g0Y7QEFggp5tnzIncC-UYrFLevSzA1ndKWQhCUTF_c=.349a118c-8ced-453f-8749-bda880105257@github.com> References: <_g0Y7QEFggp5tnzIncC-UYrFLevSzA1ndKWQhCUTF_c=.349a118c-8ced-453f-8749-bda880105257@github.com> Message-ID: On Sat, 18 Feb 2023 07:27:43 GMT, Julian Waters wrote: >> Justin King has updated the pull request incrementally with one additional commit since the last revision: >> >> Move attribute on lambda to correct location >> >> Signed-off-by: Justin King > > :( @TheShermanTanker Hotspot changes _always_ need two reviewers, unless they meet the trivial criteria. We don't generally issue the /reviewers command. ------------- PR: https://git.openjdk.org/jdk/pull/11978 From aboldtch at openjdk.org Mon Feb 20 07:15:23 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 20 Feb 2023 07:15:23 GMT Subject: RFR: 8296469: Instrument VMError::report with reentrant iteration step for register and stack printing [v9] In-Reply-To: References: Message-ID: > Add reentrant step logic to VMError::report with an inner loop which enable the logic to recover at every step of the iteration. > > Before this change, if printing one register/stack position crashes then no more registers/stack positions will be printed. > > After this change even if the VM is unstable and some registers print_location crashes the hs_err printing will recover and keep attempting to print the rest of the registers or stack values. > > Enables the following > ```C++ > REENTRANT_STEP_IF("printing register info", _verbose && _context && _thread && Universe::is_fully_initialized()) > os::print_register_info_header(st, _context); > > REENTRANT_LOOP_START(os::print_nth_register_info_max_index()) > // decode register contents if possible > ResourceMark rm(_thread); > os::print_nth_register_info(st, REENTRANT_ITERATION_STEP, _context); > REENTRANT_LOOP_END > > st->cr(); > > > Testing: tier 1 and compiled Linux-x64/aarch64, MacOS-x64/aarch64, Windows x64 and cross-compiled Linux-x86/riscv/arm/ppc/s390x (GHA and some local) Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: - Merge remote-tracking branch 'upstream_jdk/master' into vmerror_report_register_stack_reentrant - Add test - Fix and strengthen print_stack_location - Missed variable rename - Copyright - Rework logic and use continuation state for reattempts - Merge remote-tracking branch 'upstream_jdk/master' into vmerror_report_register_stack_reentrant - Restructure os::print_register_info interface - Code syle and line length - Merge Fix - ... and 5 more: https://git.openjdk.org/jdk/compare/2009dc2b...2e12b4a5 ------------- Changes: https://git.openjdk.org/jdk/pull/11017/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=11017&range=08 Stats: 676 lines in 19 files changed: 367 ins; 79 del; 230 mod Patch: https://git.openjdk.org/jdk/pull/11017.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11017/head:pull/11017 PR: https://git.openjdk.org/jdk/pull/11017 From thartmann at openjdk.org Mon Feb 20 07:33:24 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 20 Feb 2023 07:33:24 GMT Subject: RFR: JDK-8302656: Missing spaces in output of -XX:+CIPrintMethodCodes In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 13:39:58 GMT, Tobias Holenstein wrote: > When printing bytecode and method data with `-XX:+CIPrintMethodCodes` we got some lines with missing whitespace: > ``` > 11760bci: 1895VirtualCallData count(0) nonprofiled_count(0) entries(0) > ``` > > The reason for this is the use of `st->fill_to(6);` which fills up the stream with whitespace up to length 6. In our case we have 2 whitespaces before and then a numbers >=1000 which already adds up to 6+ characters in the stream. > > **Solution**: Always add a whitespace afterwards. Now > > > 0 bci: 2 VirtualCallData count(0) nonprofiled_count(0) entries(0) > 48 bci: 19 VirtualCallData count(0) nonprofiled_count(0) entries(0) > 552 bci: 109 VirtualCallData count(0) nonprofiled_count(0) entries(0) > 7320 bci: 1207 VirtualCallData count(0) nonprofiled_count(0) entries(0) > 11760 bci: 1895 VirtualCallData count(0) nonprofiled_count(0) entries(0) Looks good. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.org/jdk/pull/12591 From thartmann at openjdk.org Mon Feb 20 07:34:24 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 20 Feb 2023 07:34:24 GMT Subject: RFR: 8302146: Move TestOverloadCompileQueues.java to tier3 In-Reply-To: References: Message-ID: On Fri, 17 Feb 2023 15:08:38 GMT, Emanuel Peter wrote: > This test is expected to run long, as it has to fill the compile queue. But it does not need to run at a low tier. We just want to run it occasionally. > > Moved it to `tier3`, by adding it to `hotspot_slow_compiler`, which is excluded from `tier1` and `tier2`. Looks good. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.org/jdk/pull/12617 From thartmann at openjdk.org Mon Feb 20 07:35:36 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 20 Feb 2023 07:35:36 GMT Subject: RFR: 8302144: Move ZeroTLABTest.java to tier3 In-Reply-To: References: Message-ID: <9Zu7yAXxztPf84rwNxjbzcID97P5PRVj2k7WR8iRexE=.24646832-7270-464b-a221-920ef64bfc5a@github.com> On Fri, 17 Feb 2023 16:09:31 GMT, Emanuel Peter wrote: > This is a "Hello World" test for the flags `UseTLAB` and `ZeroTLAB`, running with `-Xcomp`. It takes about 30seconds. > This test does not need to run on `tier1`. So I pushed it out to `tier3`. > > Moved it to tier3, by adding it to hotspot_slow_compiler, which is excluded from tier1 and tier2. Looks good. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.org/jdk/pull/12620 From jwaters at openjdk.org Mon Feb 20 07:38:25 2023 From: jwaters at openjdk.org (Julian Waters) Date: Mon, 20 Feb 2023 07:38:25 GMT Subject: RFR: 8302667: Improve message format when failing to load symbols or libraries [v2] In-Reply-To: References: Message-ID: <5yVAQiMfNMmuk5X_nunGuKCxSMdv5id1lv5grz9FhFw=.d76400d8-c35d-4286-a597-00bce215b157@github.com> On Fri, 17 Feb 2023 17:05:59 GMT, Mandy Chung wrote: >> Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: >> >> - Merge branch 'openjdk:master' into patch-6 >> - Error: Failed to load > > This change is fine. > > `expandArgFile` reports `DLL_ERROR4` when it fails to read `@arg-file` which is unrelated to shared libraries. Would you mind also adding `ARG_ERROR18` to replace `DLL_ERROR4`? > @mlchung No problem, what should I define `ARG_ERROR18` as? @mlchung ? ------------- PR: https://git.openjdk.org/jdk/pull/12596 From kbarrett at openjdk.org Mon Feb 20 08:35:14 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 20 Feb 2023 08:35:14 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting Message-ID: Please review this change to the implementation of the Windows-specific option UseOSErrorReporting, toward allowing crash reporting functions to be declared noreturn. VMError::report_and_die no longer conditionally returns if the Windows-only option UseOSErrorReporting is true. The core of VM crash reporting has been renamed and privatized, and now takes an additional argument indicating whether the caller just wants reporting or also wants termination after reporting. The Windows top-level and unhandled exception filters call into that to request reporting, and also termination if UseOSErrorReporting is false, otherwise returning from the call and taking further action. All other callers of VM crash reporting include termination. So if we get a structured exception on Windows and the option is true then we report it and then end up walking up the exception handlers until running out, when Windows crash behavior gets a shot. This is essentially the same as before the change, just checking the option in a different place (in the Windows-specific caller, rather than in an #ifdef block in the shared code). In addition, the core VM crash reporter uses BREAKPOINT before termination if UseOSErrorReporting is true. So if we have an assertion failure we report it and then end up walking up the exception handlers (because of the breakpoint exception) until running out, when Windows crash behavior gets a shot. So if there is an attached debugger (or JIT debugging is enabled), this leads an assertion failure to hit the debugger inside the crash reporter rather than after it returned (due to the old behavior for UseOSErrorReporting) from the failed assertion (hitting the BREAKPOINT there). Note that on Windows, ::breakpoint() calls DebugBreak(), raising a breakpoint exception, which ends up being unhandled if UseOSErrorReporting is true. There is a pre-existing bug that I'll be reporting separately. If UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an empty .mdmp file. We shouldn't create that file when UseOSErrorReporting. Testing: mach5 tier1-3 Manual testing with the following, to verify expected behavior that is the same as before the change. # -XX:ErrorHandlerTest=N # 1: assertion failure # 2: guarantee failure # 14: SIGSEGV # 15: divide by zero path/to/bin/java \ -XX:+UnlockDiagnosticVMOptions \ -XX:+ErrorLogSecondaryErrorDetails \ -XX:+UseOSErrorReporting \ -XX:ErrorHandlerTest=1 \ TestDebug.java --- TestDebug.java --- import java.lang.String; public class TestDebug { static private volatile String dummy; public static void main(String[] args) throws Exception { while (true) { dummy = new String("foo bar"); } } } --- end TestDebug.java --- I need someone with a better understanding of Windows to test the interaction with Windows Error Reporting (WER). ------------- Commit messages: - refactor UseOSErrorReporting for noreturn attributes Changes: https://git.openjdk.org/jdk/pull/12651/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12651&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302798 Stats: 92 lines in 5 files changed: 75 ins; 11 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/12651.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12651/head:pull/12651 PR: https://git.openjdk.org/jdk/pull/12651 From tholenstein at openjdk.org Mon Feb 20 08:38:36 2023 From: tholenstein at openjdk.org (Tobias Holenstein) Date: Mon, 20 Feb 2023 08:38:36 GMT Subject: RFR: JDK-8302656: Missing spaces in output of -XX:+CIPrintMethodCodes In-Reply-To: References: Message-ID: On Fri, 17 Feb 2023 17:52:08 GMT, Vladimir Kozlov wrote: >> When printing bytecode and method data with `-XX:+CIPrintMethodCodes` we got some lines with missing whitespace: >> ``` >> 11760bci: 1895VirtualCallData count(0) nonprofiled_count(0) entries(0) >> ``` >> >> The reason for this is the use of `st->fill_to(6);` which fills up the stream with whitespace up to length 6. In our case we have 2 whitespaces before and then a numbers >=1000 which already adds up to 6+ characters in the stream. >> >> **Solution**: Always add a whitespace afterwards. Now >> >> >> 0 bci: 2 VirtualCallData count(0) nonprofiled_count(0) entries(0) >> 48 bci: 19 VirtualCallData count(0) nonprofiled_count(0) entries(0) >> 552 bci: 109 VirtualCallData count(0) nonprofiled_count(0) entries(0) >> 7320 bci: 1207 VirtualCallData count(0) nonprofiled_count(0) entries(0) >> 11760 bci: 1895 VirtualCallData count(0) nonprofiled_count(0) entries(0) > > Good and trivial. Thank you @vnkozlov and @TobiHartmann for the reviews! ------------- PR: https://git.openjdk.org/jdk/pull/12591 From tholenstein at openjdk.org Mon Feb 20 08:38:38 2023 From: tholenstein at openjdk.org (Tobias Holenstein) Date: Mon, 20 Feb 2023 08:38:38 GMT Subject: Integrated: JDK-8302656: Missing spaces in output of -XX:+CIPrintMethodCodes In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 13:39:58 GMT, Tobias Holenstein wrote: > When printing bytecode and method data with `-XX:+CIPrintMethodCodes` we got some lines with missing whitespace: > ``` > 11760bci: 1895VirtualCallData count(0) nonprofiled_count(0) entries(0) > ``` > > The reason for this is the use of `st->fill_to(6);` which fills up the stream with whitespace up to length 6. In our case we have 2 whitespaces before and then a numbers >=1000 which already adds up to 6+ characters in the stream. > > **Solution**: Always add a whitespace afterwards. Now > > > 0 bci: 2 VirtualCallData count(0) nonprofiled_count(0) entries(0) > 48 bci: 19 VirtualCallData count(0) nonprofiled_count(0) entries(0) > 552 bci: 109 VirtualCallData count(0) nonprofiled_count(0) entries(0) > 7320 bci: 1207 VirtualCallData count(0) nonprofiled_count(0) entries(0) > 11760 bci: 1895 VirtualCallData count(0) nonprofiled_count(0) entries(0) This pull request has now been integrated. Changeset: 743a85db Author: Tobias Holenstein URL: https://git.openjdk.org/jdk/commit/743a85db06ea245dbe6234b1840f18f8b2466e69 Stats: 4 lines in 2 files changed: 0 ins; 0 del; 4 mod 8302656: Missing spaces in output of -XX:+CIPrintMethodCodes Reviewed-by: kvn, thartmann ------------- PR: https://git.openjdk.org/jdk/pull/12591 From rehn at openjdk.org Mon Feb 20 08:46:39 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 20 Feb 2023 08:46:39 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v2] In-Reply-To: References: <3lTNf7dMxXfNZlqfgtV5UHqxq4moAgrnyHCM2CtlkYI=.4df55d81-05d3-4636-bdfc-296a031aa815@github.com> Message-ID: On Fri, 17 Feb 2023 20:59:52 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Review fixes > > src/hotspot/share/code/codeCache.cpp line 1416: > >> 1414: } >> 1415: >> 1416: // Flushes compiled methods dependent on dependee > > nit typo: s/Flushes/Marks/ Fixed, thanks! > src/hotspot/share/prims/methodHandles.cpp line 959: > >> 957: { >> 958: NoSafepointVerifier nsv; >> 959: MutexLocker mu2(CodeCache_lock, Mutex::_no_safepoint_check_flag); > > I missed this naming issue earlier: s/mu2/mu/ > since we only have one Mutex here. Fixed, thanks! > src/hotspot/share/prims/methodHandles.cpp line 1336: > >> 1334: DependencyContext deps = java_lang_invoke_MethodHandleNatives_CallSiteContext::vmdependencies(context()); >> 1335: deps.remove_and_mark_for_deoptimization_all_dependents(&deopt_scope); >> 1336: // This assumed to be an 'atomic' operation by verification. > > nit typo: s/This assumed/This is assumed/ Fixed, thanks! > src/hotspot/share/prims/methodHandles.hpp line 82: > >> 80: static void clean_dependency_context(oop call_site); >> 81: >> 82: static void mark_dependent_nmethods(DeoptimizationScope* deopt_scope, Handle call_site, Handle target); > > Need to update copyright year in this file. Fixed, thanks! > src/hotspot/share/runtime/deoptimization.cpp line 109: > >> 107: >> 108: MutexLocker ml(CompiledMethod_lock, Mutex::_no_safepoint_check_flag); >> 109: // If there is nothing to deopt _gen is the same as comitted. > > nit typo: s/_gen/_required_gen/ Fixed, thanks! > src/hotspot/share/runtime/deoptimization.cpp line 143: > >> 141: Mutex::_no_safepoint_check_flag); >> 142: // A method marked by someone else may have a _gen lower than what we marked with. >> 143: // Therefore only store it if it's higher than _gen. > > nit typo: s/_gen/_required_gen/ > in two places Fixed, thanks! ------------- PR: https://git.openjdk.org/jdk/pull/12585 From rrich at openjdk.org Mon Feb 20 08:49:27 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Mon, 20 Feb 2023 08:49:27 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v3] In-Reply-To: References: Message-ID: <8kW6oyPa9PkTAOKsU71YmN4zmMi5VSrHlOAE1frAazQ=.cce2850f-4ef3-44c9-85f8-b056fe632e56@github.com> On Fri, 17 Feb 2023 14:47:56 GMT, Johannes Bechberger wrote: >> Extends the existing AsyncGetCallTrace test case and fixes the issue. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Revert previous change src/hotspot/cpu/x86/frame_x86.cpp line 75: > 73: // interpreted -> interpreted calls that go through a method handle linker, > 74: // since those pop the last argument (the appendix) from the stack. > 75: if (!thread->is_in_full_stack_checked(unextended_sp)) { The condition can remain stricter. The following should work too. Could you please check? Suggestion: if (!thread->is_in_stack_range_incl(unextended_sp + Interpreter::stackElementSize, sp)) { ------------- PR: https://git.openjdk.org/jdk/pull/12535 From simonis at openjdk.org Mon Feb 20 08:52:35 2023 From: simonis at openjdk.org (Volker Simonis) Date: Mon, 20 Feb 2023 08:52:35 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: On Fri, 17 Feb 2023 21:21:56 GMT, Jorn Vernee wrote: > > (because they are all not referenced from the program any more). > > So, it sounds like this is testing a case where `LambdaMetafactory.metafactory` is being called directly? > > I'd like to point out that, while I buy that people are doing this (I see enough of this on StackOverflow), `LambdaMetafactory` is a runtime API specifically designed to support a language feature. Calling it directly is an unintended use case. Sorry, but that's a really weird argument. You can't introduce a new public API and then blame people for using it! If the API is really only intended to support the internal implementation of a language feature than make it private and don't export it. ------------- PR: https://git.openjdk.org/jdk/pull/12493 From simonis at openjdk.org Mon Feb 20 08:52:36 2023 From: simonis at openjdk.org (Volker Simonis) Date: Mon, 20 Feb 2023 08:52:36 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: On Sat, 18 Feb 2023 01:55:52 GMT, Mandy Chung wrote: > > Finally, the PR fixes a regression compared to the JDK 11 behavior. > > This is an intended change that improves the C heap memory usage. In addition, the spec of `LambdaMetafactory` has no guarantee of the spinned classes be unloaded independent of it defining loader. This is indeed a behavioral change if `LambdaMetafactory` API is called directly instead of via `invokedynamic`. I can create a CSR to retrofit this behavioral change. Yes, please. I think that would be helpful. ------------- PR: https://git.openjdk.org/jdk/pull/12493 From rehn at openjdk.org Mon Feb 20 08:54:36 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 20 Feb 2023 08:54:36 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v2] In-Reply-To: References: <3lTNf7dMxXfNZlqfgtV5UHqxq4moAgrnyHCM2CtlkYI=.4df55d81-05d3-4636-bdfc-296a031aa815@github.com> Message-ID: On Fri, 17 Feb 2023 21:01:47 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Review fixes > > src/hotspot/share/code/codeCache.hpp line 316: > >> 314: >> 315: // RedefineClasses support >> 316: // Flushing and deoptimization in case of evolution > > nit typo: s/Flushing/Marking/ Fixed with additional name changes, thanks. > src/hotspot/share/oops/instanceKlass.cpp line 1184: > >> 1182: DeoptimizationScope deopt_scope; >> 1183: { >> 1184: // Now flush all code that assume the class is not linked. > > nit typo: s/assume/assumes/ Fixed, thanks! > src/hotspot/share/prims/methodHandles.cpp line 1222: > >> 1220: MethodHandles::mark_dependent_nmethods(&deopt_scope, call_site, target); >> 1221: java_lang_invoke_CallSite::set_target(call_site(), target()); >> 1222: // This assumed to be an 'atomic' operation by verification. > > nit typo: s/This assumed/This is assumed/ Fixed, thanks! ------------- PR: https://git.openjdk.org/jdk/pull/12585 From duke at openjdk.org Mon Feb 20 09:03:29 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Mon, 20 Feb 2023 09:03:29 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v3] In-Reply-To: <8kW6oyPa9PkTAOKsU71YmN4zmMi5VSrHlOAE1frAazQ=.cce2850f-4ef3-44c9-85f8-b056fe632e56@github.com> References: <8kW6oyPa9PkTAOKsU71YmN4zmMi5VSrHlOAE1frAazQ=.cce2850f-4ef3-44c9-85f8-b056fe632e56@github.com> Message-ID: On Mon, 20 Feb 2023 08:46:43 GMT, Richard Reingruber wrote: >> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert previous change > > src/hotspot/cpu/x86/frame_x86.cpp line 75: > >> 73: // interpreted -> interpreted calls that go through a method handle linker, >> 74: // since those pop the last argument (the appendix) from the stack. >> 75: if (!thread->is_in_full_stack_checked(unextended_sp)) { > > The condition can remain stricter. The following should work too. Could you please check? > Suggestion: > > if (!thread->is_in_stack_range_incl(unextended_sp + Interpreter::stackElementSize, sp)) { The sanity test worked :) ------------- PR: https://git.openjdk.org/jdk/pull/12535 From duke at openjdk.org Mon Feb 20 09:04:25 2023 From: duke at openjdk.org (Afshin Zafari) Date: Mon, 20 Feb 2023 09:04:25 GMT Subject: RFR: 8290903: Enable function warning attribute for Clang build once Clang supports merging In-Reply-To: References: Message-ID: On Sat, 18 Feb 2023 15:26:04 GMT, Afshin Zafari wrote: > The warning attribute is enabled.. > > ### Test > mach5 tier1-5. > I'm a bit confused about the status here. Looking at: > > [llvm/llvm-project#56519](https://github.com/llvm/llvm-project/issues/56519) > > the issue is still open, not fixed. And the issue is reported as existing in version 14 so the uncommented version check seems wrong anyway. ?? The Fix Version of this issue is 21, so I mistakenly thought it is time for implementing the required changes. I had to double check it first. Thanks for your time. ------------- PR: https://git.openjdk.org/jdk/pull/12634 From kbarrett at openjdk.org Mon Feb 20 09:07:29 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 20 Feb 2023 09:07:29 GMT Subject: RFR: 8302122: Parallelize TLAB retirement in prologue in G1 [v5] In-Reply-To: <8P_TVw6hR1Ass_t8G1ZpV44nD04zUK3vktrRJH7hBFs=.9cb55249-4db5-4245-8988-33fc0cb70693@github.com> References: <8P_TVw6hR1Ass_t8G1ZpV44nD04zUK3vktrRJH7hBFs=.9cb55249-4db5-4245-8988-33fc0cb70693@github.com> Message-ID: On Wed, 15 Feb 2023 13:52:27 GMT, Thomas Schatzl wrote: >> Hi all, >> >> can I have reviews for this change that parallelizes TLAB retirement and log buffer flushing (which [JDK-8297611](https://bugs.openjdk.org/browse/JDK-8297611) suggests? >> >> * the parallelization has been split into java and non-java threads: parallelization of the latter does not make sense given how they are organized in a linked list. However non-java threads are typically very few anyway, and parallelization only starts making sense with low hundreds of threads anyway. >> * `G1BarrierSet::write_region` added a new parameter that describes the JavaThread the write_region/invalidation (not sure why `write_region` isn't just an overload of `invalidate`) is made for. The reason is previously when `BarrierSet::make_parsable()` (not sure why this is called this way either) is called to flush out deferred card marks, the card marks generated by that were added to the *calling thread's* refinement queue. This worked because the calling thread (the worker thread/vm thread) has always been processed after all java threads (which are the only ones that can have deferred card marks). So these deferred card marks' refinement entries were put into the worker thread's refinement queue. After parallelization this is not possible any more unless we deferred non java thread log flushing until after all java threads - I did not want to do that, so the change passes that explicit java thread through to `G1BarrierSet::write_region`. I think this is clearer anyway, and corresponds to how `G1DirtyCardQueueSet::concatenate_log_and_stats` works. >> >> Testing: tier1-5 >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > Fix compilation on osx Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.org/jdk/pull/12497 From duke at openjdk.org Mon Feb 20 09:09:48 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Mon, 20 Feb 2023 09:09:48 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v4] In-Reply-To: References: Message-ID: > Extends the existing AsyncGetCallTrace test case and fixes the issue. Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Improve condition ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12535/files - new: https://git.openjdk.org/jdk/pull/12535/files/da706d3b..b652c2e5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12535&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12535&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12535.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12535/head:pull/12535 PR: https://git.openjdk.org/jdk/pull/12535 From duke at openjdk.org Mon Feb 20 09:09:49 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Mon, 20 Feb 2023 09:09:49 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v3] In-Reply-To: References: <8kW6oyPa9PkTAOKsU71YmN4zmMi5VSrHlOAE1frAazQ=.cce2850f-4ef3-44c9-85f8-b056fe632e56@github.com> Message-ID: <0OC5hKoRVRzWFcZ0vrcJyxKWfEqAueAGREA2YLuG9wM=.08b446fe-88f7-42c9-8406-2403f4835c6a@github.com> On Mon, 20 Feb 2023 09:00:28 GMT, Johannes Bechberger wrote: >> src/hotspot/cpu/x86/frame_x86.cpp line 75: >> >>> 73: // interpreted -> interpreted calls that go through a method handle linker, >>> 74: // since those pop the last argument (the appendix) from the stack. >>> 75: if (!thread->is_in_full_stack_checked(unextended_sp)) { >> >> The condition can remain stricter. The following should work too. Could you please check? >> Suggestion: >> >> if (!thread->is_in_stack_range_incl(unextended_sp + Interpreter::stackElementSize, sp)) { > > The sanity test worked :) ... and no regression in the other serviceability tests. ------------- PR: https://git.openjdk.org/jdk/pull/12535 From rrich at openjdk.org Mon Feb 20 09:11:27 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Mon, 20 Feb 2023 09:11:27 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v3] In-Reply-To: <0OC5hKoRVRzWFcZ0vrcJyxKWfEqAueAGREA2YLuG9wM=.08b446fe-88f7-42c9-8406-2403f4835c6a@github.com> References: <8kW6oyPa9PkTAOKsU71YmN4zmMi5VSrHlOAE1frAazQ=.cce2850f-4ef3-44c9-85f8-b056fe632e56@github.com> <0OC5hKoRVRzWFcZ0vrcJyxKWfEqAueAGREA2YLuG9wM=.08b446fe-88f7-42c9-8406-2403f4835c6a@github.com> Message-ID: <14-X_O88AjbgcUUxbfIXODXbmDDjFPZYABHVn4aq_vs=.866e2f92-dda3-421d-aad3-30831c51c748@github.com> On Mon, 20 Feb 2023 09:04:39 GMT, Johannes Bechberger wrote: >> The sanity test worked :) > > ... and no regression in the other serviceability tests. Probably it's better to adapt the limit though (sorry). Problem with this version is that it allows `unextended_sp` outside the address range reserved for the stack. Let's see what other reviewers think. Suggestion: if (!thread->is_in_stack_range_incl(unextended_sp, sp - Interpreter::stackElementSize)) { ------------- PR: https://git.openjdk.org/jdk/pull/12535 From duke at openjdk.org Mon Feb 20 09:14:25 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Mon, 20 Feb 2023 09:14:25 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v3] In-Reply-To: <14-X_O88AjbgcUUxbfIXODXbmDDjFPZYABHVn4aq_vs=.866e2f92-dda3-421d-aad3-30831c51c748@github.com> References: <8kW6oyPa9PkTAOKsU71YmN4zmMi5VSrHlOAE1frAazQ=.cce2850f-4ef3-44c9-85f8-b056fe632e56@github.com> <0OC5hKoRVRzWFcZ0vrcJyxKWfEqAueAGREA2YLuG9wM=.08b446fe-88f7-42c9-8406-2403f4835c6a@github.com> <14-X_O88AjbgcUUxbfIXODXbmDDjFPZYABHVn4aq_vs=.866e2f92-dda3-421d-aad3-30831c51c748@github.com> Message-ID: On Mon, 20 Feb 2023 09:08:59 GMT, Richard Reingruber wrote: >> ... and no regression in the other serviceability tests. > > Probably it's better to adapt the limit though (sorry). > Problem with this version is that it allows `unextended_sp` outside the address range reserved for the stack. Let's see what other reviewers think. > Suggestion: > > if (!thread->is_in_stack_range_incl(unextended_sp, sp - Interpreter::stackElementSize)) { Good point, you're right. I could have recognized this myself. ------------- PR: https://git.openjdk.org/jdk/pull/12535 From duke at openjdk.org Mon Feb 20 09:18:46 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Mon, 20 Feb 2023 09:18:46 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v5] In-Reply-To: References: Message-ID: <0Q8kPIrJXF4qROlS6u3dwT0zzO6wpezV4XkG_aiQ1uc=.ec15567a-3e9a-459d-a8aa-0fe188111bd6@github.com> > Extends the existing AsyncGetCallTrace test case and fixes the issue. Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Improve condition again ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12535/files - new: https://git.openjdk.org/jdk/pull/12535/files/b652c2e5..cd85c6a9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12535&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12535&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12535.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12535/head:pull/12535 PR: https://git.openjdk.org/jdk/pull/12535 From duke at openjdk.org Mon Feb 20 09:18:48 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Mon, 20 Feb 2023 09:18:48 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v3] In-Reply-To: References: <8kW6oyPa9PkTAOKsU71YmN4zmMi5VSrHlOAE1frAazQ=.cce2850f-4ef3-44c9-85f8-b056fe632e56@github.com> <0OC5hKoRVRzWFcZ0vrcJyxKWfEqAueAGREA2YLuG9wM=.08b446fe-88f7-42c9-8406-2403f4835c6a@github.com> <14-X_O88AjbgcUUxbfIXODXbmDDjFPZYABHVn4aq_vs=.866e2f92-dda3-421d-aad3-30831c51c748@github.com> Message-ID: On Mon, 20 Feb 2023 09:11:38 GMT, Johannes Bechberger wrote: >> Probably it's better to adapt the limit though (sorry). >> Problem with this version is that it allows `unextended_sp` outside the address range reserved for the stack. Let's see what other reviewers think. >> Suggestion: >> >> if (!thread->is_in_stack_range_incl(unextended_sp, sp - Interpreter::stackElementSize)) { > > Good point, you're right. I could have recognized this myself. Tests work as expected. ------------- PR: https://git.openjdk.org/jdk/pull/12535 From rehn at openjdk.org Mon Feb 20 09:26:30 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 20 Feb 2023 09:26:30 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v2] In-Reply-To: References: <3lTNf7dMxXfNZlqfgtV5UHqxq4moAgrnyHCM2CtlkYI=.4df55d81-05d3-4636-bdfc-296a031aa815@github.com> Message-ID: On Fri, 17 Feb 2023 22:25:43 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Review fixes > > src/hotspot/share/code/codeCache.cpp line 1392: > >> 1390: if (nm->is_marked_for_deoptimization() && !nm->has_been_deoptimized() && nm->can_be_deoptimized()) { >> 1391: nm->make_not_entrant(); >> 1392: nm->make_deoptimized(); > > `CodeCache::make_marked_nmethods_deoptimized()` used to call > `make_nmethod_deoptimized(nm)` which made a couple of checks > before it called `nm->make_deoptimized()`. `make_deoptimized()` > does not make these checks so why are they no longer needed? > > Here's the removed code: > > void CodeCache::make_nmethod_deoptimized(CompiledMethod* nm) { > if (nm->is_marked_for_deoptimization() && nm->can_be_deoptimized()) { > nm->make_deoptimized(); > } > } If the method can't be deoptimized we should not make "not entrant". If it is already deoptimize no need to try to make it "not entrant" once more. If it's not marked we shouldn't make it "not entrant". So only if we make it "not entrant" we should also make_deopt(). This is checked at the line just before: if (nm->is_marked_for_deoptimization() && !nm->has_been_deoptimized() && nm->can_be_deoptimized()) { nm->make_not_entrant(); nm->make_deoptimized(); } So we do not need to recheck it. But I'll add a check that we do not re-patch the nops. ------------- PR: https://git.openjdk.org/jdk/pull/12585 From rehn at openjdk.org Mon Feb 20 09:31:31 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 20 Feb 2023 09:31:31 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v2] In-Reply-To: References: <3lTNf7dMxXfNZlqfgtV5UHqxq4moAgrnyHCM2CtlkYI=.4df55d81-05d3-4636-bdfc-296a031aa815@github.com> Message-ID: On Fri, 17 Feb 2023 22:29:06 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Review fixes > > src/hotspot/share/code/nmethod.cpp line 1166: > >> 1164: set_deoptimized_done(); >> 1165: return; >> 1166: } > > I don't understand this check-and-bailout logic at all. If continuations are > not enabled, then why does `nmethod::make_deoptimized()` do nothing? > This isn't an issue introduced by your patch, but I just happened to notice > when I was reading thru... We two deopts mechanisms: * Patching return PC in frames for nmethod that are no longer valid. Done by handshaking all threads and walking their stack. * Virtual threads stacks are only patched this way when mounted. If we have a million VT we can't walk the entire heap and patch PC is all these stacks. Instead there is nop-sledge after each calls from nmethods, so patch these to jump to the deopt when returning into a deopted nmethod. Calls into VM (causing the VT to pin) do not have these nop-sledges, so when using Loom, both mechanism is needed. ------------- PR: https://git.openjdk.org/jdk/pull/12585 From dholmes at openjdk.org Mon Feb 20 09:37:30 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 20 Feb 2023 09:37:30 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: <2ZV0893_BOjAOXAvGOYYVbmRHxh3jXmnXusIiVhrgRk=.7c081275-b4e3-4173-97a0-28ecadf89152@github.com> On Mon, 20 Feb 2023 08:49:38 GMT, Volker Simonis wrote: > Sorry, but that's a really weird argument. You can't introduce a new public API and then blame people for using it! If the API is really only intended to support the internal implementation of a language feature than make it private and don't export it. The spec states: > API Note: > These linkage methods are designed to support the evaluation of lambda expressions and method references in the Java Language. The API has to be public as the call gets injected into application code. ------------- PR: https://git.openjdk.org/jdk/pull/12493 From alanb at openjdk.org Mon Feb 20 09:42:26 2023 From: alanb at openjdk.org (Alan Bateman) Date: Mon, 20 Feb 2023 09:42:26 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: <2ZV0893_BOjAOXAvGOYYVbmRHxh3jXmnXusIiVhrgRk=.7c081275-b4e3-4173-97a0-28ecadf89152@github.com> References: <2ZV0893_BOjAOXAvGOYYVbmRHxh3jXmnXusIiVhrgRk=.7c081275-b4e3-4173-97a0-28ecadf89152@github.com> Message-ID: On Mon, 20 Feb 2023 09:34:17 GMT, David Holmes wrote: > You can't introduce a new public API and then blame people for using it! If the API is really only intended to support the internal implementation of a language feature than make it private and don't export it. There are a number of APIs that exist to support the Java Language. LambdaMetafactory makes it clear in the first paragraph of its API docs that it supports the lambda expression and method reference expression features of the Java Language. If a Java compiler compiles these language constructs to invokedynamic instructions then the bootstrap methods have to be standard APIs, otherwise the class files won't work on all VM implementations. The APIs adding to j.l.runtime are similar. Maybe some advance tooling/framework might use them directly but it should be rare. ------------- PR: https://git.openjdk.org/jdk/pull/12493 From tschatzl at openjdk.org Mon Feb 20 10:23:28 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 20 Feb 2023 10:23:28 GMT Subject: RFR: 8302122: Parallelize TLAB retirement in prologue in G1 [v5] In-Reply-To: References: <8P_TVw6hR1Ass_t8G1ZpV44nD04zUK3vktrRJH7hBFs=.9cb55249-4db5-4245-8988-33fc0cb70693@github.com> Message-ID: On Mon, 20 Feb 2023 09:04:24 GMT, Kim Barrett wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix compilation on osx > > Looks good. Thanks for your review @kimbarrett @albertnetymk ------------- PR: https://git.openjdk.org/jdk/pull/12497 From tschatzl at openjdk.org Mon Feb 20 10:26:41 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 20 Feb 2023 10:26:41 GMT Subject: Integrated: 8302122: Parallelize TLAB retirement in prologue in G1 In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 15:58:35 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that parallelizes TLAB retirement and log buffer flushing (which [JDK-8297611](https://bugs.openjdk.org/browse/JDK-8297611) suggests? > > * the parallelization has been split into java and non-java threads: parallelization of the latter does not make sense given how they are organized in a linked list. However non-java threads are typically very few anyway, and parallelization only starts making sense with low hundreds of threads anyway. > * `G1BarrierSet::write_region` added a new parameter that describes the JavaThread the write_region/invalidation (not sure why `write_region` isn't just an overload of `invalidate`) is made for. The reason is previously when `BarrierSet::make_parsable()` (not sure why this is called this way either) is called to flush out deferred card marks, the card marks generated by that were added to the *calling thread's* refinement queue. This worked because the calling thread (the worker thread/vm thread) has always been processed after all java threads (which are the only ones that can have deferred card marks). So these deferred card marks' refinement entries were put into the worker thread's refinement queue. After parallelization this is not possible any more unless we deferred non java thread log flushing until after all java threads - I did not want to do that, so the change passes that explicit java thread through to `G1BarrierSet::write_region`. I think this is clearer anyway, and corresponds to how `G1DirtyCardQueueSet::concatenate_log_and_stats` works. > > Testing: tier1-5 > > Thanks, > Thomas This pull request has now been integrated. Changeset: 593bec68 Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/593bec685eac13f02f0cbc5c0d38057a28255421 Stats: 403 lines in 17 files changed: 294 ins; 75 del; 34 mod 8302122: Parallelize TLAB retirement in prologue in G1 8297611: G1: Merge tlab and per-thread dirty card log flushing Reviewed-by: kbarrett, ayang ------------- PR: https://git.openjdk.org/jdk/pull/12497 From tschatzl at openjdk.org Mon Feb 20 10:31:57 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 20 Feb 2023 10:31:57 GMT Subject: RFR: 8302312: Make ParGCRareEvent_lock G1 specific [v2] In-Reply-To: References: Message-ID: <3k2Z1JFES-mFFr3rfbrFX_smLiA5tivN35RegWDXXrk=.8b47584e-131d-4f1a-a6c7-bbe8ff01a854@github.com> > Hi all, > > please have a look at this renaming of ParGCRareEvent_lock to G1RareEvent_lock to align with it only being used in G1. Previously it had been used with CMS too, but that one's gone now. > > Testing: local compilation, gha > > Based on PR#12511 > > Thanks, > Thomas Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12531/files - new: https://git.openjdk.org/jdk/pull/12531/files/0631598e..0631598e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12531&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12531&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12531.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12531/head:pull/12531 PR: https://git.openjdk.org/jdk/pull/12531 From tschatzl at openjdk.org Mon Feb 20 10:50:13 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 20 Feb 2023 10:50:13 GMT Subject: RFR: 8302312: Make ParGCRareEvent_lock G1 specific [v3] In-Reply-To: References: Message-ID: > Hi all, > > please have a look at this renaming of ParGCRareEvent_lock to G1RareEvent_lock to align with it only being used in G1. Previously it had been used with CMS too, but that one's gone now. > > Testing: local compilation, gha > > Based on PR#12511 > > Thanks, > Thomas Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: - Merge branch 'master' into 8302312-pargcrarevent-lock - rename - Improve G1MaxVerifyFailures bounds - kbarrett review, indentation - Fix compile error with G1MaxVerifyFailures - initial attempts, nothing done further cleanup more cleanup factor out changes that move the verifylivenessclosure fix compilation ------------- Changes: https://git.openjdk.org/jdk/pull/12531/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12531&range=02 Stats: 9 lines in 6 files changed: 2 ins; 2 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/12531.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12531/head:pull/12531 PR: https://git.openjdk.org/jdk/pull/12531 From tschatzl at openjdk.org Mon Feb 20 10:50:16 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 20 Feb 2023 10:50:16 GMT Subject: RFR: 8302312: Make ParGCRareEvent_lock G1 specific [v2] In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 07:52:31 GMT, Stefan Johansson wrote: >> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: >> >> - rename >> - Improve G1MaxVerifyFailures bounds >> - kbarrett review, indentation >> - Fix compile error with G1MaxVerifyFailures >> - initial attempts, nothing done >> >> further cleanup >> >> more cleanup >> >> factor out changes that move the verifylivenessclosure >> >> fix compilation > > Looks good and trivial. Thanks @kstefanj @kimbarrett for your reviews. ------------- PR: https://git.openjdk.org/jdk/pull/12531 From tschatzl at openjdk.org Mon Feb 20 10:50:18 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 20 Feb 2023 10:50:18 GMT Subject: Integrated: 8302312: Make ParGCRareEvent_lock G1 specific In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 11:40:24 GMT, Thomas Schatzl wrote: > Hi all, > > please have a look at this renaming of ParGCRareEvent_lock to G1RareEvent_lock to align with it only being used in G1. Previously it had been used with CMS too, but that one's gone now. > > Testing: local compilation, gha > > Based on PR#12511 > > Thanks, > Thomas This pull request has now been integrated. Changeset: 7c40c8af Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/7c40c8af690a238773e3070f16ec640b53581ee4 Stats: 9 lines in 6 files changed: 2 ins; 2 del; 5 mod 8302312: Make ParGCRareEvent_lock G1 specific Reviewed-by: sjohanss, kbarrett ------------- PR: https://git.openjdk.org/jdk/pull/12531 From luhenry at openjdk.org Mon Feb 20 10:51:37 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Mon, 20 Feb 2023 10:51:37 GMT Subject: Integrated: 8302776: RISC-V: Fix typo CSR_INSTERT to CSR_INSTRET In-Reply-To: <1nO3RE1e_4wr81-UlV-WXIEERv6vyYswpijNwVYBsYA=.d71b98a8-9d84-4728-98d4-5e970b55fbaa@github.com> References: <1nO3RE1e_4wr81-UlV-WXIEERv6vyYswpijNwVYBsYA=.d71b98a8-9d84-4728-98d4-5e970b55fbaa@github.com> Message-ID: On Fri, 17 Feb 2023 15:06:59 GMT, Ludovic Henry wrote: > 8302776: RISC-V: Fix typo CSR_INSTERT to CSR_INSTRET This pull request has now been integrated. Changeset: 303c61f3 Author: Ludovic Henry URL: https://git.openjdk.org/jdk/commit/303c61f3ca6b9cf6dd3f7dc700e9d0db04dc10e0 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod 8302776: RISC-V: Fix typo CSR_INSTERT to CSR_INSTRET Reviewed-by: fyang ------------- PR: https://git.openjdk.org/jdk/pull/12616 From luhenry at openjdk.org Mon Feb 20 10:53:18 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Mon, 20 Feb 2023 10:53:18 GMT Subject: RFR: 8295382: Implement SHA-256 Intrinsic on RISC-V [v2] In-Reply-To: <1JWd-CDS_jpIDtfu7HJAVmvViShKzVTrCOxDVBZ9GSo=.904a8e56-794c-43d6-8448-de2a1a856f33@github.com> References: <1JWd-CDS_jpIDtfu7HJAVmvViShKzVTrCOxDVBZ9GSo=.904a8e56-794c-43d6-8448-de2a1a856f33@github.com> Message-ID: > This has been tested with patches currently being submitted to QEMU to add support for Zvkb and Zvknha extensions. > > The documentation for the Vector Crypto extension's instructions is available at https://github.com/riscv/riscv-crypto/tree/master/doc/vector/insns Ludovic Henry has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: - Merge branch 'master' into dev/ludovic/vector-crypto-sha - 8295382: Implement SHA-256 Intrinsic on RISC-V ------------- Changes: https://git.openjdk.org/jdk/pull/12208/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12208&range=01 Stats: 536 lines in 5 files changed: 530 ins; 1 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/12208.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12208/head:pull/12208 PR: https://git.openjdk.org/jdk/pull/12208 From inakonechnyy at openjdk.org Mon Feb 20 12:13:29 2023 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Mon, 20 Feb 2023 12:13:29 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v2] In-Reply-To: References: Message-ID: On Wed, 15 Feb 2023 03:13:08 GMT, David Holmes wrote: >> Ilarion Nakonechnyy has updated the pull request incrementally with two additional commits since the last revision: >> >> - Address review notes >> - Correct the jtreg test - >> check stacktrace only for NoClassDefFoundError > > src/hotspot/share/classfile/javaClasses.cpp line 2783: > >> 2781: } >> 2782: >> 2783: Handle java_lang_Throwable::get_cause_simple(JavaThread* current, Handle throwable) { > > How about `get_cause_without_stacktrace`? > > The two methods should be able to share a lot of code rather than duplicating most of it. So you propose reworking a `java_lang_Throwable::get_cause_with_stack_trace` - instead of implementing a separate function `get_cause_simple`: add a parameter "Bool stacktrace_required", and manage to call Java depending on that parameter? ------------- PR: https://git.openjdk.org/jdk/pull/12566 From inakonechnyy at openjdk.org Mon Feb 20 12:16:37 2023 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Mon, 20 Feb 2023 12:16:37 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v2] In-Reply-To: References: Message-ID: On Wed, 15 Feb 2023 03:14:12 GMT, David Holmes wrote: >> Ilarion Nakonechnyy has updated the pull request incrementally with two additional commits since the last revision: >> >> - Address review notes >> - Correct the jtreg test - >> check stacktrace only for NoClassDefFoundError > > So if I understand correctly the scenario is: > > 1. During execution of clinit of class C a StackOverflowError is thrown. > 2. The VM sees the exception and creates the NoClassDefFoundError for C > 3. The VM calls `get_cause_with_stack_trace` to get the original SOE but that triggers a secondary SOE. > 4. The VM ignores the secondary SOE and the NCDFE is created without a cause. > > The fix is to retry step 3 avoiding the Java upcall and so avoiding a secondary SOE. > > Is that right? > > A few nits/suggestions below. > > Thanks @dholmes-ora Yes, you described the scenario correctly. ------------- PR: https://git.openjdk.org/jdk/pull/12566 From lucy at openjdk.org Mon Feb 20 12:21:27 2023 From: lucy at openjdk.org (Lutz Schmidt) Date: Mon, 20 Feb 2023 12:21:27 GMT Subject: RFR: 8299817: [s390] AES-CTR mode intrinsic fails with multiple short update() calls [v4] In-Reply-To: References: Message-ID: > This PR addresses an issue in the AES-CTR mode intrinsic on s390. When a message is ciphered in multiple, small (< 16 bytes) segments, the result is incorrect. > > This is not just a band-aid fix. The issue was taken as a chance to restructure the code. though still complicated, It is now easier to read and (hopefully) understand. > > Except for the new jetreg test, the changes are purely s390. There are no side effects on other platforms. Issue-specific tests pass. Other tests are in progress. I will update this PR once they are complete. > > **Reviews and comments are very much appreciated.** > > @backwaterred could you please run some "official" s390 tests? Thanks. Lutz Schmidt has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: - Merge master to resolve copyright conflict - 829817: fixed typos, removed JIT_TIMER references - 8299817: Update copyright - 8299817: [s390] AES-CTR mode intrinsic fails with multiple short update() calls ------------- Changes: https://git.openjdk.org/jdk/pull/11967/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=11967&range=03 Stats: 719 lines in 5 files changed: 509 ins; 64 del; 146 mod Patch: https://git.openjdk.org/jdk/pull/11967.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11967/head:pull/11967 PR: https://git.openjdk.org/jdk/pull/11967 From inakonechnyy at openjdk.org Mon Feb 20 12:21:29 2023 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Mon, 20 Feb 2023 12:21:29 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v2] In-Reply-To: References: Message-ID: On Wed, 15 Feb 2023 02:58:36 GMT, David Holmes wrote: >> Ilarion Nakonechnyy has updated the pull request incrementally with two additional commits since the last revision: >> >> - Address review notes >> - Correct the jtreg test - >> check stacktrace only for NoClassDefFoundError > > test/hotspot/jtreg/runtime/ClassInitErrors/TestNoClassDefFoundCause.java line 27: > >> 25: * @test >> 26: * @bug 8302491 >> 27: * @requires vm.compMode != "Xint" > > Why? The testcase is valid only if VM runs in Compiler mode. Do you think this annotator is redundant? ------------- PR: https://git.openjdk.org/jdk/pull/12566 From rehn at openjdk.org Mon Feb 20 13:08:10 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 20 Feb 2023 13:08:10 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v3] In-Reply-To: References: Message-ID: > Hi all, please consider. > > The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. > All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. > Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. > > Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. > The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) > > This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. > > This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. > Secondly it moves handshakes part out of the Compile_lock where it is possible. > > Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. > > It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. > > Thanks, Robbin Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: Review fixes 2 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12585/files - new: https://git.openjdk.org/jdk/pull/12585/files/22dbc2b4..27d2dbd3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12585&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12585&range=01-02 Stats: 23 lines in 8 files changed: 6 ins; 0 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/12585.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12585/head:pull/12585 PR: https://git.openjdk.org/jdk/pull/12585 From rehn at openjdk.org Mon Feb 20 13:37:40 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 20 Feb 2023 13:37:40 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v2] In-Reply-To: References: <3lTNf7dMxXfNZlqfgtV5UHqxq4moAgrnyHCM2CtlkYI=.4df55d81-05d3-4636-bdfc-296a031aa815@github.com> Message-ID: On Fri, 17 Feb 2023 19:58:06 GMT, Coleen Phillimore wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Review fixes > > src/hotspot/share/classfile/systemDictionary.cpp line 1505: > >> 1503: // Add to systemDictionary - so other classes can see it. >> 1504: // Grabs and releases SystemDictionary_lock >> 1505: update_dictionary(THREAD, k, loader_data); > > All these patterns are the same in class loading, except here we update the SystemDictionary after deoptimizing holding the Compile_lock and in your change we update it beforehand. > I was going to suggest future cleanup of moving add_to_hierarchy and the associated deopt_scope code to InstanceKlass, but now I'm not sure if this order matters. > Holding the Compile_lock while adding a class to the SystemDictionary and the code to hold the Compile_lock in ciEnv::get_klass_by_name() now doesn't make sense to me. Klass is read lock free in the dictionary and temporarily holding the Compile_lock doesn't seem to do what it says: > ``` > { // Grabbing the Compile_lock prevents systemDictionary updates > // during compilations. > > > At any rate, I think the SystemDictionary might need to be updated after deoptimization is done. Can you check this? Ok ------------- PR: https://git.openjdk.org/jdk/pull/12585 From jvernee at openjdk.org Mon Feb 20 14:13:27 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Mon, 20 Feb 2023 14:13:27 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v5] In-Reply-To: <0Q8kPIrJXF4qROlS6u3dwT0zzO6wpezV4XkG_aiQ1uc=.ec15567a-3e9a-459d-a8aa-0fe188111bd6@github.com> References: <0Q8kPIrJXF4qROlS6u3dwT0zzO6wpezV4XkG_aiQ1uc=.ec15567a-3e9a-459d-a8aa-0fe188111bd6@github.com> Message-ID: On Mon, 20 Feb 2023 09:18:46 GMT, Johannes Bechberger wrote: >> Extends the existing AsyncGetCallTrace test case and fixes the issue. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Improve condition again The new check looks good to me. I think the only potential issue is in cases where `unextended_sp == sp` and `sp - Interpreter::stackElementSize` is somehow outside of the stack. But, since this is looking at a sender, it would mean that the current frame would definitely be outside of the stack, so I think we're good. I'll submit tier 1-4 on our CI, and give a check mark when it comes back green. ------------- PR: https://git.openjdk.org/jdk/pull/12535 From coleenp at openjdk.org Mon Feb 20 15:00:26 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 20 Feb 2023 15:00:26 GMT Subject: RFR: 8301117: Remove old_size param from ResizeableResourceHashtable::resize() In-Reply-To: References: Message-ID: <6icBESpz5xp43cZK1QttmtKxL9Q6ow4psxWmN1OcZg8=.d65f2630-b0ff-4bea-9ab7-79d48a6dc20c@github.com> On Sat, 18 Feb 2023 15:31:46 GMT, Afshin Zafari wrote: > `old_size` is removed from the function parameters in definitions and calls. > `table_size` is used as old size. > ### Test > mach5 tiers 1-5. Looks good! ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.org/jdk/pull/12635 From rkennke at openjdk.org Mon Feb 20 15:16:37 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 20 Feb 2023 15:16:37 GMT Subject: RFR: 8302070: Factor null-check into load_klass() calls In-Reply-To: References: Message-ID: On Wed, 8 Feb 2023 12:55:29 GMT, Roman Kennke wrote: > In several asm routines, a pattern like the following can be found: > > __ null_check(obj, oopDesc::klass_offset_in_bytes()); > __ load_klass(klass, obj, tmp); > > The null-check logically belongs into load_klass(), though. This becomes apparent in Lilliput, where the klass-offset is different, depending on whether or not Lilliput is enabled. > > Testing: > - [x] tier1 x86_64 > - [x] tier1 aarch64 Thanks! ------------- PR: https://git.openjdk.org/jdk/pull/12473 From rkennke at openjdk.org Mon Feb 20 15:16:38 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 20 Feb 2023 15:16:38 GMT Subject: Integrated: 8302070: Factor null-check into load_klass() calls In-Reply-To: References: Message-ID: On Wed, 8 Feb 2023 12:55:29 GMT, Roman Kennke wrote: > In several asm routines, a pattern like the following can be found: > > __ null_check(obj, oopDesc::klass_offset_in_bytes()); > __ load_klass(klass, obj, tmp); > > The null-check logically belongs into load_klass(), though. This becomes apparent in Lilliput, where the klass-offset is different, depending on whether or not Lilliput is enabled. > > Testing: > - [x] tier1 x86_64 > - [x] tier1 aarch64 This pull request has now been integrated. Changeset: 7cf7e0a2 Author: Roman Kennke URL: https://git.openjdk.org/jdk/commit/7cf7e0a20b37522b0c7a97e5269bcd2eed174dbe Stats: 89 lines in 28 files changed: 41 ins; 25 del; 23 mod 8302070: Factor null-check into load_klass() calls Reviewed-by: phh, coleenp ------------- PR: https://git.openjdk.org/jdk/pull/12473 From eosterlund at openjdk.org Mon Feb 20 15:32:13 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 20 Feb 2023 15:32:13 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers Message-ID: So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. ------------- Commit messages: - 8302780: Add support for vectorized arraycopy GC barriers Changes: https://git.openjdk.org/jdk/pull/12670/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12670&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302780 Stats: 1168 lines in 11 files changed: 859 ins; 26 del; 283 mod Patch: https://git.openjdk.org/jdk/pull/12670.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12670/head:pull/12670 PR: https://git.openjdk.org/jdk/pull/12670 From eosterlund at openjdk.org Mon Feb 20 15:40:28 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 20 Feb 2023 15:40:28 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 15:23:04 GMT, Erik ?sterlund wrote: > So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. RISC-V code contributed by @yadongw - thanks for that! ------------- PR: https://git.openjdk.org/jdk/pull/12670 From lucy at openjdk.org Mon Feb 20 16:44:35 2023 From: lucy at openjdk.org (Lutz Schmidt) Date: Mon, 20 Feb 2023 16:44:35 GMT Subject: RFR: 8299817: [s390] AES-CTR mode intrinsic fails with multiple short update() calls [v4] In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 12:21:27 GMT, Lutz Schmidt wrote: >> This PR addresses an issue in the AES-CTR mode intrinsic on s390. When a message is ciphered in multiple, small (< 16 bytes) segments, the result is incorrect. >> >> This is not just a band-aid fix. The issue was taken as a chance to restructure the code. though still complicated, It is now easier to read and (hopefully) understand. >> >> Except for the new jetreg test, the changes are purely s390. There are no side effects on other platforms. Issue-specific tests pass. Other tests are in progress. I will update this PR once they are complete. >> >> **Reviews and comments are very much appreciated.** >> >> @backwaterred could you please run some "official" s390 tests? Thanks. > > Lutz Schmidt has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: > > - Merge master to resolve copyright conflict > - 829817: fixed typos, removed JIT_TIMER references > - 8299817: Update copyright > - 8299817: [s390] AES-CTR mode intrinsic fails with multiple short update() calls Being an s390-only PR, failures on arm build and linux-x85 tests are unrelated. ------------- PR: https://git.openjdk.org/jdk/pull/11967 From dholmes at openjdk.org Mon Feb 20 22:14:27 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 20 Feb 2023 22:14:27 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v2] In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 12:18:08 GMT, Ilarion Nakonechnyy wrote: >> test/hotspot/jtreg/runtime/ClassInitErrors/TestNoClassDefFoundCause.java line 27: >> >>> 25: * @test >>> 26: * @bug 8302491 >>> 27: * @requires vm.compMode != "Xint" >> >> Why? > > The testcase is valid only if VM runs in Compiler mode. > Do you think this annotator is redundant? The failure scenario seems independent of compilation mode. And the testcase seems like it should also work in any compilation mode. ------------- PR: https://git.openjdk.org/jdk/pull/12566 From dholmes at openjdk.org Mon Feb 20 22:20:27 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 20 Feb 2023 22:20:27 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v2] In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 12:10:57 GMT, Ilarion Nakonechnyy wrote: >> src/hotspot/share/classfile/javaClasses.cpp line 2783: >> >>> 2781: } >>> 2782: >>> 2783: Handle java_lang_Throwable::get_cause_simple(JavaThread* current, Handle throwable) { >> >> How about `get_cause_without_stacktrace`? >> >> The two methods should be able to share a lot of code rather than duplicating most of it. > > So you propose reworking a `java_lang_Throwable::get_cause_with_stack_trace` - instead of implementing a separate function `get_cause_simple`: add a parameter "Bool stacktrace_required", and manage to call Java depending on that parameter? That would be one option. There is a lot of overlap with these two methods. Also I think your new method has to handle this case too: // If new_exception returns a different exception while creating the exception, return null. ------------- PR: https://git.openjdk.org/jdk/pull/12566 From inakonechnyy at openjdk.org Mon Feb 20 23:35:26 2023 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Mon, 20 Feb 2023 23:35:26 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v2] In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 22:17:34 GMT, David Holmes wrote: > Also I think your new method has to handle this case too > / If new_exception returns a different exception while creating the exception, return null. I'm afraid that in this case ( if I return null ), the original cause again will be missed from the stacktrace. ------------- PR: https://git.openjdk.org/jdk/pull/12566 From dholmes at openjdk.org Mon Feb 20 23:59:26 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 20 Feb 2023 23:59:26 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v2] In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 23:32:09 GMT, Ilarion Nakonechnyy wrote: >> That would be one option. There is a lot of overlap with these two methods. Also I think your new method has to handle this case too: >> >> // If new_exception returns a different exception while creating the exception, return null. > >> Also I think your new method has to handle this case too >> / If new_exception returns a different exception while creating the exception, return null. > > I'm afraid that in this case ( if I return null ), the original cause again will be missed from the stacktrace. If `new_exception` returns a different exception then you have already lost that. The `ExceptionInInitializerError` would have been replaced by some other exception object: most likely OOME or even another SOE? ------------- PR: https://git.openjdk.org/jdk/pull/12566 From dholmes at openjdk.org Tue Feb 21 00:28:21 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 21 Feb 2023 00:28:21 GMT Subject: RFR: 8302905: arm32 Raspberry Pi OS build broken by JDK-8301494 Message-ID: <0VwQBMeWtaJorq5TtyKHgZaFVvTzf3eqPKHhYlI7-PY=.e14cae1b-c3fd-4416-8751-27f0b76e6465@github.com> Trivial fix that replaces `nullptr` with 0 in `intptr_t` expressions. This seems more appropriate than casting `nullptr` to `intptr_t`. Tested with local ARM32 build. Thanks. ------------- Commit messages: - 8302905: arm32 Raspberry Pi OS build broken by JDK-8301494 Changes: https://git.openjdk.org/jdk/pull/12679/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12679&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302905 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/12679.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12679/head:pull/12679 PR: https://git.openjdk.org/jdk/pull/12679 From mikael at openjdk.org Tue Feb 21 00:39:34 2023 From: mikael at openjdk.org (Mikael Vidstedt) Date: Tue, 21 Feb 2023 00:39:34 GMT Subject: RFR: 8302905: arm32 Raspberry Pi OS build broken by JDK-8301494 In-Reply-To: <0VwQBMeWtaJorq5TtyKHgZaFVvTzf3eqPKHhYlI7-PY=.e14cae1b-c3fd-4416-8751-27f0b76e6465@github.com> References: <0VwQBMeWtaJorq5TtyKHgZaFVvTzf3eqPKHhYlI7-PY=.e14cae1b-c3fd-4416-8751-27f0b76e6465@github.com> Message-ID: <1Y55-W9VPmBBRuE8isEfzYHCgXBcAd7tCxYRRQIH3Yo=.d4bc4ce3-41ab-4393-add6-4fb93d92e4e9@github.com> On Tue, 21 Feb 2023 00:20:51 GMT, David Holmes wrote: > Trivial fix that replaces `nullptr` with 0 in `intptr_t` expressions. This seems more appropriate than casting `nullptr` to `intptr_t`. > > Tested with local ARM32 build. > > Thanks. Marked as reviewed by mikael (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12679 From dholmes at openjdk.org Tue Feb 21 00:44:25 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 21 Feb 2023 00:44:25 GMT Subject: RFR: 8302905: arm32 Raspberry Pi OS build broken by JDK-8301494 In-Reply-To: <1Y55-W9VPmBBRuE8isEfzYHCgXBcAd7tCxYRRQIH3Yo=.d4bc4ce3-41ab-4393-add6-4fb93d92e4e9@github.com> References: <0VwQBMeWtaJorq5TtyKHgZaFVvTzf3eqPKHhYlI7-PY=.e14cae1b-c3fd-4416-8751-27f0b76e6465@github.com> <1Y55-W9VPmBBRuE8isEfzYHCgXBcAd7tCxYRRQIH3Yo=.d4bc4ce3-41ab-4393-add6-4fb93d92e4e9@github.com> Message-ID: On Tue, 21 Feb 2023 00:36:38 GMT, Mikael Vidstedt wrote: >> Trivial fix that replaces `nullptr` with 0 in `intptr_t` expressions. This seems more appropriate than casting `nullptr` to `intptr_t`. >> >> Tested with local ARM32 build. >> >> Thanks. > > Marked as reviewed by mikael (Reviewer). Thanks @vidmik ! ------------- PR: https://git.openjdk.org/jdk/pull/12679 From dholmes at openjdk.org Tue Feb 21 00:50:24 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 21 Feb 2023 00:50:24 GMT Subject: RFR: 8299240: rank of JvmtiVTMSTransition_lock can be safepoint In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 05:37:51 GMT, Serguei Spitsyn wrote: > The rank of JvmtiVTMSTransition_lock is better to be safepoint instead of nosafepoint. > The fix includes removal of the function `check_vthread_and_suspend_at_safepoint` which is not needed anymore. > > Testing: > mach5 jobs are in progress: > Kitchensink, tiers1-6 (all JVMTI, JDWP, JDI and JDB tests have to be included) Seems quite reasonable. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12550 From dholmes at openjdk.org Tue Feb 21 00:50:25 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 21 Feb 2023 00:50:25 GMT Subject: RFR: 8299240: rank of JvmtiVTMSTransition_lock can be safepoint In-Reply-To: References: Message-ID: On Sat, 18 Feb 2023 01:33:16 GMT, Serguei Spitsyn wrote: >> Thank you for looking at this PR, David! >> Please, note a disabler at L938. A safepont can be reached in its destructor. >> Also, see the comment at L952: >> >> 937 { >> 938 JvmtiVTMSTransitionDisabler disabler(true); >> 939 ThreadsListHandle tlh(current); >> 940 >> 941 err = get_threadOop_and_JavaThread(tlh.list(), thread, &java_thread, &thread_oop); >> 942 if (err != JVMTI_ERROR_NONE) { >> 943 return err; >> 944 } >> 945 >> 946 // Do not use JvmtiVTMSTransitionDisabler in context of self suspend to avoid deadlocks. >> 947 if (java_thread != current) { >> 948 err = suspend_thread(thread_oop, java_thread, /* single_suspend */ true, nullptr); >> 949 return err; >> 950 } >> 951 // protect thread_oop as a safepoint can be reached in disabler destructor >> 952 self_tobj = Handle(current, thread_oop); >> 953 } > >> it means there are now many places that can block for safepoints that previously wouldn't and it is very to hard to tell if that is okay or not. > > We partially rely on testing and good test coverage here. Okay we can see how this goes. ------------- PR: https://git.openjdk.org/jdk/pull/12550 From martin at openjdk.org Tue Feb 21 01:16:27 2023 From: martin at openjdk.org (Martin Buchholz) Date: Tue, 21 Feb 2023 01:16:27 GMT Subject: RFR: 8302905: arm32 Raspberry Pi OS build broken by JDK-8301494 In-Reply-To: <0VwQBMeWtaJorq5TtyKHgZaFVvTzf3eqPKHhYlI7-PY=.e14cae1b-c3fd-4416-8751-27f0b76e6465@github.com> References: <0VwQBMeWtaJorq5TtyKHgZaFVvTzf3eqPKHhYlI7-PY=.e14cae1b-c3fd-4416-8751-27f0b76e6465@github.com> Message-ID: On Tue, 21 Feb 2023 00:20:51 GMT, David Holmes wrote: > Trivial fix that replaces `nullptr` with 0 in `intptr_t` expressions. This seems more appropriate than casting `nullptr` to `intptr_t`. > > Tested with local ARM32 build. > > Thanks. This fixes my build. I would have preferred using ((intptr_t)0) instead of 0. BTW: hotspot is full of type-punning code with undefined behavior. TODO: convert hotspot to use bit_cast https://en.cppreference.com/w/cpp/numeric/bit_cast ------------- Marked as reviewed by martin (Reviewer). PR: https://git.openjdk.org/jdk/pull/12679 From dholmes at openjdk.org Tue Feb 21 01:20:27 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 21 Feb 2023 01:20:27 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v5] In-Reply-To: <0Q8kPIrJXF4qROlS6u3dwT0zzO6wpezV4XkG_aiQ1uc=.ec15567a-3e9a-459d-a8aa-0fe188111bd6@github.com> References: <0Q8kPIrJXF4qROlS6u3dwT0zzO6wpezV4XkG_aiQ1uc=.ec15567a-3e9a-459d-a8aa-0fe188111bd6@github.com> Message-ID: On Mon, 20 Feb 2023 09:18:46 GMT, Johannes Bechberger wrote: >> Extends the existing AsyncGetCallTrace test case and fixes the issue. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Improve condition again The latest version seems a more conservative constraint, so as long as that fixes things it seems good to me. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12535 From dholmes at openjdk.org Tue Feb 21 01:25:33 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 21 Feb 2023 01:25:33 GMT Subject: RFR: 8302905: arm32 Raspberry Pi OS build broken by JDK-8301494 In-Reply-To: <0VwQBMeWtaJorq5TtyKHgZaFVvTzf3eqPKHhYlI7-PY=.e14cae1b-c3fd-4416-8751-27f0b76e6465@github.com> References: <0VwQBMeWtaJorq5TtyKHgZaFVvTzf3eqPKHhYlI7-PY=.e14cae1b-c3fd-4416-8751-27f0b76e6465@github.com> Message-ID: On Tue, 21 Feb 2023 00:20:51 GMT, David Holmes wrote: > Trivial fix that replaces `nullptr` with 0 in `intptr_t` expressions. This seems more appropriate than casting `nullptr` to `intptr_t`. > > Tested with local ARM32 build. > > Thanks. Thanks for the review @martin-buchholz. It may be a while before we can adopt something from C++20. ------------- PR: https://git.openjdk.org/jdk/pull/12679 From dholmes at openjdk.org Tue Feb 21 01:25:34 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 21 Feb 2023 01:25:34 GMT Subject: Integrated: 8302905: arm32 Raspberry Pi OS build broken by JDK-8301494 In-Reply-To: <0VwQBMeWtaJorq5TtyKHgZaFVvTzf3eqPKHhYlI7-PY=.e14cae1b-c3fd-4416-8751-27f0b76e6465@github.com> References: <0VwQBMeWtaJorq5TtyKHgZaFVvTzf3eqPKHhYlI7-PY=.e14cae1b-c3fd-4416-8751-27f0b76e6465@github.com> Message-ID: On Tue, 21 Feb 2023 00:20:51 GMT, David Holmes wrote: > Trivial fix that replaces `nullptr` with 0 in `intptr_t` expressions. This seems more appropriate than casting `nullptr` to `intptr_t`. > > Tested with local ARM32 build. > > Thanks. This pull request has now been integrated. Changeset: 91a2b5ec Author: David Holmes URL: https://git.openjdk.org/jdk/commit/91a2b5ec6f90b9895924a49319c2c6b7007d96bd Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod 8302905: arm32 Raspberry Pi OS build broken by JDK-8301494 Reviewed-by: mikael, martin ------------- PR: https://git.openjdk.org/jdk/pull/12679 From dholmes at openjdk.org Tue Feb 21 01:58:30 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 21 Feb 2023 01:58:30 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 08:28:05 GMT, Kim Barrett wrote: > Please review this change to the implementation of the Windows-specific option > UseOSErrorReporting, toward allowing crash reporting functions to be declared > noreturn. VMError::report_and_die no longer conditionally returns if the > Windows-only option UseOSErrorReporting is true. > > The core of VM crash reporting has been renamed and privatized, and now takes > an additional argument indicating whether the caller just wants reporting or > also wants termination after reporting. The Windows top-level and unhandled > exception filters call into that to request reporting, and also termination if > UseOSErrorReporting is false, otherwise returning from the call and taking > further action. All other callers of VM crash reporting include termination. > > So if we get a structured exception on Windows and the option is true then we > report it and then end up walking up the exception handlers until running out, > when Windows crash behavior gets a shot. This is essentially the same as > before the change, just checking the option in a different place (in the > Windows-specific caller, rather than in an #ifdef block in the shared code). > > In addition, the core VM crash reporter uses BREAKPOINT before termination if > UseOSErrorReporting is true. So if we have an assertion failure we report it > and then end up walking up the exception handlers (because of the breakpoint > exception) until running out, when Windows crash behavior gets a shot. > > So if there is an attached debugger (or JIT debugging is enabled), this leads > an assertion failure to hit the debugger inside the crash reporter rather than > after it returned (due to the old behavior for UseOSErrorReporting) from the > failed assertion (hitting the BREAKPOINT there). > > Note that on Windows, ::breakpoint() calls DebugBreak(), raising a breakpoint > exception, which ends up being unhandled if UseOSErrorReporting is true. > > There is a pre-existing bug that I'll be reporting separately. If > UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an > empty .mdmp file. We shouldn't create that file when UseOSErrorReporting. > > Testing: > mach5 tier1-3 > > Manual testing with the following, to verify expected behavior that is the > same as before the change. > > # -XX:ErrorHandlerTest=N > # 1: assertion failure > # 2: guarantee failure > # 14: SIGSEGV > # 15: divide by zero > path/to/bin/java \ > -XX:+UnlockDiagnosticVMOptions \ > -XX:+ErrorLogSecondaryErrorDetails \ > -XX:+UseOSErrorReporting \ > -XX:ErrorHandlerTest=1 \ > TestDebug.java > > --- TestDebug.java --- > import java.lang.String; > public class TestDebug { > static private volatile String dummy; > public static void main(String[] args) throws Exception { > while (true) { > dummy = new String("foo bar"); > } > } > } > --- end TestDebug.java --- > > I need someone with a better understanding of Windows to test the interaction > with Windows Error Reporting (WER). So IIUC ... the top-level crash handlers on Windows will honour `UseOSErrorReporting` by passing `ARA:Return` if needed. But all other exit paths, such as `report_vm_out_of_memory` will now never return even if UOER is true, and to try and compensate for that you call `BREAKPOINT` in that case before actually dying. In terms of behavioural change I'm not at all sure what impact that might have. I wonder if we could get @mo-beck to assess this change? src/hotspot/share/utilities/vmError.cpp line 1368: > 1366: filename, lineno, size); > 1367: // Shouldn't be able to get here. Die immediately if we do. > 1368: os::die(); Should we try to report this somehow? ------------- PR: https://git.openjdk.org/jdk/pull/12651 From kbarrett at openjdk.org Tue Feb 21 05:16:25 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 21 Feb 2023 05:16:25 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 01:56:04 GMT, David Holmes wrote: > So IIUC ... the top-level crash handlers on Windows will honour `UseOSErrorReporting` by passing `ARA:Return` if needed. But all other exit paths, such as `report_vm_out_of_memory` will now never return even if UOER is true, and to try and compensate for that you call `BREAKPOINT` in that case before actually dying. > > In terms of behavioural change I'm not at all sure what impact that might have. What they used to do when the option is true was return and execute the BREAKPOINT immmediately following the call. Now they don't return, instead executing BREAKPOINT instead. Note that on Windows BREAKPOINT => DebugBreak(), which raises a breakpoint structured exception, which (in the absence of an attached debugger) walks up the handler chain (trying to report along the way, but we've already reported), until becoming an unhandled exception and terminating. So I think the same behavior as before, just the BREAKPOINT occurs in a slightly different place. > I wonder if we could get @mo-beck to assess this change? I've already emailed her, asking for help. > src/hotspot/share/utilities/vmError.cpp line 1368: > >> 1366: filename, lineno, size); >> 1367: // Shouldn't be able to get here. Die immediately if we do. >> 1368: os::die(); > > Should we try to report this somehow? I don't really see how, without being significantly at risk of recursive errors. ------------- PR: https://git.openjdk.org/jdk/pull/12651 From rrich at openjdk.org Tue Feb 21 07:15:28 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Tue, 21 Feb 2023 07:15:28 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v5] In-Reply-To: <0Q8kPIrJXF4qROlS6u3dwT0zzO6wpezV4XkG_aiQ1uc=.ec15567a-3e9a-459d-a8aa-0fe188111bd6@github.com> References: <0Q8kPIrJXF4qROlS6u3dwT0zzO6wpezV4XkG_aiQ1uc=.ec15567a-3e9a-459d-a8aa-0fe188111bd6@github.com> Message-ID: On Mon, 20 Feb 2023 09:18:46 GMT, Johannes Bechberger wrote: >> Extends the existing AsyncGetCallTrace test case and fixes the issue. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Improve condition again Looks good to me. Richard. ------------- Marked as reviewed by rrich (Reviewer). PR: https://git.openjdk.org/jdk/pull/12535 From epeter at openjdk.org Tue Feb 21 07:16:34 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 21 Feb 2023 07:16:34 GMT Subject: RFR: 8302146: Move TestOverloadCompileQueues.java to tier3 In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 07:32:03 GMT, Tobias Hartmann wrote: >> This test is expected to run long, as it has to fill the compile queue. But it does not need to run at a low tier. We just want to run it occasionally. >> >> Moved it to `tier3`, by adding it to `hotspot_slow_compiler`, which is excluded from `tier1` and `tier2`. > > Looks good. Thanks @TobiHartmann @vnkozlov for the reviews! ------------- PR: https://git.openjdk.org/jdk/pull/12617 From epeter at openjdk.org Tue Feb 21 07:16:37 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 21 Feb 2023 07:16:37 GMT Subject: Integrated: 8302146: Move TestOverloadCompileQueues.java to tier3 In-Reply-To: References: Message-ID: On Fri, 17 Feb 2023 15:08:38 GMT, Emanuel Peter wrote: > This test is expected to run long, as it has to fill the compile queue. But it does not need to run at a low tier. We just want to run it occasionally. > > Moved it to `tier3`, by adding it to `hotspot_slow_compiler`, which is excluded from `tier1` and `tier2`. This pull request has now been integrated. Changeset: 17274c72 Author: Emanuel Peter URL: https://git.openjdk.org/jdk/commit/17274c72a962e8ee3afed72b38ed72aa20dd2ae0 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod 8302146: Move TestOverloadCompileQueues.java to tier3 Reviewed-by: kvn, thartmann ------------- PR: https://git.openjdk.org/jdk/pull/12617 From rehn at openjdk.org Tue Feb 21 08:08:00 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 21 Feb 2023 08:08:00 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: Message-ID: > Hi all, please consider. > > The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. > All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. > Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. > > Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. > The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) > > This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. > > This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. > Secondly it moves handshakes part out of the Compile_lock where it is possible. > > Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. > > It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. > > Thanks, Robbin Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: Coleen fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12585/files - new: https://git.openjdk.org/jdk/pull/12585/files/27d2dbd3..1cbe4937 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12585&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12585&range=02-03 Stats: 61 lines in 5 files changed: 8 ins; 26 del; 27 mod Patch: https://git.openjdk.org/jdk/pull/12585.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12585/head:pull/12585 PR: https://git.openjdk.org/jdk/pull/12585 From rehn at openjdk.org Tue Feb 21 08:08:51 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 21 Feb 2023 08:08:51 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v3] In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 13:08:10 GMT, Robbin Ehn wrote: >> Hi all, please consider. >> >> The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. >> All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. >> Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. >> >> Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. >> The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) >> >> This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. >> >> This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. >> Secondly it moves handshakes part out of the Compile_lock where it is possible. >> >> Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. >> >> It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. >> >> Thanks, Robbin > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Review fixes 2 @coleenp was these the changes you were looking for? Passes t1-t7 ------------- PR: https://git.openjdk.org/jdk/pull/12585 From dholmes at openjdk.org Tue Feb 21 08:23:28 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 21 Feb 2023 08:23:28 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 05:13:27 GMT, Kim Barrett wrote: > What they used to do when the option is true was return and execute the BREAKPOINT immmediately following the call. Sorry I'm having trouble finding that existing BREAKPOINT >> src/hotspot/share/utilities/vmError.cpp line 1368: >> >>> 1366: filename, lineno, size); >>> 1367: // Shouldn't be able to get here. Die immediately if we do. >>> 1368: os::die(); >> >> Should we try to report this somehow? > > I don't really see how, without being significantly at risk of recursive errors. tty->print_cr or even UL if possible? ------------- PR: https://git.openjdk.org/jdk/pull/12651 From duke at openjdk.org Tue Feb 21 08:58:50 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Tue, 21 Feb 2023 08:58:50 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v6] In-Reply-To: References: Message-ID: > Extends the existing AsyncGetCallTrace test case and fixes the issue. Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Update full name ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12535/files - new: https://git.openjdk.org/jdk/pull/12535/files/cd85c6a9..42bdfd8f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12535&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12535&range=04-05 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12535.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12535/head:pull/12535 PR: https://git.openjdk.org/jdk/pull/12535 From sspitsyn at openjdk.org Tue Feb 21 09:31:26 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 21 Feb 2023 09:31:26 GMT Subject: RFR: 8299240: rank of JvmtiVTMSTransition_lock can be safepoint In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 05:37:51 GMT, Serguei Spitsyn wrote: > The rank of JvmtiVTMSTransition_lock is better to be safepoint instead of nosafepoint. > The fix includes removal of the function `check_vthread_and_suspend_at_safepoint` which is not needed anymore. > > Testing: > mach5 jobs are in progress: > Kitchensink, tiers1-6 (all JVMTI, JDWP, JDI and JDB tests have to be included) David, thank you for review! ------------- PR: https://git.openjdk.org/jdk/pull/12550 From yzheng at openjdk.org Tue Feb 21 09:32:26 2023 From: yzheng at openjdk.org (Yudi Zheng) Date: Tue, 21 Feb 2023 09:32:26 GMT Subject: RFR: 8302452: [JVMCI] Export _poly1305_processBlocks, JfrThreadLocal fields to JVMCI compiler. In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 15:13:05 GMT, Yudi Zheng wrote: > This PR allows JVMCI compiler intrinsics to reuse the _poly1305_processBlocks stub and to update JfrThreadLocal fields on `Thread.setCurrentThread` events. @vnkozlov could you please review this patch? Many thanks! ------------- PR: https://git.openjdk.org/jdk/pull/12560 From rehn at openjdk.org Tue Feb 21 10:11:34 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 21 Feb 2023 10:11:34 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v2] In-Reply-To: References: <3lTNf7dMxXfNZlqfgtV5UHqxq4moAgrnyHCM2CtlkYI=.4df55d81-05d3-4636-bdfc-296a031aa815@github.com> Message-ID: On Mon, 20 Feb 2023 13:34:50 GMT, Robbin Ehn wrote: >> src/hotspot/share/classfile/systemDictionary.cpp line 1505: >> >>> 1503: // Add to systemDictionary - so other classes can see it. >>> 1504: // Grabs and releases SystemDictionary_lock >>> 1505: update_dictionary(THREAD, k, loader_data); >> >> All these patterns are the same in class loading, except here we update the SystemDictionary after deoptimizing holding the Compile_lock and in your change we update it beforehand. >> I was going to suggest future cleanup of moving add_to_hierarchy and the associated deopt_scope code to InstanceKlass, but now I'm not sure if this order matters. >> Holding the Compile_lock while adding a class to the SystemDictionary and the code to hold the Compile_lock in ciEnv::get_klass_by_name() now doesn't make sense to me. Klass is read lock free in the dictionary and temporarily holding the Compile_lock doesn't seem to do what it says: >> ``` >> { // Grabbing the Compile_lock prevents systemDictionary updates >> // during compilations. >> >> >> At any rate, I think the SystemDictionary might need to be updated after deoptimization is done. Can you check this? > > Ok Please see new commit, thanks! ------------- PR: https://git.openjdk.org/jdk/pull/12585 From rkennke at openjdk.org Tue Feb 21 11:10:48 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 21 Feb 2023 11:10:48 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v23] In-Reply-To: <0-lVZpSsP2FUCAiK6Oo2CrI9FNIg-vyq0G5mdoEh1JQ=.50c51610-7efd-4752-8ea5-b8da19f7cce8@github.com> References: <0-lVZpSsP2FUCAiK6Oo2CrI9FNIg-vyq0G5mdoEh1JQ=.50c51610-7efd-4752-8ea5-b8da19f7cce8@github.com> Message-ID: On Wed, 15 Feb 2023 14:40:20 GMT, Stefan Karlsson wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Clarify comment on arrayOopDesc::max_array_length() > > src/hotspot/share/gc/z/zObjArrayAllocator.cpp line 63: > >> 61: assert(is_aligned(base_offset, HeapWordSize), "remaining array base must be 64 bit aligned"); >> 62: >> 63: const size_t header = heap_word_size(base_offset); > > `base_offset` has already been aligned by the code above, so we don't have to call heap_word_size here and can simply divide with HeapWordSize. Regarding MemAllocator::obj_memory_range(): why do we even need to exclude array header from the bad-heap-word-check, when not zeroing? It seems reasonable to also check the header word(s) in any case. I'd just remove ObjArrayAllocator::obj_memory_range() and always use the full object range, as implemented by MemAllocator::obj_memory_range(). ------------- PR: https://git.openjdk.org/jdk/pull/11044 From dnsimon at openjdk.org Tue Feb 21 11:18:25 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 21 Feb 2023 11:18:25 GMT Subject: RFR: 8302452: [JVMCI] Export _poly1305_processBlocks, JfrThreadLocal fields to JVMCI compiler. In-Reply-To: References: Message-ID: <5y2Kgsd_bme6GrO7elI9s-P4RG9POHutpp2S39MiOJI=.964aa5ca-2757-47a4-a798-e4a88a5faee9@github.com> On Tue, 14 Feb 2023 15:13:05 GMT, Yudi Zheng wrote: > This PR allows JVMCI compiler intrinsics to reuse the _poly1305_processBlocks stub and to update JfrThreadLocal fields on `Thread.setCurrentThread` events. Marked as reviewed by dnsimon (Committer). ------------- PR: https://git.openjdk.org/jdk/pull/12560 From thartmann at openjdk.org Tue Feb 21 11:31:00 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Tue, 21 Feb 2023 11:31:00 GMT Subject: RFR: 8289552: Make intrinsic conversions between bit representations of half precision values and floats [v13] In-Reply-To: References: <66be8SJdxPOqmqsQ1YIwS4zM4GwPerypGIf8IbfxhRs=.1d03c94a-f3e5-40ae-999e-bdd5f328170d@github.com> Message-ID: On Tue, 11 Oct 2022 17:00:53 GMT, Smita Kamath wrote: >> I started new testing. > > @vnkozlov Thank you for reviewing the patch. There is an issue with the `_floatToFloat16` intrinsic, leading to incorrect results: [JDK-8302976](https://bugs.openjdk.org/browse/JDK-8302976). @smita-kamath, could you please have a look? Thanks. ------------- PR: https://git.openjdk.org/jdk/pull/9781 From inakonechnyy at openjdk.org Tue Feb 21 11:31:31 2023 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Tue, 21 Feb 2023 11:31:31 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v2] In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 23:56:53 GMT, David Holmes wrote: >>> Also I think your new method has to handle this case too >>> / If new_exception returns a different exception while creating the exception, return null. >> >> I'm afraid that in this case ( if I return null ), the original cause again will be missed from the stacktrace. > > If `new_exception` returns a different exception then you have already lost that. The `ExceptionInInitializerError` would have been replaced by some other exception object: most likely OOME or even another SOE? Is it poor, to report OOME or SOE as the cause, instead of `ExceptionInInitializerError`? Especially when the first call to `get_cause_with_stack_trace` already has returned null. Actually, this is the intention of the fix - to report the SOE as the cause of NCDFE. ------------- PR: https://git.openjdk.org/jdk/pull/12566 From inakonechnyy at openjdk.org Tue Feb 21 12:12:56 2023 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Tue, 21 Feb 2023 12:12:56 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v3] In-Reply-To: References: Message-ID: > The proposed approach added a new function for getting the cause of an exception -`java_lang_Throwable::get_cause_simple `, that gets called within `InstanceKlass::add_initialization_error` if an old one `java_lang_Throwable::get_cause_with_stack_trace` didn't succeed because of an exception during the VM call. The simple function doesn't call the VM for getting a stack trace but fills in any other information about an exception. > > Besides that, the discovering information about an exception was added to `ConstantPoolCacheEntry::save_and_throw_indy_exc` function. > > Jtreg for reproducing the issue also was added to the commit. > The commit was tested with tier1 tests. Ilarion Nakonechnyy has updated the pull request incrementally with two additional commits since the last revision: - Get rid of redundant code - merge get_cause_with_stack_trace and get_cause_simple - Review corrections ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12566/files - new: https://git.openjdk.org/jdk/pull/12566/files/f2e5b362..6df8bb1a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12566&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12566&range=01-02 Stats: 52 lines in 5 files changed: 6 ins; 23 del; 23 mod Patch: https://git.openjdk.org/jdk/pull/12566.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12566/head:pull/12566 PR: https://git.openjdk.org/jdk/pull/12566 From dholmes at openjdk.org Tue Feb 21 12:31:27 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 21 Feb 2023 12:31:27 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v2] In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 11:29:10 GMT, Ilarion Nakonechnyy wrote: >> If `new_exception` returns a different exception then you have already lost that. The `ExceptionInInitializerError` would have been replaced by some other exception object: most likely OOME or even another SOE? > > Is it poor, to report OOME or SOE as the cause, instead of `ExceptionInInitializerError`? > Especially when the first call to `get_cause_with_stack_trace` already has returned null. > Actually, this is the intention of the fix - to report the SOE as the cause of NCDFE. If an exception happens during clinit then it is caught and set as the cause of an ExceptionInInitializerError and the class is marked as erroneous. Subsequent attempts to load the class should throw NCDFE with the EIIE as the cause. Sorry I've now confused myself about exactly what the problem scenario is and how we are trying to fix it. The first attempt to call `get_cause_with_stack_trace` fails due to SOE - but what "cause" are we trying to get here and what exception are we trying to throw? I need to take a deeper look at this again tomorrow and get the control flow sorted out. ------------- PR: https://git.openjdk.org/jdk/pull/12566 From jvernee at openjdk.org Tue Feb 21 12:58:27 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Tue, 21 Feb 2023 12:58:27 GMT Subject: RFR: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test [v6] In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 08:58:50 GMT, Johannes Bechberger wrote: >> Extends the existing AsyncGetCallTrace test case and fixes the issue. > > Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: > > Update full name Testing looks good ------------- Marked as reviewed by jvernee (Reviewer). PR: https://git.openjdk.org/jdk/pull/12535 From rkennke at openjdk.org Tue Feb 21 13:17:34 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 21 Feb 2023 13:17:34 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v24] In-Reply-To: References: Message-ID: > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 40 commits: - Eliminate oopDesc::header_size() - Merge branch 'master' into JDK-8139457 - Clarify comment on arrayOopDesc::max_array_length() - Remove debug output - Fix assert in collectedHeap - Add asserts in and around arrayOop::max_array_length() - Fix intendation in test - Remove stale method - Alexey's comments - Use uint64_t - ... and 30 more: https://git.openjdk.org/jdk/compare/622f5604...017a836b ------------- Changes: https://git.openjdk.org/jdk/pull/11044/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=23 Stats: 750 lines in 47 files changed: 505 ins; 146 del; 99 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From duke at openjdk.org Tue Feb 21 13:51:10 2023 From: duke at openjdk.org (Jan Kratochvil) Date: Tue, 21 Feb 2023 13:51:10 GMT Subject: RFR: 8302191: Performance degradation for float/double modulo on Linux [v2] In-Reply-To: References: Message-ID: > I have OCA already processed/approved. I am not Author but my Author request is being processed these days (sent to Rob McKenna). > I did regression test x86_64 OpenJDK-8. I will leave other regression testing on GHA. > The patch (and former GCC performance regression) affects only x86_64+i686. Jan Kratochvil has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision: 8302191: Performance degradation for float/double modulo on Linux ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12508/files - new: https://git.openjdk.org/jdk/pull/12508/files/bf445751..d00f1a5a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12508&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12508&range=00-01 Stats: 52354 lines in 1136 files changed: 20657 ins; 15866 del; 15831 mod Patch: https://git.openjdk.org/jdk/pull/12508.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12508/head:pull/12508 PR: https://git.openjdk.org/jdk/pull/12508 From duke at openjdk.org Tue Feb 21 13:51:12 2023 From: duke at openjdk.org (Jan Kratochvil) Date: Tue, 21 Feb 2023 13:51:12 GMT Subject: RFR: 8302191: Performance degradation for float/double modulo on Linux In-Reply-To: References: Message-ID: On Fri, 17 Feb 2023 06:27:38 GMT, David Holmes wrote: > adding to test/micro/org/openjdk/bench/vm/floating-point. I have used `floatingpoint` as the dash is not permitted: https://docs.oracle.com/javase/tutorial/java/package/namingpkgs.html ------------- PR: https://git.openjdk.org/jdk/pull/12508 From duke at openjdk.org Tue Feb 21 14:36:47 2023 From: duke at openjdk.org (Johannes Bechberger) Date: Tue, 21 Feb 2023 14:36:47 GMT Subject: Integrated: JDK-8302320: AsyncGetCallTrace obtains too few frames in sanity test In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 14:39:00 GMT, Johannes Bechberger wrote: > Extends the existing AsyncGetCallTrace test case and fixes the issue. This pull request has now been integrated. Changeset: db483a38 Author: Johannes Bechberger Committer: Jorn Vernee URL: https://git.openjdk.org/jdk/commit/db483a38a815f85bd9668749674b5f0f6e4b27b4 Stats: 39 lines in 2 files changed: 36 ins; 0 del; 3 mod 8302320: AsyncGetCallTrace obtains too few frames in sanity test Reviewed-by: jvernee, dholmes, rrich ------------- PR: https://git.openjdk.org/jdk/pull/12535 From coleenp at openjdk.org Tue Feb 21 14:48:28 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 21 Feb 2023 14:48:28 GMT Subject: RFR: 8299240: rank of JvmtiVTMSTransition_lock can be safepoint In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 05:37:51 GMT, Serguei Spitsyn wrote: > The rank of JvmtiVTMSTransition_lock is better to be safepoint instead of nosafepoint. > The fix includes removal of the function `check_vthread_and_suspend_at_safepoint` which is not needed anymore. > > Testing: > mach5 jobs are in progress: > Kitchensink, tiers1-6 (all JVMTI, JDWP, JDI and JDB tests have to be included) I had a couple of nits but otherwise this looks like a good change. src/hotspot/share/prims/jvmtiEnv.cpp line 932: > 930: JavaThread* current = JavaThread::current(); > 931: HandleMark hm(current); > 932: Handle self_tobj = Handle(current, nullptr); This doesn't have to have the assignment, which the compiler will optimize away anyway. Handle self_tobj; does the same thing. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.org/jdk/pull/12550 From coleenp at openjdk.org Tue Feb 21 14:48:29 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 21 Feb 2023 14:48:29 GMT Subject: RFR: 8299240: rank of JvmtiVTMSTransition_lock can be safepoint In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 14:38:21 GMT, Coleen Phillimore wrote: >> The rank of JvmtiVTMSTransition_lock is better to be safepoint instead of nosafepoint. >> The fix includes removal of the function `check_vthread_and_suspend_at_safepoint` which is not needed anymore. >> >> Testing: >> mach5 jobs are in progress: >> Kitchensink, tiers1-6 (all JVMTI, JDWP, JDI and JDB tests have to be included) > > src/hotspot/share/prims/jvmtiEnv.cpp line 932: > >> 930: JavaThread* current = JavaThread::current(); >> 931: HandleMark hm(current); >> 932: Handle self_tobj = Handle(current, nullptr); > > This doesn't have to have the assignment, which the compiler will optimize away anyway. > > Handle self_tobj; > > does the same thing. Both local variables java_thread and thread_oop should be declared before line 939 so we don't have to worry about the oop being unhandled and java_thread is only used locally. ------------- PR: https://git.openjdk.org/jdk/pull/12550 From mbaesken at openjdk.org Tue Feb 21 14:53:10 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 21 Feb 2023 14:53:10 GMT Subject: RFR: JDK-8302989: Add missing INCLUDE_CDS checks Message-ID: The cds only coding in hotspot is usually guarded with the INCLUDE_CDS macro so that it can be removed at compile time in case the correct configure flags are set. However at some places INCLUDE_CDS is missing and should be added. One question - should (additionally to the UseSharedSpaces code section) the DumpSharedSpaces code sections be guarded as well with INCLUDE_CDS macros ? ------------- Commit messages: - JDK-8302989 Changes: https://git.openjdk.org/jdk/pull/12691/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12691&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302989 Stats: 55 lines in 5 files changed: 22 ins; 4 del; 29 mod Patch: https://git.openjdk.org/jdk/pull/12691.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12691/head:pull/12691 PR: https://git.openjdk.org/jdk/pull/12691 From coleenp at openjdk.org Tue Feb 21 14:53:26 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 21 Feb 2023 14:53:26 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v3] In-Reply-To: <3PPkXDXtcUMLVD3xYuoFP0TobqmB1sz-piqpdqs84y0=.57906c0a-e5e8-44f4-a052-7da5ff0c639d@github.com> References: <3PPkXDXtcUMLVD3xYuoFP0TobqmB1sz-piqpdqs84y0=.57906c0a-e5e8-44f4-a052-7da5ff0c639d@github.com> Message-ID: On Wed, 15 Feb 2023 01:40:09 GMT, Ilarion Nakonechnyy wrote: >> test/hotspot/jtreg/runtime/ClassInitErrors/TestNoClassDefFoundCause.java line 26: >> >>> 24: /** >>> 25: * @test >>> 26: * @bug 1234567 >> >> Fix the bug number. >> Do you need a @requires to run with -Xcomp? > > Do you propose to use something like > `@requires vm.compMode != "Xint"` No just leave it unspecified. ------------- PR: https://git.openjdk.org/jdk/pull/12566 From coleenp at openjdk.org Tue Feb 21 15:09:27 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 21 Feb 2023 15:09:27 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v3] In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 12:12:56 GMT, Ilarion Nakonechnyy wrote: >> The proposed approach added a new function for getting the cause of an exception -`java_lang_Throwable::get_cause_simple `, that gets called within `InstanceKlass::add_initialization_error` if an old one `java_lang_Throwable::get_cause_with_stack_trace` didn't succeed because of an exception during the VM call. The simple function doesn't call the VM for getting a stack trace but fills in any other information about an exception. >> >> Besides that, the discovering information about an exception was added to `ConstantPoolCacheEntry::save_and_throw_indy_exc` function. >> >> Jtreg for reproducing the issue also was added to the commit. >> The commit was tested with tier1 tests. > > Ilarion Nakonechnyy has updated the pull request incrementally with two additional commits since the last revision: > > - Get rid of redundant code - > merge get_cause_with_stack_trace and get_cause_simple > - Review corrections Changes requested by coleenp (Reviewer). src/hotspot/share/classfile/javaClasses.cpp line 2738: > 2736: } > 2737: > 2738: Handle java_lang_Throwable::get_cause(Handle throwable, bool with_stack_trace, TRAPS) { I see, you just reversed the code so it creates the message first, then calls getStackTrace. If you change it to not have TRAPS, and use JavaThread* current as the first parameter, then have as the first line to show that the exception isn't propagated out of this function. JavaThread* THREAD = current; create the message { call getStackTrace(); if (PENDING_EXCEPTION) CLEAR_PENDING_EXCEPTION; else set_stacktrace(); clear_backtrace(); } return h_cause; Something like that. Then the caller doesn't need to check for null or pending exception. ------------- PR: https://git.openjdk.org/jdk/pull/12566 From coleenp at openjdk.org Tue Feb 21 15:09:30 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 21 Feb 2023 15:09:30 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v3] In-Reply-To: References: Message-ID: On Wed, 15 Feb 2023 00:45:31 GMT, Coleen Phillimore wrote: >> Ilarion Nakonechnyy has updated the pull request incrementally with two additional commits since the last revision: >> >> - Get rid of redundant code - >> merge get_cause_with_stack_trace and get_cause_simple >> - Review corrections > > src/hotspot/share/oops/instanceKlass.cpp line 987: > >> 985: // Retry with the simple method >> 986: cause = java_lang_Throwable::get_cause_simple(exception, THREAD); >> 987: CLEAR_PENDING_EXCEPTION; > > Reverse this: clear the pending exception first. So you could handle the case where the stack trace throws an OOM or SOE in the get_cause() function and not have the second call here. ------------- PR: https://git.openjdk.org/jdk/pull/12566 From coleenp at openjdk.org Tue Feb 21 15:14:36 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 21 Feb 2023 15:14:36 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: Message-ID: <3PEGJ7Y4-Co17YrM6Z4T0zt9nmQbsOyWElr4H6tQBOw=.be00c6bb-931f-4d60-ac83-8e8616cabc94@github.com> On Tue, 21 Feb 2023 08:08:00 GMT, Robbin Ehn wrote: >> Hi all, please consider. >> >> The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. >> All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. >> Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. >> >> Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. >> The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) >> >> This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. >> >> This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. >> Secondly it moves handshakes part out of the Compile_lock where it is possible. >> >> Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. >> >> It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. >> >> Thanks, Robbin > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Coleen fix I still haven't reviewed the DeoptimizationScope part but you have other reviewers for that. Just looking at the class loading code. src/hotspot/share/classfile/systemDictionary.cpp line 1614: > 1612: // before a new class is used. > 1613: > 1614: void SystemDictionary::add_to_hierarchy(InstanceKlass* k, JavaThread* current) { Since you're changing add_to_hierarchy so much, I think you might as well move it to InstanceKlass also. These changes look good. Removing the Compile_lock around dictionary update can be done later (at at lower priority). ------------- Changes requested by coleenp (Reviewer). PR: https://git.openjdk.org/jdk/pull/12585 From mchung at openjdk.org Tue Feb 21 17:02:28 2023 From: mchung at openjdk.org (Mandy Chung) Date: Tue, 21 Feb 2023 17:02:28 GMT Subject: RFR: 8302667: Improve message format when failing to load symbols or libraries [v2] In-Reply-To: References: Message-ID: <5HI98VM_T336GbG8atm-0-Vuf9tqbVR9nIFGghIsdE8=.42476df6-adaf-4055-8619-c984954a7c92@github.com> On Fri, 17 Feb 2023 14:08:34 GMT, Julian Waters wrote: >> DLL_ERROR4 is a macro expanding to an error message when a failure to load a generic item (shared libraries or an exported symbol from said libraries for example) occurs. "Error: loading:" is not a very pretty message, so this small change results in "Error: Failed to load %s" instead, which looks better and also means the message makes more sense if we want to append a reason behind as well, such as "Error: Failed to load libjvm.so because xxx" > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch 'openjdk:master' into patch-6 > - Error: Failed to load What about: #define ARG_ERROR18 "Error: reading %s" ------------- PR: https://git.openjdk.org/jdk/pull/12596 From jwaters at openjdk.org Tue Feb 21 17:14:57 2023 From: jwaters at openjdk.org (Julian Waters) Date: Tue, 21 Feb 2023 17:14:57 GMT Subject: RFR: 8302667: Improve message format when failing to load symbols or libraries [v3] In-Reply-To: References: Message-ID: > DLL_ERROR4 is a macro expanding to an error message when a failure to load a generic item (shared libraries or an exported symbol from said libraries for example) occurs. "Error: loading:" is not a very pretty message, so this small change results in "Error: Failed to load %s" instead, which looks better and also means the message makes more sense if we want to append a reason behind as well, such as "Error: Failed to load libjvm.so because xxx" Julian Waters has updated the pull request incrementally with two additional commits since the last revision: - expandArgFile - ARG_ERROR18 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12596/files - new: https://git.openjdk.org/jdk/pull/12596/files/5f4370f8..6b54816a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12596&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12596&range=01-02 Stats: 2 lines in 2 files changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12596.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12596/head:pull/12596 PR: https://git.openjdk.org/jdk/pull/12596 From jwaters at openjdk.org Tue Feb 21 17:14:58 2023 From: jwaters at openjdk.org (Julian Waters) Date: Tue, 21 Feb 2023 17:14:58 GMT Subject: RFR: 8302667: Improve message format when failing to load symbols or libraries [v2] In-Reply-To: References: Message-ID: On Fri, 17 Feb 2023 14:08:34 GMT, Julian Waters wrote: >> DLL_ERROR4 is a macro expanding to an error message when a failure to load a generic item (shared libraries or an exported symbol from said libraries for example) occurs. "Error: loading:" is not a very pretty message, so this small change results in "Error: Failed to load %s" instead, which looks better and also means the message makes more sense if we want to append a reason behind as well, such as "Error: Failed to load libjvm.so because xxx" > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch 'openjdk:master' into patch-6 > - Error: Failed to load Alright, will do, just gimme a moment... ------------- PR: https://git.openjdk.org/jdk/pull/12596 From pchilanomate at openjdk.org Tue Feb 21 18:02:26 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 21 Feb 2023 18:02:26 GMT Subject: RFR: 8300575: JVMTI support when using alternative virtual thread implementation [v7] In-Reply-To: References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: <44LBvvPOCCF4QAbRg6ZPbjAy8aQg-PL7sYQ-xTWmQgo=.f4424e4a-c9fd-4599-8e07-211d0ed23581@github.com> On Sat, 18 Feb 2023 07:56:25 GMT, Alan Bateman wrote: >>> Thanks for dropping is_bound_vthread, I think it's much better now. What would you think about changing it to test the thread type is_a vmClasses::BaseVirtualThread_klass() so the VM doesn't need to know about ThreadBuilders$BoundVirtualThread. >>> >> Yes, we could do that. Although I think that checking for BoundVirtualThread_klass is more descriptive of what we are trying to do and will be less confusing when reading the code. How about checking for `(!VMContinuations && x->is_a(vmClasses::BaseVirtualThread_klass()))`? > >> Yes, we could do that. Although I think that checking for BoundVirtualThread_klass is more descriptive of what we are trying to do and will be less confusing when reading the code. How about checking for `(!VMContinuations && x->is_a(vmClasses::BaseVirtualThread_klass()))`? > > We can go with what you have for now and re-visit again if it becomes an issue. The main thing I had hoped is that the VM won't get too coupled to BoundVirtualThread as that may move or be refactored soon. Thanks for the reviews @AlanBateman, @lmesnik and @sspitsyn! ------------- PR: https://git.openjdk.org/jdk/pull/12512 From pchilanomate at openjdk.org Tue Feb 21 18:15:26 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 21 Feb 2023 18:15:26 GMT Subject: RFR: 8299240: rank of JvmtiVTMSTransition_lock can be safepoint In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 05:37:51 GMT, Serguei Spitsyn wrote: > The rank of JvmtiVTMSTransition_lock is better to be safepoint instead of nosafepoint. > The fix includes removal of the function `check_vthread_and_suspend_at_safepoint` which is not needed anymore. > > Testing: > mach5 jobs are in progress: > Kitchensink, tiers1-6 (all JVMTI, JDWP, JDI and JDB tests have to be included) Looks good to me! src/hotspot/share/prims/jvmtiThreadState.cpp line 301: > 299: } > 300: MonitorLocker ml(JvmtiVTMSTransition_lock); > 301: Ah, before we were accessing the vthread oop fields while being blocked so this change actually fixes that. ------------- Marked as reviewed by pchilanomate (Reviewer). PR: https://git.openjdk.org/jdk/pull/12550 From simonis at openjdk.org Tue Feb 21 18:42:30 2023 From: simonis at openjdk.org (Volker Simonis) Date: Tue, 21 Feb 2023 18:42:30 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: <2nIotFMJtOU6SxkjWHI7_Q-tTVb78xaPACgQMgLFWhY=.2d075f93-bacb-4dfa-86ee-e34d3ff9921a@github.com> On Thu, 9 Feb 2023 18:11:18 GMT, Volker Simonis wrote: >> Prior to [JDK-8239384](https://bugs.openjdk.org/browse/JDK-8239384)/[JDK-8238358](https://bugs.openjdk.org/browse/JDK-8238358) LambdaMetaFactory has created VM-anonymous classes which could easily be unloaded once they were not referenced any more. Starting with JDK 15 and the new "hidden class" based implementation, this is not the case any more, because the hidden classes will be strongly tied to their defining class loader. If this is the default application class loader, these hidden classes can never be unloaded which can easily lead to Metaspace exhaustion (see the [test case in the JBS issue](https://bugs.openjdk.org/secure/attachment/102601/LambdaClassLeak.java)). This is a regression compared to previous JDK versions which some of our applications have been affected from when migrating to JDK 17. >> >> The reason why the newly created hidden classes are strongly linked to their defining class loader is not clear to me. JDK-8239384 mentions it as an "implementation detail": >> >>> *4. the lambda proxy class has the strong relationship with the class loader (that will share the VM metaspace for its defining loader - implementation details)* >> >> From my current understanding the strong link between a hidden class created by `LambdaMetaFactory` and its defining class loader is not strictly required. In order to prevent potential OOMs and fix the regression compared the JDK 14 and earlier I propose to create these hidden classes without the `STRONG` option. >> >> I'll be happy to add the test case as JTreg test to this PR if you think that would be useful. > > Volker Simonis has updated the pull request incrementally with two additional commits since the last revision: > > - Remove assertions which insist on Lambda proxy classes being strongly linked to their class loader > - Removed unused import of STRONG und updated copyright year I hear your arguments although I don't agree :) Can we at least get consensus on that the current design to create a new ClassLoaderData for each non-strongly linked Hidden Class just in order to enable simple unloading isn't the greatest design and should eventually be replaced by a more sophisticated implementation which allows non-strongly linked Hidden Classes to share a single ClassLoaderData but still enable unloading for them once they aren't referenced any more? ------------- PR: https://git.openjdk.org/jdk/pull/12493 From kbarrett at openjdk.org Tue Feb 21 19:38:26 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 21 Feb 2023 19:38:26 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 08:20:42 GMT, David Holmes wrote: > > What they used to do when the option is true was return and execute the BREAKPOINT immmediately following the call. > > Sorry I'm having trouble finding that existing BREAKPOINT It's in the various assertion macros in debug.hpp, for example here: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/utilities/debug.hpp#L57-L65 These will be removed in a later change in this series, once the reporting functions are noreturn. (This PR is only one step toward making them noreturn, and not there yet because of `Debugging`.) ------------- PR: https://git.openjdk.org/jdk/pull/12651 From sspitsyn at openjdk.org Tue Feb 21 19:49:26 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 21 Feb 2023 19:49:26 GMT Subject: RFR: 8299240: rank of JvmtiVTMSTransition_lock can be safepoint In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 05:37:51 GMT, Serguei Spitsyn wrote: > The rank of JvmtiVTMSTransition_lock is better to be safepoint instead of nosafepoint. > The fix includes removal of the function `check_vthread_and_suspend_at_safepoint` which is not needed anymore. > > Testing: > mach5 jobs are in progress: > Kitchensink, tiers1-6 (all JVMTI, JDWP, JDI and JDB tests have to be included) Coleen and Patricio, thank you for review! ------------- PR: https://git.openjdk.org/jdk/pull/12550 From sspitsyn at openjdk.org Tue Feb 21 19:49:29 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 21 Feb 2023 19:49:29 GMT Subject: RFR: 8299240: rank of JvmtiVTMSTransition_lock can be safepoint In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 14:45:13 GMT, Coleen Phillimore wrote: >> src/hotspot/share/prims/jvmtiEnv.cpp line 932: >> >>> 930: JavaThread* current = JavaThread::current(); >>> 931: HandleMark hm(current); >>> 932: Handle self_tobj = Handle(current, nullptr); >> >> This doesn't have to have the assignment, which the compiler will optimize away anyway. >> >> Handle self_tobj; >> >> does the same thing. > > Both local variables java_thread and thread_oop should be declared before line 939 so we don't have to worry about the oop being unhandled and java_thread is only used locally. Thank you, Coleen. Fixed now. ------------- PR: https://git.openjdk.org/jdk/pull/12550 From sspitsyn at openjdk.org Tue Feb 21 19:49:30 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 21 Feb 2023 19:49:30 GMT Subject: RFR: 8299240: rank of JvmtiVTMSTransition_lock can be safepoint In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 17:57:36 GMT, Patricio Chilano Mateo wrote: >> The rank of JvmtiVTMSTransition_lock is better to be safepoint instead of nosafepoint. >> The fix includes removal of the function `check_vthread_and_suspend_at_safepoint` which is not needed anymore. >> >> Testing: >> mach5 jobs are in progress: >> Kitchensink, tiers1-6 (all JVMTI, JDWP, JDI and JDB tests have to be included) > > src/hotspot/share/prims/jvmtiThreadState.cpp line 301: > >> 299: } >> 300: MonitorLocker ml(JvmtiVTMSTransition_lock); >> 301: > > Ah, before we were accessing the vthread oop fields while being blocked so this change actually fixes that. Right. It is really hard to catch and track, so it is much better to avoid being blocked as it is dangerous. ------------- PR: https://git.openjdk.org/jdk/pull/12550 From coleenp at openjdk.org Tue Feb 21 19:52:30 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 21 Feb 2023 19:52:30 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 18:11:18 GMT, Volker Simonis wrote: >> Prior to [JDK-8239384](https://bugs.openjdk.org/browse/JDK-8239384)/[JDK-8238358](https://bugs.openjdk.org/browse/JDK-8238358) LambdaMetaFactory has created VM-anonymous classes which could easily be unloaded once they were not referenced any more. Starting with JDK 15 and the new "hidden class" based implementation, this is not the case any more, because the hidden classes will be strongly tied to their defining class loader. If this is the default application class loader, these hidden classes can never be unloaded which can easily lead to Metaspace exhaustion (see the [test case in the JBS issue](https://bugs.openjdk.org/secure/attachment/102601/LambdaClassLeak.java)). This is a regression compared to previous JDK versions which some of our applications have been affected from when migrating to JDK 17. >> >> The reason why the newly created hidden classes are strongly linked to their defining class loader is not clear to me. JDK-8239384 mentions it as an "implementation detail": >> >>> *4. the lambda proxy class has the strong relationship with the class loader (that will share the VM metaspace for its defining loader - implementation details)* >> >> From my current understanding the strong link between a hidden class created by `LambdaMetaFactory` and its defining class loader is not strictly required. In order to prevent potential OOMs and fix the regression compared the JDK 14 and earlier I propose to create these hidden classes without the `STRONG` option. >> >> I'll be happy to add the test case as JTreg test to this PR if you think that would be useful. > > Volker Simonis has updated the pull request incrementally with two additional commits since the last revision: > > - Remove assertions which insist on Lambda proxy classes being strongly linked to their class loader > - Removed unused import of STRONG und updated copyright year The reason that non-strongly linked classes have their own ClassLoaderData is because it implements the property that when the class loader is no longer loaded, metadata for it can be removed at once. Even though Metaspace has been redesigned to be less fragmented, this is simpler than removing selected entities inside metaspace. Replacing it with a more sophisticated design would not an be an improvement. So I have to disagree with your statement above. ------------- PR: https://git.openjdk.org/jdk/pull/12493 From rehn at openjdk.org Tue Feb 21 19:55:57 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 21 Feb 2023 19:55:57 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: <3PEGJ7Y4-Co17YrM6Z4T0zt9nmQbsOyWElr4H6tQBOw=.be00c6bb-931f-4d60-ac83-8e8616cabc94@github.com> References: <3PEGJ7Y4-Co17YrM6Z4T0zt9nmQbsOyWElr4H6tQBOw=.be00c6bb-931f-4d60-ac83-8e8616cabc94@github.com> Message-ID: On Tue, 21 Feb 2023 15:11:01 GMT, Coleen Phillimore wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Coleen fix > > src/hotspot/share/classfile/systemDictionary.cpp line 1614: > >> 1612: // before a new class is used. >> 1613: >> 1614: void SystemDictionary::add_to_hierarchy(InstanceKlass* k, JavaThread* current) { > > Since you're changing add_to_hierarchy so much, I think you might as well move it to InstanceKlass also. > > These changes look good. Removing the Compile_lock around dictionary update can be done later (at at lower priority). Hi Coleen, so I looked at moving it to IK. There will be some additional changes, some method may now be private instead, etc... You think this should be done here? ------------- PR: https://git.openjdk.org/jdk/pull/12585 From sspitsyn at openjdk.org Tue Feb 21 21:01:42 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 21 Feb 2023 21:01:42 GMT Subject: RFR: 8299240: rank of JvmtiVTMSTransition_lock can be safepoint [v2] In-Reply-To: References: Message-ID: > The rank of JvmtiVTMSTransition_lock is better to be safepoint instead of nosafepoint. > The fix includes removal of the function `check_vthread_and_suspend_at_safepoint` which is not needed anymore. > > Testing: > mach5 jobs are in progress: > Kitchensink, tiers1-6 (all JVMTI, JDWP, JDI and JDB tests have to be included) Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: addressed review comments from Coleen ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12550/files - new: https://git.openjdk.org/jdk/pull/12550/files/e408826e..244836dc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12550&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12550&range=00-01 Stats: 7 lines in 1 file changed: 2 ins; 2 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/12550.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12550/head:pull/12550 PR: https://git.openjdk.org/jdk/pull/12550 From coleenp at openjdk.org Tue Feb 21 21:21:48 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 21 Feb 2023 21:21:48 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: <3PEGJ7Y4-Co17YrM6Z4T0zt9nmQbsOyWElr4H6tQBOw=.be00c6bb-931f-4d60-ac83-8e8616cabc94@github.com> Message-ID: On Tue, 21 Feb 2023 19:52:54 GMT, Robbin Ehn wrote: >> src/hotspot/share/classfile/systemDictionary.cpp line 1614: >> >>> 1612: // before a new class is used. >>> 1613: >>> 1614: void SystemDictionary::add_to_hierarchy(InstanceKlass* k, JavaThread* current) { >> >> Since you're changing add_to_hierarchy so much, I think you might as well move it to InstanceKlass also. >> >> These changes look good. Removing the Compile_lock around dictionary update can be done later (at at lower priority). > > Hi Coleen, so I looked at moving it to IK. > There will be some additional changes, some method may now be private instead, etc... > You think this should be done here? ok, maybe it's ok where it is and we can move it with a future RFE. ------------- PR: https://git.openjdk.org/jdk/pull/12585 From sspitsyn at openjdk.org Tue Feb 21 21:25:40 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 21 Feb 2023 21:25:40 GMT Subject: Integrated: 8299240: rank of JvmtiVTMSTransition_lock can be safepoint In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 05:37:51 GMT, Serguei Spitsyn wrote: > The rank of JvmtiVTMSTransition_lock is better to be safepoint instead of nosafepoint. > The fix includes removal of the function `check_vthread_and_suspend_at_safepoint` which is not needed anymore. > > Testing: > mach5 jobs are in progress: > Kitchensink, tiers1-6 (all JVMTI, JDWP, JDI and JDB tests have to be included) This pull request has now been integrated. Changeset: 46f25250 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/46f25250bd49702fe18f9903473dc3e1cbe70f84 Stats: 55 lines in 6 files changed: 10 ins; 32 del; 13 mod 8299240: rank of JvmtiVTMSTransition_lock can be safepoint Reviewed-by: dholmes, coleenp, pchilanomate ------------- PR: https://git.openjdk.org/jdk/pull/12550 From iklam at openjdk.org Tue Feb 21 22:31:37 2023 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 21 Feb 2023 22:31:37 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: <2nIotFMJtOU6SxkjWHI7_Q-tTVb78xaPACgQMgLFWhY=.2d075f93-bacb-4dfa-86ee-e34d3ff9921a@github.com> References: <2nIotFMJtOU6SxkjWHI7_Q-tTVb78xaPACgQMgLFWhY=.2d075f93-bacb-4dfa-86ee-e34d3ff9921a@github.com> Message-ID: On Tue, 21 Feb 2023 18:39:31 GMT, Volker Simonis wrote: >> Volker Simonis has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove assertions which insist on Lambda proxy classes being strongly linked to their class loader >> - Removed unused import of STRONG und updated copyright year > > I hear your arguments although I don't agree :) > > Can we at least get consensus on that the current design to create a new ClassLoaderData for each non-strongly linked Hidden Class just in order to enable simple unloading isn't the greatest design and should eventually be replaced by a more sophisticated implementation which allows non-strongly linked Hidden Classes to share a single ClassLoaderData but still enable unloading for them once they aren't referenced any more? @simonis I want to ask a basic question -- what is the reason for your code to call `LambdaMetafactory.metafactory()` directly? It looks like you want to implement so sort of dynamic dispatch. Can equivalent functionality be implemented by the app itself? If there's a real need for such a style of programming, and it requires some sort of built-in support in by the JDK, maybe we should have a proper API instead of piggy-backing on `LambdaMetafactory.metafactory()`. I think if you give us more background, we can make this a more productive discussion than focusing on "did we make the right decision N versions ago" without defining what "right" means :-) I would suggest re-booting this decision in the mailing lists rather than continuing in this PR. ------------- PR: https://git.openjdk.org/jdk/pull/12493 From dholmes at openjdk.org Tue Feb 21 23:47:29 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 21 Feb 2023 23:47:29 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 18:11:18 GMT, Volker Simonis wrote: >> Prior to [JDK-8239384](https://bugs.openjdk.org/browse/JDK-8239384)/[JDK-8238358](https://bugs.openjdk.org/browse/JDK-8238358) LambdaMetaFactory has created VM-anonymous classes which could easily be unloaded once they were not referenced any more. Starting with JDK 15 and the new "hidden class" based implementation, this is not the case any more, because the hidden classes will be strongly tied to their defining class loader. If this is the default application class loader, these hidden classes can never be unloaded which can easily lead to Metaspace exhaustion (see the [test case in the JBS issue](https://bugs.openjdk.org/secure/attachment/102601/LambdaClassLeak.java)). This is a regression compared to previous JDK versions which some of our applications have been affected from when migrating to JDK 17. >> >> The reason why the newly created hidden classes are strongly linked to their defining class loader is not clear to me. JDK-8239384 mentions it as an "implementation detail": >> >>> *4. the lambda proxy class has the strong relationship with the class loader (that will share the VM metaspace for its defining loader - implementation details)* >> >> From my current understanding the strong link between a hidden class created by `LambdaMetaFactory` and its defining class loader is not strictly required. In order to prevent potential OOMs and fix the regression compared the JDK 14 and earlier I propose to create these hidden classes without the `STRONG` option. >> >> I'll be happy to add the test case as JTreg test to this PR if you think that would be useful. > > Volker Simonis has updated the pull request incrementally with two additional commits since the last revision: > > - Remove assertions which insist on Lambda proxy classes being strongly linked to their class loader > - Removed unused import of STRONG und updated copyright year I also have to disagree with the statement. The unit of unloading is the ClassLoader. ------------- PR: https://git.openjdk.org/jdk/pull/12493 From dholmes at openjdk.org Wed Feb 22 00:28:24 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 22 Feb 2023 00:28:24 GMT Subject: RFR: JDK-8302989: Add missing INCLUDE_CDS checks In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 14:42:38 GMT, Matthias Baesken wrote: > The cds only coding in hotspot is usually guarded with the INCLUDE_CDS macro so that it can be removed at compile time in case the correct configure flags are set. > However at some places INCLUDE_CDS is missing and should be added. > > One question - should (additionally to the UseSharedSpaces code section) the DumpSharedSpaces code sections be guarded as well with INCLUDE_CDS macros ? The CDS related source code is not 100% compile-time conditionalized. Typically we only exclude code that can't compile without the CDS specific files being compiled. Othercode won't execute due to the value of the "flags" in those cases. If you want to 100% conditionalize CDS related code then you need more changes than just those outlined here. Are you actually trying to solve a problem or was this just an observation? ------------- PR: https://git.openjdk.org/jdk/pull/12691 From lmesnik at openjdk.org Wed Feb 22 00:51:30 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 22 Feb 2023 00:51:30 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v3] In-Reply-To: References: Message-ID: <3_z2mjtYQIX4PrHMjC7x3ZLqYx120f4jAug0Ba-EIrw=.75806386-3e7b-4f25-8b41-2d0d29aa281d@github.com> On Tue, 21 Feb 2023 12:12:56 GMT, Ilarion Nakonechnyy wrote: >> The proposed approach added a new function for getting the cause of an exception -`java_lang_Throwable::get_cause_simple `, that gets called within `InstanceKlass::add_initialization_error` if an old one `java_lang_Throwable::get_cause_with_stack_trace` didn't succeed because of an exception during the VM call. The simple function doesn't call the VM for getting a stack trace but fills in any other information about an exception. >> >> Besides that, the discovering information about an exception was added to `ConstantPoolCacheEntry::save_and_throw_indy_exc` function. >> >> Jtreg for reproducing the issue also was added to the commit. >> The commit was tested with tier1 tests. > > Ilarion Nakonechnyy has updated the pull request incrementally with two additional commits since the last revision: > > - Get rid of redundant code - > merge get_cause_with_stack_trace and get_cause_simple > - Review corrections test/hotspot/jtreg/runtime/ClassInitErrors/TestNoClassDefFoundCause.java line 38: > 36: public class TestNoClassDefFoundCause { > 37: > 38: static class CrashWithSOE { CrashWithSOE is confusing. The VM is not crashed but just throw SOE. Could you please rename it. test/hotspot/jtreg/runtime/ClassInitErrors/TestNoClassDefFoundCause.java line 65: > 63: } > 64: > 65: private static void verify_stack(Throwable e, String cause) throws Exception { Please use verifyStack instead of verify_stack. ------------- PR: https://git.openjdk.org/jdk/pull/12566 From sviswanathan at openjdk.org Wed Feb 22 02:20:00 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Wed, 22 Feb 2023 02:20:00 GMT Subject: RFR: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter Message-ID: Change the java/lang/float.java and the corresponding shared runtime constant expression evaluation to generate QNaN. The HW instructions generate QNaNs and not SNaNs for floating point instructions. This happens across double, float, and float16 data types. The most significant bit of mantissa is set to 1 for QNaNs. ------------- Commit messages: - 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter Changes: https://git.openjdk.org/jdk/pull/12704/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12704&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302976 Stats: 8 lines in 4 files changed: 4 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/12704.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12704/head:pull/12704 PR: https://git.openjdk.org/jdk/pull/12704 From kvn at openjdk.org Wed Feb 22 02:46:25 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 22 Feb 2023 02:46:25 GMT Subject: RFR: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter In-Reply-To: References: Message-ID: <5DOvibCAPEcdToBLe4Fz_DHNPlsmsmvNAVoKdSsCl7Q=.0aaea866-23ef-4349-ab8c-5fed87e83961@github.com> On Wed, 22 Feb 2023 02:08:27 GMT, Sandhya Viswanathan wrote: > Change the java/lang/float.java and the corresponding shared runtime constant expression evaluation to generate QNaN. > The HW instructions generate QNaNs and not SNaNs for floating point instructions. This happens across double, float, and float16 data types. The most significant bit of mantissa is set to 1 for QNaNs. Please add regression test. Also consider creating copies of jdk/java/lang/Float/Binary16Conversion*.java tests in compiler/intrinsics/ and modify them to compare results from Interpreter, runtime and JITed code. We run compiler/intrinsics/ tests with different SSE and AVX settings to make sure they work in all cases. ------------- Changes requested by kvn (Reviewer). PR: https://git.openjdk.org/jdk/pull/12704 From darcy at openjdk.org Wed Feb 22 03:49:25 2023 From: darcy at openjdk.org (Joe Darcy) Date: Wed, 22 Feb 2023 03:49:25 GMT Subject: RFR: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 02:08:27 GMT, Sandhya Viswanathan wrote: > Change the java/lang/float.java and the corresponding shared runtime constant expression evaluation to generate QNaN. > The HW instructions generate QNaNs and not SNaNs for floating point instructions. This happens across double, float, and float16 data types. The most significant bit of mantissa is set to 1 for QNaNs. I'd like to see a more informative description of the problem: "float16 NaN values handled differently with and without intrinsification" If that is issue reported, it may not be a problem as opposed to "incorrect value returned under Float.float16ToFloat intrinsification", etc. src/java.base/share/classes/java/lang/Float.java line 1101: > 1099: return (short)(sign_bit > 1100: | 0x7c00 // max exponent + 1 > 1101: | 0x0200 // QNaN I don't understand what is being done here. From IEEE 754-2019: "Besides issues such as byte order which affect all data interchange, certain implementation options allowed by this standard must also be considered: ? for binary formats, how signaling NaNs are distinguished from quiet NaNs ? for decimal formats, whether binary or decimal encoding is used. This standard does not define how these parameters are to be communicated." The code in java.lang.Float in particular is meant to be usable on all host CPUs so architecture-specific assumptions about QNAN vs SNAN should be avoided. ------------- PR: https://git.openjdk.org/jdk/pull/12704 From darcy at openjdk.org Wed Feb 22 04:05:27 2023 From: darcy at openjdk.org (Joe Darcy) Date: Wed, 22 Feb 2023 04:05:27 GMT Subject: RFR: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 03:46:42 GMT, Joe Darcy wrote: > I'd like to see a more informative description of the problem: > > "float16 NaN values handled differently with and without intrinsification" > > If that is issue reported, it may not be a problem as opposed to > > "incorrect value returned under Float.float16ToFloat intrinsification", etc. PS The detailed NaN handling is specifically done in a separate test file since the invariants that are true for the software implementation need not be true for a intrinsified hardware-based one. However, as done for the intrinsics of the transcendental methods (sin, cos, tan), if the float16 conversion intrinsics are used, they should be consistently used or not used regardless of compilation approach (interpreter, C1, C2, etc.). HTH ------------- PR: https://git.openjdk.org/jdk/pull/12704 From dholmes at openjdk.org Wed Feb 22 04:05:30 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 22 Feb 2023 04:05:30 GMT Subject: RFR: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 02:08:27 GMT, Sandhya Viswanathan wrote: > Change the java/lang/float.java and the corresponding shared runtime constant expression evaluation to generate QNaN. > The HW instructions generate QNaNs and not SNaNs for floating point instructions. This happens across double, float, and float16 data types. The most significant bit of mantissa is set to 1 for QNaNs. I'm also a bit concerned that we are rushing in to "fix" this. IIUC we have three mechanisms for implementing this functionality: 1. The interpreted Java code 2. The compiled non-intrinisc sharedRuntime code 3. The compiler intrinsic that uses a hardware instruction. Unless the hardware instructions for all relevant CPUs behave exactly the same, then I don't see how we can have parity of behaviour across these three mechanisms. The observed behaviour may be surprising but it seems not to be a bug. And is this even a real concern - would real programs actually need to peek at the raw bits and so see the difference, or does it suffice to handle Nan's opaquely? ------------- PR: https://git.openjdk.org/jdk/pull/12704 From dholmes at openjdk.org Wed Feb 22 05:10:29 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 22 Feb 2023 05:10:29 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v3] In-Reply-To: References: Message-ID: <0wsea_jjNWfHUDPZLbwUIwoRRzyxRTrjQQw4aFKzh8w=.40c1d1d3-d9d7-4db3-8cb9-549feaaf14f9@github.com> On Tue, 21 Feb 2023 12:12:56 GMT, Ilarion Nakonechnyy wrote: >> The proposed approach added a new function for getting the cause of an exception -`java_lang_Throwable::get_cause_simple `, that gets called within `InstanceKlass::add_initialization_error` if an old one `java_lang_Throwable::get_cause_with_stack_trace` didn't succeed because of an exception during the VM call. The simple function doesn't call the VM for getting a stack trace but fills in any other information about an exception. >> >> Besides that, the discovering information about an exception was added to `ConstantPoolCacheEntry::save_and_throw_indy_exc` function. >> >> Jtreg for reproducing the issue also was added to the commit. >> The commit was tested with tier1 tests. > > Ilarion Nakonechnyy has updated the pull request incrementally with two additional commits since the last revision: > > - Get rid of redundant code - > merge get_cause_with_stack_trace and get_cause_simple > - Review corrections I've put a long comment in the JBS issue explaining the process of saving the "initialization error" in detail. Further changes in relation to secondary exception handling is needed - see below. Some other issues also raised. Thanks. src/hotspot/share/classfile/javaClasses.cpp line 2764: > 2762: h_cause->klass()->external_name()); > 2763: return Handle(); > 2764: } You need to do this whether `with_stack_trace` or not, otherwise you will install the exception from `new_exception` as the initialization error for this class, when in fact it was not the initialization error at all. src/hotspot/share/classfile/javaClasses.hpp line 618: > 616: > 617: // For recreating class initialization error exceptions. > 618: static Handle get_cause(Handle throwable, bool with_stack_trace, TRAPS); Existing: this is an absolutely terrible name for this method! Given that Throwable's have a "cause" this method would be assumed to relate to getting the cause of the passed in `throwable`, but it has nothing to do with that at all! This method is about creating an `ExceptionInInitializerError` instance that includes information from `throwable` (including class name, message and stacktrace). I suggest, based on terminology in `instanceKlass` that this be called `create_initialization_error`. src/hotspot/share/oops/cpCache.cpp line 476: > 474: Symbol* message = java_lang_Throwable::detail_message(PENDING_EXCEPTION); > 475: // Try to discover the cause > 476: oop cause = java_lang_Throwable::cause(PENDING_EXCEPTION); This enhancement is unrelated to the current JBS issue. I think it a good enhancement but it should be done separately. ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12566 From dholmes at openjdk.org Wed Feb 22 05:10:31 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 22 Feb 2023 05:10:31 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v3] In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 15:06:42 GMT, Coleen Phillimore wrote: >> Ilarion Nakonechnyy has updated the pull request incrementally with two additional commits since the last revision: >> >> - Get rid of redundant code - >> merge get_cause_with_stack_trace and get_cause_simple >> - Review corrections > > src/hotspot/share/classfile/javaClasses.cpp line 2738: > >> 2736: } >> 2737: >> 2738: Handle java_lang_Throwable::get_cause(Handle throwable, bool with_stack_trace, TRAPS) { > > I see, you just reversed the code so it creates the message first, then calls getStackTrace. > If you change it to not have TRAPS, and use JavaThread* current as the first parameter, then have as the first line to show that the exception isn't propagated out of this function. > JavaThread* THREAD = current; > create the message > { call getStackTrace(); > if (PENDING_EXCEPTION) CLEAR_PENDING_EXCEPTION; > else set_stacktrace(); clear_backtrace(); > } > return h_cause; > > Something like that. Then the caller doesn't need to check for null or pending exception. I agree with Coleen's suggestion, it doesn't make sense to return null and have a pending exception when the pending exception will be cleared in the caller anyway. This is a pre-existing issue but while we are here. ------------- PR: https://git.openjdk.org/jdk/pull/12566 From dholmes at openjdk.org Wed Feb 22 05:20:28 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 22 Feb 2023 05:20:28 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 19:35:46 GMT, Kim Barrett wrote: > > > What they used to do when the option is true was return and execute the BREAKPOINT immmediately following the call. > > > > > > Sorry I'm having trouble finding that existing BREAKPOINT > > It's in the various assertion macros in debug.hpp Ah! Got it. Thanks. Ignoring misplaced guarantees, we can only hit those if UOER is true and `report_and_die` returns; so instead don't return and place the breakpoint in `report_and_die` instead. ------------- PR: https://git.openjdk.org/jdk/pull/12651 From darcy at openjdk.org Wed Feb 22 05:24:29 2023 From: darcy at openjdk.org (Joe Darcy) Date: Wed, 22 Feb 2023 05:24:29 GMT Subject: RFR: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 04:03:02 GMT, David Holmes wrote: > I'm also a bit concerned that we are rushing in to "fix" this. IIUC we have three mechanisms for implementing this functionality: > > 1. The interpreted Java code > > 2. The compiled non-intrinisc sharedRuntime code > > 3. The compiler intrinsic that uses a hardware instruction. > > > Unless the hardware instructions for all relevant CPUs behave exactly the same, then I don't see how we can have parity of behaviour across these three mechanisms. > > The observed behaviour may be surprising but it seems not to be a bug. And is this even a real concern - would real programs actually need to peek at the raw bits and so see the difference, or does it suffice to handle Nan's opaquely? >From the spec (https://download.java.net/java/early_access/jdk20/docs/api/java.base/java/lang/Float.html#float16ToFloat(short)) "Returns the float value closest to the numerical value of the argument, a floating-point binary16 value encoded in a short. The conversion is exact; all binary16 values can be exactly represented in float. Special cases: If the argument is zero, the result is a zero with the same sign as the argument. If the argument is infinite, the result is an infinity with the same sign as the argument. If the argument is a NaN, the result is a NaN. " If the float argument is a NaN, you are supposed to get a float16 NaN as a result -- that is all the specification requires. However, the implementation makes stronger guarantees to try to preserve some non-zero NaN significand bits if they are set. "NaN boxing" is a technique used to put extra information into the significand bits a NaN and pass the around. It is consistent with the intended use of the feature by IEEE 754 and used in various language runtimes: e.g., https://piotrduperas.com/posts/nan-boxing https://leonardschuetz.ch/blog/nan-boxing/ https://anniecherkaev.com/the-secret-life-of-nan The Java specs are careful to avoid mentioning quiet vs signaling NaNs in general discussion. That said, I think it is reasonable on a given JVM invocation if Float.floatToFloat16(f) gave the same result for input f regardless of in what context it was called. ------------- PR: https://git.openjdk.org/jdk/pull/12704 From dholmes at openjdk.org Wed Feb 22 05:28:28 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 22 Feb 2023 05:28:28 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 08:28:05 GMT, Kim Barrett wrote: > Please review this change to the implementation of the Windows-specific option > UseOSErrorReporting, toward allowing crash reporting functions to be declared > noreturn. VMError::report_and_die no longer conditionally returns if the > Windows-only option UseOSErrorReporting is true. > > The core of VM crash reporting has been renamed and privatized, and now takes > an additional argument indicating whether the caller just wants reporting or > also wants termination after reporting. The Windows top-level and unhandled > exception filters call into that to request reporting, and also termination if > UseOSErrorReporting is false, otherwise returning from the call and taking > further action. All other callers of VM crash reporting include termination. > > So if we get a structured exception on Windows and the option is true then we > report it and then end up walking up the exception handlers until running out, > when Windows crash behavior gets a shot. This is essentially the same as > before the change, just checking the option in a different place (in the > Windows-specific caller, rather than in an #ifdef block in the shared code). > > In addition, the core VM crash reporter uses BREAKPOINT before termination if > UseOSErrorReporting is true. So if we have an assertion failure we report it > and then end up walking up the exception handlers (because of the breakpoint > exception) until running out, when Windows crash behavior gets a shot. > > So if there is an attached debugger (or JIT debugging is enabled), this leads > an assertion failure to hit the debugger inside the crash reporter rather than > after it returned (due to the old behavior for UseOSErrorReporting) from the > failed assertion (hitting the BREAKPOINT there). > > Note that on Windows, ::breakpoint() calls DebugBreak(), raising a breakpoint > exception, which ends up being unhandled if UseOSErrorReporting is true. > > There is a pre-existing bug that I'll be reporting separately. If > UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an > empty .mdmp file. We shouldn't create that file when UseOSErrorReporting. > > Testing: > mach5 tier1-3 > > Manual testing with the following, to verify expected behavior that is the > same as before the change. > > # -XX:ErrorHandlerTest=N > # 1: assertion failure > # 2: guarantee failure > # 14: SIGSEGV > # 15: divide by zero > path/to/bin/java \ > -XX:+UnlockDiagnosticVMOptions \ > -XX:+ErrorLogSecondaryErrorDetails \ > -XX:+UseOSErrorReporting \ > -XX:ErrorHandlerTest=1 \ > TestDebug.java > > --- TestDebug.java --- > import java.lang.String; > public class TestDebug { > static private volatile String dummy; > public static void main(String[] args) throws Exception { > while (true) { > dummy = new String("foo bar"); > } > } > } > --- end TestDebug.java --- > > I need someone with a better understanding of Windows to test the interaction > with Windows Error Reporting (WER). A couple of nits/suggestions but otherwise I'm content that this is addressing the issue in as compatible a way as practical. Thanks. src/hotspot/share/utilities/vmError.cpp line 1710: > 1708: // Caller requested reporting without termination. > 1709: return; > 1710: } else if (after_report_action == AfterReportAction::Die) { Nit: as there are only 2 values for ARA we don't really need an `else if` here. src/hotspot/share/utilities/vmError.cpp line 1730: > 1728: } > 1729: // We REALLY better not reach here. > 1730: ShouldNotReachHere(); Maybe just `os::die()` again rather than a recursive call back into VMError code? (Not that we ever expect to execute this but ... ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12651 From mdoerr at openjdk.org Wed Feb 22 05:39:36 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 22 Feb 2023 05:39:36 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) Message-ID: Description will get added soon. ------------- Commit messages: - Initial Panama implementation. Changes: https://git.openjdk.org/jdk/pull/12708/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12708&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8303040 Stats: 1973 lines in 58 files changed: 1865 ins; 1 del; 107 mod Patch: https://git.openjdk.org/jdk/pull/12708.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12708/head:pull/12708 PR: https://git.openjdk.org/jdk/pull/12708 From dholmes at openjdk.org Wed Feb 22 06:40:02 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 22 Feb 2023 06:40:02 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 08:08:00 GMT, Robbin Ehn wrote: >> Hi all, please consider. >> >> The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. >> All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. >> Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. >> >> Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. >> The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) >> >> This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. >> >> This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. >> Secondly it moves handshakes part out of the Compile_lock where it is possible. >> >> Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. >> >> It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. >> >> Thanks, Robbin > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Coleen fix A few drive-by comments on the locking. The deopt stuff itself is not somethng I know. Thanks. src/hotspot/share/classfile/systemDictionary.cpp line 1489: > 1487: } > 1488: > 1489: // Add the new class. We need recompile lock during update of CHA. This comment belongs after `add_to_hierarchy`. src/hotspot/share/classfile/systemDictionary.cpp line 1620: > 1618: assert(k != nullptr, "just checking"); > 1619: if (Universe::is_fully_initialized()) { > 1620: assert_locked_or_safepoint(Compile_lock); This serves no purpose given you just took out the lock. But it does raise the question as to whether the lock can/should always be taken out if called at a safepoint, or if the Universe is not yet fully initialized? src/hotspot/share/classfile/systemDictionary.cpp line 1635: > 1633: // Note: must be done *after* linking k into the hierarchy (was bug 12/9/97) > 1634: if (Universe::is_fully_initialized()) { > 1635: CodeCache::mark_dependents_on(&deopt_scope, k); The flush comment no longer matches the code. ------------- PR: https://git.openjdk.org/jdk/pull/12585 From fyang at openjdk.org Wed Feb 22 06:46:33 2023 From: fyang at openjdk.org (Fei Yang) Date: Wed, 22 Feb 2023 06:46:33 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 15:23:04 GMT, Erik ?sterlund wrote: > So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. Hi, I think some riscv-specific changes could be removed or postponed: 1. The vector version of BarrierSetAssembler::copy_load_at/copy_store_at should be removed as they are not implemented; 2. Far branches are not needed here for this PR. It's better for these changes to go with the final generational zgc. And I have prepare a small cleanup for that. Could you please add it if it's good for you? Thanks. [riscv-changes.txt](https://github.com/openjdk/jdk/files/10800754/riscv-changes.txt) ------------- PR: https://git.openjdk.org/jdk/pull/12670 From stuefe at openjdk.org Wed Feb 22 07:35:32 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 22 Feb 2023 07:35:32 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 08:28:05 GMT, Kim Barrett wrote: > Please review this change to the implementation of the Windows-specific option > UseOSErrorReporting, toward allowing crash reporting functions to be declared > noreturn. VMError::report_and_die no longer conditionally returns if the > Windows-only option UseOSErrorReporting is true. > > The core of VM crash reporting has been renamed and privatized, and now takes > an additional argument indicating whether the caller just wants reporting or > also wants termination after reporting. The Windows top-level and unhandled > exception filters call into that to request reporting, and also termination if > UseOSErrorReporting is false, otherwise returning from the call and taking > further action. All other callers of VM crash reporting include termination. > > So if we get a structured exception on Windows and the option is true then we > report it and then end up walking up the exception handlers until running out, > when Windows crash behavior gets a shot. This is essentially the same as > before the change, just checking the option in a different place (in the > Windows-specific caller, rather than in an #ifdef block in the shared code). > > In addition, the core VM crash reporter uses BREAKPOINT before termination if > UseOSErrorReporting is true. So if we have an assertion failure we report it > and then end up walking up the exception handlers (because of the breakpoint > exception) until running out, when Windows crash behavior gets a shot. > > So if there is an attached debugger (or JIT debugging is enabled), this leads > an assertion failure to hit the debugger inside the crash reporter rather than > after it returned (due to the old behavior for UseOSErrorReporting) from the > failed assertion (hitting the BREAKPOINT there). > > Note that on Windows, ::breakpoint() calls DebugBreak(), raising a breakpoint > exception, which ends up being unhandled if UseOSErrorReporting is true. > > There is a pre-existing bug that I'll be reporting separately. If > UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an > empty .mdmp file. We shouldn't create that file when UseOSErrorReporting. > > Testing: > mach5 tier1-3 > > Manual testing with the following, to verify expected behavior that is the > same as before the change. > > # -XX:ErrorHandlerTest=N > # 1: assertion failure > # 2: guarantee failure > # 14: SIGSEGV > # 15: divide by zero > path/to/bin/java \ > -XX:+UnlockDiagnosticVMOptions \ > -XX:+ErrorLogSecondaryErrorDetails \ > -XX:+UseOSErrorReporting \ > -XX:ErrorHandlerTest=1 \ > TestDebug.java > > --- TestDebug.java --- > import java.lang.String; > public class TestDebug { > static private volatile String dummy; > public static void main(String[] args) throws Exception { > while (true) { > dummy = new String("foo bar"); > } > } > } > --- end TestDebug.java --- > > I need someone with a better understanding of Windows to test the interaction > with Windows Error Reporting (WER). Do we actually need UseOSErrorReporting? Honest question since I have personally never used it. Whenever I wanted to debug a problem on windows, I used the minidump write facility. ------------- PR: https://git.openjdk.org/jdk/pull/12651 From kbarrett at openjdk.org Wed Feb 22 07:35:39 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 22 Feb 2023 07:35:39 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 05:20:25 GMT, David Holmes wrote: >> Please review this change to the implementation of the Windows-specific option >> UseOSErrorReporting, toward allowing crash reporting functions to be declared >> noreturn. VMError::report_and_die no longer conditionally returns if the >> Windows-only option UseOSErrorReporting is true. >> >> The core of VM crash reporting has been renamed and privatized, and now takes >> an additional argument indicating whether the caller just wants reporting or >> also wants termination after reporting. The Windows top-level and unhandled >> exception filters call into that to request reporting, and also termination if >> UseOSErrorReporting is false, otherwise returning from the call and taking >> further action. All other callers of VM crash reporting include termination. >> >> So if we get a structured exception on Windows and the option is true then we >> report it and then end up walking up the exception handlers until running out, >> when Windows crash behavior gets a shot. This is essentially the same as >> before the change, just checking the option in a different place (in the >> Windows-specific caller, rather than in an #ifdef block in the shared code). >> >> In addition, the core VM crash reporter uses BREAKPOINT before termination if >> UseOSErrorReporting is true. So if we have an assertion failure we report it >> and then end up walking up the exception handlers (because of the breakpoint >> exception) until running out, when Windows crash behavior gets a shot. >> >> So if there is an attached debugger (or JIT debugging is enabled), this leads >> an assertion failure to hit the debugger inside the crash reporter rather than >> after it returned (due to the old behavior for UseOSErrorReporting) from the >> failed assertion (hitting the BREAKPOINT there). >> >> Note that on Windows, ::breakpoint() calls DebugBreak(), raising a breakpoint >> exception, which ends up being unhandled if UseOSErrorReporting is true. >> >> There is a pre-existing bug that I'll be reporting separately. If >> UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an >> empty .mdmp file. We shouldn't create that file when UseOSErrorReporting. >> >> Testing: >> mach5 tier1-3 >> >> Manual testing with the following, to verify expected behavior that is the >> same as before the change. >> >> # -XX:ErrorHandlerTest=N >> # 1: assertion failure >> # 2: guarantee failure >> # 14: SIGSEGV >> # 15: divide by zero >> path/to/bin/java \ >> -XX:+UnlockDiagnosticVMOptions \ >> -XX:+ErrorLogSecondaryErrorDetails \ >> -XX:+UseOSErrorReporting \ >> -XX:ErrorHandlerTest=1 \ >> TestDebug.java >> >> --- TestDebug.java --- >> import java.lang.String; >> public class TestDebug { >> static private volatile String dummy; >> public static void main(String[] args) throws Exception { >> while (true) { >> dummy = new String("foo bar"); >> } >> } >> } >> --- end TestDebug.java --- >> >> I need someone with a better understanding of Windows to test the interaction >> with Windows Error Reporting (WER). > > src/hotspot/share/utilities/vmError.cpp line 1710: > >> 1708: // Caller requested reporting without termination. >> 1709: return; >> 1710: } else if (after_report_action == AfterReportAction::Die) { > > Nit: as there are only 2 values for ARA we don't really need an `else if` here. I didn't want to add an assert. But see reply to your next comment, about the ShouldNotReachHere at the end. > src/hotspot/share/utilities/vmError.cpp line 1730: > >> 1728: } >> 1729: // We REALLY better not reach here. >> 1730: ShouldNotReachHere(); > > Maybe just `os::die()` again rather than a recursive call back into VMError code? (Not that we ever expect to execute this but ... This blob of code could have been equivalently written as switch (after_report_action) { case AfterReportAction::Return: return; case AfterReportAction::Die: ... breakpoint and abort ... os::die(); // should be unreachable because die is noreturn. perhaps fall through // if get here, though we're in UB territory. default: ShouldNotReachHere(); } I didn't write it that way because the Die code is a bit longer and more complicated than I like to put in a case clause. ------------- PR: https://git.openjdk.org/jdk/pull/12651 From kbarrett at openjdk.org Wed Feb 22 07:50:47 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 22 Feb 2023 07:50:47 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 07:33:07 GMT, Thomas Stuefe wrote: > Do we actually need UseOSErrorReporting? Honest question since I have personally never used it. Whenever I wanted to debug a problem on windows, I used the minidump write facility. This was recently discussed in https://bugs.openjdk.org/browse/JDK-8250782 specifically the reply from @mo-beck https://bugs.openjdk.org/browse/JDK-8250782?focusedCommentId=14553883&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14553883 So yes, we do want to keep it. ------------- PR: https://git.openjdk.org/jdk/pull/12651 From mbaesken at openjdk.org Wed Feb 22 07:51:50 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 22 Feb 2023 07:51:50 GMT Subject: RFR: JDK-8302989: Add missing INCLUDE_CDS checks In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 00:25:31 GMT, David Holmes wrote: > Are you actually trying to solve a problem or was this just an observation? Hi David, this was about getting less code compiled into the libjvm in case cds is disabled by configure. ------------- PR: https://git.openjdk.org/jdk/pull/12691 From dnsimon at openjdk.org Wed Feb 22 09:49:58 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 22 Feb 2023 09:49:58 GMT Subject: RFR: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter In-Reply-To: References: Message-ID: <-3z8sZMccXAOTV4KgeqrX6KQTcnSP20B70a293fhQSI=.354c1f98-c5bd-4b7e-a59a-10ff31d7d6b9@github.com> On Wed, 22 Feb 2023 05:21:48 GMT, Joe Darcy wrote: > That said, I think it is reasonable on a given JVM invocation if Float.floatToFloat16(f) gave the same result for input f regardless of in what context it was called. Yes, I'm under the impression that for math API methods like this, the stability of input to output must be preserved for a single JVM invocation. Or are there existing methods for which the interpreter and compiled code execution is allowed to differ? ------------- PR: https://git.openjdk.org/jdk/pull/12704 From simonis at openjdk.org Wed Feb 22 11:23:15 2023 From: simonis at openjdk.org (Volker Simonis) Date: Wed, 22 Feb 2023 11:23:15 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: <9Qs4-JH8-M20G-Cu39MHZhbraXHeSGB50zskJFOJlTI=.427e4206-a66e-4aba-92d6-3ac555480daa@github.com> On Tue, 21 Feb 2023 19:49:24 GMT, Coleen Phillimore wrote: > The reason that non-strongly linked classes have their own ClassLoaderData is because it implements the property that when the class loader is no longer loaded, metadata for it can be removed at once. Even though Metaspace has been redesigned to be less fragmented, this is simpler than removing selected entities inside metaspace. Replacing it with a more sophisticated design would not an be an improvement. So I have to disagree with your statement above. How could it be not an improvement to save ~600 bytes of native memory per class? Sorry, but I start to have real problems to follow this discussion. First you don't want to change strongly linked Hidden Classes to weakly linked ones **because** of the native memory overhead and then you say that a more sophisticated implementation which will eliminate this overhead would **not** be an improvement. I'm puzzled. ------------- PR: https://git.openjdk.org/jdk/pull/12493 From duke at openjdk.org Wed Feb 22 11:37:07 2023 From: duke at openjdk.org (Amit Kumar) Date: Wed, 22 Feb 2023 11:37:07 GMT Subject: RFR: 8299817: [s390] AES-CTR mode intrinsic fails with multiple short update() calls [v4] In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 12:21:27 GMT, Lutz Schmidt wrote: >> This PR addresses an issue in the AES-CTR mode intrinsic on s390. When a message is ciphered in multiple, small (< 16 bytes) segments, the result is incorrect. >> >> This is not just a band-aid fix. The issue was taken as a chance to restructure the code. though still complicated, It is now easier to read and (hopefully) understand. >> >> Except for the new jetreg test, the changes are purely s390. There are no side effects on other platforms. Issue-specific tests pass. Other tests are in progress. I will update this PR once they are complete. >> >> **Reviews and comments are very much appreciated.** >> >> @backwaterred could you please run some "official" s390 tests? Thanks. > > Lutz Schmidt has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: > > - Merge master to resolve copyright conflict > - 829817: fixed typos, removed JIT_TIMER references > - 8299817: Update copyright > - 8299817: [s390] AES-CTR mode intrinsic fails with multiple short update() calls Build and Testing done for tier1 in fastdebug, slowdebug and release build on z15/s390x machine. LGTM ------------- Marked as reviewed by offamitkumar at github.com (no known OpenJDK username). PR: https://git.openjdk.org/jdk/pull/11967 From simonis at openjdk.org Wed Feb 22 13:41:46 2023 From: simonis at openjdk.org (Volker Simonis) Date: Wed, 22 Feb 2023 13:41:46 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: <4ntc5zIvBrfwF7J3X0fWgA3n0XPc1Ih29LvVTVs6YwY=.fef99bb7-297a-4060-921d-d3798a2342b4@github.com> On Tue, 21 Feb 2023 23:44:48 GMT, David Holmes wrote: > I also have to disagree with the statement. The unit of unloading is the ClassLoader. Hidden classes are not normal classes. They are not defined, created or loaded by a class loader. The only reason why hidden classes maintain a link to a *defining class loader* is because they need it to resolve types used by the hidden class's own fields and methods. Some references from [JEP 371: Hidden Classes](https://openjdk.org/jeps/371): > Dynamically generated classes may only be needed for a limited time, so retaining them for the lifetime of the statically generated class might unnecessarily increase memory footprint. Existing workarounds for this situation, such as per-class class loaders, are cumbersome and inefficient. > A hidden class can be unloaded when it is no longer reachable, or it can share the lifetime of a class loader so that it is unloaded only when the class loader is garbage collected > A hidden class is not created by a class loader and has only a loose connection to the class loader deemed to be its defining loader. We can turn these facts to our advantage by allowing a hidden class to be unloaded even if its notional defining loader cannot be reclaimed by the garbage collector. > By default, Lookup::defineHiddenClass will create a hidden class that can be unloaded regardless of whether its notional defining loader is still alive. That is, when all instances of the hidden class are reclaimed and the hidden class is no longer reachable, it may be unloaded even though its notional defining loader is still reachable. This behavior is useful when a language runtime creates a hidden class to serve multiple classes defined by arbitrary class loaders: The runtime will see an improvement in footprint and performance relative to both `ClassLoader::defineClass()` and Unsafe::defineAnonymousClass()` The only reason why hidden classes created by `LambdaMetaFactory` are strongly linked to a class loader (at least I haven't heard any other argument until now in this thread) is to *save native memory* and not because it is *logically required*! It's fine for me if you don't want to fix that. I can just not understand why you are all still insisting that creating a new ClassLoaderData object per hidden class is such a great decision and fixing that would not be beneficial. Hidden classes were designed to be light-weight and easily unloadable (see JEP references above). In the case of `LambdaMetaFactory` we unnecessarily link them strongly to a class loader just because the current implementation is too expensive otherwise. On a side note, the JDK already creates non-strongly linked hidden classes as well, e.g. for `java.lang.invoke.MethodHandles$Lookup.unreflect()`. ------------- PR: https://git.openjdk.org/jdk/pull/12493 From simonis at openjdk.org Wed Feb 22 14:10:28 2023 From: simonis at openjdk.org (Volker Simonis) Date: Wed, 22 Feb 2023 14:10:28 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: <4ntc5zIvBrfwF7J3X0fWgA3n0XPc1Ih29LvVTVs6YwY=.fef99bb7-297a-4060-921d-d3798a2342b4@github.com> References: <4ntc5zIvBrfwF7J3X0fWgA3n0XPc1Ih29LvVTVs6YwY=.fef99bb7-297a-4060-921d-d3798a2342b4@github.com> Message-ID: On Wed, 22 Feb 2023 13:38:57 GMT, Volker Simonis wrote: >> I also have to disagree with the statement. The unit of unloading is the ClassLoader. > >> I also have to disagree with the statement. The unit of unloading is the ClassLoader. > > Hidden classes are not normal classes. They are not defined, created or loaded by a class loader. The only reason why hidden classes maintain a link to a *defining class loader* is because they need it to resolve types used by the hidden class's own fields and methods. > > Some references from [JEP 371: Hidden Classes](https://openjdk.org/jeps/371): > >> Dynamically generated classes may only be needed for a limited time, so retaining them for the lifetime of the statically generated class might unnecessarily increase memory footprint. Existing workarounds for this situation, such as per-class class loaders, are cumbersome and inefficient. > >> A hidden class can be unloaded when it is no longer reachable, or it can share the lifetime of a class loader so that it is unloaded only when the class loader is garbage collected > >> A hidden class is not created by a class loader and has only a loose connection to the class loader deemed to be its defining loader. We can turn these facts to our advantage by allowing a hidden class to be unloaded even if its notional defining loader cannot be reclaimed by the garbage collector. > >> By default, Lookup::defineHiddenClass will create a hidden class that can be unloaded regardless of whether its notional defining loader is still alive. That is, when all instances of the hidden class are reclaimed and the hidden class is no longer reachable, it may be unloaded even though its notional defining loader is still reachable. This behavior is useful when a language runtime creates a hidden class to serve multiple classes defined by arbitrary class loaders: The runtime will see an improvement in footprint and performance relative to both `ClassLoader::defineClass()` and Unsafe::defineAnonymousClass()` > > The only reason why hidden classes created by `LambdaMetaFactory` are strongly linked to a class loader (at least I haven't heard any other argument until now in this thread) is to *save native memory* and not because it is *logically required*! It's fine for me if you don't want to fix that. I can just not understand why you are all still insisting that creating a new ClassLoaderData object per hidden class is such a great decision and fixing that would not be beneficial. > > Hidden classes were designed to be light-weight and easily unloadable (see JEP references above). In the case of `LambdaMetaFactory` we unnecessarily link them strongly to a class loader just because the current implementation is too expensive otherwise. On a side note, the JDK already creates non-strongly linked hidden classes as well, e.g. for `java.lang.invoke.MethodHandles$Lookup.unreflect()`. > @simonis I want to ask a basic question -- what is the reason for your code to call `LambdaMetafactory.metafactory()` directly? It looks like you want to implement so sort of dynamic dispatch. Can equivalent functionality be implemented by the app itself? > > If there's a real need for such a style of programming, and it requires some sort of built-in support in by the JDK, maybe we should have a proper API instead of piggy-backing on `LambdaMetafactory.metafactory()`. > > I think if you give us more background, we can make this a more productive discussion than focusing on "did we make the right decision N versions ago" without defining what "right" means :-) > > I would suggest re-booting this decision in the mailing lists rather than continuing in this PR. The initial reason for this issue comes from here https://github.com/aws/aws-sdk-java-v2/issues/3701. That issue could be solved by caching. However, as I've outlined in my other answers above, I still think that one ClassLoaderData per hidden class is overly expensive and not really required. I'm fine if you don't want to fix that I just don't understand why you think it is a great solution? ------------- PR: https://git.openjdk.org/jdk/pull/12493 From forax at openjdk.org Wed Feb 22 14:10:35 2023 From: forax at openjdk.org (=?UTF-8?B?UsOpbWk=?= Forax) Date: Wed, 22 Feb 2023 14:10:35 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 18:11:18 GMT, Volker Simonis wrote: >> Prior to [JDK-8239384](https://bugs.openjdk.org/browse/JDK-8239384)/[JDK-8238358](https://bugs.openjdk.org/browse/JDK-8238358) LambdaMetaFactory has created VM-anonymous classes which could easily be unloaded once they were not referenced any more. Starting with JDK 15 and the new "hidden class" based implementation, this is not the case any more, because the hidden classes will be strongly tied to their defining class loader. If this is the default application class loader, these hidden classes can never be unloaded which can easily lead to Metaspace exhaustion (see the [test case in the JBS issue](https://bugs.openjdk.org/secure/attachment/102601/LambdaClassLeak.java)). This is a regression compared to previous JDK versions which some of our applications have been affected from when migrating to JDK 17. >> >> The reason why the newly created hidden classes are strongly linked to their defining class loader is not clear to me. JDK-8239384 mentions it as an "implementation detail": >> >>> *4. the lambda proxy class has the strong relationship with the class loader (that will share the VM metaspace for its defining loader - implementation details)* >> >> From my current understanding the strong link between a hidden class created by `LambdaMetaFactory` and its defining class loader is not strictly required. In order to prevent potential OOMs and fix the regression compared the JDK 14 and earlier I propose to create these hidden classes without the `STRONG` option. >> >> I'll be happy to add the test case as JTreg test to this PR if you think that would be useful. > > Volker Simonis has updated the pull request incrementally with two additional commits since the last revision: > > - Remove assertions which insist on Lambda proxy classes being strongly linked to their class loader > - Removed unused import of STRONG und updated copyright year Volker, i think you are mixing lambda metafactory and hidden classes. Hidden Classes have been envision has a way to generate adapters at runtime for languages that run on the JVM. They replace VM anonymous classes that were both unsafe and always not bound to a classloader. Lambda metafactory is the entry point to create lambda proxy for Java (the language). The current implementation is using hidden classes to represent lambda proxy. In the past, it was using VM anonymous classes that why lambdas were not tie to a classloader, the fact that the implementation was using VM anonymous class was leaking. To unload a Java class, its clasloader has to be unreachable. I see no reason this behavior should not be the same for Java lambdas classes. I believe it's what David is saying too. Now, for https://github.com/aws/aws-sdk-java-v2/issues/3701, you can fix it by using a ClassValue to cache the class and/or using a hidden class. For me, this has nothing to do with Java lambdas, i.e. nothing to do with the lambda metafactory. ------------- PR: https://git.openjdk.org/jdk/pull/12493 From pminborg at openjdk.org Wed Feb 22 14:11:45 2023 From: pminborg at openjdk.org (Per Minborg) Date: Wed, 22 Feb 2023 14:11:45 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) In-Reply-To: References: Message-ID: <_QETLZVuG6pkWyXvp1Plh3rOVjFn_sYEP09AE7MdsAE=.6f20102e-eaa0-4425-bc26-ab95ab6338ee@github.com> On Wed, 22 Feb 2023 05:31:46 GMT, Martin Doerr wrote: > Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". > > This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). > > Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. > > There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. > > The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. > > I had to make changes to shared code and code for other platforms: > 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: > - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. > - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. > - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! > 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). In the most recent version of the Panama FFM API, any memory layout (including struct and padding layouts) are always byte aligned. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From stuefe at openjdk.org Wed Feb 22 14:45:13 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 22 Feb 2023 14:45:13 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 07:47:46 GMT, Kim Barrett wrote: > > Do we actually need UseOSErrorReporting? Honest question since I have personally never used it. Whenever I wanted to debug a problem on windows, I used the minidump write facility. > > This was recently discussed in https://bugs.openjdk.org/browse/JDK-8250782 > > specifically the reply from @mo-beck https://bugs.openjdk.org/browse/JDK-8250782?focusedCommentId=14553883&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14553883 > > So yes, we do want to keep it. Okay, I see. But does it have to come with hs-err file too? Because that is the only reason for this complication. And hs-err files probably spoil the context reported to Watson, see below. I am unhappy with this patch because it "sanctifies" returning from hs-err reporting. We had long and intense discussions in the recent past where we needed to convince developers that returning from hs-err reporting is a very bad idea. I'm afraid with your patch, it won't be long before follow-up patches build upon your ARA and re-introduce this question. I'd prefer if UseOSErrorReporting, and all this code surrounding the windows exceptionhandler and CONTINUE_SEARCH, remained Windows only, and restricted to this one use case: jumping back to OS error reporting in case of an error. Your patch opens up other questions too, e.g. how is this supposed to work in the face of secondary errors? IIUC the last invocation of the exception handler would be the winning one and return its error state back to the top-level exception filter, so the error reported by Watson would be the secondary crash, not the real crash. I think this would happen today too, or? This may be an argument for skipping hs-err reporting with UseOSErrorReporting - skipping everything, actually: the context that gets reported to Watson would be guaranteed to be the one from the real crash point. And the crash would be "fresh", as close as possible to the real crash site. I'd think that this is how UseOSErrorReporting is supposed to work, leaving crash reporting up to the OS. ------------- PR: https://git.openjdk.org/jdk/pull/12651 From jvernee at openjdk.org Wed Feb 22 15:20:00 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Wed, 22 Feb 2023 15:20:00 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 18:11:18 GMT, Volker Simonis wrote: >> Prior to [JDK-8239384](https://bugs.openjdk.org/browse/JDK-8239384)/[JDK-8238358](https://bugs.openjdk.org/browse/JDK-8238358) LambdaMetaFactory has created VM-anonymous classes which could easily be unloaded once they were not referenced any more. Starting with JDK 15 and the new "hidden class" based implementation, this is not the case any more, because the hidden classes will be strongly tied to their defining class loader. If this is the default application class loader, these hidden classes can never be unloaded which can easily lead to Metaspace exhaustion (see the [test case in the JBS issue](https://bugs.openjdk.org/secure/attachment/102601/LambdaClassLeak.java)). This is a regression compared to previous JDK versions which some of our applications have been affected from when migrating to JDK 17. >> >> The reason why the newly created hidden classes are strongly linked to their defining class loader is not clear to me. JDK-8239384 mentions it as an "implementation detail": >> >>> *4. the lambda proxy class has the strong relationship with the class loader (that will share the VM metaspace for its defining loader - implementation details)* >> >> From my current understanding the strong link between a hidden class created by `LambdaMetaFactory` and its defining class loader is not strictly required. In order to prevent potential OOMs and fix the regression compared the JDK 14 and earlier I propose to create these hidden classes without the `STRONG` option. >> >> I'll be happy to add the test case as JTreg test to this PR if you think that would be useful. > > Volker Simonis has updated the pull request incrementally with two additional commits since the last revision: > > - Remove assertions which insist on Lambda proxy classes being strongly linked to their class loader > - Removed unused import of STRONG und updated copyright year I'd like to suggest maybe using another option for turning MethodHandles into interface instances, such as using `MethodHandleProxies` or just lambda expressions which capture the particular `MethodHandle`, which is an option if the target type of the functional interface is statically known: MethodHandle mh = ... TargetType instance = (Widget w) -> { try { return (SomeType) mh.invokeExact(w); } catch(Throwable t) { throw new RuntimeException(t); // or something more nuanced } }; Most importantly, the above doesn't generate a new class every time, it generates just one. It can be slower in specific situations due to lack of inlining, but in the common use case it's not that bad (I've measure the extra hop through the method handle to be 4ns in the past in microbenchmarks). You could measure the impact for the particular app. (I guess the question is: do you _really_ want a new class to be generated for each MethodHandle?) ------------- PR: https://git.openjdk.org/jdk/pull/12493 From pchilanomate at openjdk.org Wed Feb 22 15:46:41 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 22 Feb 2023 15:46:41 GMT Subject: Integrated: 8300575: JVMTI support when using alternative virtual thread implementation In-Reply-To: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> References: <7-glNqi3qp6zijEvorJSMii9pUji65gOYepL0epeQRE=.86023773-de72-4056-b6b1-a55adff14c52@github.com> Message-ID: On Fri, 10 Feb 2023 14:50:36 GMT, Patricio Chilano Mateo wrote: > Please review the following change. Some of the JVMTI functions had to be modified since they are not supported by the specs on virtual threads (StopThread, RunAgentThread, GetCurrentThreadCpuTime, GetThreadCpuTime) or shouldn't include virtual threads in the results (GetAllThreads, GetAllStackTraces, GetThreadGroupChildren). Others were modified because they should work on virtual thread but they weren't (SuspendAllVirtualThreads/ResumeAllVirtualThreads). Also support for VirtualThreadStart/VirtualThreadEnd events was added. > > There were a total of 71 tests under serviceability/jvmti/ that had the vm.continuations requirement. Of those, 24 where under serviceability/jvmti/vthread/. I was able to remove that requirement from 50 tests. Most of those tests were already passing for the alternative vthread implementation just by removing the check in jni_GetEnv for VMContinuations. > From the other 21 tests, 20 still fail with the changes either because they actually test continuations or because of different assumptions in the tests that don't hold true for the alternative vthread implementation. So for those I left the vm.continuations requirement untouched (from those 20 tests there are actually 4 tests from the thread/GetStackTrace/ family that are passing because of a bug in the test, but I can fix that in another RFE). The other remaining test is vthread/GetSetLocalTest/GetSetLocalTest.java which I see is problemlisted. > > Regarding variable _is_bound_vthread, although it's handy as a cache to avoid repeating the check, I mainly added it to avoid transitioning for GetCurrentThreadCpuTime (we are native and cannot access oops). > > I added new test BoundVThreadTest.java which just checks for the unsupported/supported behavior mentioned previously. For some extra basic coverage I also added a new run with -XX:-VMContinuations on tests inside the serviceability/jvmti/vthread/ directory that don't require vm.continuations anymore. I could also add that for all the other tests in serviceability/jvmti/ for which I removed the vm.continuations requirement. > > I run the patch through mach5 tiers 1-6 plus some extra local runs on the relevant tests. > > Thanks, > Patricio This pull request has now been integrated. Changeset: 83bea26d Author: Patricio Chilano Mateo URL: https://git.openjdk.org/jdk/commit/83bea26df453282d46afff333e006e8f9b7fb201 Stats: 635 lines in 69 files changed: 514 ins; 77 del; 44 mod 8300575: JVMTI support when using alternative virtual thread implementation Reviewed-by: lmesnik, sspitsyn, alanb ------------- PR: https://git.openjdk.org/jdk/pull/12512 From mchung at openjdk.org Wed Feb 22 17:00:49 2023 From: mchung at openjdk.org (Mandy Chung) Date: Wed, 22 Feb 2023 17:00:49 GMT Subject: RFR: 8302667: Improve message format when failing to load symbols or libraries [v3] In-Reply-To: References: Message-ID: <1k3a4hmHCfKOQC_B-pKEhTwoRM4lQ21rcEgnyMC-5FQ=.375427a9-3b6d-40e2-90fe-085ded39b1d2@github.com> On Tue, 21 Feb 2023 17:14:57 GMT, Julian Waters wrote: >> DLL_ERROR4 is a macro expanding to an error message when a failure to load a generic item (shared libraries or an exported symbol from said libraries for example) occurs. "Error: loading:" is not a very pretty message, so this small change results in "Error: Failed to load %s" instead, which looks better and also means the message makes more sense if we want to append a reason behind as well, such as "Error: Failed to load libjvm.so because xxx" > > Julian Waters has updated the pull request incrementally with two additional commits since the last revision: > > - expandArgFile > - ARG_ERROR18 Thanks in making the change. src/java.base/share/native/libjli/emessages.h line 60: > 58: #define ARG_ERROR16 "Error: Option %s in %s is not allowed in this context" > 59: #define ARG_ERROR17 "Error: Cannot specify main class in this context" > 60: #define ARG_ERROR18 "Error: Could not read %s" Suggestion: #define ARG_ERROR18 "Error: Failed to read %s" ------------- Marked as reviewed by mchung (Reviewer). PR: https://git.openjdk.org/jdk/pull/12596 From jvernee at openjdk.org Wed Feb 22 17:05:49 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Wed, 22 Feb 2023 17:05:49 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 05:31:46 GMT, Martin Doerr wrote: > Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". > > This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). > > Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. > > There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. > > The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. > > I had to make changes to shared code and code for other platforms: > 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: > - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. > - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. > - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! > 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). I will do a more thorough review soon. Some preliminary comments: > The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. FWIW, we have to do this for Windows vararg floats as well ([here](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/foreign/abi/x64/windows/CallArranger.java#L231-L239)) This can be done by `dup`-ing the value, and using 2 `vmStore`s. (each `vmStore` corresponding to a single register/stack location). Doing something similar might be simpler than the `INTEGER_AND_FLOAT` and `STACK_AND_FLOAT` storage types you're using right now. I'm not sure if that is related to the other limitations you mention? Might be interesting to look into. (perhaps as a separate RFE. I don't have a big issue since the current approach stays in PPC-only code) > I had to make changes to shared code and code for other platforms: > > 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: > > > * PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. > > * Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. > > * Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! I think supplying the `BasicType` is fine. `VMReg` doesn't have any width information attached to it, and that's why a complementary `BasicType` is needed. I'm glad to see that you could make it work with the register masks for `VMStorage` :) WRT the extension of int -> long. This could potentially also be handled in Java by adding the conversion as a `Cast` binding variant, and then adding the widening casts in `CallArranger`. (I'd be happy to implement the needed changes in shared code if you want, since it touches `BindingSpecializer` which is pretty dense). Since the extension seems to be a figment of the C ABI, that could be preferable, since it has the benefit of the VM code staying ABI-agnostic. This is potentially important if we want to add other ABIs in the future. But, we can also cross that bridge when we get to it (and there are probably more bridges to cross in that case too). So, up to you, really. (It's similar to the discussion surrounding floats for RISCV, if you followed that) > 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). Zero-length memory segments are supposed to be resized before they are written to or read from (see [Zero-length memory segments](https://download.java.net/java/early_access/jdk20/docs/api/java.base/java/lang/foreign/MemorySegment.html#wrapping-addresses)). We shouldn't disable the check for them, as that would have far-reaching implications for the safety design of the memory access API. Can you explain a bit more about where/why/how the issue occurs? ------------- PR: https://git.openjdk.org/jdk/pull/12708 From duke at openjdk.org Wed Feb 22 17:08:10 2023 From: duke at openjdk.org (Kasper Nielsen) Date: Wed, 22 Feb 2023 17:08:10 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: <0ZZLsRlr6IO4jL2VKAcn4geYptqUOdYPUXbd-KQiApo=.d1b3ca63-ca4a-4eef-9d5e-88cc3418ada9@github.com> On Thu, 9 Feb 2023 18:11:18 GMT, Volker Simonis wrote: >> Prior to [JDK-8239384](https://bugs.openjdk.org/browse/JDK-8239384)/[JDK-8238358](https://bugs.openjdk.org/browse/JDK-8238358) LambdaMetaFactory has created VM-anonymous classes which could easily be unloaded once they were not referenced any more. Starting with JDK 15 and the new "hidden class" based implementation, this is not the case any more, because the hidden classes will be strongly tied to their defining class loader. If this is the default application class loader, these hidden classes can never be unloaded which can easily lead to Metaspace exhaustion (see the [test case in the JBS issue](https://bugs.openjdk.org/secure/attachment/102601/LambdaClassLeak.java)). This is a regression compared to previous JDK versions which some of our applications have been affected from when migrating to JDK 17. >> >> The reason why the newly created hidden classes are strongly linked to their defining class loader is not clear to me. JDK-8239384 mentions it as an "implementation detail": >> >>> *4. the lambda proxy class has the strong relationship with the class loader (that will share the VM metaspace for its defining loader - implementation details)* >> >> From my current understanding the strong link between a hidden class created by `LambdaMetaFactory` and its defining class loader is not strictly required. In order to prevent potential OOMs and fix the regression compared the JDK 14 and earlier I propose to create these hidden classes without the `STRONG` option. >> >> I'll be happy to add the test case as JTreg test to this PR if you think that would be useful. > > Volker Simonis has updated the pull request incrementally with two additional commits since the last revision: > > - Remove assertions which insist on Lambda proxy classes being strongly linked to their class loader > - Removed unused import of STRONG und updated copyright year In all the places I've seen LambdaMetaFactory used, it is because of performance over reflection/non-static method handles. See, for example, https://www.optaplanner.org/blog/2018/01/09/JavaReflectionButMuchFaster.html. I believe Optaplanner is still using it. There is also quite a number of posts on StackOverflow on people trying to use LambdaMetafactory: https://stackoverflow.com/search?q=LambdaMetafactory ------------- PR: https://git.openjdk.org/jdk/pull/12493 From coleenp at openjdk.org Wed Feb 22 17:29:58 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 22 Feb 2023 17:29:58 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 08:08:00 GMT, Robbin Ehn wrote: >> Hi all, please consider. >> >> The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. >> All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. >> Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. >> >> Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. >> The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) >> >> This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. >> >> This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. >> Secondly it moves handshakes part out of the Compile_lock where it is possible. >> >> Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. >> >> It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. >> >> Thanks, Robbin > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Coleen fix I had two small additional requests. I also reviewed the redefine classes and code cache code that I know and looks fine. src/hotspot/share/classfile/vmClasses.cpp line 252: > 250: Dictionary* dictionary = loader_data->dictionary(); > 251: dictionary->add_klass(THREAD, klass->name(), klass); > 252: SystemDictionary::add_to_hierarchy(klass, THREAD); One more small changes please. Can you make JavaThread* current the first parameter so this doesn't look like we're failing to throw an exception from these calls? src/hotspot/share/oops/instanceKlass.hpp line 34: > 32: #include "oops/instanceKlassFlags.hpp" > 33: #include "oops/instanceOop.hpp" > 34: #include "runtime/deoptimization.hpp" Can you just forward declare DeoptimizationScope rather than including the whole file here? ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.org/jdk/pull/12585 From jvernee at openjdk.org Wed Feb 22 17:38:55 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Wed, 22 Feb 2023 17:38:55 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 17:03:16 GMT, Jorn Vernee wrote: > (I'd be happy to implement the needed changes in shared code if you want, since it touches `BindingSpecializer` which is pretty dense) FYI: https://github.com/openjdk/jdk/compare/master...JornVernee:jdk:I2L (assuming `I2L` has the same semantics as `extsw`). Then just add a `.cast(int.class, long.class)` wherever an `int` is stored. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From jwaters at openjdk.org Wed Feb 22 18:05:09 2023 From: jwaters at openjdk.org (Julian Waters) Date: Wed, 22 Feb 2023 18:05:09 GMT Subject: RFR: 8302667: Improve message format when failing to load symbols or libraries [v4] In-Reply-To: References: Message-ID: > DLL_ERROR4 is a macro expanding to an error message when a failure to load a generic item (shared libraries or an exported symbol from said libraries for example) occurs. "Error: loading:" is not a very pretty message, so this small change results in "Error: Failed to load %s" instead, which looks better and also means the message makes more sense if we want to append a reason behind as well, such as "Error: Failed to load libjvm.so because xxx" Julian Waters has updated the pull request incrementally with one additional commit since the last revision: src/java.base/share/native/libjli/emessages.h Co-authored-by: Mandy Chung ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12596/files - new: https://git.openjdk.org/jdk/pull/12596/files/6b54816a..d923ef45 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12596&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12596&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12596.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12596/head:pull/12596 PR: https://git.openjdk.org/jdk/pull/12596 From jwaters at openjdk.org Wed Feb 22 18:05:34 2023 From: jwaters at openjdk.org (Julian Waters) Date: Wed, 22 Feb 2023 18:05:34 GMT Subject: RFR: 8302667: Improve message format when failing to load symbols or libraries [v2] In-Reply-To: <5HI98VM_T336GbG8atm-0-Vuf9tqbVR9nIFGghIsdE8=.42476df6-adaf-4055-8619-c984954a7c92@github.com> References: <5HI98VM_T336GbG8atm-0-Vuf9tqbVR9nIFGghIsdE8=.42476df6-adaf-4055-8619-c984954a7c92@github.com> Message-ID: On Tue, 21 Feb 2023 16:59:18 GMT, Mandy Chung wrote: >> Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: >> >> - Merge branch 'openjdk:master' into patch-6 >> - Error: Failed to load > > What about: > > > #define ARG_ERROR18 "Error: reading %s" @mlchung Thanks for the review! :) ------------- PR: https://git.openjdk.org/jdk/pull/12596 From jwaters at openjdk.org Wed Feb 22 18:05:48 2023 From: jwaters at openjdk.org (Julian Waters) Date: Wed, 22 Feb 2023 18:05:48 GMT Subject: Integrated: 8302667: Improve message format when failing to load symbols or libraries In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 15:02:38 GMT, Julian Waters wrote: > DLL_ERROR4 is a macro expanding to an error message when a failure to load a generic item (shared libraries or an exported symbol from said libraries for example) occurs. "Error: loading:" is not a very pretty message, so this small change results in "Error: Failed to load %s" instead, which looks better and also means the message makes more sense if we want to append a reason behind as well, such as "Error: Failed to load libjvm.so because xxx" This pull request has now been integrated. Changeset: 8de841dd Author: Julian Waters URL: https://git.openjdk.org/jdk/commit/8de841dd19a77f9ff6273a74366c08f33e0cac94 Stats: 3 lines in 2 files changed: 1 ins; 0 del; 2 mod 8302667: Improve message format when failing to load symbols or libraries Reviewed-by: mchung ------------- PR: https://git.openjdk.org/jdk/pull/12596 From sviswanathan at openjdk.org Wed Feb 22 18:19:57 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Wed, 22 Feb 2023 18:19:57 GMT Subject: RFR: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 04:03:02 GMT, David Holmes wrote: >> Change the java/lang/float.java and the corresponding shared runtime constant expression evaluation to generate QNaN. >> The HW instructions generate QNaNs and not SNaNs for floating point instructions. This happens across double, float, and float16 data types. The most significant bit of mantissa is set to 1 for QNaNs. > > I'm also a bit concerned that we are rushing in to "fix" this. IIUC we have three mechanisms for implementing this functionality: > > 1. The interpreted Java code > 2. The compiled non-intrinisc sharedRuntime code > 3. The compiler intrinsic that uses a hardware instruction. > > Unless the hardware instructions for all relevant CPUs behave exactly the same, then I don't see how we can have parity of behaviour across these three mechanisms. > > The observed behaviour may be surprising but it seems not to be a bug. And is this even a real concern - would real programs actually need to peek at the raw bits and so see the difference, or does it suffice to handle Nan's opaquely? @dholmes-ora @jddarcy @TobiHartmann @vnkozlov From @dean-long 's comment in the JBS entry, he sees the same result on AARCH64 and Intel, i.e. the output has the QNaN bit set. Please let me know if we want to proceed with this PR or if it would be good to withdraw this. I am open to either suggestion. Please advice. ------------- PR: https://git.openjdk.org/jdk/pull/12704 From jvernee at openjdk.org Wed Feb 22 18:28:05 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Wed, 22 Feb 2023 18:28:05 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) In-Reply-To: References: Message-ID: <6Kr2C2phn8UoKIB6MbRn0o-0IuGo5BGFaVXVN9jS5Pg=.d59b2968-a811-487a-9775-d7a868bbbda7@github.com> On Wed, 22 Feb 2023 05:31:46 GMT, Martin Doerr wrote: > Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". > > This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). > > Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. > > There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. > > The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. > > I had to make changes to shared code and code for other platforms: > 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: > - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. > - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. > - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! > 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). src/java.base/share/classes/jdk/internal/foreign/PlatformLayouts.java line 266: > 264: * The {@code T*} native type. > 265: */ > 266: public static final ValueLayout.OfAddress C_POINTER = ValueLayout.ADDRESS.withBitAlignment(64); I think this is where the issue with the check in `MemorySegment::copy` comes from. Note how other platforms add a call to `asUnbounded` for the created layout, which makes any pointer boxed using this layout writable/readable (such as the in memory return pointer for upcalls). ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mcimadamore at openjdk.org Wed Feb 22 18:34:36 2023 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Wed, 22 Feb 2023 18:34:36 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) In-Reply-To: References: Message-ID: <-TEFo_FkmQmlWADKNrtOBByV6R0wOrrsVxDtpwf7J7w=.16c3e1cc-4cd8-4157-ab2c-15b0be9454f8@github.com> On Wed, 22 Feb 2023 05:31:46 GMT, Martin Doerr wrote: > Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". > > This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). > > Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. > > There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. > > The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. > > I had to make changes to shared code and code for other platforms: > 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: > - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. > - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. > - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! > 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). Thanks for looking into this port! src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1359: > 1357: long size = elementCount * srcElementLayout.byteSize(); > 1358: srcImpl.checkAccess(srcOffset, size, true); > 1359: if (dstImpl instanceof NativeMemorySegmentImpl && dstImpl.byteSize() == 0) { As Jorn said, this change is not acceptable, as it allows bulk copy disregarding the segment real size. In such cases, the issue is always a missing unsafe resize somewhere in the linker code. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From kvn at openjdk.org Wed Feb 22 19:19:48 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 22 Feb 2023 19:19:48 GMT Subject: RFR: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter In-Reply-To: References: Message-ID: <3KyL3IP4tHCtZNzoOb3RsDuj0TsNkWBUufxwg4-YePc=.177bed30-9483-483c-90b3-0feab074ca88@github.com> On Wed, 22 Feb 2023 02:08:27 GMT, Sandhya Viswanathan wrote: > Change the java/lang/float.java and the corresponding shared runtime constant expression evaluation to generate QNaN. > The HW instructions generate QNaNs and not SNaNs for floating point instructions. This happens across double, float, and float16 data types. The most significant bit of mantissa is set to 1 for QNaNs. The proposed fix do exactly what everyone asked - the same result from Java code (Interpreter), runtime (C++ code) and intrinsic (HW instruction). Since HW instruction is already produces QNaNs, PR fixes only Java code (Interpreter) and runtime (C++) code to produce QNaNs. @TobiHartmann created test which covers all cases and should be added to this PR. ------------- PR: https://git.openjdk.org/jdk/pull/12704 From dlong at openjdk.org Wed Feb 22 20:51:53 2023 From: dlong at openjdk.org (Dean Long) Date: Wed, 22 Feb 2023 20:51:53 GMT Subject: RFR: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 05:21:48 GMT, Joe Darcy wrote: >> I'm also a bit concerned that we are rushing in to "fix" this. IIUC we have three mechanisms for implementing this functionality: >> >> 1. The interpreted Java code >> 2. The compiled non-intrinisc sharedRuntime code >> 3. The compiler intrinsic that uses a hardware instruction. >> >> Unless the hardware instructions for all relevant CPUs behave exactly the same, then I don't see how we can have parity of behaviour across these three mechanisms. >> >> The observed behaviour may be surprising but it seems not to be a bug. And is this even a real concern - would real programs actually need to peek at the raw bits and so see the difference, or does it suffice to handle Nan's opaquely? > >> I'm also a bit concerned that we are rushing in to "fix" this. IIUC we have three mechanisms for implementing this functionality: >> >> 1. The interpreted Java code >> >> 2. The compiled non-intrinisc sharedRuntime code >> >> 3. The compiler intrinsic that uses a hardware instruction. >> >> >> Unless the hardware instructions for all relevant CPUs behave exactly the same, then I don't see how we can have parity of behaviour across these three mechanisms. >> >> The observed behaviour may be surprising but it seems not to be a bug. And is this even a real concern - would real programs actually need to peek at the raw bits and so see the difference, or does it suffice to handle Nan's opaquely? > > From the spec (https://download.java.net/java/early_access/jdk20/docs/api/java.base/java/lang/Float.html#float16ToFloat(short)) > > "Returns the float value closest to the numerical value of the argument, a floating-point binary16 value encoded in a short. The conversion is exact; all binary16 values can be exactly represented in float. Special cases: > > If the argument is zero, the result is a zero with the same sign as the argument. > If the argument is infinite, the result is an infinity with the same sign as the argument. > If the argument is a NaN, the result is a NaN. " > > If the float argument is a NaN, you are supposed to get a float16 NaN as a result -- that is all the specification requires. However, the implementation makes stronger guarantees to try to preserve some non-zero NaN significand bits if they are set. > > "NaN boxing" is a technique used to put extra information into the significand bits a NaN and pass the around. It is consistent with the intended use of the feature by IEEE 754 and used in various language runtimes: e.g., > > https://piotrduperas.com/posts/nan-boxing > https://leonardschuetz.ch/blog/nan-boxing/ > https://anniecherkaev.com/the-secret-life-of-nan > > The Java specs are careful to avoid mentioning quiet vs signaling NaNs in general discussion. > > That said, I think it is reasonable on a given JVM invocation if Float.floatToFloat16(f) gave the same result for input f regardless of in what context it was called. We don't know that all HW will produce the same NaN "payload", right? Instead, we might need interpreter intrinsics. I assume that is how the trig functions are handled that @jddarcy mentioned. ------------- PR: https://git.openjdk.org/jdk/pull/12704 From alexandr.miloslavskiy at syntevo.com Wed Feb 22 20:57:25 2023 From: alexandr.miloslavskiy at syntevo.com (Alexandr Miloslavskiy) Date: Wed, 22 Feb 2023 23:57:25 +0300 Subject: JVM crashes in JIT compiled code without hs_err_pid file In-Reply-To: <3fd60ae0-2719-018d-5679-a78d1588e2bb@syntevo.com> References: <65436f8c-21f0-0270-1ad8-32f72d2163df@syntevo.com> <3fd60ae0-2719-018d-5679-a78d1588e2bb@syntevo.com> Message-ID: <21075ca4-4244-3a1b-695b-e9019cc99566@syntevo.com> I noticed that a number of foreign DLLs were loaded in the process when it crashed: C:\Program Files\DGAgent\plugins\09D849B6-32D3-4A40-85EE-6B84BA29E35B\AE_MailSensor_Plugin64.dll C:\Program Files\DGAgent\plugins\09D849B6-32D3-4A40-85EE-6B84BA29E35B\ame_outlooksensor64.dll C:\Program Files\DGAgent\plugins\09D849B6-32D3-4A40-85EE-6B84BA29E35B\ame_smtpsensor64.dll C:\Program Files\DGAgent\plugins\8E4EA70A-6128-4B57-BD3F-8E9E0F0DA6BB\COM_Sensor64.dll C:\Program Files\DGAgent\plugins\8E4EA70A-6128-4B57-BD3F-8E9E0F0DA6BB\OS_Plugin64.dll C:\Program Files\FileZilla FTP Client\fzshellext_64.dll C:\ProgramData\Symantec\Symantec Endpoint Protection\14.3.8268.5000.105\Data\Sysfer\x64\sysfer.dll I told user to get rid of these. He says that he put our application into exclusion list in "Symantec Endpoint Protection" and so far there were no crashes. If true, I still wonder how exactly does it manage to break things. From kvn at openjdk.org Wed Feb 22 21:24:25 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 22 Feb 2023 21:24:25 GMT Subject: RFR: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter In-Reply-To: References: Message-ID: <3JDw74vUUrTL3j-7TQ1It-_I_WuqcYrGtk_b7AilH74=.2038c9e0-4262-40f4-aa0d-05f1a1e95e40@github.com> On Wed, 22 Feb 2023 05:21:48 GMT, Joe Darcy wrote: >> I'm also a bit concerned that we are rushing in to "fix" this. IIUC we have three mechanisms for implementing this functionality: >> >> 1. The interpreted Java code >> 2. The compiled non-intrinisc sharedRuntime code >> 3. The compiler intrinsic that uses a hardware instruction. >> >> Unless the hardware instructions for all relevant CPUs behave exactly the same, then I don't see how we can have parity of behaviour across these three mechanisms. >> >> The observed behaviour may be surprising but it seems not to be a bug. And is this even a real concern - would real programs actually need to peek at the raw bits and so see the difference, or does it suffice to handle Nan's opaquely? > >> I'm also a bit concerned that we are rushing in to "fix" this. IIUC we have three mechanisms for implementing this functionality: >> >> 1. The interpreted Java code >> >> 2. The compiled non-intrinisc sharedRuntime code >> >> 3. The compiler intrinsic that uses a hardware instruction. >> >> >> Unless the hardware instructions for all relevant CPUs behave exactly the same, then I don't see how we can have parity of behaviour across these three mechanisms. >> >> The observed behaviour may be surprising but it seems not to be a bug. And is this even a real concern - would real programs actually need to peek at the raw bits and so see the difference, or does it suffice to handle Nan's opaquely? > > From the spec (https://download.java.net/java/early_access/jdk20/docs/api/java.base/java/lang/Float.html#float16ToFloat(short)) > > "Returns the float value closest to the numerical value of the argument, a floating-point binary16 value encoded in a short. The conversion is exact; all binary16 values can be exactly represented in float. Special cases: > > If the argument is zero, the result is a zero with the same sign as the argument. > If the argument is infinite, the result is an infinity with the same sign as the argument. > If the argument is a NaN, the result is a NaN. " > > If the float argument is a NaN, you are supposed to get a float16 NaN as a result -- that is all the specification requires. However, the implementation makes stronger guarantees to try to preserve some non-zero NaN significand bits if they are set. > > "NaN boxing" is a technique used to put extra information into the significand bits a NaN and pass the around. It is consistent with the intended use of the feature by IEEE 754 and used in various language runtimes: e.g., > > https://piotrduperas.com/posts/nan-boxing > https://leonardschuetz.ch/blog/nan-boxing/ > https://anniecherkaev.com/the-secret-life-of-nan > > The Java specs are careful to avoid mentioning quiet vs signaling NaNs in general discussion. > > That said, I think it is reasonable on a given JVM invocation if Float.floatToFloat16(f) gave the same result for input f regardless of in what context it was called. > We don't know that all HW will produce the same NaN "payload", right? Instead, we might need interpreter intrinsics. I assume that is how the trig functions are handled that @jddarcy mentioned. Good point. We can't guarantee that all OpenJDK ports HW do the same. If CPU has corresponding instructions we need to generate a stub during VM startup with HW instructions and use it in all cases (or directly the same instruction in JIT compiled code). If CPU does not have instruction we should use runtime C++ function in all cases to be consistent. ------------- PR: https://git.openjdk.org/jdk/pull/12704 From forax at openjdk.org Wed Feb 22 21:32:35 2023 From: forax at openjdk.org (=?UTF-8?B?UsOpbWk=?= Forax) Date: Wed, 22 Feb 2023 21:32:35 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: <0ZZLsRlr6IO4jL2VKAcn4geYptqUOdYPUXbd-KQiApo=.d1b3ca63-ca4a-4eef-9d5e-88cc3418ada9@github.com> References: <0ZZLsRlr6IO4jL2VKAcn4geYptqUOdYPUXbd-KQiApo=.d1b3ca63-ca4a-4eef-9d5e-88cc3418ada9@github.com> Message-ID: On Wed, 22 Feb 2023 17:05:00 GMT, Kasper Nielsen wrote: >> Volker Simonis has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove assertions which insist on Lambda proxy classes being strongly linked to their class loader >> - Removed unused import of STRONG und updated copyright year > > In all the places I've seen LambdaMetaFactory used, it is because of performance over reflection/non-static method handles. See, for example, https://www.optaplanner.org/blog/2018/01/09/JavaReflectionButMuchFaster.html. I believe Optaplanner is still using it. > > There is also quite a number of posts on StackOverflow on people trying to use LambdaMetafactory: > https://stackoverflow.com/search?q=LambdaMetafactory @kaspernielsen, i believe that now that hidden class + class data are part of the public API, you do not need to use lambda metafactory directly. But it requires Java 17 not 8 or 11. ------------- PR: https://git.openjdk.org/jdk/pull/12493 From sviswanathan at openjdk.org Wed Feb 22 21:56:28 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Wed, 22 Feb 2023 21:56:28 GMT Subject: RFR: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter In-Reply-To: <3JDw74vUUrTL3j-7TQ1It-_I_WuqcYrGtk_b7AilH74=.2038c9e0-4262-40f4-aa0d-05f1a1e95e40@github.com> References: <3JDw74vUUrTL3j-7TQ1It-_I_WuqcYrGtk_b7AilH74=.2038c9e0-4262-40f4-aa0d-05f1a1e95e40@github.com> Message-ID: On Wed, 22 Feb 2023 21:21:42 GMT, Vladimir Kozlov wrote: >>> I'm also a bit concerned that we are rushing in to "fix" this. IIUC we have three mechanisms for implementing this functionality: >>> >>> 1. The interpreted Java code >>> >>> 2. The compiled non-intrinisc sharedRuntime code >>> >>> 3. The compiler intrinsic that uses a hardware instruction. >>> >>> >>> Unless the hardware instructions for all relevant CPUs behave exactly the same, then I don't see how we can have parity of behaviour across these three mechanisms. >>> >>> The observed behaviour may be surprising but it seems not to be a bug. And is this even a real concern - would real programs actually need to peek at the raw bits and so see the difference, or does it suffice to handle Nan's opaquely? >> >> From the spec (https://download.java.net/java/early_access/jdk20/docs/api/java.base/java/lang/Float.html#float16ToFloat(short)) >> >> "Returns the float value closest to the numerical value of the argument, a floating-point binary16 value encoded in a short. The conversion is exact; all binary16 values can be exactly represented in float. Special cases: >> >> If the argument is zero, the result is a zero with the same sign as the argument. >> If the argument is infinite, the result is an infinity with the same sign as the argument. >> If the argument is a NaN, the result is a NaN. " >> >> If the float argument is a NaN, you are supposed to get a float16 NaN as a result -- that is all the specification requires. However, the implementation makes stronger guarantees to try to preserve some non-zero NaN significand bits if they are set. >> >> "NaN boxing" is a technique used to put extra information into the significand bits a NaN and pass the around. It is consistent with the intended use of the feature by IEEE 754 and used in various language runtimes: e.g., >> >> https://piotrduperas.com/posts/nan-boxing >> https://leonardschuetz.ch/blog/nan-boxing/ >> https://anniecherkaev.com/the-secret-life-of-nan >> >> The Java specs are careful to avoid mentioning quiet vs signaling NaNs in general discussion. >> >> That said, I think it is reasonable on a given JVM invocation if Float.floatToFloat16(f) gave the same result for input f regardless of in what context it was called. > >> We don't know that all HW will produce the same NaN "payload", right? Instead, we might need interpreter intrinsics. I assume that is how the trig functions are handled that @jddarcy mentioned. > > Good point. We can't guarantee that all OpenJDK ports HW do the same. > > If CPU has corresponding instructions we need to generate a stub during VM startup with HW instructions and use it in all cases (or directly the same instruction in JIT compiled code). > If CPU does not have instruction we should use runtime C++ function in all cases to be consistent. Thanks @vnkozlov @dean-long. One last question before I withdraw the PR: As QNaN bit is supported across current architectures like x86, ARM and may be others as well for conversion, couldn't we go ahead with this PR? The architectures that behave differently could then follow the technique suggested by Vladimir Kozlov as and when they implement the intrinsic? ------------- PR: https://git.openjdk.org/jdk/pull/12704 From dcubed at openjdk.org Wed Feb 22 22:55:06 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Wed, 22 Feb 2023 22:55:06 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 08:08:00 GMT, Robbin Ehn wrote: >> Hi all, please consider. >> >> The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. >> All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. >> Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. >> >> Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. >> The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) >> >> This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. >> >> This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. >> Secondly it moves handshakes part out of the Compile_lock where it is possible. >> >> Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. >> >> It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. >> >> Thanks, Robbin > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Coleen fix src/hotspot/share/classfile/systemDictionary.cpp line 1632: > 1630: k->process_interfaces(); // handle all "implements" declarations > 1631: > 1632: // Now flush all code that depended on old class hierarchy. nit typo: s/flush/mark/ since you've changed that elsewhere... src/hotspot/share/classfile/systemDictionaryShared.cpp line 856: > 854: // Add to class hierarchy, and do possible deoptimizations. > 855: SystemDictionary::add_to_hierarchy(loaded_lambda, THREAD); > 856: // But, do not add to dictionary. nit comment location: move above L855 like other places? ------------- PR: https://git.openjdk.org/jdk/pull/12585 From dlong at openjdk.org Wed Feb 22 23:59:58 2023 From: dlong at openjdk.org (Dean Long) Date: Wed, 22 Feb 2023 23:59:58 GMT Subject: RFR: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter In-Reply-To: <3JDw74vUUrTL3j-7TQ1It-_I_WuqcYrGtk_b7AilH74=.2038c9e0-4262-40f4-aa0d-05f1a1e95e40@github.com> References: <3JDw74vUUrTL3j-7TQ1It-_I_WuqcYrGtk_b7AilH74=.2038c9e0-4262-40f4-aa0d-05f1a1e95e40@github.com> Message-ID: On Wed, 22 Feb 2023 21:21:42 GMT, Vladimir Kozlov wrote: >>> I'm also a bit concerned that we are rushing in to "fix" this. IIUC we have three mechanisms for implementing this functionality: >>> >>> 1. The interpreted Java code >>> >>> 2. The compiled non-intrinisc sharedRuntime code >>> >>> 3. The compiler intrinsic that uses a hardware instruction. >>> >>> >>> Unless the hardware instructions for all relevant CPUs behave exactly the same, then I don't see how we can have parity of behaviour across these three mechanisms. >>> >>> The observed behaviour may be surprising but it seems not to be a bug. And is this even a real concern - would real programs actually need to peek at the raw bits and so see the difference, or does it suffice to handle Nan's opaquely? >> >> From the spec (https://download.java.net/java/early_access/jdk20/docs/api/java.base/java/lang/Float.html#float16ToFloat(short)) >> >> "Returns the float value closest to the numerical value of the argument, a floating-point binary16 value encoded in a short. The conversion is exact; all binary16 values can be exactly represented in float. Special cases: >> >> If the argument is zero, the result is a zero with the same sign as the argument. >> If the argument is infinite, the result is an infinity with the same sign as the argument. >> If the argument is a NaN, the result is a NaN. " >> >> If the float argument is a NaN, you are supposed to get a float16 NaN as a result -- that is all the specification requires. However, the implementation makes stronger guarantees to try to preserve some non-zero NaN significand bits if they are set. >> >> "NaN boxing" is a technique used to put extra information into the significand bits a NaN and pass the around. It is consistent with the intended use of the feature by IEEE 754 and used in various language runtimes: e.g., >> >> https://piotrduperas.com/posts/nan-boxing >> https://leonardschuetz.ch/blog/nan-boxing/ >> https://anniecherkaev.com/the-secret-life-of-nan >> >> The Java specs are careful to avoid mentioning quiet vs signaling NaNs in general discussion. >> >> That said, I think it is reasonable on a given JVM invocation if Float.floatToFloat16(f) gave the same result for input f regardless of in what context it was called. > >> We don't know that all HW will produce the same NaN "payload", right? Instead, we might need interpreter intrinsics. I assume that is how the trig functions are handled that @jddarcy mentioned. > > Good point. We can't guarantee that all OpenJDK ports HW do the same. > > If CPU has corresponding instructions we need to generate a stub during VM startup with HW instructions and use it in all cases (or directly the same instruction in JIT compiled code). > If CPU does not have instruction we should use runtime C++ function in all cases to be consistent. > Thanks @vnkozlov @dean-long. One last question before I withdraw the PR: As QNaN bit is supported across current architectures like x86, ARM and may be others as well for conversion, couldn't we go ahead with this PR? The architectures that behave differently could then follow the technique suggested by Vladimir Kozlov as and when they implement the intrinsic? No, because it's not just the SNaN vs QNaN that is different, but also the NaN "payload" or "boxing" that is different. For example, the intrinsic gives me different results on aarch64 vs Intel with this test: public class Foo { public static float hf2f(short s) { return Float.floatToFloat16(s); } public static short f2hf(float f) { return Float.floatToFloat16(f); } public static void main(String[] args) { float f = Float.intBitsToFloat(0x7fc00000 | 0x2000 ); System.out.println(Integer.toHexString(f2hf(f))); f = Float.intBitsToFloat(0x7fc00000 | 0x20 ); System.out.println(Integer.toHexString(f2hf(f))); f = Float.intBitsToFloat(0x7fc00000 | 0x4); System.out.println(Integer.toHexString(f2hf(f))); f = Float.intBitsToFloat(0x7fc00000 | 0x2000 | 0x20 | 0x4); System.out.println(Integer.toHexString(f2hf(f))); } } ------------- PR: https://git.openjdk.org/jdk/pull/12704 From dcubed at openjdk.org Thu Feb 23 01:26:09 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 23 Feb 2023 01:26:09 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 08:08:00 GMT, Robbin Ehn wrote: >> Hi all, please consider. >> >> The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. >> All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. >> Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. >> >> Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. >> The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) >> >> This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. >> >> This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. >> Secondly it moves handshakes part out of the Compile_lock where it is possible. >> >> Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. >> >> It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. >> >> Thanks, Robbin > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Coleen fix Changes requested by dcubed (Reviewer). src/hotspot/share/oops/instanceKlass.cpp line 1184: > 1182: DeoptimizationScope deopt_scope; > 1183: { > 1184: // Now flush all code that assumes the class is not linked. nit typo: s/flush/mark/ since you've made that change elsewhere... ------------- PR: https://git.openjdk.org/jdk/pull/12585 From dcubed at openjdk.org Thu Feb 23 01:26:11 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 23 Feb 2023 01:26:11 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 06:26:55 GMT, David Holmes wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Coleen fix > > src/hotspot/share/classfile/systemDictionary.cpp line 1620: > >> 1618: assert(k != nullptr, "just checking"); >> 1619: if (Universe::is_fully_initialized()) { >> 1620: assert_locked_or_safepoint(Compile_lock); > > This serves no purpose given you just took out the lock. But it does raise the question as to whether the lock can/should always be taken out if called at a safepoint, or if the Universe is not yet fully initialized? I found one call to `add_to_hierarchy()` where we previously did not grab in `Compile_lock` in `vmClasses::resolve_shared_class()` and that call is made with this assert in place: ```assert(!Universe::is_fully_initialized(), "We can make short cuts only during VM initialization");``` so it looks to me like we have a locking issue there (as in too early for locks). ------------- PR: https://git.openjdk.org/jdk/pull/12585 From dcubed at openjdk.org Thu Feb 23 01:26:12 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 23 Feb 2023 01:26:12 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 16:49:33 GMT, Coleen Phillimore wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Coleen fix > > src/hotspot/share/classfile/vmClasses.cpp line 252: > >> 250: Dictionary* dictionary = loader_data->dictionary(); >> 251: dictionary->add_klass(THREAD, klass->name(), klass); >> 252: SystemDictionary::add_to_hierarchy(klass, THREAD); > > One more small changes please. Can you make JavaThread* current the first parameter so this doesn't look like we're failing to throw an exception from these calls? This call to `SystemDictionary::add_to_hierarchy()` was not previously done while grabbing the `Compile_lock` and now we'll do that. This specific call is made while this assert() is true: ```assert(!Universe::is_fully_initialized(), "We can make short cuts only during VM initialization");``` so very early in the life of the VM. ------------- PR: https://git.openjdk.org/jdk/pull/12585 From dcubed at openjdk.org Thu Feb 23 01:26:14 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 23 Feb 2023 01:26:14 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v2] In-Reply-To: References: <3lTNf7dMxXfNZlqfgtV5UHqxq4moAgrnyHCM2CtlkYI=.4df55d81-05d3-4636-bdfc-296a031aa815@github.com> Message-ID: <0_PwtvOjaxXGHyuYTolPkYCaNwTEL4rVAKsi3FwcNEY=.b49c8aeb-4721-49bc-af08-54485ba222c2@github.com> On Mon, 20 Feb 2023 09:23:19 GMT, Robbin Ehn wrote: >> src/hotspot/share/code/codeCache.cpp line 1392: >> >>> 1390: if (nm->is_marked_for_deoptimization() && !nm->has_been_deoptimized() && nm->can_be_deoptimized()) { >>> 1391: nm->make_not_entrant(); >>> 1392: nm->make_deoptimized(); >> >> `CodeCache::make_marked_nmethods_deoptimized()` used to call >> `make_nmethod_deoptimized(nm)` which made a couple of checks >> before it called `nm->make_deoptimized()`. `make_deoptimized()` >> does not make these checks so why are they no longer needed? >> >> Here's the removed code: >> >> void CodeCache::make_nmethod_deoptimized(CompiledMethod* nm) { >> if (nm->is_marked_for_deoptimization() && nm->can_be_deoptimized()) { >> nm->make_deoptimized(); >> } >> } > > If the method can't be deoptimized we should not make "not entrant". > If it is already deoptimize no need to try to make it "not entrant" once more. > If it's not marked we shouldn't make it "not entrant". > So only if we make it "not entrant" we should also make_deopt(). > > This is checked at the line just before: > > if (nm->is_marked_for_deoptimization() && !nm->has_been_deoptimized() && nm->can_be_deoptimized()) { > nm->make_not_entrant(); > nm->make_deoptimized(); > } > > > So we do not need to recheck it. > > But I'll add a check that we do not re-patch the nops. Thanks! >> src/hotspot/share/code/nmethod.cpp line 1166: >> >>> 1164: set_deoptimized_done(); >>> 1165: return; >>> 1166: } >> >> I don't understand this check-and-bailout logic at all. If continuations are >> not enabled, then why does `nmethod::make_deoptimized()` do nothing? >> This isn't an issue introduced by your patch, but I just happened to notice >> when I was reading thru... > > We two deopts mechanisms: > * Patching return PC in frames for nmethod that are no longer valid. > Done by handshaking all threads and walking their stack. > > * Virtual threads stacks are only patched this way when mounted. > If we have a million VT we can't walk the entire heap and patch PC is all these stacks. > Instead there is nop-sledge after each calls from nmethods, so patch these to jump to the deopt > when returning into a deopted nmethod. > Calls into VM (causing the VT to pin) do not have these nop-sledges, so when using Loom, both mechanism is needed. Thanks for explanation. ------------- PR: https://git.openjdk.org/jdk/pull/12585 From darcy at openjdk.org Thu Feb 23 03:12:11 2023 From: darcy at openjdk.org (Joe Darcy) Date: Thu, 23 Feb 2023 03:12:11 GMT Subject: RFR: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter In-Reply-To: <-3z8sZMccXAOTV4KgeqrX6KQTcnSP20B70a293fhQSI=.354c1f98-c5bd-4b7e-a59a-10ff31d7d6b9@github.com> References: <-3z8sZMccXAOTV4KgeqrX6KQTcnSP20B70a293fhQSI=.354c1f98-c5bd-4b7e-a59a-10ff31d7d6b9@github.com> Message-ID: On Wed, 22 Feb 2023 09:46:59 GMT, Doug Simon wrote: > > That said, I think it is reasonable on a given JVM invocation if Float.floatToFloat16(f) gave the same result for input f regardless of in what context it was called. > > Yes, I'm under the impression that for math API methods like this, the stability of input to output must be preserved for a single JVM invocation. Or are there existing methods for which the interpreter and compiled code execution is allowed to differ? A similar but not exactly analagous situation occurs for the intrinsics of various java.lang.Math methods. Many of the methods of interest allow implementation flexibility, subject to various quality of implementation criteria. Those criteria bound the maximum error at a single point, phrased in terms of ulps -- units in the last place, and require semi-monotonicity -- which describes the relation of outputs of adjacent floating-point values. Taking two compliant implementations of Math.foo(), it is not necessarily a valid implementation to switch between them because the semi-monotonicity constraint can be violated. The solution is to always use or always not use an intrinsic for Math.foo on a given JVM invocation. HTH ------------- PR: https://git.openjdk.org/jdk/pull/12704 From kbarrett at openjdk.org Thu Feb 23 04:00:02 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 23 Feb 2023 04:00:02 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 14:42:20 GMT, Thomas Stuefe wrote: > Okay, I see. But does it have to come with hs-err file too? Because that is the only reason for this complication. And hs-err files probably spoil the context reported to Watson, see below. The question of whether to create an hs_err log when +UseOSErrorReporting is out of scope for this change. The decision to do so is very old, going back to these: https://bugs.openjdk.org/browse/JDK-4997835 RFE: crash dump will only be created when running w/ -XX:+ShowMessageBoxOnError https://bugs.openjdk.org/browse/JDK-6227246 Improve Windows unhandled structured exception reporting https://bugs.openjdk.org/browse/JDK-6794885 Improve Windows unhandled structured exception reporting - UseOSErrorReporting Feel free to raise that as a separate issue. > I am a bit unhappy with this patch because it "sanctifies" returning from hs-err reporting. We had long discussions in the recent past where we needed to convince developers that returning from hs-err reporting is a very bad idea. I'm afraid with your patch, it won't be long before follow-up patches build upon your ARA and re-introduce this question. > > I'd prefer if UseOSErrorReporting, and the code surrounding the windows exceptionhandler and CONTINUE_SEARCH, remained Windows only, and restricted to this one use case: jumping back to OS error reporting in case of an error. I don't think it santifies anything of the sort. I agree that returning from hs-err reporting is problematic. The current +UseOSErrorReporting was only one of several mechanisms doing exactly that. I've already removed SuppressErrorAt. I'm working on another change that modifies how the existing Debugging mechanism works. I've included comments about how the ARA stuff and xxx_maybe_die_xxx exist only to support +UseOSErrorReporting. I could go further and add something like the following at the beginning of VMError::report_and_maybe_die_impl: assert(after_report_action == AfterReportAction::Die WINDOWS_ONLY(|| UseOSErrorReporting), "must terminate"); The reason for the new argument is so that the caller, which is where the knowlege of whether return or termination is desired, can request what it wants. I could have further uglified the code by making ARA::Return Windows-only, with more #ifdefs. I'd kind of prefer not to do that. > Your patch opens up other questions too, e.g. how is this supposed to work in the face of secondary errors? IIUC the last invocation of the exception handler would be the winning one and return its error state back to the top-level exception filter, so the error reported by Watson would be the secondary crash, not the real crash. I think this would happen today too, or? I think that's a correct summary, with or without this change. > This may be an argument for skipping hs-err reporting with UseOSErrorReporting - skipping everything, actually: the context that gets reported to Watson would be guaranteed to be the one from the real crash point. And the crash would be "fresh", as close as possible to the real crash site. I'd think that this is how UseOSErrorReporting is supposed to work, leaving crash reporting up to the OS. That would be an appropriate argument to make as part of a proposal to remove the hs-err logging when +UseOSErrorReporting. ------------- PR: https://git.openjdk.org/jdk/pull/12651 From mdoerr at openjdk.org Thu Feb 23 04:37:49 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 23 Feb 2023 04:37:49 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: > Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". > > This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). > > Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. > > There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. > > The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. > > I had to make changes to shared code and code for other platforms: > 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: > - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. > - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. > - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! > 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: Clean fix for NativeMemorySegmentImpl issue with byteSize 0. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12708/files - new: https://git.openjdk.org/jdk/pull/12708/files/4a5debfc..7315fd20 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12708&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12708&range=00-01 Stats: 6 lines in 2 files changed: 0 ins; 4 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12708.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12708/head:pull/12708 PR: https://git.openjdk.org/jdk/pull/12708 From mdoerr at openjdk.org Thu Feb 23 04:40:04 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 23 Feb 2023 04:40:04 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: <-TEFo_FkmQmlWADKNrtOBByV6R0wOrrsVxDtpwf7J7w=.16c3e1cc-4cd8-4157-ab2c-15b0be9454f8@github.com> References: <-TEFo_FkmQmlWADKNrtOBByV6R0wOrrsVxDtpwf7J7w=.16c3e1cc-4cd8-4157-ab2c-15b0be9454f8@github.com> Message-ID: On Wed, 22 Feb 2023 18:31:45 GMT, Maurizio Cimadamore wrote: >> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: >> >> Clean fix for NativeMemorySegmentImpl issue with byteSize 0. > > src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1359: > >> 1357: long size = elementCount * srcElementLayout.byteSize(); >> 1358: srcImpl.checkAccess(srcOffset, size, true); >> 1359: if (dstImpl instanceof NativeMemorySegmentImpl && dstImpl.byteSize() == 0) { > > As Jorn said, this change is not acceptable, as it allows bulk copy disregarding the segment real size. In such cases, the issue is always a missing unsafe resize somewhere in the linker code. Removed this workaround. I'm glad to get rid of it. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mdoerr at openjdk.org Thu Feb 23 04:40:07 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 23 Feb 2023 04:40:07 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: <6Kr2C2phn8UoKIB6MbRn0o-0IuGo5BGFaVXVN9jS5Pg=.d59b2968-a811-487a-9775-d7a868bbbda7@github.com> References: <6Kr2C2phn8UoKIB6MbRn0o-0IuGo5BGFaVXVN9jS5Pg=.d59b2968-a811-487a-9775-d7a868bbbda7@github.com> Message-ID: On Wed, 22 Feb 2023 18:23:54 GMT, Jorn Vernee wrote: >> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: >> >> Clean fix for NativeMemorySegmentImpl issue with byteSize 0. > > src/java.base/share/classes/jdk/internal/foreign/PlatformLayouts.java line 266: > >> 264: * The {@code T*} native type. >> 265: */ >> 266: public static final ValueLayout.OfAddress C_POINTER = ValueLayout.ADDRESS.withBitAlignment(64); > > I think this is where the issue with the check in `MemorySegment::copy` comes from. Note how other platforms add a call to `asUnbounded` for the created layout, which makes any pointer boxed using this layout writable/readable (such as the in memory return pointer for upcalls). Thanks for the hint! You found it pretty quickly. I had missed that when rebasing my early prototype. Fixed. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mdoerr at openjdk.org Thu Feb 23 04:47:05 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 23 Feb 2023 04:47:05 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: On Thu, 23 Feb 2023 04:37:49 GMT, Martin Doerr wrote: >> Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". >> >> This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). >> >> Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. >> >> There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. >> >> The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. >> >> I had to make changes to shared code and code for other platforms: >> 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: >> - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. >> - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. >> - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! >> 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Clean fix for NativeMemorySegmentImpl issue with byteSize 0. > > (I'd be happy to implement the needed changes in shared code if you want, since it touches `BindingSpecializer` which is pretty dense) > > FYI: [master...JornVernee:jdk:I2L](https://github.com/openjdk/jdk/compare/master...JornVernee:jdk:I2L) (assuming `I2L` has the same semantics as `extsw`). Then just add a `.cast(int.class, long.class)` wherever currently an `int` is `vmStore`d in the PPC CallArranger. Correct, `extsw` performs a `I2L` conversion. I had thought about this already, but I think my current implementation is more efficient as it combines register moves with the 64 bit extend. Your proposal would generate separate extend and move instructions, right? ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mdoerr at openjdk.org Thu Feb 23 04:56:02 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 23 Feb 2023 04:56:02 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) In-Reply-To: References: Message-ID: <4KT43N8FvdfdmoAeaXw1vEBMP20MhpV5RUTZzeD2DqQ=.39e184f4-e9a3-47bd-9cad-ed9fe51a0d7b@github.com> On Wed, 22 Feb 2023 17:03:16 GMT, Jorn Vernee wrote: > I will do a more thorough review soon. Thanks a lot! > > The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. > > FWIW, we have to do this for Windows vararg floats as well ([here](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/foreign/abi/x64/windows/CallArranger.java#L231-L239)) > > This can be done by `dup`-ing the value, and using 2 `vmStore`s. (each `vmStore` corresponding to a single register/stack location). Doing something similar might be simpler than the `INTEGER_AND_FLOAT` and `STACK_AND_FLOAT` storage types you're using right now. I'm not sure if that is related to the other limitations you mention? Might be interesting to look into. (perhaps as a separate RFE. I don't have a big issue since the current approach stays in PPC-only code) Maybe I need to think a bit more about it. I don't really like the extra cases for `INTEGER_AND_FLOAT` and `STACK_AND_FLOAT`. On the other side, doing it in the CallArranger would break the design of factoring out the allocation from the binding generation. In addition, it seems like PPC64 is even more tricky than the Windows case. I need to pass 2 float arguments in a GP reg (or stack slot) plus one of these 2 floats in float register F13. I think this can get implemented more easily in the backend. Do you agree? ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mdoerr at openjdk.org Thu Feb 23 06:18:49 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 23 Feb 2023 06:18:49 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> > Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". > > This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). > > Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. > > There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. > > The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. > > I had to make changes to shared code and code for other platforms: > 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: > - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. > - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. > - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! > 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: Remove size restriction for structs. Add TODO for Big Endian. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12708/files - new: https://git.openjdk.org/jdk/pull/12708/files/7315fd20..a4d844f7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12708&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12708&range=01-02 Stats: 3 lines in 1 file changed: 1 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12708.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12708/head:pull/12708 PR: https://git.openjdk.org/jdk/pull/12708 From mdoerr at openjdk.org Thu Feb 23 06:24:03 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 23 Feb 2023 06:24:03 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: On Thu, 23 Feb 2023 06:18:49 GMT, Martin Doerr wrote: >> Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". >> >> This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). >> >> Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. >> >> There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. >> >> The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. >> >> I had to make changes to shared code and code for other platforms: >> 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: >> - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. >> - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. >> - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! >> 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Remove size restriction for structs. Add TODO for Big Endian. I have removed the size restriction for structs. Passing a struct consisting of 1 char works (on Little Endian). However, passing a struct consisting of 3 chars doesn't (getting IndexOutOfBoundsException: Out of bound access on segment MemorySegment). Neither on PPC64, nor on x86. Is that known or should I file a bug for that? ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mdoerr at openjdk.org Thu Feb 23 07:22:08 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 23 Feb 2023 07:22:08 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: <2lWqgBlo7fvu2MpVr3N46-03iI1Q49y1hqfd7uUBB58=.ba2a9ec5-c824-416d-a431-ca7e33019736@github.com> On Thu, 23 Feb 2023 06:18:49 GMT, Martin Doerr wrote: >> Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". >> >> This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). >> >> Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. >> >> There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. >> >> The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. >> >> I had to make changes to shared code and code for other platforms: >> 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: >> - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. >> - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. >> - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! >> 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Remove size restriction for structs. Add TODO for Big Endian. I should add tests for the tricky corner cases like the following ones: EXPORT struct S_FF f_S_S_FF(float p0, float p1, float p2, float p3, float p4, float p5, float p6, float p7, float p8, float p9, float p10, float p11, struct S_FF p12, float p13) { return p12; } EXPORT float f_F_S_FF(float p0, float p1, float p2, float p3, float p4, float p5, float p6, float p7, float p8, float p9, float p10, float p11, struct S_FF p12, float p13) { return p13; } EXPORT struct S_FF f_S_SSSSSSS_FF(struct S_FF p0, struct S_FF p1, struct S_FF p2, struct S_FF p3, struct S_FF p4, struct S_FF p5, struct S_FF p6, float p7) { return p6; } EXPORT float f_F_SSSSSSS_FF(struct S_FF p0, struct S_FF p1, struct S_FF p2, struct S_FF p3, struct S_FF p4, struct S_FF p5, struct S_FF p6, float p7) { return p7; } Can I add them to the existing libraries? If so, what is the correct naming scheme and what is needed to get them executed (adding the EXPORT alone is not sufficient). Or should I create a separate test for these cases? Advice will be appreciated! ------------- PR: https://git.openjdk.org/jdk/pull/12708 From dholmes at openjdk.org Thu Feb 23 07:53:04 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 23 Feb 2023 07:53:04 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: On Thu, 23 Feb 2023 03:57:18 GMT, Kim Barrett wrote: > Okay, I see. But does it have to come with hs-err file too? The ability to return to the OS error reporting after generating the hs_err file is IMO an important part of this piece of functionality. > I could go further and add something like the following at the beginning of VMError::report_and_maybe_die_impl: That wouldn't hurt to reinforce this is for Windows-only use of UseOSErrorReporting. Your call. ------------- PR: https://git.openjdk.org/jdk/pull/12651 From dholmes at openjdk.org Thu Feb 23 08:05:05 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 23 Feb 2023 08:05:05 GMT Subject: RFR: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 02:08:27 GMT, Sandhya Viswanathan wrote: > Change the java/lang/float.java and the corresponding shared runtime constant expression evaluation to generate QNaN. > The HW instructions generate QNaNs and not SNaNs for floating point instructions. This happens across double, float, and float16 data types. The most significant bit of mantissa is set to 1 for QNaNs. Changing the Java code to match the semantics of a given architecture's HW instruction risks requiring that other architectures have to implement additional code to match those semantics, potentially impacting the benefit of using an intrinsic in the first place. Consistent output across all execution contexts is certainly a desirable quality, but at what cost? ------------- PR: https://git.openjdk.org/jdk/pull/12704 From stuefe at openjdk.org Thu Feb 23 08:11:06 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 23 Feb 2023 08:11:06 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 08:28:05 GMT, Kim Barrett wrote: > Please review this change to the implementation of the Windows-specific option > UseOSErrorReporting, toward allowing crash reporting functions to be declared > noreturn. VMError::report_and_die no longer conditionally returns if the > Windows-only option UseOSErrorReporting is true. > > The core of VM crash reporting has been renamed and privatized, and now takes > an additional argument indicating whether the caller just wants reporting or > also wants termination after reporting. The Windows top-level and unhandled > exception filters call into that to request reporting, and also termination if > UseOSErrorReporting is false, otherwise returning from the call and taking > further action. All other callers of VM crash reporting include termination. > > So if we get a structured exception on Windows and the option is true then we > report it and then end up walking up the exception handlers until running out, > when Windows crash behavior gets a shot. This is essentially the same as > before the change, just checking the option in a different place (in the > Windows-specific caller, rather than in an #ifdef block in the shared code). > > In addition, the core VM crash reporter uses BREAKPOINT before termination if > UseOSErrorReporting is true. So if we have an assertion failure we report it > and then end up walking up the exception handlers (because of the breakpoint > exception) until running out, when Windows crash behavior gets a shot. > > So if there is an attached debugger (or JIT debugging is enabled), this leads > an assertion failure to hit the debugger inside the crash reporter rather than > after it returned (due to the old behavior for UseOSErrorReporting) from the > failed assertion (hitting the BREAKPOINT there). > > Note that on Windows, ::breakpoint() calls DebugBreak(), raising a breakpoint > exception, which ends up being unhandled if UseOSErrorReporting is true. > > There is a pre-existing bug that I'll be reporting separately. If > UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an > empty .mdmp file. We shouldn't create that file when UseOSErrorReporting. > > Testing: > mach5 tier1-3 > > Manual testing with the following, to verify expected behavior that is the > same as before the change. > > # -XX:ErrorHandlerTest=N > # 1: assertion failure > # 2: guarantee failure > # 14: SIGSEGV > # 15: divide by zero > path/to/bin/java \ > -XX:+UnlockDiagnosticVMOptions \ > -XX:+ErrorLogSecondaryErrorDetails \ > -XX:+UseOSErrorReporting \ > -XX:ErrorHandlerTest=1 \ > TestDebug.java > > --- TestDebug.java --- > import java.lang.String; > public class TestDebug { > static private volatile String dummy; > public static void main(String[] args) throws Exception { > while (true) { > dummy = new String("foo bar"); > } > } > } > --- end TestDebug.java --- > > I need someone with a better understanding of Windows to test the interaction > with Windows Error Reporting (WER). Looking through this further. Side note: I've disliked this thicket of report() functions and their wall of arguments for a long time. I think a very simple way to make this easier to read would be to capture argument groups in structures. For example, anything crash related could go into struct crash_info_t { signal, siginfo, context, crash_address } maybe even as a union with Posix/Windows flavour and native types (DWORD vs int for signal, EXCEPTION_INFO vs siginfo, CONTEXT vs ucontext). Infos for oom-errors could go into a similar structure like struct oom_info_t { size, failing_api, filename, lineno } etc. src/hotspot/share/utilities/vmError.cpp line 1374: > 1372: Thread* thread, unsigned int sig, address pc, > 1373: void* siginfo, void* context, > 1374: const char* detail_fmt, ...) { Do we really need this variant? AFAICS this is only used in one place, in vmError_windows.cpp, from the secondary crash handler. And there, we don't really need the detail format; all you feed it is an empty string. ------------- PR: https://git.openjdk.org/jdk/pull/12651 From stuefe at openjdk.org Thu Feb 23 08:11:06 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 23 Feb 2023 08:11:06 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: <9yWBGfO0KPK5bRyBT5URw-J33GqBKx_x_uEZ-QBJW5g=.383df2a5-2027-4116-aa5b-eeef96a356ae@github.com> On Thu, 23 Feb 2023 03:57:18 GMT, Kim Barrett wrote: > > Okay, I see. But does it have to come with hs-err file too? Because that is the only reason for this complication. And hs-err files probably spoil the context reported to Watson, see below. > > The question of whether to create an hs_err log when +UseOSErrorReporting is out of scope for this change. The decision to do so is very old, going back to these: > > https://bugs.openjdk.org/browse/JDK-4997835 RFE: crash dump will only be created when running w/ -XX:+ShowMessageBoxOnError > > https://bugs.openjdk.org/browse/JDK-6227246 Improve Windows unhandled structured exception reporting > > https://bugs.openjdk.org/browse/JDK-6794885 Improve Windows unhandled structured exception reporting - UseOSErrorReporting > > Feel free to raise that as a separate issue. > Thanks for digging out this history, it has been interesting reading. > > I am a bit unhappy with this patch because it "sanctifies" returning from hs-err reporting. We had long discussions in the recent past where we needed to convince developers that returning from hs-err reporting is a very bad idea. I'm afraid with your patch, it won't be long before follow-up patches build upon your ARA and re-introduce this question. > > I'd prefer if UseOSErrorReporting, and the code surrounding the windows exceptionhandler and CONTINUE_SEARCH, remained Windows only, and restricted to this one use case: jumping back to OS error reporting in case of an error. > > I don't think it santifies anything of the sort. I agree that returning from hs-err reporting is problematic. The current +UseOSErrorReporting was only one of several mechanisms doing exactly that. I've already removed SuppressErrorAt. I'm working on another change that modifies how the existing Debugging mechanism works. > > I've included comments about how the ARA stuff and xxx_maybe_die_xxx exist only to support +UseOSErrorReporting. I could go further and add something like the following at the beginning of VMError::report_and_maybe_die_impl: > > ``` > assert(after_report_action == AfterReportAction::Die > WINDOWS_ONLY(|| UseOSErrorReporting), > "must terminate"); > ``` > > The reason for the new argument is so that the caller, which is where the knowlege of whether return or termination is desired, can request what it wants. I could have further uglified the code by making ARA::Return Windows-only, with more #ifdefs. I'd kind of prefer not to do that. > Okay, I think above assert would be a reasonable compromise; thanks for addressing this concern. > > Your patch opens up other questions too, e.g. how is this supposed to work in the face of secondary errors? IIUC the last invocation of the exception handler would be the winning one and return its error state back to the top-level exception filter, so the error reported by Watson would be the secondary crash, not the real crash. I think this would happen today too, or? > > I think that's a correct summary, with or without this change. > > > This may be an argument for skipping hs-err reporting with UseOSErrorReporting - skipping everything, actually: the context that gets reported to Watson would be guaranteed to be the one from the real crash point. And the crash would be "fresh", as close as possible to the real crash site. I'd think that this is how UseOSErrorReporting is supposed to work, leaving crash reporting up to the OS. > > That would be an appropriate argument to make as part of a proposal to remove the hs-err logging when +UseOSErrorReporting. Okay, we can address it separately. ------------- PR: https://git.openjdk.org/jdk/pull/12651 From kbarrett at openjdk.org Thu Feb 23 09:20:43 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 23 Feb 2023 09:20:43 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting [v2] In-Reply-To: References: Message-ID: > Please review this change to the implementation of the Windows-specific option > UseOSErrorReporting, toward allowing crash reporting functions to be declared > noreturn. VMError::report_and_die no longer conditionally returns if the > Windows-only option UseOSErrorReporting is true. > > The core of VM crash reporting has been renamed and privatized, and now takes > an additional argument indicating whether the caller just wants reporting or > also wants termination after reporting. The Windows top-level and unhandled > exception filters call into that to request reporting, and also termination if > UseOSErrorReporting is false, otherwise returning from the call and taking > further action. All other callers of VM crash reporting include termination. > > So if we get a structured exception on Windows and the option is true then we > report it and then end up walking up the exception handlers until running out, > when Windows crash behavior gets a shot. This is essentially the same as > before the change, just checking the option in a different place (in the > Windows-specific caller, rather than in an #ifdef block in the shared code). > > In addition, the core VM crash reporter uses BREAKPOINT before termination if > UseOSErrorReporting is true. So if we have an assertion failure we report it > and then end up walking up the exception handlers (because of the breakpoint > exception) until running out, when Windows crash behavior gets a shot. > > So if there is an attached debugger (or JIT debugging is enabled), this leads > an assertion failure to hit the debugger inside the crash reporter rather than > after it returned (due to the old behavior for UseOSErrorReporting) from the > failed assertion (hitting the BREAKPOINT there). > > Note that on Windows, ::breakpoint() calls DebugBreak(), raising a breakpoint > exception, which ends up being unhandled if UseOSErrorReporting is true. > > There is a pre-existing bug that I'll be reporting separately. If > UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an > empty .mdmp file. We shouldn't create that file when UseOSErrorReporting. > > Testing: > mach5 tier1-3 > > Manual testing with the following, to verify expected behavior that is the > same as before the change. > > # -XX:ErrorHandlerTest=N > # 1: assertion failure > # 2: guarantee failure > # 14: SIGSEGV > # 15: divide by zero > path/to/bin/java \ > -XX:+UnlockDiagnosticVMOptions \ > -XX:+ErrorLogSecondaryErrorDetails \ > -XX:+UseOSErrorReporting \ > -XX:ErrorHandlerTest=1 \ > TestDebug.java > > --- TestDebug.java --- > import java.lang.String; > public class TestDebug { > static private volatile String dummy; > public static void main(String[] args) throws Exception { > while (true) { > dummy = new String("foo bar"); > } > } > } > --- end TestDebug.java --- > > I need someone with a better understanding of Windows to test the interaction > with Windows Error Reporting (WER). Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: check after_report_action ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12651/files - new: https://git.openjdk.org/jdk/pull/12651/files/8efb9e4e..07251af2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12651&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12651&range=00-01 Stats: 6 lines in 1 file changed: 6 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12651.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12651/head:pull/12651 PR: https://git.openjdk.org/jdk/pull/12651 From kbarrett at openjdk.org Thu Feb 23 09:20:43 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 23 Feb 2023 09:20:43 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: <9yWBGfO0KPK5bRyBT5URw-J33GqBKx_x_uEZ-QBJW5g=.383df2a5-2027-4116-aa5b-eeef96a356ae@github.com> References: <9yWBGfO0KPK5bRyBT5URw-J33GqBKx_x_uEZ-QBJW5g=.383df2a5-2027-4116-aa5b-eeef96a356ae@github.com> Message-ID: On Thu, 23 Feb 2023 08:08:33 GMT, Thomas Stuefe wrote: > > I've included comments about how the ARA stuff and xxx_maybe_die_xxx exist only to support +UseOSErrorReporting. I could go further and add something like the following at the beginning of VMError::report_and_maybe_die_impl: > > ``` > > assert(after_report_action == AfterReportAction::Die > > WINDOWS_ONLY(|| UseOSErrorReporting), > > "must terminate"); > > ``` > > > > The reason for the new argument is so that the caller, which is where the knowlege of whether return or termination is desired, can request what it wants. I could have further uglified the code by making ARA::Return Windows-only, with more #ifdefs. I'd kind of prefer not to do that. > > Okay, I think above assert would be a reasonable compromise; thanks for addressing this concern. Added the assert. ------------- PR: https://git.openjdk.org/jdk/pull/12651 From duke at openjdk.org Thu Feb 23 09:34:09 2023 From: duke at openjdk.org (Kasper Nielsen) Date: Thu, 23 Feb 2023 09:34:09 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: <0ZZLsRlr6IO4jL2VKAcn4geYptqUOdYPUXbd-KQiApo=.d1b3ca63-ca4a-4eef-9d5e-88cc3418ada9@github.com> Message-ID: On Wed, 22 Feb 2023 21:29:12 GMT, R?mi Forax wrote: >> In all the places I've seen LambdaMetaFactory used, it is because of performance over reflection/non-static method handles. See, for example, https://www.optaplanner.org/blog/2018/01/09/JavaReflectionButMuchFaster.html. I believe Optaplanner is still using it. >> >> There is also quite a number of posts on StackOverflow on people trying to use LambdaMetafactory: >> https://stackoverflow.com/search?q=LambdaMetafactory > > @kaspernielsen, i believe that now that hidden class + class data are part of the public API, > you do not need to use lambda metafactory directly. But it requires Java 17 not 8 or 11. @forax I think what attracted people was that you didn't have to add any dependencies (ASM). It is also simpler than having to deal with class files. Maybe once we get the Classfile API people will reconsider. Still, LambdaMetaFactory is just a couple of lines of code. So will probably be hard to resist for some people. ------------- PR: https://git.openjdk.org/jdk/pull/12493 From mcimadamore at openjdk.org Thu Feb 23 10:18:08 2023 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Thu, 23 Feb 2023 10:18:08 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: <2lWqgBlo7fvu2MpVr3N46-03iI1Q49y1hqfd7uUBB58=.ba2a9ec5-c824-416d-a431-ca7e33019736@github.com> References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> <2lWqgBlo7fvu2MpVr3N46-03iI1Q49y1hqfd7uUBB58=.ba2a9ec5-c824-416d-a431-ca7e33019736@github.com> Message-ID: On Thu, 23 Feb 2023 07:19:24 GMT, Martin Doerr wrote: > > Can I add them to the existing libraries? If so, what is the correct naming scheme and what is needed to get them executed (adding the EXPORT alone is not sufficient). Or should I create a separate test for these cases? Advice will be appreciated! There are two kinds of tests for the linker - some tests (e.g. TestDowncallXYZ and TestUpcallXYZ) execute end to end test with several shapes, and make sure that things work. Then there are ABI specific tests (e.g. see TestXYZCallArranger). These latter tests are typically used to stress tests corners of specific ABIs - and they are easier to write as you can just provide the input (some function descriptor) then test that the resulting set of bindings is the expected one. This allows for much more in-depth testing. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mcimadamore at openjdk.org Thu Feb 23 10:23:05 2023 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Thu, 23 Feb 2023 10:23:05 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: <3dzZfm_BSXLL_iWAmllX-8YXAe1NWMMPr11zJtedIWU=.8a1dd1df-4eb7-4c72-9a06-f4636e5ec3e4@github.com> On Thu, 23 Feb 2023 04:44:18 GMT, Martin Doerr wrote: > > Correct, `extsw` performs a `I2L` conversion. I had thought about this already, but I think my current implementation is more efficient as it combines register moves with the 64 bit extend. Your proposal would generate separate extend and move instructions, right? Would it be possible to generate a biding cast + move and the recognize the pattern in the HS backend and optimize it away as a `extsw` ? ------------- PR: https://git.openjdk.org/jdk/pull/12708 From rehn at openjdk.org Thu Feb 23 12:50:15 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 23 Feb 2023 12:50:15 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 06:24:12 GMT, David Holmes wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Coleen fix > > src/hotspot/share/classfile/systemDictionary.cpp line 1489: > >> 1487: } >> 1488: >> 1489: // Add the new class. We need recompile lock during update of CHA. > > This comment belongs after `add_to_hierarchy`. Not sure that make sense. Also we have no "recompile" lock. I plainly removed this comment. > src/hotspot/share/classfile/systemDictionary.cpp line 1635: > >> 1633: // Note: must be done *after* linking k into the hierarchy (was bug 12/9/97) >> 1634: if (Universe::is_fully_initialized()) { >> 1635: CodeCache::mark_dependents_on(&deopt_scope, k); > > The flush comment no longer matches the code. Fixed ------------- PR: https://git.openjdk.org/jdk/pull/12585 From rehn at openjdk.org Thu Feb 23 12:50:18 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 23 Feb 2023 12:50:18 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: <3PEGJ7Y4-Co17YrM6Z4T0zt9nmQbsOyWElr4H6tQBOw=.be00c6bb-931f-4d60-ac83-8e8616cabc94@github.com> Message-ID: On Tue, 21 Feb 2023 21:16:30 GMT, Coleen Phillimore wrote: >> Hi Coleen, so I looked at moving it to IK. >> There will be some additional changes, some method may now be private instead, etc... >> You think this should be done here? > > ok, maybe it's ok where it is and we can move it with a future RFE. Great ------------- PR: https://git.openjdk.org/jdk/pull/12585 From rehn at openjdk.org Thu Feb 23 12:50:20 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 23 Feb 2023 12:50:20 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: Message-ID: On Thu, 23 Feb 2023 01:21:51 GMT, Daniel D. Daugherty wrote: >> src/hotspot/share/classfile/systemDictionary.cpp line 1620: >> >>> 1618: assert(k != nullptr, "just checking"); >>> 1619: if (Universe::is_fully_initialized()) { >>> 1620: assert_locked_or_safepoint(Compile_lock); >> >> This serves no purpose given you just took out the lock. But it does raise the question as to whether the lock can/should always be taken out if called at a safepoint, or if the Universe is not yet fully initialized? > > I found one call to `add_to_hierarchy()` where we previously did not > grab in `Compile_lock` in `vmClasses::resolve_shared_class()` and > that call is made with this assert in place: > ```assert(!Universe::is_fully_initialized(), "We can make short cuts only during VM initialization");``` > > so it looks to me like we have a locking issue there (as in too early for locks). I think you are taking about CDS dump process. It works fine taking the lock. I think this was just some micro-optimization. But we should not mark and do deopt fixed that. ------------- PR: https://git.openjdk.org/jdk/pull/12585 From rehn at openjdk.org Thu Feb 23 12:50:23 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 23 Feb 2023 12:50:23 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 22:50:30 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Coleen fix > > src/hotspot/share/classfile/systemDictionary.cpp line 1632: > >> 1630: k->process_interfaces(); // handle all "implements" declarations >> 1631: >> 1632: // Now flush all code that depended on old class hierarchy. > > nit typo: s/flush/mark/ > > since you've changed that elsewhere... Fixed ------------- PR: https://git.openjdk.org/jdk/pull/12585 From rehn at openjdk.org Thu Feb 23 12:55:17 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 23 Feb 2023 12:55:17 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 22:51:29 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Coleen fix > > src/hotspot/share/classfile/systemDictionaryShared.cpp line 856: > >> 854: // Add to class hierarchy, and do possible deoptimizations. >> 855: SystemDictionary::add_to_hierarchy(loaded_lambda, THREAD); >> 856: // But, do not add to dictionary. > > nit comment location: move above L855 like other places? I had already moved the other ones under instead. ------------- PR: https://git.openjdk.org/jdk/pull/12585 From stefank at openjdk.org Thu Feb 23 13:13:45 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 23 Feb 2023 13:13:45 GMT Subject: RFR: 8302927: Fuzz.java from ProblemList-zgc.txt again Message-ID: The previous removal of Fuzz from the ProblemList got undone in a merge from JDK 20. This patch removes the test from the ProblemList again. ------------- Commit messages: - Unproblemlist Fuzz.java from ProblemList-zgc.txt again Changes: https://git.openjdk.org/jdk/pull/12684/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12684&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302927 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12684.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12684/head:pull/12684 PR: https://git.openjdk.org/jdk/pull/12684 From sjohanss at openjdk.org Thu Feb 23 13:13:47 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Thu, 23 Feb 2023 13:13:47 GMT Subject: RFR: 8302927: Fuzz.java from ProblemList-zgc.txt again In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 08:10:56 GMT, Stefan Karlsson wrote: > The previous removal of Fuzz from the ProblemList got undone in a merge from JDK 20. This patch removes the test from the ProblemList again. Marked as reviewed by sjohanss (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12684 From eosterlund at openjdk.org Thu Feb 23 13:13:48 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 23 Feb 2023 13:13:48 GMT Subject: RFR: 8302927: Fuzz.java from ProblemList-zgc.txt again In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 08:10:56 GMT, Stefan Karlsson wrote: > The previous removal of Fuzz from the ProblemList got undone in a merge from JDK 20. This patch removes the test from the ProblemList again. Marked as reviewed by eosterlund (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12684 From rehn at openjdk.org Thu Feb 23 13:18:41 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 23 Feb 2023 13:18:41 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v5] In-Reply-To: References: Message-ID: <6wpn67lL5OnsNJ02BWmKcDM8g7iCOm_OOoNRka2clDc=.468d4d8f-0fe9-44f2-ae27-1bf148ed164f@github.com> > Hi all, please consider. > > The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. > All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. > Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. > > Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. > The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) > > This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. > > This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. > Secondly it moves handshakes part out of the Compile_lock where it is possible. > > Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. > > It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. > > Thanks, Robbin Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: More review fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12585/files - new: https://git.openjdk.org/jdk/pull/12585/files/1cbe4937..e1cd8a30 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12585&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12585&range=03-04 Stats: 24 lines in 6 files changed: 9 ins; 7 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/12585.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12585/head:pull/12585 PR: https://git.openjdk.org/jdk/pull/12585 From rehn at openjdk.org Thu Feb 23 13:22:15 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 23 Feb 2023 13:22:15 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: Message-ID: On Thu, 23 Feb 2023 01:09:49 GMT, Daniel D. Daugherty wrote: >> src/hotspot/share/classfile/vmClasses.cpp line 252: >> >>> 250: Dictionary* dictionary = loader_data->dictionary(); >>> 251: dictionary->add_klass(THREAD, klass->name(), klass); >>> 252: SystemDictionary::add_to_hierarchy(klass, THREAD); >> >> One more small changes please. Can you make JavaThread* current the first parameter so this doesn't look like we're failing to throw an exception from these calls? > > This call to `SystemDictionary::add_to_hierarchy()` was not previously > done while grabbing the `Compile_lock` and now we'll do that. > > This specific call is made while this assert() is true: > ```assert(!Universe::is_fully_initialized(), "We can make short cuts only during VM initialization");``` > > so very early in the life of the VM. Grabbing the lock should not be problematic. ------------- PR: https://git.openjdk.org/jdk/pull/12585 From stefank at openjdk.org Thu Feb 23 14:11:15 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 23 Feb 2023 14:11:15 GMT Subject: Integrated: 8302927: Unproblemlist Fuzz.java from ProblemList-zgc.txt again In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 08:10:56 GMT, Stefan Karlsson wrote: > The previous removal of Fuzz from the ProblemList got undone in a merge from JDK 20. This patch removes the test from the ProblemList again. This pull request has now been integrated. Changeset: f113b04a Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/f113b04ab94c4428c62548338d31c4eb88ebdeaf Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod 8302927: Unproblemlist Fuzz.java from ProblemList-zgc.txt again Reviewed-by: sjohanss, eosterlund ------------- PR: https://git.openjdk.org/jdk/pull/12684 From lucy at openjdk.org Thu Feb 23 14:42:04 2023 From: lucy at openjdk.org (Lutz Schmidt) Date: Thu, 23 Feb 2023 14:42:04 GMT Subject: RFR: JDK-8302989: Add missing INCLUDE_CDS checks In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 14:42:38 GMT, Matthias Baesken wrote: > The cds only coding in hotspot is usually guarded with the INCLUDE_CDS macro so that it can be removed at compile time in case the correct configure flags are set. > However at some places INCLUDE_CDS is missing and should be added. > > One question - should (additionally to the UseSharedSpaces code section) the DumpSharedSpaces code sections be guarded as well with INCLUDE_CDS macros ? Looks good to me. "Don't compile inactive code" is a good principle. ------------- Marked as reviewed by lucy (Reviewer). PR: https://git.openjdk.org/jdk/pull/12691 From jvernee at openjdk.org Thu Feb 23 14:53:09 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Thu, 23 Feb 2023 14:53:09 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: On Thu, 23 Feb 2023 04:44:18 GMT, Martin Doerr wrote: > > > (I'd be happy to implement the needed changes in shared code if you want, since it touches `BindingSpecializer` which is pretty dense) > > > > > > FYI: [master...JornVernee:jdk:I2L](https://github.com/openjdk/jdk/compare/master...JornVernee:jdk:I2L) (assuming `I2L` has the same semantics as `extsw`). Then just add a `.cast(int.class, long.class)` wherever currently an `int` is `vmStore`d in the PPC CallArranger. > > Correct, `extsw` performs a `I2L` conversion. I had thought about this already, but I think my current implementation is more efficient as it combines register moves with the 64 bit extend. Your proposal would generate separate extend and move instructions, right? The design philosophy has been to put as much as possible on the Java side, and there are a few reasons for that: 1. For maintainability. Generated assembly is ultimately harder to debug, compared to Java code (especially in interpreted mode using `-Djdk.internal.foreign.*Linker.USE_SPEC=false`). (Though, there might also be some personal bias here) 2. Moving things to the Java side makes it visible to the JIT, which means it has the opportunity to be optimized away, or otherwise optimized together with the surrounding Java code. While anything put into the downcall stub is fixed. 3. If we want to intrisify `linkToNative` in C2 later, having downcall stubs be simple and consistent across platforms makes that much easier. Anything that's special in the native code would have to be replicated by the JIT as well. So... WRT efficiency, I think it depends. I've found in the past that adding a few more move instructions to the downcall stub didn't visibly affect performance. This might be because the CPU is good at just aliasing the registers instead of performing an actual move, or because it's just noise next to the membar we do on the return path. Ultimately, I don't think it matters much for performance, though (you could measure). I think the maintainability/future-proofing from implementing in Java is more important. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From jvernee at openjdk.org Thu Feb 23 14:59:05 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Thu, 23 Feb 2023 14:59:05 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) In-Reply-To: <4KT43N8FvdfdmoAeaXw1vEBMP20MhpV5RUTZzeD2DqQ=.39e184f4-e9a3-47bd-9cad-ed9fe51a0d7b@github.com> References: <4KT43N8FvdfdmoAeaXw1vEBMP20MhpV5RUTZzeD2DqQ=.39e184f4-e9a3-47bd-9cad-ed9fe51a0d7b@github.com> Message-ID: On Thu, 23 Feb 2023 04:53:34 GMT, Martin Doerr wrote: > > I will do a more thorough review soon. > > Thanks a lot! > > > > The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. > > > > > > FWIW, we have to do this for Windows vararg floats as well ([here](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/foreign/abi/x64/windows/CallArranger.java#L231-L239)) > > This can be done by `dup`-ing the value, and using 2 `vmStore`s. (each `vmStore` corresponding to a single register/stack location). Doing something similar might be simpler than the `INTEGER_AND_FLOAT` and `STACK_AND_FLOAT` storage types you're using right now. I'm not sure if that is related to the other limitations you mention? Might be interesting to look into. (perhaps as a separate RFE. I don't have a big issue since the current approach stays in PPC-only code) > > Maybe I need to think a bit more about it. I don't really like the extra cases for `INTEGER_AND_FLOAT` and `STACK_AND_FLOAT`. On the other side, doing it in the CallArranger would break the design of factoring out the allocation from the binding generation. In addition, it seems like PPC64 is even more tricky than the Windows case. I need to pass 2 float arguments in a GP reg (or stack slot) plus one of these 2 floats in float register F13. I think this can get implemented more easily in the backend. Do you agree? I think the same arguments apply here. I'll have a more thorough look at the patch and then get back to you on this. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From jvernee at openjdk.org Thu Feb 23 15:02:06 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Thu, 23 Feb 2023 15:02:06 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: On Thu, 23 Feb 2023 06:21:05 GMT, Martin Doerr wrote: > I have removed the size restriction for structs. Passing a struct consisting of 1 char works (on Little Endian). However, passing a struct consisting of 3 chars doesn't (getting IndexOutOfBoundsException: Out of bound access on segment MemorySegment). Neither on PPC64, nor on x86. Is that known or should I file a bug for that? I think you might be running into: https://bugs.openjdk.org/browse/JDK-8303017 which was recently found. (If you have a simpler test case please add it to the JBS issue, in a comment) ------------- PR: https://git.openjdk.org/jdk/pull/12708 From ayang at openjdk.org Thu Feb 23 15:06:06 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 23 Feb 2023 15:06:06 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 15:23:04 GMT, Erik ?sterlund wrote: > So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. src/hotspot/cpu/x86/gc/shared/barrierSetAssembler_x86.cpp line 218: > 216: fatal("No support for 8 bytes copy"); > 217: #endif > 218: } I believe the intent would be clearer using `switch`. src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 1198: > 1196: const Register saved_r15 = r9; > 1197: #ifdef _WIN64 > 1198: if (nargs >= 4) { When would `nargs` ever be `> 4`? ------------- PR: https://git.openjdk.org/jdk/pull/12670 From jvernee at openjdk.org Thu Feb 23 15:07:05 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Thu, 23 Feb 2023 15:07:05 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: <3dzZfm_BSXLL_iWAmllX-8YXAe1NWMMPr11zJtedIWU=.8a1dd1df-4eb7-4c72-9a06-f4636e5ec3e4@github.com> References: <3dzZfm_BSXLL_iWAmllX-8YXAe1NWMMPr11zJtedIWU=.8a1dd1df-4eb7-4c72-9a06-f4636e5ec3e4@github.com> Message-ID: On Thu, 23 Feb 2023 10:20:31 GMT, Maurizio Cimadamore wrote: > > Correct, `extsw` performs a `I2L` conversion. I had thought about this already, but I think my current implementation is more efficient as it combines register moves with the 64 bit extend. Your proposal would generate separate extend and move instructions, right? > > Would it be possible to generate a biding cast + move and the recognize the pattern in the HS backend and optimize it away as a `extsw` ? At the moment, there is a forced indirection between the Java code and downcall stub, so the JIT can not do any optimizations that have to take both sides into account. The downcall stub is opaque to the JIT. (though, perhaps in the long run we can add intrinsification of `linkToNative` that can generate the code in the downcall stub as part of the JIT's own IR, which would solve this) ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mcimadamore at openjdk.org Thu Feb 23 16:51:08 2023 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Thu, 23 Feb 2023 16:51:08 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: <3dzZfm_BSXLL_iWAmllX-8YXAe1NWMMPr11zJtedIWU=.8a1dd1df-4eb7-4c72-9a06-f4636e5ec3e4@github.com> Message-ID: <-q-cx5Voa0dNJS0d8od_Ui7KRqc4fc5VRHxaPjI07X0=.f79a8180-c247-487f-a994-4d0192482a43@github.com> On Thu, 23 Feb 2023 15:02:48 GMT, Jorn Vernee wrote: > > > Correct, `extsw` performs a `I2L` conversion. I had thought about this already, but I think my current implementation is more efficient as it combines register moves with the 64 bit extend. Your proposal would generate separate extend and move instructions, right? > > > > > > Would it be possible to generate a biding cast + move and the recognize the pattern in the HS backend and optimize it away as a `extsw` ? > > At the moment, there is a forced indirection between the Java code and downcall stub, so the JIT can not do any optimizations that have to take both sides into account. The downcall stub is opaque to the JIT. (though, perhaps in the long run we can add intrinsification of `linkToNative` that can generate the code in the downcall stub as part of the JIT's own IR, which would solve this) I meant generating `extsw` when emitting the stub (since when we emit the stub we can see the bindings). But I suppose the problem there is that the VM only sees low level bindings such as moves, it doesn't see bindings such as casts. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From jvernee at openjdk.org Thu Feb 23 17:14:37 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Thu, 23 Feb 2023 17:14:37 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: <-q-cx5Voa0dNJS0d8od_Ui7KRqc4fc5VRHxaPjI07X0=.f79a8180-c247-487f-a994-4d0192482a43@github.com> References: <3dzZfm_BSXLL_iWAmllX-8YXAe1NWMMPr11zJtedIWU=.8a1dd1df-4eb7-4c72-9a06-f4636e5ec3e4@github.com> <-q-cx5Voa0dNJS0d8od_Ui7KRqc4fc5VRHxaPjI07X0=.f79a8180-c247-487f-a994-4d0192482a43@github.com> Message-ID: On Thu, 23 Feb 2023 16:48:30 GMT, Maurizio Cimadamore wrote: > I meant generating extsw when emitting the stub (since when we emit the stub we can see the bindings). But I suppose the problem there is that the VM only sees low level bindings such as moves, it doesn't see bindings such as casts. Oh, sorry, I see what you mean now. But yeah, the VM stub only handles moves, and I think we want to keep it that way (for reasons outlined). The VM stub can be viewed as a low-level primitive that accepts a set of register/stack values, and moves them into the corresponding locations. It's not really supposed to be doing any kind of processing of the values it receives or returns. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From sviswanathan at openjdk.org Thu Feb 23 17:47:49 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Thu, 23 Feb 2023 17:47:49 GMT Subject: RFR: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 02:08:27 GMT, Sandhya Viswanathan wrote: > Change the java/lang/float.java and the corresponding shared runtime constant expression evaluation to generate QNaN. > The HW instructions generate QNaNs and not SNaNs for floating point instructions. This happens across double, float, and float16 data types. The most significant bit of mantissa is set to 1 for QNaNs. Thankyou very much all for your valuable inputs. I think it is best to withdraw this PR. ------------- PR: https://git.openjdk.org/jdk/pull/12704 From sviswanathan at openjdk.org Thu Feb 23 17:47:51 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Thu, 23 Feb 2023 17:47:51 GMT Subject: Withdrawn: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 02:08:27 GMT, Sandhya Viswanathan wrote: > Change the java/lang/float.java and the corresponding shared runtime constant expression evaluation to generate QNaN. > The HW instructions generate QNaNs and not SNaNs for floating point instructions. This happens across double, float, and float16 data types. The most significant bit of mantissa is set to 1 for QNaNs. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/12704 From dcubed at openjdk.org Thu Feb 23 17:47:46 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 23 Feb 2023 17:47:46 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v5] In-Reply-To: <6wpn67lL5OnsNJ02BWmKcDM8g7iCOm_OOoNRka2clDc=.468d4d8f-0fe9-44f2-ae27-1bf148ed164f@github.com> References: <6wpn67lL5OnsNJ02BWmKcDM8g7iCOm_OOoNRka2clDc=.468d4d8f-0fe9-44f2-ae27-1bf148ed164f@github.com> Message-ID: On Thu, 23 Feb 2023 13:18:41 GMT, Robbin Ehn wrote: >> Hi all, please consider. >> >> The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. >> All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. >> Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. >> >> Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. >> The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) >> >> This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. >> >> This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. >> Secondly it moves handshakes part out of the Compile_lock where it is possible. >> >> Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. >> >> It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. >> >> Thanks, Robbin > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > More review fixes Very minor nits this round. Thumbs up. Thanks for accepting all the feedback. src/hotspot/share/classfile/systemDictionary.cpp line 1610: > 1608: // ---------------------------------------------------------------------------- > 1609: // Update hierarchy. This is done before the new klass has been added to the SystemDictionary. The Compile_lock > 1610: // is held, to ensure that the compiler is not using the class hierarchy. nit: s/is held/is grabbed/ Adjust the header comment to reflect that we now grab the Compile_lock in the function rather than expecting it to be grabbed by the caller. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.org/jdk/pull/12585 From dcubed at openjdk.org Thu Feb 23 17:47:49 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 23 Feb 2023 17:47:49 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: Message-ID: On Thu, 23 Feb 2023 12:52:10 GMT, Robbin Ehn wrote: >> src/hotspot/share/classfile/systemDictionaryShared.cpp line 856: >> >>> 854: // Add to class hierarchy, and do possible deoptimizations. >>> 855: SystemDictionary::add_to_hierarchy(loaded_lambda, THREAD); >>> 856: // But, do not add to dictionary. >> >> nit comment location: move above L855 like other places? > > I had already moved the other ones under instead. OK. Being consistent is the goal. ------------- PR: https://git.openjdk.org/jdk/pull/12585 From dcubed at openjdk.org Thu Feb 23 17:47:54 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 23 Feb 2023 17:47:54 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: Message-ID: On Thu, 23 Feb 2023 01:15:54 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Coleen fix > > src/hotspot/share/oops/instanceKlass.cpp line 1184: > >> 1182: DeoptimizationScope deopt_scope; >> 1183: { >> 1184: // Now flush all code that assumes the class is not linked. > > nit typo: s/flush/mark/ > since you've made that change elsewhere... This one hasn't been fixed yet. ------------- PR: https://git.openjdk.org/jdk/pull/12585 From dcubed at openjdk.org Thu Feb 23 17:54:43 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 23 Feb 2023 17:54:43 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v3] In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 08:06:49 GMT, Robbin Ehn wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Review fixes 2 > > @coleenp was these the changes you were looking for? > Passes t1-t7 @robehn - I forgot to ask: what testing has been run on the more recent version(s)? ------------- PR: https://git.openjdk.org/jdk/pull/12585 From rehn at openjdk.org Thu Feb 23 19:56:24 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 23 Feb 2023 19:56:24 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v3] In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 08:06:49 GMT, Robbin Ehn wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Review fixes 2 > > @coleenp was these the changes you were looking for? > Passes t1-t7 > @robehn - I forgot to ask: what testing has been run on the more recent version(s)? All versions moving DeoptScope/Compile_lock have been tested to tier 8, assert changes and such to tier 5. But I notice UseVtableBasedCHA seems only be turned off in one test, so I started to go over the compiler code, which added some concerns which I need to eliminate. I'll keep fixing review issue while looking into that. ------------- PR: https://git.openjdk.org/jdk/pull/12585 From rehn at openjdk.org Thu Feb 23 20:06:35 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 23 Feb 2023 20:06:35 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 17:08:49 GMT, Coleen Phillimore wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Coleen fix > > src/hotspot/share/oops/instanceKlass.hpp line 34: > >> 32: #include "oops/instanceKlassFlags.hpp" >> 33: #include "oops/instanceOop.hpp" >> 34: #include "runtime/deoptimization.hpp" > > Can you just forward declare DeoptimizationScope rather than including the whole file here? Fixed, thanks! (not pushed yet) ------------- PR: https://git.openjdk.org/jdk/pull/12585 From rehn at openjdk.org Thu Feb 23 20:14:32 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 23 Feb 2023 20:14:32 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v6] In-Reply-To: References: Message-ID: > Hi all, please consider. > > The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. > All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. > Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. > > Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. > The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) > > This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. > > This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. > Secondly it moves handshakes part out of the Compile_lock where it is possible. > > Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. > > It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. > > Thanks, Robbin Robbin Ehn has updated the pull request incrementally with two additional commits since the last revision: - Comment fixes - Include/fwd fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12585/files - new: https://git.openjdk.org/jdk/pull/12585/files/e1cd8a30..e1162979 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12585&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12585&range=04-05 Stats: 8 lines in 4 files changed: 4 ins; 2 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12585.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12585/head:pull/12585 PR: https://git.openjdk.org/jdk/pull/12585 From rehn at openjdk.org Thu Feb 23 20:14:37 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 23 Feb 2023 20:14:37 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v5] In-Reply-To: References: <6wpn67lL5OnsNJ02BWmKcDM8g7iCOm_OOoNRka2clDc=.468d4d8f-0fe9-44f2-ae27-1bf148ed164f@github.com> Message-ID: <5boxei2siCpyFq-Pno37kQMGnMQIbK94Ik6FF88gxXc=.094c3327-3d51-4142-87ca-c30dbee471bc@github.com> On Thu, 23 Feb 2023 17:26:28 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> More review fixes > > src/hotspot/share/classfile/systemDictionary.cpp line 1610: > >> 1608: // ---------------------------------------------------------------------------- >> 1609: // Update hierarchy. This is done before the new klass has been added to the SystemDictionary. The Compile_lock >> 1610: // is held, to ensure that the compiler is not using the class hierarchy. > > nit: s/is held/is grabbed/ > Adjust the header comment to reflect that we now grab the Compile_lock > in the function rather than expecting it to be grabbed by the caller. Fixed, thanks! ------------- PR: https://git.openjdk.org/jdk/pull/12585 From dholmes at openjdk.org Thu Feb 23 21:52:04 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 23 Feb 2023 21:52:04 GMT Subject: RFR: JDK-8302989: Add missing INCLUDE_CDS checks In-Reply-To: References: Message-ID: <0_ydaMZuemKJh9T8ssGRKRiPkidmvcP-7EfqDossp-s=.aeb9634c-5368-49c0-a846-6f14db629c3d@github.com> On Tue, 21 Feb 2023 14:42:38 GMT, Matthias Baesken wrote: > The cds only coding in hotspot is usually guarded with the INCLUDE_CDS macro so that it can be removed at compile time in case the correct configure flags are set. > However at some places INCLUDE_CDS is missing and should be added. > > One question - should (additionally to the UseSharedSpaces code section) the DumpSharedSpaces code sections be guarded as well with INCLUDE_CDS macros ? How much space in libjvm does this save? There is a trade-off with code readability when doing this. If the aim is to get 100% conditional code then you need to do more. Do plan to do that or just this small tweak? ------------- PR: https://git.openjdk.org/jdk/pull/12691 From dcubed at openjdk.org Thu Feb 23 22:32:17 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 23 Feb 2023 22:32:17 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v6] In-Reply-To: References: Message-ID: On Thu, 23 Feb 2023 20:14:32 GMT, Robbin Ehn wrote: >> Hi all, please consider. >> >> The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. >> All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. >> Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. >> >> Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. >> The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) >> >> This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. >> >> This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. >> Secondly it moves handshakes part out of the Compile_lock where it is possible. >> >> Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. >> >> It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. >> >> Thanks, Robbin > > Robbin Ehn has updated the pull request incrementally with two additional commits since the last revision: > > - Comment fixes > - Include/fwd fixes Reviewed the v05 incremental webrev. Still good. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.org/jdk/pull/12585 From dholmes at openjdk.org Fri Feb 24 02:49:14 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 24 Feb 2023 02:49:14 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: Message-ID: <4qbrvlIuei8l6Tk5-XwIA_WxqLL3p8esF2PCWhpMjko=.035ac2e2-e68c-436f-bab9-e3afb144979c@github.com> On Thu, 23 Feb 2023 12:44:37 GMT, Robbin Ehn wrote: >> src/hotspot/share/classfile/systemDictionary.cpp line 1489: >> >>> 1487: } >>> 1488: >>> 1489: // Add the new class. We need recompile lock during update of CHA. >> >> This comment belongs after `add_to_hierarchy`. > > Not sure that make sense. Also we have no "recompile" lock. > I plainly removed this comment. "recompile" lock may have been a poor choice of words but the comment was describing why we need the Compile_lock when updating the dictionary. ------------- PR: https://git.openjdk.org/jdk/pull/12585 From dholmes at openjdk.org Fri Feb 24 02:54:16 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 24 Feb 2023 02:54:16 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: Message-ID: <37uJ_WR0M5Q9Te1lWg2ecfCTGgFzzecrAPmmlnQ8vzQ=.f756a746-641e-4b29-a336-aeb549a74ff4@github.com> On Thu, 23 Feb 2023 12:46:47 GMT, Robbin Ehn wrote: >> I found one call to `add_to_hierarchy()` where we previously did not >> grab in `Compile_lock` in `vmClasses::resolve_shared_class()` and >> that call is made with this assert in place: >> ```assert(!Universe::is_fully_initialized(), "We can make short cuts only during VM initialization");``` >> >> so it looks to me like we have a locking issue there (as in too early for locks). > > I think you are taking about CDS dump process. It works fine taking the lock. > I think this was just some micro-optimization. But we should not mark and do deopt fixed that. To restate: the assert_locked_or_safepoint is pointless because you just took out the lock here. The code Dan points to does not seem to be related to CDS dump: bool vmClasses::resolve(vmClassID id, TRAPS) { InstanceKlass** klassp = &_klasses[as_int(id)]; #if INCLUDE_CDS if (UseSharedSpaces && !JvmtiExport::should_post_class_prepare()) { InstanceKlass* k = *klassp; assert(k->is_shared_boot_class(), "must be"); ClassLoaderData* loader_data = ClassLoaderData::the_null_class_loader_data(); resolve_shared_class(k, loader_data, Handle(), CHECK_false); return true; } #endif // INCLUDE_CDS I remain concerned this code may be executed at a safepoint and we will now try to take the Compile_lock. ------------- PR: https://git.openjdk.org/jdk/pull/12585 From dholmes at openjdk.org Fri Feb 24 03:02:06 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 24 Feb 2023 03:02:06 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting [v2] In-Reply-To: References: Message-ID: On Thu, 23 Feb 2023 09:20:43 GMT, Kim Barrett wrote: >> Please review this change to the implementation of the Windows-specific option >> UseOSErrorReporting, toward allowing crash reporting functions to be declared >> noreturn. VMError::report_and_die no longer conditionally returns if the >> Windows-only option UseOSErrorReporting is true. >> >> The core of VM crash reporting has been renamed and privatized, and now takes >> an additional argument indicating whether the caller just wants reporting or >> also wants termination after reporting. The Windows top-level and unhandled >> exception filters call into that to request reporting, and also termination if >> UseOSErrorReporting is false, otherwise returning from the call and taking >> further action. All other callers of VM crash reporting include termination. >> >> So if we get a structured exception on Windows and the option is true then we >> report it and then end up walking up the exception handlers until running out, >> when Windows crash behavior gets a shot. This is essentially the same as >> before the change, just checking the option in a different place (in the >> Windows-specific caller, rather than in an #ifdef block in the shared code). >> >> In addition, the core VM crash reporter uses BREAKPOINT before termination if >> UseOSErrorReporting is true. So if we have an assertion failure we report it >> and then end up walking up the exception handlers (because of the breakpoint >> exception) until running out, when Windows crash behavior gets a shot. >> >> So if there is an attached debugger (or JIT debugging is enabled), this leads >> an assertion failure to hit the debugger inside the crash reporter rather than >> after it returned (due to the old behavior for UseOSErrorReporting) from the >> failed assertion (hitting the BREAKPOINT there). >> >> Note that on Windows, ::breakpoint() calls DebugBreak(), raising a breakpoint >> exception, which ends up being unhandled if UseOSErrorReporting is true. >> >> There is a pre-existing bug that I'll be reporting separately. If >> UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an >> empty .mdmp file. We shouldn't create that file when UseOSErrorReporting. >> >> Testing: >> mach5 tier1-3 >> >> Manual testing with the following, to verify expected behavior that is the >> same as before the change. >> >> # -XX:ErrorHandlerTest=N >> # 1: assertion failure >> # 2: guarantee failure >> # 14: SIGSEGV >> # 15: divide by zero >> path/to/bin/java \ >> -XX:+UnlockDiagnosticVMOptions \ >> -XX:+ErrorLogSecondaryErrorDetails \ >> -XX:+UseOSErrorReporting \ >> -XX:ErrorHandlerTest=1 \ >> TestDebug.java >> >> --- TestDebug.java --- >> import java.lang.String; >> public class TestDebug { >> static private volatile String dummy; >> public static void main(String[] args) throws Exception { >> while (true) { >> dummy = new String("foo bar"); >> } >> } >> } >> --- end TestDebug.java --- >> >> I need someone with a better understanding of Windows to test the interaction >> with Windows Error Reporting (WER). > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > check after_report_action Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12651 From mdoerr at openjdk.org Fri Feb 24 04:07:06 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 24 Feb 2023 04:07:06 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: On Thu, 23 Feb 2023 06:18:49 GMT, Martin Doerr wrote: >> Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". >> >> This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). >> >> Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. >> >> There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. >> >> The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. >> >> I had to make changes to shared code and code for other platforms: >> 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: >> - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. >> - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. >> - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! >> 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Remove size restriction for structs. Add TODO for Big Endian. Some more remarks about other issues: - Uploaded my simple reproducer to [JDK-8303017](https://bugs.openjdk.org/browse/JDK-8303017) - Using oversized load / stores is problematic. Don't forget that OpenJDK still supports Big Endian platforms (AIX, s390x). - Since the membar on the return path was mentioned: I think it would be good to enable UseSystemMemoryBarrier by default on operating systems which support it. Maybe we should discuss this with @robehn. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From dholmes at openjdk.org Fri Feb 24 04:55:48 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 24 Feb 2023 04:55:48 GMT Subject: RFR: 8303068: Memory leak in DwarfFile::LineNumberProgram::run_line_number_program Message-ID: After allocating: _state = new (std::nothrow) LineNumberProgramState(_header); we need to `delete _state` before returning. Testing: tiers 1-3 Thanks. ------------- Commit messages: - 8303068: Memory leak in DwarfFile::LineNumberProgram::run_line_number_program Changes: https://git.openjdk.org/jdk/pull/12738/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12738&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8303068 Stats: 5 lines in 1 file changed: 3 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12738.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12738/head:pull/12738 PR: https://git.openjdk.org/jdk/pull/12738 From mchung at openjdk.org Fri Feb 24 06:31:12 2023 From: mchung at openjdk.org (Mandy Chung) Date: Fri, 24 Feb 2023 06:31:12 GMT Subject: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2] In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 18:11:18 GMT, Volker Simonis wrote: >> Prior to [JDK-8239384](https://bugs.openjdk.org/browse/JDK-8239384)/[JDK-8238358](https://bugs.openjdk.org/browse/JDK-8238358) LambdaMetaFactory has created VM-anonymous classes which could easily be unloaded once they were not referenced any more. Starting with JDK 15 and the new "hidden class" based implementation, this is not the case any more, because the hidden classes will be strongly tied to their defining class loader. If this is the default application class loader, these hidden classes can never be unloaded which can easily lead to Metaspace exhaustion (see the [test case in the JBS issue](https://bugs.openjdk.org/secure/attachment/102601/LambdaClassLeak.java)). This is a regression compared to previous JDK versions which some of our applications have been affected from when migrating to JDK 17. >> >> The reason why the newly created hidden classes are strongly linked to their defining class loader is not clear to me. JDK-8239384 mentions it as an "implementation detail": >> >>> *4. the lambda proxy class has the strong relationship with the class loader (that will share the VM metaspace for its defining loader - implementation details)* >> >> From my current understanding the strong link between a hidden class created by `LambdaMetaFactory` and its defining class loader is not strictly required. In order to prevent potential OOMs and fix the regression compared the JDK 14 and earlier I propose to create these hidden classes without the `STRONG` option. >> >> I'll be happy to add the test case as JTreg test to this PR if you think that would be useful. > > Volker Simonis has updated the pull request incrementally with two additional commits since the last revision: > > - Remove assertions which insist on Lambda proxy classes being strongly linked to their class loader > - Removed unused import of STRONG und updated copyright year As Remi, others and I explained, whether lambda metafactory uses strong hidden classes or VM anonymous classes (weak hidden classes) is implementation details. The lambda proxy class is a logical part of the caller class as it's the implementation of the lambda, i.e. strongly-linked with the defining class loader. ------------- PR: https://git.openjdk.org/jdk/pull/12493 From rehn at openjdk.org Fri Feb 24 06:50:14 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 24 Feb 2023 06:50:14 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v6] In-Reply-To: References: Message-ID: On Thu, 23 Feb 2023 22:29:11 GMT, Daniel D. Daugherty wrote: > Reviewed the v05 incremental webrev. Still good. Thank you! ------------- PR: https://git.openjdk.org/jdk/pull/12585 From rehn at openjdk.org Fri Feb 24 07:15:05 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 24 Feb 2023 07:15:05 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: On Fri, 24 Feb 2023 04:03:53 GMT, Martin Doerr wrote: > * Since the membar on the return path was mentioned: I think it would be good to enable UseSystemMemoryBarrier by default on operating systems which support it. Maybe we should discuss this with @robehn. Changing the default is fine by me. There is, AFAIK, one case that needs to read the thread state a lot, thus emitting sysmembars alot, JFR with very high sampling rate. Other than that there are no issues that I know about. Maybe it would be good to test at what sampling interval we notice a change? Also I think it's not best match to have the flag experimental when we are in some sense using it. Maybe diagnostic? ------------- PR: https://git.openjdk.org/jdk/pull/12708 From jvernee at openjdk.org Fri Feb 24 07:21:08 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Fri, 24 Feb 2023 07:21:08 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: On Fri, 24 Feb 2023 04:03:53 GMT, Martin Doerr wrote: > * Uploaded my simple reproducer to [JDK-8303017](https://bugs.openjdk.org/browse/JDK-8303017) Thanks! > * Using oversized load / stores is problematic. Don't forget that OpenJDK still supports Big Endian platforms (AIX, s390x). You're right. I realized that it's also problematic for heap segments, for which we can't do oversized accesses. I have another solution that splits up the loads/stores into power-of-two sized chunks: https://github.com/openjdk/panama-foreign/compare/foreign-memaccess+abi...JornVernee:panama-foreign:OOB That patch is just a POC at this point though. Also, I don't think it works for BE at the moment (need to flip the offset for BE, I think. Just like we do in Unsafe). > * The result of `NativeCallingConvention::calling_convention` is interpreted as size, but it returns the max offset. That's off by one slot. Should I file a bug for that? (PPC64 is not affected because it doesn't use the result.) I'm not sure there's an issue there. Note that the 'max offset' is computed as `reg.offset() + reg.stack_size()`, so that should get us the size we need to allocate for the stack arguments. (e.g. 2 ints being passed at offset 0 and 4, would make max offset 4 + 4 = 8, which gives the size needed for the 2 ints). Computing the max offset instead of just summing the sizes of the stack arguments is needed since stack arguments can be sparsely placed in some cases on Mac/AArch64. > * Since the membar on the return path was mentioned: I think it would be good to enable UseSystemMemoryBarrier by default on operating systems which support it. Maybe we should discuss this with @robehn. ~I don't think we've done that much testing with UseSystemMemoryBarrier since it was added~. I'm a bit nervous about turning it on by default since it's currently also used for JNI. Let's see what Robbin thinks. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From rehn at openjdk.org Fri Feb 24 07:43:05 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 24 Feb 2023 07:43:05 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: On Fri, 24 Feb 2023 07:17:30 GMT, Jorn Vernee wrote: > ~I don't think we've done that much testing with UseSystemMemoryBarrier since it was added~. I'm a bit nervous about turning it on by default since it's currently also used for JNI. Let's see what Robbin thinks. For reliability: On every Oracle tier 5 we run these test two groups: vmTestbase_nsk_jvmti, open/test/hotspot/jtreg/:hotspot_runtime For each of linux_x64_debug, windows_x64, linux_aarch64_debug with sysmembar on. I picked jvmti as this is very heavy on 'JNI'. No issues reported that I know about. We also have this test: SystemMembarHandshakeTransitionTest.java which only JNI transisition with this options. For performance, the heavy lifting is on reader of thread state and the only case I have identified is JFR. I suggest changing the default now, if there is issues we have time to revert it back before RDP1. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mbaesken at openjdk.org Fri Feb 24 07:50:01 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 24 Feb 2023 07:50:01 GMT Subject: RFR: JDK-8302989: Add missing INCLUDE_CDS checks In-Reply-To: <0_ydaMZuemKJh9T8ssGRKRiPkidmvcP-7EfqDossp-s=.aeb9634c-5368-49c0-a846-6f14db629c3d@github.com> References: <0_ydaMZuemKJh9T8ssGRKRiPkidmvcP-7EfqDossp-s=.aeb9634c-5368-49c0-a846-6f14db629c3d@github.com> Message-ID: On Thu, 23 Feb 2023 21:49:07 GMT, David Holmes wrote: > How much space in libjvm does this save? There is a trade-off with code readability when doing this. > > If the aim is to get 100% conditional code then you need to do more. Do plan to do that or just this small tweak? Hi David, I would not use the macro with one - liners , I think this is not worth the effort. So no goal to get to 100 %. It is more like a pragmatic approach (and for the UseSharedSpaces sections it is done already mostly). >>One question - should (additionally to the UseSharedSpaces code section) the DumpSharedSpaces code sections be guarded as well with INCLUDE_CDS macros I wasn't sure about the DumpSharedSpaces sections, usually the UseSharedSpaces are guarded with a few exceptions like those addressed in this change. ------------- PR: https://git.openjdk.org/jdk/pull/12691 From rehn at openjdk.org Fri Feb 24 08:01:14 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 24 Feb 2023 08:01:14 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: <4qbrvlIuei8l6Tk5-XwIA_WxqLL3p8esF2PCWhpMjko=.035ac2e2-e68c-436f-bab9-e3afb144979c@github.com> References: <4qbrvlIuei8l6Tk5-XwIA_WxqLL3p8esF2PCWhpMjko=.035ac2e2-e68c-436f-bab9-e3afb144979c@github.com> Message-ID: On Fri, 24 Feb 2023 02:45:58 GMT, David Holmes wrote: >> Not sure that make sense. Also we have no "recompile" lock. >> I plainly removed this comment. > > "recompile" lock may have been a poor choice of words but the comment was describing why we need the Compile_lock when updating the dictionary. The "add class" I interpret was for { add_to_hierarchy() + update_dictionary() } combo. I thought maybe there use to be a recompile ? :) Let me know if I should add back something? ------------- PR: https://git.openjdk.org/jdk/pull/12585 From rehn at openjdk.org Fri Feb 24 08:01:16 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 24 Feb 2023 08:01:16 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: <37uJ_WR0M5Q9Te1lWg2ecfCTGgFzzecrAPmmlnQ8vzQ=.f756a746-641e-4b29-a336-aeb549a74ff4@github.com> References: <37uJ_WR0M5Q9Te1lWg2ecfCTGgFzzecrAPmmlnQ8vzQ=.f756a746-641e-4b29-a336-aeb549a74ff4@github.com> Message-ID: On Fri, 24 Feb 2023 02:50:55 GMT, David Holmes wrote: >> I think you are taking about CDS dump process. It works fine taking the lock. >> I think this was just some micro-optimization. But we should not mark and do deopt fixed that. > > To restate: the assert_locked_or_safepoint is pointless because you just took out the lock here. > > The code Dan points to does not seem to be related to CDS dump: > > bool vmClasses::resolve(vmClassID id, TRAPS) { > InstanceKlass** klassp = &_klasses[as_int(id)]; > > #if INCLUDE_CDS > if (UseSharedSpaces && !JvmtiExport::should_post_class_prepare()) { > InstanceKlass* k = *klassp; > assert(k->is_shared_boot_class(), "must be"); > > ClassLoaderData* loader_data = ClassLoaderData::the_null_class_loader_data(); > resolve_shared_class(k, loader_data, Handle(), CHECK_false); > return true; > } > #endif // INCLUDE_CDS > > I remain concerned this code may be executed at a safepoint and we will now try to take the Compile_lock. I did not find in testing any case that executed the add_to_hierarchy. I'll check more! ------------- PR: https://git.openjdk.org/jdk/pull/12585 From rcastanedalo at openjdk.org Fri Feb 24 09:48:06 2023 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Fri, 24 Feb 2023 09:48:06 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 15:23:04 GMT, Erik ?sterlund wrote: > So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 29: > 27: #include "gc/shared/barrierSet.hpp" > 28: #include "gc/shared/barrierSetAssembler.hpp" > 29: #include "gc/shared/gc_globals.hpp" Unused include. src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 1831: > 1829: BarrierSetAssembler *bs = BarrierSet::barrier_set()->barrier_set_assembler(); > 1830: #if COMPILER2_OR_JVMCI > 1831: if ((!is_oop || bs->supports_avx3_masked_arraycopy()) && VM_Version::supports_avx512vlbw() && VM_Version::supports_bmi2() && MaxVectorSize >= 32) { Suggestion: if ((!is_oop || bs->supports_avx3_masked_arraycopy()) && VM_Version::supports_avx512vlbw() && VM_Version::supports_bmi2() && MaxVectorSize >= 32) { ------------- PR: https://git.openjdk.org/jdk/pull/12670 From mdoerr at openjdk.org Fri Feb 24 10:16:08 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 24 Feb 2023 10:16:08 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: On Fri, 24 Feb 2023 07:39:15 GMT, Robbin Ehn wrote: > > ~I don't think we've done that much testing with UseSystemMemoryBarrier since it was added~. I'm a bit nervous about turning it on by default since it's currently also used for JNI. Let's see what Robbin thinks. > > For reliability: On every Oracle tier 5 we run these test two groups: vmTestbase_nsk_jvmti, open/test/hotspot/jtreg/:hotspot_runtime For each of linux_x64_debug, windows_x64, linux_aarch64_debug with sysmembar on. I picked jvmti as this is very heavy on 'JNI'. No issues reported that I know about. We also have this test: SystemMembarHandshakeTransitionTest.java which only JNI transisition with this options. > > For performance, the heavy lifting is on reader of thread state and the only case I have identified is JFR. > > I suggest changing the default now, if there is issues we have time to revert it back before RDP1. Thanks for your prompt reply! I agree, this is a good time to enable it. Do you want to do it? We'll support it and run tests on our machines and platforms. I think JFR users who want a very high sampling rate can switch it off. Making the flag diagnostic is fine with me, but that shouldn't get discussed in this PR :-) ------------- PR: https://git.openjdk.org/jdk/pull/12708 From rcastanedalo at openjdk.org Fri Feb 24 10:25:04 2023 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Fri, 24 Feb 2023 10:25:04 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 15:23:04 GMT, Erik ?sterlund wrote: > So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 1944: > 1942: BarrierSetAssembler *bs = BarrierSet::barrier_set()->barrier_set_assembler(); > 1943: #if COMPILER2_OR_JVMCI > 1944: if ((!is_oop || bs->supports_avx3_masked_arraycopy()) && VM_Version::supports_avx512vlbw() && VM_Version::supports_bmi2() && MaxVectorSize >= 32) { Suggestion: if ((!is_oop || bs->supports_avx3_masked_arraycopy()) && VM_Version::supports_avx512vlbw() && VM_Version::supports_bmi2() && MaxVectorSize >= 32) { ------------- PR: https://git.openjdk.org/jdk/pull/12670 From rcastanedalo at openjdk.org Fri Feb 24 10:29:02 2023 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Fri, 24 Feb 2023 10:29:02 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 15:23:04 GMT, Erik ?sterlund wrote: > So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 1831: > 1829: BarrierSetAssembler *bs = BarrierSet::barrier_set()->barrier_set_assembler(); > 1830: #if COMPILER2_OR_JVMCI > 1831: if ((!is_oop || bs->supports_avx3_masked_arraycopy()) && VM_Version::supports_avx512vlbw() && VM_Version::supports_bmi2() && MaxVectorSize >= 32) { Consider extending the test similarly in `generate_disjoint_int_oop_copy()` and `generate_conjoint_int_oop_copy()`, for consistency. ------------- PR: https://git.openjdk.org/jdk/pull/12670 From eosterlund at openjdk.org Fri Feb 24 10:44:16 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 24 Feb 2023 10:44:16 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v2] In-Reply-To: References: Message-ID: > So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: RISC-V fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12670/files - new: https://git.openjdk.org/jdk/pull/12670/files/27f30b40..676fc887 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12670&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12670&range=00-01 Stats: 61 lines in 3 files changed: 0 ins; 55 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/12670.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12670/head:pull/12670 PR: https://git.openjdk.org/jdk/pull/12670 From eosterlund at openjdk.org Fri Feb 24 10:44:18 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 24 Feb 2023 10:44:18 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 15:23:04 GMT, Erik ?sterlund wrote: > So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. > That sounds good to me. Fixed. ------------- PR: https://git.openjdk.org/jdk/pull/12670 From eosterlund at openjdk.org Fri Feb 24 10:44:20 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 24 Feb 2023 10:44:20 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v2] In-Reply-To: References: Message-ID: <3y0ux-MS6EOxFUy4ulfp5n9qAhOXI4NsdapyWaD8iYs=.a222cace-f0c5-4db9-b476-1a09d04c7dc6@github.com> On Thu, 23 Feb 2023 15:00:43 GMT, Albert Mingkun Yang wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> RISC-V fixes > > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 1198: > >> 1196: const Register saved_r15 = r9; >> 1197: #ifdef _WIN64 >> 1198: if (nargs >= 4) { > > When would `nargs` ever be `> 4`? It won't (yet). I was just structuring it like setup_arg_regs, where it similarly checks if nargs >= 4. But I can change both to an assert that it isn't > 4. ------------- PR: https://git.openjdk.org/jdk/pull/12670 From rehn at openjdk.org Fri Feb 24 10:45:05 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 24 Feb 2023 10:45:05 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: <1PZeZa46N1rA3R0IZs-4TIKP5d1gJYMDlWlfv4ApPV8=.6c5fb2fd-2b75-4dc8-979f-87c33bbe6d9d@github.com> On Fri, 24 Feb 2023 10:12:52 GMT, Martin Doerr wrote: > > > ~I don't think we've done that much testing with UseSystemMemoryBarrier since it was added~. I'm a bit nervous about turning it on by default since it's currently also used for JNI. Let's see what Robbin thinks. > > > > > > For reliability: On every Oracle tier 5 we run these test two groups: vmTestbase_nsk_jvmti, open/test/hotspot/jtreg/:hotspot_runtime For each of linux_x64_debug, windows_x64, linux_aarch64_debug with sysmembar on. I picked jvmti as this is very heavy on 'JNI'. No issues reported that I know about. We also have this test: SystemMembarHandshakeTransitionTest.java which only JNI transisition with this options. > > For performance, the heavy lifting is on reader of thread state and the only case I have identified is JFR. > > I suggest changing the default now, if there is issues we have time to revert it back before RDP1. > > Thanks for your prompt reply! I agree, this is a good time to enable it. Do you want to do it? We'll support it and run tests on our machines and platforms. I think JFR users who want a very high sampling rate can switch it off. Making the flag diagnostic is fine with me, but that shouldn't get discussed in this PR :-) I will be skiing next week, if you want it now, I suggest you do it, otherwise I can when I'm back. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From eosterlund at openjdk.org Fri Feb 24 11:04:08 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 24 Feb 2023 11:04:08 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v2] In-Reply-To: References: Message-ID: <4LXoPDwwMSjRqz9uBh4mJ7CyjqiU5f1gy-Do4vJ8kYQ=.cca9644d-8d12-47a1-9225-b9b13c524ac4@github.com> On Thu, 23 Feb 2023 15:02:58 GMT, Albert Mingkun Yang wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> RISC-V fixes > > src/hotspot/cpu/x86/gc/shared/barrierSetAssembler_x86.cpp line 218: > >> 216: fatal("No support for 8 bytes copy"); >> 217: #endif >> 218: } > > I believe the intent would be clearer using `switch`. I tried both, but I actually found it less clear with switch as I then have to worry about a missing break; which would be accepted by the compiler and just yield subtly incorrect machine code. I'm open to doing switch though if you really do think it is more clear then. What do you think? ------------- PR: https://git.openjdk.org/jdk/pull/12670 From jsjolen at openjdk.org Fri Feb 24 11:19:05 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 24 Feb 2023 11:19:05 GMT Subject: RFR: 8303068: Memory leak in DwarfFile::LineNumberProgram::run_line_number_program In-Reply-To: References: Message-ID: On Fri, 24 Feb 2023 04:48:53 GMT, David Holmes wrote: > After allocating: > > _state = new (std::nothrow) LineNumberProgramState(_header); > > we need to `delete _state` before returning. > > Testing: tiers 1-3 > > Thanks. Hi David, `LineNumberProgram` is used in `elFile.cpp:704`, is stack allocated and has exactly one entry point: `find_filename_and_line_number`. `LineNumberProgramState` is used only internally by `LineNumberProgramState`. `LineNumberProgram`'s current size is 120 bytes. The size of `LineNumberProgramState` is 48 bytes. Can't we change the type of `_state` to `LineNumberProgramState`? This bloats the size up to 168 bytes, but we no longer have to care about tracking the resource or have another heap allocation call. Yes, we have to change `->` to `.`, so the change is larger, but it's conceptually simpler. FWIW, `_state` is also always re-allocated, so I don't see why we even need the `reset()` method on it, but that's a future RFE. ------------- PR: https://git.openjdk.org/jdk/pull/12738 From ayang at openjdk.org Fri Feb 24 11:19:07 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 24 Feb 2023 11:19:07 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v2] In-Reply-To: <3y0ux-MS6EOxFUy4ulfp5n9qAhOXI4NsdapyWaD8iYs=.a222cace-f0c5-4db9-b476-1a09d04c7dc6@github.com> References: <3y0ux-MS6EOxFUy4ulfp5n9qAhOXI4NsdapyWaD8iYs=.a222cace-f0c5-4db9-b476-1a09d04c7dc6@github.com> Message-ID: On Fri, 24 Feb 2023 10:40:42 GMT, Erik ?sterlund wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 1198: >> >>> 1196: const Register saved_r15 = r9; >>> 1197: #ifdef _WIN64 >>> 1198: if (nargs >= 4) { >> >> When would `nargs` ever be `> 4`? > > It won't (yet). I was just structuring it like setup_arg_regs, where it similarly checks if nargs >= 4. But I can change both to an assert that it isn't > 4. An assert sounds good. This method (linux-part) gives me the impression that it's ensuring _four_ args are OK. ------------- PR: https://git.openjdk.org/jdk/pull/12670 From ayang at openjdk.org Fri Feb 24 11:23:07 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 24 Feb 2023 11:23:07 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v2] In-Reply-To: <4LXoPDwwMSjRqz9uBh4mJ7CyjqiU5f1gy-Do4vJ8kYQ=.cca9644d-8d12-47a1-9225-b9b13c524ac4@github.com> References: <4LXoPDwwMSjRqz9uBh4mJ7CyjqiU5f1gy-Do4vJ8kYQ=.cca9644d-8d12-47a1-9225-b9b13c524ac4@github.com> Message-ID: <8Pv2A3qh2tBb2XIPuiAZPKa3owClFOOb1R2S5U1h0lY=.837ae993-b66e-4f7c-b062-54410d60c913@github.com> On Fri, 24 Feb 2023 11:00:55 GMT, Erik ?sterlund wrote: >> src/hotspot/cpu/x86/gc/shared/barrierSetAssembler_x86.cpp line 218: >> >>> 216: fatal("No support for 8 bytes copy"); >>> 217: #endif >>> 218: } >> >> I believe the intent would be clearer using `switch`. > > I tried both, but I actually found it less clear with switch as I then have to worry about a missing break; which would be accepted by the compiler and just yield subtly incorrect machine code. I'm open to doing switch though if you really do think it is more clear then. What do you think? I think `switch` looks better; it's not that this code will be updated frequently, adding/removing cases. (Ofc, this is subjective.) switch(byte) { case 1: ...; break; case 2: ...; break; ... default: should-not-reach; } ------------- PR: https://git.openjdk.org/jdk/pull/12670 From duke at openjdk.org Fri Feb 24 12:29:05 2023 From: duke at openjdk.org (Jan Kratochvil) Date: Fri, 24 Feb 2023 12:29:05 GMT Subject: RFR: 8302191: Performance degradation for float/double modulo on Linux [v2] In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 13:51:10 GMT, Jan Kratochvil wrote: >> I have OCA already processed/approved. I am not Author but my Author request is being processed these days (sent to Rob McKenna). >> I did regression test x86_64 OpenJDK-8. I will leave other regression testing on GHA. >> The patch (and former GCC performance regression) affects only x86_64+i686. > > Jan Kratochvil has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision: > > 8302191: Performance degradation for float/double modulo on Linux I have also posted it for GCC as: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108922 ------------- PR: https://git.openjdk.org/jdk/pull/12508 From eosterlund at openjdk.org Fri Feb 24 12:31:39 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 24 Feb 2023 12:31:39 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v3] In-Reply-To: References: Message-ID: <9T9FOC-vcKCueshTQkRP4Nh7LCWTaQLRPFLTTvgOY6A=.1f9bc3f7-186f-414b-8128-a14a7e00d8e5@github.com> > So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: Update src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp Co-authored-by: Roberto Casta?eda Lozano ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12670/files - new: https://git.openjdk.org/jdk/pull/12670/files/676fc887..8d1f003a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12670&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12670&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12670.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12670/head:pull/12670 PR: https://git.openjdk.org/jdk/pull/12670 From eosterlund at openjdk.org Fri Feb 24 12:31:43 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 24 Feb 2023 12:31:43 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v3] In-Reply-To: References: Message-ID: On Fri, 24 Feb 2023 10:26:41 GMT, Roberto Casta?eda Lozano wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> Update src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp >> >> Co-authored-by: Roberto Casta?eda Lozano > > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 1831: > >> 1829: BarrierSetAssembler *bs = BarrierSet::barrier_set()->barrier_set_assembler(); >> 1830: #if COMPILER2_OR_JVMCI >> 1831: if ((!is_oop || bs->supports_avx3_masked_arraycopy()) && VM_Version::supports_avx512vlbw() && VM_Version::supports_bmi2() && MaxVectorSize >= 32) { > > Consider extending the test similarly in `generate_disjoint_int_oop_copy()` and `generate_conjoint_int_oop_copy()`, for consistency. Good point. Will do. ------------- PR: https://git.openjdk.org/jdk/pull/12670 From eosterlund at openjdk.org Fri Feb 24 12:37:38 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 24 Feb 2023 12:37:38 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v4] In-Reply-To: <8Pv2A3qh2tBb2XIPuiAZPKa3owClFOOb1R2S5U1h0lY=.837ae993-b66e-4f7c-b062-54410d60c913@github.com> References: <4LXoPDwwMSjRqz9uBh4mJ7CyjqiU5f1gy-Do4vJ8kYQ=.cca9644d-8d12-47a1-9225-b9b13c524ac4@github.com> <8Pv2A3qh2tBb2XIPuiAZPKa3owClFOOb1R2S5U1h0lY=.837ae993-b66e-4f7c-b062-54410d60c913@github.com> Message-ID: On Fri, 24 Feb 2023 11:19:52 GMT, Albert Mingkun Yang wrote: >> I tried both, but I actually found it less clear with switch as I then have to worry about a missing break; which would be accepted by the compiler and just yield subtly incorrect machine code. I'm open to doing switch though if you really do think it is more clear then. What do you think? > > I think `switch` looks better; it's not that this code will be updated frequently, adding/removing cases. (Ofc, this is subjective.) > > > switch(byte) { > case 1: ...; break; > case 2: ...; break; > ... > default: should-not-reach; > } Okay. ------------- PR: https://git.openjdk.org/jdk/pull/12670 From eosterlund at openjdk.org Fri Feb 24 12:37:38 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 24 Feb 2023 12:37:38 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v4] In-Reply-To: References: Message-ID: > So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: Roberto and Albert suggestions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12670/files - new: https://git.openjdk.org/jdk/pull/12670/files/8d1f003a..c245c570 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12670&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12670&range=02-03 Stats: 33 lines in 3 files changed: 17 ins; 1 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/12670.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12670/head:pull/12670 PR: https://git.openjdk.org/jdk/pull/12670 From eosterlund at openjdk.org Fri Feb 24 12:50:42 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 24 Feb 2023 12:50:42 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v5] In-Reply-To: References: Message-ID: <5RGCEzj3v0gBoof-SP1GKTWeg3_Bxp9PO1ESBzMjkss=.6fc143f9-e9e5-498c-9550-f302d6d89213@github.com> > So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: Move bs declaration up ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12670/files - new: https://git.openjdk.org/jdk/pull/12670/files/c245c570..1feedf18 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12670&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12670&range=03-04 Stats: 4 lines in 1 file changed: 2 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12670.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12670/head:pull/12670 PR: https://git.openjdk.org/jdk/pull/12670 From eosterlund at openjdk.org Fri Feb 24 12:50:42 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 24 Feb 2023 12:50:42 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v4] In-Reply-To: References: Message-ID: On Fri, 24 Feb 2023 12:37:38 GMT, Erik ?sterlund wrote: >> So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Roberto and Albert suggestions Thanks @robcasloz, @albertnetymk and @RealFYang for the reviews. I have updated the PR based on the feedback. ------------- PR: https://git.openjdk.org/jdk/pull/12670 From coleenp at openjdk.org Fri Feb 24 13:44:09 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 24 Feb 2023 13:44:09 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting [v2] In-Reply-To: References: Message-ID: On Thu, 23 Feb 2023 09:20:43 GMT, Kim Barrett wrote: >> Please review this change to the implementation of the Windows-specific option >> UseOSErrorReporting, toward allowing crash reporting functions to be declared >> noreturn. VMError::report_and_die no longer conditionally returns if the >> Windows-only option UseOSErrorReporting is true. >> >> The core of VM crash reporting has been renamed and privatized, and now takes >> an additional argument indicating whether the caller just wants reporting or >> also wants termination after reporting. The Windows top-level and unhandled >> exception filters call into that to request reporting, and also termination if >> UseOSErrorReporting is false, otherwise returning from the call and taking >> further action. All other callers of VM crash reporting include termination. >> >> So if we get a structured exception on Windows and the option is true then we >> report it and then end up walking up the exception handlers until running out, >> when Windows crash behavior gets a shot. This is essentially the same as >> before the change, just checking the option in a different place (in the >> Windows-specific caller, rather than in an #ifdef block in the shared code). >> >> In addition, the core VM crash reporter uses BREAKPOINT before termination if >> UseOSErrorReporting is true. So if we have an assertion failure we report it >> and then end up walking up the exception handlers (because of the breakpoint >> exception) until running out, when Windows crash behavior gets a shot. >> >> So if there is an attached debugger (or JIT debugging is enabled), this leads >> an assertion failure to hit the debugger inside the crash reporter rather than >> after it returned (due to the old behavior for UseOSErrorReporting) from the >> failed assertion (hitting the BREAKPOINT there). >> >> Note that on Windows, ::breakpoint() calls DebugBreak(), raising a breakpoint >> exception, which ends up being unhandled if UseOSErrorReporting is true. >> >> There is a pre-existing bug that I'll be reporting separately. If >> UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an >> empty .mdmp file. We shouldn't create that file when UseOSErrorReporting. >> >> Testing: >> mach5 tier1-3 >> >> Manual testing with the following, to verify expected behavior that is the >> same as before the change. >> >> # -XX:ErrorHandlerTest=N >> # 1: assertion failure >> # 2: guarantee failure >> # 14: SIGSEGV >> # 15: divide by zero >> path/to/bin/java \ >> -XX:+UnlockDiagnosticVMOptions \ >> -XX:+ErrorLogSecondaryErrorDetails \ >> -XX:+UseOSErrorReporting \ >> -XX:ErrorHandlerTest=1 \ >> TestDebug.java >> >> --- TestDebug.java --- >> import java.lang.String; >> public class TestDebug { >> static private volatile String dummy; >> public static void main(String[] args) throws Exception { >> while (true) { >> dummy = new String("foo bar"); >> } >> } >> } >> --- end TestDebug.java --- >> >> I need someone with a better understanding of Windows to test the interaction >> with Windows Error Reporting (WER). > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > check after_report_action Changes requested by coleenp (Reviewer). src/hotspot/share/utilities/vmError.cpp line 1718: > 1716: } else if (after_report_action == AfterReportAction::Die) { > 1717: #ifdef _WINDOWS > 1718: if (UseOSErrorReporting) { Since you already have to use UseOSErrorReporting in this code, I don't see the purpose of the parameter and ARA code. If you add a 'return' here and have the other code snippet do a return, this should be all you need. I'm not convinced that the DebugBreak here for windows is useful, since if you want to attach to a debugger, you'd use ShowMessageBoxOnError (based on my sample size of 1). But moving it here away from the vmassert macro seems fine. I tested that this works. When implementing WER, they'll want a non-empty mdmp file but that can probably be done with call to the minidump facility which can be factored out of the os::abort() call. My original interest is to get meaningful mdmp files showing the correct stack trace of the crashing thread. This doesn't break or fix that. ------------- PR: https://git.openjdk.org/jdk/pull/12651 From coleenp at openjdk.org Fri Feb 24 13:50:27 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 24 Feb 2023 13:50:27 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v6] In-Reply-To: References: Message-ID: <7WgXblJjvGmhAJ8inSa76HDDLUOUUR6YX7wcy-AYFmY=.47d9d887-6a9c-4289-97c8-22423dc26af4@github.com> On Thu, 23 Feb 2023 20:14:32 GMT, Robbin Ehn wrote: >> Hi all, please consider. >> >> The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. >> All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. >> Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. >> >> Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. >> The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) >> >> This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. >> >> This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. >> Secondly it moves handshakes part out of the Compile_lock where it is possible. >> >> Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. >> >> It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. >> >> Thanks, Robbin > > Robbin Ehn has updated the pull request incrementally with two additional commits since the last revision: > > - Comment fixes > - Include/fwd fixes This looks good to me. Thanks for your patience and explanations! ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.org/jdk/pull/12585 From ayang at openjdk.org Fri Feb 24 13:53:08 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 24 Feb 2023 13:53:08 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v5] In-Reply-To: <5RGCEzj3v0gBoof-SP1GKTWeg3_Bxp9PO1ESBzMjkss=.6fc143f9-e9e5-498c-9550-f302d6d89213@github.com> References: <5RGCEzj3v0gBoof-SP1GKTWeg3_Bxp9PO1ESBzMjkss=.6fc143f9-e9e5-498c-9550-f302d6d89213@github.com> Message-ID: <61SKzlWm_Of64eGV1U2KxEZlF4jISWqljLGlpwoEmk8=.d8a6797b-e0b9-4434-9db0-ad7be14a7bfa@github.com> On Fri, 24 Feb 2023 12:50:42 GMT, Erik ?sterlund wrote: >> So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Move bs declaration up Marked as reviewed by ayang (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12670 From coleenp at openjdk.org Fri Feb 24 14:15:12 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 24 Feb 2023 14:15:12 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting [v2] In-Reply-To: References: Message-ID: On Thu, 23 Feb 2023 09:20:43 GMT, Kim Barrett wrote: >> Please review this change to the implementation of the Windows-specific option >> UseOSErrorReporting, toward allowing crash reporting functions to be declared >> noreturn. VMError::report_and_die no longer conditionally returns if the >> Windows-only option UseOSErrorReporting is true. >> >> The core of VM crash reporting has been renamed and privatized, and now takes >> an additional argument indicating whether the caller just wants reporting or >> also wants termination after reporting. The Windows top-level and unhandled >> exception filters call into that to request reporting, and also termination if >> UseOSErrorReporting is false, otherwise returning from the call and taking >> further action. All other callers of VM crash reporting include termination. >> >> So if we get a structured exception on Windows and the option is true then we >> report it and then end up walking up the exception handlers until running out, >> when Windows crash behavior gets a shot. This is essentially the same as >> before the change, just checking the option in a different place (in the >> Windows-specific caller, rather than in an #ifdef block in the shared code). >> >> In addition, the core VM crash reporter uses BREAKPOINT before termination if >> UseOSErrorReporting is true. So if we have an assertion failure we report it >> and then end up walking up the exception handlers (because of the breakpoint >> exception) until running out, when Windows crash behavior gets a shot. >> >> So if there is an attached debugger (or JIT debugging is enabled), this leads >> an assertion failure to hit the debugger inside the crash reporter rather than >> after it returned (due to the old behavior for UseOSErrorReporting) from the >> failed assertion (hitting the BREAKPOINT there). >> >> Note that on Windows, ::breakpoint() calls DebugBreak(), raising a breakpoint >> exception, which ends up being unhandled if UseOSErrorReporting is true. >> >> There is a pre-existing bug that I'll be reporting separately. If >> UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an >> empty .mdmp file. We shouldn't create that file when UseOSErrorReporting. >> >> Testing: >> mach5 tier1-3 >> >> Manual testing with the following, to verify expected behavior that is the >> same as before the change. >> >> # -XX:ErrorHandlerTest=N >> # 1: assertion failure >> # 2: guarantee failure >> # 14: SIGSEGV >> # 15: divide by zero >> path/to/bin/java \ >> -XX:+UnlockDiagnosticVMOptions \ >> -XX:+ErrorLogSecondaryErrorDetails \ >> -XX:+UseOSErrorReporting \ >> -XX:ErrorHandlerTest=1 \ >> TestDebug.java >> >> --- TestDebug.java --- >> import java.lang.String; >> public class TestDebug { >> static private volatile String dummy; >> public static void main(String[] args) throws Exception { >> while (true) { >> dummy = new String("foo bar"); >> } >> } >> } >> --- end TestDebug.java --- >> >> I need someone with a better understanding of Windows to test the interaction >> with Windows Error Reporting (WER). > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > check after_report_action src/hotspot/os/windows/vmError_windows.cpp line 39: > 37: exceptionInfo->ExceptionRecord, > 38: exceptionInfo->ContextRecord, > 39: "%s", ""); Or if you want to make report_and_die marked noreturn, refactor it without the ARA stuff so report_and_die() is no-return, report_and_maybe_die() is return and both call the impl that uses UseOSErrorReporting. ------------- PR: https://git.openjdk.org/jdk/pull/12651 From coleenp at openjdk.org Fri Feb 24 14:27:04 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 24 Feb 2023 14:27:04 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: <9yWBGfO0KPK5bRyBT5URw-J33GqBKx_x_uEZ-QBJW5g=.383df2a5-2027-4116-aa5b-eeef96a356ae@github.com> References: <9yWBGfO0KPK5bRyBT5URw-J33GqBKx_x_uEZ-QBJW5g=.383df2a5-2027-4116-aa5b-eeef96a356ae@github.com> Message-ID: On Thu, 23 Feb 2023 08:08:33 GMT, Thomas Stuefe wrote: > This may be an argument for skipping hs-err reporting with UseOSErrorReporting - skipping everything, actually: the context that gets reported to Watson would be guaranteed to be the one from the real crash point. And the crash would be "fresh", as close as possible to the real crash site. I'd think that this is how UseOSErrorReporting is supposed to work, leaving crash reporting up to the OS. We had envisioned using a Windows API that was being added in maybe Vista to transmit the hs_err_pid file to WER. I think that was why we decided to generate it. Getting a real "fresh" crash site as you say would be an improvement for us as well. I don't know how to get both. ------------- PR: https://git.openjdk.org/jdk/pull/12651 From rcastanedalo at openjdk.org Fri Feb 24 15:28:11 2023 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Fri, 24 Feb 2023 15:28:11 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v5] In-Reply-To: <5RGCEzj3v0gBoof-SP1GKTWeg3_Bxp9PO1ESBzMjkss=.6fc143f9-e9e5-498c-9550-f302d6d89213@github.com> References: <5RGCEzj3v0gBoof-SP1GKTWeg3_Bxp9PO1ESBzMjkss=.6fc143f9-e9e5-498c-9550-f302d6d89213@github.com> Message-ID: On Fri, 24 Feb 2023 12:50:42 GMT, Erik ?sterlund wrote: >> So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Move bs declaration up src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 1223: > 1221: UnsafeCopyMemoryMark ucmm(this, !aligned, false, ucme_exit_pc); > 1222: // Copy in multi-bytes chunks > 1223: copy_bytes_forward(end_from, end_to, qword_count, rax, r10, L_copy_bytes, L_copy_8_bytes, decorators, T_BYTE); Why is `setup_arg_regs_using_thread()` not needed to free `r10` as a temporary register in `generate_[conjoint|disjoint]_[byte|short]_copy()`? From the following comment, it sounds like it would be needed: https://github.com/openjdk/jdk/blob/7d8b8ba9c46475ededcae7db6c841b25fa83d167/src/hotspot/cpu/x86/stubGenerator_x86_64.cpp#L1193-L1195 ------------- PR: https://git.openjdk.org/jdk/pull/12670 From daniel.daugherty at oracle.com Fri Feb 24 20:11:11 2023 From: daniel.daugherty at oracle.com (daniel.daugherty at oracle.com) Date: Fri, 24 Feb 2023 15:11:11 -0500 Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: <37uJ_WR0M5Q9Te1lWg2ecfCTGgFzzecrAPmmlnQ8vzQ=.f756a746-641e-4b29-a336-aeb549a74ff4@github.com> References: <37uJ_WR0M5Q9Te1lWg2ecfCTGgFzzecrAPmmlnQ8vzQ=.f756a746-641e-4b29-a336-aeb549a74ff4@github.com> Message-ID: <6b583699-5d25-5017-6ee5-944f02bbe182@oracle.com> On 2/23/23 9:54 PM, David Holmes wrote: > On Thu, 23 Feb 2023 12:46:47 GMT, Robbin Ehn wrote: > >>> I found one call to `add_to_hierarchy()` where we previously did not >>> grab in `Compile_lock` in `vmClasses::resolve_shared_class()` and >>> that call is made with this assert in place: >>> ```assert(!Universe::is_fully_initialized(), "We can make short cuts only during VM initialization");``` >>> >>> so it looks to me like we have a locking issue there (as in too early for locks). >> I think you are taking about CDS dump process. It works fine taking the lock. >> I think this was just some micro-optimization. But we should not mark and do deopt fixed that. > To restate: the assert_locked_or_safepoint is pointless because you just took out the lock here. > > The code Dan points to does not seem to be related to CDS dump: > > bool vmClasses::resolve(vmClassID id, TRAPS) { > InstanceKlass** klassp = &_klasses[as_int(id)]; > > #if INCLUDE_CDS > if (UseSharedSpaces && !JvmtiExport::should_post_class_prepare()) { > InstanceKlass* k = *klassp; > assert(k->is_shared_boot_class(), "must be"); > > ClassLoaderData* loader_data = ClassLoaderData::the_null_class_loader_data(); > resolve_shared_class(k, loader_data, Handle(), CHECK_false); > return true; > } > #endif // INCLUDE_CDS > > I remain concerned this code may be executed at a safepoint and we will now try to take the Compile_lock. > > ------------- > > PR:https://git.openjdk.org/jdk/pull/12585 For some reason these comments no longer appear in the current PR. That's probably because the code has changed relative to the original anchors for the comments so the bots don't know where in the code to attach the comments. In any case, here's the current code that we were talking about (from the v05 full webrev): -void SystemDictionary::add_to_hierarchy(InstanceKlass* k) { +void SystemDictionary::add_to_hierarchy(JavaThread* current, InstanceKlass* k) { assert(k != nullptr, "just checking"); - if (Universe::is_fully_initialized()) { - assert_locked_or_safepoint(Compile_lock); - } + assert(!SafepointSynchronize::is_at_safepoint(), "must NOT be at safepoint"); + DeoptimizationScope deopt_scope; + { + MutexLocker ml(current, Compile_lock); So the "assert_locked_or_safepoint(Compile_lock)" call has been replaced by an unconditional: assert(!SafepointSynchronize::is_at_safepoint(), "must NOT be at safepoint"); which should trip if SystemDictionary::add_to_hierarchy() is ever called from a safepoint (with non-release bits). As far as I know, this assert has not fired in Robbin's testing. Also, the code that I was talking about was not "bool vmClasses::resolve(vmClassID id, TRAPS)". I was talking about "vmClasses::resolve_shared_class()" which is the code that calls this assert: assert(!Universe::is_fully_initialized(), "We can make short cuts only during VM initialization"); I was worried about this code path because its the only place that calls add_to_hierarchy() in the original code that _did not_ grab Compile_lock. With Robbin's latest code add_to_hierarchy() does grab Compile_lock, but if that was going to be a problem, it should have shown up in testing. Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: From dcubed at openjdk.org Fri Feb 24 20:22:18 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 24 Feb 2023 20:22:18 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v6] In-Reply-To: References: Message-ID: On Thu, 23 Feb 2023 20:14:32 GMT, Robbin Ehn wrote: >> Hi all, please consider. >> >> The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. >> All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. >> Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. >> >> Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. >> The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) >> >> This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. >> >> This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. >> Secondly it moves handshakes part out of the Compile_lock where it is possible. >> >> Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. >> >> It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. >> >> Thanks, Robbin > > Robbin Ehn has updated the pull request incrementally with two additional commits since the last revision: > > - Comment fixes > - Include/fwd fixes I've now done another crawl thru review of v05. I don't see anything more that needs comments or changes There was an email from @dholmes-ora that didn't show up in "Files changed" view with a time stamp of "On 2/23/23 9:54 PM, David Holmes wrote:" and I've replied to that email thread, but my reply may not show up here either. ------------- PR: https://git.openjdk.org/jdk/pull/12585 From dcubed at openjdk.org Fri Feb 24 20:22:23 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 24 Feb 2023 20:22:23 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 06:37:22 GMT, David Holmes wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Coleen fix > > A few drive-by comments on the locking. The deopt stuff itself is not somethng I know. > > Thanks. Here's @dholmes-ora's missing comment and my reply: On 2/23/23 9:54 PM, David Holmes wrote: > On Thu, 23 Feb 2023 12:46:47 GMT, Robbin Ehn [](mailto:rehn at openjdk.org) wrote: > >>> I found one call to `add_to_hierarchy()` where we previously did not >>> grab in `Compile_lock` in `vmClasses::resolve_shared_class()` and >>> that call is made with this assert in place: >>> ```assert(!Universe::is_fully_initialized(), "We can make short cuts only during VM initialization");``` >>> >>> so it looks to me like we have a locking issue there (as in too early for locks). >> I think you are taking about CDS dump process. It works fine taking the lock. >> I think this was just some micro-optimization. But we should not mark and do deopt fixed that. > To restate: the assert_locked_or_safepoint is pointless because you just took out the lock here. > > The code Dan points to does not seem to be related to CDS dump: > > bool vmClasses::resolve(vmClassID id, TRAPS) { > InstanceKlass** klassp = &_klasses[as_int(id)]; > > #if INCLUDE_CDS > if (UseSharedSpaces && !JvmtiExport::should_post_class_prepare()) { > InstanceKlass* k = *klassp; > assert(k->is_shared_boot_class(), "must be"); > > ClassLoaderData* loader_data = ClassLoaderData::the_null_class_loader_data(); > resolve_shared_class(k, loader_data, Handle(), CHECK_false); > return true; > } > #endif // INCLUDE_CDS > > I remain concerned this code may be executed at a safepoint and we will now try to take the Compile_lock. > > ------------- > > PR: https://git.openjdk.org/jdk/pull/12585 For some reason these comments no longer appear in the current PR. That's probably because the code has changed relative to the original anchors for the comments so the bots don't know where in the code to attach the comments. In any case, here's the current code that we were talking about (from the v05 full webrev): -void SystemDictionary::add_to_hierarchy(InstanceKlass* k) { +void SystemDictionary::add_to_hierarchy(JavaThread* current, InstanceKlass* k) { assert(k != nullptr, "just checking"); - if (Universe::is_fully_initialized()) { - assert_locked_or_safepoint(Compile_lock); - } + assert(!SafepointSynchronize::is_at_safepoint(), "must NOT be at safepoint"); + DeoptimizationScope deopt_scope; + { + MutexLocker ml(current, Compile_lock); So the "assert_locked_or_safepoint(Compile_lock)" call has been replaced by an unconditional: assert(!SafepointSynchronize::is_at_safepoint(), "must NOT be at safepoint"); which should trip if SystemDictionary::add_to_hierarchy() is ever called from a safepoint (with non-release bits). As far as I know, this assert has not fired in Robbin's testing. Also, the code that I was talking about was not "bool vmClasses::resolve(vmClassID id, TRAPS)". I was talking about "vmClasses::resolve_shared_class()" which is the code that calls this assert: assert(!Universe::is_fully_initialized(), "We can make short cuts only during VM initialization"); I was worried about this code path because its the only place that calls add_to_hierarchy() in the original code that _did not_ grab Compile_lock. With Robbin's latest code add_to_hierarchy() does grab Compile_lock, but if that was going to be a problem, it should have shown up in testing. Dan ------------- PR: https://git.openjdk.org/jdk/pull/12585 From dcubed at openjdk.org Fri Feb 24 21:23:14 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 24 Feb 2023 21:23:14 GMT Subject: RFR: 8291555: Implement alternative fast-locking scheme [v12] In-Reply-To: References: Message-ID: On Wed, 15 Feb 2023 21:15:54 GMT, Roman Kennke wrote: >> This change adds a fast-locking scheme as an alternative to the current stack-locking implementation. It retains the advantages of stack-locking (namely fast locking in uncontended code-paths), while avoiding the overload of the mark word. That overloading causes massive problems with Lilliput, because it means we have to check and deal with this situation when trying to access the mark-word. And because of the very racy nature, this turns out to be very complex and would involve a variant of the inflation protocol to ensure that the object header is stable. (The current implementation of setting/fetching the i-hash provides a glimpse into the complexity). >> >> What the original stack-locking does is basically to push a stack-lock onto the stack which consists only of the displaced header, and CAS a pointer to this stack location into the object header (the lowest two header bits being 00 indicate 'stack-locked'). The pointer into the stack can then be used to identify which thread currently owns the lock. >> >> This change basically reverses stack-locking: It still CASes the lowest two header bits to 00 to indicate 'fast-locked' but does *not* overload the upper bits with a stack-pointer. Instead, it pushes the object-reference to a thread-local lock-stack. This is a new structure which is basically a small array of oops that is associated with each thread. Experience shows that this array typcially remains very small (3-5 elements). Using this lock stack, it is possible to query which threads own which locks. Most importantly, the most common question 'does the current thread own me?' is very quickly answered by doing a quick scan of the array. More complex queries like 'which thread owns X?' are not performed in very performance-critical paths (usually in code like JVMTI or deadlock detection) where it is ok to do more complex operations (and we already do). The lock-stack is also a new set of GC roots, and would be scanned during thread scanning, possibly concurrently, via the normal protocols. >> >> The lock-stack is grown when needed. This means that we need to check for potential overflow before attempting locking. When that is the case, locking fast-paths would call into the runtime to grow the stack and handle the locking. Compiled fast-paths (C1 and C2 on x86_64 and aarch64) do this check on method entry to avoid (possibly lots) of such checks at locking sites. >> >> In contrast to stack-locking, fast-locking does *not* support recursive locking (yet). When that happens, the fast-lock gets inflated to a full monitor. It is not clear if it is worth to add support for recursive fast-locking. >> >> One trouble is that when a contending thread arrives at a fast-locked object, it must inflate the fast-lock to a full monitor. Normally, we need to know the current owning thread, and record that in the monitor, so that the contending thread can wait for the current owner to properly exit the monitor. However, fast-locking doesn't have this information. What we do instead is to record a special marker ANONYMOUS_OWNER. When the thread that currently holds the lock arrives at monitorexit, and observes ANONYMOUS_OWNER, it knows it must be itself, fixes the owner to be itself, and then properly exits the monitor, and thus handing over to the contending thread. >> >> As an alternative, I considered to remove stack-locking altogether, and only use heavy monitors. In most workloads this did not show measurable regressions. However, in a few workloads, I have observed severe regressions. All of them have been using old synchronized Java collections (Vector, Stack), StringBuffer or similar code. The combination of two conditions leads to regressions without stack- or fast-locking: 1. The workload synchronizes on uncontended locks (e.g. single-threaded use of Vector or StringBuffer) and 2. The workload churns such locks. IOW, uncontended use of Vector, StringBuffer, etc as such is ok, but creating lots of such single-use, single-threaded-locked objects leads to massive ObjectMonitor churn, which can lead to a significant performance impact. But alas, such code exists, and we probably don't want to punish it if we can avoid it. >> >> This change enables to simplify (and speed-up!) a lot of code: >> >> - The inflation protocol is no longer necessary: we can directly CAS the (tagged) ObjectMonitor pointer to the object header. >> - Accessing the hashcode could now be done in the fastpath always, if the hashcode has been installed. Fast-locked headers can be used directly, for monitor-locked objects we can easily reach-through to the displaced header. This is safe because Java threads participate in monitor deflation protocol. This would be implemented in a separate PR >> >> >> Testing: >> - [x] tier1 x86_64 x aarch64 x +UseFastLocking >> - [x] tier2 x86_64 x aarch64 x +UseFastLocking >> - [x] tier3 x86_64 x aarch64 x +UseFastLocking >> - [x] tier4 x86_64 x aarch64 x +UseFastLocking >> - [x] tier1 x86_64 x aarch64 x -UseFastLocking >> - [x] tier2 x86_64 x aarch64 x -UseFastLocking >> - [x] tier3 x86_64 x aarch64 x -UseFastLocking >> - [x] tier4 x86_64 x aarch64 x -UseFastLocking >> - [x] Several real-world applications have been tested with this change in tandem with Lilliput without any problems, yet >> >> ### Performance >> >> #### Simple Microbenchmark >> >> The microbenchmark exercises only the locking primitives for monitorenter and monitorexit, without contention. The benchmark can be found (here)[https://github.com/rkennke/fastlockbench]. Numbers are in ns/ops. >> >> | | x86_64 | aarch64 | >> | -- | -- | -- | >> | -UseFastLocking | 20.651 | 20.764 | >> | +UseFastLocking | 18.896 | 18.908 | >> >> >> #### Renaissance >> >> ? | x86_64 | ? | ? | ? | aarch64 | ? | ? >> -- | -- | -- | -- | -- | -- | -- | -- >> ? | stack-locking | fast-locking | ? | ? | stack-locking | fast-locking | ? >> AkkaUct | 841.884 | 836.948 | 0.59% | ? | 1475.774 | 1465.647 | 0.69% >> Reactors | 11041.427 | 11181.451 | -1.25% | ? | 11381.751 | 11521.318 | -1.21% >> Als | 1367.183 | 1359.358 | 0.58% | ? | 1678.103 | 1688.067 | -0.59% >> ChiSquare | 577.021 | 577.398 | -0.07% | ? | 986.619 | 988.063 | -0.15% >> GaussMix | 817.459 | 819.073 | -0.20% | ? | 1154.293 | 1155.522 | -0.11% >> LogRegression | 598.343 | 603.371 | -0.83% | ? | 638.052 | 644.306 | -0.97% >> MovieLens | 8248.116 | 8314.576 | -0.80% | ? | 7569.219 | 7646.828 | -1.01%% >> NaiveBayes | 587.607 | 581.608 | 1.03% | ? | 541.583 | 550.059 | -1.54% >> PageRank | 3260.553 | 3263.472 | -0.09% | ? | 4376.405 | 4381.101 | -0.11% >> FjKmeans | 979.978 | 976.122 | 0.40% | ? | 774.312 | 771.235 | 0.40% >> FutureGenetic | 2187.369 | 2183.271 | 0.19% | ? | 2685.722 | 2689.056 | -0.12% >> ParMnemonics | 2434.551 | 2468.763 | -1.39% | ? | 4278.225 | 4263.863 | 0.34% >> Scrabble | 111.882 | 111.768 | 0.10% | ? | 151.796 | 153.959 | -1.40% >> RxScrabble | 210.252 | 211.38 | -0.53% | ? | 310.116 | 315.594 | -1.74% >> Dotty | 750.415 | 752.658 | -0.30% | ? | 1033.636 | 1036.168 | -0.24% >> ScalaDoku | 3072.05 | 3051.2 | 0.68% | ? | 3711.506 | 3690.04 | 0.58% >> ScalaKmeans | 211.427 | 209.957 | 0.70% | ? | 264.38 | 265.788 | -0.53% >> ScalaStmBench7 | 1017.795 | 1018.869 | -0.11% | ? | 1088.182 | 1092.266 | -0.37% >> Philosophers | 6450.124 | 6565.705 | -1.76% | ? | 12017.964 | 11902.559 | 0.97% >> FinagleChirper | 3953.623 | 3972.647 | -0.48% | ? | 4750.751 | 4769.274 | -0.39% >> FinagleHttp | 3970.526 | 4005.341 | -0.87% | ? | 5294.125 | 5296.224 | -0.04% > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Fix merge error (move done label into correct places) This project is currently baselined on jdk-21+10-761. ------------- PR: https://git.openjdk.org/jdk/pull/10907 From inakonechnyy at openjdk.org Fri Feb 24 22:18:54 2023 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Fri, 24 Feb 2023 22:18:54 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v4] In-Reply-To: References: Message-ID: > The proposed approach added a new function for getting the cause of an exception -`java_lang_Throwable::get_cause_simple `, that gets called within `InstanceKlass::add_initialization_error` if an old one `java_lang_Throwable::get_cause_with_stack_trace` didn't succeed because of an exception during the VM call. The simple function doesn't call the VM for getting a stack trace but fills in any other information about an exception. > > Besides that, the discovering information about an exception was added to `ConstantPoolCacheEntry::save_and_throw_indy_exc` function. > > Jtreg for reproducing the issue also was added to the commit. > The commit was tested with tier1 tests. Ilarion Nakonechnyy has updated the pull request incrementally with four additional commits since the last revision: - reverting changes in cpCache.cpp - Redesigned get_cause as create_initialization_error. Corrected naming in testcase - jcheck corrections - Removed VM.compMode from test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12566/files - new: https://git.openjdk.org/jdk/pull/12566/files/6df8bb1a..5c9c1ffe Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12566&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12566&range=02-03 Stats: 53 lines in 5 files changed: 0 ins; 16 del; 37 mod Patch: https://git.openjdk.org/jdk/pull/12566.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12566/head:pull/12566 PR: https://git.openjdk.org/jdk/pull/12566 From inakonechnyy at openjdk.org Fri Feb 24 22:18:58 2023 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Fri, 24 Feb 2023 22:18:58 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v3] In-Reply-To: <0wsea_jjNWfHUDPZLbwUIwoRRzyxRTrjQQw4aFKzh8w=.40c1d1d3-d9d7-4db3-8cb9-549feaaf14f9@github.com> References: <0wsea_jjNWfHUDPZLbwUIwoRRzyxRTrjQQw4aFKzh8w=.40c1d1d3-d9d7-4db3-8cb9-549feaaf14f9@github.com> Message-ID: On Wed, 22 Feb 2023 05:05:30 GMT, David Holmes wrote: >> Ilarion Nakonechnyy has updated the pull request incrementally with two additional commits since the last revision: >> >> - Get rid of redundant code - >> merge get_cause_with_stack_trace and get_cause_simple >> - Review corrections > > src/hotspot/share/classfile/javaClasses.cpp line 2764: > >> 2762: h_cause->klass()->external_name()); >> 2763: return Handle(); >> 2764: } > > You need to do this whether `with_stack_trace` or not, otherwise you will install the exception from `new_exception` as the initialization error for this class, when in fact it was not the initialization error at all. I would suggest a correction in this place. Just because if Java is in a state, when it throws an SOE at JavaCalls::call_virtual(), the most probably that the `Exceptions::new_exception()`, which actually wraps up a call to Java, also throws an SOE. If I return an empty Handle(), any clues on the cause will be missed from a stacktrace, even one of my testcases shows the following results: Exception in thread "main" java.lang.NoClassDefFoundError: Could not initialize class java.lang.invoke.MethodHandleImpl$AsVarargsCollector at java.sql/java.sql.DriverManager.ensureDriversInitialized(DriverManager.java:628) at java.sql/java.sql.DriverManager.getDrivers(DriverManager.java:427) at jdbc.repro.DriverA.(DriverA.java:22) at jdbc.repro.DriverRepro.main(DriverRepro.java:12) Caused by: java.lang.NoClassDefFoundError: Could not initialize class java.lang.invoke.MethodHandleImpl$AsVarargsCollector ... 4 more ------------- PR: https://git.openjdk.org/jdk/pull/12566 From inakonechnyy at openjdk.org Fri Feb 24 22:43:09 2023 From: inakonechnyy at openjdk.org (Ilarion Nakonechnyy) Date: Fri, 24 Feb 2023 22:43:09 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v3] In-Reply-To: <0wsea_jjNWfHUDPZLbwUIwoRRzyxRTrjQQw4aFKzh8w=.40c1d1d3-d9d7-4db3-8cb9-549feaaf14f9@github.com> References: <0wsea_jjNWfHUDPZLbwUIwoRRzyxRTrjQQw4aFKzh8w=.40c1d1d3-d9d7-4db3-8cb9-549feaaf14f9@github.com> Message-ID: <3kzpev86GSiw_-gugbVmoWSyzXmKcJwznPcSXoZfq9A=.a7d7ae02-1065-4a87-ac35-f7ff17a7a872@github.com> On Wed, 22 Feb 2023 04:57:52 GMT, David Holmes wrote: >> Ilarion Nakonechnyy has updated the pull request incrementally with two additional commits since the last revision: >> >> - Get rid of redundant code - >> merge get_cause_with_stack_trace and get_cause_simple >> - Review corrections > > src/hotspot/share/oops/cpCache.cpp line 476: > >> 474: Symbol* message = java_lang_Throwable::detail_message(PENDING_EXCEPTION); >> 475: // Try to discover the cause >> 476: oop cause = java_lang_Throwable::cause(PENDING_EXCEPTION); > > This enhancement is unrelated to the current JBS issue. I think it a good enhancement but it should be done separately. I removed it from this PR. The attached testcase doesn't cover this enhancement, but one of my other tests, which emulates SOE during the loading class, fails without SOE in a stacktrace if I remove these lines (Tested on mac os) Do you mind me asking for any clue, on how I can rework the testcase in att. for verifying enhancement in this file separately, in a separate JBS issue? ------------- PR: https://git.openjdk.org/jdk/pull/12566 From dholmes at openjdk.org Fri Feb 24 23:18:05 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 24 Feb 2023 23:18:05 GMT Subject: RFR: 8303068: Memory leak in DwarfFile::LineNumberProgram::run_line_number_program In-Reply-To: References: Message-ID: <1KaPWs0CT6SDrY_Cv_-zqsqS4a5stxT6Qwm8QaRV9yQ=.31d69204-96a9-4f80-b4a1-f1b564838d7d@github.com> On Fri, 24 Feb 2023 11:16:23 GMT, Johan Sj?len wrote: > Can't we change the type of _state to LineNumberProgramState? I think you meant change to `LineNumberProgram` ? I'm not really familiar with this code and just want to fix the memory leak that was detected. Any redesign of the code really needs to be a separate RFE. ------------- PR: https://git.openjdk.org/jdk/pull/12738 From stuefe at openjdk.org Sat Feb 25 07:10:11 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 25 Feb 2023 07:10:11 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: <9yWBGfO0KPK5bRyBT5URw-J33GqBKx_x_uEZ-QBJW5g=.383df2a5-2027-4116-aa5b-eeef96a356ae@github.com> Message-ID: On Fri, 24 Feb 2023 14:23:48 GMT, Coleen Phillimore wrote: > > This may be an argument for skipping hs-err reporting with UseOSErrorReporting - skipping everything, actually: the context that gets reported to Watson would be guaranteed to be the one from the real crash point. And the crash would be "fresh", as close as possible to the real crash site. I'd think that this is how UseOSErrorReporting is supposed to work, leaving crash reporting up to the OS. > > We had envisioned using a Windows API that was being added in maybe Vista to transmit the hs_err_pid file to WER. I think that was why we decided to generate it. Interesting, did you get this to work? > Getting a real "fresh" crash site as you say would be an improvement for us as well. I don't know how to get both. We could generate the minidump before generating the hs-err file by calling MiniDumpWriteDump before error reporting. But that does not include WER, of course. ------------- PR: https://git.openjdk.org/jdk/pull/12651 From mdoerr at openjdk.org Sat Feb 25 07:55:09 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Sat, 25 Feb 2023 07:55:09 GMT Subject: RFR: 8299817: [s390] AES-CTR mode intrinsic fails with multiple short update() calls [v4] In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 12:21:27 GMT, Lutz Schmidt wrote: >> This PR addresses an issue in the AES-CTR mode intrinsic on s390. When a message is ciphered in multiple, small (< 16 bytes) segments, the result is incorrect. >> >> This is not just a band-aid fix. The issue was taken as a chance to restructure the code. though still complicated, It is now easier to read and (hopefully) understand. >> >> Except for the new jetreg test, the changes are purely s390. There are no side effects on other platforms. Issue-specific tests pass. Other tests are in progress. I will update this PR once they are complete. >> >> **Reviews and comments are very much appreciated.** >> >> @backwaterred could you please run some "official" s390 tests? Thanks. > > Lutz Schmidt has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: > > - Merge master to resolve copyright conflict > - 829817: fixed typos, removed JIT_TIMER references > - 8299817: Update copyright > - 8299817: [s390] AES-CTR mode intrinsic fails with multiple short update() calls Not a detailed review, but I couldn't spot anything bad. test/hotspot/jtreg/compiler/codegen/aes/Test8299817.java line 67: > 65: public static void main(String[] args) throws Exception { > 66: if (!DEBUG_MODE) { > 67: if (!Compiler.isIntrinsicAvailable(CompilerWhiteBoxTest.COMP_LEVEL_FULL_OPTIMIZATION, "com.sun.crypto.provider.CounterMode", "implCrypt", byte[].class, int.class, int.class, byte[].class, int.class)) { Maybe break the large lines? ------------- Marked as reviewed by mdoerr (Reviewer). PR: https://git.openjdk.org/jdk/pull/11967 From mdoerr at openjdk.org Sat Feb 25 09:24:59 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Sat, 25 Feb 2023 09:24:59 GMT Subject: RFR: 8303210: [linux, Windows] Enable UseSystemMemoryBarrier by default if possible Message-ID: <9eZo1xYNGhjMSC9lDXKtkO1eyU_H-Veuh1AeP3CPKbg=.69b6c4ff-a3ef-439f-8468-21fec9de1825@github.com> I'd like to enable UseSystemMemoryBarrier by default on supported Operating Systems in order to improve performance of thread state transitions (I/O, JNI, foreign function calls, JIT compiler threads, etc.). See JBS issue for more details. Unfortunately, the feature was not yet implemented on all platforms. I added the code, but need the platform maintainers to check if it can be used reliably (and ideally if the performance improves). It's easy to switch it off again in case of problems. ------------- Commit messages: - 8303210: [linux, Windows] Enable UseSystemMemoryBarrier by default if possible Changes: https://git.openjdk.org/jdk/pull/12753/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12753&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8303210 Stats: 37 lines in 11 files changed: 19 ins; 1 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/12753.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12753/head:pull/12753 PR: https://git.openjdk.org/jdk/pull/12753 From mdoerr at openjdk.org Sat Feb 25 09:40:01 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Sat, 25 Feb 2023 09:40:01 GMT Subject: RFR: 8303210: [linux, Windows] Enable UseSystemMemoryBarrier by default if possible In-Reply-To: <9eZo1xYNGhjMSC9lDXKtkO1eyU_H-Veuh1AeP3CPKbg=.69b6c4ff-a3ef-439f-8468-21fec9de1825@github.com> References: <9eZo1xYNGhjMSC9lDXKtkO1eyU_H-Veuh1AeP3CPKbg=.69b6c4ff-a3ef-439f-8468-21fec9de1825@github.com> Message-ID: <9bxo80I1q4X_kSpJGbU01O-iF046TlKbP2LnnFwvNTU=.7dded7bb-060f-46fa-9f15-bbf04c448c56@github.com> On Sat, 25 Feb 2023 09:17:57 GMT, Martin Doerr wrote: > I'd like to enable UseSystemMemoryBarrier by default on supported Operating Systems in order to improve performance of thread state transitions (I/O, JNI, foreign function calls, JIT compiler threads, etc.). See JBS issue for more details. > Unfortunately, the feature was not yet implemented on all platforms. I added the code, but need the platform maintainers to check if it can be used reliably (and ideally if the performance improves). It's easy to switch it off again in case of problems. @bulasevic, @RealFYang, @offamitkumar, @shipilev: Can you please check your platforms and provide feedback? Maybe the feature can also be implemented for MacOS. I haven't checked. ------------- PR: https://git.openjdk.org/jdk/pull/12753 From mdoerr at openjdk.org Sat Feb 25 09:43:04 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Sat, 25 Feb 2023 09:43:04 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: <1PZeZa46N1rA3R0IZs-4TIKP5d1gJYMDlWlfv4ApPV8=.6c5fb2fd-2b75-4dc8-979f-87c33bbe6d9d@github.com> References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> <1PZeZa46N1rA3R0IZs-4TIKP5d1gJYMDlWlfv4ApPV8=.6c5fb2fd-2b75-4dc8-979f-87c33bbe6d9d@github.com> Message-ID: On Fri, 24 Feb 2023 10:42:06 GMT, Robbin Ehn wrote: > > > > ~I don't think we've done that much testing with UseSystemMemoryBarrier since it was added~. I'm a bit nervous about turning it on by default since it's currently also used for JNI. Let's see what Robbin thinks. > > > > > > > > > For reliability: On every Oracle tier 5 we run these test two groups: vmTestbase_nsk_jvmti, open/test/hotspot/jtreg/:hotspot_runtime For each of linux_x64_debug, windows_x64, linux_aarch64_debug with sysmembar on. I picked jvmti as this is very heavy on 'JNI'. No issues reported that I know about. We also have this test: SystemMembarHandshakeTransitionTest.java which only JNI transisition with this options. > > > For performance, the heavy lifting is on reader of thread state and the only case I have identified is JFR. > > > I suggest changing the default now, if there is issues we have time to revert it back before RDP1. > > > > > > Thanks for your prompt reply! I agree, this is a good time to enable it. Do you want to do it? We'll support it and run tests on our machines and platforms. I think JFR users who want a very high sampling rate can switch it off. Making the flag diagnostic is fine with me, but that shouldn't get discussed in this PR :-) > > I will be skiing next week, if you want it now, I suggest you do it, otherwise I can when I'm back. https://github.com/openjdk/jdk/pull/12753 ------------- PR: https://git.openjdk.org/jdk/pull/12708 From kbarrett at openjdk.org Sat Feb 25 12:30:06 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 25 Feb 2023 12:30:06 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting [v2] In-Reply-To: References: Message-ID: On Thu, 23 Feb 2023 09:20:43 GMT, Kim Barrett wrote: >> Please review this change to the implementation of the Windows-specific option >> UseOSErrorReporting, toward allowing crash reporting functions to be declared >> noreturn. VMError::report_and_die no longer conditionally returns if the >> Windows-only option UseOSErrorReporting is true. >> >> The core of VM crash reporting has been renamed and privatized, and now takes >> an additional argument indicating whether the caller just wants reporting or >> also wants termination after reporting. The Windows top-level and unhandled >> exception filters call into that to request reporting, and also termination if >> UseOSErrorReporting is false, otherwise returning from the call and taking >> further action. All other callers of VM crash reporting include termination. >> >> So if we get a structured exception on Windows and the option is true then we >> report it and then end up walking up the exception handlers until running out, >> when Windows crash behavior gets a shot. This is essentially the same as >> before the change, just checking the option in a different place (in the >> Windows-specific caller, rather than in an #ifdef block in the shared code). >> >> In addition, the core VM crash reporter uses BREAKPOINT before termination if >> UseOSErrorReporting is true. So if we have an assertion failure we report it >> and then end up walking up the exception handlers (because of the breakpoint >> exception) until running out, when Windows crash behavior gets a shot. >> >> So if there is an attached debugger (or JIT debugging is enabled), this leads >> an assertion failure to hit the debugger inside the crash reporter rather than >> after it returned (due to the old behavior for UseOSErrorReporting) from the >> failed assertion (hitting the BREAKPOINT there). >> >> Note that on Windows, ::breakpoint() calls DebugBreak(), raising a breakpoint >> exception, which ends up being unhandled if UseOSErrorReporting is true. >> >> There is a pre-existing bug that I'll be reporting separately. If >> UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an >> empty .mdmp file. We shouldn't create that file when UseOSErrorReporting. >> >> Testing: >> mach5 tier1-3 >> >> Manual testing with the following, to verify expected behavior that is the >> same as before the change. >> >> # -XX:ErrorHandlerTest=N >> # 1: assertion failure >> # 2: guarantee failure >> # 14: SIGSEGV >> # 15: divide by zero >> path/to/bin/java \ >> -XX:+UnlockDiagnosticVMOptions \ >> -XX:+ErrorLogSecondaryErrorDetails \ >> -XX:+UseOSErrorReporting \ >> -XX:ErrorHandlerTest=1 \ >> TestDebug.java >> >> --- TestDebug.java --- >> import java.lang.String; >> public class TestDebug { >> static private volatile String dummy; >> public static void main(String[] args) throws Exception { >> while (true) { >> dummy = new String("foo bar"); >> } >> } >> } >> --- end TestDebug.java --- >> >> I need someone with a better understanding of Windows to test the interaction >> with Windows Error Reporting (WER). > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > check after_report_action I _think_ the proposed change will work essentially the same as the existing code. However, I recently found RaiseFailFastException https://learn.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-raisefailfastexception which is leading me to rethink all of this. Now if I can just figure out how to make sure WER is enabled. ------------- PR: https://git.openjdk.org/jdk/pull/12651 From mdoerr at openjdk.org Sat Feb 25 16:14:10 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Sat, 25 Feb 2023 16:14:10 GMT Subject: RFR: 8303210: [linux, Windows] Enable UseSystemMemoryBarrier by default if possible [v2] In-Reply-To: <9eZo1xYNGhjMSC9lDXKtkO1eyU_H-Veuh1AeP3CPKbg=.69b6c4ff-a3ef-439f-8468-21fec9de1825@github.com> References: <9eZo1xYNGhjMSC9lDXKtkO1eyU_H-Veuh1AeP3CPKbg=.69b6c4ff-a3ef-439f-8468-21fec9de1825@github.com> Message-ID: > I'd like to enable UseSystemMemoryBarrier by default on supported Operating Systems in order to improve performance of thread state transitions (I/O, JNI, foreign function calls, JIT compiler threads, etc.). See JBS issue for more details. > Unfortunately, the feature was not yet implemented on all platforms. I added the code, but need the platform maintainers to check if it can be used reliably (and ideally if the performance improves). It's easy to switch it off again in case of problems. Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: Fix log message for platforms which don't support it. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12753/files - new: https://git.openjdk.org/jdk/pull/12753/files/7fbf0856..0ce067a9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12753&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12753&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12753.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12753/head:pull/12753 PR: https://git.openjdk.org/jdk/pull/12753 From jcking at openjdk.org Sat Feb 25 18:28:02 2023 From: jcking at openjdk.org (Justin King) Date: Sat, 25 Feb 2023 18:28:02 GMT Subject: RFR: 8303068: Memory leak in DwarfFile::LineNumberProgram::run_line_number_program In-Reply-To: <1KaPWs0CT6SDrY_Cv_-zqsqS4a5stxT6Qwm8QaRV9yQ=.31d69204-96a9-4f80-b4a1-f1b564838d7d@github.com> References: <1KaPWs0CT6SDrY_Cv_-zqsqS4a5stxT6Qwm8QaRV9yQ=.31d69204-96a9-4f80-b4a1-f1b564838d7d@github.com> Message-ID: On Fri, 24 Feb 2023 23:15:21 GMT, David Holmes wrote: > > Can't we change the type of _state to LineNumberProgramState? > > I think you meant change to `LineNumberProgram` ? > > I'm not really familiar with this code and just want to fix the memory leak that was detected. Any redesign of the code really needs to be a separate RFE. I agree. Let's take the path of least resistance to removing the leak and follow up with simplification in another PR. ------------- PR: https://git.openjdk.org/jdk/pull/12738 From dholmes at openjdk.org Sun Feb 26 06:06:04 2023 From: dholmes at openjdk.org (David Holmes) Date: Sun, 26 Feb 2023 06:06:04 GMT Subject: RFR: 8303210: [linux, Windows] Enable UseSystemMemoryBarrier by default if possible [v2] In-Reply-To: References: <9eZo1xYNGhjMSC9lDXKtkO1eyU_H-Veuh1AeP3CPKbg=.69b6c4ff-a3ef-439f-8468-21fec9de1825@github.com> Message-ID: On Sat, 25 Feb 2023 16:14:10 GMT, Martin Doerr wrote: >> I'd like to enable UseSystemMemoryBarrier by default on supported Operating Systems in order to improve performance of thread state transitions (I/O, JNI, foreign function calls, JIT compiler threads, etc.). See JBS issue for more details. >> Unfortunately, the feature was not yet implemented on all platforms. I added the code, but need the platform maintainers to check if it can be used reliably (and ideally if the performance improves). It's easy to switch it off again in case of problems. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Fix log message for platforms which don't support it. There are two steps needed here: 1. An argument/justification that this move from being experimental to being product (even if diagnostic). 2. Enabling it by default. ------------- PR: https://git.openjdk.org/jdk/pull/12753 From jsjolen at openjdk.org Sun Feb 26 10:16:04 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Sun, 26 Feb 2023 10:16:04 GMT Subject: RFR: 8303068: Memory leak in DwarfFile::LineNumberProgram::run_line_number_program In-Reply-To: References: Message-ID: On Fri, 24 Feb 2023 04:48:53 GMT, David Holmes wrote: > After allocating: > > _state = new (std::nothrow) LineNumberProgramState(_header); > > we need to `delete _state` before returning. > > Testing: tiers 1-3 > > Thanks. OK, that's fair to me, thanks for the changes. ------------- Marked as reviewed by jsjolen (Committer). PR: https://git.openjdk.org/jdk/pull/12738 From jsjolen at openjdk.org Sun Feb 26 10:34:09 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Sun, 26 Feb 2023 10:34:09 GMT Subject: RFR: 8303068: Memory leak in DwarfFile::LineNumberProgram::run_line_number_program In-Reply-To: References: Message-ID: On Fri, 24 Feb 2023 04:48:53 GMT, David Holmes wrote: > After allocating: > > _state = new (std::nothrow) LineNumberProgramState(_header); > > we need to `delete _state` before returning. > > Testing: tiers 1-3 > > Thanks. #else DWARF_LOG_DEBUG("Address: Line: Column: File:"); #endif - _state = new (std::nothrow) LineNumberProgramState(_header); + LineNumberProgramState state{_header}; + _state = &state; if (_state == nullptr) { DWARF_LOG_ERROR("Failed to create new LineNumberProgramState object"); return false; Isn't this equivalent? ------------- PR: https://git.openjdk.org/jdk/pull/12738 From jcking at openjdk.org Sun Feb 26 17:03:03 2023 From: jcking at openjdk.org (Justin King) Date: Sun, 26 Feb 2023 17:03:03 GMT Subject: RFR: 8303068: Memory leak in DwarfFile::LineNumberProgram::run_line_number_program In-Reply-To: References: Message-ID: <7J6gOPo4RbAL75NKPRqsb1ngRZfumIlUt0RqNGpMDrk=.fdaaac3d-08f5-4157-b77e-0f47fe49a9e3@github.com> On Sun, 26 Feb 2023 10:30:51 GMT, Johan Sj?len wrote: > ``` > #else > DWARF_LOG_DEBUG("Address: Line: Column: File:"); > #endif > - _state = new (std::nothrow) LineNumberProgramState(_header); > + LineNumberProgramState state{_header}; > + _state = &state; > if (_state == nullptr) { > DWARF_LOG_ERROR("Failed to create new LineNumberProgramState object"); > return false; > ``` > > Isn't this equivalent? You'd have to clear _state before returning from the function for safety. ------------- PR: https://git.openjdk.org/jdk/pull/12738 From david.holmes at oracle.com Mon Feb 27 00:20:22 2023 From: david.holmes at oracle.com (David Holmes) Date: Mon, 27 Feb 2023 10:20:22 +1000 Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v4] In-Reply-To: <6b583699-5d25-5017-6ee5-944f02bbe182@oracle.com> References: <37uJ_WR0M5Q9Te1lWg2ecfCTGgFzzecrAPmmlnQ8vzQ=.f756a746-641e-4b29-a336-aeb549a74ff4@github.com> <6b583699-5d25-5017-6ee5-944f02bbe182@oracle.com> Message-ID: Answering via email ... On 25/02/2023 6:11 am, daniel.daugherty at oracle.com wrote: > On 2/23/23 9:54 PM, David Holmes wrote: >> On Thu, 23 Feb 2023 12:46:47 GMT, Robbin Ehn wrote: >> >>>> I found one call to `add_to_hierarchy()` where we previously did not >>>> grab in `Compile_lock` in `vmClasses::resolve_shared_class()` and >>>> that call is made with this assert in place: >>>> ```assert(!Universe::is_fully_initialized(), "We can make short cuts only during VM initialization");``` >>>> >>>> so it looks to me like we have a locking issue there (as in too early for locks). >>> I think you are taking about CDS dump process. It works fine taking the lock. >>> I think this was just some micro-optimization. But we should not mark and do deopt fixed that. >> To restate: the assert_locked_or_safepoint is pointless because you just took out the lock here. >> >> The code Dan points to does not seem to be related to CDS dump: >> >> bool vmClasses::resolve(vmClassID id, TRAPS) { >> InstanceKlass** klassp = &_klasses[as_int(id)]; >> >> #if INCLUDE_CDS >> if (UseSharedSpaces && !JvmtiExport::should_post_class_prepare()) { >> InstanceKlass* k = *klassp; >> assert(k->is_shared_boot_class(), "must be"); >> >> ClassLoaderData* loader_data = ClassLoaderData::the_null_class_loader_data(); >> resolve_shared_class(k, loader_data, Handle(), CHECK_false); >> return true; >> } >> #endif // INCLUDE_CDS >> >> I remain concerned this code may be executed at a safepoint and we will now try to take the Compile_lock. >> >> ------------- >> >> PR:https://git.openjdk.org/jdk/pull/12585 > > For some reason these comments no longer appear in the current PR. > That's probably because the code has changed relative to the original > anchors for the comments so the bots don't know where in the code to > attach the comments. Yes it gets very hard to keep track of things. > In any case, here's the current code that we were talking about (from > the v05 full webrev): > > > -void SystemDictionary::add_to_hierarchy(InstanceKlass* k) { +void > SystemDictionary::add_to_hierarchy(JavaThread* current, InstanceKlass* > k) { assert(k != nullptr, "just checking"); - if > (Universe::is_fully_initialized()) { - > assert_locked_or_safepoint(Compile_lock); - } + > assert(!SafepointSynchronize::is_at_safepoint(), "must NOT be at > safepoint"); + DeoptimizationScope deopt_scope; + { + MutexLocker > ml(current, Compile_lock); So the > "assert_locked_or_safepoint(Compile_lock)" call has been replaced by an > unconditional: assert(!SafepointSynchronize::is_at_safepoint(), "must > NOT be at safepoint"); which should trip if > SystemDictionary::add_to_hierarchy() is ever called from a safepoint > (with non-release bits). As far as I know, this assert has not fired in > Robbin's testing. Okay that partially relieves my concern. Obviously whomever put the original assert_or_locked did believe/expect it to be called from a safepoint. It would be good to understand if they were mistaken or whether things have changed, or whether testing simply isn't catching that case. > Also, the code that I was talking about was not "bool > vmClasses::resolve(vmClassID id, TRAPS)". I was talking about > "vmClasses::resolve_shared_class()" which is the code that calls this > assert: assert(!Universe::is_fully_initialized(), "We can make short > cuts only during VM initialization"); I was worried about this code path > because its the only place that calls add_to_hierarchy() in the original > code that _did not_ grab Compile_lock. With Robbin's latest code > add_to_hierarchy() does grab Compile_lock, but if that was going to be a > problem, it should have shown up in testing. Dan Yes and in response to your concern Robbin said that there was no issue at CDS dump time. I then showed that vmClasses::resolve called vmClasses::resolve_shared_class() not during CDS dump - hence it was potentially still a concern. David ----- From dholmes at openjdk.org Mon Feb 27 00:32:15 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 27 Feb 2023 00:32:15 GMT Subject: RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v6] In-Reply-To: References: Message-ID: On Thu, 23 Feb 2023 20:14:32 GMT, Robbin Ehn wrote: >> Hi all, please consider. >> >> The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X. >> All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1. >> Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not. >> >> Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts. >> The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places) >> >> This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes. >> >> This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes. >> Secondly it moves handshakes part out of the Compile_lock where it is possible. >> >> Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading. >> >> It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications. >> >> Thanks, Robbin > > Robbin Ehn has updated the pull request incrementally with two additional commits since the last revision: > > - Comment fixes > - Include/fwd fixes In `ciEnv::register_method` we take the `Compile_lock` to ensure `add_to_hierarchy` can't run: // Prevent SystemDictionary::add_to_hierarchy from running // and invalidating our dependencies until we install this method. // No safepoints are allowed. Otherwise, class redefinition can occur in between. MutexLocker ml(Compile_lock); NoSafepointVerifier nsv; does moving the deopt outside of the `Compile_lock` affect that? ------------- PR: https://git.openjdk.org/jdk/pull/12585 From dholmes at openjdk.org Mon Feb 27 01:20:09 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 27 Feb 2023 01:20:09 GMT Subject: RFR: 8302191: Performance degradation for float/double modulo on Linux [v2] In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 13:51:10 GMT, Jan Kratochvil wrote: >> I have OCA already processed/approved. I am not Author but my Author request is being processed these days (sent to Rob McKenna). >> I did regression test x86_64 OpenJDK-8. I will leave other regression testing on GHA. >> The patch (and former GCC performance regression) affects only x86_64+i686. > > Jan Kratochvil has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision: > > 8302191: Performance degradation for float/double modulo on Linux Thanks for the updates. A few minor comments. I need to look into effective testing and benchmarking for this change. src/hotspot/cpu/x86/sharedRuntime_x86.cpp line 87: > 85: #endif //COMPILER1 > 86: > 87: JRT_LEAF(jfloat, SharedRuntime::frem(jfloat x, jfloat y)) Nit: extra space before x and y. src/hotspot/share/runtime/sharedRuntime.cpp line 236: > 234: const julong double_infinity = CONST64(0x7FF0000000000000); > 235: > 236: #ifndef X86 I wonder if the WIN64 workaround is actually needed/valid for non-X86 windows? test/micro/org/openjdk/bench/vm/floatingpoint/DremFrem.java line 2: > 1: /* > 2: * Copyright (c) 2023 Oracle and/or its affiliates. All rights reserved. Oracle did not write this code so it should have your own copyright on it, unless you copied it from other OpenJDK code in which case it could have dual copyright. test/micro/org/openjdk/bench/vm/floatingpoint/DremFrem.java line 9: > 7: * published by the Free Software Foundation. Oracle designates this > 8: * particular file as subject to the "Classpath" exception as provided > 9: * by Oracle in the LICENSE file that accompanied this code. I don't think the Classpath exception is relevant to this code. test/micro/org/openjdk/bench/vm/floatingpoint/DremFrem.java line 44: > 42: * Java bytecode drem is defined as C fmod (not C drem==remainder). > 43: * GCC since 93ba85fdd253b4b9cf2b9e54e8e5969b1a3db098 has slow fmod(). > 44: * Testcase is based on: https://stackoverflow.com/a/55673220/2995591 That could be a problem. What is the license for code shown on StackOverflow? ------------- PR: https://git.openjdk.org/jdk/pull/12508 From dholmes at openjdk.org Mon Feb 27 02:10:05 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 27 Feb 2023 02:10:05 GMT Subject: RFR: 8302491: NoClassDefFoundError omits the original cause of an error [v4] In-Reply-To: References: Message-ID: On Fri, 24 Feb 2023 22:18:54 GMT, Ilarion Nakonechnyy wrote: >> The proposed approach added a new function for getting the cause of an exception -`java_lang_Throwable::get_cause_simple `, that gets called within `InstanceKlass::add_initialization_error` if an old one `java_lang_Throwable::get_cause_with_stack_trace` didn't succeed because of an exception during the VM call. The simple function doesn't call the VM for getting a stack trace but fills in any other information about an exception. >> >> Besides that, the discovering information about an exception was added to `ConstantPoolCacheEntry::save_and_throw_indy_exc` function. >> >> Jtreg for reproducing the issue also was added to the commit. >> The commit was tested with tier1 tests. > > Ilarion Nakonechnyy has updated the pull request incrementally with four additional commits since the last revision: > > - reverting changes in cpCache.cpp > - Redesigned get_cause as create_initialization_error. > Corrected naming in testcase > - jcheck corrections > - Removed VM.compMode from test Please see [JDK-8048190](https://bugs.openjdk.org/browse/JDK-8048190) for why we do not record the original cause in the initialization error table directly. You cannot guarantee 100% to be able to save the initialization error for later use with the `NoClassDefFoundError` as that requires creating an instance of `ExceptionInInitializerError` with the desired data, to be stored in the initialization error table. There are a number of things that can go wrong trying to create the EIIE instance. The main one is that trying to get the stacktrace can throw an exception. So this PR should simply handle that case: ignore any exception from getting the stacktrace and save the EIIE without it. If creating the EIIE itself throws an exception then we just have to abandon the attempt to save the initialization error - as we have nothing to save. Thanks ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12566 From dholmes at openjdk.org Mon Feb 27 02:26:09 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 27 Feb 2023 02:26:09 GMT Subject: RFR: 8302191: Performance degradation for float/double modulo on Linux [v2] In-Reply-To: References: Message-ID: <4A9Uw_YcF6d1bg751wsOJJeLT1Mfx8b6Ne5MUSUvdrs=.de62d0d6-ccdb-402c-94be-800328cc622d@github.com> On Tue, 21 Feb 2023 13:51:10 GMT, Jan Kratochvil wrote: >> I have OCA already processed/approved. I am not Author but my Author request is being processed these days (sent to Rob McKenna). >> I did regression test x86_64 OpenJDK-8. I will leave other regression testing on GHA. >> The patch (and former GCC performance regression) affects only x86_64+i686. > > Jan Kratochvil has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: > > 8302191: Performance degradation for float/double modulo on Linux Changes requested by dholmes (Reviewer). src/hotspot/cpu/x86/sharedRuntime_x86.cpp line 97: > 95: jne 1b \n\ > 96: " > 97: :"=t"(retval) This doesn't compile on Windows c:\sb\prod\1677461706\workspace\open\src\hotspot\cpu\x86\sharedRuntime_x86.cpp(97): error C2059: syntax error: ':' c:\sb\prod\1677461706\workspace\open\src\hotspot\cpu\x86\sharedRuntime_x86.cpp(114): error C2059: syntax error: ':' I assume this asm is gcc based. ------------- PR: https://git.openjdk.org/jdk/pull/12508 From dholmes at openjdk.org Mon Feb 27 02:44:03 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 27 Feb 2023 02:44:03 GMT Subject: RFR: 8303068: Memory leak in DwarfFile::LineNumberProgram::run_line_number_program In-Reply-To: References: Message-ID: On Sun, 26 Feb 2023 10:30:51 GMT, Johan Sj?len wrote: >> After allocating: >> >> _state = new (std::nothrow) LineNumberProgramState(_header); >> >> we need to `delete _state` before returning. >> >> Testing: tiers 1-3 >> >> Thanks. > > #else > DWARF_LOG_DEBUG("Address: Line: Column: File:"); > #endif > - _state = new (std::nothrow) LineNumberProgramState(_header); > + LineNumberProgramState state{_header}; > + _state = &state; > if (_state == nullptr) { > DWARF_LOG_ERROR("Failed to create new LineNumberProgramState object"); > return false; > > > Isn't this equivalent? I'm going to have to look at this in much more detail. I thought `_state` was just locally used but it isn't. So I can't just delete it in the non-error cases, and probably need to reset to `nullptr` as well. That also prevents using the trick @jdksjolen suggested with the stack allocated instance. ------------- PR: https://git.openjdk.org/jdk/pull/12738 From dholmes at openjdk.org Mon Feb 27 04:22:23 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 27 Feb 2023 04:22:23 GMT Subject: RFR: 8303068: Memory leak in DwarfFile::LineNumberProgram::run_line_number_program [v2] In-Reply-To: References: Message-ID: > After allocating: > > _state = new (std::nothrow) LineNumberProgramState(_header); > > we need to `delete _state` before returning. > > Testing: tiers 1-3 > > Thanks. David Holmes has updated the pull request incrementally with one additional commit since the last revision: Simpler, cleaner and more robust fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12738/files - new: https://git.openjdk.org/jdk/pull/12738/files/3040fab4..e213ad56 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12738&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12738&range=00-01 Stats: 5 lines in 2 files changed: 2 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12738.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12738/head:pull/12738 PR: https://git.openjdk.org/jdk/pull/12738 From dholmes at openjdk.org Mon Feb 27 04:22:25 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 27 Feb 2023 04:22:25 GMT Subject: RFR: 8303068: Memory leak in DwarfFile::LineNumberProgram::run_line_number_program In-Reply-To: References: Message-ID: On Fri, 24 Feb 2023 04:48:53 GMT, David Holmes wrote: > After allocating: > > _state = new (std::nothrow) LineNumberProgramState(_header); > > we need to `delete _state` before returning. > > Testing: tiers 1-3 > > Thanks. Okay I just pushed a much simpler, cleaner and more robust fix, by adding a destructor for `LineNumberProgram`. ------------- PR: https://git.openjdk.org/jdk/pull/12738 From dholmes at openjdk.org Mon Feb 27 04:26:15 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 27 Feb 2023 04:26:15 GMT Subject: RFR: 8303068: Memory leak in DwarfFile::LineNumberProgram::run_line_number_program [v3] In-Reply-To: References: Message-ID: <1FV7AUQOJ7E7rWIrHXw-TVvouTWxDnr4v_miwkYYtKE=.47c0dcbc-3d58-4431-8830-aa279e63605a@github.com> > After allocating: > > _state = new (std::nothrow) LineNumberProgramState(_header); > > we need to `delete _state` before returning. > > Testing: tiers 1-3 > > Thanks. David Holmes has updated the pull request incrementally with one additional commit since the last revision: Revert unnecessary change ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12738/files - new: https://git.openjdk.org/jdk/pull/12738/files/e213ad56..af294d35 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12738&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12738&range=01-02 Stats: 3 lines in 1 file changed: 1 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12738.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12738/head:pull/12738 PR: https://git.openjdk.org/jdk/pull/12738 From mdoerr at openjdk.org Mon Feb 27 04:47:03 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 27 Feb 2023 04:47:03 GMT Subject: RFR: 8303210: [linux, Windows] Enable UseSystemMemoryBarrier by default if possible [v2] In-Reply-To: References: <9eZo1xYNGhjMSC9lDXKtkO1eyU_H-Veuh1AeP3CPKbg=.69b6c4ff-a3ef-439f-8468-21fec9de1825@github.com> Message-ID: On Sun, 26 Feb 2023 06:03:42 GMT, David Holmes wrote: > There are two steps needed here: > > 1. An argument/justification that this move from being experimental to being product (even if diagnostic). > 2. Enabling it by default. Hi David, thanks for looking into this. 1. We want to support it in product JDKs. I think there has been enough testing to leave the experimental state. See https://github.com/openjdk/jdk/pull/12708#issuecomment-1442965746 2. We need it activated for customers to benefit from the performance improvement. In addition, I'd like to enable it as soon as possible in order to get additional testing before RDP1, also in EA builds. Switching it off again in case of any problems is very easy. Performance results were published here: https://github.com/openjdk/jdk/pull/10123#issuecomment-1235123012 In addition, performance measurements wrt. the foreign function interface (project Panama) have shown the memory barrier performance bottleneck. ------------- PR: https://git.openjdk.org/jdk/pull/12753 From dholmes at openjdk.org Mon Feb 27 06:24:14 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 27 Feb 2023 06:24:14 GMT Subject: RFR: JDK-8302989: Add missing INCLUDE_CDS checks In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 14:42:38 GMT, Matthias Baesken wrote: > The cds only coding in hotspot is usually guarded with the INCLUDE_CDS macro so that it can be removed at compile time in case the correct configure flags are set. > However at some places INCLUDE_CDS is missing and should be added. > > One question - should (additionally to the UseSharedSpaces code section) the DumpSharedSpaces code sections be guarded as well with INCLUDE_CDS macros ? So you avoid doing e.g. if (RewriteBytecodes CDS_ONLY(&& !UseSharedSpaces && !DumpSharedSpaces)) { and CDS_ONLY(assert(!UseSharedSpaces, "UsedSharedSpaces not supported with exploded module builds");) *** but what about the code in `Metaspace::global_initialize()`? Looks the the `INCLUDE_CDS` section there needs expanding. And `Universe::genesis` is a bit inconsistent about excluding or not. And `InstanceRefKlass::update_nonstatic_oop_maps` looks like a candidate too. And realistically we shouldn't even be compiling systemDictionaryShared.cpp in a non-CDS build, but that would take a lot of additional conditionalization. I'm just not seeing that converting a handful of sites is really beneficial. The flags themselves are not normal flags defined for CDS_ONLY precisely because we wanted them available without having to conditionalize every single usage. I'm not looking to "block" this but I can't really endorse it either. *** there is probably potential for adding a `cds_assert` macro to handle a bunch of CDS-only assertions. ------------- PR: https://git.openjdk.org/jdk/pull/12691 From dholmes at openjdk.org Mon Feb 27 07:24:05 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 27 Feb 2023 07:24:05 GMT Subject: RFR: 8303210: [linux, Windows] Enable UseSystemMemoryBarrier by default if possible [v2] In-Reply-To: References: <9eZo1xYNGhjMSC9lDXKtkO1eyU_H-Veuh1AeP3CPKbg=.69b6c4ff-a3ef-439f-8468-21fec9de1825@github.com> Message-ID: On Sat, 25 Feb 2023 16:14:10 GMT, Martin Doerr wrote: >> I'd like to enable UseSystemMemoryBarrier by default on supported Operating Systems in order to improve performance of thread state transitions (I/O, JNI, foreign function calls, JIT compiler threads, etc.). See JBS issue for more details. >> Unfortunately, the feature was not yet implemented on all platforms. I added the code, but need the platform maintainers to check if it can be used reliably (and ideally if the performance improves). It's easy to switch it off again in case of problems. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Fix log message for platforms which don't support it. Thanks for the pointers to the other discussions. I think we need to see broad benchmarks runs on all affected platforms to ensure this isn't penalizing applications that don't actually need the fast native access. It may have been better to split this PR in two, first adding support to the platforms missing it, and then changing the flag type and state. A few comments below. Thanks src/hotspot/os/linux/systemMemoryBarrier_linux.cpp line 75: > 73: int ret = membarrier(MEMBARRIER_CMD_QUERY, 0, 0); > 74: if (ret < 0) { > 75: log_trace(os)("MEMBARRIER_CMD_QUERY unsupported"); Why trace? If this should be on by default then I would think info logging would be the right thing here e.g.: log_info(os)("Use of system memory barriers is not available: MEMBARRIER PRIVATE_EXPEDITED unsupported"); etc. src/hotspot/share/runtime/threads.cpp line 559: > 557: if (!SystemMemoryBarrier::initialize()) { > 558: if (!FLAG_IS_DEFAULT(UseSystemMemoryBarrier)) { > 559: warning("UseSystemMemoryBarrier specified, but not supported on this OS."); Possibly add "use -Xlog:os=info for details" (as appropriate given the logging discussion). ------------- PR: https://git.openjdk.org/jdk/pull/12753 From dholmes at openjdk.org Mon Feb 27 07:24:07 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 27 Feb 2023 07:24:07 GMT Subject: RFR: 8303210: [linux, Windows] Enable UseSystemMemoryBarrier by default if possible [v2] In-Reply-To: References: <9eZo1xYNGhjMSC9lDXKtkO1eyU_H-Veuh1AeP3CPKbg=.69b6c4ff-a3ef-439f-8468-21fec9de1825@github.com> Message-ID: On Mon, 27 Feb 2023 06:56:49 GMT, David Holmes wrote: >> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix log message for platforms which don't support it. > > src/hotspot/os/linux/systemMemoryBarrier_linux.cpp line 75: > >> 73: int ret = membarrier(MEMBARRIER_CMD_QUERY, 0, 0); >> 74: if (ret < 0) { >> 75: log_trace(os)("MEMBARRIER_CMD_QUERY unsupported"); > > Why trace? If this should be on by default then I would think info logging would be the right thing here e.g.: > > log_info(os)("Use of system memory barriers is not available: MEMBARRIER PRIVATE_EXPEDITED unsupported"); > > etc. Also log in the success case too, before returning. > src/hotspot/share/runtime/threads.cpp line 559: > >> 557: if (!SystemMemoryBarrier::initialize()) { >> 558: if (!FLAG_IS_DEFAULT(UseSystemMemoryBarrier)) { >> 559: warning("UseSystemMemoryBarrier specified, but not supported on this OS."); > > Possibly add "use -Xlog:os=info for details" (as appropriate given the logging discussion). Also, with the flag checking this code now looks to be in the wrong place. Flag warnings typically belong is arguments.cpp. Actually I think this code belongs in os::init() regardless. ------------- PR: https://git.openjdk.org/jdk/pull/12753 From mbaesken at openjdk.org Mon Feb 27 08:10:04 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 27 Feb 2023 08:10:04 GMT Subject: RFR: JDK-8302989: Add missing INCLUDE_CDS checks In-Reply-To: References: Message-ID: On Mon, 27 Feb 2023 06:21:14 GMT, David Holmes wrote: > So you avoid doing e.g. > > ``` > if (RewriteBytecodes CDS_ONLY(&& !UseSharedSpaces && !DumpSharedSpaces)) { > ``` > > and > > ``` > CDS_ONLY(assert(!UseSharedSpaces, "UsedSharedSpaces not supported with exploded module builds");) *** > ``` > > but what about the code in `Metaspace::global_initialize()`? Looks the the `INCLUDE_CDS` section there needs expanding. And `Universe::genesis` is a bit inconsistent about excluding or not. And `InstanceRefKlass::update_nonstatic_oop_maps` looks like a candidate too. And realistically we shouldn't even be compiling systemDictionaryShared.cpp in a non-CDS build, but that would take a lot of additional conditionalization. > Hi David, systemDictionaryShared.cpp is already not compiled in non-CDS mode, see hotspot/lib/JvmFeatures.gmk . Regarding the other places you mentioned, `InstanceRefKlass::update_nonstatic_oop_maps` , the UseSharedSpaces section there might indeed be guarded by `INCLUDE_CDS` - should I add this ? (on the other hand this section includes just asserts so I would expect that the whole code gets optimized out anyway in a product build. > I'm just not seeing that converting a handful of sites is really beneficial. The flags themselves are not normal flags defined for CDS_ONLY precisely because we wanted them available without having to conditionalize every single usage. I'm not looking to "block" this but I can't really endorse it either. > The conversion of a handful of sites is just an **addition** to the already rather common usage of the macro. Nothin new really ... > *** there is probably potential for adding a `cds_assert` macro to handle a bunch of CDS-only assertions. Maybe in a separate issue, that do you think ? Do you have some example places where to use it ? ------------- PR: https://git.openjdk.org/jdk/pull/12691 From stuefe at openjdk.org Mon Feb 27 08:31:07 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 27 Feb 2023 08:31:07 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting [v2] In-Reply-To: References: Message-ID: On Sat, 25 Feb 2023 12:27:36 GMT, Kim Barrett wrote: > I _think_ the proposed change will work essentially the same as the existing code. However, I recently found RaiseFailFastException https://learn.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-raisefailfastexception which is leading me to rethink all of this. Now if I can just figure out how to make sure WER is enabled. Looks promising, I was hoping there would be a solution like this. ------------- PR: https://git.openjdk.org/jdk/pull/12651 From eosterlund at openjdk.org Mon Feb 27 08:37:08 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 27 Feb 2023 08:37:08 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v5] In-Reply-To: References: <5RGCEzj3v0gBoof-SP1GKTWeg3_Bxp9PO1ESBzMjkss=.6fc143f9-e9e5-498c-9550-f302d6d89213@github.com> Message-ID: <6Cw7G_FRdEv2YGLGCXFqF4tRGDziP347NmgbqeUekc8=.6ea3586e-d663-423d-9e8f-5bcac25cb4e7@github.com> On Fri, 24 Feb 2023 15:24:53 GMT, Roberto Casta?eda Lozano wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> Move bs declaration up > > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 1223: > >> 1221: UnsafeCopyMemoryMark ucmm(this, !aligned, false, ucme_exit_pc); >> 1222: // Copy in multi-bytes chunks >> 1223: copy_bytes_forward(end_from, end_to, qword_count, rax, r10, L_copy_bytes, L_copy_8_bytes, decorators, T_BYTE); > > Why is `setup_arg_regs_using_thread()` not needed to free `r10` as a temporary register in `generate_[conjoint|disjoint]_[byte|short]_copy()`? From the following comment, it sounds like it would be needed: > > https://github.com/openjdk/jdk/blob/7d8b8ba9c46475ededcae7db6c841b25fa83d167/src/hotspot/cpu/x86/stubGenerator_x86_64.cpp#L1193-L1195 We used to only have setup_arg_regs which saved windows callee saved registers in r10, which turned out to be horribly fragile (cf. https://bugs.openjdk.org/browse/JDK-8203466). The setup_arg_regs_using_thread() function was introduced specifically so that the windows callee save registers are *not* stored in r10, because it is a register you really don't want to rely on not getting scratched, when it is chosen as the rscratch1 register. Instead, callee save registers are saved in the thread memory area, which is much less fragile. That's why I can use r10. ------------- PR: https://git.openjdk.org/jdk/pull/12670 From eosterlund at openjdk.org Mon Feb 27 08:52:07 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 27 Feb 2023 08:52:07 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: On Thu, 23 Feb 2023 06:18:49 GMT, Martin Doerr wrote: >> Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". >> >> This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). >> >> Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. >> >> There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. >> >> The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. >> >> I had to make changes to shared code and code for other platforms: >> 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: >> - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. >> - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. >> - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! >> 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Remove size restriction for structs. Add TODO for Big Endian. I don?t think we want this to be on by default on platforms where StoreLoad fences don't cause substantial global overheads. The benefit on such platforms is rather low, and needing the last couple of nanoseconds of transition speed, seems to not be a normal use case that default settings should optimize for. Conversely, the global synchronization can be rather intrusive, especially when it involves handshakes with N threads, and you need to perform global synchronization across the entire machine, for each thread poked. I would be much more afraid of that issue out of the box, than I would be afraid of a couple of nanoseconds slower native transitions. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From fyang at openjdk.org Mon Feb 27 08:56:08 2023 From: fyang at openjdk.org (Fei Yang) Date: Mon, 27 Feb 2023 08:56:08 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v5] In-Reply-To: <5RGCEzj3v0gBoof-SP1GKTWeg3_Bxp9PO1ESBzMjkss=.6fc143f9-e9e5-498c-9550-f302d6d89213@github.com> References: <5RGCEzj3v0gBoof-SP1GKTWeg3_Bxp9PO1ESBzMjkss=.6fc143f9-e9e5-498c-9550-f302d6d89213@github.com> Message-ID: On Fri, 24 Feb 2023 12:50:42 GMT, Erik ?sterlund wrote: >> So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Move bs declaration up Updated riscv-specific change looks good. ------------- Marked as reviewed by fyang (Reviewer). PR: https://git.openjdk.org/jdk/pull/12670 From lucy at openjdk.org Mon Feb 27 09:03:49 2023 From: lucy at openjdk.org (Lutz Schmidt) Date: Mon, 27 Feb 2023 09:03:49 GMT Subject: RFR: 8299817: [s390] AES-CTR mode intrinsic fails with multiple short update() calls [v5] In-Reply-To: References: Message-ID: > This PR addresses an issue in the AES-CTR mode intrinsic on s390. When a message is ciphered in multiple, small (< 16 bytes) segments, the result is incorrect. > > This is not just a band-aid fix. The issue was taken as a chance to restructure the code. though still complicated, It is now easier to read and (hopefully) understand. > > Except for the new jetreg test, the changes are purely s390. There are no side effects on other platforms. Issue-specific tests pass. Other tests are in progress. I will update this PR once they are complete. > > **Reviews and comments are very much appreciated.** > > @backwaterred could you please run some "official" s390 tests? Thanks. Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: 8299817: split some excessively long lines in Test8299817.java ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11967/files - new: https://git.openjdk.org/jdk/pull/11967/files/0c8ced1e..6de4aef7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11967&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11967&range=03-04 Stats: 18 lines in 1 file changed: 10 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/11967.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11967/head:pull/11967 PR: https://git.openjdk.org/jdk/pull/11967 From lucy at openjdk.org Mon Feb 27 09:03:57 2023 From: lucy at openjdk.org (Lutz Schmidt) Date: Mon, 27 Feb 2023 09:03:57 GMT Subject: RFR: 8299817: [s390] AES-CTR mode intrinsic fails with multiple short update() calls [v4] In-Reply-To: References: Message-ID: On Sat, 25 Feb 2023 07:51:24 GMT, Martin Doerr wrote: >> Lutz Schmidt has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: >> >> - Merge master to resolve copyright conflict >> - 829817: fixed typos, removed JIT_TIMER references >> - 8299817: Update copyright >> - 8299817: [s390] AES-CTR mode intrinsic fails with multiple short update() calls > > test/hotspot/jtreg/compiler/codegen/aes/Test8299817.java line 67: > >> 65: public static void main(String[] args) throws Exception { >> 66: if (!DEBUG_MODE) { >> 67: if (!Compiler.isIntrinsicAvailable(CompilerWhiteBoxTest.COMP_LEVEL_FULL_OPTIMIZATION, "com.sun.crypto.provider.CounterMode", "implCrypt", byte[].class, int.class, int.class, byte[].class, int.class)) { > > Maybe break the large lines? Lines split. Thank you very much for your reviews, @TheRealMDoerr @offamitkumar @MBaesken ------------- PR: https://git.openjdk.org/jdk/pull/11967 From dholmes at openjdk.org Mon Feb 27 09:15:14 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 27 Feb 2023 09:15:14 GMT Subject: RFR: JDK-8302989: Add missing INCLUDE_CDS checks In-Reply-To: References: Message-ID: <-oG_7BAygWyAL0mDPAyiUEjg7bp501RSvqUPWiunuB4=.1238a230-212a-470e-bab9-12fa04719432@github.com> On Mon, 27 Feb 2023 08:06:56 GMT, Matthias Baesken wrote: > systemDictionaryShared.cpp is already not compiled in non-CDS mode, see hotspot/lib/JvmFeatures.gmk Sorry I must have misread something. > The conversion of a handful of sites is just an addition to the already rather common usage of the macro. Nothin new really ... Nothing new but it doesn't really add anything either; especially when there remains so much non-conditional usage. > Maybe in a separate issue, that do you think ? Do you have some example places where to use it ? Up to you. These are what I spotted mainly: ./share/oops/instanceClassLoaderKlass.hpp: InstanceClassLoaderKlass() { assert(DumpSharedSpaces || UseSharedSpaces, "only for CDS"); } ./share/oops/instanceMirrorKlass.hpp: InstanceMirrorKlass() { assert(DumpSharedSpaces || UseSharedSpaces, "only for CDS"); } ./share/oops/arrayKlass.hpp: ArrayKlass() { assert(DumpSharedSpaces || UseSharedSpaces, "only for cds"); } ./share/oops/constantPool.hpp: ConstantPool() { assert(DumpSharedSpaces || UseSharedSpaces, "only for CDS"); } ./share/oops/instanceRefKlass.hpp: InstanceRefKlass() { assert(DumpSharedSpaces || UseSharedSpaces, "only for CDS"); } ./share/oops/instanceStackChunkKlass.hpp: InstanceStackChunkKlass() { assert(DumpSharedSpaces || UseSharedSpaces, "only for CDS"); } ./share/oops/klass.hpp: Klass() : _kind(KlassKind(-1)) { assert(DumpSharedSpaces || UseSharedSpaces, "only for cds"); } ./share/oops/instanceKlass.hpp: InstanceKlass() { assert(DumpSharedSpaces || UseSharedSpaces, "only for CDS"); } But looking closer these are all in constructors that should be CDS_ONLY. ------------- PR: https://git.openjdk.org/jdk/pull/12691 From jsjolen at openjdk.org Mon Feb 27 09:35:08 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 27 Feb 2023 09:35:08 GMT Subject: RFR: 8303068: Memory leak in DwarfFile::LineNumberProgram::run_line_number_program [v3] In-Reply-To: <1FV7AUQOJ7E7rWIrHXw-TVvouTWxDnr4v_miwkYYtKE=.47c0dcbc-3d58-4431-8830-aa279e63605a@github.com> References: <1FV7AUQOJ7E7rWIrHXw-TVvouTWxDnr4v_miwkYYtKE=.47c0dcbc-3d58-4431-8830-aa279e63605a@github.com> Message-ID: On Mon, 27 Feb 2023 04:26:15 GMT, David Holmes wrote: >> After allocating: >> >> _state = new (std::nothrow) LineNumberProgramState(_header); >> >> we need to `delete _state` before returning. >> >> Testing: tiers 1-3 >> >> Thanks. > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Revert unnecessary change Re-approving ------------- Marked as reviewed by jsjolen (Committer). PR: https://git.openjdk.org/jdk/pull/12738 From iklam at openjdk.org Mon Feb 27 10:11:06 2023 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 27 Feb 2023 10:11:06 GMT Subject: RFR: JDK-8302989: Add missing INCLUDE_CDS checks In-Reply-To: References: Message-ID: On Tue, 21 Feb 2023 14:42:38 GMT, Matthias Baesken wrote: > The cds only coding in hotspot is usually guarded with the INCLUDE_CDS macro so that it can be removed at compile time in case the correct configure flags are set. > However at some places INCLUDE_CDS is missing and should be added. > > One question - should (additionally to the UseSharedSpaces code section) the DumpSharedSpaces code sections be guarded as well with INCLUDE_CDS macros ? It may be more profitable (and less work) if we change these variables to `const` when CDS is disabled: extern bool DumpSharedSpaces; extern bool DynamicDumpSharedSpaces; extern bool RequireSharedSpaces; extern "C" { // Make sure UseSharedSpaces is accessible to the serviceability agent. extern JNIEXPORT jboolean UseSharedSpaces; } But some changes may be needed in SA. ------------- PR: https://git.openjdk.org/jdk/pull/12691 From rcastanedalo at openjdk.org Mon Feb 27 10:21:11 2023 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Mon, 27 Feb 2023 10:21:11 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v5] In-Reply-To: <6Cw7G_FRdEv2YGLGCXFqF4tRGDziP347NmgbqeUekc8=.6ea3586e-d663-423d-9e8f-5bcac25cb4e7@github.com> References: <5RGCEzj3v0gBoof-SP1GKTWeg3_Bxp9PO1ESBzMjkss=.6fc143f9-e9e5-498c-9550-f302d6d89213@github.com> <6Cw7G_FRdEv2YGLGCXFqF4tRGDziP347NmgbqeUekc8=.6ea3586e-d663-423d-9e8f-5bcac25cb4e7@github.com> Message-ID: On Mon, 27 Feb 2023 08:34:35 GMT, Erik ?sterlund wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 1223: >> >>> 1221: UnsafeCopyMemoryMark ucmm(this, !aligned, false, ucme_exit_pc); >>> 1222: // Copy in multi-bytes chunks >>> 1223: copy_bytes_forward(end_from, end_to, qword_count, rax, r10, L_copy_bytes, L_copy_8_bytes, decorators, T_BYTE); >> >> Why is `setup_arg_regs_using_thread()` not needed to free `r10` as a temporary register in `generate_[conjoint|disjoint]_[byte|short]_copy()`? From the following comment, it sounds like it would be needed: >> >> https://github.com/openjdk/jdk/blob/7d8b8ba9c46475ededcae7db6c841b25fa83d167/src/hotspot/cpu/x86/stubGenerator_x86_64.cpp#L1193-L1195 > > We used to only have setup_arg_regs which saved windows callee saved registers in r10, which turned out to be horribly fragile (cf. https://bugs.openjdk.org/browse/JDK-8203466). The setup_arg_regs_using_thread() function was introduced specifically so that the windows callee save registers are *not* stored in r10, because it is a register you really don't want to rely on not getting scratched, when it is chosen as the rscratch1 register. Instead, callee save registers are saved in the thread memory area, which is much less fragile. That's why I can use r10. Thanks for the explanation, Erik! If I got it right, in this patch, implementations of `copy_[load|store]_at()` can only use `r10` safely (`Register tmp`, or `Register tmp2` for vector `copy_store_at()`) if `BasicType type == T_OBJECT`, because only in this context they will be called between `setup_arg_regs_using_thread()` and `restore_arg_regs_using_thread()` (as opposed to regular `setup_arg_regs()`/`restore_arg_regs()`). If this is correct and expected, I suggest to document it somewhere, e.g. as a comment in the declaration of `copy_[load|store]_at()`, to avoid misinterpretations. ------------- PR: https://git.openjdk.org/jdk/pull/12670 From kbarrett at openjdk.org Mon Feb 27 10:23:20 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 27 Feb 2023 10:23:20 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting [v2] In-Reply-To: References: Message-ID: <5b4uPq85mcZFPpETxZJu32nvM4dKf27GbqbmZN9pKJQ=.a565bed4-c01b-4579-90ca-b718d9eb9193@github.com> On Thu, 23 Feb 2023 09:20:43 GMT, Kim Barrett wrote: >> Please review this change to the implementation of the Windows-specific option >> UseOSErrorReporting, toward allowing crash reporting functions to be declared >> noreturn. VMError::report_and_die no longer conditionally returns if the >> Windows-only option UseOSErrorReporting is true. >> >> The core of VM crash reporting has been renamed and privatized, and now takes >> an additional argument indicating whether the caller just wants reporting or >> also wants termination after reporting. The Windows top-level and unhandled >> exception filters call into that to request reporting, and also termination if >> UseOSErrorReporting is false, otherwise returning from the call and taking >> further action. All other callers of VM crash reporting include termination. >> >> So if we get a structured exception on Windows and the option is true then we >> report it and then end up walking up the exception handlers until running out, >> when Windows crash behavior gets a shot. This is essentially the same as >> before the change, just checking the option in a different place (in the >> Windows-specific caller, rather than in an #ifdef block in the shared code). >> >> In addition, the core VM crash reporter uses BREAKPOINT before termination if >> UseOSErrorReporting is true. So if we have an assertion failure we report it >> and then end up walking up the exception handlers (because of the breakpoint >> exception) until running out, when Windows crash behavior gets a shot. >> >> So if there is an attached debugger (or JIT debugging is enabled), this leads >> an assertion failure to hit the debugger inside the crash reporter rather than >> after it returned (due to the old behavior for UseOSErrorReporting) from the >> failed assertion (hitting the BREAKPOINT there). >> >> Note that on Windows, ::breakpoint() calls DebugBreak(), raising a breakpoint >> exception, which ends up being unhandled if UseOSErrorReporting is true. >> >> There is a pre-existing bug that I'll be reporting separately. If >> UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an >> empty .mdmp file. We shouldn't create that file when UseOSErrorReporting. >> >> Testing: >> mach5 tier1-3 >> >> Manual testing with the following, to verify expected behavior that is the >> same as before the change. >> >> # -XX:ErrorHandlerTest=N >> # 1: assertion failure >> # 2: guarantee failure >> # 14: SIGSEGV >> # 15: divide by zero >> path/to/bin/java \ >> -XX:+UnlockDiagnosticVMOptions \ >> -XX:+ErrorLogSecondaryErrorDetails \ >> -XX:+UseOSErrorReporting \ >> -XX:ErrorHandlerTest=1 \ >> TestDebug.java >> >> --- TestDebug.java --- >> import java.lang.String; >> public class TestDebug { >> static private volatile String dummy; >> public static void main(String[] args) throws Exception { >> while (true) { >> dummy = new String("foo bar"); >> } >> } >> } >> --- end TestDebug.java --- >> >> I need someone with a better understanding of Windows to test the interaction >> with Windows Error Reporting (WER). > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > check after_report_action Withdrawn - superseded by https://github.com/openjdk/jdk/pull/12759 ------------- PR: https://git.openjdk.org/jdk/pull/12651 From kbarrett at openjdk.org Mon Feb 27 10:23:22 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 27 Feb 2023 10:23:22 GMT Subject: Withdrawn: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: On Mon, 20 Feb 2023 08:28:05 GMT, Kim Barrett wrote: > Please review this change to the implementation of the Windows-specific option > UseOSErrorReporting, toward allowing crash reporting functions to be declared > noreturn. VMError::report_and_die no longer conditionally returns if the > Windows-only option UseOSErrorReporting is true. > > The core of VM crash reporting has been renamed and privatized, and now takes > an additional argument indicating whether the caller just wants reporting or > also wants termination after reporting. The Windows top-level and unhandled > exception filters call into that to request reporting, and also termination if > UseOSErrorReporting is false, otherwise returning from the call and taking > further action. All other callers of VM crash reporting include termination. > > So if we get a structured exception on Windows and the option is true then we > report it and then end up walking up the exception handlers until running out, > when Windows crash behavior gets a shot. This is essentially the same as > before the change, just checking the option in a different place (in the > Windows-specific caller, rather than in an #ifdef block in the shared code). > > In addition, the core VM crash reporter uses BREAKPOINT before termination if > UseOSErrorReporting is true. So if we have an assertion failure we report it > and then end up walking up the exception handlers (because of the breakpoint > exception) until running out, when Windows crash behavior gets a shot. > > So if there is an attached debugger (or JIT debugging is enabled), this leads > an assertion failure to hit the debugger inside the crash reporter rather than > after it returned (due to the old behavior for UseOSErrorReporting) from the > failed assertion (hitting the BREAKPOINT there). > > Note that on Windows, ::breakpoint() calls DebugBreak(), raising a breakpoint > exception, which ends up being unhandled if UseOSErrorReporting is true. > > There is a pre-existing bug that I'll be reporting separately. If > UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an > empty .mdmp file. We shouldn't create that file when UseOSErrorReporting. > > Testing: > mach5 tier1-3 > > Manual testing with the following, to verify expected behavior that is the > same as before the change. > > # -XX:ErrorHandlerTest=N > # 1: assertion failure > # 2: guarantee failure > # 14: SIGSEGV > # 15: divide by zero > path/to/bin/java \ > -XX:+UnlockDiagnosticVMOptions \ > -XX:+ErrorLogSecondaryErrorDetails \ > -XX:+UseOSErrorReporting \ > -XX:ErrorHandlerTest=1 \ > TestDebug.java > > --- TestDebug.java --- > import java.lang.String; > public class TestDebug { > static private volatile String dummy; > public static void main(String[] args) throws Exception { > while (true) { > dummy = new String("foo bar"); > } > } > } > --- end TestDebug.java --- > > I need someone with a better understanding of Windows to test the interaction > with Windows Error Reporting (WER). This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/12651 From kbarrett at openjdk.org Mon Feb 27 10:26:49 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 27 Feb 2023 10:26:49 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting Message-ID: Please review this change to the implementation of the Windows-specific option UseOSErrorReporting, toward allowing crash reporting functions to be declared noreturn. VMError::report_and_die no longer conditionally returns if the Windows-only option UseOSErrorReporting is true. The Windows-only sections of report_and_die now call RaiseFailFastException (https://learn.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-raisefailfastexception), which immediately invokes WER (Windows Error Reporting) if it is enabled, without executing structured exception handler. If WER is not enabled, it just immediately terminates the program. Thus, we no longer return to walk up thestructured exception handler chain to pop out at the top as unhandled in order to invoke WER. This permits declaring report_and_die as [[noreturn]], once some functions from the os class are also so declared. Also adding that attribute as appropriate to other functions in the os class. This of course assumes the use of [[noreturn]] in HotSpot code is approved (JDK-8302124). There is a pre-existing bug that I'll be reporting separately. If UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an empty .mdmp file. We shouldn't create that file when UseOSErrorReporting. Testing: mach5 tier1-3 Manual testing with the following, to verify desired behavior. -XX:ErrorHandlerTest=N 1: assertion failure 2: guarantee failure 14: SIGSEGV 15: divide by zero path/to/bin/java \ -XX:+UnlockDiagnosticVMOptions \ -XX:+ErrorLogSecondaryErrorDetails \ -XX:+UseOSErrorReporting \ -XX:ErrorHandlerTest=1 \ TestDebug.java --- TestDebug.java --- import java.lang.String; public class TestDebug { static private volatile String dummy; public static void main(String[] args) throws Exception { while (true) { dummy = new String("foo bar"); } } } --- end TestDebug.java --- The state of WER can be examined and modified using Power Shell commands {Get,Enable,Disable}-WindowsErrorReporting. The state of reporting WER captured errors can be examined and modified using Control Panel > Security and Maintenance > Maintenance : Report Problems [on,off] With Report Problems off, reports are placed in c:\ProgramData\Microsoft\Windows\WER\ReportArchive I verified that executing the above test with WER enabled adds an entry in that directory, but not when it's disabled. Also nothing is added there when the test is run with -XX:-UseOSErrorReporting. ------------- Commit messages: - failfast Changes: https://git.openjdk.org/jdk/pull/12759/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12759&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302798 Stats: 48 lines in 5 files changed: 22 ins; 2 del; 24 mod Patch: https://git.openjdk.org/jdk/pull/12759.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12759/head:pull/12759 PR: https://git.openjdk.org/jdk/pull/12759 From kbarrett at openjdk.org Mon Feb 27 10:49:34 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 27 Feb 2023 10:49:34 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 17:26:57 GMT, Kim Barrett wrote: >> Ah, I see, thanks for the clarification behind the location of the attribute changing where it applies to, that was something I'd forgotten. I'd argue that eventually biting the bullet and changing the attribute positions that we already have under macros (can be done in one fell swoop honestly) is a net gain since they now have well defined behaviour according to C++ and we know what they apply to exactly, but I don't know how much you'd agree on that point >> >> Speaking of moving to C++17 though, what about C++20? It's the C++ version that Visual C++ gives access to the noinline and forceinline attributes which are guaranteed to not inline or always inline, as opposed to what we have now, which globalDefinitions_visCPP notes that we are uncertain whether they actually work, and while it's probably farfetched to jump an entire version of C++ just for that, I found it pretty interesting and wondered what you'd have to say about that > > Moving to a newer C++ language version requires compiler support, for all > supported OpenJDK ports (*). Only recently has the xlclang compiler used by > the aix-ppc port had a version that supported C++17. (I think it was released > last summer/fall.) I don't know when they might release a version supporting > C++20. Use of that new version is being investigated by the port maintainers. > I've done a little bit of peeking ahead to C++17 as well, and haven't yet run > into any problems with the Oracle-supported ports (which is what I expected). > > (*) Non-open licensee ports also need to be considered, though I'm not sure > what the requirements are there. I know of one such that didn't yet have > complete C++14 support when we adopted it for OpenJDK. > I've no idea how standard attributes and gcc `__attribute__` or msvc `__declspec` interact syntactically. My hope would be that they can interleave, but I have no idea whether that's true or not. I guess some research is needed, though I think it doesn't matter for the current set. (I doubt anyone is going to declare a function both noreturn and always-inline, for example.) I did some experimenting, and it looks like they interleave without any problems. And I even found a potential use-case for noreturn + always-inline. :) ------------- PR: https://git.openjdk.org/jdk/pull/12507 From kbarrett at openjdk.org Mon Feb 27 10:53:29 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 27 Feb 2023 10:53:29 GMT Subject: RFR: 8274400: HotSpot Style Guide should permit use of alignof [v7] In-Reply-To: References: <3IdeZGaGcFB1-DbVugn0P9v8vRQMLURzHFRSpAm46NA=.09bf4dd8-b652-49c3-b9a5-ad3f58dfbcc3@github.com> Message-ID: On Tue, 24 Jan 2023 09:21:37 GMT, Julian Waters wrote: >> The alignof operator was added by C++11. It returns the alignment for the given type. Various metaprogramming usages exist, in particular when using std::aligned_storage. Use of this operator should be permitted in HotSpot code. > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > Oof This has been open and stable for a while. Anyone else? ------------- PR: https://git.openjdk.org/jdk/pull/11761 From kbarrett at openjdk.org Mon Feb 27 10:54:22 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 27 Feb 2023 10:54:22 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 05:26:38 GMT, Kim Barrett wrote: > Please review this change to permit the use of noreturn attributes in HotSpot > code. > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2761.pdf > https://en.cppreference.com/w/cpp/language/attributes/noreturn > > This will permit such changes as marking failed assertions and the like as > noreturn, permitting the compiler to generate better code in some cases. This > has benefits even in product builds, since some of the relevant checks occur > in product code (such as `guarantee`, `fatal`, &etc). It also provides a > solution to the problem described in JDK-8294031, where the potential > continued execution from a failed assertion leads to compiler warnings. > > The change is written in such a way that it should be easy to add the > appropriate text for new attributes in the future. There have been > discussions of adopting C++17, which adds several attributes. > > The change to the Style Guide is forward looking, toward a time when more > attributes are available due to the adoption of a newer language standard than > the current C++14. It is written in such a way that it should be easy to add > the appropriate text for new attributes. > > Testing: > I have a prototype of making HotSpot assertions noreturn, which has been run > through mach5 tier1-8 for all Oracle-supported platforms. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 24-Feb-2023 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. This has been open and stable for a while. Anyone else? ------------- PR: https://git.openjdk.org/jdk/pull/12507 From tschatzl at openjdk.org Mon Feb 27 11:09:12 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 27 Feb 2023 11:09:12 GMT Subject: RFR: 8303068: Memory leak in DwarfFile::LineNumberProgram::run_line_number_program [v3] In-Reply-To: <1FV7AUQOJ7E7rWIrHXw-TVvouTWxDnr4v_miwkYYtKE=.47c0dcbc-3d58-4431-8830-aa279e63605a@github.com> References: <1FV7AUQOJ7E7rWIrHXw-TVvouTWxDnr4v_miwkYYtKE=.47c0dcbc-3d58-4431-8830-aa279e63605a@github.com> Message-ID: On Mon, 27 Feb 2023 04:26:15 GMT, David Holmes wrote: >> After allocating: >> >> _state = new (std::nothrow) LineNumberProgramState(_header); >> >> we need to `delete _state` before returning. >> >> Testing: tiers 1-3 >> >> Thanks. > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Revert unnecessary change Marked as reviewed by tschatzl (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12738 From eosterlund at openjdk.org Mon Feb 27 11:22:09 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 27 Feb 2023 11:22:09 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v5] In-Reply-To: References: <5RGCEzj3v0gBoof-SP1GKTWeg3_Bxp9PO1ESBzMjkss=.6fc143f9-e9e5-498c-9550-f302d6d89213@github.com> <6Cw7G_FRdEv2YGLGCXFqF4tRGDziP347NmgbqeUekc8=.6ea3586e-d663-423d-9e8f-5bcac25cb4e7@github.com> Message-ID: On Mon, 27 Feb 2023 10:18:22 GMT, Roberto Casta?eda Lozano wrote: >> We used to only have setup_arg_regs which saved windows callee saved registers in r10, which turned out to be horribly fragile (cf. https://bugs.openjdk.org/browse/JDK-8203466). The setup_arg_regs_using_thread() function was introduced specifically so that the windows callee save registers are *not* stored in r10, because it is a register you really don't want to rely on not getting scratched, when it is chosen as the rscratch1 register. Instead, callee save registers are saved in the thread memory area, which is much less fragile. That's why I can use r10. > > Thanks for the explanation, Erik! If I got it right, in this patch, implementations of `copy_[load|store]_at()` can only use `r10` safely (`Register tmp`, or `Register tmp2` for vector `copy_store_at()`) if `BasicType type == T_OBJECT`, because only in this context they will be called between `setup_arg_regs_using_thread()` and `restore_arg_regs_using_thread()` (as opposed to regular `setup_arg_regs()`/`restore_arg_regs()`). If this is correct and expected, I suggest to document it somewhere, e.g. as a comment in the declaration of `copy_[load|store]_at()`, to avoid misinterpretations. Good point. I'll add a comment. ------------- PR: https://git.openjdk.org/jdk/pull/12670 From eosterlund at openjdk.org Mon Feb 27 11:28:40 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 27 Feb 2023 11:28:40 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v6] In-Reply-To: References: Message-ID: <47e5gQJkZn7ldGv3n2cyQoZSAT9YpWWvIwXnbxhdGuQ=.643a4652-204b-4689-85a4-9378e8a587b3@github.com> > So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: Add comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12670/files - new: https://git.openjdk.org/jdk/pull/12670/files/1feedf18..8d2b8f41 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12670&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12670&range=04-05 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12670.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12670/head:pull/12670 PR: https://git.openjdk.org/jdk/pull/12670 From jwaters at openjdk.org Mon Feb 27 11:49:12 2023 From: jwaters at openjdk.org (Julian Waters) Date: Mon, 27 Feb 2023 11:49:12 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 05:26:38 GMT, Kim Barrett wrote: > Please review this change to permit the use of noreturn attributes in HotSpot > code. > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2761.pdf > https://en.cppreference.com/w/cpp/language/attributes/noreturn > > This will permit such changes as marking failed assertions and the like as > noreturn, permitting the compiler to generate better code in some cases. This > has benefits even in product builds, since some of the relevant checks occur > in product code (such as `guarantee`, `fatal`, &etc). It also provides a > solution to the problem described in JDK-8294031, where the potential > continued execution from a failed assertion leads to compiler warnings. > > The change is written in such a way that it should be easy to add the > appropriate text for new attributes in the future. There have been > discussions of adopting C++17, which adds several attributes. > > The change to the Style Guide is forward looking, toward a time when more > attributes are available due to the adoption of a newer language standard than > the current C++14. It is written in such a way that it should be easy to add > the appropriate text for new attributes. > > Testing: > I have a prototype of making HotSpot assertions noreturn, which has been run > through mach5 tier1-8 for all Oracle-supported platforms. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 24-Feb-2023 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Marked as reviewed by jwaters (Committer). Not a HotSpot Group member, but have my thumbs up anyway haha ------------- PR: https://git.openjdk.org/jdk/pull/12507 From jwaters at openjdk.org Mon Feb 27 11:49:14 2023 From: jwaters at openjdk.org (Julian Waters) Date: Mon, 27 Feb 2023 11:49:14 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: Message-ID: On Mon, 27 Feb 2023 10:46:17 GMT, Kim Barrett wrote: >> Moving to a newer C++ language version requires compiler support, for all >> supported OpenJDK ports (*). Only recently has the xlclang compiler used by >> the aix-ppc port had a version that supported C++17. (I think it was released >> last summer/fall.) I don't know when they might release a version supporting >> C++20. Use of that new version is being investigated by the port maintainers. >> I've done a little bit of peeking ahead to C++17 as well, and haven't yet run >> into any problems with the Oracle-supported ports (which is what I expected). >> >> (*) Non-open licensee ports also need to be considered, though I'm not sure >> what the requirements are there. I know of one such that didn't yet have >> complete C++14 support when we adopted it for OpenJDK. > >> I've no idea how standard attributes and gcc `__attribute__` or msvc `__declspec` interact syntactically. My hope would be that they can interleave, but I have no idea whether that's true or not. I guess some research is needed, though I think it doesn't matter for the current set. (I doubt anyone is going to declare a function both noreturn and always-inline, for example.) > > I did some experimenting, and it looks like they interleave without any problems. And I even found a potential > use-case for noreturn + always-inline. :) Ah, that's interesting. If you don't mind, what would said combination end up doing? I'm a little curious :P It's frustrating that C++20 is still going to have to wait for xlc though. Guess we'll have to wait for longer before we can use those guaranteed inline/noinline Visual C++ attributes :( ------------- PR: https://git.openjdk.org/jdk/pull/12507 From lucy at openjdk.org Mon Feb 27 11:50:18 2023 From: lucy at openjdk.org (Lutz Schmidt) Date: Mon, 27 Feb 2023 11:50:18 GMT Subject: RFR: 8299817: [s390] AES-CTR mode intrinsic fails with multiple short update() calls [v5] In-Reply-To: References: Message-ID: <2ePC020NWUVyA4CrP1NirZAZ5UXZRV5VhYkdWReWaAQ=.46649f78-6f94-40f4-80c2-6873120b637c@github.com> On Mon, 27 Feb 2023 09:03:49 GMT, Lutz Schmidt wrote: >> This PR addresses an issue in the AES-CTR mode intrinsic on s390. When a message is ciphered in multiple, small (< 16 bytes) segments, the result is incorrect. >> >> This is not just a band-aid fix. The issue was taken as a chance to restructure the code. though still complicated, It is now easier to read and (hopefully) understand. >> >> Except for the new jetreg test, the changes are purely s390. There are no side effects on other platforms. Issue-specific tests pass. Other tests are in progress. I will update this PR once they are complete. >> >> **Reviews and comments are very much appreciated.** >> >> @backwaterred could you please run some "official" s390 tests? Thanks. > > Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: > > 8299817: split some excessively long lines in Test8299817.java Build and test failures seem unrelated to this s390-only fix. ------------- PR: https://git.openjdk.org/jdk/pull/11967 From dholmes at openjdk.org Mon Feb 27 12:24:08 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 27 Feb 2023 12:24:08 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: On Mon, 27 Feb 2023 10:19:50 GMT, Kim Barrett wrote: > Please review this change to the implementation of the Windows-specific option > UseOSErrorReporting, toward allowing crash reporting functions to be declared > noreturn. VMError::report_and_die no longer conditionally returns if the > Windows-only option UseOSErrorReporting is true. > > The Windows-only sections of report_and_die now call RaiseFailFastException > (https://learn.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-raisefailfastexception), > which immediately invokes WER (Windows Error Reporting) if it is enabled, > without executing structured exception handler. If WER is not enabled, it > just immediately terminates the program. Thus, we no longer return to walk up > thestructured exception handler chain to pop out at the top as unhandled in > order to invoke WER. > > This permits declaring report_and_die as [[noreturn]], once some functions > from the os class are also so declared. Also adding that attribute as > appropriate to other functions in the os class. This of course assumes > the use of [[noreturn]] in HotSpot code is approved (JDK-8302124). > > There is a pre-existing bug that I'll be reporting separately. If > UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an > empty .mdmp file. We shouldn't create that file when UseOSErrorReporting. > > Testing: > mach5 tier1-3 > > Manual testing with the following, to verify desired behavior. > > -XX:ErrorHandlerTest=N > 1: assertion failure > 2: guarantee failure > 14: SIGSEGV > 15: divide by zero > path/to/bin/java \ > -XX:+UnlockDiagnosticVMOptions \ > -XX:+ErrorLogSecondaryErrorDetails \ > -XX:+UseOSErrorReporting \ > -XX:ErrorHandlerTest=1 \ > TestDebug.java > > --- TestDebug.java --- > import java.lang.String; > public class TestDebug { > static private volatile String dummy; > public static void main(String[] args) throws Exception { > while (true) { > dummy = new String("foo bar"); > } > } > } > --- end TestDebug.java --- > > The state of WER can be examined and modified using Power Shell commands > {Get,Enable,Disable}-WindowsErrorReporting. > > The state of reporting WER captured errors can be examined and modified using > Control Panel > Security and Maintenance > Maintenance : Report Problems [on,off] > > With Report Problems off, reports are placed in > c:\ProgramData\Microsoft\Windows\WER\ReportArchive > > I verified that executing the above test with WER enabled adds an entry in > that directory, but not when it's disabled. Also nothing is added there when > the test is run with -XX:-UseOSErrorReporting. So ... this approach skips the VM error reporting and goes straight to WER? ------------- PR: https://git.openjdk.org/jdk/pull/12759 From dholmes at openjdk.org Mon Feb 27 12:31:04 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 27 Feb 2023 12:31:04 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: On Mon, 27 Feb 2023 12:20:49 GMT, David Holmes wrote: > So ... this approach skips the VM error reporting and goes straight to WER? Strike that - no it doesn't. Okay I have to admit I'm not at all clear on what the existing unwinding process actually looks like, and how the new fix affects that. ------------- PR: https://git.openjdk.org/jdk/pull/12759 From jwaters at openjdk.org Mon Feb 27 13:05:11 2023 From: jwaters at openjdk.org (Julian Waters) Date: Mon, 27 Feb 2023 13:05:11 GMT Subject: RFR: 8274400: HotSpot Style Guide should permit use of alignof [v7] In-Reply-To: References: <3IdeZGaGcFB1-DbVugn0P9v8vRQMLURzHFRSpAm46NA=.09bf4dd8-b652-49c3-b9a5-ad3f58dfbcc3@github.com> Message-ID: On Mon, 27 Feb 2023 10:50:43 GMT, Kim Barrett wrote: > This has been open and stable for a while. Anyone else? Also if there is anyone else coming to review this, consider looking at https://github.com/openjdk/jdk/pull/11431 as well, it's also been collecting dust for quite a while too ;-; ------------- PR: https://git.openjdk.org/jdk/pull/11761 From chagedorn at openjdk.org Mon Feb 27 13:08:08 2023 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Mon, 27 Feb 2023 13:08:08 GMT Subject: RFR: 8303068: Memory leak in DwarfFile::LineNumberProgram::run_line_number_program [v3] In-Reply-To: <1FV7AUQOJ7E7rWIrHXw-TVvouTWxDnr4v_miwkYYtKE=.47c0dcbc-3d58-4431-8830-aa279e63605a@github.com> References: <1FV7AUQOJ7E7rWIrHXw-TVvouTWxDnr4v_miwkYYtKE=.47c0dcbc-3d58-4431-8830-aa279e63605a@github.com> Message-ID: On Mon, 27 Feb 2023 04:26:15 GMT, David Holmes wrote: >> After allocating: >> >> _state = new (std::nothrow) LineNumberProgramState(_header); >> >> we need to `delete _state` before returning. >> >> Testing: tiers 1-3 >> >> Thanks. > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Revert unnecessary change Looks good, thanks for fixing this! ------------- Marked as reviewed by chagedorn (Reviewer). PR: https://git.openjdk.org/jdk/pull/12738 From kvn at openjdk.org Mon Feb 27 18:31:15 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 27 Feb 2023 18:31:15 GMT Subject: RFR: 8274400: HotSpot Style Guide should permit use of alignof [v7] In-Reply-To: References: <3IdeZGaGcFB1-DbVugn0P9v8vRQMLURzHFRSpAm46NA=.09bf4dd8-b652-49c3-b9a5-ad3f58dfbcc3@github.com> Message-ID: On Tue, 24 Jan 2023 09:21:37 GMT, Julian Waters wrote: >> The alignof operator was added by C++11. It returns the alignment for the given type. Various metaprogramming usages exist, in particular when using std::aligned_storage. Use of this operator should be permitted in HotSpot code. > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > Oof Approved. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.org/jdk/pull/11761 From kvn at openjdk.org Mon Feb 27 19:04:25 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 27 Feb 2023 19:04:25 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 05:26:38 GMT, Kim Barrett wrote: > Please review this change to permit the use of noreturn attributes in HotSpot > code. > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2761.pdf > https://en.cppreference.com/w/cpp/language/attributes/noreturn > > This will permit such changes as marking failed assertions and the like as > noreturn, permitting the compiler to generate better code in some cases. This > has benefits even in product builds, since some of the relevant checks occur > in product code (such as `guarantee`, `fatal`, &etc). It also provides a > solution to the problem described in JDK-8294031, where the potential > continued execution from a failed assertion leads to compiler warnings. > > The change is written in such a way that it should be easy to add the > appropriate text for new attributes in the future. There have been > discussions of adopting C++17, which adds several attributes. > > The change to the Style Guide is forward looking, toward a time when more > attributes are available due to the adoption of a newer language standard than > the current C++14. It is written in such a way that it should be easy to add > the appropriate text for new attributes. > > Testing: > I have a prototype of making HotSpot assertions noreturn, which has been run > through mach5 tier1-8 for all Oracle-supported platforms. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 24-Feb-2023 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Approved. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.org/jdk/pull/12507 From rkennke at openjdk.org Mon Feb 27 20:17:33 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 27 Feb 2023 20:17:33 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v25] In-Reply-To: References: Message-ID: > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Merge remote-tracking branch 'origin/JDK-8139457' into JDK-8139457 - Some renames ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/017a836b..7d1daeab Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=24 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=23-24 Stats: 15 lines in 2 files changed: 2 ins; 0 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From dholmes at openjdk.org Mon Feb 27 20:54:25 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 27 Feb 2023 20:54:25 GMT Subject: RFR: 8274400: HotSpot Style Guide should permit use of alignof [v7] In-Reply-To: References: <3IdeZGaGcFB1-DbVugn0P9v8vRQMLURzHFRSpAm46NA=.09bf4dd8-b652-49c3-b9a5-ad3f58dfbcc3@github.com> Message-ID: On Tue, 24 Jan 2023 09:21:37 GMT, Julian Waters wrote: >> The alignof operator was added by C++11. It returns the alignment for the given type. Various metaprogramming usages exist, in particular when using std::aligned_storage. Use of this operator should be permitted in HotSpot code. > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > Oof Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/11761 From rkennke at openjdk.org Mon Feb 27 20:54:58 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 27 Feb 2023 20:54:58 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v26] In-Reply-To: References: Message-ID: > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Fix comment in s390 - Rename header_size* -> base_offset* in arm ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/7d1daeab..60787563 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=25 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=24-25 Stats: 7 lines in 3 files changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From dholmes at openjdk.org Mon Feb 27 21:42:38 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 27 Feb 2023 21:42:38 GMT Subject: RFR: 8303068: Memory leak in DwarfFile::LineNumberProgram::run_line_number_program In-Reply-To: References: Message-ID: On Sun, 26 Feb 2023 10:30:51 GMT, Johan Sj?len wrote: >> After allocating: >> >> _state = new (std::nothrow) LineNumberProgramState(_header); >> >> we need to `delete _state` before returning. >> >> Testing: tiers 1-3 >> >> Thanks. > > #else > DWARF_LOG_DEBUG("Address: Line: Column: File:"); > #endif > - _state = new (std::nothrow) LineNumberProgramState(_header); > + LineNumberProgramState state{_header}; > + _state = &state; > if (_state == nullptr) { > DWARF_LOG_ERROR("Failed to create new LineNumberProgramState object"); > return false; > > > Isn't this equivalent? Thanks for the reviews @jdksjolen , @tschatzl and @chhagedorn ! ------------- PR: https://git.openjdk.org/jdk/pull/12738 From dholmes at openjdk.org Mon Feb 27 21:42:41 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 27 Feb 2023 21:42:41 GMT Subject: Integrated: 8303068: Memory leak in DwarfFile::LineNumberProgram::run_line_number_program In-Reply-To: References: Message-ID: On Fri, 24 Feb 2023 04:48:53 GMT, David Holmes wrote: > After allocating: > > _state = new (std::nothrow) LineNumberProgramState(_header); > > we need to `delete _state` before returning. > > Testing: tiers 1-3 > > Thanks. This pull request has now been integrated. Changeset: f7f10367 Author: David Holmes URL: https://git.openjdk.org/jdk/commit/f7f10367b2169f9e10f79b430acb450aabb5dcb6 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod 8303068: Memory leak in DwarfFile::LineNumberProgram::run_line_number_program Reviewed-by: jsjolen, tschatzl, chagedorn ------------- PR: https://git.openjdk.org/jdk/pull/12738 From coleenp at openjdk.org Mon Feb 27 22:00:28 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 27 Feb 2023 22:00:28 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: <6VenuisSudRu06LiySSGslFxVuvMh9GpY1elHkahExU=.5b06f61d-7635-48b0-b139-da102ceb2fcf@github.com> On Mon, 27 Feb 2023 10:19:50 GMT, Kim Barrett wrote: > Please review this change to the implementation of the Windows-specific option > UseOSErrorReporting, toward allowing crash reporting functions to be declared > noreturn. VMError::report_and_die no longer conditionally returns if the > Windows-only option UseOSErrorReporting is true. > > The Windows-only sections of report_and_die now call RaiseFailFastException > (https://learn.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-raisefailfastexception), > which immediately invokes WER (Windows Error Reporting) if it is enabled, > without executing structured exception handler. If WER is not enabled, it > just immediately terminates the program. Thus, we no longer return to walk up > thestructured exception handler chain to pop out at the top as unhandled in > order to invoke WER. > > This permits declaring report_and_die as [[noreturn]], once some functions > from the os class are also so declared. Also adding that attribute as > appropriate to other functions in the os class. This of course assumes > the use of [[noreturn]] in HotSpot code is approved (JDK-8302124). > > There is a pre-existing bug that I'll be reporting separately. If > UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an > empty .mdmp file. We shouldn't create that file when UseOSErrorReporting. > > Testing: > mach5 tier1-3 > > Manual testing with the following, to verify desired behavior. > > -XX:ErrorHandlerTest=N > 1: assertion failure > 2: guarantee failure > 14: SIGSEGV > 15: divide by zero > path/to/bin/java \ > -XX:+UnlockDiagnosticVMOptions \ > -XX:+ErrorLogSecondaryErrorDetails \ > -XX:+UseOSErrorReporting \ > -XX:ErrorHandlerTest=1 \ > TestDebug.java > > --- TestDebug.java --- > import java.lang.String; > public class TestDebug { > static private volatile String dummy; > public static void main(String[] args) throws Exception { > while (true) { > dummy = new String("foo bar"); > } > } > } > --- end TestDebug.java --- > > The state of WER can be examined and modified using Power Shell commands > {Get,Enable,Disable}-WindowsErrorReporting. > > The state of reporting WER captured errors can be examined and modified using > Control Panel > Security and Maintenance > Maintenance : Report Problems [on,off] > > With Report Problems off, reports are placed in > c:\ProgramData\Microsoft\Windows\WER\ReportArchive > > I verified that executing the above test with WER enabled adds an entry in > that directory, but not when it's disabled. Also nothing is added there when > the test is run with -XX:-UseOSErrorReporting. This is a nice solution and thank you for going the extra step of verifying that the code supports WER. src/hotspot/share/utilities/vmError.hpp line 180: > 178: > 179: [[noreturn]] > 180: ATTRIBUTE_PRINTF(6, 0) I thought you were going to have to take BREAKPOINT out of vmassert in order to make report_and_die [[noreturn]] ? ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.org/jdk/pull/12759 From kbarrett at openjdk.org Mon Feb 27 23:14:06 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 27 Feb 2023 23:14:06 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: <6VenuisSudRu06LiySSGslFxVuvMh9GpY1elHkahExU=.5b06f61d-7635-48b0-b139-da102ceb2fcf@github.com> References: <6VenuisSudRu06LiySSGslFxVuvMh9GpY1elHkahExU=.5b06f61d-7635-48b0-b139-da102ceb2fcf@github.com> Message-ID: On Mon, 27 Feb 2023 21:56:59 GMT, Coleen Phillimore wrote: >> Please review this change to the implementation of the Windows-specific option >> UseOSErrorReporting, toward allowing crash reporting functions to be declared >> noreturn. VMError::report_and_die no longer conditionally returns if the >> Windows-only option UseOSErrorReporting is true. >> >> The Windows-only sections of report_and_die now call RaiseFailFastException >> (https://learn.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-raisefailfastexception), >> which immediately invokes WER (Windows Error Reporting) if it is enabled, >> without executing structured exception handler. If WER is not enabled, it >> just immediately terminates the program. Thus, we no longer return to walk up >> thestructured exception handler chain to pop out at the top as unhandled in >> order to invoke WER. >> >> This permits declaring report_and_die as [[noreturn]], once some functions >> from the os class are also so declared. Also adding that attribute as >> appropriate to other functions in the os class. This of course assumes >> the use of [[noreturn]] in HotSpot code is approved (JDK-8302124). >> >> There is a pre-existing bug that I'll be reporting separately. If >> UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an >> empty .mdmp file. We shouldn't create that file when UseOSErrorReporting. >> >> Testing: >> mach5 tier1-3 >> >> Manual testing with the following, to verify desired behavior. >> >> -XX:ErrorHandlerTest=N >> 1: assertion failure >> 2: guarantee failure >> 14: SIGSEGV >> 15: divide by zero >> path/to/bin/java \ >> -XX:+UnlockDiagnosticVMOptions \ >> -XX:+ErrorLogSecondaryErrorDetails \ >> -XX:+UseOSErrorReporting \ >> -XX:ErrorHandlerTest=1 \ >> TestDebug.java >> >> --- TestDebug.java --- >> import java.lang.String; >> public class TestDebug { >> static private volatile String dummy; >> public static void main(String[] args) throws Exception { >> while (true) { >> dummy = new String("foo bar"); >> } >> } >> } >> --- end TestDebug.java --- >> >> The state of WER can be examined and modified using Power Shell commands >> {Get,Enable,Disable}-WindowsErrorReporting. >> >> The state of reporting WER captured errors can be examined and modified using >> Control Panel > Security and Maintenance > Maintenance : Report Problems [on,off] >> >> With Report Problems off, reports are placed in >> c:\ProgramData\Microsoft\Windows\WER\ReportArchive >> >> I verified that executing the above test with WER enabled adds an entry in >> that directory, but not when it's disabled. Also nothing is added there when >> the test is run with -XX:-UseOSErrorReporting. > > src/hotspot/share/utilities/vmError.hpp line 180: > >> 178: >> 179: [[noreturn]] >> 180: ATTRIBUTE_PRINTF(6, 0) > > I thought you were going to have to take BREAKPOINT out of vmassert in order to make report_and_die [[noreturn]] ? vmassert and friends are not noreturn yet, because of `Debugging`. That's the next change in my local queue. ------------- PR: https://git.openjdk.org/jdk/pull/12759 From kbarrett at openjdk.org Mon Feb 27 23:35:06 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 27 Feb 2023 23:35:06 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: On Mon, 27 Feb 2023 10:19:50 GMT, Kim Barrett wrote: > Please review this change to the implementation of the Windows-specific option > UseOSErrorReporting, toward allowing crash reporting functions to be declared > noreturn. VMError::report_and_die no longer conditionally returns if the > Windows-only option UseOSErrorReporting is true. > > The Windows-only sections of report_and_die now call RaiseFailFastException > (https://learn.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-raisefailfastexception), > which immediately invokes WER (Windows Error Reporting) if it is enabled, > without executing structured exception handler. If WER is not enabled, it > just immediately terminates the program. Thus, we no longer return to walk up > thestructured exception handler chain to pop out at the top as unhandled in > order to invoke WER. > > This permits declaring report_and_die as [[noreturn]], once some functions > from the os class are also so declared. Also adding that attribute as > appropriate to other functions in the os class. This of course assumes > the use of [[noreturn]] in HotSpot code is approved (JDK-8302124). > > There is a pre-existing bug that I'll be reporting separately. If > UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an > empty .mdmp file. We shouldn't create that file when UseOSErrorReporting. > > Testing: > mach5 tier1-3 > > Manual testing with the following, to verify desired behavior. > > -XX:ErrorHandlerTest=N > 1: assertion failure > 2: guarantee failure > 14: SIGSEGV > 15: divide by zero > path/to/bin/java \ > -XX:+UnlockDiagnosticVMOptions \ > -XX:+ErrorLogSecondaryErrorDetails \ > -XX:+UseOSErrorReporting \ > -XX:ErrorHandlerTest=1 \ > TestDebug.java > > --- TestDebug.java --- > import java.lang.String; > public class TestDebug { > static private volatile String dummy; > public static void main(String[] args) throws Exception { > while (true) { > dummy = new String("foo bar"); > } > } > } > --- end TestDebug.java --- > > The state of WER can be examined and modified using Power Shell commands > {Get,Enable,Disable}-WindowsErrorReporting. > > The state of reporting WER captured errors can be examined and modified using > Control Panel > Security and Maintenance > Maintenance : Report Problems [on,off] > > With Report Problems off, reports are placed in > c:\ProgramData\Microsoft\Windows\WER\ReportArchive > > I verified that executing the above test with WER enabled adds an entry in > that directory, but not when it's disabled. Also nothing is added there when > the test is run with -XX:-UseOSErrorReporting. src/hotspot/share/utilities/vmError.cpp line 1448: > 1446: if (UseOSErrorReporting && log_done) { > 1447: raise_fail_fast(_siginfo, _context); > 1448: } I think this bit might not be needed anymore. I think the reason for the old code here is to deal with the repeated calls to report_and_die as we walk up the Windows exception filter list and return because of UseOSErrorReporting. We don't do that anymore. I think we could instead now not do anything here and treat repeated calls "normally", reporting them as such and eventually dying, either via Windows fail-fast or abort. ------------- PR: https://git.openjdk.org/jdk/pull/12759 From dholmes at openjdk.org Tue Feb 28 01:18:07 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 28 Feb 2023 01:18:07 GMT Subject: RFR: 8303210: [linux, Windows] Enable UseSystemMemoryBarrier by default if possible [v2] In-Reply-To: References: <9eZo1xYNGhjMSC9lDXKtkO1eyU_H-Veuh1AeP3CPKbg=.69b6c4ff-a3ef-439f-8468-21fec9de1825@github.com> Message-ID: On Sat, 25 Feb 2023 16:14:10 GMT, Martin Doerr wrote: >> I'd like to enable UseSystemMemoryBarrier by default on supported Operating Systems in order to improve performance of thread state transitions (I/O, JNI, foreign function calls, JIT compiler threads, etc.). See JBS issue for more details. >> Unfortunately, the feature was not yet implemented on all platforms. I added the code, but need the platform maintainers to check if it can be used reliably (and ideally if the performance improves). It's easy to switch it off again in case of problems. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Fix log message for platforms which don't support it. I'm running a swag of our benchmarks on linux-x64 and linux-aarch64 with the flag simply set to true. Preliminary results show a big hit on startup benchmarks on both platforms! ------------- PR: https://git.openjdk.org/jdk/pull/12753 From kbarrett at openjdk.org Tue Feb 28 02:03:38 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 28 Feb 2023 02:03:38 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute [v2] In-Reply-To: References: Message-ID: > Please review this change to permit the use of noreturn attributes in HotSpot > code. > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2761.pdf > https://en.cppreference.com/w/cpp/language/attributes/noreturn > > This will permit such changes as marking failed assertions and the like as > noreturn, permitting the compiler to generate better code in some cases. This > has benefits even in product builds, since some of the relevant checks occur > in product code (such as `guarantee`, `fatal`, &etc). It also provides a > solution to the problem described in JDK-8294031, where the potential > continued execution from a failed assertion leads to compiler warnings. > > The change is written in such a way that it should be easy to add the > appropriate text for new attributes in the future. There have been > discussions of adopting C++17, which adds several attributes. > > The change to the Style Guide is forward looking, toward a time when more > attributes are available due to the adoption of a newer language standard than > the current C++14. It is written in such a way that it should be easy to add > the appropriate text for new attributes. > > Testing: > I have a prototype of making HotSpot assertions noreturn, which has been run > through mach5 tier1-8 for all Oracle-supported platforms. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 24-Feb-2023 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'master' into style-guide-noreturn - style guide permits noreturn attributes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12507/files - new: https://git.openjdk.org/jdk/pull/12507/files/421a1681..37b2bccb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12507&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12507&range=00-01 Stats: 59158 lines in 1499 files changed: 24690 ins; 17229 del; 17239 mod Patch: https://git.openjdk.org/jdk/pull/12507.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12507/head:pull/12507 PR: https://git.openjdk.org/jdk/pull/12507 From kbarrett at openjdk.org Tue Feb 28 02:03:38 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 28 Feb 2023 02:03:38 GMT Subject: RFR: 8302124: HotSpot Style Guide should permit noreturn attribute [v2] In-Reply-To: References: Message-ID: On Mon, 27 Feb 2023 19:01:42 GMT, Vladimir Kozlov wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: >> >> - Merge branch 'master' into style-guide-noreturn >> - style guide permits noreturn attributes > > Approved. Thanks for reviews, and final approval from @vnkozlov . ------------- PR: https://git.openjdk.org/jdk/pull/12507 From kbarrett at openjdk.org Tue Feb 28 02:03:39 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 28 Feb 2023 02:03:39 GMT Subject: Integrated: 8302124: HotSpot Style Guide should permit noreturn attribute In-Reply-To: References: Message-ID: <1dCSfX2nLc4fAmGjrMtNnIsr5zWl2-mBp5lzZiNK9VE=.cef1eb24-5206-4eeb-9245-d714f1a750d4@github.com> On Fri, 10 Feb 2023 05:26:38 GMT, Kim Barrett wrote: > Please review this change to permit the use of noreturn attributes in HotSpot > code. > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2761.pdf > https://en.cppreference.com/w/cpp/language/attributes/noreturn > > This will permit such changes as marking failed assertions and the like as > noreturn, permitting the compiler to generate better code in some cases. This > has benefits even in product builds, since some of the relevant checks occur > in product code (such as `guarantee`, `fatal`, &etc). It also provides a > solution to the problem described in JDK-8294031, where the potential > continued execution from a failed assertion leads to compiler warnings. > > The change is written in such a way that it should be easy to add the > appropriate text for new attributes in the future. There have been > discussions of adopting C++17, which adds several attributes. > > The change to the Style Guide is forward looking, toward a time when more > attributes are available due to the adoption of a newer language standard than > the current C++14. It is written in such a way that it should be easy to add > the appropriate text for new attributes. > > Testing: > I have a prototype of making HotSpot assertions noreturn, which has been run > through mach5 tier1-8 for all Oracle-supported platforms. > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Friday 24-Feb-2023 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. This pull request has now been integrated. Changeset: 14a014d4 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/14a014d4303efda69bd0acbdf71139534601b49a Stats: 64 lines in 2 files changed: 52 ins; 12 del; 0 mod 8302124: HotSpot Style Guide should permit noreturn attribute Reviewed-by: dcubed, iveresov, dholmes, tschatzl, jwaters, kvn ------------- PR: https://git.openjdk.org/jdk/pull/12507 From mdoerr at openjdk.org Tue Feb 28 02:56:02 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 28 Feb 2023 02:56:02 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: On Mon, 27 Feb 2023 08:49:18 GMT, Erik ?sterlund wrote: > I don?t think we want this to be on by default on platforms where StoreLoad fences don't cause substantial global overheads. The benefit on such platforms is rather low, and needing the last couple of nanoseconds of transition speed, seems to not be a normal use case that default settings should optimize for. Conversely, the global synchronization can be rather intrusive, especially when it involves handshakes with N threads, and you need to perform global synchronization across the entire machine, for each thread poked. I would be much more afraid of that issue out of the box, than I would be afraid of a couple of nanoseconds slower native transitions. Hi Erik, StoreLoad fences cause substantial overhead on any multi-socket system including x86_64. The benefit may be small on single-socket systems, but can the VM distinguish? We are currently looking for benchmarks which show a negative effect of enabling it. Seems like the SPEC benchmarks don't care about it. Note that we typically use only one membarrier syscall when we handshake all threads. If you know any workload which suffers, would be great to know. David is currently also checking benchmarks. We should discuss further details in https://github.com/openjdk/jdk/pull/12753. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mdoerr at openjdk.org Tue Feb 28 05:26:22 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 28 Feb 2023 05:26:22 GMT Subject: RFR: 8303210: [linux, Windows] Enable UseSystemMemoryBarrier by default if possible [v3] In-Reply-To: <9eZo1xYNGhjMSC9lDXKtkO1eyU_H-Veuh1AeP3CPKbg=.69b6c4ff-a3ef-439f-8468-21fec9de1825@github.com> References: <9eZo1xYNGhjMSC9lDXKtkO1eyU_H-Veuh1AeP3CPKbg=.69b6c4ff-a3ef-439f-8468-21fec9de1825@github.com> Message-ID: > I'd like to enable UseSystemMemoryBarrier by default on supported Operating Systems in order to improve performance of thread state transitions (I/O, JNI, foreign function calls, JIT compiler threads, etc.). See JBS issue for more details. > Unfortunately, the feature was not yet implemented on all platforms. I added the code, but need the platform maintainers to check if it can be used reliably (and ideally if the performance improves). It's easy to switch it off again in case of problems. Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: Improve logging. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12753/files - new: https://git.openjdk.org/jdk/pull/12753/files/0ce067a9..4660291f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12753&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12753&range=01-02 Stats: 6 lines in 3 files changed: 1 ins; 1 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/12753.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12753/head:pull/12753 PR: https://git.openjdk.org/jdk/pull/12753 From mdoerr at openjdk.org Tue Feb 28 05:33:04 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 28 Feb 2023 05:33:04 GMT Subject: RFR: 8303210: [linux, Windows] Enable UseSystemMemoryBarrier by default if possible [v2] In-Reply-To: References: <9eZo1xYNGhjMSC9lDXKtkO1eyU_H-Veuh1AeP3CPKbg=.69b6c4ff-a3ef-439f-8468-21fec9de1825@github.com> Message-ID: <_buE64iH4bc2BZFEe5_0WgyfVW1sxzXms4yXkPisCwc=.5dcc456d-d0ac-409e-b3b4-7222c8aea350@github.com> On Tue, 28 Feb 2023 01:15:13 GMT, David Holmes wrote: > I'm running a swag of our benchmarks on linux-x64 and linux-aarch64 with the flag simply set to true. Preliminary results show a big hit on startup benchmarks on both platforms! Interesting. I had only checked large benchmarks before. I don't know why startup is impacted. There were also concerns when many single thread handshakes are used. Maybe you're right and I should only make the feature available on all platforms for now (and keep it experimental and switched off). ------------- PR: https://git.openjdk.org/jdk/pull/12753 From dholmes at openjdk.org Tue Feb 28 06:31:03 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 28 Feb 2023 06:31:03 GMT Subject: RFR: 8303210: [linux, Windows] Enable UseSystemMemoryBarrier by default if possible [v2] In-Reply-To: <_buE64iH4bc2BZFEe5_0WgyfVW1sxzXms4yXkPisCwc=.5dcc456d-d0ac-409e-b3b4-7222c8aea350@github.com> References: <9eZo1xYNGhjMSC9lDXKtkO1eyU_H-Veuh1AeP3CPKbg=.69b6c4ff-a3ef-439f-8468-21fec9de1825@github.com> <_buE64iH4bc2BZFEe5_0WgyfVW1sxzXms4yXkPisCwc=.5dcc456d-d0ac-409e-b3b4-7222c8aea350@github.com> Message-ID: On Tue, 28 Feb 2023 05:29:51 GMT, Martin Doerr wrote: > There were also concerns when many single thread handshakes are used. Ah! The startup issues may be related to JDK-8300926. I'll need to apply that change and rerun them. ------------- PR: https://git.openjdk.org/jdk/pull/12753 From dholmes at openjdk.org Tue Feb 28 06:31:05 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 28 Feb 2023 06:31:05 GMT Subject: RFR: 8303210: [linux, Windows] Enable UseSystemMemoryBarrier by default if possible [v3] In-Reply-To: References: <9eZo1xYNGhjMSC9lDXKtkO1eyU_H-Veuh1AeP3CPKbg=.69b6c4ff-a3ef-439f-8468-21fec9de1825@github.com> Message-ID: <5fRZQL_fESuAYk3W5hdNEMl8dtcAPrWMUardVietcgI=.c550b7c9-5586-4d1e-83bf-2e0d67d00784@github.com> On Tue, 28 Feb 2023 05:26:22 GMT, Martin Doerr wrote: >> I'd like to enable UseSystemMemoryBarrier by default on supported Operating Systems in order to improve performance of thread state transitions (I/O, JNI, foreign function calls, JIT compiler threads, etc.). See JBS issue for more details. >> Unfortunately, the feature was not yet implemented on all platforms. I added the code, but need the platform maintainers to check if it can be used reliably (and ideally if the performance improves). It's easy to switch it off again in case of problems. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Improve logging. There is also a 7.9% regression on Aarch64 with the Renaissance-Scala-Kmeans benchmark. ------------- PR: https://git.openjdk.org/jdk/pull/12753 From stuefe at openjdk.org Tue Feb 28 07:42:03 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 28 Feb 2023 07:42:03 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: On Mon, 27 Feb 2023 10:19:50 GMT, Kim Barrett wrote: > Please review this change to the implementation of the Windows-specific option > UseOSErrorReporting, toward allowing crash reporting functions to be declared > noreturn. VMError::report_and_die no longer conditionally returns if the > Windows-only option UseOSErrorReporting is true. > > The Windows-only sections of report_and_die now call RaiseFailFastException > (https://learn.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-raisefailfastexception), > which immediately invokes WER (Windows Error Reporting) if it is enabled, > without executing structured exception handler. If WER is not enabled, it > just immediately terminates the program. Thus, we no longer return to walk up > thestructured exception handler chain to pop out at the top as unhandled in > order to invoke WER. > > This permits declaring report_and_die as [[noreturn]], once some functions > from the os class are also so declared. Also adding that attribute as > appropriate to other functions in the os class. This of course assumes > the use of [[noreturn]] in HotSpot code is approved (JDK-8302124). > > There is a pre-existing bug that I'll be reporting separately. If > UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an > empty .mdmp file. We shouldn't create that file when UseOSErrorReporting. > > Testing: > mach5 tier1-3 > > Manual testing with the following, to verify desired behavior. > > -XX:ErrorHandlerTest=N > 1: assertion failure > 2: guarantee failure > 14: SIGSEGV > 15: divide by zero > path/to/bin/java \ > -XX:+UnlockDiagnosticVMOptions \ > -XX:+ErrorLogSecondaryErrorDetails \ > -XX:+UseOSErrorReporting \ > -XX:ErrorHandlerTest=1 \ > TestDebug.java > > --- TestDebug.java --- > import java.lang.String; > public class TestDebug { > static private volatile String dummy; > public static void main(String[] args) throws Exception { > while (true) { > dummy = new String("foo bar"); > } > } > } > --- end TestDebug.java --- > > The state of WER can be examined and modified using Power Shell commands > {Get,Enable,Disable}-WindowsErrorReporting. > > The state of reporting WER captured errors can be examined and modified using > Control Panel > Security and Maintenance > Maintenance : Report Problems [on,off] > > With Report Problems off, reports are placed in > c:\ProgramData\Microsoft\Windows\WER\ReportArchive > > I verified that executing the above test with WER enabled adds an entry in > that directory, but not when it's disabled. Also nothing is added there when > the test is run with -XX:-UseOSErrorReporting. Very nice. I like that the control flow gets closer to Posix now. Note that I think this also solves another subtle bug: Before, if UseOSErrorReporting was set but outside code had set a vectored exception handler (VEH), that handler would be called instead of WER. Cheers, Thomas ------------- Changes requested by stuefe (Reviewer). PR: https://git.openjdk.org/jdk/pull/12759 From stuefe at openjdk.org Tue Feb 28 07:42:06 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 28 Feb 2023 07:42:06 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: On Mon, 27 Feb 2023 23:31:49 GMT, Kim Barrett wrote: >> Please review this change to the implementation of the Windows-specific option >> UseOSErrorReporting, toward allowing crash reporting functions to be declared >> noreturn. VMError::report_and_die no longer conditionally returns if the >> Windows-only option UseOSErrorReporting is true. >> >> The Windows-only sections of report_and_die now call RaiseFailFastException >> (https://learn.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-raisefailfastexception), >> which immediately invokes WER (Windows Error Reporting) if it is enabled, >> without executing structured exception handler. If WER is not enabled, it >> just immediately terminates the program. Thus, we no longer return to walk up >> thestructured exception handler chain to pop out at the top as unhandled in >> order to invoke WER. >> >> This permits declaring report_and_die as [[noreturn]], once some functions >> from the os class are also so declared. Also adding that attribute as >> appropriate to other functions in the os class. This of course assumes >> the use of [[noreturn]] in HotSpot code is approved (JDK-8302124). >> >> There is a pre-existing bug that I'll be reporting separately. If >> UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an >> empty .mdmp file. We shouldn't create that file when UseOSErrorReporting. >> >> Testing: >> mach5 tier1-3 >> >> Manual testing with the following, to verify desired behavior. >> >> -XX:ErrorHandlerTest=N >> 1: assertion failure >> 2: guarantee failure >> 14: SIGSEGV >> 15: divide by zero >> path/to/bin/java \ >> -XX:+UnlockDiagnosticVMOptions \ >> -XX:+ErrorLogSecondaryErrorDetails \ >> -XX:+UseOSErrorReporting \ >> -XX:ErrorHandlerTest=1 \ >> TestDebug.java >> >> --- TestDebug.java --- >> import java.lang.String; >> public class TestDebug { >> static private volatile String dummy; >> public static void main(String[] args) throws Exception { >> while (true) { >> dummy = new String("foo bar"); >> } >> } >> } >> --- end TestDebug.java --- >> >> The state of WER can be examined and modified using Power Shell commands >> {Get,Enable,Disable}-WindowsErrorReporting. >> >> The state of reporting WER captured errors can be examined and modified using >> Control Panel > Security and Maintenance > Maintenance : Report Problems [on,off] >> >> With Report Problems off, reports are placed in >> c:\ProgramData\Microsoft\Windows\WER\ReportArchive >> >> I verified that executing the above test with WER enabled adds an entry in >> that directory, but not when it's disabled. Also nothing is added there when >> the test is run with -XX:-UseOSErrorReporting. > > src/hotspot/share/utilities/vmError.cpp line 1448: > >> 1446: if (UseOSErrorReporting && log_done) { >> 1447: raise_fail_fast(_siginfo, _context); >> 1448: } > > I think this bit might not be needed anymore. I think the reason for the old code here is to deal with > the repeated calls to report_and_die as we walk up the Windows exception filter list and return because > of UseOSErrorReporting. We don't do that anymore. I think we could instead now not do anything here > and treat repeated calls "normally", reporting them as such and eventually dying, either via Windows > fail-fast or abort. I think we should remove this part. Otherwise concurrent threads running into faults cause the VM to end abruptly while the primary thread is its in the last stages of error reporting: after log_done is set but before os::die was called (we do some things here: JFR shutdown, NMT statistics, compiler replay file writing). In fact, this may be an existing bug which could explain missing/corrupt replay files or JFR files if UseOSErrorReporting is set. ------------- PR: https://git.openjdk.org/jdk/pull/12759 From kbarrett at openjdk.org Tue Feb 28 07:51:04 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 28 Feb 2023 07:51:04 GMT Subject: RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting In-Reply-To: References: Message-ID: On Tue, 28 Feb 2023 07:20:00 GMT, Thomas Stuefe wrote: >> src/hotspot/share/utilities/vmError.cpp line 1448: >> >>> 1446: if (UseOSErrorReporting && log_done) { >>> 1447: raise_fail_fast(_siginfo, _context); >>> 1448: } >> >> I think this bit might not be needed anymore. I think the reason for the old code here is to deal with >> the repeated calls to report_and_die as we walk up the Windows exception filter list and return because >> of UseOSErrorReporting. We don't do that anymore. I think we could instead now not do anything here >> and treat repeated calls "normally", reporting them as such and eventually dying, either via Windows >> fail-fast or abort. > > I think we should remove this part. Otherwise concurrent threads running into faults cause the VM to end abruptly while the primary thread is its in the last stages of error reporting: after log_done is set but before os::die was called (we do some things here: JFR shutdown, NMT statistics, compiler replay file writing). > > In fact, this may be an existing bug which could explain missing/corrupt replay files or JFR files if UseOSErrorReporting is set. Yeah, I was approaching the same conclusion. I'll remove it. ------------- PR: https://git.openjdk.org/jdk/pull/12759 From aph at openjdk.org Tue Feb 28 10:37:13 2023 From: aph at openjdk.org (Andrew Haley) Date: Tue, 28 Feb 2023 10:37:13 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v6] In-Reply-To: <47e5gQJkZn7ldGv3n2cyQoZSAT9YpWWvIwXnbxhdGuQ=.643a4652-204b-4689-85a4-9378e8a587b3@github.com> References: <47e5gQJkZn7ldGv3n2cyQoZSAT9YpWWvIwXnbxhdGuQ=.643a4652-204b-4689-85a4-9378e8a587b3@github.com> Message-ID: <2g2QqIWkC2a7QyVzPkX-QW3LcMrti6-AMwuFPqe752o=.c63ae268-69be-4770-b642-544633c1cf8f@github.com> On Mon, 27 Feb 2023 11:28:40 GMT, Erik ?sterlund wrote: >> So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Add comment src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 720: > 718: t4 = r7, t5 = r11, t6 = r12, t7 = r13; > 719: const Register stride = r14; > 720: const Register gct1 = r8, gct2 = r9, gct3 = r10; Please don't alias rscratch1 and rscratch2. Many macros use them, and this is a bug waiting to happen. ------------- PR: https://git.openjdk.org/jdk/pull/12670 From duke at openjdk.org Tue Feb 28 11:18:49 2023 From: duke at openjdk.org (Afshin Zafari) Date: Tue, 28 Feb 2023 11:18:49 GMT Subject: RFR: 8292059: Do not inline InstanceKlass::allocate_instance() Message-ID: The inline and not-inline versions of the method is stress tested to compare the performance difference. The statistics are drawn in the following chart. ![chart_2](https://user-images.githubusercontent.com/4697012/221837134-8745dbe3-279c-47f8-ad0f-88897bc8c3cc.png) ------------- Commit messages: - 8292059: Do not inline InstanceKlass::allocate_instance() Changes: https://git.openjdk.org/jdk/pull/12782/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12782&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8292059 Stats: 24 lines in 2 files changed: 12 ins; 12 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12782.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12782/head:pull/12782 PR: https://git.openjdk.org/jdk/pull/12782 From rkennke at openjdk.org Tue Feb 28 11:26:21 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 28 Feb 2023 11:26:21 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v23] In-Reply-To: <0-lVZpSsP2FUCAiK6Oo2CrI9FNIg-vyq0G5mdoEh1JQ=.50c51610-7efd-4752-8ea5-b8da19f7cce8@github.com> References: <0-lVZpSsP2FUCAiK6Oo2CrI9FNIg-vyq0G5mdoEh1JQ=.50c51610-7efd-4752-8ea5-b8da19f7cce8@github.com> Message-ID: On Wed, 15 Feb 2023 14:32:45 GMT, Stefan Karlsson wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Clarify comment on arrayOopDesc::max_array_length() > > src/hotspot/share/gc/shared/jvmFlagConstraintsGC.cpp line 340: > >> 338: return JVMFlag::VIOLATES_CONSTRAINT; >> 339: } >> 340: if (value > ((uint64_t)ThreadLocalAllocBuffer::max_size() * HeapWordSize)) { > > Was this change needed because _filler_array_max_size was changed and max_tlab_size uses it? Could you motivate why this is needed or add an assert that we don't overflow size_t. > > I see that we have another place where we don't guard against a size_t overflow: > > static size_t max_size_in_bytes() { return max_size() * BytesPerWord; } > > If need to care about overflow in MinTLABSizeConstraintFunc, don't we need to care about it in usages of max_size_in_bytes()? Using the new filler_array_max_size we can end up exactly at SIZE_MAX+1, which would wrap-around and become 0 and make the comparison fail. So yeah, maybe the new value should be reconsidered to ensure not overflowing when multiplying with BytesPerWord. (I'm wondering if it might be easier to hardcode a table of array-max-sizes depending on elem-type and LP64...) The only use of max_size_in_bytes() can easily be converted to use words instead of bytes. And I guess that the comparison in the constraint function can be changed to use words as well, so as to avoid the overflow. Would that address your concern? ------------- PR: https://git.openjdk.org/jdk/pull/11044 From rkennke at openjdk.org Tue Feb 28 11:33:16 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 28 Feb 2023 11:33:16 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v23] In-Reply-To: <0-lVZpSsP2FUCAiK6Oo2CrI9FNIg-vyq0G5mdoEh1JQ=.50c51610-7efd-4752-8ea5-b8da19f7cce8@github.com> References: <0-lVZpSsP2FUCAiK6Oo2CrI9FNIg-vyq0G5mdoEh1JQ=.50c51610-7efd-4752-8ea5-b8da19f7cce8@github.com> Message-ID: On Wed, 15 Feb 2023 13:34:16 GMT, Stefan Karlsson wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Clarify comment on arrayOopDesc::max_array_length() > > src/hotspot/cpu/arm/c1_MacroAssembler_arm.cpp line 166: > >> 164: void C1_MacroAssembler::allocate_array(Register obj, Register len, >> 165: Register tmp1, Register tmp2, Register tmp3, >> 166: int header_size_in_bytes, int element_size, > > Other platforms call this parameter base_offset_in_bytes. I looked at the code below, but couldn't convince myself that the MinOjbAlignmentInBytesMask usages were correct. Maybe they are because this is 32-bit code, but could be worth taking a closer look. The use of MinObjAlignmentInBytes(Mask) below is not correct, afaict. It aligns-up the object size if either the header/base-offset is not aligned at 8 or if the element-size is not aligned at 8. I can kinda understand to align-up if the header-size is not 8-bytes-aligned (although, with this change this should only be needed for long or double element types), but what is the point of aligning-up the object size for, e.g., int[] or byte[]? I am not even sure why this currently works, as far as I can tell, this would tend to over-allocate arrays, and this should throw off size calculations. *scratches head*. I any case, the fact that we are having this discussion and can't even figure out what is going on indicates that this is some seriously nasty code that should be rewritten to be more reasonable and understandable. Unfortunately I don't have access to arm hardware, and I wouldn't want to do it blindly. IIRC, @shipilev made the changes for this PR and verified that it does at least work well-enough to pass some sanity testing. I would suggest that either somebody (@shipilev?) could check this soon, or I file a follow-up bug to address this ASAP. ------------- PR: https://git.openjdk.org/jdk/pull/11044 From redestad at openjdk.org Tue Feb 28 11:41:01 2023 From: redestad at openjdk.org (Claes Redestad) Date: Tue, 28 Feb 2023 11:41:01 GMT Subject: RFR: 8292059: Do not inline InstanceKlass::allocate_instance() In-Reply-To: References: Message-ID: On Tue, 28 Feb 2023 11:11:54 GMT, Afshin Zafari wrote: > The inline and not-inline versions of the method is stress tested to compare the performance difference. The statistics are drawn in the following chart. > ![chart_2](https://user-images.githubusercontent.com/4697012/221837134-8745dbe3-279c-47f8-ad0f-88897bc8c3cc.png) That's an odd graph. If everything is measuring total time (in the different phases) then a bar chart with bars originating from zero would be less confusing. Labeling units of measure would also help. What workloads has been used to collect the statistics? ------------- PR: https://git.openjdk.org/jdk/pull/12782 From rkennke at openjdk.org Tue Feb 28 11:48:59 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 28 Feb 2023 11:48:59 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v27] In-Reply-To: References: Message-ID: > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Protect against overflow when dealing with TLAB::max_size() ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/60787563..129400cb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=26 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=25-26 Stats: 16 lines in 5 files changed: 0 ins; 1 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From duke at openjdk.org Tue Feb 28 12:15:02 2023 From: duke at openjdk.org (Afshin Zafari) Date: Tue, 28 Feb 2023 12:15:02 GMT Subject: RFR: 8292059: Do not inline InstanceKlass::allocate_instance() In-Reply-To: References: Message-ID: On Tue, 28 Feb 2023 11:11:54 GMT, Afshin Zafari wrote: > The inline and not-inline versions of the method is stress tested to compare the performance difference. The statistics are drawn in the following chart. > ![chart (2)](https://user-images.githubusercontent.com/4697012/221848555-2884313e-9d26-41c9-a265-3f1ce295b17b.png) > > The vertical axis is in milliseconds. > New chart is uploaded. Vertical axis unit is milliseconds. ------------- PR: https://git.openjdk.org/jdk/pull/12782 From rcastanedalo at openjdk.org Tue Feb 28 12:31:09 2023 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Tue, 28 Feb 2023 12:31:09 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v6] In-Reply-To: <47e5gQJkZn7ldGv3n2cyQoZSAT9YpWWvIwXnbxhdGuQ=.643a4652-204b-4689-85a4-9378e8a587b3@github.com> References: <47e5gQJkZn7ldGv3n2cyQoZSAT9YpWWvIwXnbxhdGuQ=.643a4652-204b-4689-85a4-9378e8a587b3@github.com> Message-ID: <44kCeVWFrSGQEcehVV5G_EowMs26zWcuv5KmDbYv9Mc=.f49cb445-de78-4932-aefc-1839fd0be611@github.com> On Mon, 27 Feb 2023 11:28:40 GMT, Erik ?sterlund wrote: >> So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Add comment src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 1270: > 1268: Label copy4, copy8, copy16, copy32, copy80, copy_big, finish; > 1269: const Register t2 = r5, t3 = r6, t4 = r7, t5 = r11; > 1270: const Register t6 = r12, t7 = r13, t8 = r14, t9 = r15; I find the usage of r15 in `copy_memory` a bit confusing. If I get it right, it is used 1. as a temporary register (aliased as `t9`) for 16-bytes copying (L1407-1437), 2. as a temporary register (in its raw form, i.e. not via `t9`) passed explicitly to `copy_memory_small` (L1518-1548), and 3. as a "special" register passed implicitly to the generated `copy_longs` stubs (L1557-1574). I think it would be clearer if `t9` was used instead of `r15` for case 2) and then perhaps a comment was added before calling the generated `copy_longs` stubs mentioning that `t9` cannot be used from then on because it aliases with `r15`. ------------- PR: https://git.openjdk.org/jdk/pull/12670 From rcastanedalo at openjdk.org Tue Feb 28 12:36:14 2023 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Tue, 28 Feb 2023 12:36:14 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v6] In-Reply-To: <47e5gQJkZn7ldGv3n2cyQoZSAT9YpWWvIwXnbxhdGuQ=.643a4652-204b-4689-85a4-9378e8a587b3@github.com> References: <47e5gQJkZn7ldGv3n2cyQoZSAT9YpWWvIwXnbxhdGuQ=.643a4652-204b-4689-85a4-9378e8a587b3@github.com> Message-ID: On Mon, 27 Feb 2023 11:28:40 GMT, Erik ?sterlund wrote: >> So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Add comment Looks good to me! (my comment about clarifying the use of `r15` in aarch64 should be interpreted as a suggestion, not a request). ------------- Marked as reviewed by rcastanedalo (Reviewer). PR: https://git.openjdk.org/jdk/pull/12670 From duke at openjdk.org Tue Feb 28 12:44:09 2023 From: duke at openjdk.org (Jan Kratochvil) Date: Tue, 28 Feb 2023 12:44:09 GMT Subject: RFR: 8302191: Performance degradation for float/double modulo on Linux [v2] In-Reply-To: References: Message-ID: On Mon, 27 Feb 2023 01:02:54 GMT, David Holmes wrote: >> Jan Kratochvil has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: >> >> 8302191: Performance degradation for float/double modulo on Linux > > test/micro/org/openjdk/bench/vm/floatingpoint/DremFrem.java line 44: > >> 42: * Java bytecode drem is defined as C fmod (not C drem==remainder). >> 43: * GCC since 93ba85fdd253b4b9cf2b9e54e8e5969b1a3db098 has slow fmod(). >> 44: * Testcase is based on: https://stackoverflow.com/a/55673220/2995591 > > That could be a problem. What is the license for code shown on StackOverflow? A nice catch. I have tried to get a compatible license, we'll see: https://github.com/cirosantilli/java-cheat/pull/3 ------------- PR: https://git.openjdk.org/jdk/pull/12508 From redestad at openjdk.org Tue Feb 28 12:53:07 2023 From: redestad at openjdk.org (Claes Redestad) Date: Tue, 28 Feb 2023 12:53:07 GMT Subject: RFR: 8292059: Do not inline InstanceKlass::allocate_instance() In-Reply-To: References: Message-ID: On Tue, 28 Feb 2023 11:11:54 GMT, Afshin Zafari wrote: > The inline and not-inline versions of the method is stress tested to compare the performance difference. The statistics are drawn in the following chart. > ![chart (2)](https://user-images.githubusercontent.com/4697012/221848555-2884313e-9d26-41c9-a265-3f1ce295b17b.png) > > The vertical axis is in milliseconds. So if I'm reading this correctly then by removing the inlining we see a reduction in performance on some GC metrics on some unspecified workload by 5-10%? I'd say that's a good data point in favor of keeping this as-is, unless there's data to show that there are compelling improvements elsewhere (for example binary size, compilation times). @iklam might want to comment on this, but it seems to me this might be a poor trade-off. ------------- PR: https://git.openjdk.org/jdk/pull/12782 From lucy at openjdk.org Tue Feb 28 12:55:20 2023 From: lucy at openjdk.org (Lutz Schmidt) Date: Tue, 28 Feb 2023 12:55:20 GMT Subject: Integrated: 8299817: [s390] AES-CTR mode intrinsic fails with multiple short update() calls In-Reply-To: References: Message-ID: On Thu, 12 Jan 2023 14:29:34 GMT, Lutz Schmidt wrote: > This PR addresses an issue in the AES-CTR mode intrinsic on s390. When a message is ciphered in multiple, small (< 16 bytes) segments, the result is incorrect. > > This is not just a band-aid fix. The issue was taken as a chance to restructure the code. though still complicated, It is now easier to read and (hopefully) understand. > > Except for the new jetreg test, the changes are purely s390. There are no side effects on other platforms. Issue-specific tests pass. Other tests are in progress. I will update this PR once they are complete. > > **Reviews and comments are very much appreciated.** > > @backwaterred could you please run some "official" s390 tests? Thanks. This pull request has now been integrated. Changeset: e144783e Author: Lutz Schmidt URL: https://git.openjdk.org/jdk/commit/e144783eb2d2a4437d0f992c964e34a932bfa54b Stats: 729 lines in 5 files changed: 519 ins; 64 del; 146 mod 8299817: [s390] AES-CTR mode intrinsic fails with multiple short update() calls Reviewed-by: mbaesken, mdoerr ------------- PR: https://git.openjdk.org/jdk/pull/11967 From mbaesken at openjdk.org Tue Feb 28 13:05:06 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 28 Feb 2023 13:05:06 GMT Subject: RFR: JDK-8302989: Add missing INCLUDE_CDS checks In-Reply-To: References: Message-ID: On Mon, 27 Feb 2023 10:08:43 GMT, Ioi Lam wrote: > It may be more profitable (and less work) if we change these variables to `const` when CDS is disabled: > > ``` > extern bool DumpSharedSpaces; > extern bool DynamicDumpSharedSpaces; > extern bool RequireSharedSpaces; > extern "C" { > // Make sure UseSharedSpaces is accessible to the serviceability agent. > extern JNIEXPORT jboolean UseSharedSpaces; > } > ``` > > But some changes may be needed in SA. Hi Ioi, do you think this should be done for all 4 bools ? Btw. when adding a const to UseSharedSpaces I run into something like this (when compiling with gcc, seems there is some issue when const is used together with JNIEXPORT , any ideas why ? ` globalDefinitions.cpp:50:26: error: 'visibility' attribute ignored [-Werror=attributes]` ------------- PR: https://git.openjdk.org/jdk/pull/12691 From aph at openjdk.org Tue Feb 28 13:43:18 2023 From: aph at openjdk.org (Andrew Haley) Date: Tue, 28 Feb 2023 13:43:18 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v6] In-Reply-To: <47e5gQJkZn7ldGv3n2cyQoZSAT9YpWWvIwXnbxhdGuQ=.643a4652-204b-4689-85a4-9378e8a587b3@github.com> References: <47e5gQJkZn7ldGv3n2cyQoZSAT9YpWWvIwXnbxhdGuQ=.643a4652-204b-4689-85a4-9378e8a587b3@github.com> Message-ID: On Mon, 27 Feb 2023 11:28:40 GMT, Erik ?sterlund wrote: >> So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Add comment src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 837: > 835: bs_asm->copy_load_at(_masm, decorators, type, 16, > 836: t6, t7, Address(__ pre(s, 8 * unit)), > 837: gct1); Suggestion: } else { for (int i = 0; i < 8; i+= 2) { bs_asm->copy_store_at(_masm, decorators, type, 16, Address(d, i * 2 * unit), t[i], t[i+1], gct1, gct2, gct3); bs_asm->copy_load_at(_masm, decorators, type, 16, t[i], t[i+1], Address(s, i * 2 * unit), gct1); } } ``` ------------- PR: https://git.openjdk.org/jdk/pull/12670 From aph at openjdk.org Tue Feb 28 13:48:20 2023 From: aph at openjdk.org (Andrew Haley) Date: Tue, 28 Feb 2023 13:48:20 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v6] In-Reply-To: <47e5gQJkZn7ldGv3n2cyQoZSAT9YpWWvIwXnbxhdGuQ=.643a4652-204b-4689-85a4-9378e8a587b3@github.com> References: <47e5gQJkZn7ldGv3n2cyQoZSAT9YpWWvIwXnbxhdGuQ=.643a4652-204b-4689-85a4-9378e8a587b3@github.com> Message-ID: On Mon, 27 Feb 2023 11:28:40 GMT, Erik ?sterlund wrote: >> So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Add comment src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 726: > 724: assert_different_registers(rscratch1, rscratch2, t0, t1, t2, t3, t4, t5, t6, t7); > 725: assert_different_registers(s, d, count, rscratch1, rscratch2); > 726: Suggestion: static const FloatRegister vt[] = {v0, v1, v2, v3}; static const Register t[] = {t0, t1, t2, t3, t4, t5, t6, t7}; src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 812: > 810: bs_asm->copy_load_at(_masm, decorators, type, 32, > 811: v2, v3, Address(__ pre(s, 8 * unit)), > 812: gct1, gct2, gcvt1); Suggestion: for (int i = 0; i < 4; i+= 2) { bs_asm->copy_store_at(_masm, decorators, type, 32, Address(d, i * 4 * unit), vt[i], vt[i+1], gct1, gct2, gct3, gcvt1, gcvt2, gcvt3); bs_asm->copy_load_at(_masm, decorators, type, 32, vt[i], vt[i+1], Address(s, i * 4 * unit), gct1, gct2, gcvt1); } ``` ------------- PR: https://git.openjdk.org/jdk/pull/12670 From aph at openjdk.org Tue Feb 28 13:51:11 2023 From: aph at openjdk.org (Andrew Haley) Date: Tue, 28 Feb 2023 13:51:11 GMT Subject: RFR: 8302780: Add support for vectorized arraycopy GC barriers [v6] In-Reply-To: <47e5gQJkZn7ldGv3n2cyQoZSAT9YpWWvIwXnbxhdGuQ=.643a4652-204b-4689-85a4-9378e8a587b3@github.com> References: <47e5gQJkZn7ldGv3n2cyQoZSAT9YpWWvIwXnbxhdGuQ=.643a4652-204b-4689-85a4-9378e8a587b3@github.com> Message-ID: On Mon, 27 Feb 2023 11:28:40 GMT, Erik ?sterlund wrote: >> So far, the arraycopy stubs have performed some kind of bulk pre/post barriers for arraycopy, which have been good enough, and allowed the copying itself to be done with plain loads and stores. For generational ZGC, this approach is not good enough, and we need barriers for the actual copying, but instead don't need the pre/post barriers. To prepare the JVM for generational ZGC, we need to add an API for arraycopy barriers. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Add comment src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 812: > 810: bs_asm->copy_load_at(_masm, decorators, type, 32, > 811: v2, v3, Address(__ pre(s, 8 * unit)), > 812: gct1, gct2, gcvt1); All this extreme cut-and-paste manual unrolling is very hard to read, maintain, and review. I wasn't going to say anything, because it's Erik, and what do I know! But I must push back here, this is too much. Please consider these style changes. ------------- PR: https://git.openjdk.org/jdk/pull/12670 From alanb at openjdk.org Tue Feb 28 14:31:22 2023 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 28 Feb 2023 14:31:22 GMT Subject: RFR: 8302615: make JVMTI thread cpu time functions optional for virtual threads In-Reply-To: References: Message-ID: On Thu, 16 Feb 2023 18:05:54 GMT, Serguei Spitsyn wrote: > This is a minor JVM TI spec update for two timer functions `GetCurrentThreadCpuTime` and `GetThreadCpuTime` to allow implementations to support virtual threads. > > The CSR is: > [JDK-8302616](https://bugs.openjdk.org/browse/JDK-8302616): make JVMTI thread cpu time functions optional for virtual threads > > No testing is needed. I've just spotted that we missed some text: GetCurrentThreadCpuTime has "The current thread may not be a virtual thread". GetThreadCpuTime has "The thread may not be a virtual thread". The changes that have been integrated allow these functions to support CPU time measurement so I think we will need to adjust both of these sentences in a follow up issue. ------------- PR: https://git.openjdk.org/jdk/pull/12604 From yzheng at openjdk.org Tue Feb 28 16:13:45 2023 From: yzheng at openjdk.org (Yudi Zheng) Date: Tue, 28 Feb 2023 16:13:45 GMT Subject: RFR: 8302452: [JVMCI] Export _poly1305_processBlocks, JfrThreadLocal fields to JVMCI compiler. In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 15:13:05 GMT, Yudi Zheng wrote: > This PR allows JVMCI compiler intrinsics to reuse the _poly1305_processBlocks stub and to update JfrThreadLocal fields on `Thread.setCurrentThread` events. @dean-long could you please review this patch? thanks! ------------- PR: https://git.openjdk.org/jdk/pull/12560 From rkennke at openjdk.org Tue Feb 28 18:02:13 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 28 Feb 2023 18:02:13 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v28] In-Reply-To: References: Message-ID: <6R4wpZhfXCzH96dUBOzn3FWKsTR1ZtI4-FZcNG8s4js=.77602f48-3f1e-462a-8c58-2da4e83fba41@github.com> > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Initialize gap between array-length and first element ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/129400cb..572de842 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=27 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=26-27 Stats: 30 lines in 2 files changed: 14 ins; 14 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From jvernee at openjdk.org Tue Feb 28 21:06:23 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Tue, 28 Feb 2023 21:06:23 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: On Thu, 23 Feb 2023 06:18:49 GMT, Martin Doerr wrote: >> Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". >> >> This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). >> >> Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. >> >> There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. >> >> The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. >> >> I had to make changes to shared code and code for other platforms: >> 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: >> - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. >> - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. >> - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! >> 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Remove size restriction for structs. Add TODO for Big Endian. This looks good overall I think, though I'll stick to my previous suggestion to try and move more logic into Java. Also, I recommend adding a test in `java/foreign/callarranger` similar to the tests already found there. They call the CallGenerator directly and then check the binding recipe. This could prove useful if we need to refactor shared code for whatever reason, since those tests run on (almost) every platform, so can be used to do some basic sanity checking. src/hotspot/cpu/ppc/downcallLinker_ppc.cpp line 133: > 131: Register callerSP = R2, // C/C++ uses R2 as TOC, but we can reuse it here > 132: tmp = R11_scratch1, // same as shuffle_reg > 133: call_target_address = R12_scratch2; // same as _abi._scratch2 (ABIv2 requires this reg!) Do I understand correctly that the ABI requires the register to be used for the call to be `R12`? How does that make a difference? I guess in some cases the callee might want to know the address through which it is called? (so it looks at `R12`) src/hotspot/cpu/ppc/downcallLinker_ppc.cpp line 154: > 152: // (abi_reg_args is abi_minframe plus space for 8 argument register spill slots) > 153: assert(_abi._shadow_space_bytes == frame::abi_minframe_size, "expected space according to ABI"); > 154: int allocated_frame_size = frame::abi_minframe_size + MAX2(_input_registers.length(), 8) * BytesPerWord; This is hard-coding an assumption about the ABI that's being called. Ok for now. If it needs to be addressed in the future, it could be done by adding another field to `ABIDescriptor` like `min_stack_arg_bytes`, or something like that (which is set to zero for other ABIs). It seems to be different from `shadow_space` since it's also used by the caller to put stack arguments. src/hotspot/cpu/ppc/downcallLinker_ppc.cpp line 343: > 341: > 342: __ flush(); > 343: // Disassembler::decode((u_char*)start, (u_char*)__ pc(), tty); Leftover commented code? (note that the stub can also be disassembled with `-Xlog:foreign+downcall=trace` now) src/hotspot/cpu/ppc/foreignGlobals_ppc.cpp line 229: > 227: > 228: void ArgumentShuffle::pd_generate(MacroAssembler* masm, VMStorage tmp, int in_stk_bias, int out_stk_bias, const StubLocations& locs) const { > 229: Register callerSP = as_Register(tmp); // preset It looks like `tmp` is being used to hold the caller's SP. I'm guessing this can not be computed the same way as we do on x86 and aarch64? (based on `RBP`, `RFP_BIAS`) If you want, you could add another register argument to `pd_generate` that is just invalid/unused on other platforms. That way you could use `tmp` for the shuffling instead of having to go through the stack. (looks like `R0` is already used in some cases as a temp register) src/hotspot/cpu/ppc/upcallLinker_ppc.cpp line 137: > 135: ArgumentShuffle arg_shuffle(in_sig_bt, total_in_args, out_sig_bt, total_out_args, &in_conv, &out_conv, shuffle_reg); > 136: // The Java call uses the JIT ABI, but we also call C. > 137: int out_arg_area = MAX2(frame::jit_out_preserve_size + arg_shuffle.out_arg_bytes(), (int)frame::abi_reg_args_size); We need `frame::abi_reg_args_size` since we call `on_entry`/`on_exit` which require the stack space I guess? src/hotspot/cpu/ppc/upcallLinker_ppc.cpp line 240: > 238: __ ld(call_target_address, in_bytes(Method::from_compiled_offset()), R19_method); > 239: __ mtctr(call_target_address); > 240: __ bctrl(); Ok, I see. I guess there is some special purpose register called `CTR` which we are moving to for `bctrl` here. Does ABIv2 require that move to always come from `R12`? (from the comment in downcallLinker). (I'm trying to understand the requirements for possibly tweaking shared code). src/hotspot/cpu/ppc/upcallLinker_ppc.cpp line 347: > 345: FunctionDescriptor* fd = (FunctionDescriptor*)fd_addr; > 346: fd->set_entry(fd_addr + sizeof(FunctionDescriptor)); > 347: #endif Had to do a double take. Looks like we're not the only one who are using the name `FunctionDescriptor` :) src/hotspot/cpu/ppc/upcallLinker_ppc.cpp line 356: > 354: } > 355: #endif > 356: //blob->print_on(tty); Leftover commented code? src/java.base/share/classes/jdk/internal/foreign/abi/ppc64/CallArranger.java line 68: > 66: public abstract class CallArranger { > 67: // Linux PPC64 Little Endian uses ABI v2. > 68: private static final boolean useABIv2 = ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN; Now that I'm here. This could be a potentially interesting case for having 2 subclasses of CallArranger: one for `useABIv2 == true` and one for `false`. src/java.base/share/classes/jdk/internal/foreign/abi/ppc64/CallArranger.java line 81: > 79: new VMStorage[] { f1, f2, f3, f4, f5, f6, f7, f8 }, // FP output > 80: new VMStorage[] { r0, r2, r3, r4, r5, r6, r7, r8, r9, r10, r11, r12 }, // volatile GP > 81: new VMStorage[] { f0, f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13 }, // volatile FP Note that argument registers are assumed volatile, so they don't have to be duplicated here. src/java.base/share/classes/jdk/internal/foreign/abi/ppc64/CallArranger.java line 286: > 284: // "no partial DW rule": Mark first stack slot to get filled. > 285: // Note: Can only happen with forArguments = true. > 286: VMStorage overlappingReg = null; `overlappingReg` is initialized along all branches, so it's not needed to assign `null` here. src/java.base/share/classes/jdk/internal/foreign/abi/ppc64/CallArranger.java line 293: > 291: } else { > 292: overlappingReg = new VMStorage(StorageType.STACK_AND_FLOAT, > 293: (short) STACK_SLOT_SIZE, (int) stackOffset - 4); I think you could remove the mixed VMStorage types here relatively easily by returning a `VMStorage[][]`, where each element is a single element array, but then for the `needOverlapping` case add another element to the array for the extra store. Then when unboxing a `STRUCT_HFA`, `dup` the result of the `bufferLoad` and then do 2 `vmStore`s (one for each element). For boxing, you could just ignore the extra storage, and just `vmLoad` the first one (or, whichever one you like :)) src/java.base/share/classes/jdk/internal/foreign/abi/ppc64/TypeClass.java line 66: > 64: } > 65: > 66: static boolean isHomogeneousFloatAggregate(MemoryLayout type, boolean useABIv2) { Note that we had to make some changes to this routine on AArch64, since it didn't properly account for nested structs/unions and arrays. See: https://github.com/openjdk/panama-foreign/pull/780 Just as a heads up, in case PPC needs changes too. src/java.base/share/classes/jdk/internal/foreign/abi/ppc64/linux/LinuxPPC64CallArranger.java line 33: > 31: * PPC64 CallArranger specialized for Linux ABI. > 32: */ > 33: public class LinuxPPC64CallArranger extends CallArranger { I don't really see the point in having a separate subclass with `CallArranger` being abstract, unless you are planning to add other implementations later? (edit: see also later comment in CallArranger https://github.com/openjdk/jdk/pull/12708#discussion_r1120753657) ------------- PR: https://git.openjdk.org/jdk/pull/12708