From kbarrett at openjdk.java.net Fri Oct 1 00:29:13 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 1 Oct 2021 00:29:13 GMT Subject: RFR: 8274322: Problems with oopDesc construction [v3] In-Reply-To: References: Message-ID: > Please review this change to the default constructor for markWord and > associated "change" to construction of oopDesc. > > The current code never invokes the constructor for oopDesc or any of its > derived classes. For that to be permissible according to the Standard, > those classes must be trivially default constructible. And for that to be > the case, the markWord default constructor must be trivial. > > This change consists of three parts. > > (1) The markWord default constructor is changed to be trivial, so the > default constructors for oopDesc and classes derived from it will also be > trivial. It wasn't previously trivial because the mechanism for making it so > (a default definition) is a C++11 feature that wasn't yet supported when the > previous constructor was defined. > > (2) This change also adds static asserts to verify the relevant classes have > trivial default constructors, to prevent later changes from unintentionally > breaking this. > > (3) This change also makes oopDesc noncopyable, to prevent inadvertent usage > of these operations that don't make any sense. > > A different approach would be to always use placement new with an > appropriate constructor to perform the initialization, perhaps encapsulated > in factory functions. I did some exploration in that direction. It's a much > larger and more complex change, though the final behavior (use constructors > for initialization) is simpler. > > Testing: > tier1 Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Merge branch 'master' into oopdesc_construct - explicitly default markWord special functions - dholmes review - improve oopDesc and markWord construction ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5729/files - new: https://git.openjdk.java.net/jdk/pull/5729/files/5197b2c6..07c0d159 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5729&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5729&range=01-02 Stats: 6100 lines in 249 files changed: 4307 ins; 1092 del; 701 mod Patch: https://git.openjdk.java.net/jdk/pull/5729.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5729/head:pull/5729 PR: https://git.openjdk.java.net/jdk/pull/5729 From kbarrett at openjdk.java.net Fri Oct 1 00:29:16 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 1 Oct 2021 00:29:16 GMT Subject: RFR: 8274322: Problems with oopDesc construction [v2] In-Reply-To: References: Message-ID: On Thu, 30 Sep 2021 11:18:51 GMT, Stefan Karlsson wrote: >> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: >> >> dholmes review > > src/hotspot/share/oops/markWord.hpp line 79: > >> 77: >> 78: // It is critical for performance that this class be trivially >> 79: // destructable, copyable, and assignable. > > Given the comment, would it make sense to also explicitly mark them as `= default`? Yes. That would make it hard for someone to break the restriction without reading the comment. ------------- PR: https://git.openjdk.java.net/jdk/pull/5729 From kbarrett at openjdk.java.net Fri Oct 1 00:29:19 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 1 Oct 2021 00:29:19 GMT Subject: Integrated: 8274322: Problems with oopDesc construction In-Reply-To: References: Message-ID: On Tue, 28 Sep 2021 03:12:52 GMT, Kim Barrett wrote: > Please review this change to the default constructor for markWord and > associated "change" to construction of oopDesc. > > The current code never invokes the constructor for oopDesc or any of its > derived classes. For that to be permissible according to the Standard, > those classes must be trivially default constructible. And for that to be > the case, the markWord default constructor must be trivial. > > This change consists of three parts. > > (1) The markWord default constructor is changed to be trivial, so the > default constructors for oopDesc and classes derived from it will also be > trivial. It wasn't previously trivial because the mechanism for making it so > (a default definition) is a C++11 feature that wasn't yet supported when the > previous constructor was defined. > > (2) This change also adds static asserts to verify the relevant classes have > trivial default constructors, to prevent later changes from unintentionally > breaking this. > > (3) This change also makes oopDesc noncopyable, to prevent inadvertent usage > of these operations that don't make any sense. > > A different approach would be to always use placement new with an > appropriate constructor to perform the initialization, perhaps encapsulated > in factory functions. I did some exploration in that direction. It's a much > larger and more complex change, though the final behavior (use constructors > for initialization) is simpler. > > Testing: > tier1 This pull request has now been integrated. Changeset: 2e690ba8 Author: Kim Barrett URL: https://git.openjdk.java.net/jdk/commit/2e690ba8bda30902f1188cabad63fb60f4eb828f Stats: 35 lines in 5 files changed: 31 ins; 0 del; 4 mod 8274322: Problems with oopDesc construction Reviewed-by: dholmes, stefank ------------- PR: https://git.openjdk.java.net/jdk/pull/5729 From kbarrett at openjdk.java.net Fri Oct 1 00:32:40 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 1 Oct 2021 00:32:40 GMT Subject: RFR: 8272807: Permit use of memory concurrent with pretouch [v2] In-Reply-To: References: Message-ID: On Tue, 21 Sep 2021 23:01:08 GMT, Kim Barrett wrote: >> Note that this PR replaces the withdrawn https://github.com/openjdk/jdk/pull/5215. >> >> Please review this change which adds os::touch_memory, which is similar to >> os::pretouch_memory but allows concurrent access to the memory while it is >> being touched. This is accomplished by using an atomic add of zero as the >> operation for touching the memory, ensuring the virtual location is backed >> by physical memory while not changing any values being read or written by >> other threads. >> >> While I was there, fixed some other lurking issues in os::pretouch_memory. >> There was a potential overflow in the iteration that has been fixed. And if >> the range arguments weren't page aligned then the last page might not get >> touched. The latter was even mentioned in the function's description. Both >> of those have been fixed by careful alignment and some extra checks. The >> resulting code is a little more complicated, but more robust and complete. >> >> Similarly added TouchTask, which is similar to PretouchTask. Again here, >> there is some cleaning up to avoid potential overflows and such. >> >> - The chunk size is computed using the page size after possible adjustment >> for UseTransparentHugePages. We want a chunk size that reflects the actual >> number of touches that will be performed. >> >> - The chunk claim is now done using a CAS that won't exceed the range end. >> The old atomic-fetch-and-add and check the result, which is performed by >> each worker thread, could lead to overflow. The old code has a test for >> overflow, but since pointer-arithmetic overflow is UB that's not reliable. >> >> - The old calculation of num_chunks for parallel touching could also >> potentially overflow. >> >> Testing: >> mach5 tier1-3 > > Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Merge branch 'master' into touch_memory > - simplify touch_impl, using conditional on bool arg rather than template specialization > - touch task > - add touch_memory Withdrawing this. Other folks don't like various aspects of it, and I only went down this path due to unexpected performance problems with https://github.com/openjdk/jdk/pull/5215. Revisiting that, it might be possible to deal with the performance problems after all. ------------- PR: https://git.openjdk.java.net/jdk/pull/5353 From kbarrett at openjdk.java.net Fri Oct 1 00:32:41 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 1 Oct 2021 00:32:41 GMT Subject: Withdrawn: 8272807: Permit use of memory concurrent with pretouch In-Reply-To: References: Message-ID: On Thu, 2 Sep 2021 18:33:56 GMT, Kim Barrett wrote: > Note that this PR replaces the withdrawn https://github.com/openjdk/jdk/pull/5215. > > Please review this change which adds os::touch_memory, which is similar to > os::pretouch_memory but allows concurrent access to the memory while it is > being touched. This is accomplished by using an atomic add of zero as the > operation for touching the memory, ensuring the virtual location is backed > by physical memory while not changing any values being read or written by > other threads. > > While I was there, fixed some other lurking issues in os::pretouch_memory. > There was a potential overflow in the iteration that has been fixed. And if > the range arguments weren't page aligned then the last page might not get > touched. The latter was even mentioned in the function's description. Both > of those have been fixed by careful alignment and some extra checks. The > resulting code is a little more complicated, but more robust and complete. > > Similarly added TouchTask, which is similar to PretouchTask. Again here, > there is some cleaning up to avoid potential overflows and such. > > - The chunk size is computed using the page size after possible adjustment > for UseTransparentHugePages. We want a chunk size that reflects the actual > number of touches that will be performed. > > - The chunk claim is now done using a CAS that won't exceed the range end. > The old atomic-fetch-and-add and check the result, which is performed by > each worker thread, could lead to overflow. The old code has a test for > overflow, but since pointer-arithmetic overflow is UB that's not reliable. > > - The old calculation of num_chunks for parallel touching could also > potentially overflow. > > Testing: > mach5 tier1-3 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/5353 From kbarrett at openjdk.java.net Fri Oct 1 05:17:42 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 1 Oct 2021 05:17:42 GMT Subject: RFR: 8274615: Support relaxed atomic add for linux-aarch64 Message-ID: <1V2HK9V1rgEy2RkQmT_1PrIIbQcOAd94etBUs026Yds=.d5740b78-e13d-4abd-a0c7-3a3fbc2a4ac0@github.com> Please review this change to the linux-aarch64 port to support relaxed atomic add operations. Both ldxr/stxr and LSE-based implementations are provided. The appropriate one to use selected at runtime, using the existing infrastructure for that selection. Testing: mach5 tier1-3 Some hand-checked performance and functional testing that both LSE and non-LSE implementations work, and that the relaxed operations are faster than the conservative default. ------------- Commit messages: - relaxed atomic add Changes: https://git.openjdk.java.net/jdk/pull/5785/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5785&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8274615 Stats: 64 lines in 4 files changed: 51 ins; 0 del; 13 mod Patch: https://git.openjdk.java.net/jdk/pull/5785.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5785/head:pull/5785 PR: https://git.openjdk.java.net/jdk/pull/5785 From mdoerr at openjdk.java.net Fri Oct 1 08:24:40 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Fri, 1 Oct 2021 08:24:40 GMT Subject: RFR: 8274550: c2i entry barriers read int as long on PPC In-Reply-To: References: Message-ID: On Thu, 30 Sep 2021 14:15:08 GMT, Martin Doerr wrote: > `_keep_alive` is an int. We shouldn't use a 64 bit load. Thanks for the reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/5776 From mdoerr at openjdk.java.net Fri Oct 1 08:24:41 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Fri, 1 Oct 2021 08:24:41 GMT Subject: Integrated: 8274550: c2i entry barriers read int as long on PPC In-Reply-To: References: Message-ID: On Thu, 30 Sep 2021 14:15:08 GMT, Martin Doerr wrote: > `_keep_alive` is an int. We shouldn't use a 64 bit load. This pull request has now been integrated. Changeset: 5e4b514e Author: Martin Doerr URL: https://git.openjdk.java.net/jdk/commit/5e4b514e6e7e1b9f51fac1983b6c12a988f7f5a8 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8274550: c2i entry barriers read int as long on PPC Reviewed-by: eosterlund, shade ------------- PR: https://git.openjdk.java.net/jdk/pull/5776 From kbarrett at openjdk.java.net Fri Oct 1 09:20:32 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 1 Oct 2021 09:20:32 GMT Subject: RFR: 8274521: jdk/jfr/event/gc/detailed/TestGCLockerEvent.java fails when other GC is selected In-Reply-To: References: Message-ID: <1SM9sQDIn0W_7gccgFQVQnZz6NdOaDtMOHjOksNWjUk=.4870c0af-281f-4d06-a2d4-818a08c4d534@github.com> On Wed, 29 Sep 2021 16:53:43 GMT, Aleksey Shipilev wrote: > Simple test bug, need to check if G1 is enabled, or it is a default GC before asking `othervm` with `-XX:+UseG1GC`. Other tests in the same directory do exactly that. > > Additional testing: > - [x] Affected test skipped with Shenandoah now > - [x] Affected test still passes with G1 > - [x] Affected test still passes default GC Changes requested by kbarrett (Reviewer). test/jdk/jdk/jfr/event/gc/detailed/TestGCLockerEvent.java line 29: > 27: * @key jfr > 28: * @requires vm.hasJFR > 29: * @requires vm.gc == "G1" | vm.gc == null I think this should instead be `@requires vm.gc.G1`. The null check is wrong if running with a build that doesn't include G1 and the default GC is being used. ------------- PR: https://git.openjdk.java.net/jdk/pull/5756 From shade at openjdk.java.net Fri Oct 1 09:23:33 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 1 Oct 2021 09:23:33 GMT Subject: RFR: 8274521: jdk/jfr/event/gc/detailed/TestGCLockerEvent.java fails when other GC is selected In-Reply-To: <1SM9sQDIn0W_7gccgFQVQnZz6NdOaDtMOHjOksNWjUk=.4870c0af-281f-4d06-a2d4-818a08c4d534@github.com> References: <1SM9sQDIn0W_7gccgFQVQnZz6NdOaDtMOHjOksNWjUk=.4870c0af-281f-4d06-a2d4-818a08c4d534@github.com> Message-ID: <3rhqtGKoMQEOn0MTa9nKInTeycoEHd27lpeWs7yTeKw=.91dc70b1-5e42-4b59-9b52-07819112c71d@github.com> On Fri, 1 Oct 2021 09:17:28 GMT, Kim Barrett wrote: >> Simple test bug, need to check if G1 is enabled, or it is a default GC before asking `othervm` with `-XX:+UseG1GC`. Other tests in the same directory do exactly that. >> >> Additional testing: >> - [x] Affected test skipped with Shenandoah now >> - [x] Affected test still passes with G1 >> - [x] Affected test still passes default GC > > test/jdk/jdk/jfr/event/gc/detailed/TestGCLockerEvent.java line 29: > >> 27: * @key jfr >> 28: * @requires vm.hasJFR >> 29: * @requires vm.gc == "G1" | vm.gc == null > > I think this should instead be `@requires vm.gc.G1`. The null check is wrong if running with a build that doesn't include G1 and the default GC is being used. True, it does look suspicious. I matched what other tests in the same folder are doing. Maybe we should fix them all at once in the followup PR? ------------- PR: https://git.openjdk.java.net/jdk/pull/5756 From kbarrett at openjdk.java.net Fri Oct 1 09:33:32 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 1 Oct 2021 09:33:32 GMT Subject: RFR: 8274521: jdk/jfr/event/gc/detailed/TestGCLockerEvent.java fails when other GC is selected In-Reply-To: <3rhqtGKoMQEOn0MTa9nKInTeycoEHd27lpeWs7yTeKw=.91dc70b1-5e42-4b59-9b52-07819112c71d@github.com> References: <1SM9sQDIn0W_7gccgFQVQnZz6NdOaDtMOHjOksNWjUk=.4870c0af-281f-4d06-a2d4-818a08c4d534@github.com> <3rhqtGKoMQEOn0MTa9nKInTeycoEHd27lpeWs7yTeKw=.91dc70b1-5e42-4b59-9b52-07819112c71d@github.com> Message-ID: On Fri, 1 Oct 2021 09:20:35 GMT, Aleksey Shipilev wrote: >> test/jdk/jdk/jfr/event/gc/detailed/TestGCLockerEvent.java line 29: >> >>> 27: * @key jfr >>> 28: * @requires vm.hasJFR >>> 29: * @requires vm.gc == "G1" | vm.gc == null >> >> I think this should instead be `@requires vm.gc.G1`. The null check is wrong if running with a build that doesn't include G1 and the default GC is being used. > > True, it does look suspicious. I matched what other tests in the same folder are doing. Maybe we should fix them all at once in the followup PR? I don't see a reason to change this test from being wrong to being differently wrong. ------------- PR: https://git.openjdk.java.net/jdk/pull/5756 From shade at openjdk.java.net Fri Oct 1 09:40:55 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 1 Oct 2021 09:40:55 GMT Subject: RFR: 8274521: jdk/jfr/event/gc/detailed/TestGCLockerEvent.java fails when other GC is selected [v2] In-Reply-To: References: Message-ID: > Simple test bug, need to check if G1 is enabled, or it is a default GC before asking `othervm` with `-XX:+UseG1GC`. Other tests in the same directory do exactly that. > > Additional testing: > - [x] Affected test skipped with Shenandoah now > - [x] Affected test still passes with G1 > - [x] Affected test still passes default GC Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Proper requires ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5756/files - new: https://git.openjdk.java.net/jdk/pull/5756/files/bf70f058..90efd821 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5756&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5756&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/5756.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5756/head:pull/5756 PR: https://git.openjdk.java.net/jdk/pull/5756 From shade at openjdk.java.net Fri Oct 1 09:40:56 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 1 Oct 2021 09:40:56 GMT Subject: RFR: 8274521: jdk/jfr/event/gc/detailed/TestGCLockerEvent.java fails when other GC is selected [v2] In-Reply-To: References: <1SM9sQDIn0W_7gccgFQVQnZz6NdOaDtMOHjOksNWjUk=.4870c0af-281f-4d06-a2d4-818a08c4d534@github.com> <3rhqtGKoMQEOn0MTa9nKInTeycoEHd27lpeWs7yTeKw=.91dc70b1-5e42-4b59-9b52-07819112c71d@github.com> Message-ID: On Fri, 1 Oct 2021 09:30:09 GMT, Kim Barrett wrote: >> True, it does look suspicious. I matched what other tests in the same folder are doing. Maybe we should fix them all at once in the followup PR? > > I don't see a reason to change this test from being wrong to being differently wrong. Yeah, you are right. Changed, re-tested. See new commit. ------------- PR: https://git.openjdk.java.net/jdk/pull/5756 From kbarrett at openjdk.java.net Fri Oct 1 10:36:29 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 1 Oct 2021 10:36:29 GMT Subject: RFR: 8274521: jdk/jfr/event/gc/detailed/TestGCLockerEvent.java fails when other GC is selected [v2] In-Reply-To: References: Message-ID: On Fri, 1 Oct 2021 09:40:55 GMT, Aleksey Shipilev wrote: >> Simple test bug, need to check if G1 is enabled, or it is a default GC before asking `othervm` with `-XX:+UseG1GC`. Other tests in the same directory do exactly that. >> >> Additional testing: >> - [x] Affected test skipped with Shenandoah now >> - [x] Affected test still passes with G1 >> - [x] Affected test still passes default GC > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Proper requires Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5756 From tschatzl at openjdk.java.net Fri Oct 1 14:00:34 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Fri, 1 Oct 2021 14:00:34 GMT Subject: RFR: 8274521: jdk/jfr/event/gc/detailed/TestGCLockerEvent.java fails when other GC is selected [v2] In-Reply-To: References: Message-ID: On Fri, 1 Oct 2021 09:40:55 GMT, Aleksey Shipilev wrote: >> Simple test bug, need to check if G1 is enabled, or it is a default GC before asking `othervm` with `-XX:+UseG1GC`. Other tests in the same directory do exactly that. >> >> Additional testing: >> - [x] Affected test skipped with Shenandoah now >> - [x] Affected test still passes with G1 >> - [x] Affected test still passes default GC > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Proper requires Lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5756 From never at openjdk.java.net Fri Oct 1 17:10:28 2021 From: never at openjdk.java.net (Tom Rodriguez) Date: Fri, 1 Oct 2021 17:10:28 GMT Subject: RFR: 8218885: Restore pop_frame and force_early_return functionality for Graal In-Reply-To: References: Message-ID: On Wed, 22 Sep 2021 05:40:40 GMT, Tom Rodriguez wrote: > This logic no longer seems to be necessary since the adjustCompilationLevel callback has been removed. I don't see any easy way to run the suggested command at the moment through mach5. We've already pushed this change into our JDK17 repos and once we have a snapshot GraalVM based on that I could do it, but it will probably take at least 2-3 weeks for all those bits to be promoted. Why are we running the jck jvmti tests on this since they weren't failing originally? The problem test list from the commit http://hg.openjdk.java.net/jdk/jdk12/rev/f5fd8eefae0f is all nsk and com.sun.jdi tests. I ran those locally with jtreg using a GraalVM built from a patched JDK and they all passed. Is that sufficient? ------------- PR: https://git.openjdk.java.net/jdk/pull/5625 From kvn at openjdk.java.net Fri Oct 1 17:28:33 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 1 Oct 2021 17:28:33 GMT Subject: RFR: 8218885: Restore pop_frame and force_early_return functionality for Graal In-Reply-To: References: Message-ID: On Wed, 22 Sep 2021 05:40:40 GMT, Tom Rodriguez wrote: > This logic no longer seems to be necessary since the adjustCompilationLevel callback has been removed. Yes, running locally with GraalVM is fine. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5625 From john.r.rose at oracle.com Fri Oct 1 18:35:08 2021 From: john.r.rose at oracle.com (John Rose) Date: Fri, 1 Oct 2021 18:35:08 +0000 Subject: RFC - Improving C2 Escape Analysis In-Reply-To: <20210930140335.648146897@eggemoggin.niobe.net> References: <20210930140335.648146897@eggemoggin.niobe.net> Message-ID: I second Mark?s request. The OpenJDK mailing lists (like this one) and the cr.ojn pages are the official central stable record of our shared engineering notes and conversations about how we are evolving our platform. (For those few itching to have a side-conversation about all the gritty process-related reasons why that is so, I?m not your guy. But it is so.) Quick comments: Thank you. This is a very good line of research. It feels good to have other folks in other teams caring about it. The paper touched on the factors which limit EA, notably the path-independent approximation, and (most of all) the inlining decisions. I like to think of the cumulative result of the inlining decisions (of a particular compile task) as the ?inlining horizon? that the task works within. Anything that happens inside the horizon can be represented accurately by the IR of the task. Anything outside the horizon has to be characterized with approximations, usually worst-case approximations. It matters where you place the horizon, and what assumptions you can make on values that come in across or go out across the horizon. One think that I like to think about is what are the ?hard cases? that force us to make weak assumptions about what happens outside the inlining horizon. Quantitative corpus analyses can sometimes be tuned to help us understand which ?hard cases? are preventing optimization. Characterizing those hard cases is important because it focuses our attention where to make changes. Still more, because we are redefining Java as we go alone, we can redefine the rules of Java so that the hard cases get less hard, not because we are more clever about what we do inside the horizon (or expanding the horizon) but because we change the rules in our favor to limit what can happen outside the horizon. Valhalla is where those rules change. Valhalla is intended to deeply change the results of EA. Not to make EA unnecessary, by any means, but rather to focus EA better on some remaining ?hard cases? while allowing EA to Two (not the only two) of the bad things that can happen outside of the inlining horizon are field changes and identity reads. When a field changes outside the horizon, the IR has to model the new value somehow. We have changed the rules here by introducing final fields. In the future we will perhaps tighten the rules for ?trusting? final fields not to change (some say we don?t need that) and perhaps allow ?frozen arrays? which provably cannot be mutated even outside an inlining horizon. Identity reads are nasty little things. Often you don?t care about the precise identity of an object, just its contents. For example, Integer.valueOf(x) or List.of(x,y) have an identity, but you are not supposed to depend on it. (Don?t rely on `x==y`, use `x.equals(y)`, etc.) We have written down some rules about value-based classes[1] which try to tell the programmer how to stay away from identity reads, but there are no ?teeth? in that document. As a result, an object which crosses into or out of an inlining horizon may or may not have the same identity as any other object crossing the horizon (if the contents are the same, of course). [1]: https://docs.oracle.com/javase/8/docs/api/java/lang/doc-files/ValueBased.html Valhalla puts teeth into those rules, by defining a robust type of value-based class which actively obscures its identity. (As a corollary of that it cannot be used to post side effects to its fields. Maybe via fields of mutable objects referred to by its fields, but not directly in its own fields.) I expect that your numbers would change greatly if there were routine use of Valhalla primitive objects wherever VBC?s are in use today. There would still be non-primitive objects passing through the inlining horizon, and it would still be incrementally profitable to track them and estimate their behavior, but the profit margin would be elsewhere. This is what winning looks like, actually: If we are having a hard time playing today?s EA game, we turn the tables over and start a new EA game we can win at. This can happen because we don?t just write the JVM optimizer, we also write the JVM rules. Here?s a way to model what Valhalla objects can do, relative to our sense of today?s EA optimizations. At the IR level, you can ?split? an object (just like you ?split? a register allocation) by recopying it at all uses. If it were a regular Java object, you would be breaking its identity up into a bunch of path-dependent identities. If there are identity reads and/or state changes in the ?split? object, you have just broken the user?s program. But, if you can get away with it, you have also made your worries disappear about what can happen over the horizon, because ?bad stuff? happens only to the ?split? copies which have disappeared over the horizon. The local ?split? copies are still sitting there in your IR graph, undisturbed. Valhalla allows (even encourages) the optimizer to do this splitting, routinely and frequently. Our prototype also records ?souvenirs? of where each split came from, in case we wish to re-box on the heap; then we can reuse a souvenir if one is present. But the canonical form (apart from the souvenir) is just the tuple of field values, with no object identity (or header) in sight. (Headers are, in a sense, where object identity has its ?primary residence?. The GC uses a header to forward stateful objects accurately, and the header is also used to manage identity reads for identityHashCode and synchronization.) The design of Valhalla is informed, in part, by our understanding of the pitfalls of EA, especially with respect to the inlining horizon. We are designing our way out of a trap that prevents us from doing the EA we wish to do. Your work will get better with Valhalla; I hope it can also help us predict the effects of Valhalla. ? John On Sep 30, 2021, at 2:03 PM, mark.reinhold at oracle.com wrote: > > 2021/9/30 13:51:32 -0700, divino.cesar at microsoft.com: >> I've spent the past few weeks investigating the C2 Escape Analysis >> implementation with the goal of identifying which part(s) of it would benefit >> the most from a contribution. >> >> As a conclusion to that investigation, I wrote a report where I list the most >> evident points, accompanied with a _preliminary_ quantitative analysis of how >> effective the current implementation is for finding opportunities for Scalar >> Replacement. >> >> ... >> >> Here is the link to the document: >> https://gist.github.com/JohnTortugo/c2607821202634a6509ec3c321ebf370 > > Thanks for writing this up! > > For IP clarity, could you please post a copy of this document either to > this mailing list or to your directory on cr.openjdk.java.net [1]? > > - Mark > > > [1] https://openjdk.java.net/guide/codeReview.html From Divino.Cesar at microsoft.com Fri Oct 1 18:37:53 2021 From: Divino.Cesar at microsoft.com (Cesar Soares Lucas) Date: Fri, 1 Oct 2021 18:37:53 +0000 Subject: RFC - Improving C2 Escape Analysis In-Reply-To: <20210930140335.648146897@eggemoggin.niobe.net> References: <20210930140335.648146897@eggemoggin.niobe.net> Message-ID: Hi Mark, apologies?for the delay?in getting back to this. As you know, I was having some issues uploading the document. That being said, now I was able to upload it. You can access it here: http://cr.openjdk.java.net/~cslucas/escape-analysis/EscapeAnalysis.html http://cr.openjdk.java.net/~cslucas/escape-analysis/EscapeAnalysis.md ? Regards, Cesar From: mark.reinhold at oracle.com Sent: September 30, 2021 2:03 PM To: Cesar Soares Lucas Cc: hotspot-dev at openjdk.java.net ; Brian Stafford ; Martijn Verburg Subject: Re: RFC - Improving C2 Escape Analysis ? 2021/9/30 13:51:32 -0700, divino.cesar at microsoft.com: > I've spent the past few weeks investigating the C2 Escape Analysis > implementation with the goal of identifying which part(s) of it would benefit > the most from a contribution. > > As a conclusion to that investigation, I wrote a report where I list the most > evident points, accompanied with a _preliminary_ quantitative analysis of how > effective the current implementation is for finding opportunities for Scalar > Replacement. > > ... > > Here is the link to the document: > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgist.github.com%2FJohnTortugo%2Fc2607821202634a6509ec3c321ebf370&data=04%7C01%7Cdivino.cesar%40microsoft.com%7C7f0039249178423efe6c08d98455ca0a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637686326301417173%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2FSEtUXLWP6C5DVaUvu2WVP2WS2kvnY3pIH8%2Fyf6rrG0%3D&reserved=0 Thanks for writing this up! For IP clarity, could you please post a copy of this document either to this mailing list or to your directory on cr.openjdk.java.net [1]? - Mark [1] https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fopenjdk.java.net%2Fguide%2FcodeReview.html&data=04%7C01%7Cdivino.cesar%40microsoft.com%7C7f0039249178423efe6c08d98455ca0a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637686326301417173%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=vRD%2BWbMJ42424XkehtAb%2BtZIakJ%2FEEScbi0D%2BXIkUcc%3D&reserved=0 From mark.reinhold at oracle.com Fri Oct 1 18:39:40 2021 From: mark.reinhold at oracle.com (mark.reinhold at oracle.com) Date: Fri, 01 Oct 2021 11:39:40 -0700 Subject: RFC - Improving C2 Escape Analysis In-Reply-To: References: <20210930140335.648146897@eggemoggin.niobe.net> Message-ID: <20211001113940.357926487@eggemoggin.niobe.net> 2021/10/1 11:37:53 -0700, divino.cesar at microsoft.com: > Hi Mark, > > apologiesfor the delayin getting back to this. As you know, I was > having some issues uploading the document. That being said, now I was > able to upload it. You can access it here: > > http://cr.openjdk.java.net/~cslucas/escape-analysis/EscapeAnalysis.html > http://cr.openjdk.java.net/~cslucas/escape-analysis/EscapeAnalysis.md Great -- thanks! - Mark From dlong at openjdk.java.net Fri Oct 1 20:34:29 2021 From: dlong at openjdk.java.net (Dean Long) Date: Fri, 1 Oct 2021 20:34:29 GMT Subject: RFR: 8218885: Restore pop_frame and force_early_return functionality for Graal In-Reply-To: References: Message-ID: On Wed, 22 Sep 2021 05:40:40 GMT, Tom Rodriguez wrote: > This logic no longer seems to be necessary since the adjustCompilationLevel callback has been removed. Marked as reviewed by dlong (Reviewer). Make sure to test with -Xcomp. ------------- PR: https://git.openjdk.java.net/jdk/pull/5625 From sspitsyn at openjdk.java.net Fri Oct 1 20:56:30 2021 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Fri, 1 Oct 2021 20:56:30 GMT Subject: RFR: 8218885: Restore pop_frame and force_early_return functionality for Graal In-Reply-To: References: Message-ID: On Wed, 22 Sep 2021 05:40:40 GMT, Tom Rodriguez wrote: > This logic no longer seems to be necessary since the adjustCompilationLevel callback has been removed. Marked as reviewed by sspitsyn (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/5625 From sspitsyn at openjdk.java.net Fri Oct 1 20:56:30 2021 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Fri, 1 Oct 2021 20:56:30 GMT Subject: RFR: 8218885: Restore pop_frame and force_early_return functionality for Graal In-Reply-To: References: Message-ID: On Fri, 1 Oct 2021 17:07:23 GMT, Tom Rodriguez wrote: > Why are we running the jck jvmti tests on this since they weren't failing originally? The problem test list from the commit http://hg.openjdk.java.net/jdk/jdk12/rev/f5fd8eefae0f is all nsk and com.sun.jdi tests. The reason to disable PopFrame and ForceEarlyReturn with Graal were failing jck tests. They are not JTreg tests, handled differently and have no ProblemList. I'd suggest to integrate your fix now and deal with this potential problem when Graal is back. ------------- PR: https://git.openjdk.java.net/jdk/pull/5625 From github.com+2249648+johntortugo at openjdk.java.net Sat Oct 2 00:14:43 2021 From: github.com+2249648+johntortugo at openjdk.java.net (John Tortugo) Date: Sat, 2 Oct 2021 00:14:43 GMT Subject: RFR: 8267265: Use new IR Test Framework to create tests for C2 IGV transformations [v4] In-Reply-To: References: <8Ce6bZtHwGEw8_wXZz4ak3obprd1YmZDi4cItcXB4bA=.a7162709-7aad-4709-a585-d2391392f49b@github.com> Message-ID: <841LgVYYimcsO_aZRCDU_6EBvNVXXeGbVkGfJ6gAQrg=.5978eea7-4753-465d-a627-ec1625a504ce@github.com> On Thu, 16 Sep 2021 07:40:58 GMT, Christian Hagedorn wrote: >> John Tortugo has updated the pull request incrementally with 146 additional commits since the last revision: >> >> - Fix merge mistake. >> - Merge branch 'jdk-8267265' of https://github.com/JohnTortugo/jdk into jdk-8267265 >> - Addressing PR feedback: move tests to other directory, add custom tests, add tests for other optimizations, rename some tests. >> - 8273197: ProblemList 2 jtools tests due to JDK-8273187 >> 8273198: ProblemList java/lang/instrument/BootClassPath/BootClassPathTest.sh due to JDK-8273188 >> >> Reviewed-by: naoto >> - 8262186: Call X509KeyManager.chooseClientAlias once for all key types >> >> Reviewed-by: xuelei >> - 8273186: Remove leftover comment about sparse remembered set in G1 HeapRegionRemSet >> >> Reviewed-by: ayang >> - 8273169: java/util/regex/NegativeArraySize.java failed after JDK-8271302 >> >> Reviewed-by: jiefu, serb >> - 8273092: Sort classlist in JDK image >> >> Reviewed-by: redestad, ihse, dfuchs >> - 8273144: Remove unused top level "Sample Collection Set Candidates" logging >> >> Reviewed-by: iwalulya, ayang >> - 8262095: NPE in Flow$FlowAnalyzer.visitApply: Cannot invoke getThrownTypes because tree.meth.type is null >> >> Co-authored-by: Jan Lahoda >> Co-authored-by: Vicente Romero >> Reviewed-by: jlahoda >> - ... and 136 more: https://git.openjdk.java.net/jdk/compare/ac430bf7...463102e2 > > test/hotspot/jtreg/compiler/c2/irTests/AddINodeIdealizationTests.java line 40: > >> 38: >> 39: @Test >> 40: @IR(failOn = {IRNode.LOAD, IRNode.STORE, IRNode.MUL, IRNode.DIV, IRNode.SUB}) > > In this test and all the following ones (including the other files), I think you can remove unrelated `failOn` regexes on operations that are not part of the test. For example, in this test you can safely remove `IRNode.MUL, DIV, and SUB`. Do you think I can remove the "LOAD" and "STORE" as well? ------------- PR: https://git.openjdk.java.net/jdk/pull/5135 From xliu at openjdk.java.net Mon Oct 4 06:39:31 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Mon, 4 Oct 2021 06:39:31 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v4] In-Reply-To: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: <_XakYRQVR8S2LYHZgtIidUTTJ5B1LvHQuo8ZWKu4Gjs=.c9d8c696-7478-485c-8df7-788aa1b38bcc@github.com> > This patch allows the custom commands of OnError to attach to HotSpot itself. > It sets the thread of report_and_die() to Native before os::fork_and_exec(cmd). > This prevents cmds which require safepoint synchronization from deadlock. > eg. OnError='jcmd %p Thread.print'. > > Without this patch, we will encounter a deadlock at safepoint synchronization. > `"main" #1` is the very thread which executes `os::fork_and_exec(cmd)`. > > > Aborting due to java.lang.OutOfMemoryError: Java heap space > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (debug.cpp:364), pid=94632, tid=94633 > # fatal error: OutOfMemory encountered: Java heap space > # > # JRE version: OpenJDK Runtime Environment (18.0) (build 18-internal+0-adhoc.xxinliu.jdk) > # Java VM: OpenJDK 64-Bit Server VM (18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) > # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again > # > # An error report file with more information is saved as: > # /local/home/xxinliu/JDK-2085/hs_err_pid94632.log > # > # -XX:OnError="jcmd %p Thread.print" > # Executing /bin/sh -c "jcmd 94632 Thread.print" ... > 94632: > [10.616s][warning][safepoint] > [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timeout detected: > [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint. > [10.616s][warning][safepoint] # SafepointSynchronize::begin: Threads which did not reach the safepoint: > [10.616s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=236.97ms elapsed=10.61s tid=0x00007f01b00232f0 nid=94633 runnable [0x00007f01b7a08000] > [10.616s][warning][safepoint] java.lang.Thread.State: RUNNABLE > [10.616s][warning][safepoint] > [10.616s][warning][safepoint] # SafepointSynchronize::begin: (End of list) Xin Liu has updated the pull request incrementally with one additional commit since the last revision: Cleanup: remove a statment of debugging. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5590/files - new: https://git.openjdk.java.net/jdk/pull/5590/files/451cdda3..a9d439ad Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5590&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5590&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/5590.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5590/head:pull/5590 PR: https://git.openjdk.java.net/jdk/pull/5590 From coleenp at openjdk.java.net Mon Oct 4 06:51:50 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 4 Oct 2021 06:51:50 GMT Subject: RFR: 8273917: Remove 'leaf' ranking for Mutex Message-ID: <0u1_bu5EunfDFYz-AMKxs-wOVO0-HwpYpe8YoSXBwtM=.e25a1229-de90-4d15-9e89-2f20d6ccf40b@github.com> This change removes 'leaf' ranking. The previous change for JDK-8273915 divided the 'leaf' ranked locks that didn't safepoint check into the rank 'nosafepoint', so all the 'leaf' ranking locks left were safepoint_check_always. The rank 'nonleaf' (to be renamed 'safepoint' in the next change) is the *top* mutex rank. The transformation in this change is as follows: nonleaf+n => nonleaf - Generally these 'nonleaf' mutex were top level locks) leaf => nonleaf - Many of these locks were top level locks leaf => nonleaf-2 - Assuming that they were 'leaf' and 2 levels less than some existing nonleaf lock leaf-n => nonleaf-n The new mutex rankings reflect their rankings based on my logging, except for a couple shenandoah locks which I didn't observe, so I made them nonleaf-2. This change also introduces a relative mutex ranking macro, so that a Mutex/Monitor can be defined in terms of a mutex that it holds while trying to acquire it. So these relative mutex are moved to the end of the init function in mutexLocker.cpp. This has been tested with tier1-8, and retesting tier1-3 locally in progress. ------------- Commit messages: - Fix lock name. - 8273917: Remove 'leaf' ranking for Mutex Changes: https://git.openjdk.java.net/jdk/pull/5801/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5801&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8273917 Stats: 139 lines in 11 files changed: 54 ins; 23 del; 62 mod Patch: https://git.openjdk.java.net/jdk/pull/5801.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5801/head:pull/5801 PR: https://git.openjdk.java.net/jdk/pull/5801 From coleenp at openjdk.java.net Mon Oct 4 06:51:52 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 4 Oct 2021 06:51:52 GMT Subject: RFR: 8273917: Remove 'leaf' ranking for Mutex In-Reply-To: <0u1_bu5EunfDFYz-AMKxs-wOVO0-HwpYpe8YoSXBwtM=.e25a1229-de90-4d15-9e89-2f20d6ccf40b@github.com> References: <0u1_bu5EunfDFYz-AMKxs-wOVO0-HwpYpe8YoSXBwtM=.e25a1229-de90-4d15-9e89-2f20d6ccf40b@github.com> Message-ID: <7xeQ9UDvFVktUSHsIWJf0j47Mp4b9E48qM-np75wjqw=.46ddd7cd-f9b0-4ed8-9ce8-cfe611918a63@github.com> On Sun, 3 Oct 2021 21:08:51 GMT, Coleen Phillimore wrote: > This change removes 'leaf' ranking. The previous change for JDK-8273915 divided the 'leaf' ranked locks that didn't safepoint check into the rank 'nosafepoint', so all the 'leaf' ranking locks left were safepoint_check_always. > > The rank 'nonleaf' (to be renamed 'safepoint' in the next change) is the *top* mutex rank. > > The transformation in this change is as follows: > nonleaf+n => nonleaf - Generally these 'nonleaf' mutex were top level locks) > leaf => nonleaf - Many of these locks were top level locks > leaf => nonleaf-2 - Assuming that they were 'leaf' and 2 levels less than some existing nonleaf lock > leaf-n => nonleaf-n > > The new mutex rankings reflect their rankings based on my logging, except for a couple shenandoah locks which I didn't observe, so I made them nonleaf-2. > > This change also introduces a relative mutex ranking macro, so that a Mutex/Monitor can be defined in terms of a mutex that it holds while trying to acquire it. So these relative mutex are moved to the end of the init function in mutexLocker.cpp. > > This has been tested with tier1-8, and retesting tier1-3 locally in progress. src/hotspot/share/oops/methodData.cpp line 1210: > 1208: : _method(method()), > 1209: // Holds Compile_lock > 1210: _extra_data_lock(Mutex::nonleaf-2, "MDOExtraData_lock", Mutex::_safepoint_check_always), MDO lock (nonleaf-2) is taken when the Compile_lock is held (nonleaf-1) which is taken while the MethodCompileQueue_lock (nonleaf) is held. ------------- PR: https://git.openjdk.java.net/jdk/pull/5801 From simonis at openjdk.java.net Mon Oct 4 07:51:27 2021 From: simonis at openjdk.java.net (Volker Simonis) Date: Mon, 4 Oct 2021 07:51:27 GMT Subject: RFR: 8273563: Improve performance of implicit exceptions with -XX:-OmitStackTraceInFastThrow [v3] In-Reply-To: References: Message-ID: > Currently, if running with `-XX:-OmitStackTraceInFastThrow`, C2 has no possibility to create implicit exceptions like AIOOBE, NullPointerExceptions, etc. in compiled code. This means that such methods will always be deoptimized and re-executed in the interpreter if such exceptions are happening. > > If implicit exceptions are used for normal control flow, that can have a dramatic impact on performance. A prominent example for such code is [Tomcat's `HttpParser::isAlpha()` method](https://github.com/apache/tomcat/blob/26ba86cdbd40ca718e43b82e62b3eb49d004c3d6/java/org/apache/tomcat/util/http/parser/HttpParser.java#L266-L274): > > public static boolean isAlpha(int c) { > try { > return IS_ALPHA[c]; > } catch (ArrayIndexOutOfBoundsException ex) { > return false; > } > } > > > ### Solution > > Instead of deoptimizing and resorting to the interpreter, we can generate code which allocates and initializes the corresponding exceptions right in compiled code. This results in a ten-times performance improvement for the above code: > > -XX:-OmitStackTraceInFastThrow -XX:-OptimizeImplicitExceptions > Benchmark (exceptionProbability) Mode Cnt Score Error Units > ImplicitExceptions.bench 0.0 avgt 5 1.430 ? 0.353 ns/op > ImplicitExceptions.bench 0.33 avgt 5 3563.038 ? 77.358 ns/op > ImplicitExceptions.bench 0.66 avgt 5 8609.693 ? 1205.104 ns/op > ImplicitExceptions.bench 1.00 avgt 5 12842.401 ? 1022.728 ns/op > > -XX:-OmitStackTraceInFastThrow -XX:+OptimizeImplicitExceptions > Benchmark (exceptionProbability) Mode Cnt Score Error Units > ImplicitExceptions.bench 0.0 avgt 5 1.432 ? 0.352 ns/op > ImplicitExceptions.bench 0.33 avgt 5 355.723 ? 16.641 ns/op > ImplicitExceptions.bench 0.66 avgt 5 887.068 ? 166.728 ns/op > ImplicitExceptions.bench 1.00 avgt 5 1274.418 ? 88.235 ns/op > > > ### Implementation details > > - The new optimization is guarded by the option `OptimizeImplicitExceptions` which is on by default. > - In `GraphKit::builtin_throw()` we can't simply use `CallGenerator::for_direct_call()` to create a `DirectCallGenerator` for the call to the exception's `` function because `DirectCallGenerator` assumes in various places that calls are only issued at `invoke*` bytecodes. This is is not true in genral for bytecode which can cause an implicit exception. > - Instead, we manually wire up the call based on the code in `DirectCallGenerator::generate()`. > - We use a similar trick like for method handle intrinsics where the callee from the bytecode is replaced by a direct call and this fact is recorded in the call's `_override_symbolic_info` field. For calling constructors of implicit exceptions I've introduced the new field `_implicit_exception_init`. This field is also used in various assertions to prevent queries for the bytecode's symbolic method information which doesn't exist because we're not at an `invoke*` bytecode at the place where we generate the call. > - The PR contains a micro-benchmark which compares the old and the new implementation for [Tomcat's `HttpParser::isAlpha()` method](https://github.com/apache/tomcat/blob/26ba86cdbd40ca718e43b82e62b3eb49d004c3d6/java/org/apache/tomcat/util/http/parser/HttpParser.java#L266-L274). Except for the trivial case where the exception probability is 0 (i.e. no exceptions are happening at all) the new implementation is about 10 times faster. Volker Simonis has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: - Fix special case where we're creating an implicit exception for a regular invoke* bytecode - Minor updates as requested by @TheRealMDoerr - 8273563: Improve performance of implicit exceptions with -XX:-OmitStackTraceInFastThrow ------------- Changes: https://git.openjdk.java.net/jdk/pull/5488/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5488&range=02 Stats: 208 lines in 10 files changed: 200 ins; 0 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/5488.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5488/head:pull/5488 PR: https://git.openjdk.java.net/jdk/pull/5488 From chagedorn at openjdk.java.net Mon Oct 4 08:38:10 2021 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Mon, 4 Oct 2021 08:38:10 GMT Subject: RFR: 8267265: Use new IR Test Framework to create tests for C2 IGV transformations [v4] In-Reply-To: <841LgVYYimcsO_aZRCDU_6EBvNVXXeGbVkGfJ6gAQrg=.5978eea7-4753-465d-a627-ec1625a504ce@github.com> References: <8Ce6bZtHwGEw8_wXZz4ak3obprd1YmZDi4cItcXB4bA=.a7162709-7aad-4709-a585-d2391392f49b@github.com> <841LgVYYimcsO_aZRCDU_6EBvNVXXeGbVkGfJ6gAQrg=.5978eea7-4753-465d-a627-ec1625a504ce@github.com> Message-ID: <5ELOnKEloqyJk_gHgnvU2zJWDhxtjkByCQZgPLCHptA=.f26a20ac-fc1f-4a58-91dd-7a02994a5e7d@github.com> On Sat, 2 Oct 2021 00:11:47 GMT, John Tortugo wrote: >> test/hotspot/jtreg/compiler/c2/irTests/AddINodeIdealizationTests.java line 40: >> >>> 38: >>> 39: @Test >>> 40: @IR(failOn = {IRNode.LOAD, IRNode.STORE, IRNode.MUL, IRNode.DIV, IRNode.SUB}) >> >> In this test and all the following ones (including the other files), I think you can remove unrelated `failOn` regexes on operations that are not part of the test. For example, in this test you can safely remove `IRNode.MUL, DIV, and SUB`. > > Do you think I can remove the "LOAD" and "STORE" as well? Yes, I think you can remove them, too, as long as there are no fields involved. ------------- PR: https://git.openjdk.java.net/jdk/pull/5135 From shade at openjdk.java.net Mon Oct 4 12:26:08 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 4 Oct 2021 12:26:08 GMT Subject: RFR: 8274521: jdk/jfr/event/gc/detailed/TestGCLockerEvent.java fails when other GC is selected [v2] In-Reply-To: References: Message-ID: <8rsKPdZ0tR4JdWDcR09uDujl-GrFLPAFgp-5sLP8Ch8=.2dbd88c4-44be-4909-9ed6-2585e534c6cc@github.com> On Fri, 1 Oct 2021 09:40:55 GMT, Aleksey Shipilev wrote: >> Simple test bug, need to check if G1 is enabled, or it is a default GC before asking `othervm` with `-XX:+UseG1GC`. Other tests in the same directory do exactly that. >> >> Additional testing: >> - [x] Affected test skipped with Shenandoah now >> - [x] Affected test still passes with G1 >> - [x] Affected test still passes default GC > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Proper requires Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/5756 From shade at openjdk.java.net Mon Oct 4 12:26:09 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 4 Oct 2021 12:26:09 GMT Subject: Integrated: 8274521: jdk/jfr/event/gc/detailed/TestGCLockerEvent.java fails when other GC is selected In-Reply-To: References: Message-ID: On Wed, 29 Sep 2021 16:53:43 GMT, Aleksey Shipilev wrote: > Simple test bug, need to check if G1 is enabled, or it is a default GC before asking `othervm` with `-XX:+UseG1GC`. Other tests in the same directory do exactly that. > > Additional testing: > - [x] Affected test skipped with Shenandoah now > - [x] Affected test still passes with G1 > - [x] Affected test still passes default GC This pull request has now been integrated. Changeset: 0828273b Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/0828273b898cca5368344e75f1c3f4c3a29dde80 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod 8274521: jdk/jfr/event/gc/detailed/TestGCLockerEvent.java fails when other GC is selected Reviewed-by: kbarrett, tschatzl ------------- PR: https://git.openjdk.java.net/jdk/pull/5756 From coleenp at openjdk.java.net Mon Oct 4 14:10:29 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 4 Oct 2021 14:10:29 GMT Subject: RFR: 8273917: Remove 'leaf' ranking for Mutex [v2] In-Reply-To: <0u1_bu5EunfDFYz-AMKxs-wOVO0-HwpYpe8YoSXBwtM=.e25a1229-de90-4d15-9e89-2f20d6ccf40b@github.com> References: <0u1_bu5EunfDFYz-AMKxs-wOVO0-HwpYpe8YoSXBwtM=.e25a1229-de90-4d15-9e89-2f20d6ccf40b@github.com> Message-ID: > This change removes 'leaf' ranking. The previous change for JDK-8273915 divided the 'leaf' ranked locks that didn't safepoint check into the rank 'nosafepoint', so all the 'leaf' ranking locks left were safepoint_check_always. > > The rank 'nonleaf' (to be renamed 'safepoint' in the next change) is the *top* mutex rank. > > The transformation in this change is as follows: > nonleaf+n => nonleaf - Generally these 'nonleaf' mutex were top level locks) > leaf => nonleaf - Many of these locks were top level locks > leaf => nonleaf-2 - Assuming that they were 'leaf' and 2 levels less than some existing nonleaf lock > leaf-n => nonleaf-n > > The new mutex rankings reflect their rankings based on my logging, except for a couple shenandoah locks which I didn't observe, so I made them nonleaf-2. > > This change also introduces a relative mutex ranking macro, so that a Mutex/Monitor can be defined in terms of a mutex that it holds while trying to acquire it. So these relative mutex are moved to the end of the init function in mutexLocker.cpp. > > This has been tested with tier1-8, and retesting tier1-3 locally in progress. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix Minimal build. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5801/files - new: https://git.openjdk.java.net/jdk/pull/5801/files/34c43ec0..4596df6f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5801&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5801&range=00-01 Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/5801.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5801/head:pull/5801 PR: https://git.openjdk.java.net/jdk/pull/5801 From Divino.Cesar at microsoft.com Mon Oct 4 21:32:37 2021 From: Divino.Cesar at microsoft.com (Cesar Soares Lucas) Date: Mon, 4 Oct 2021 21:32:37 +0000 Subject: RFC - Improving C2 Escape Analysis In-Reply-To: References: <20210930140335.648146897@eggemoggin.niobe.net> Message-ID: Hi John, Thank you for the feedback and for clarifying how Valhalla will impact Escape Analysis and Scalar Replacement in C2. I'm excited to know that it will bring so much goodness! ? I think it's safe to say that the Inlining horizon and the presence of control/data-flow merge points in the compilation unit are the major hurdles to the effectiveness of the current EA and dependent optimizations. I understand that Valhalla (Inline Types) will help tremendously with the inlining horizon aspect of the problem. How much of a problem is control-flow/data-flow merge points in a world where Inline Types are in place? From my (external) point of view, I can ?imagine that this will still be a problem. If so, it looks to me that this is an issue that ?once solved would benefit not only the current EA implementation but also Valhalla. ?What do you [all] think? ? You mention about the "remaining [EA] hard cases once Inline Types are in place". Can you please expand a little bit on that? Thank you again for taking the time to read the report and provide your perspective! ? Regards, Cesar From: John Rose Sent: October 1, 2021 11:35 AM To: Mark Reinhold Cc: Cesar Soares Lucas ; hotspot-dev at openjdk.java.net ; Brian Stafford ; Martijn Verburg Subject: Re: RFC - Improving C2 Escape Analysis ? I second Mark?s request. The OpenJDK mailing lists (like this one) and the cr.ojn pages are the official central stable record of our shared engineering notes and conversations about how we are evolving our platform. (For those few itching to have a side-conversation about all the gritty process-related reasons why that is so, I?m not your guy.? But it is so.) Quick comments: Thank you.? This is a very good line of research. It feels good to have other folks in other teams caring about it. The paper touched on the factors which limit EA, notably the path-independent approximation, and (most of all) the inlining decisions.? I like to think of the cumulative result of the inlining decisions (of a particular compile task) as the ?inlining horizon? that the task works within. Anything that happens inside the horizon can be represented accurately by the IR of the task. Anything outside the horizon has to be characterized with approximations, usually worst-case approximations.? It matters where you place the horizon, and what assumptions you can make on values that come in across or go out across the horizon. One think that I like to think about is what are the ?hard cases? that force us to make weak assumptions about what happens outside the inlining horizon.? Quantitative corpus analyses can sometimes be tuned to help us understand which ?hard cases? are preventing optimization. Characterizing those hard cases is important because it focuses our attention where to make changes.? Still more, because we are redefining Java as we go alone, we can redefine the rules of Java so that the hard cases get less hard, not because we are more clever about what we do inside the horizon (or expanding the horizon) but because we change the rules in our favor to limit what can happen outside the horizon. Valhalla is where those rules change.? Valhalla is intended to deeply change the results of EA. Not to make EA unnecessary, by any means, but rather to focus EA better on some remaining ?hard cases? while allowing EA to Two (not the only two) of the bad things that can happen outside of the inlining horizon are field changes and identity reads. When a field changes outside the horizon, the IR has to model the new value somehow.? We have changed the rules here by introducing final fields.? In the future we will perhaps tighten the rules for ?trusting? final fields not to change (some say we don?t need that) and perhaps allow ?frozen arrays? which provably cannot be mutated even outside an inlining horizon. Identity reads are nasty little things.? Often you don?t care about the precise identity of an object, just its contents.? For example, Integer.valueOf(x) or List.of(x,y) have an identity, but you are not supposed to depend on it.? (Don?t rely on `x==y`, use `x.equals(y)`, etc.) We have written down some rules about value-based classes[1] which try to tell the programmer how to stay away from identity reads, but there are no ?teeth? in that document.? As a result, an object which crosses into or out of an inlining horizon may or may not have the same identity as any other object crossing the horizon (if the contents are the same, of course). [1]: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.oracle.com%2Fjavase%2F8%2Fdocs%2Fapi%2Fjava%2Flang%2Fdoc-files%2FValueBased.html&data=04%7C01%7Cdivino.cesar%40microsoft.com%7C36b5dce31fce47979a1a08d9850a39f5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637687101252541730%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=KA8Buahfp6YwXHxGJoc738RHPG%2FoLf27MDbBIoQeR5c%3D&reserved=0 Valhalla puts teeth into those rules, by defining a robust type of value-based class which actively obscures its identity. (As a corollary of that it cannot be used to post side effects to its fields.? Maybe via fields of mutable objects referred to by its fields, but not directly in its own fields.) I expect that your numbers would change greatly if there were routine use of Valhalla primitive objects wherever VBC?s are in use today.? There would still be non-primitive objects passing through the inlining horizon, and it would still be incrementally profitable to track them and estimate their behavior, but the profit margin would be elsewhere. This is what winning looks like, actually:? If we are having a hard time playing today?s EA game, we turn the tables over and start a new EA game we can win at.? This can happen because we don?t just write the JVM optimizer, we also write the JVM rules. Here?s a way to model what Valhalla objects can do, relative to our sense of today?s EA optimizations.? At the IR level, you can ?split? an object (just like you ?split? a register allocation) by recopying it at all uses.? If it were a regular Java object, you would be breaking its identity up into a bunch of path-dependent identities.? If there are identity reads and/or state changes in the ?split? object, you have just broken the user?s program.? But, if you can get away with it, you have also made your worries disappear about what can happen over the horizon, because ?bad stuff? happens only to the ?split? copies which have disappeared over the horizon.? The local ?split? copies are still sitting there in your IR graph, undisturbed. Valhalla allows (even encourages) the optimizer to do this splitting, routinely and frequently.? Our prototype also records ?souvenirs? of where each split came from, in case we wish to re-box on the heap; then we can reuse a souvenir if one is present.? But the canonical form (apart from the souvenir) is just the tuple of field values, with no object identity (or header) in sight. (Headers are, in a sense, where object identity has its ?primary residence?.? The GC uses a header to forward stateful objects accurately, and the header is also used to manage identity reads for identityHashCode and synchronization.) The design of Valhalla is informed, in part, by our understanding of the pitfalls of EA, especially with respect to the inlining horizon.? We are designing our way out of a trap that prevents us from doing the EA we wish to do.? Your work will get better with Valhalla; I hope it can also help us predict the effects of Valhalla. ? John On Sep 30, 2021, at 2:03 PM, mark.reinhold at oracle.com wrote: > > 2021/9/30 13:51:32 -0700, divino.cesar at microsoft.com: >> I've spent the past few weeks investigating the C2 Escape Analysis >> implementation with the goal of identifying which part(s) of it would benefit >> the most from a contribution. >> >> As a conclusion to that investigation, I wrote a report where I list the most >> evident points, accompanied with a _preliminary_ quantitative analysis of how >> effective the current implementation is for finding opportunities for Scalar >> Replacement. >> >> ... >> >> Here is the link to the document: >> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgist.github.com%2FJohnTortugo%2Fc2607821202634a6509ec3c321ebf370&data=04%7C01%7Cdivino.cesar%40microsoft.com%7C36b5dce31fce47979a1a08d9850a39f5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637687101252541730%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=KjGpgxl1L1PZngcwtFuNOlgDlGsMiBxzHbu8J8h0mkc%3D&reserved=0 > > Thanks for writing this up! > > For IP clarity, could you please post a copy of this document either to > this mailing list or to your directory on cr.openjdk.java.net [1]? > > - Mark > > > [1] https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fopenjdk.java.net%2Fguide%2FcodeReview.html&data=04%7C01%7Cdivino.cesar%40microsoft.com%7C36b5dce31fce47979a1a08d9850a39f5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637687101252541730%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=yrXC%2F8%2BRuL5Jo245KrNK%2FAaYJW%2FWLCOng2rSRRx8TWE%3D&reserved=0 From john.r.rose at oracle.com Mon Oct 4 22:20:02 2021 From: john.r.rose at oracle.com (John Rose) Date: Mon, 4 Oct 2021 22:20:02 +0000 Subject: [External] : Re: RFC - Improving C2 Escape Analysis In-Reply-To: References: <20210930140335.648146897@eggemoggin.niobe.net> Message-ID: On Oct 4, 2021, at 2:32 PM, Cesar Soares Lucas wrote: > > ... > I think it's safe to say that the Inlining horizon and the presence of > control/data-flow merge points in the compilation unit are the major hurdles to > the effectiveness of the current EA and dependent optimizations. I understand > that Valhalla (Inline Types) will help tremendously with the inlining horizon > aspect of the problem. How much of a problem is control-flow/data-flow merge > points in a world where Inline Types are in place? There?s an easy thought experiment or two here: Suppose a class V is known to be identity-insensitive. This means a physical copy of V can be recopied into another location (on the heap or even registers or stack) with no violation of semantics. If there is some use u(v) of a V-value v, we can pass the physical representation of v through an copy operation c(v) which preserves the value but physically rebuffers it somewhere else, so that the new graph is u(c(v)). We call this ?splitting the use? u(v). Experiment 1: At the horizon of the current task, split all incoming and outgoing uses of type V. Check how this transform affects your favorite EA. I think you suddenly don?t care about identity escapes of those values, beyond the horizon, because they take up new identities there. Experiment 2: In the current task, split all uses of type V that touch phis. (Where there are already copies, of the form c(v), use the identity c(c(v)) = c(v).) Again, check how this affects your favorite flow-sensitive EA. Now I think you have decoupled questions of identity from the control flow graph. The limiting case is to treat values of type V as always splitting. And you probably don?t ever need explicit c-nodes. The c notation helps to explain how V-type references behave differently from normal references, in classic EA algorithms. > From my (external) point of > view, I can imagine that this will still be a problem. What?s an example of such a problem, that remains after uses are split with the c-operator? (There?s an issue of removing needless splits, but if the c-operator is never materialized, then the problem reduces, I think, to tracking ?souvenirs? of last-known bufferings of given value bundles.) > If so, it looks to me > that this is an issue that once solved would benefit not only the current EA > implementation but also Valhalla. What do you [all] think? > > You mention about the "remaining [EA] hard cases once Inline Types are in > place". Can you please expand a little bit on that? Oops, I left a sentence incomplete there. It should have been simply: Not to make EA unnecessary, by any means, but rather to focus EA better on some remaining ?hard cases?. The hard cases continue to be behavior of objects outside of the inlining horizon that might disrupt optimistic optimizations inside the horizon. A couple of examples: A. We might wish to use thread-local allocation for ?confined? objects which never leave the thread, but which need to be accessed as if they were on the heap. (Stack allocation is a *special case* of thread-local allocation: It requires an extra assurance that the object lives only during the lifetime of a stack frame. BTW I think concurrent cross-thread access to thread-local objects is likely to be very buggy, which is why I draw the line at confinement, not the different line of ?during the *global* lifetime of a stack frame?.) B. We might wish to vectorize over data sources and sinks assuming (after a check perhaps) that the memory streams are non-overlapping with writes we are vectorizing (a ?noalias? condition). A hard case of A. is when code might let a mostly-confined reference escape to somewhere another thread can see it. There are many ways to build a fence around this, such as static inter-task analysis beyond the horizon (what is called ?interprocedural?, but the JIT task not the procedure is the key unit). Or GC barriers that detect escapes dynamically. A hard case of B. is similar, but you might test for different conditions, trying to prove that distinct alias categories stay distinct beyond the horizon. Actually, I don?t have any good suggestions for that, other than the usual technique of predicating on a non-overlap condition inside the horizon, before a loop that needs the condition. Even if we use only static inter-task techniques to analyze and summarize the global access to references (passed to and from method calls and data structures), there is probably always a dynamic component. At least, if you add a new subclass you probably have to adjoin the summaries of its overriding methods, if you are relying on summaries of related methods; in the worst case you might have to retract optimizations in running code (as we do when we de-opt when a devirtualization is not longer correct). Turning that around, there might be cases where a reference *might* escape but it dynamically *does not*. (Example: A so-far-never-taken path makes a reference escape, where the other paths keep it confined.) We might impose dynamic checks on methods whose behavior we have summarized (for the benefit of compile tasks which do not inline them), so that the summary can be sharper, at the cost of doing de-opt if the sharper summary is no longer valid. There?s a frontier here, of more and more clever and aggressive and speculative optimizations we can do. I think it helps to have them operate on fewer (larger?) objects. So I guess what I?m hoping for here is that if many small objects are made identity-free, then we can better use limited resources of time, space, and engineering cleverness to track the remaining objects whose access we wish to optimize. > Thank you again for taking the time to read the report and provide your > perspective! My pleasure! From dholmes at openjdk.java.net Tue Oct 5 01:48:08 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 5 Oct 2021 01:48:08 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v4] In-Reply-To: <_XakYRQVR8S2LYHZgtIidUTTJ5B1LvHQuo8ZWKu4Gjs=.c9d8c696-7478-485c-8df7-788aa1b38bcc@github.com> References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <_XakYRQVR8S2LYHZgtIidUTTJ5B1LvHQuo8ZWKu4Gjs=.c9d8c696-7478-485c-8df7-788aa1b38bcc@github.com> Message-ID: On Mon, 4 Oct 2021 06:39:31 GMT, Xin Liu wrote: >> This patch allows the custom commands of OnError to attach to HotSpot itself. >> It sets the thread of report_and_die() to Native before os::fork_and_exec(cmd). >> This prevents cmds which require safepoint synchronization from deadlock. >> eg. OnError='jcmd %p Thread.print'. >> >> Without this patch, we will encounter a deadlock at safepoint synchronization. >> `"main" #1` is the very thread which executes `os::fork_and_exec(cmd)`. >> >> >> Aborting due to java.lang.OutOfMemoryError: Java heap space >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (debug.cpp:364), pid=94632, tid=94633 >> # fatal error: OutOfMemory encountered: Java heap space >> # >> # JRE version: OpenJDK Runtime Environment (18.0) (build 18-internal+0-adhoc.xxinliu.jdk) >> # Java VM: OpenJDK 64-Bit Server VM (18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) >> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again >> # >> # An error report file with more information is saved as: >> # /local/home/xxinliu/JDK-2085/hs_err_pid94632.log >> # >> # -XX:OnError="jcmd %p Thread.print" >> # Executing /bin/sh -c "jcmd 94632 Thread.print" ... >> 94632: >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timeout detected: >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint. >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Threads which did not reach the safepoint: >> [10.616s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=236.97ms elapsed=10.61s tid=0x00007f01b00232f0 nid=94633 runnable [0x00007f01b7a08000] >> [10.616s][warning][safepoint] java.lang.Thread.State: RUNNABLE >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: (End of list) > > Xin Liu has updated the pull request incrementally with one additional commit since the last revision: > > Cleanup: remove a statment of debugging. Hi Xin, A couple of minor nits below but otherwise this seems okay now. Thanks, David src/hotspot/share/runtime/mutexLocker.cpp line 376: > 374: } > 375: > 376: void unlock_locks_on_error(JavaThread* thread) { Nit: s/thread/current/ as we must be applying this to the current thread. src/hotspot/share/runtime/mutexLocker.hpp line 176: > 174: void print_owned_locks_on_error(outputStream* st); > 175: // Unlock all Mutex/Monitors currently owned by a JavaThread. > 176: void unlock_locks_on_error(JavaThread* t); Nit: s/t/current/ as we must be applying this to the current thread. Also as a matter of style the parameter names in the declaration should match those in the definition. test/hotspot/jtreg/runtime/ErrorHandling/TestOutOfMemoryErrorFromNIO.java line 26: > 24: /* > 25: * @test TestOutOfMemoryErrorFromNIO > 26: * @bug 8155004 8273608 This shouldn't reference 8155004 as it makes no sense to claim to be a regression test for an issue that did not generate a code change. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5590 From mdoerr at openjdk.java.net Tue Oct 5 10:36:19 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Tue, 5 Oct 2021 10:36:19 GMT Subject: RFR: 8274770: [PPC64] resolve_jobject needs a generic implementation to support load barriers Message-ID: The default implementation of BarrierSetAssembler::resolve_jobject should be generic and support all GCs. Single GCs can still implement an optimized version. ------------- Commit messages: - 8274770: [PPC64] resolve_jobject needs a generic implementation to support load barriers Changes: https://git.openjdk.java.net/jdk/pull/5821/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5821&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8274770 Stats: 34 lines in 3 files changed: 31 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/5821.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5821/head:pull/5821 PR: https://git.openjdk.java.net/jdk/pull/5821 From coleenp at openjdk.java.net Tue Oct 5 13:31:10 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 5 Oct 2021 13:31:10 GMT Subject: RFR: 8273917: Remove 'leaf' ranking for Mutex [v2] In-Reply-To: References: <0u1_bu5EunfDFYz-AMKxs-wOVO0-HwpYpe8YoSXBwtM=.e25a1229-de90-4d15-9e89-2f20d6ccf40b@github.com> Message-ID: On Mon, 4 Oct 2021 14:10:29 GMT, Coleen Phillimore wrote: >> This change removes 'leaf' ranking. The previous change for JDK-8273915 divided the 'leaf' ranked locks that didn't safepoint check into the rank 'nosafepoint', so all the 'leaf' ranking locks left were safepoint_check_always. >> >> The rank 'nonleaf' (to be renamed 'safepoint' in the next change) is the *top* mutex rank. >> >> The transformation in this change is as follows: >> nonleaf+n => nonleaf - Generally these 'nonleaf' mutex were top level locks) >> leaf => nonleaf - Many of these locks were top level locks >> leaf => nonleaf-2 - Assuming that they were 'leaf' and 2 levels less than some existing nonleaf lock >> leaf-n => nonleaf-n >> >> The new mutex rankings reflect their rankings based on my logging, except for a couple shenandoah locks which I didn't observe, so I made them nonleaf-2. >> >> This change also introduces a relative mutex ranking macro, so that a Mutex/Monitor can be defined in terms of a mutex that it holds while trying to acquire it. So these relative mutex are moved to the end of the init function in mutexLocker.cpp. >> >> This has been tested with tier1-8, and retesting tier1-3 locally in progress. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix Minimal build. According to my logging, the shenandoah locks don't have any dependencies when running runThese with shenandoah, but could someone confirm this is ok to make them nonleaf? @zhengyu123 ? $ grep -r ShenandoahAllocFailureGC_lock 38_ShenandoahAllocFailureGC_lock:def(ShenandoahAllocFailureGC_lock , PaddedMonitor, nonleaf, _safepoint_check_always, true); $ grep -r ShenandoahRequestedGC_lock 38_ShenandoahRequestedGC_lock:def(ShenandoahRequestedGC_lock , PaddedMonitor, nonleaf, _safepoint_check_always, true); _wait_monitor (should rename to a consistent lock name) depends on PeriodicTask_lock: $ grep -r PeriodicTask_lock 38_PeriodicTask_lock:def(PeriodicTask_lock , PaddedMonitor, nonleaf, _safepoint_check_always, true); 37__wait_monitor:def(_wait_monitor , PaddedMonitor, PeriodicTask_lock, _safepoint_check_always, true); ------------- PR: https://git.openjdk.java.net/jdk/pull/5801 From coleenp at openjdk.java.net Tue Oct 5 14:35:31 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 5 Oct 2021 14:35:31 GMT Subject: RFR: 8273917: Remove 'leaf' ranking for Mutex [v3] In-Reply-To: <0u1_bu5EunfDFYz-AMKxs-wOVO0-HwpYpe8YoSXBwtM=.e25a1229-de90-4d15-9e89-2f20d6ccf40b@github.com> References: <0u1_bu5EunfDFYz-AMKxs-wOVO0-HwpYpe8YoSXBwtM=.e25a1229-de90-4d15-9e89-2f20d6ccf40b@github.com> Message-ID: <7Q5YHVtzbMG9Hj1y19E81HPARccq0RxaNwWqX05-fkE=.612b2d93-42f3-4df3-bd66-7b97bf66dad6@github.com> > This change removes 'leaf' ranking. The previous change for JDK-8273915 divided the 'leaf' ranked locks that didn't safepoint check into the rank 'nosafepoint', so all the 'leaf' ranking locks left were safepoint_check_always. > > The rank 'nonleaf' (to be renamed 'safepoint' in the next change) is the *top* mutex rank. > > The transformation in this change is as follows: > nonleaf+n => nonleaf - Generally these 'nonleaf' mutex were top level locks) > leaf => nonleaf - Many of these locks were top level locks > leaf => nonleaf-2 - Assuming that they were 'leaf' and 2 levels less than some existing nonleaf lock > leaf-n => nonleaf-n > > The new mutex rankings reflect their rankings based on my logging, except for a couple shenandoah locks which I didn't observe, so I made them nonleaf-2. > > This change also introduces a relative mutex ranking macro, so that a Mutex/Monitor can be defined in terms of a mutex that it holds while trying to acquire it. So these relative mutex are moved to the end of the init function in mutexLocker.cpp. > > This has been tested with tier1-8, and retesting tier1-3 locally in progress. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Adjust some nonleaf-2 ranks. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5801/files - new: https://git.openjdk.java.net/jdk/pull/5801/files/4596df6f..bc4022c6 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5801&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5801&range=01-02 Stats: 4 lines in 3 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/5801.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5801/head:pull/5801 PR: https://git.openjdk.java.net/jdk/pull/5801 From mdoerr at openjdk.java.net Tue Oct 5 15:39:08 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Tue, 5 Oct 2021 15:39:08 GMT Subject: RFR: 8273563: Improve performance of implicit exceptions with -XX:-OmitStackTraceInFastThrow [v3] In-Reply-To: References: Message-ID: On Mon, 4 Oct 2021 07:51:27 GMT, Volker Simonis wrote: >> Currently, if running with `-XX:-OmitStackTraceInFastThrow`, C2 has no possibility to create implicit exceptions like AIOOBE, NullPointerExceptions, etc. in compiled code. This means that such methods will always be deoptimized and re-executed in the interpreter if such exceptions are happening. >> >> If implicit exceptions are used for normal control flow, that can have a dramatic impact on performance. A prominent example for such code is [Tomcat's `HttpParser::isAlpha()` method](https://github.com/apache/tomcat/blob/26ba86cdbd40ca718e43b82e62b3eb49d004c3d6/java/org/apache/tomcat/util/http/parser/HttpParser.java#L266-L274): >> >> public static boolean isAlpha(int c) { >> try { >> return IS_ALPHA[c]; >> } catch (ArrayIndexOutOfBoundsException ex) { >> return false; >> } >> } >> >> >> ### Solution >> >> Instead of deoptimizing and resorting to the interpreter, we can generate code which allocates and initializes the corresponding exceptions right in compiled code. This results in a ten-times performance improvement for the above code: >> >> -XX:-OmitStackTraceInFastThrow -XX:-OptimizeImplicitExceptions >> Benchmark (exceptionProbability) Mode Cnt Score Error Units >> ImplicitExceptions.bench 0.0 avgt 5 1.430 ? 0.353 ns/op >> ImplicitExceptions.bench 0.33 avgt 5 3563.038 ? 77.358 ns/op >> ImplicitExceptions.bench 0.66 avgt 5 8609.693 ? 1205.104 ns/op >> ImplicitExceptions.bench 1.00 avgt 5 12842.401 ? 1022.728 ns/op >> >> -XX:-OmitStackTraceInFastThrow -XX:+OptimizeImplicitExceptions >> Benchmark (exceptionProbability) Mode Cnt Score Error Units >> ImplicitExceptions.bench 0.0 avgt 5 1.432 ? 0.352 ns/op >> ImplicitExceptions.bench 0.33 avgt 5 355.723 ? 16.641 ns/op >> ImplicitExceptions.bench 0.66 avgt 5 887.068 ? 166.728 ns/op >> ImplicitExceptions.bench 1.00 avgt 5 1274.418 ? 88.235 ns/op >> >> >> ### Implementation details >> >> - The new optimization is guarded by the option `OptimizeImplicitExceptions` which is on by default. >> - In `GraphKit::builtin_throw()` we can't simply use `CallGenerator::for_direct_call()` to create a `DirectCallGenerator` for the call to the exception's `` function because `DirectCallGenerator` assumes in various places that calls are only issued at `invoke*` bytecodes. This is is not true in genral for bytecode which can cause an implicit exception. >> - Instead, we manually wire up the call based on the code in `DirectCallGenerator::generate()`. >> - We use a similar trick like for method handle intrinsics where the callee from the bytecode is replaced by a direct call and this fact is recorded in the call's `_override_symbolic_info` field. For calling constructors of implicit exceptions I've introduced the new field `_implicit_exception_init`. This field is also used in various assertions to prevent queries for the bytecode's symbolic method information which doesn't exist because we're not at an `invoke*` bytecode at the place where we generate the call. >> - The PR contains a micro-benchmark which compares the old and the new implementation for [Tomcat's `HttpParser::isAlpha()` method](https://github.com/apache/tomcat/blob/26ba86cdbd40ca718e43b82e62b3eb49d004c3d6/java/org/apache/tomcat/util/http/parser/HttpParser.java#L266-L274). Except for the trivial case where the exception probability is 0 (i.e. no exceptions are happening at all) the new implementation is about 10 times faster. > > Volker Simonis has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: > > - Fix special case where we're creating an implicit exception for a regular invoke* bytecode > - Minor updates as requested by @TheRealMDoerr > - 8273563: Improve performance of implicit exceptions with -XX:-OmitStackTraceInFastThrow Thanks for the update! New version looks good to me and our tests with disabled OmitStackTraceInFastThrow have passed. I think you should choose a couple of jtreg tests and add an additional run with OmitStackTraceInFastThrow disabled. What do you think? ------------- PR: https://git.openjdk.java.net/jdk/pull/5488 From coleenp at openjdk.java.net Tue Oct 5 15:59:12 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 5 Oct 2021 15:59:12 GMT Subject: RFR: 8264707: HotSpot Style Guide should permit use of lambda [v2] In-Reply-To: References: Message-ID: <_-oWSNfnIGybhYO8pO5MxwXlAS9ulUZDx8peHJQkzFk=.8f4e8fb3-616f-423c-b391-319b12b6ff57@github.com> On Sat, 25 Sep 2021 00:14:50 GMT, Kim Barrett wrote: >> src/hotspot/share/compiler/compilerOracle.cpp line 787: >> >>> 785: >>> 786: char* original_line = os::strdup(line, mtInternal); >>> 787: auto g = make_guard([&] { os::free(original_line); }); >> >> I have to admit that this usage is very mysterious. I know you're going to take it out but using lambdas for something that's really hard to parse doesn't seem like a win. Does save lines of code though. > > I think a different approach may be possible, that would have usage like this: > > char* original_line = os::strdup(line, mtInternal); > ScopeGuard guard([&] { os::free(original_line); }); > > This would use something like (or maybe directly use) a solution to the type-erased lambda capturing problem a la std::function. Yes, this second usage looks like the typical RAII object that is familiar in the code. ------------- PR: https://git.openjdk.java.net/jdk/pull/5144 From xliu at openjdk.java.net Tue Oct 5 16:57:13 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Tue, 5 Oct 2021 16:57:13 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v4] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <_XakYRQVR8S2LYHZgtIidUTTJ5B1LvHQuo8ZWKu4Gjs=.c9d8c696-7478-485c-8df7-788aa1b38bcc@github.com> Message-ID: On Tue, 5 Oct 2021 01:42:10 GMT, David Holmes wrote: >> Xin Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Cleanup: remove a statment of debugging. > > test/hotspot/jtreg/runtime/ErrorHandling/TestOutOfMemoryErrorFromNIO.java line 26: > >> 24: /* >> 25: * @test TestOutOfMemoryErrorFromNIO >> 26: * @bug 8155004 8273608 > > This shouldn't reference 8155004 as it makes no sense to claim to be a regression test for an issue that did not generate a code change. yes, I see. ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From eosterlund at openjdk.java.net Tue Oct 5 17:01:13 2021 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 5 Oct 2021 17:01:13 GMT Subject: RFR: 8273917: Remove 'leaf' ranking for Mutex [v3] In-Reply-To: <7Q5YHVtzbMG9Hj1y19E81HPARccq0RxaNwWqX05-fkE=.612b2d93-42f3-4df3-bd66-7b97bf66dad6@github.com> References: <0u1_bu5EunfDFYz-AMKxs-wOVO0-HwpYpe8YoSXBwtM=.e25a1229-de90-4d15-9e89-2f20d6ccf40b@github.com> <7Q5YHVtzbMG9Hj1y19E81HPARccq0RxaNwWqX05-fkE=.612b2d93-42f3-4df3-bd66-7b97bf66dad6@github.com> Message-ID: On Tue, 5 Oct 2021 14:35:31 GMT, Coleen Phillimore wrote: >> This change removes 'leaf' ranking. The previous change for JDK-8273915 divided the 'leaf' ranked locks that didn't safepoint check into the rank 'nosafepoint', so all the 'leaf' ranking locks left were safepoint_check_always. >> >> The rank 'nonleaf' (to be renamed 'safepoint' in the next change) is the *top* mutex rank. >> >> The transformation in this change is as follows: >> nonleaf+n => nonleaf - Generally these 'nonleaf' mutex were top level locks) >> leaf => nonleaf - Many of these locks were top level locks >> leaf => nonleaf-2 - Assuming that they were 'leaf' and 2 levels less than some existing nonleaf lock >> leaf-n => nonleaf-n >> >> The new mutex rankings reflect their rankings based on my logging, except for a couple shenandoah locks which I didn't observe, so I made them nonleaf-2. >> >> This change also introduces a relative mutex ranking macro, so that a Mutex/Monitor can be defined in terms of a mutex that it holds while trying to acquire it. So these relative mutex are moved to the end of the init function in mutexLocker.cpp. >> >> This has been tested with tier1-8, and retesting tier1-3 locally in progress. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Adjust some nonleaf-2 ranks. I think this is a step in the right direction, and I heard great things will come in the next patch. Thanks for reducing the changes to nonleaf - 2. Go Coleen, go! Also looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5801 From stuefe at openjdk.java.net Tue Oct 5 18:45:13 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 5 Oct 2021 18:45:13 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v4] In-Reply-To: <_XakYRQVR8S2LYHZgtIidUTTJ5B1LvHQuo8ZWKu4Gjs=.c9d8c696-7478-485c-8df7-788aa1b38bcc@github.com> References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <_XakYRQVR8S2LYHZgtIidUTTJ5B1LvHQuo8ZWKu4Gjs=.c9d8c696-7478-485c-8df7-788aa1b38bcc@github.com> Message-ID: On Mon, 4 Oct 2021 06:39:31 GMT, Xin Liu wrote: >> This patch allows the custom commands of OnError to attach to HotSpot itself. >> It sets the thread of report_and_die() to Native before os::fork_and_exec(cmd). >> This prevents cmds which require safepoint synchronization from deadlock. >> eg. OnError='jcmd %p Thread.print'. >> >> Without this patch, we will encounter a deadlock at safepoint synchronization. >> `"main" #1` is the very thread which executes `os::fork_and_exec(cmd)`. >> >> >> Aborting due to java.lang.OutOfMemoryError: Java heap space >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (debug.cpp:364), pid=94632, tid=94633 >> # fatal error: OutOfMemory encountered: Java heap space >> # >> # JRE version: OpenJDK Runtime Environment (18.0) (build 18-internal+0-adhoc.xxinliu.jdk) >> # Java VM: OpenJDK 64-Bit Server VM (18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) >> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again >> # >> # An error report file with more information is saved as: >> # /local/home/xxinliu/JDK-2085/hs_err_pid94632.log >> # >> # -XX:OnError="jcmd %p Thread.print" >> # Executing /bin/sh -c "jcmd 94632 Thread.print" ... >> 94632: >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timeout detected: >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint. >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Threads which did not reach the safepoint: >> [10.616s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=236.97ms elapsed=10.61s tid=0x00007f01b00232f0 nid=94633 runnable [0x00007f01b7a08000] >> [10.616s][warning][safepoint] java.lang.Thread.State: RUNNABLE >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: (End of list) > > Xin Liu has updated the pull request incrementally with one additional commit since the last revision: > > Cleanup: remove a statment of debugging. Hi Xin, nice work. Good test. Some nitpicking inline. About the test, I still would prefer very much to limit this workaround to OOM errors only in VMError, but David had a different opinion, so I won't press the point. However, if we do this for normal crashes+OnError too, I'd like to see a test for that (you can simply reuse your test, add a second `@run` section and just start the VM with `-XX:ErrorHandlerTest=14 -XX:OnError=... -version`). Of course, that one would require `vm.debug`. Cheers, Thoams src/hotspot/share/runtime/mutexLocker.cpp line 376: > 374: } > 375: > 376: void unlock_locks_on_error(JavaThread* thread) { Does this have to live in mutexLocker.cpp? Seems to be highly specific to error handling and would make more sense as a function local static in vmError.cpp. src/hotspot/share/utilities/vmError.cpp line 1320: > 1318: // Otherwise, it may end up with a deadlock when dcmds try to synchronize all Java threads > 1319: // at safepoints. > 1320: bool transition_into_native() { Since this is a highly specific workaround for one specific problem (forking off jcmd which will attach back to us with a sub command which needs a safe point) I would suggest naming this function somewhat more specific, e.g. "prepare_jcmd_workaround" or "prepare_fork_and_exec" (though the latter is misleading since this is needed only for the jcmd-attaches-to-me case). For these cases, I also prefer the JBS issue number mentioned in a comment. Its helpful when one looks at the code, and one does not have to dig through the change history. test/hotspot/jtreg/runtime/ErrorHandling/TestOutOfMemoryErrorFromNIO.java line 25: > 23: > 24: /* > 25: * @test TestOutOfMemoryErrorFromNIO I would rename the test. The point is not to test OOM from NIO, but to test XX:OnError with jcmd which attaches to the parent. You could do the same with any OOM, right? E.g. just start the VM with a very small MaxMetaspaceSize. You would not even need a main function for that, since the VM would not come up but throw a OOM right away, Proposal (but feel free to come up with something better): "TestOnErrorWithSelfAttachingJCmd" test/hotspot/jtreg/runtime/ErrorHandling/TestOutOfMemoryErrorFromNIO.java line 59: > 57: StringBuilder after = new StringBuilder(jcmd); > 58: after.append(" %p"); > 59: after.append(" GC.heap_dump a.hprof"); Could be probably simplified to: String jcmd = JDKToolFinder.getJDKTool("jcmd"); String before = jcmd + " %p Thread.print"; String after = jcmd + " %p GC.heap_dump a.hprof; test/hotspot/jtreg/runtime/ErrorHandling/TestOutOfMemoryErrorFromNIO.java line 69: > 67: "-XX:+UnlockDiagnosticVMOptions", > 68: "-XX:AbortVMOnException=java.lang.OutOfMemoryError", > 69: "-XX:OnError=" + cmds, Please specify a (small) heap size. test/hotspot/jtreg/runtime/ErrorHandling/TestOutOfMemoryErrorFromNIO.java line 84: > 82: # -XX:OnError="echo Test1 Succeeded" > 83: # Executing /bin/sh -c "echo Test1 Succeeded" ... > 84: */ Can you make this a line comment (`//`)? ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From stuefe at openjdk.java.net Tue Oct 5 18:45:13 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 5 Oct 2021 18:45:13 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v2] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <8LmYbwFdozu6SSdD6j0sE4dKDiu19bvqExkE-TR_4YA=.5f7caeab-e42e-4815-bab0-6da3ca0ae514@github.com> Message-ID: <4URcCMaCGRzf0PK4VClt5f0cztoqrm6BD5tMCPvTW2A=.f1501d74-2bc6-47b4-8325-8cf897a0613e@github.com> On Wed, 29 Sep 2021 05:37:31 GMT, David Holmes wrote: >> This logic is for release build. >> >> Yes, we don't need them in debug build. We should have released all owning mutexes above. >> It's no-op in debug build because owner() should be NULL. `thread` isn't NULL in this function. >> >> Here is current bookkeeping. From a thread, we can't find all owning mutexes. @pchilano said we can try _mutex_array in release build. I think the idea is that we are supposed to capture failure in debug build. >> >> | class | debug | release | | >> |-------------|---------------------------|--------------|---| >> | Thread | _owned_locks (listedlist) | N/A | | >> | Mutex | _owner | _owner | | >> | MutexLocker | _mutex_array | _mutex_array | | >> >> Shall we fix _mutex_array in release? I think we can change it to GrowableArray and keep track all mutexes in runtime. > > My point is that the code to process the `_mutex_array` should be in a `#else` so that we don't build it or use for debug builds. So a debug build uses `owned_locks()` while a release build uses `_mutex_array`. Sorry if I missed part of the conversation, but why two different implementations? Why not use _mutex_array always? I dislike having different code in debug and release builds. ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From Divino.Cesar at microsoft.com Tue Oct 5 19:30:41 2021 From: Divino.Cesar at microsoft.com (Cesar Soares Lucas) Date: Tue, 5 Oct 2021 19:30:41 +0000 Subject: [External] : Re: RFC - Improving C2 Escape Analysis In-Reply-To: References: <20210930140335.648146897@eggemoggin.niobe.net> Message-ID: Hi John, Thank you for answering my questions and providing some examples of "hard-cases"! I'm going over all this with my teammates and I'll come back to you as soon as possible. Regards, Cesar ________________________________ From: John Rose Sent: October 4, 2021 3:20 PM To: Cesar Soares Lucas Cc: Mark Reinhold ; hotspot-dev at openjdk.java.net ; Brian Stafford ; Martijn Verburg Subject: Re: [External] : Re: RFC - Improving C2 Escape Analysis On Oct 4, 2021, at 2:32 PM, Cesar Soares Lucas wrote: > > ... > I think it's safe to say that the Inlining horizon and the presence of > control/data-flow merge points in the compilation unit are the major hurdles to > the effectiveness of the current EA and dependent optimizations. I understand > that Valhalla (Inline Types) will help tremendously with the inlining horizon > aspect of the problem. How much of a problem is control-flow/data-flow merge > points in a world where Inline Types are in place? There?s an easy thought experiment or two here: Suppose a class V is known to be identity-insensitive. This means a physical copy of V can be recopied into another location (on the heap or even registers or stack) with no violation of semantics. If there is some use u(v) of a V-value v, we can pass the physical representation of v through an copy operation c(v) which preserves the value but physically rebuffers it somewhere else, so that the new graph is u(c(v)). We call this ?splitting the use? u(v). Experiment 1: At the horizon of the current task, split all incoming and outgoing uses of type V. Check how this transform affects your favorite EA. I think you suddenly don?t care about identity escapes of those values, beyond the horizon, because they take up new identities there. Experiment 2: In the current task, split all uses of type V that touch phis. (Where there are already copies, of the form c(v), use the identity c(c(v)) = c(v).) Again, check how this affects your favorite flow-sensitive EA. Now I think you have decoupled questions of identity from the control flow graph. The limiting case is to treat values of type V as always splitting. And you probably don?t ever need explicit c-nodes. The c notation helps to explain how V-type references behave differently from normal references, in classic EA algorithms. > From my (external) point of > view, I can imagine that this will still be a problem. What?s an example of such a problem, that remains after uses are split with the c-operator? (There?s an issue of removing needless splits, but if the c-operator is never materialized, then the problem reduces, I think, to tracking ?souvenirs? of last-known bufferings of given value bundles.) > If so, it looks to me > that this is an issue that once solved would benefit not only the current EA > implementation but also Valhalla. What do you [all] think? > > You mention about the "remaining [EA] hard cases once Inline Types are in > place". Can you please expand a little bit on that? Oops, I left a sentence incomplete there. It should have been simply: Not to make EA unnecessary, by any means, but rather to focus EA better on some remaining ?hard cases?. The hard cases continue to be behavior of objects outside of the inlining horizon that might disrupt optimistic optimizations inside the horizon. A couple of examples: A. We might wish to use thread-local allocation for ?confined? objects which never leave the thread, but which need to be accessed as if they were on the heap. (Stack allocation is a *special case* of thread-local allocation: It requires an extra assurance that the object lives only during the lifetime of a stack frame. BTW I think concurrent cross-thread access to thread-local objects is likely to be very buggy, which is why I draw the line at confinement, not the different line of ?during the *global* lifetime of a stack frame?.) B. We might wish to vectorize over data sources and sinks assuming (after a check perhaps) that the memory streams are non-overlapping with writes we are vectorizing (a ?noalias? condition). A hard case of A. is when code might let a mostly-confined reference escape to somewhere another thread can see it. There are many ways to build a fence around this, such as static inter-task analysis beyond the horizon (what is called ?interprocedural?, but the JIT task not the procedure is the key unit). Or GC barriers that detect escapes dynamically. A hard case of B. is similar, but you might test for different conditions, trying to prove that distinct alias categories stay distinct beyond the horizon. Actually, I don?t have any good suggestions for that, other than the usual technique of predicating on a non-overlap condition inside the horizon, before a loop that needs the condition. Even if we use only static inter-task techniques to analyze and summarize the global access to references (passed to and from method calls and data structures), there is probably always a dynamic component. At least, if you add a new subclass you probably have to adjoin the summaries of its overriding methods, if you are relying on summaries of related methods; in the worst case you might have to retract optimizations in running code (as we do when we de-opt when a devirtualization is not longer correct). Turning that around, there might be cases where a reference *might* escape but it dynamically *does not*. (Example: A so-far-never-taken path makes a reference escape, where the other paths keep it confined.) We might impose dynamic checks on methods whose behavior we have summarized (for the benefit of compile tasks which do not inline them), so that the summary can be sharper, at the cost of doing de-opt if the sharper summary is no longer valid. There?s a frontier here, of more and more clever and aggressive and speculative optimizations we can do. I think it helps to have them operate on fewer (larger?) objects. So I guess what I?m hoping for here is that if many small objects are made identity-free, then we can better use limited resources of time, space, and engineering cleverness to track the remaining objects whose access we wish to optimize. > Thank you again for taking the time to read the report and provide your > perspective! My pleasure! From neliasso at openjdk.java.net Tue Oct 5 19:44:12 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Tue, 5 Oct 2021 19:44:12 GMT Subject: RFR: 8273563: Improve performance of implicit exceptions with -XX:-OmitStackTraceInFastThrow [v3] In-Reply-To: References: Message-ID: On Mon, 4 Oct 2021 07:51:27 GMT, Volker Simonis wrote: >> Currently, if running with `-XX:-OmitStackTraceInFastThrow`, C2 has no possibility to create implicit exceptions like AIOOBE, NullPointerExceptions, etc. in compiled code. This means that such methods will always be deoptimized and re-executed in the interpreter if such exceptions are happening. >> >> If implicit exceptions are used for normal control flow, that can have a dramatic impact on performance. A prominent example for such code is [Tomcat's `HttpParser::isAlpha()` method](https://github.com/apache/tomcat/blob/26ba86cdbd40ca718e43b82e62b3eb49d004c3d6/java/org/apache/tomcat/util/http/parser/HttpParser.java#L266-L274): >> >> public static boolean isAlpha(int c) { >> try { >> return IS_ALPHA[c]; >> } catch (ArrayIndexOutOfBoundsException ex) { >> return false; >> } >> } >> >> >> ### Solution >> >> Instead of deoptimizing and resorting to the interpreter, we can generate code which allocates and initializes the corresponding exceptions right in compiled code. This results in a ten-times performance improvement for the above code: >> >> -XX:-OmitStackTraceInFastThrow -XX:-OptimizeImplicitExceptions >> Benchmark (exceptionProbability) Mode Cnt Score Error Units >> ImplicitExceptions.bench 0.0 avgt 5 1.430 ? 0.353 ns/op >> ImplicitExceptions.bench 0.33 avgt 5 3563.038 ? 77.358 ns/op >> ImplicitExceptions.bench 0.66 avgt 5 8609.693 ? 1205.104 ns/op >> ImplicitExceptions.bench 1.00 avgt 5 12842.401 ? 1022.728 ns/op >> >> -XX:-OmitStackTraceInFastThrow -XX:+OptimizeImplicitExceptions >> Benchmark (exceptionProbability) Mode Cnt Score Error Units >> ImplicitExceptions.bench 0.0 avgt 5 1.432 ? 0.352 ns/op >> ImplicitExceptions.bench 0.33 avgt 5 355.723 ? 16.641 ns/op >> ImplicitExceptions.bench 0.66 avgt 5 887.068 ? 166.728 ns/op >> ImplicitExceptions.bench 1.00 avgt 5 1274.418 ? 88.235 ns/op >> >> >> ### Implementation details >> >> - The new optimization is guarded by the option `OptimizeImplicitExceptions` which is on by default. >> - In `GraphKit::builtin_throw()` we can't simply use `CallGenerator::for_direct_call()` to create a `DirectCallGenerator` for the call to the exception's `` function because `DirectCallGenerator` assumes in various places that calls are only issued at `invoke*` bytecodes. This is is not true in genral for bytecode which can cause an implicit exception. >> - Instead, we manually wire up the call based on the code in `DirectCallGenerator::generate()`. >> - We use a similar trick like for method handle intrinsics where the callee from the bytecode is replaced by a direct call and this fact is recorded in the call's `_override_symbolic_info` field. For calling constructors of implicit exceptions I've introduced the new field `_implicit_exception_init`. This field is also used in various assertions to prevent queries for the bytecode's symbolic method information which doesn't exist because we're not at an `invoke*` bytecode at the place where we generate the call. >> - The PR contains a micro-benchmark which compares the old and the new implementation for [Tomcat's `HttpParser::isAlpha()` method](https://github.com/apache/tomcat/blob/26ba86cdbd40ca718e43b82e62b3eb49d004c3d6/java/org/apache/tomcat/util/http/parser/HttpParser.java#L266-L274). Except for the trivial case where the exception probability is 0 (i.e. no exceptions are happening at all) the new implementation is about 10 times faster. > > Volker Simonis has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Fix special case where we're creating an implicit exception for a regular invoke* bytecode > - Minor updates as requested by @TheRealMDoerr > - 8273563: Improve performance of implicit exceptions with -XX:-OmitStackTraceInFastThrow In the benchmark you are testing a case where the exception isn't used - will the allocation be eliminated then? In addition to the benchmark you have provided, I would like to see functional tests for both when the allocation is used, and when it isn't. Have you looked at the new IR Test Framework? ------------- Changes requested by neliasso (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5488 From coleenp at openjdk.java.net Tue Oct 5 19:49:14 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 5 Oct 2021 19:49:14 GMT Subject: RFR: 8273917: Remove 'leaf' ranking for Mutex [v3] In-Reply-To: <7Q5YHVtzbMG9Hj1y19E81HPARccq0RxaNwWqX05-fkE=.612b2d93-42f3-4df3-bd66-7b97bf66dad6@github.com> References: <0u1_bu5EunfDFYz-AMKxs-wOVO0-HwpYpe8YoSXBwtM=.e25a1229-de90-4d15-9e89-2f20d6ccf40b@github.com> <7Q5YHVtzbMG9Hj1y19E81HPARccq0RxaNwWqX05-fkE=.612b2d93-42f3-4df3-bd66-7b97bf66dad6@github.com> Message-ID: On Tue, 5 Oct 2021 14:35:31 GMT, Coleen Phillimore wrote: >> This change removes 'leaf' ranking. The previous change for JDK-8273915 divided the 'leaf' ranked locks that didn't safepoint check into the rank 'nosafepoint', so all the 'leaf' ranking locks left were safepoint_check_always. >> >> The rank 'nonleaf' (to be renamed 'safepoint' in the next change) is the *top* mutex rank. >> >> The transformation in this change is as follows: >> nonleaf+n => nonleaf - Generally these 'nonleaf' mutex were top level locks) >> leaf => nonleaf - Many of these locks were top level locks >> leaf => nonleaf-2 - Assuming that they were 'leaf' and 2 levels less than some existing nonleaf lock >> leaf-n => nonleaf-n >> >> The new mutex rankings reflect their rankings based on my logging, except for a couple shenandoah locks which I didn't observe, so I made them nonleaf-2. >> >> This change also introduces a relative mutex ranking macro, so that a Mutex/Monitor can be defined in terms of a mutex that it holds while trying to acquire it. So these relative mutex are moved to the end of the init function in mutexLocker.cpp. >> >> This has been tested with tier1-8, and retesting tier1-3 locally in progress. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Adjust some nonleaf-2 ranks. Thank you Erik! ------------- PR: https://git.openjdk.java.net/jdk/pull/5801 From coleenp at openjdk.java.net Tue Oct 5 20:00:13 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 5 Oct 2021 20:00:13 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v2] In-Reply-To: <4URcCMaCGRzf0PK4VClt5f0cztoqrm6BD5tMCPvTW2A=.f1501d74-2bc6-47b4-8325-8cf897a0613e@github.com> References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <8LmYbwFdozu6SSdD6j0sE4dKDiu19bvqExkE-TR_4YA=.5f7caeab-e42e-4815-bab0-6da3ca0ae514@github.com> <4URcCMaCGRzf0PK4VClt5f0cztoqrm6BD5tMCPvTW2A=.f1501d74-2bc6-47b4-8325-8cf897a0613e@github.com> Message-ID: On Tue, 5 Oct 2021 18:14:07 GMT, Thomas Stuefe wrote: >> My point is that the code to process the `_mutex_array` should be in a `#else` so that we don't build it or use for debug builds. So a debug build uses `owned_locks()` while a release build uses `_mutex_array`. > > Sorry if I missed part of the conversation, but why two different implementations? Why not use _mutex_array always? I dislike having different code in debug and release builds. https://bugs.openjdk.java.net/browse/JDK-8274794 I also think we should have _owned_locks available in product. I experimented with adding a mutex_array when we create mutex but this requires synchronization on mutex_array, with potentially a mutex, especially if it's a growable array. Also, we create and delete Mutex so the array would have to allow deletion. Just making owned_locks available in product would be really nice, so I filed a bug. ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From lkorinth at openjdk.java.net Tue Oct 5 20:39:30 2021 From: lkorinth at openjdk.java.net (Leo Korinth) Date: Tue, 5 Oct 2021 20:39:30 GMT Subject: RFR: 8269537: memset() is called after operator new [v3] In-Reply-To: References: Message-ID: > The basic problem is that we are relying on undefined behaviour, as documented in the code: > > // This whole business of passing information from ResourceObj::operator new > // to the ResourceObj constructor via fields in the "object" is technically UB. > // But it seems to work within the limitations of HotSpot usage (such as no > // multiple inheritance) with the compilers and compiler options we're using. > // And it gives some possibly useful checking for misuse of ResourceObj. > > > I am removing the undefined behaviour by passing the type of allocation through a thread local variable. > > This solution has some advantages: > 1) it is not UB > 2) it is simpler and easier to understand > 3) it uses less memory (I could make it use even less if I made the enum `allocation_type` a u8) > 4) in the *very* unlikely situation that stack memory (or embedded) already equals the data calculated from the address of the object, the code will also work. > > When doing the change, I also updated `allocated_on_stack()` to the new name `allocated_on_stack_or_embedded()` which is much harder to misinterpret. > > I also disallow to "fake" the memory type by explicitly calling `ResourceObj::set_allocation_type`. > > This forced me to change two places that is faking the allocation type of an embedded `GrowableArray` from `STACK_OR_EMBEDDED` to `C_HEAP`. The faking of the type is hard to understand as a `STACK_OR_EMBEDDED` `GrowableArray` can allocate any type of object. My guess is that `GrowableArray` has changed behaviour, or maybe that it was hard to understand because the old naming of `allocated_on_stack()`. > > I have also tried to update the comments. In doing that I not only changed the comments for this change, but also for the *incorrect* advice to always delete object you allocate with new. > > Testing on debug build tier1-3 > Testing on release build tier1 Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: Use a thread local buffer so that the compiler might reorder operator new. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5387/files - new: https://git.openjdk.java.net/jdk/pull/5387/files/d8acedb0..c9d45906 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5387&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5387&range=01-02 Stats: 91 lines in 5 files changed: 46 ins; 27 del; 18 mod Patch: https://git.openjdk.java.net/jdk/pull/5387.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5387/head:pull/5387 PR: https://git.openjdk.java.net/jdk/pull/5387 From burban at openjdk.java.net Tue Oct 5 20:41:25 2021 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Tue, 5 Oct 2021 20:41:25 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler Message-ID: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> r18 should not be used as it is reserved as platform register. Linux is fine with userspace using it, but Windows and also recently macOS (https://github.com/openjdk/jdk11u-dev/pull/301#issuecomment-911998917 ) are actually using it on the kernel side. The macro assembler uses the bit pattern `0x7fff_ffff` (== `r0-r30`) to specify which registers to spill; fortunately this helper is only used here: https://github.com/openjdk/jdk/blob/c05dc268acaf87236f30cf700ea3ac778e3b20e5/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp#L1400-L1404 I haven't seen causing this particular instance any issues in practice _yet_, presumably because it looks hard to align the stars in order to trigger a problem (between `stp` and `ldp` of `r18` a transition to kernel space must happen *and* the kernel needs to do something with `r18`). But jdk11u-dev has more usages of the `::pusha`/`::popa` macro and that causes troubles as explained in the link above. Output of `-XX:+PrintInterpreter` before this change: ---------------------------------------------------------------------- method entry point (kind = native) [0x0000000138809b00, 0x000000013880a280] 1920 bytes -------------------------------------------------------------------------------- 0x0000000138809b00: ldr x2, [x12, #16] 0x0000000138809b04: ldrh w2, [x2, #44] 0x0000000138809b08: add x24, x20, x2, uxtx #3 0x0000000138809b0c: sub x24, x24, #0x8 [...] 0x0000000138809fa4: stp x16, x17, [sp, #128] 0x0000000138809fa8: stp x18, x19, [sp, #144] 0x0000000138809fac: stp x20, x21, [sp, #160] [...] 0x0000000138809fc0: stp x30, xzr, [sp, #240] 0x0000000138809fc4: mov x0, x28 ;; 0x10864ACCC 0x0000000138809fc8: mov x9, #0xaccc // #44236 0x0000000138809fcc: movk x9, #0x864, lsl #16 0x0000000138809fd0: movk x9, #0x1, lsl #32 0x0000000138809fd4: blr x9 0x0000000138809fd8: ldp x2, x3, [sp, #16] [...] 0x0000000138809ff4: ldp x16, x17, [sp, #128] 0x0000000138809ff8: ldp x18, x19, [sp, #144] 0x0000000138809ffc: ldp x20, x21, [sp, #160] After: ---------------------------------------------------------------------- method entry point (kind = native) [0x0000000108e4db00, 0x0000000108e4e280] 1920 bytes -------------------------------------------------------------------------------- 0x0000000108e4db00: ldr x2, [x12, #16] 0x0000000108e4db04: ldrh w2, [x2, #44] 0x0000000108e4db08: add x24, x20, x2, uxtx #3 0x0000000108e4db0c: sub x24, x24, #0x8 [...] 0x0000000108e4dfa4: stp x16, x17, [sp, #128] 0x0000000108e4dfa8: stp x19, x20, [sp, #144] 0x0000000108e4dfac: stp x21, x22, [sp, #160] [...] 0x0000000108e4dfbc: stp x29, x30, [sp, #224] 0x0000000108e4dfc0: mov x0, x28 ;; 0x107E4A06C 0x0000000108e4dfc4: mov x9, #0xa06c // #41068 0x0000000108e4dfc8: movk x9, #0x7e4, lsl #16 0x0000000108e4dfcc: movk x9, #0x1, lsl #32 0x0000000108e4dfd0: blr x9 0x0000000108e4dfd4: ldp x2, x3, [sp, #16] [...] 0x0000000108e4dff0: ldp x16, x17, [sp, #128] 0x0000000108e4dff4: ldp x19, x20, [sp, #144] 0x0000000108e4dff8: ldp x21, x22, [sp, #160] [...] ------------- Commit messages: - JDK-8274795: avoid spilling and restoring r18 in macro assembler Changes: https://git.openjdk.java.net/jdk/pull/5828/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5828&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8274795 Stats: 10 lines in 2 files changed: 8 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/5828.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5828/head:pull/5828 PR: https://git.openjdk.java.net/jdk/pull/5828 From lkorinth at openjdk.java.net Tue Oct 5 20:57:07 2021 From: lkorinth at openjdk.java.net (Leo Korinth) Date: Tue, 5 Oct 2021 20:57:07 GMT Subject: RFR: 8269537: memset() is called after operator new [v3] In-Reply-To: References: Message-ID: On Sun, 26 Sep 2021 04:50:07 GMT, Kim Barrett wrote: >> Thanks Ioi for making me adding the assert!!! The sequencing of the allocation function and the arguments to the constructor is not what I thought, so my "solution" is not working. I am unsure how to resolve this in a good way. We could probably have a small thread local collection of (type, address) pairs, but I wonder if it is not better removing this debug information altogether. It is to my knowledge only used in GrowableArray (to limit the type of its internal allocations) and to hinder delete on resource allocated objects. > > A collection of (type, address) pairs is still problematic. It still requires assuming that the address of the derived object is the same as the address of the ResourceObj subobject, which isn't guaranteed, though the current mechanism also depends on that being true. I think there might be other places in HotSpot where we're making that assumption too, unfortunately. I used your idea of using a triplet to be sure that the derived object is within the created allocation. This solution can still fail if using multiple inheritance (no regression from today) or if allocation is done recursively (but then an assert will trigger). I think this is better than what we have today. If someone wants to remove this debug information in the future I would not argue against it. In the latest update I changed two classes that faked to be `ResourceObj`. I made them proper `ResourceObj`. I also made `allocation_type` a uint8_t and changed the code a bit so that `operator new` does not call `operator new`. I also made `BufferSize` 5. This could be anything >= 2 for clang at the moment, an assert will trigger if this size is too small. I have run the tests on tier1-tier3 on linux-x64, linux-x64-debug, linux-x86-debug, macosx-x64-debug, windows-x64-debug. ------------- PR: https://git.openjdk.java.net/jdk/pull/5387 From github.com+2249648+johntortugo at openjdk.java.net Tue Oct 5 23:12:09 2021 From: github.com+2249648+johntortugo at openjdk.java.net (John Tortugo) Date: Tue, 5 Oct 2021 23:12:09 GMT Subject: RFR: 8267265: Use new IR Test Framework to create tests for C2 IGV transformations [v4] In-Reply-To: References: <8Ce6bZtHwGEw8_wXZz4ak3obprd1YmZDi4cItcXB4bA=.a7162709-7aad-4709-a585-d2391392f49b@github.com> Message-ID: On Fri, 17 Sep 2021 10:23:51 GMT, Christian Hagedorn wrote: >> John Tortugo has updated the pull request incrementally with 146 additional commits since the last revision: >> >> - Fix merge mistake. >> - Merge branch 'jdk-8267265' of https://github.com/JohnTortugo/jdk into jdk-8267265 >> - Addressing PR feedback: move tests to other directory, add custom tests, add tests for other optimizations, rename some tests. >> - 8273197: ProblemList 2 jtools tests due to JDK-8273187 >> 8273198: ProblemList java/lang/instrument/BootClassPath/BootClassPathTest.sh due to JDK-8273188 >> >> Reviewed-by: naoto >> - 8262186: Call X509KeyManager.chooseClientAlias once for all key types >> >> Reviewed-by: xuelei >> - 8273186: Remove leftover comment about sparse remembered set in G1 HeapRegionRemSet >> >> Reviewed-by: ayang >> - 8273169: java/util/regex/NegativeArraySize.java failed after JDK-8271302 >> >> Reviewed-by: jiefu, serb >> - 8273092: Sort classlist in JDK image >> >> Reviewed-by: redestad, ihse, dfuchs >> - 8273144: Remove unused top level "Sample Collection Set Candidates" logging >> >> Reviewed-by: iwalulya, ayang >> - 8262095: NPE in Flow$FlowAnalyzer.visitApply: Cannot invoke getThrownTypes because tree.meth.type is null >> >> Co-authored-by: Jan Lahoda >> Co-authored-by: Vicente Romero >> Reviewed-by: jlahoda >> - ... and 136 more: https://git.openjdk.java.net/jdk/compare/ac430bf7...463102e2 > > test/hotspot/jtreg/compiler/c2/irTests/MulINodeIdealizationTests.java line 45: > >> 43: //Checks Max(a,b) * min(a,b) => a*b >> 44: public int excludeMaxMin(int x, int y){ >> 45: return Math.max(x, y) * Math.min(x, y); > > `Math.min/max()` is intrinsified and HotSpot generates `CMove` nodes (see `LibraryCallKit::generate_min_max()`) for them. But it looks like `MulNode::Ideal` misses this check for `CMove` nodes. That could be done in a separate RFE (and then this test could be improved to check if the `CMove` node was removed). > > Anyways, min/max nodes are mainly used for loop limit computations, so it's harder to test this transformation in an easy way. Created this work item to address the suggestion: https://bugs.openjdk.java.net/browse/JDK-8274799 ------------- PR: https://git.openjdk.java.net/jdk/pull/5135 From david.holmes at oracle.com Wed Oct 6 00:12:02 2021 From: david.holmes at oracle.com (David Holmes) Date: Wed, 6 Oct 2021 10:12:02 +1000 Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v2] In-Reply-To: <4URcCMaCGRzf0PK4VClt5f0cztoqrm6BD5tMCPvTW2A=.f1501d74-2bc6-47b4-8325-8cf897a0613e@github.com> References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <8LmYbwFdozu6SSdD6j0sE4dKDiu19bvqExkE-TR_4YA=.5f7caeab-e42e-4815-bab0-6da3ca0ae514@github.com> <4URcCMaCGRzf0PK4VClt5f0cztoqrm6BD5tMCPvTW2A=.f1501d74-2bc6-47b4-8325-8cf897a0613e@github.com> Message-ID: On 6/10/2021 4:45 am, Thomas Stuefe wrote: > On Wed, 29 Sep 2021 05:37:31 GMT, David Holmes wrote: > >>> This logic is for release build. >>> >>> Yes, we don't need them in debug build. We should have released all owning mutexes above. >>> It's no-op in debug build because owner() should be NULL. `thread` isn't NULL in this function. >>> >>> Here is current bookkeeping. From a thread, we can't find all owning mutexes. @pchilano said we can try _mutex_array in release build. I think the idea is that we are supposed to capture failure in debug build. >>> >>> | class | debug | release | | >>> |-------------|---------------------------|--------------|---| >>> | Thread | _owned_locks (listedlist) | N/A | | >>> | Mutex | _owner | _owner | | >>> | MutexLocker | _mutex_array | _mutex_array | | >>> >>> Shall we fix _mutex_array in release? I think we can change it to GrowableArray and keep track all mutexes in runtime. >> >> My point is that the code to process the `_mutex_array` should be in a `#else` so that we don't build it or use for debug builds. So a debug build uses `owned_locks()` while a release build uses `_mutex_array`. > > Sorry if I missed part of the conversation, but why two different implementations? Why not use _mutex_array always? I dislike having different code in debug and release builds. The debug code tracks all owned mutex/monitors. The _mutex_array is only a subset of mutex/monitors that may exist. David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/5590 > From xliu at openjdk.java.net Wed Oct 6 00:25:11 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Wed, 6 Oct 2021 00:25:11 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v4] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <_XakYRQVR8S2LYHZgtIidUTTJ5B1LvHQuo8ZWKu4Gjs=.c9d8c696-7478-485c-8df7-788aa1b38bcc@github.com> Message-ID: On Tue, 5 Oct 2021 18:30:31 GMT, Thomas Stuefe wrote: >> Xin Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Cleanup: remove a statment of debugging. > > test/hotspot/jtreg/runtime/ErrorHandling/TestOutOfMemoryErrorFromNIO.java line 59: > >> 57: StringBuilder after = new StringBuilder(jcmd); >> 58: after.append(" %p"); >> 59: after.append(" GC.heap_dump a.hprof"); > > Could be probably simplified to: > > String jcmd = JDKToolFinder.getJDKTool("jcmd"); > String before = jcmd + " %p Thread.print"; > String after = jcmd + " %p GC.heap_dump a.hprof; make sense. it's better. ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From xliu at openjdk.java.net Wed Oct 6 00:29:09 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Wed, 6 Oct 2021 00:29:09 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v4] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <_XakYRQVR8S2LYHZgtIidUTTJ5B1LvHQuo8ZWKu4Gjs=.c9d8c696-7478-485c-8df7-788aa1b38bcc@github.com> Message-ID: On Tue, 5 Oct 2021 18:07:35 GMT, Thomas Stuefe wrote: >> Xin Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Cleanup: remove a statment of debugging. > > src/hotspot/share/runtime/mutexLocker.cpp line 376: > >> 374: } >> 375: >> 376: void unlock_locks_on_error(JavaThread* thread) { > > Does this have to live in mutexLocker.cpp? Seems to be highly specific to error handling and would make more sense as a function local static in vmError.cpp. `unlock_locks_on_error` refers to `_mutex_array`, which is a static variable. ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From dholmes at openjdk.java.net Wed Oct 6 00:33:07 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 6 Oct 2021 00:33:07 GMT Subject: RFR: 8273917: Remove 'leaf' ranking for Mutex [v3] In-Reply-To: <7Q5YHVtzbMG9Hj1y19E81HPARccq0RxaNwWqX05-fkE=.612b2d93-42f3-4df3-bd66-7b97bf66dad6@github.com> References: <0u1_bu5EunfDFYz-AMKxs-wOVO0-HwpYpe8YoSXBwtM=.e25a1229-de90-4d15-9e89-2f20d6ccf40b@github.com> <7Q5YHVtzbMG9Hj1y19E81HPARccq0RxaNwWqX05-fkE=.612b2d93-42f3-4df3-bd66-7b97bf66dad6@github.com> Message-ID: On Tue, 5 Oct 2021 14:35:31 GMT, Coleen Phillimore wrote: >> This change removes 'leaf' ranking. The previous change for JDK-8273915 divided the 'leaf' ranked locks that didn't safepoint check into the rank 'nosafepoint', so all the 'leaf' ranking locks left were safepoint_check_always. >> >> The rank 'nonleaf' (to be renamed 'safepoint' in the next change) is the *top* mutex rank. >> >> The transformation in this change is as follows: >> nonleaf+n => nonleaf - Generally these 'nonleaf' mutex were top level locks) >> leaf => nonleaf - Many of these locks were top level locks >> leaf => nonleaf-2 - Assuming that they were 'leaf' and 2 levels less than some existing nonleaf lock >> leaf-n => nonleaf-n >> >> The new mutex rankings reflect their rankings based on my logging, except for a couple shenandoah locks which I didn't observe, so I made them nonleaf-2. >> >> This change also introduces a relative mutex ranking macro, so that a Mutex/Monitor can be defined in terms of a mutex that it holds while trying to acquire it. So these relative mutex are moved to the end of the init function in mutexLocker.cpp. >> >> This has been tested with tier1-8, and retesting tier1-3 locally in progress. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Adjust some nonleaf-2 ranks. Hi Coleen, A minor comment below but otherwise this all seems okay. Thanks, David src/hotspot/share/runtime/mutex.cpp line 300: > 298: > 299: assert(_rank >= 0, "Bad lock rank: %s", name); > 300: assert(_rank <= nonleaf, "Bad lock rank: %s", name); Can we just combine these into a single range check? And really the assert message should include the bad rank value. And probably this basic range check should be first. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5801 From xliu at openjdk.java.net Wed Oct 6 00:41:06 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Wed, 6 Oct 2021 00:41:06 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v4] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <_XakYRQVR8S2LYHZgtIidUTTJ5B1LvHQuo8ZWKu4Gjs=.c9d8c696-7478-485c-8df7-788aa1b38bcc@github.com> Message-ID: <-uSqg_bPi7OTAJmFAdLptIrI7ypU1wG5CNJ7ttJJCuM=.763d47d7-934d-4cb3-9df4-496644ba50a5@github.com> On Tue, 5 Oct 2021 18:17:40 GMT, Thomas Stuefe wrote: >> Xin Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Cleanup: remove a statment of debugging. > > src/hotspot/share/utilities/vmError.cpp line 1320: > >> 1318: // Otherwise, it may end up with a deadlock when dcmds try to synchronize all Java threads >> 1319: // at safepoints. >> 1320: bool transition_into_native() { > > Since this is a highly specific workaround for one specific problem (forking off jcmd which will attach back to us with a sub command which needs a safe point) I would suggest naming this function somewhat more specific, e.g. "prepare_jcmd_workaround" or "prepare_fork_and_exec" (though the latter is misleading since this is needed only for the jcmd-attaches-to-me case). > > For these cases, I also prefer the JBS issue number mentioned in a comment. Its helpful when one looks at the code, and one does not have to dig through the change history. I don't like workaround. I intend to do thing right. In my understanding, fork_and_exec() is going to step into Native, isn't it? I would like to clean up [prepare_for_emergency_dump](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/jfr/recorder/repository/jfrEmergencyDump.cpp#L426) in next step. It's easier to maintain than just iterating locks manually. that's why I would like to have a natural name. How about `VMError::transition_java_current_to_native_from_vm()`? ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From coleenp at openjdk.java.net Wed Oct 6 01:22:24 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 6 Oct 2021 01:22:24 GMT Subject: RFR: 8273917: Remove 'leaf' ranking for Mutex [v4] In-Reply-To: <0u1_bu5EunfDFYz-AMKxs-wOVO0-HwpYpe8YoSXBwtM=.e25a1229-de90-4d15-9e89-2f20d6ccf40b@github.com> References: <0u1_bu5EunfDFYz-AMKxs-wOVO0-HwpYpe8YoSXBwtM=.e25a1229-de90-4d15-9e89-2f20d6ccf40b@github.com> Message-ID: > This change removes 'leaf' ranking. The previous change for JDK-8273915 divided the 'leaf' ranked locks that didn't safepoint check into the rank 'nosafepoint', so all the 'leaf' ranking locks left were safepoint_check_always. > > The rank 'nonleaf' (to be renamed 'safepoint' in the next change) is the *top* mutex rank. > > The transformation in this change is as follows: > nonleaf+n => nonleaf - Generally these 'nonleaf' mutex were top level locks) > leaf => nonleaf - Many of these locks were top level locks > leaf => nonleaf-2 - Assuming that they were 'leaf' and 2 levels less than some existing nonleaf lock > leaf-n => nonleaf-n > > The new mutex rankings reflect their rankings based on my logging, except for a couple shenandoah locks which I didn't observe, so I made them nonleaf-2. > > This change also introduces a relative mutex ranking macro, so that a Mutex/Monitor can be defined in terms of a mutex that it holds while trying to acquire it. So these relative mutex are moved to the end of the init function in mutexLocker.cpp. > > This has been tested with tier1-8, and retesting tier1-3 locally in progress. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Combine bad rank range checks. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5801/files - new: https://git.openjdk.java.net/jdk/pull/5801/files/bc4022c6..bd6ff180 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5801&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5801&range=02-03 Stats: 5 lines in 1 file changed: 2 ins; 3 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/5801.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5801/head:pull/5801 PR: https://git.openjdk.java.net/jdk/pull/5801 From coleenp at openjdk.java.net Wed Oct 6 01:22:25 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 6 Oct 2021 01:22:25 GMT Subject: RFR: 8273917: Remove 'leaf' ranking for Mutex [v3] In-Reply-To: <7Q5YHVtzbMG9Hj1y19E81HPARccq0RxaNwWqX05-fkE=.612b2d93-42f3-4df3-bd66-7b97bf66dad6@github.com> References: <0u1_bu5EunfDFYz-AMKxs-wOVO0-HwpYpe8YoSXBwtM=.e25a1229-de90-4d15-9e89-2f20d6ccf40b@github.com> <7Q5YHVtzbMG9Hj1y19E81HPARccq0RxaNwWqX05-fkE=.612b2d93-42f3-4df3-bd66-7b97bf66dad6@github.com> Message-ID: On Tue, 5 Oct 2021 14:35:31 GMT, Coleen Phillimore wrote: >> This change removes 'leaf' ranking. The previous change for JDK-8273915 divided the 'leaf' ranked locks that didn't safepoint check into the rank 'nosafepoint', so all the 'leaf' ranking locks left were safepoint_check_always. >> >> The rank 'nonleaf' (to be renamed 'safepoint' in the next change) is the *top* mutex rank. >> >> The transformation in this change is as follows: >> nonleaf+n => nonleaf - Generally these 'nonleaf' mutex were top level locks) >> leaf => nonleaf - Many of these locks were top level locks >> leaf => nonleaf-2 - Assuming that they were 'leaf' and 2 levels less than some existing nonleaf lock >> leaf-n => nonleaf-n >> >> The new mutex rankings reflect their rankings based on my logging, except for a couple shenandoah locks which I didn't observe, so I made them nonleaf-2. >> >> This change also introduces a relative mutex ranking macro, so that a Mutex/Monitor can be defined in terms of a mutex that it holds while trying to acquire it. So these relative mutex are moved to the end of the init function in mutexLocker.cpp. >> >> This has been tested with tier1-8, and retesting tier1-3 locally in progress. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Adjust some nonleaf-2 ranks. Thank you for reviewing, David. ------------- PR: https://git.openjdk.java.net/jdk/pull/5801 From coleenp at openjdk.java.net Wed Oct 6 01:22:26 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 6 Oct 2021 01:22:26 GMT Subject: RFR: 8273917: Remove 'leaf' ranking for Mutex [v3] In-Reply-To: References: <0u1_bu5EunfDFYz-AMKxs-wOVO0-HwpYpe8YoSXBwtM=.e25a1229-de90-4d15-9e89-2f20d6ccf40b@github.com> <7Q5YHVtzbMG9Hj1y19E81HPARccq0RxaNwWqX05-fkE=.612b2d93-42f3-4df3-bd66-7b97bf66dad6@github.com> Message-ID: On Wed, 6 Oct 2021 00:26:24 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Adjust some nonleaf-2 ranks. > > src/hotspot/share/runtime/mutex.cpp line 300: > >> 298: >> 299: assert(_rank >= 0, "Bad lock rank: %s", name); >> 300: assert(_rank <= nonleaf, "Bad lock rank: %s", name); > > Can we just combine these into a single range check? And really the assert message should include the bad rank value. And probably this basic range check should be first. I combined the check and moved it up for now. In my next change that makes Rank into an enum class, the range check is different because you can't add to nonleaf which is going to be renamed 'safepoint' rank. ------------- PR: https://git.openjdk.java.net/jdk/pull/5801 From xliu at openjdk.java.net Wed Oct 6 01:39:12 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Wed, 6 Oct 2021 01:39:12 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v4] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <_XakYRQVR8S2LYHZgtIidUTTJ5B1LvHQuo8ZWKu4Gjs=.c9d8c696-7478-485c-8df7-788aa1b38bcc@github.com> Message-ID: On Tue, 5 Oct 2021 18:27:45 GMT, Thomas Stuefe wrote: >> Xin Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Cleanup: remove a statment of debugging. > > test/hotspot/jtreg/runtime/ErrorHandling/TestOutOfMemoryErrorFromNIO.java line 25: > >> 23: >> 24: /* >> 25: * @test TestOutOfMemoryErrorFromNIO > > I would rename the test. The point is not to test OOM from NIO, but to test XX:OnError with jcmd which attaches to the parent. > > You could do the same with any OOM, right? E.g. just start the VM with a very small MaxMetaspaceSize. You would not even need a main function for that, since the VM would not come up but throw a OOM right away, > > Proposal (but feel free to come up with something better): "TestOnErrorWithSelfAttachingJCmd" I found HotSpot is very subtle when it comes to `MaxMetaspaceSize` and `-XX:AbortVMOnException=java.lang.OutOfMemoryError` On my host, the watershed is 3M. If I give it 2M, it will ignore `AbortVMOnException`. java -XX:MaxMetaspaceSize=2M -XX:AbortVMOnException=java.lang.OutOfMemoryError Error occurred during initialization of VM OutOfMemoryError: Metaspace Flip code blocks here can solve this problem. is it intentional? diff --git a/src/hotspot/share/memory/metaspace.cpp b/src/hotspot/share/memory/metaspace.cpp index 74b5d1a0b30..efe430f7d3c 100644 --- a/src/hotspot/share/memory/metaspace.cpp +++ b/src/hotspot/share/memory/metaspace.cpp @@ -974,15 +974,16 @@ void Metaspace::report_metadata_oome(ClassLoaderData* loader_data, size_t word_s space_string); } - if (!is_init_completed()) { - vm_exit_during_initialization("OutOfMemoryError", space_string); - } - if (out_of_compressed_class_space) { THROW_OOP(Universe::out_of_memory_error_class_metaspace()); } else { THROW_OOP(Universe::out_of_memory_error_metaspace()); } + + if (!is_init_completed()) { + vm_exit_during_initialization("OutOfMemoryError", space_string); + } + } ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From dholmes at openjdk.java.net Wed Oct 6 01:40:06 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 6 Oct 2021 01:40:06 GMT Subject: RFR: 8273917: Remove 'leaf' ranking for Mutex [v4] In-Reply-To: References: <0u1_bu5EunfDFYz-AMKxs-wOVO0-HwpYpe8YoSXBwtM=.e25a1229-de90-4d15-9e89-2f20d6ccf40b@github.com> Message-ID: On Wed, 6 Oct 2021 01:22:24 GMT, Coleen Phillimore wrote: >> This change removes 'leaf' ranking. The previous change for JDK-8273915 divided the 'leaf' ranked locks that didn't safepoint check into the rank 'nosafepoint', so all the 'leaf' ranking locks left were safepoint_check_always. >> >> The rank 'nonleaf' (to be renamed 'safepoint' in the next change) is the *top* mutex rank. >> >> The transformation in this change is as follows: >> nonleaf+n => nonleaf - Generally these 'nonleaf' mutex were top level locks) >> leaf => nonleaf - Many of these locks were top level locks >> leaf => nonleaf-2 - Assuming that they were 'leaf' and 2 levels less than some existing nonleaf lock >> leaf-n => nonleaf-n >> >> The new mutex rankings reflect their rankings based on my logging, except for a couple shenandoah locks which I didn't observe, so I made them nonleaf-2. >> >> This change also introduces a relative mutex ranking macro, so that a Mutex/Monitor can be defined in terms of a mutex that it holds while trying to acquire it. So these relative mutex are moved to the end of the init function in mutexLocker.cpp. >> >> This has been tested with tier1-8, and retesting tier1-3 locally in progress. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Combine bad rank range checks. Thanks for that tweak. David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5801 From david.holmes at oracle.com Wed Oct 6 01:47:44 2021 From: david.holmes at oracle.com (David Holmes) Date: Wed, 6 Oct 2021 11:47:44 +1000 Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v4] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <_XakYRQVR8S2LYHZgtIidUTTJ5B1LvHQuo8ZWKu4Gjs=.c9d8c696-7478-485c-8df7-788aa1b38bcc@github.com> Message-ID: <1ff32378-87b6-8205-2ea9-6cd3f84facfc@oracle.com> On 6/10/2021 11:39 am, Xin Liu wrote: > On Tue, 5 Oct 2021 18:27:45 GMT, Thomas Stuefe wrote: > >>> Xin Liu has updated the pull request incrementally with one additional commit since the last revision: >>> >>> Cleanup: remove a statment of debugging. >> >> test/hotspot/jtreg/runtime/ErrorHandling/TestOutOfMemoryErrorFromNIO.java line 25: >> >>> 23: >>> 24: /* >>> 25: * @test TestOutOfMemoryErrorFromNIO >> >> I would rename the test. The point is not to test OOM from NIO, but to test XX:OnError with jcmd which attaches to the parent. >> >> You could do the same with any OOM, right? E.g. just start the VM with a very small MaxMetaspaceSize. You would not even need a main function for that, since the VM would not come up but throw a OOM right away, >> >> Proposal (but feel free to come up with something better): "TestOnErrorWithSelfAttachingJCmd" > > I found HotSpot is very subtle when it comes to `MaxMetaspaceSize` and `-XX:AbortVMOnException=java.lang.OutOfMemoryError` > > On my host, the watershed is 3M. If I give it 2M, it will ignore `AbortVMOnException`. > > > java -XX:MaxMetaspaceSize=2M -XX:AbortVMOnException=java.lang.OutOfMemoryError > Error occurred during initialization of VM > OutOfMemoryError: Metaspace > > > Flip code blocks here can solve this problem. is it intentional? Yes it is intentional. If init has not completed you probably can't throw exceptions yet. David ----- > > diff --git a/src/hotspot/share/memory/metaspace.cpp b/src/hotspot/share/memory/metaspace.cpp > index 74b5d1a0b30..efe430f7d3c 100644 > --- a/src/hotspot/share/memory/metaspace.cpp > +++ b/src/hotspot/share/memory/metaspace.cpp > @@ -974,15 +974,16 @@ void Metaspace::report_metadata_oome(ClassLoaderData* loader_data, size_t word_s > space_string); > } > > - if (!is_init_completed()) { > - vm_exit_during_initialization("OutOfMemoryError", space_string); > - } > - > if (out_of_compressed_class_space) { > THROW_OOP(Universe::out_of_memory_error_class_metaspace()); > } else { > THROW_OOP(Universe::out_of_memory_error_metaspace()); > } > + > + if (!is_init_completed()) { > + vm_exit_during_initialization("OutOfMemoryError", space_string); > + } > + > } > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/5590 > From xliu at openjdk.java.net Wed Oct 6 02:02:07 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Wed, 6 Oct 2021 02:02:07 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v4] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <_XakYRQVR8S2LYHZgtIidUTTJ5B1LvHQuo8ZWKu4Gjs=.c9d8c696-7478-485c-8df7-788aa1b38bcc@github.com> Message-ID: On Tue, 5 Oct 2021 18:42:05 GMT, Thomas Stuefe wrote: > However, if we do this for normal crashes+OnError too, I'd like to see a test for that (you can simply reuse your test, add a second `@run` section and just start the VM with `-XX:ErrorHandlerTest=14 -XX:OnError=... -version`). Of course, that one would require `vm.debug`. Actually, `-XX:+ErrorHandlerTest=14` is my first attempt. it turns out difficult to build a testcase which reliably reproduces this bug on all different platforms. I don't want to increase the burden of long-term operations. In this case, ErrorHandleTest triggers `VMError::controlled_crash` early. I have seen that it's earlier than AttachListener. As a result, jcmd can't surely attach HotSpot. For normal crashes+OnError scenario, I think `runtime/ErrorHandling/TestOnError.java` has covered it. ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From xxinliu at amazon.com Wed Oct 6 02:03:45 2021 From: xxinliu at amazon.com (Liu, Xin) Date: Tue, 5 Oct 2021 19:03:45 -0700 Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v4] In-Reply-To: <1ff32378-87b6-8205-2ea9-6cd3f84facfc@oracle.com> References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <_XakYRQVR8S2LYHZgtIidUTTJ5B1LvHQuo8ZWKu4Gjs=.c9d8c696-7478-485c-8df7-788aa1b38bcc@github.com> <1ff32378-87b6-8205-2ea9-6cd3f84facfc@oracle.com> Message-ID: <61830d1c-d716-c607-ed86-d01b3194a0bc@amazon.com> hi, David, Thanks for pointing it out. Understood. -XX:MaxMetaspaceSize is neat, but it's still subject to initialization process. I tried different approaches to throw OOME in the test, so far, I think throwing from NIO is the most reliable one. thanks, --lx On 10/5/21 6:47 PM, David Holmes wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > On 6/10/2021 11:39 am, Xin Liu wrote: >> On Tue, 5 Oct 2021 18:27:45 GMT, Thomas Stuefe wrote: >> >>>> Xin Liu has updated the pull request incrementally with one additional commit since the last revision: >>>> >>>> Cleanup: remove a statment of debugging. >>> >>> test/hotspot/jtreg/runtime/ErrorHandling/TestOutOfMemoryErrorFromNIO.java line 25: >>> >>>> 23: >>>> 24: /* >>>> 25: * @test TestOutOfMemoryErrorFromNIO >>> >>> I would rename the test. The point is not to test OOM from NIO, but to test XX:OnError with jcmd which attaches to the parent. >>> >>> You could do the same with any OOM, right? E.g. just start the VM with a very small MaxMetaspaceSize. You would not even need a main function for that, since the VM would not come up but throw a OOM right away, >>> >>> Proposal (but feel free to come up with something better): "TestOnErrorWithSelfAttachingJCmd" >> >> I found HotSpot is very subtle when it comes to `MaxMetaspaceSize` and `-XX:AbortVMOnException=java.lang.OutOfMemoryError` >> >> On my host, the watershed is 3M. If I give it 2M, it will ignore `AbortVMOnException`. >> >> >> java -XX:MaxMetaspaceSize=2M -XX:AbortVMOnException=java.lang.OutOfMemoryError >> Error occurred during initialization of VM >> OutOfMemoryError: Metaspace >> >> >> Flip code blocks here can solve this problem. is it intentional? > > Yes it is intentional. If init has not completed you probably can't > throw exceptions yet. > > David > ----- > >> >> diff --git a/src/hotspot/share/memory/metaspace.cpp b/src/hotspot/share/memory/metaspace.cpp >> index 74b5d1a0b30..efe430f7d3c 100644 >> --- a/src/hotspot/share/memory/metaspace.cpp >> +++ b/src/hotspot/share/memory/metaspace.cpp >> @@ -974,15 +974,16 @@ void Metaspace::report_metadata_oome(ClassLoaderData* loader_data, size_t word_s >> space_string); >> } >> >> - if (!is_init_completed()) { >> - vm_exit_during_initialization("OutOfMemoryError", space_string); >> - } >> - >> if (out_of_compressed_class_space) { >> THROW_OOP(Universe::out_of_memory_error_class_metaspace()); >> } else { >> THROW_OOP(Universe::out_of_memory_error_metaspace()); >> } >> + >> + if (!is_init_completed()) { >> + vm_exit_during_initialization("OutOfMemoryError", space_string); >> + } >> + >> } >> >> ------------- >> >> PR: https://git.openjdk.java.net/jdk/pull/5590 >> From xliu at openjdk.java.net Wed Oct 6 06:11:08 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Wed, 6 Oct 2021 06:11:08 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v4] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <_XakYRQVR8S2LYHZgtIidUTTJ5B1LvHQuo8ZWKu4Gjs=.c9d8c696-7478-485c-8df7-788aa1b38bcc@github.com> Message-ID: On Tue, 5 Oct 2021 01:38:12 GMT, David Holmes wrote: >> Xin Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Cleanup: remove a statment of debugging. > > src/hotspot/share/runtime/mutexLocker.cpp line 376: > >> 374: } >> 375: >> 376: void unlock_locks_on_error(JavaThread* thread) { > > Nit: s/thread/current/ as we must be applying this to the current thread. make sense. I will make them consistent. Should be `thread` because it's not necessarily be current(). ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From vkempik at openjdk.java.net Wed Oct 6 06:14:06 2021 From: vkempik at openjdk.java.net (Vladimir Kempik) Date: Wed, 6 Oct 2021 06:14:06 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler In-Reply-To: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> Message-ID: On Tue, 5 Oct 2021 20:33:31 GMT, Bernhard Urban-Forster wrote: > r18 should not be used as it is reserved as platform register. Linux is fine with userspace using it, but Windows and also recently macOS (https://github.com/openjdk/jdk11u-dev/pull/301#issuecomment-911998917 ) are actually using it on the kernel side. > > The macro assembler uses the bit pattern `0x7fff_ffff` (== `r0-r30`) to specify which registers to spill; fortunately this helper is only used here: > > https://github.com/openjdk/jdk/blob/c05dc268acaf87236f30cf700ea3ac778e3b20e5/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp#L1400-L1404 > > I haven't seen causing this particular instance any issues in practice _yet_, presumably because it looks hard to align the stars in order to trigger a problem (between `stp` and `ldp` of `r18` a transition to kernel space must happen *and* the kernel needs to do something with `r18`). But jdk11u-dev has more usages of the `::pusha`/`::popa` macro and that causes troubles as explained in the link above. > > Output of `-XX:+PrintInterpreter` before this change: > > ---------------------------------------------------------------------- > method entry point (kind = native) [0x0000000138809b00, 0x000000013880a280] 1920 bytes > -------------------------------------------------------------------------------- > 0x0000000138809b00: ldr x2, [x12, #16] > 0x0000000138809b04: ldrh w2, [x2, #44] > 0x0000000138809b08: add x24, x20, x2, uxtx #3 > 0x0000000138809b0c: sub x24, x24, #0x8 > [...] > 0x0000000138809fa4: stp x16, x17, [sp, #128] > 0x0000000138809fa8: stp x18, x19, [sp, #144] > 0x0000000138809fac: stp x20, x21, [sp, #160] > [...] > 0x0000000138809fc0: stp x30, xzr, [sp, #240] > 0x0000000138809fc4: mov x0, x28 > ;; 0x10864ACCC > 0x0000000138809fc8: mov x9, #0xaccc // #44236 > 0x0000000138809fcc: movk x9, #0x864, lsl #16 > 0x0000000138809fd0: movk x9, #0x1, lsl #32 > 0x0000000138809fd4: blr x9 > 0x0000000138809fd8: ldp x2, x3, [sp, #16] > [...] > 0x0000000138809ff4: ldp x16, x17, [sp, #128] > 0x0000000138809ff8: ldp x18, x19, [sp, #144] > 0x0000000138809ffc: ldp x20, x21, [sp, #160] > > > After: > > ---------------------------------------------------------------------- > method entry point (kind = native) [0x0000000108e4db00, 0x0000000108e4e280] 1920 bytes > > -------------------------------------------------------------------------------- > 0x0000000108e4db00: ldr x2, [x12, #16] > 0x0000000108e4db04: ldrh w2, [x2, #44] > 0x0000000108e4db08: add x24, x20, x2, uxtx #3 > 0x0000000108e4db0c: sub x24, x24, #0x8 > [...] > 0x0000000108e4dfa4: stp x16, x17, [sp, #128] > 0x0000000108e4dfa8: stp x19, x20, [sp, #144] > 0x0000000108e4dfac: stp x21, x22, [sp, #160] > [...] > 0x0000000108e4dfbc: stp x29, x30, [sp, #224] > 0x0000000108e4dfc0: mov x0, x28 > ;; 0x107E4A06C > 0x0000000108e4dfc4: mov x9, #0xa06c // #41068 > 0x0000000108e4dfc8: movk x9, #0x7e4, lsl #16 > 0x0000000108e4dfcc: movk x9, #0x1, lsl #32 > 0x0000000108e4dfd0: blr x9 > 0x0000000108e4dfd4: ldp x2, x3, [sp, #16] > [...] > 0x0000000108e4dff0: ldp x16, x17, [sp, #128] > 0x0000000108e4dff4: ldp x19, x20, [sp, #144] > 0x0000000108e4dff8: ldp x21, x22, [sp, #160] > [...] Skip printing R18 in MacroAssembler::debug64() too as the R18 is missing in regs array after this patch ------------- PR: https://git.openjdk.java.net/jdk/pull/5828 From xliu at openjdk.java.net Wed Oct 6 06:30:36 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Wed, 6 Oct 2021 06:30:36 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v5] In-Reply-To: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: > This patch allows the custom commands of OnError to attach to HotSpot itself. > It sets the thread of report_and_die() to Native before os::fork_and_exec(cmd). > This prevents cmds which require safepoint synchronization from deadlock. > eg. OnError='jcmd %p Thread.print'. > > Without this patch, we will encounter a deadlock at safepoint synchronization. > `"main" #1` is the very thread which executes `os::fork_and_exec(cmd)`. > > > Aborting due to java.lang.OutOfMemoryError: Java heap space > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (debug.cpp:364), pid=94632, tid=94633 > # fatal error: OutOfMemory encountered: Java heap space > # > # JRE version: OpenJDK Runtime Environment (18.0) (build 18-internal+0-adhoc.xxinliu.jdk) > # Java VM: OpenJDK 64-Bit Server VM (18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) > # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again > # > # An error report file with more information is saved as: > # /local/home/xxinliu/JDK-2085/hs_err_pid94632.log > # > # -XX:OnError="jcmd %p Thread.print" > # Executing /bin/sh -c "jcmd 94632 Thread.print" ... > 94632: > [10.616s][warning][safepoint] > [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timeout detected: > [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint. > [10.616s][warning][safepoint] # SafepointSynchronize::begin: Threads which did not reach the safepoint: > [10.616s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=236.97ms elapsed=10.61s tid=0x00007f01b00232f0 nid=94633 runnable [0x00007f01b7a08000] > [10.616s][warning][safepoint] java.lang.Thread.State: RUNNABLE > [10.616s][warning][safepoint] > [10.616s][warning][safepoint] # SafepointSynchronize::begin: (End of list) Xin Liu has updated the pull request incrementally with one additional commit since the last revision: Update per reviewers' feedbacks. 1. rename the test to TestOnErrorWithSelfAttachingJCmd.java. 2. rename java_current_transition_into_native() and add the static qualifier. 3. make unlock_locks_on_error more general. ensure formal name is consistent with argument name. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5590/files - new: https://git.openjdk.java.net/jdk/pull/5590/files/a9d439ad..2877637e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5590&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5590&range=03-04 Stats: 192 lines in 5 files changed: 91 ins; 95 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/5590.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5590/head:pull/5590 PR: https://git.openjdk.java.net/jdk/pull/5590 From stuefe at openjdk.java.net Wed Oct 6 06:42:13 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 6 Oct 2021 06:42:13 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v4] In-Reply-To: <-uSqg_bPi7OTAJmFAdLptIrI7ypU1wG5CNJ7ttJJCuM=.763d47d7-934d-4cb3-9df4-496644ba50a5@github.com> References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <_XakYRQVR8S2LYHZgtIidUTTJ5B1LvHQuo8ZWKu4Gjs=.c9d8c696-7478-485c-8df7-788aa1b38bcc@github.com> <-uSqg_bPi7OTAJmFAdLptIrI7ypU1wG5CNJ7ttJJCuM=.763d47d7-934d-4cb3-9df4-496644ba50a5@github.com> Message-ID: On Wed, 6 Oct 2021 00:37:41 GMT, Xin Liu wrote: > I don't like workaround. I intend to do thing right. In my understanding, fork_and_exec() is going to step into Native, isn't it? The problem I have with this is that force-unlocking all mutexes changes the state and possibly destroys evidence in core files. For instance, let's say you have a crash in a thread owning a piece of memory, and unlocking all mutexes unlocks the mutex guarding this memory, lets other threads access and change the memory. The core file will reflect this. And this will be almost impossible to prove too. I really would prefer to limit this workaround to the specific use case where it is needed. > > I would like to clean up [prepare_for_emergency_dump](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/jfr/recorder/repository/jfrEmergencyDump.cpp#L426) in next step. It's easier to maintain than just iterating locks manually. that's why I would like to have a natural name. How about `VMError::transition_java_current_to_native_from_vm()`? Does that run before or after the hs-err file is written? > > However, if we do this for normal crashes+OnError too, I'd like to see a test for that (you can simply reuse your test, add a second `@run` section and just start the VM with `-XX:ErrorHandlerTest=14 -XX:OnError=... -version`). Of course, that one would require `vm.debug`. > > Actually, `-XX:+ErrorHandlerTest=14` is my first attempt. it turns out difficult to build a testcase which reliably reproduces this bug on all different platforms. I don't want to increase the burden of long-term operations. > > In this case, ErrorHandleTest triggers `VMError::controlled_crash` early. I have seen that it's earlier than AttachListener. As a result, jcmd can't surely attach HotSpot. > > For normal crashes+OnError scenario, I think `runtime/ErrorHandling/TestOnError.java` has covered it. +1 As I wrote, I think this seems like a separate issue. OnError should always work, also if VM has an error in init phase. >> test/hotspot/jtreg/runtime/ErrorHandling/TestOutOfMemoryErrorFromNIO.java line 25: >> >>> 23: >>> 24: /* >>> 25: * @test TestOutOfMemoryErrorFromNIO >> >> I would rename the test. The point is not to test OOM from NIO, but to test XX:OnError with jcmd which attaches to the parent. >> >> You could do the same with any OOM, right? E.g. just start the VM with a very small MaxMetaspaceSize. You would not even need a main function for that, since the VM would not come up but throw a OOM right away, >> >> Proposal (but feel free to come up with something better): "TestOnErrorWithSelfAttachingJCmd" > > I found HotSpot is very subtle when it comes to `MaxMetaspaceSize` and `-XX:AbortVMOnException=java.lang.OutOfMemoryError` > > On my host, the watershed is 3M. If I give it 2M, it will ignore `AbortVMOnException`. > > > java -XX:MaxMetaspaceSize=2M -XX:AbortVMOnException=java.lang.OutOfMemoryError > Error occurred during initialization of VM > OutOfMemoryError: Metaspace > > > Flip code blocks here can solve this problem. is it intentional? > > > diff --git a/src/hotspot/share/memory/metaspace.cpp b/src/hotspot/share/memory/metaspace.cpp > index 74b5d1a0b30..efe430f7d3c 100644 > --- a/src/hotspot/share/memory/metaspace.cpp > +++ b/src/hotspot/share/memory/metaspace.cpp > @@ -974,15 +974,16 @@ void Metaspace::report_metadata_oome(ClassLoaderData* loader_data, size_t word_s > space_string); > } > > - if (!is_init_completed()) { > - vm_exit_during_initialization("OutOfMemoryError", space_string); > - } > - > if (out_of_compressed_class_space) { > THROW_OOP(Universe::out_of_memory_error_class_metaspace()); > } else { > THROW_OOP(Universe::out_of_memory_error_metaspace()); > } > + > + if (!is_init_completed()) { > + vm_exit_during_initialization("OutOfMemoryError", space_string); > + } > + > } I don't think you can flip it. Throwing the exception may involve further metaspace allocations, which gives you a circle. But for OnError this should not matter. OnError should also work for errors during initialization (whether the subsequent command, e.g. jcmd, succeeds in accessing the half-inited VM is another story). I think this is a separate issue: make OnError work when the VM bails out during initialization. ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From aph at openjdk.java.net Wed Oct 6 08:04:06 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 6 Oct 2021 08:04:06 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler In-Reply-To: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> Message-ID: On Tue, 5 Oct 2021 20:33:31 GMT, Bernhard Urban-Forster wrote: > r18 should not be used as it is reserved as platform register. Linux is fine with userspace using it, but Windows and also recently macOS (https://github.com/openjdk/jdk11u-dev/pull/301#issuecomment-911998917 ) are actually using it on the kernel side. > > The macro assembler uses the bit pattern `0x7fff_ffff` (== `r0-r30`) to specify which registers to spill; fortunately this helper is only used here: > > https://github.com/openjdk/jdk/blob/c05dc268acaf87236f30cf700ea3ac778e3b20e5/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp#L1400-L1404 > > I haven't seen causing this particular instance any issues in practice _yet_, presumably because it looks hard to align the stars in order to trigger a problem (between `stp` and `ldp` of `r18` a transition to kernel space must happen *and* the kernel needs to do something with `r18`). But jdk11u-dev has more usages of the `::pusha`/`::popa` macro and that causes troubles as explained in the link above. > > Output of `-XX:+PrintInterpreter` before this change: > > ---------------------------------------------------------------------- > method entry point (kind = native) [0x0000000138809b00, 0x000000013880a280] 1920 bytes > -------------------------------------------------------------------------------- > 0x0000000138809b00: ldr x2, [x12, #16] > 0x0000000138809b04: ldrh w2, [x2, #44] > 0x0000000138809b08: add x24, x20, x2, uxtx #3 > 0x0000000138809b0c: sub x24, x24, #0x8 > [...] > 0x0000000138809fa4: stp x16, x17, [sp, #128] > 0x0000000138809fa8: stp x18, x19, [sp, #144] > 0x0000000138809fac: stp x20, x21, [sp, #160] > [...] > 0x0000000138809fc0: stp x30, xzr, [sp, #240] > 0x0000000138809fc4: mov x0, x28 > ;; 0x10864ACCC > 0x0000000138809fc8: mov x9, #0xaccc // #44236 > 0x0000000138809fcc: movk x9, #0x864, lsl #16 > 0x0000000138809fd0: movk x9, #0x1, lsl #32 > 0x0000000138809fd4: blr x9 > 0x0000000138809fd8: ldp x2, x3, [sp, #16] > [...] > 0x0000000138809ff4: ldp x16, x17, [sp, #128] > 0x0000000138809ff8: ldp x18, x19, [sp, #144] > 0x0000000138809ffc: ldp x20, x21, [sp, #160] > > > After: > > ---------------------------------------------------------------------- > method entry point (kind = native) [0x0000000108e4db00, 0x0000000108e4e280] 1920 bytes > > -------------------------------------------------------------------------------- > 0x0000000108e4db00: ldr x2, [x12, #16] > 0x0000000108e4db04: ldrh w2, [x2, #44] > 0x0000000108e4db08: add x24, x20, x2, uxtx #3 > 0x0000000108e4db0c: sub x24, x24, #0x8 > [...] > 0x0000000108e4dfa4: stp x16, x17, [sp, #128] > 0x0000000108e4dfa8: stp x19, x20, [sp, #144] > 0x0000000108e4dfac: stp x21, x22, [sp, #160] > [...] > 0x0000000108e4dfbc: stp x29, x30, [sp, #224] > 0x0000000108e4dfc0: mov x0, x28 > ;; 0x107E4A06C > 0x0000000108e4dfc4: mov x9, #0xa06c // #41068 > 0x0000000108e4dfc8: movk x9, #0x7e4, lsl #16 > 0x0000000108e4dfcc: movk x9, #0x1, lsl #32 > 0x0000000108e4dfd0: blr x9 > 0x0000000108e4dfd4: ldp x2, x3, [sp, #16] > [...] > 0x0000000108e4dff0: ldp x16, x17, [sp, #128] > 0x0000000108e4dff4: ldp x19, x20, [sp, #144] > 0x0000000108e4dff8: ldp x21, x22, [sp, #160] > [...] Better to delete `pusha()` and `popa()`, and just use `push_call_clobered_registers()`; ------------- PR: https://git.openjdk.java.net/jdk/pull/5828 From vkempik at openjdk.java.net Wed Oct 6 08:30:07 2021 From: vkempik at openjdk.java.net (Vladimir Kempik) Date: Wed, 6 Oct 2021 08:30:07 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler In-Reply-To: References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> Message-ID: On Wed, 6 Oct 2021 08:00:49 GMT, Andrew Haley wrote: > Better to delete `pusha()` and `popa()`, and just use `push_call_clobered_registers()`; is it possible then to predict what to print in debug64 call ? ------------- PR: https://git.openjdk.java.net/jdk/pull/5828 From aph at openjdk.java.net Wed Oct 6 08:45:08 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 6 Oct 2021 08:45:08 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler In-Reply-To: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> Message-ID: On Tue, 5 Oct 2021 20:33:31 GMT, Bernhard Urban-Forster wrote: > r18 should not be used as it is reserved as platform register. Linux is fine with userspace using it, but Windows and also recently macOS (https://github.com/openjdk/jdk11u-dev/pull/301#issuecomment-911998917 ) are actually using it on the kernel side. > > The macro assembler uses the bit pattern `0x7fff_ffff` (== `r0-r30`) to specify which registers to spill; fortunately this helper is only used here: > > https://github.com/openjdk/jdk/blob/c05dc268acaf87236f30cf700ea3ac778e3b20e5/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp#L1400-L1404 > > I haven't seen causing this particular instance any issues in practice _yet_, presumably because it looks hard to align the stars in order to trigger a problem (between `stp` and `ldp` of `r18` a transition to kernel space must happen *and* the kernel needs to do something with `r18`). But jdk11u-dev has more usages of the `::pusha`/`::popa` macro and that causes troubles as explained in the link above. > > Output of `-XX:+PrintInterpreter` before this change: > > ---------------------------------------------------------------------- > method entry point (kind = native) [0x0000000138809b00, 0x000000013880a280] 1920 bytes > -------------------------------------------------------------------------------- > 0x0000000138809b00: ldr x2, [x12, #16] > 0x0000000138809b04: ldrh w2, [x2, #44] > 0x0000000138809b08: add x24, x20, x2, uxtx #3 > 0x0000000138809b0c: sub x24, x24, #0x8 > [...] > 0x0000000138809fa4: stp x16, x17, [sp, #128] > 0x0000000138809fa8: stp x18, x19, [sp, #144] > 0x0000000138809fac: stp x20, x21, [sp, #160] > [...] > 0x0000000138809fc0: stp x30, xzr, [sp, #240] > 0x0000000138809fc4: mov x0, x28 > ;; 0x10864ACCC > 0x0000000138809fc8: mov x9, #0xaccc // #44236 > 0x0000000138809fcc: movk x9, #0x864, lsl #16 > 0x0000000138809fd0: movk x9, #0x1, lsl #32 > 0x0000000138809fd4: blr x9 > 0x0000000138809fd8: ldp x2, x3, [sp, #16] > [...] > 0x0000000138809ff4: ldp x16, x17, [sp, #128] > 0x0000000138809ff8: ldp x18, x19, [sp, #144] > 0x0000000138809ffc: ldp x20, x21, [sp, #160] > > > After: > > ---------------------------------------------------------------------- > method entry point (kind = native) [0x0000000108e4db00, 0x0000000108e4e280] 1920 bytes > > -------------------------------------------------------------------------------- > 0x0000000108e4db00: ldr x2, [x12, #16] > 0x0000000108e4db04: ldrh w2, [x2, #44] > 0x0000000108e4db08: add x24, x20, x2, uxtx #3 > 0x0000000108e4db0c: sub x24, x24, #0x8 > [...] > 0x0000000108e4dfa4: stp x16, x17, [sp, #128] > 0x0000000108e4dfa8: stp x19, x20, [sp, #144] > 0x0000000108e4dfac: stp x21, x22, [sp, #160] > [...] > 0x0000000108e4dfbc: stp x29, x30, [sp, #224] > 0x0000000108e4dfc0: mov x0, x28 > ;; 0x107E4A06C > 0x0000000108e4dfc4: mov x9, #0xa06c // #41068 > 0x0000000108e4dfc8: movk x9, #0x7e4, lsl #16 > 0x0000000108e4dfcc: movk x9, #0x1, lsl #32 > 0x0000000108e4dfd0: blr x9 > 0x0000000108e4dfd4: ldp x2, x3, [sp, #16] > [...] > 0x0000000108e4dff0: ldp x16, x17, [sp, #128] > 0x0000000108e4dff4: ldp x19, x20, [sp, #144] > 0x0000000108e4dff8: ldp x21, x22, [sp, #160] > [...] Please leave register printing alone. Do push r18. Don't pop r18. I suggest pop(RegSet::range(r0, r17)); #ifdef(R18_RESERVED) ldp(zr, r19, Address(__ post(sp, 2 * wordSize))); pop(RegSet::range(r20, r30)); #else pop(RegSet::range(r18, r30)); #endif This is explicit, and doesn't touch anything unnecessarily. ------------- PR: https://git.openjdk.java.net/jdk/pull/5828 From burban at openjdk.java.net Wed Oct 6 08:48:11 2021 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Wed, 6 Oct 2021 08:48:11 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler In-Reply-To: References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> Message-ID: On Wed, 6 Oct 2021 06:10:50 GMT, Vladimir Kempik wrote: > Skip printing R18 in MacroAssembler::debug64() too as the R18 is missing in regs array after this patch it's not, right? This PR changes `::pusha()`, but `generate_verify_oop()` uses `::push` and defines its own regset: https://github.com/openjdk/jdk/blob/df125f680b6a4517109be80512a113064ca6281d/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp#L599-L610 That said, I think `::debug64()` shouldn't touch `regs[30]` and `regs[31]`, but this is for another time. ------------- PR: https://git.openjdk.java.net/jdk/pull/5828 From burban at openjdk.java.net Wed Oct 6 09:03:29 2021 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Wed, 6 Oct 2021 09:03:29 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler [v2] In-Reply-To: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> Message-ID: > r18 should not be used as it is reserved as platform register. Linux is fine with userspace using it, but Windows and also recently macOS (https://github.com/openjdk/jdk11u-dev/pull/301#issuecomment-911998917 ) are actually using it on the kernel side. > > The macro assembler uses the bit pattern `0x7fff_ffff` (== `r0-r30`) to specify which registers to spill; fortunately this helper is only used here: > > https://github.com/openjdk/jdk/blob/c05dc268acaf87236f30cf700ea3ac778e3b20e5/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp#L1400-L1404 > > I haven't seen causing this particular instance any issues in practice _yet_, presumably because it looks hard to align the stars in order to trigger a problem (between `stp` and `ldp` of `r18` a transition to kernel space must happen *and* the kernel needs to do something with `r18`). But jdk11u-dev has more usages of the `::pusha`/`::popa` macro and that causes troubles as explained in the link above. > > Output of `-XX:+PrintInterpreter` before this change: > > ---------------------------------------------------------------------- > method entry point (kind = native) [0x0000000138809b00, 0x000000013880a280] 1920 bytes > -------------------------------------------------------------------------------- > 0x0000000138809b00: ldr x2, [x12, #16] > 0x0000000138809b04: ldrh w2, [x2, #44] > 0x0000000138809b08: add x24, x20, x2, uxtx #3 > 0x0000000138809b0c: sub x24, x24, #0x8 > [...] > 0x0000000138809fa4: stp x16, x17, [sp, #128] > 0x0000000138809fa8: stp x18, x19, [sp, #144] > 0x0000000138809fac: stp x20, x21, [sp, #160] > [...] > 0x0000000138809fc0: stp x30, xzr, [sp, #240] > 0x0000000138809fc4: mov x0, x28 > ;; 0x10864ACCC > 0x0000000138809fc8: mov x9, #0xaccc // #44236 > 0x0000000138809fcc: movk x9, #0x864, lsl #16 > 0x0000000138809fd0: movk x9, #0x1, lsl #32 > 0x0000000138809fd4: blr x9 > 0x0000000138809fd8: ldp x2, x3, [sp, #16] > [...] > 0x0000000138809ff4: ldp x16, x17, [sp, #128] > 0x0000000138809ff8: ldp x18, x19, [sp, #144] > 0x0000000138809ffc: ldp x20, x21, [sp, #160] > > > After: > > ---------------------------------------------------------------------- > method entry point (kind = native) [0x0000000108e4db00, 0x0000000108e4e280] 1920 bytes > > -------------------------------------------------------------------------------- > 0x0000000108e4db00: ldr x2, [x12, #16] > 0x0000000108e4db04: ldrh w2, [x2, #44] > 0x0000000108e4db08: add x24, x20, x2, uxtx #3 > 0x0000000108e4db0c: sub x24, x24, #0x8 > [...] > 0x0000000108e4dfa4: stp x16, x17, [sp, #128] > 0x0000000108e4dfa8: stp x19, x20, [sp, #144] > 0x0000000108e4dfac: stp x21, x22, [sp, #160] > [...] > 0x0000000108e4dfbc: stp x29, x30, [sp, #224] > 0x0000000108e4dfc0: mov x0, x28 > ;; 0x107E4A06C > 0x0000000108e4dfc4: mov x9, #0xa06c // #41068 > 0x0000000108e4dfc8: movk x9, #0x7e4, lsl #16 > 0x0000000108e4dfcc: movk x9, #0x1, lsl #32 > 0x0000000108e4dfd0: blr x9 > 0x0000000108e4dfd4: ldp x2, x3, [sp, #16] > [...] > 0x0000000108e4dff0: ldp x16, x17, [sp, #128] > 0x0000000108e4dff4: ldp x19, x20, [sp, #144] > 0x0000000108e4dff8: ldp x21, x22, [sp, #160] > [...] Bernhard Urban-Forster has updated the pull request incrementally with one additional commit since the last revision: remove pusha/popa, be explicit in the template generator Restore looks like this now: ``` 0x0000000106e4dfcc: movk x9, #0x5e4, lsl #16 0x0000000106e4dfd0: movk x9, #0x1, lsl #32 0x0000000106e4dfd4: blr x9 0x0000000106e4dfd8: ldp x2, x3, [sp, #16] 0x0000000106e4dfdc: ldp x4, x5, [sp, #32] 0x0000000106e4dfe0: ldp x6, x7, [sp, #48] 0x0000000106e4dfe4: ldp x8, x9, [sp, #64] 0x0000000106e4dfe8: ldp x10, x11, [sp, #80] 0x0000000106e4dfec: ldp x12, x13, [sp, #96] 0x0000000106e4dff0: ldp x14, x15, [sp, #112] 0x0000000106e4dff4: ldp x16, x17, [sp, #128] 0x0000000106e4dff8: ldp x0, x1, [sp], #144 0x0000000106e4dffc: ldp xzr, x19, [sp], #16 0x0000000106e4e000: ldp x22, x23, [sp, #16] 0x0000000106e4e004: ldp x24, x25, [sp, #32] 0x0000000106e4e008: ldp x26, x27, [sp, #48] 0x0000000106e4e00c: ldp x28, x29, [sp, #64] 0x0000000106e4e010: ldp x30, xzr, [sp, #80] 0x0000000106e4e014: ldp x20, x21, [sp], #96 0x0000000106e4e018: ldur x12, [x29, #-24] 0x0000000106e4e01c: ldr x22, [x12, #16] 0x0000000106e4e020: add x22, x22, #0x30 0x0000000106e4e024: ldr x8, [x28, #8] ``` ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5828/files - new: https://git.openjdk.java.net/jdk/pull/5828/files/7c5ebae7..84c5c902 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5828&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5828&range=00-01 Stats: 31 lines in 3 files changed: 8 ins; 21 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/5828.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5828/head:pull/5828 PR: https://git.openjdk.java.net/jdk/pull/5828 From burban at openjdk.java.net Wed Oct 6 09:03:29 2021 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Wed, 6 Oct 2021 09:03:29 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler In-Reply-To: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> Message-ID: On Tue, 5 Oct 2021 20:33:31 GMT, Bernhard Urban-Forster wrote: > r18 should not be used as it is reserved as platform register. Linux is fine with userspace using it, but Windows and also recently macOS (https://github.com/openjdk/jdk11u-dev/pull/301#issuecomment-911998917 ) are actually using it on the kernel side. > > The macro assembler uses the bit pattern `0x7fff_ffff` (== `r0-r30`) to specify which registers to spill; fortunately this helper is only used here: > > https://github.com/openjdk/jdk/blob/c05dc268acaf87236f30cf700ea3ac778e3b20e5/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp#L1400-L1404 > > I haven't seen causing this particular instance any issues in practice _yet_, presumably because it looks hard to align the stars in order to trigger a problem (between `stp` and `ldp` of `r18` a transition to kernel space must happen *and* the kernel needs to do something with `r18`). But jdk11u-dev has more usages of the `::pusha`/`::popa` macro and that causes troubles as explained in the link above. > > Output of `-XX:+PrintInterpreter` before this change: > > ---------------------------------------------------------------------- > method entry point (kind = native) [0x0000000138809b00, 0x000000013880a280] 1920 bytes > -------------------------------------------------------------------------------- > 0x0000000138809b00: ldr x2, [x12, #16] > 0x0000000138809b04: ldrh w2, [x2, #44] > 0x0000000138809b08: add x24, x20, x2, uxtx #3 > 0x0000000138809b0c: sub x24, x24, #0x8 > [...] > 0x0000000138809fa4: stp x16, x17, [sp, #128] > 0x0000000138809fa8: stp x18, x19, [sp, #144] > 0x0000000138809fac: stp x20, x21, [sp, #160] > [...] > 0x0000000138809fc0: stp x30, xzr, [sp, #240] > 0x0000000138809fc4: mov x0, x28 > ;; 0x10864ACCC > 0x0000000138809fc8: mov x9, #0xaccc // #44236 > 0x0000000138809fcc: movk x9, #0x864, lsl #16 > 0x0000000138809fd0: movk x9, #0x1, lsl #32 > 0x0000000138809fd4: blr x9 > 0x0000000138809fd8: ldp x2, x3, [sp, #16] > [...] > 0x0000000138809ff4: ldp x16, x17, [sp, #128] > 0x0000000138809ff8: ldp x18, x19, [sp, #144] > 0x0000000138809ffc: ldp x20, x21, [sp, #160] > > > After: > > ---------------------------------------------------------------------- > method entry point (kind = native) [0x0000000108e4db00, 0x0000000108e4e280] 1920 bytes > > -------------------------------------------------------------------------------- > 0x0000000108e4db00: ldr x2, [x12, #16] > 0x0000000108e4db04: ldrh w2, [x2, #44] > 0x0000000108e4db08: add x24, x20, x2, uxtx #3 > 0x0000000108e4db0c: sub x24, x24, #0x8 > [...] > 0x0000000108e4dfa4: stp x16, x17, [sp, #128] > 0x0000000108e4dfa8: stp x19, x20, [sp, #144] > 0x0000000108e4dfac: stp x21, x22, [sp, #160] > [...] > 0x0000000108e4dfbc: stp x29, x30, [sp, #224] > 0x0000000108e4dfc0: mov x0, x28 > ;; 0x107E4A06C > 0x0000000108e4dfc4: mov x9, #0xa06c // #41068 > 0x0000000108e4dfc8: movk x9, #0x7e4, lsl #16 > 0x0000000108e4dfcc: movk x9, #0x1, lsl #32 > 0x0000000108e4dfd0: blr x9 > 0x0000000108e4dfd4: ldp x2, x3, [sp, #16] > [...] > 0x0000000108e4dff0: ldp x16, x17, [sp, #128] > 0x0000000108e4dff4: ldp x19, x20, [sp, #144] > 0x0000000108e4dff8: ldp x21, x22, [sp, #160] > [...] Thanks for the review Andrew. Restore looks like this now: ``` 0x0000000106e4dfcc: movk x9, #0x5e4, lsl #16 0x0000000106e4dfd0: movk x9, #0x1, lsl #32 0x0000000106e4dfd4: blr x9 0x0000000106e4dfd8: ldp x2, x3, [sp, #16] 0x0000000106e4dfdc: ldp x4, x5, [sp, #32] 0x0000000106e4dfe0: ldp x6, x7, [sp, #48] 0x0000000106e4dfe4: ldp x8, x9, [sp, #64] 0x0000000106e4dfe8: ldp x10, x11, [sp, #80] 0x0000000106e4dfec: ldp x12, x13, [sp, #96] 0x0000000106e4dff0: ldp x14, x15, [sp, #112] 0x0000000106e4dff4: ldp x16, x17, [sp, #128] 0x0000000106e4dff8: ldp x0, x1, [sp], #144 0x0000000106e4dffc: ldp xzr, x19, [sp], #16 0x0000000106e4e000: ldp x22, x23, [sp, #16] 0x0000000106e4e004: ldp x24, x25, [sp, #32] 0x0000000106e4e008: ldp x26, x27, [sp, #48] 0x0000000106e4e00c: ldp x28, x29, [sp, #64] 0x0000000106e4e010: ldp x30, xzr, [sp, #80] 0x0000000106e4e014: ldp x20, x21, [sp], #96 0x0000000106e4e018: ldur x12, [x29, #-24] 0x0000000106e4e01c: ldr x22, [x12, #16] 0x0000000106e4e020: add x22, x22, #0x30 0x0000000106e4e024: ldr x8, [x28, #8] ------------- PR: https://git.openjdk.java.net/jdk/pull/5828 From aph at openjdk.java.net Wed Oct 6 09:29:08 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 6 Oct 2021 09:29:08 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler [v2] In-Reply-To: References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> Message-ID: On Wed, 6 Oct 2021 09:03:29 GMT, Bernhard Urban-Forster wrote: >> r18 should not be used as it is reserved as platform register. Linux is fine with userspace using it, but Windows and also recently macOS (https://github.com/openjdk/jdk11u-dev/pull/301#issuecomment-911998917 ) are actually using it on the kernel side. >> >> The macro assembler uses the bit pattern `0x7fff_ffff` (== `r0-r30`) to specify which registers to spill; fortunately this helper is only used here: >> >> https://github.com/openjdk/jdk/blob/c05dc268acaf87236f30cf700ea3ac778e3b20e5/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp#L1400-L1404 >> >> I haven't seen causing this particular instance any issues in practice _yet_, presumably because it looks hard to align the stars in order to trigger a problem (between `stp` and `ldp` of `r18` a transition to kernel space must happen *and* the kernel needs to do something with `r18`). But jdk11u-dev has more usages of the `::pusha`/`::popa` macro and that causes troubles as explained in the link above. >> >> Output of `-XX:+PrintInterpreter` before this change: >> >> ---------------------------------------------------------------------- >> method entry point (kind = native) [0x0000000138809b00, 0x000000013880a280] 1920 bytes >> -------------------------------------------------------------------------------- >> 0x0000000138809b00: ldr x2, [x12, #16] >> 0x0000000138809b04: ldrh w2, [x2, #44] >> 0x0000000138809b08: add x24, x20, x2, uxtx #3 >> 0x0000000138809b0c: sub x24, x24, #0x8 >> [...] >> 0x0000000138809fa4: stp x16, x17, [sp, #128] >> 0x0000000138809fa8: stp x18, x19, [sp, #144] >> 0x0000000138809fac: stp x20, x21, [sp, #160] >> [...] >> 0x0000000138809fc0: stp x30, xzr, [sp, #240] >> 0x0000000138809fc4: mov x0, x28 >> ;; 0x10864ACCC >> 0x0000000138809fc8: mov x9, #0xaccc // #44236 >> 0x0000000138809fcc: movk x9, #0x864, lsl #16 >> 0x0000000138809fd0: movk x9, #0x1, lsl #32 >> 0x0000000138809fd4: blr x9 >> 0x0000000138809fd8: ldp x2, x3, [sp, #16] >> [...] >> 0x0000000138809ff4: ldp x16, x17, [sp, #128] >> 0x0000000138809ff8: ldp x18, x19, [sp, #144] >> 0x0000000138809ffc: ldp x20, x21, [sp, #160] >> >> >> After: >> >> ---------------------------------------------------------------------- >> method entry point (kind = native) [0x0000000108e4db00, 0x0000000108e4e280] 1920 bytes >> >> -------------------------------------------------------------------------------- >> 0x0000000108e4db00: ldr x2, [x12, #16] >> 0x0000000108e4db04: ldrh w2, [x2, #44] >> 0x0000000108e4db08: add x24, x20, x2, uxtx #3 >> 0x0000000108e4db0c: sub x24, x24, #0x8 >> [...] >> 0x0000000108e4dfa4: stp x16, x17, [sp, #128] >> 0x0000000108e4dfa8: stp x19, x20, [sp, #144] >> 0x0000000108e4dfac: stp x21, x22, [sp, #160] >> [...] >> 0x0000000108e4dfbc: stp x29, x30, [sp, #224] >> 0x0000000108e4dfc0: mov x0, x28 >> ;; 0x107E4A06C >> 0x0000000108e4dfc4: mov x9, #0xa06c // #41068 >> 0x0000000108e4dfc8: movk x9, #0x7e4, lsl #16 >> 0x0000000108e4dfcc: movk x9, #0x1, lsl #32 >> 0x0000000108e4dfd0: blr x9 >> 0x0000000108e4dfd4: ldp x2, x3, [sp, #16] >> [...] >> 0x0000000108e4dff0: ldp x16, x17, [sp, #128] >> 0x0000000108e4dff4: ldp x19, x20, [sp, #144] >> 0x0000000108e4dff8: ldp x21, x22, [sp, #160] >> [...] > > Bernhard Urban-Forster has updated the pull request incrementally with one additional commit since the last revision: > > remove pusha/popa, be explicit in the template generator > > Restore looks like this now: > ``` > 0x0000000106e4dfcc: movk x9, #0x5e4, lsl #16 > 0x0000000106e4dfd0: movk x9, #0x1, lsl #32 > 0x0000000106e4dfd4: blr x9 > 0x0000000106e4dfd8: ldp x2, x3, [sp, #16] > 0x0000000106e4dfdc: ldp x4, x5, [sp, #32] > 0x0000000106e4dfe0: ldp x6, x7, [sp, #48] > 0x0000000106e4dfe4: ldp x8, x9, [sp, #64] > 0x0000000106e4dfe8: ldp x10, x11, [sp, #80] > 0x0000000106e4dfec: ldp x12, x13, [sp, #96] > 0x0000000106e4dff0: ldp x14, x15, [sp, #112] > 0x0000000106e4dff4: ldp x16, x17, [sp, #128] > 0x0000000106e4dff8: ldp x0, x1, [sp], #144 > 0x0000000106e4dffc: ldp xzr, x19, [sp], #16 > 0x0000000106e4e000: ldp x22, x23, [sp, #16] > 0x0000000106e4e004: ldp x24, x25, [sp, #32] > 0x0000000106e4e008: ldp x26, x27, [sp, #48] > 0x0000000106e4e00c: ldp x28, x29, [sp, #64] > 0x0000000106e4e010: ldp x30, xzr, [sp, #80] > 0x0000000106e4e014: ldp x20, x21, [sp], #96 > 0x0000000106e4e018: ldur x12, [x29, #-24] > 0x0000000106e4e01c: ldr x22, [x12, #16] > 0x0000000106e4e020: add x22, x22, #0x30 > 0x0000000106e4e024: ldr x8, [x28, #8] > ``` Sorry, stupid mistake: the pop is done in reverse, of course! Please try this (still untested) code. #ifdef(R18_RESERVED) pop(RegSet::range(r20, r30)); ldp(zr, r19, Address(__ post(sp, 2 * wordSize))); #else pop(RegSet::range(r18, r30)); #endif pop(RegSet::range(r0, r17)); ------------- PR: https://git.openjdk.java.net/jdk/pull/5828 From lkorinth at openjdk.java.net Wed Oct 6 09:50:08 2021 From: lkorinth at openjdk.java.net (Leo Korinth) Date: Wed, 6 Oct 2021 09:50:08 GMT Subject: RFR: 8269537: memset() is called after operator new [v3] In-Reply-To: References: Message-ID: <4aTOtDeBp6oPKbZwwW16Fp6YnuhxAh3vJvM4VxgZRQI=.22d277b7-f7e2-4a4b-8121-4c303fb4dfdf@github.com> On Tue, 5 Oct 2021 20:39:30 GMT, Leo Korinth wrote: >> The basic problem is that we are relying on undefined behaviour, as documented in the code: >> >> // This whole business of passing information from ResourceObj::operator new >> // to the ResourceObj constructor via fields in the "object" is technically UB. >> // But it seems to work within the limitations of HotSpot usage (such as no >> // multiple inheritance) with the compilers and compiler options we're using. >> // And it gives some possibly useful checking for misuse of ResourceObj. >> >> >> I am removing the undefined behaviour by passing the type of allocation through a thread local variable. >> >> This solution has some advantages: >> 1) it is not UB >> 2) it is simpler and easier to understand >> 3) it uses less memory (I could make it use even less if I made the enum `allocation_type` a u8) >> 4) in the *very* unlikely situation that stack memory (or embedded) already equals the data calculated from the address of the object, the code will also work. >> >> When doing the change, I also updated `allocated_on_stack()` to the new name `allocated_on_stack_or_embedded()` which is much harder to misinterpret. >> >> I also disallow to "fake" the memory type by explicitly calling `ResourceObj::set_allocation_type`. >> >> This forced me to change two places that is faking the allocation type of an embedded `GrowableArray` from `STACK_OR_EMBEDDED` to `C_HEAP`. The faking of the type is hard to understand as a `STACK_OR_EMBEDDED` `GrowableArray` can allocate any type of object. My guess is that `GrowableArray` has changed behaviour, or maybe that it was hard to understand because the old naming of `allocated_on_stack()`. >> >> I have also tried to update the comments. In doing that I not only changed the comments for this change, but also for the *incorrect* advice to always delete object you allocate with new. >> >> Testing on debug build tier1-3 >> Testing on release build tier1 > > Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: > > Use a thread local buffer so that the compiler might reorder operator new. I should probably remove MemRegionClosureRO and let DirtyCardToOopClosure directly inherit ResourceObj instead. ------------- PR: https://git.openjdk.java.net/jdk/pull/5387 From coleenp at openjdk.java.net Wed Oct 6 12:18:11 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 6 Oct 2021 12:18:11 GMT Subject: Integrated: 8273917: Remove 'leaf' ranking for Mutex In-Reply-To: <0u1_bu5EunfDFYz-AMKxs-wOVO0-HwpYpe8YoSXBwtM=.e25a1229-de90-4d15-9e89-2f20d6ccf40b@github.com> References: <0u1_bu5EunfDFYz-AMKxs-wOVO0-HwpYpe8YoSXBwtM=.e25a1229-de90-4d15-9e89-2f20d6ccf40b@github.com> Message-ID: On Sun, 3 Oct 2021 21:08:51 GMT, Coleen Phillimore wrote: > This change removes 'leaf' ranking. The previous change for JDK-8273915 divided the 'leaf' ranked locks that didn't safepoint check into the rank 'nosafepoint', so all the 'leaf' ranking locks left were safepoint_check_always. > > The rank 'nonleaf' (to be renamed 'safepoint' in the next change) is the *top* mutex rank. > > The transformation in this change is as follows: > nonleaf+n => nonleaf - Generally these 'nonleaf' mutex were top level locks) > leaf => nonleaf - Many of these locks were top level locks > leaf => nonleaf-2 - Assuming that they were 'leaf' and 2 levels less than some existing nonleaf lock > leaf-n => nonleaf-n > > The new mutex rankings reflect their rankings based on my logging, except for a couple shenandoah locks which I didn't observe, so I made them nonleaf-2. > > This change also introduces a relative mutex ranking macro, so that a Mutex/Monitor can be defined in terms of a mutex that it holds while trying to acquire it. So these relative mutex are moved to the end of the init function in mutexLocker.cpp. > > This has been tested with tier1-8, and retesting tier1-3 locally in progress. This pull request has now been integrated. Changeset: b8af6a9b Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/b8af6a9bfb28aaf0fea0cfdaba13236dc8cbaa3a Stats: 141 lines in 11 files changed: 54 ins; 24 del; 63 mod 8273917: Remove 'leaf' ranking for Mutex Reviewed-by: eosterlund, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/5801 From coleenp at openjdk.java.net Wed Oct 6 12:18:10 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 6 Oct 2021 12:18:10 GMT Subject: RFR: 8273917: Remove 'leaf' ranking for Mutex [v4] In-Reply-To: References: <0u1_bu5EunfDFYz-AMKxs-wOVO0-HwpYpe8YoSXBwtM=.e25a1229-de90-4d15-9e89-2f20d6ccf40b@github.com> Message-ID: On Wed, 6 Oct 2021 01:22:24 GMT, Coleen Phillimore wrote: >> This change removes 'leaf' ranking. The previous change for JDK-8273915 divided the 'leaf' ranked locks that didn't safepoint check into the rank 'nosafepoint', so all the 'leaf' ranking locks left were safepoint_check_always. >> >> The rank 'nonleaf' (to be renamed 'safepoint' in the next change) is the *top* mutex rank. >> >> The transformation in this change is as follows: >> nonleaf+n => nonleaf - Generally these 'nonleaf' mutex were top level locks) >> leaf => nonleaf - Many of these locks were top level locks >> leaf => nonleaf-2 - Assuming that they were 'leaf' and 2 levels less than some existing nonleaf lock >> leaf-n => nonleaf-n >> >> The new mutex rankings reflect their rankings based on my logging, except for a couple shenandoah locks which I didn't observe, so I made them nonleaf-2. >> >> This change also introduces a relative mutex ranking macro, so that a Mutex/Monitor can be defined in terms of a mutex that it holds while trying to acquire it. So these relative mutex are moved to the end of the init function in mutexLocker.cpp. >> >> This has been tested with tier1-8, and retesting tier1-3 locally in progress. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Combine bad rank range checks. Thanks for the reviews Erik and David. ------------- PR: https://git.openjdk.java.net/jdk/pull/5801 From david.holmes at oracle.com Wed Oct 6 12:36:14 2021 From: david.holmes at oracle.com (David Holmes) Date: Wed, 6 Oct 2021 22:36:14 +1000 Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v4] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <_XakYRQVR8S2LYHZgtIidUTTJ5B1LvHQuo8ZWKu4Gjs=.c9d8c696-7478-485c-8df7-788aa1b38bcc@github.com> Message-ID: <063d9647-862c-7905-2638-56603c2b1b3e@oracle.com> On 6/10/2021 4:11 pm, Xin Liu wrote: > On Tue, 5 Oct 2021 01:38:12 GMT, David Holmes wrote: > >>> Xin Liu has updated the pull request incrementally with one additional commit since the last revision: >>> >>> Cleanup: remove a statment of debugging. >> >> src/hotspot/share/runtime/mutexLocker.cpp line 376: >> >>> 374: } >>> 375: >>> 376: void unlock_locks_on_error(JavaThread* thread) { >> >> Nit: s/thread/current/ as we must be applying this to the current thread. > > make sense. I will make them consistent. > Should be `thread` because it's not necessarily be current(). Yes it has to be current - you can't unlock a lock that belongs to another thread! David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/5590 > From david.holmes at oracle.com Wed Oct 6 12:53:54 2021 From: david.holmes at oracle.com (David Holmes) Date: Wed, 6 Oct 2021 22:53:54 +1000 Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v4] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <_XakYRQVR8S2LYHZgtIidUTTJ5B1LvHQuo8ZWKu4Gjs=.c9d8c696-7478-485c-8df7-788aa1b38bcc@github.com> <-uSqg_bPi7OTAJmFAdLptIrI7ypU1wG5CNJ7ttJJCuM=.763d47d7-934d-4cb3-9df4-496644ba50a5@github.com> Message-ID: <7e066f89-416c-fb29-66dc-8b8f34002a02@oracle.com> On 6/10/2021 4:42 pm, Thomas Stuefe wrote: > On Wed, 6 Oct 2021 00:37:41 GMT, Xin Liu wrote: > >> I don't like workaround. I intend to do thing right. In my understanding, fork_and_exec() is going to step into Native, isn't it? > > The problem I have with this is that force-unlocking all mutexes changes the state and possibly destroys evidence in core files. > > For instance, let's say you have a crash in a thread owning a piece of memory, and unlocking all mutexes unlocks the mutex guarding this memory, lets other threads access and change the memory. The core file will reflect this. And this will be almost impossible to prove too. > > I really would prefer to limit this workaround to the specific use case where it is needed. I admit that the more I think about this the more I think unlocking the mutexes is the wrong thing to do. It introduces non-determinism to the crash and as Thomas says the state can change radically by the time the core dump is taken - thus defeating the whole point of the core dump being useful to analyse the crash. That said I don't see how this can be limited to the specific usecase of "jcmd %p" ?? I think a different tack is needed here. We want the thread calling fork_and_exec to be seen as safepoint-safe, so perhaps the simplest hack is to switch to _thread_blocked (in a safe manner). David ----- >> >> I would like to clean up [prepare_for_emergency_dump](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/jfr/recorder/repository/jfrEmergencyDump.cpp#L426) in next step. It's easier to maintain than just iterating locks manually. that's why I would like to have a natural name. How about `VMError::transition_java_current_to_native_from_vm()`? > > Does that run before or after the hs-err file is written? > > >>> However, if we do this for normal crashes+OnError too, I'd like to see a test for that (you can simply reuse your test, add a second `@run` section and just start the VM with `-XX:ErrorHandlerTest=14 -XX:OnError=... -version`). Of course, that one would require `vm.debug`. >> >> Actually, `-XX:+ErrorHandlerTest=14` is my first attempt. it turns out difficult to build a testcase which reliably reproduces this bug on all different platforms. I don't want to increase the burden of long-term operations. >> >> In this case, ErrorHandleTest triggers `VMError::controlled_crash` early. I have seen that it's earlier than AttachListener. As a result, jcmd can't surely attach HotSpot. >> >> For normal crashes+OnError scenario, I think `runtime/ErrorHandling/TestOnError.java` has covered it. > > +1 > > As I wrote, I think this seems like a separate issue. OnError should always work, also if VM has an error in init phase. > >>> test/hotspot/jtreg/runtime/ErrorHandling/TestOutOfMemoryErrorFromNIO.java line 25: >>> >>>> 23: >>>> 24: /* >>>> 25: * @test TestOutOfMemoryErrorFromNIO >>> >>> I would rename the test. The point is not to test OOM from NIO, but to test XX:OnError with jcmd which attaches to the parent. >>> >>> You could do the same with any OOM, right? E.g. just start the VM with a very small MaxMetaspaceSize. You would not even need a main function for that, since the VM would not come up but throw a OOM right away, >>> >>> Proposal (but feel free to come up with something better): "TestOnErrorWithSelfAttachingJCmd" >> >> I found HotSpot is very subtle when it comes to `MaxMetaspaceSize` and `-XX:AbortVMOnException=java.lang.OutOfMemoryError` >> >> On my host, the watershed is 3M. If I give it 2M, it will ignore `AbortVMOnException`. >> >> >> java -XX:MaxMetaspaceSize=2M -XX:AbortVMOnException=java.lang.OutOfMemoryError >> Error occurred during initialization of VM >> OutOfMemoryError: Metaspace >> >> >> Flip code blocks here can solve this problem. is it intentional? >> >> >> diff --git a/src/hotspot/share/memory/metaspace.cpp b/src/hotspot/share/memory/metaspace.cpp >> index 74b5d1a0b30..efe430f7d3c 100644 >> --- a/src/hotspot/share/memory/metaspace.cpp >> +++ b/src/hotspot/share/memory/metaspace.cpp >> @@ -974,15 +974,16 @@ void Metaspace::report_metadata_oome(ClassLoaderData* loader_data, size_t word_s >> space_string); >> } >> >> - if (!is_init_completed()) { >> - vm_exit_during_initialization("OutOfMemoryError", space_string); >> - } >> - >> if (out_of_compressed_class_space) { >> THROW_OOP(Universe::out_of_memory_error_class_metaspace()); >> } else { >> THROW_OOP(Universe::out_of_memory_error_metaspace()); >> } >> + >> + if (!is_init_completed()) { >> + vm_exit_during_initialization("OutOfMemoryError", space_string); >> + } >> + >> } > > I don't think you can flip it. Throwing the exception may involve further metaspace allocations, which gives you a circle. But for OnError this should not matter. OnError should also work for errors during initialization (whether the subsequent command, e.g. jcmd, succeeds in accessing the half-inited VM is another story). > > I think this is a separate issue: make OnError work when the VM bails out during initialization. > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/5590 > From stuefe at openjdk.java.net Wed Oct 6 13:28:10 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 6 Oct 2021 13:28:10 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v5] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: On Wed, 6 Oct 2021 06:30:36 GMT, Xin Liu wrote: >> This patch allows the custom commands of OnError to attach to HotSpot itself. >> It sets the thread of report_and_die() to Native before os::fork_and_exec(cmd). >> This prevents cmds which require safepoint synchronization from deadlock. >> eg. OnError='jcmd %p Thread.print'. >> >> Without this patch, we will encounter a deadlock at safepoint synchronization. >> `"main" #1` is the very thread which executes `os::fork_and_exec(cmd)`. >> >> >> Aborting due to java.lang.OutOfMemoryError: Java heap space >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (debug.cpp:364), pid=94632, tid=94633 >> # fatal error: OutOfMemory encountered: Java heap space >> # >> # JRE version: OpenJDK Runtime Environment (18.0) (build 18-internal+0-adhoc.xxinliu.jdk) >> # Java VM: OpenJDK 64-Bit Server VM (18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) >> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again >> # >> # An error report file with more information is saved as: >> # /local/home/xxinliu/JDK-2085/hs_err_pid94632.log >> # >> # -XX:OnError="jcmd %p Thread.print" >> # Executing /bin/sh -c "jcmd 94632 Thread.print" ... >> 94632: >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timeout detected: >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint. >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Threads which did not reach the safepoint: >> [10.616s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=236.97ms elapsed=10.61s tid=0x00007f01b00232f0 nid=94633 runnable [0x00007f01b7a08000] >> [10.616s][warning][safepoint] java.lang.Thread.State: RUNNABLE >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: (End of list) > > Xin Liu has updated the pull request incrementally with one additional commit since the last revision: > > Update per reviewers' feedbacks. > > 1. rename the test to TestOnErrorWithSelfAttachingJCmd.java. > 2. rename java_current_transition_into_native() and add the static qualifier. > 3. make unlock_locks_on_error more general. ensure formal name is consistent > with argument name. > _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on [hotspot-dev](mailto:hotspot-dev at mail.openjdk.java.net):_ > > On 6/10/2021 4:42 pm, Thomas Stuefe wrote: > > > On Wed, 6 Oct 2021 00:37:41 GMT, Xin Liu wrote: > > > I don't like workaround. I intend to do thing right. In my understanding, fork_and_exec() is going to step into Native, isn't it? > > > > > > The problem I have with this is that force-unlocking all mutexes changes the state and possibly destroys evidence in core files. > > For instance, let's say you have a crash in a thread owning a piece of memory, and unlocking all mutexes unlocks the mutex guarding this memory, lets other threads access and change the memory. The core file will reflect this. And this will be almost impossible to prove too. > > I really would prefer to limit this workaround to the specific use case where it is needed. > > I admit that the more I think about this the more I think unlocking the mutexes is the wrong thing to do. It introduces non-determinism to the crash and as Thomas says the state can change radically by the time the core dump is taken - thus defeating the whole point of the core dump being useful to analyse the crash. > > That said I don't see how this can be limited to the specific usecase of "jcmd %p" ?? > > I think a different tack is needed here. We want the thread calling fork_and_exec to be seen as safepoint-safe, so perhaps the simplest hack is to switch to _thread_blocked (in a safe manner). > > David ----- A simple compromise would have been to just limit the workaround to OOM scenarios. It's a one-line change in VMError and also would reduce scenarios we need to maintain. And it makes sense, since with OOMs one usually uses heap dumps or thread dumps, which you need jcmd for (*). In crash scenarios OTOH one usually needs cores. In crash scenarios, it makes rarely sense to self-attach with jcmd because god knows what would happen. Further down the line, a better improvement may be to not spawn jcmd at all but to provide specialized support for running jcmd commands on OOM inside the VM, without the help of an intermediate child process. Sort of like `-XX:CommandOnOOM=VM.info; GC.heap_dump my_dump; Thread.print"` . (*) just occurred to me that we have already HeapDumpOnOutOfMemoryError, so attaching jcmd to the VM to do a heap dump is not really needed anyway. ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From burban at openjdk.java.net Wed Oct 6 13:44:09 2021 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Wed, 6 Oct 2021 13:44:09 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler [v2] In-Reply-To: References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> Message-ID: On Wed, 6 Oct 2021 09:26:35 GMT, Andrew Haley wrote: > Sorry, stupid mistake: the pop is done in reverse, of course! hmm, I think it's fine. Checkout the full listing: 0x0000000130009f80: b.ne 0x000000013000a018 // b.any 0x0000000130009f84: stp x0, x1, [sp, #-256]! 0x0000000130009f88: stp x2, x3, [sp, #16] 0x0000000130009f8c: stp x4, x5, [sp, #32] 0x0000000130009f90: stp x6, x7, [sp, #48] 0x0000000130009f94: stp x8, x9, [sp, #64] 0x0000000130009f98: stp x10, x11, [sp, #80] 0x0000000130009f9c: stp x12, x13, [sp, #96] 0x0000000130009fa0: stp x14, x15, [sp, #112] 0x0000000130009fa4: stp x16, x17, [sp, #128] 0x0000000130009fa8: stp x18, x19, [sp, #144] 0x0000000130009fac: stp x20, x21, [sp, #160] 0x0000000130009fb0: stp x22, x23, [sp, #176] 0x0000000130009fb4: stp x24, x25, [sp, #192] 0x0000000130009fb8: stp x26, x27, [sp, #208] 0x0000000130009fbc: stp x28, x29, [sp, #224] 0x0000000130009fc0: stp x30, xzr, [sp, #240] 0x0000000130009fc4: mov x0, x28 ;; 0x10564AC0C 0x0000000130009fc8: mov x9, #0xac0c // #44044 0x0000000130009fcc: movk x9, #0x564, lsl #16 0x0000000130009fd0: movk x9, #0x1, lsl #32 0x0000000130009fd4: blr x9 0x0000000130009fd8: ldp x2, x3, [sp, #16] 0x0000000130009fdc: ldp x4, x5, [sp, #32] 0x0000000130009fe0: ldp x6, x7, [sp, #48] 0x0000000130009fe4: ldp x8, x9, [sp, #64] 0x0000000130009fe8: ldp x10, x11, [sp, #80] 0x0000000130009fec: ldp x12, x13, [sp, #96] 0x0000000130009ff0: ldp x14, x15, [sp, #112] 0x0000000130009ff4: ldp x16, x17, [sp, #128] 0x0000000130009ff8: ldp x0, x1, [sp], #144 0x0000000130009ffc: ldp xzr, x19, [sp], #16 0x000000013000a000: ldp x22, x23, [sp, #16] 0x000000013000a004: ldp x24, x25, [sp, #32] 0x000000013000a008: ldp x26, x27, [sp, #48] 0x000000013000a00c: ldp x28, x29, [sp, #64] 0x000000013000a010: ldp x30, xzr, [sp, #80] 0x000000013000a014: ldp x20, x21, [sp], #96 0x000000013000a018: ldur x12, [x29, #-24] 0x000000013000a01c: ldr x22, [x12, #16] 0x000000013000a020: add x22, x22, #0x30 0x000000013000a024: ldr x8, [x28, #8] In terms of testing, it seems hard to trigger, so I forced the code path via: diff --git a/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp b/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp index d069345b9c0..046c325faa7 100644 --- a/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp +++ b/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp @@ -1392,11 +1392,14 @@ address TemplateInterpreterGenerator::generate_native_entry(bool synchronized) { { Label no_reguard; + Label skip_check; + __ b(skip_check); __ lea(rscratch1, Address(rthread, in_bytes(JavaThread::stack_guard_state_offset()))); __ ldrw(rscratch1, Address(rscratch1)); __ cmp(rscratch1, (u1)StackOverflow::stack_guard_yellow_reserved_disabled); __ br(Assembler::NE, no_reguard); + __ bind(skip_check); __ push(RegSet::range(r0, r30), sp); __ mov(c_rarg0, rthread); __ mov(rscratch2, CAST_FROM_FN_PTR(address, SharedRuntime::reguard_yellow_pages)); which crashes with your most recent suggestion btw, but it's fine with what's in this PR right now. Any better suggestion on how to test it? ------------- PR: https://git.openjdk.java.net/jdk/pull/5828 From aph at openjdk.java.net Wed Oct 6 14:06:16 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 6 Oct 2021 14:06:16 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler [v2] In-Reply-To: References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> Message-ID: <0GcxjuCVlnLEM8van8gXC4cg7avK8zISdx8hGe02Ap8=.a8f4f92d-be11-469e-8253-2d64fdc32b5b@github.com> On Wed, 6 Oct 2021 09:03:29 GMT, Bernhard Urban-Forster wrote: >> r18 should not be used as it is reserved as platform register. Linux is fine with userspace using it, but Windows and also recently macOS (https://github.com/openjdk/jdk11u-dev/pull/301#issuecomment-911998917 ) are actually using it on the kernel side. >> >> The macro assembler uses the bit pattern `0x7fff_ffff` (== `r0-r30`) to specify which registers to spill; fortunately this helper is only used here: >> >> https://github.com/openjdk/jdk/blob/c05dc268acaf87236f30cf700ea3ac778e3b20e5/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp#L1400-L1404 >> >> I haven't seen causing this particular instance any issues in practice _yet_, presumably because it looks hard to align the stars in order to trigger a problem (between `stp` and `ldp` of `r18` a transition to kernel space must happen *and* the kernel needs to do something with `r18`). But jdk11u-dev has more usages of the `::pusha`/`::popa` macro and that causes troubles as explained in the link above. >> >> Output of `-XX:+PrintInterpreter` before this change: >> >> ---------------------------------------------------------------------- >> method entry point (kind = native) [0x0000000138809b00, 0x000000013880a280] 1920 bytes >> -------------------------------------------------------------------------------- >> 0x0000000138809b00: ldr x2, [x12, #16] >> 0x0000000138809b04: ldrh w2, [x2, #44] >> 0x0000000138809b08: add x24, x20, x2, uxtx #3 >> 0x0000000138809b0c: sub x24, x24, #0x8 >> [...] >> 0x0000000138809fa4: stp x16, x17, [sp, #128] >> 0x0000000138809fa8: stp x18, x19, [sp, #144] >> 0x0000000138809fac: stp x20, x21, [sp, #160] >> [...] >> 0x0000000138809fc0: stp x30, xzr, [sp, #240] >> 0x0000000138809fc4: mov x0, x28 >> ;; 0x10864ACCC >> 0x0000000138809fc8: mov x9, #0xaccc // #44236 >> 0x0000000138809fcc: movk x9, #0x864, lsl #16 >> 0x0000000138809fd0: movk x9, #0x1, lsl #32 >> 0x0000000138809fd4: blr x9 >> 0x0000000138809fd8: ldp x2, x3, [sp, #16] >> [...] >> 0x0000000138809ff4: ldp x16, x17, [sp, #128] >> 0x0000000138809ff8: ldp x18, x19, [sp, #144] >> 0x0000000138809ffc: ldp x20, x21, [sp, #160] >> >> >> After: >> >> ---------------------------------------------------------------------- >> method entry point (kind = native) [0x0000000108e4db00, 0x0000000108e4e280] 1920 bytes >> >> -------------------------------------------------------------------------------- >> 0x0000000108e4db00: ldr x2, [x12, #16] >> 0x0000000108e4db04: ldrh w2, [x2, #44] >> 0x0000000108e4db08: add x24, x20, x2, uxtx #3 >> 0x0000000108e4db0c: sub x24, x24, #0x8 >> [...] >> 0x0000000108e4dfa4: stp x16, x17, [sp, #128] >> 0x0000000108e4dfa8: stp x19, x20, [sp, #144] >> 0x0000000108e4dfac: stp x21, x22, [sp, #160] >> [...] >> 0x0000000108e4dfbc: stp x29, x30, [sp, #224] >> 0x0000000108e4dfc0: mov x0, x28 >> ;; 0x107E4A06C >> 0x0000000108e4dfc4: mov x9, #0xa06c // #41068 >> 0x0000000108e4dfc8: movk x9, #0x7e4, lsl #16 >> 0x0000000108e4dfcc: movk x9, #0x1, lsl #32 >> 0x0000000108e4dfd0: blr x9 >> 0x0000000108e4dfd4: ldp x2, x3, [sp, #16] >> [...] >> 0x0000000108e4dff0: ldp x16, x17, [sp, #128] >> 0x0000000108e4dff4: ldp x19, x20, [sp, #144] >> 0x0000000108e4dff8: ldp x21, x22, [sp, #160] >> [...] > > Bernhard Urban-Forster has updated the pull request incrementally with one additional commit since the last revision: > > remove pusha/popa, be explicit in the template generator > > Restore looks like this now: > ``` > 0x0000000106e4dfcc: movk x9, #0x5e4, lsl #16 > 0x0000000106e4dfd0: movk x9, #0x1, lsl #32 > 0x0000000106e4dfd4: blr x9 > 0x0000000106e4dfd8: ldp x2, x3, [sp, #16] > 0x0000000106e4dfdc: ldp x4, x5, [sp, #32] > 0x0000000106e4dfe0: ldp x6, x7, [sp, #48] > 0x0000000106e4dfe4: ldp x8, x9, [sp, #64] > 0x0000000106e4dfe8: ldp x10, x11, [sp, #80] > 0x0000000106e4dfec: ldp x12, x13, [sp, #96] > 0x0000000106e4dff0: ldp x14, x15, [sp, #112] > 0x0000000106e4dff4: ldp x16, x17, [sp, #128] > 0x0000000106e4dff8: ldp x0, x1, [sp], #144 > 0x0000000106e4dffc: ldp xzr, x19, [sp], #16 > 0x0000000106e4e000: ldp x22, x23, [sp, #16] > 0x0000000106e4e004: ldp x24, x25, [sp, #32] > 0x0000000106e4e008: ldp x26, x27, [sp, #48] > 0x0000000106e4e00c: ldp x28, x29, [sp, #64] > 0x0000000106e4e010: ldp x30, xzr, [sp, #80] > 0x0000000106e4e014: ldp x20, x21, [sp], #96 > 0x0000000106e4e018: ldur x12, [x29, #-24] > 0x0000000106e4e01c: ldr x22, [x12, #16] > 0x0000000106e4e020: add x22, x22, #0x30 > 0x0000000106e4e024: ldr x8, [x28, #8] > ``` > > Sorry, stupid mistake: the pop is done in reverse, of course! > > hmm, I think it's fine. Checkout the full listing: > > Any better suggestion on how to test it? You're right, it looks fine. It's a bit odd but it's really OK. I'd check it simply by stopping at the start of the push, printing all registers, and stepping through. ------------- PR: https://git.openjdk.java.net/jdk/pull/5828 From dnsimon at openjdk.java.net Wed Oct 6 14:30:34 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Wed, 6 Oct 2021 14:30:34 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs Message-ID: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> This enhances hs-err logs to: * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. * Show machine code for the top frames on the native stack. An interpreter or stub frame is only shown if it is the crashing frame. A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). ------------- Commit messages: - dump machine code in hs-err log for top frames on native stack and use abstract assembly if a disassembler is not installed Changes: https://git.openjdk.java.net/jdk/pull/5446/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5446&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8272586 Stats: 279 lines in 4 files changed: 235 ins; 28 del; 16 mod Patch: https://git.openjdk.java.net/jdk/pull/5446.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5446/head:pull/5446 PR: https://git.openjdk.java.net/jdk/pull/5446 From dnsimon at openjdk.java.net Wed Oct 6 14:30:35 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Wed, 6 Oct 2021 14:30:35 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs In-Reply-To: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> Message-ID: On Thu, 9 Sep 2021 16:56:57 GMT, Doug Simon wrote: > This enhances hs-err logs to: > * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. > * Show machine code for the top frames on the native stack. > > An interpreter or stub frame is only shown if it is the crashing frame. > > A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). src/hotspot/share/utilities/vmError.cpp line 913: > 911: // The first 10 unique code units on the stack should be sufficient > 912: // for investigating crashes. > 913: int printed_len = 10; Maybe 10 is too big? ------------- PR: https://git.openjdk.java.net/jdk/pull/5446 From rkennke at openjdk.java.net Wed Oct 6 15:21:07 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 6 Oct 2021 15:21:07 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs In-Reply-To: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> Message-ID: On Thu, 9 Sep 2021 16:56:57 GMT, Doug Simon wrote: > This enhances hs-err logs to: > * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. > * Show machine code for the top frames on the native stack. > > An interpreter or stub frame is only shown if it is the crashing frame. > > A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). This is going to be very useful! I used to copy+paste the instructions hex block into disassembler.io, having relevant assembly in hs_err will make debugging such situations much easier! This is not a review, just a big thumbs-up on the feature as such! :-) Thanks, Roman ------------- PR: https://git.openjdk.java.net/jdk/pull/5446 From kvn at openjdk.java.net Wed Oct 6 15:26:08 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 6 Oct 2021 15:26:08 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs In-Reply-To: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> Message-ID: <0HmiCwO-Pz_oQdWA4aIvp4dhk7brxurIRBw86xBRkM4=.df6f4f12-ec46-4c1f-b148-9e8f04992114@github.com> On Thu, 9 Sep 2021 16:56:57 GMT, Doug Simon wrote: > This enhances hs-err logs to: > * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. > * Show machine code for the top frames on the native stack. > > An interpreter or stub frame is only shown if it is the crashing frame. > > A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). src/hotspot/share/utilities/vmError.cpp line 307: > 305: ResourceMark rm(thread); > 306: Disassembler::decode(cb, st); > 307: st->cr(); Why RM used only here? ------------- PR: https://git.openjdk.java.net/jdk/pull/5446 From kvn at openjdk.java.net Wed Oct 6 15:29:10 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 6 Oct 2021 15:29:10 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs In-Reply-To: <0HmiCwO-Pz_oQdWA4aIvp4dhk7brxurIRBw86xBRkM4=.df6f4f12-ec46-4c1f-b148-9e8f04992114@github.com> References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> <0HmiCwO-Pz_oQdWA4aIvp4dhk7brxurIRBw86xBRkM4=.df6f4f12-ec46-4c1f-b148-9e8f04992114@github.com> Message-ID: On Wed, 6 Oct 2021 15:22:58 GMT, Vladimir Kozlov wrote: >> This enhances hs-err logs to: >> * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. >> * Show machine code for the top frames on the native stack. >> >> An interpreter or stub frame is only shown if it is the crashing frame. >> >> A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). > > src/hotspot/share/utilities/vmError.cpp line 307: > >> 305: ResourceMark rm(thread); >> 306: Disassembler::decode(cb, st); >> 307: st->cr(); > > Why RM used only here? I see that you copied existing code. Okay. ------------- PR: https://git.openjdk.java.net/jdk/pull/5446 From kvn at openjdk.java.net Wed Oct 6 15:45:08 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 6 Oct 2021 15:45:08 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs In-Reply-To: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> Message-ID: On Thu, 9 Sep 2021 16:56:57 GMT, Doug Simon wrote: > This enhances hs-err logs to: > * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. > * Show machine code for the top frames on the native stack. > > An interpreter or stub frame is only shown if it is the crashing frame. > > A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). Look good in general. I have few small comments. src/hotspot/share/utilities/vmError.cpp line 247: > 245: * Determines if the code unit denoted by `owner` should be printed. > 246: * If this method returns true, then `owner` has been added to `printed`. > 247: * If `owner` was already in `printed`, this methods returns false. Is it because of recursive calls? src/hotspot/share/utilities/vmError.cpp line 324: > 322: // non-walkable C frame. Use frame.sender() for java frames. > 323: frame invalid; > 324: if (t && t->is_Java_thread()) { use `(t != nullptr)` explicitly src/hotspot/share/utilities/vmError.cpp line 363: > 361: st->cr(); > 362: fr = next_frame(fr, t); > 363: if (!fr.pc()) { Please use `== nullptr` src/hotspot/share/utilities/vmError.cpp line 918: > 916: > 917: frame fr = os::fetch_frame_from_context(_context); > 918: for (int count = 0; count < printed_len && fr.pc(); ) { Use `(fr.pc() != nullptr)` ------------- PR: https://git.openjdk.java.net/jdk/pull/5446 From dnsimon at openjdk.java.net Wed Oct 6 16:30:14 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Wed, 6 Oct 2021 16:30:14 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs In-Reply-To: References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> Message-ID: <_bJv3Mo_usYaTuQosK0Z-SY9JOaDgmd_oDyJmA2YcNo=.8d0f3017-630d-48dc-b4f9-b8e4ee2190d3@github.com> On Wed, 6 Oct 2021 15:41:27 GMT, Vladimir Kozlov wrote: >> This enhances hs-err logs to: >> * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. >> * Show machine code for the top frames on the native stack. >> >> An interpreter or stub frame is only shown if it is the crashing frame. >> >> A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). > > src/hotspot/share/utilities/vmError.cpp line 247: > >> 245: * Determines if the code unit denoted by `owner` should be printed. >> 246: * If this method returns true, then `owner` has been added to `printed`. >> 247: * If `owner` was already in `printed`, this methods returns false. > > Is it because of recursive calls? Yes - no point in printing the same code more than once. > src/hotspot/share/utilities/vmError.cpp line 324: > >> 322: // non-walkable C frame. Use frame.sender() for java frames. >> 323: frame invalid; >> 324: if (t && t->is_Java_thread()) { > > use `(t != nullptr)` explicitly Will do. BTW, I've never seen `nullptr` before - where is it defined? ------------- PR: https://git.openjdk.java.net/jdk/pull/5446 From dnsimon at openjdk.java.net Wed Oct 6 16:41:39 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Wed, 6 Oct 2021 16:41:39 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs [v2] In-Reply-To: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> Message-ID: > This enhances hs-err logs to: > * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. > * Show machine code for the top frames on the native stack. > > An interpreter or stub frame is only shown if it is the crashing frame. > > A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). Doug Simon has updated the pull request incrementally with one additional commit since the last revision: use nullptr instead of 0 or implicit NULL tests ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5446/files - new: https://git.openjdk.java.net/jdk/pull/5446/files/a9bf07e3..0a69c058 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5446&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5446&range=00-01 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/5446.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5446/head:pull/5446 PR: https://git.openjdk.java.net/jdk/pull/5446 From kvn at openjdk.java.net Wed Oct 6 16:41:42 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 6 Oct 2021 16:41:42 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs [v2] In-Reply-To: <_bJv3Mo_usYaTuQosK0Z-SY9JOaDgmd_oDyJmA2YcNo=.8d0f3017-630d-48dc-b4f9-b8e4ee2190d3@github.com> References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> <_bJv3Mo_usYaTuQosK0Z-SY9JOaDgmd_oDyJmA2YcNo=.8d0f3017-630d-48dc-b4f9-b8e4ee2190d3@github.com> Message-ID: On Wed, 6 Oct 2021 16:25:50 GMT, Doug Simon wrote: >> src/hotspot/share/utilities/vmError.cpp line 324: >> >>> 322: // non-walkable C frame. Use frame.sender() for java frames. >>> 323: frame invalid; >>> 324: if (t && t->is_Java_thread()) { >> >> use `(t != nullptr)` explicitly > > Will do. > BTW, I've never seen `nullptr` before - where is it defined? We start using C++ features: https://github.com/openjdk/jdk/blob/master/doc/hotspot-style.md#nullptr ------------- PR: https://git.openjdk.java.net/jdk/pull/5446 From kvn at openjdk.java.net Wed Oct 6 17:48:10 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 6 Oct 2021 17:48:10 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs [v2] In-Reply-To: References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> Message-ID: <5qMnXqg3bwOfQU_l9S_ASepXEdPdFI7a4IzQDIjW91k=.50bfdc50-6c93-4929-9166-a09efd177804@github.com> On Wed, 6 Oct 2021 16:41:39 GMT, Doug Simon wrote: >> This enhances hs-err logs to: >> * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. >> * Show machine code for the top frames on the native stack. >> >> An interpreter or stub frame is only shown if it is the crashing frame. >> >> A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > use nullptr instead of 0 or implicit NULL tests Good. Someone from runtime group have to look on this too. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5446 From xliu at openjdk.java.net Wed Oct 6 18:22:14 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Wed, 6 Oct 2021 18:22:14 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v5] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: On Wed, 6 Oct 2021 06:30:36 GMT, Xin Liu wrote: >> This patch allows the custom commands of OnError to attach to HotSpot itself. >> It sets the thread of report_and_die() to Native before os::fork_and_exec(cmd). >> This prevents cmds which require safepoint synchronization from deadlock. >> eg. OnError='jcmd %p Thread.print'. >> >> Without this patch, we will encounter a deadlock at safepoint synchronization. >> `"main" #1` is the very thread which executes `os::fork_and_exec(cmd)`. >> >> >> Aborting due to java.lang.OutOfMemoryError: Java heap space >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (debug.cpp:364), pid=94632, tid=94633 >> # fatal error: OutOfMemory encountered: Java heap space >> # >> # JRE version: OpenJDK Runtime Environment (18.0) (build 18-internal+0-adhoc.xxinliu.jdk) >> # Java VM: OpenJDK 64-Bit Server VM (18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) >> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again >> # >> # An error report file with more information is saved as: >> # /local/home/xxinliu/JDK-2085/hs_err_pid94632.log >> # >> # -XX:OnError="jcmd %p Thread.print" >> # Executing /bin/sh -c "jcmd 94632 Thread.print" ... >> 94632: >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timeout detected: >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint. >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Threads which did not reach the safepoint: >> [10.616s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=236.97ms elapsed=10.61s tid=0x00007f01b00232f0 nid=94633 runnable [0x00007f01b7a08000] >> [10.616s][warning][safepoint] java.lang.Thread.State: RUNNABLE >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: (End of list) > > Xin Liu has updated the pull request incrementally with one additional commit since the last revision: > > Update per reviewers' feedbacks. > > 1. rename the test to TestOnErrorWithSelfAttachingJCmd.java. > 2. rename java_current_transition_into_native() and add the static qualifier. > 3. make unlock_locks_on_error more general. ensure formal name is consistent > with argument name. 1. `HeapDumpOnOutOfMemoryError` can't cover OOME from Java world and thread exhaustion. 2. I admit that unlocking mutexes is hacky. To be fair, this problem predates this patch. There is [Jfr::on_vm_shutdown(true)](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/utilities/vmError.cpp#L1560) in VMError::report_and_die. if you use ""-XX:+FlightRecorder", "-XX:StartFlightRecording:dumponexit=true", this will tamper the coredump because it unlocks all holding mutexes. 3. I just learned from David that "you can't unlock a lock that belongs to another thread!". I need to clean up this. I would like to provide a general solution to the temporary Stop-The-world. Not only for OnError=jcmd %, but also for JFR dumponexit. I think I can have a RAII object to do the bookkeeping of unlocked mutexes. I will clean up everything for coredump. ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From nradomski at openjdk.java.net Wed Oct 6 19:38:32 2021 From: nradomski at openjdk.java.net (Niklas Radomski) Date: Wed, 6 Oct 2021 19:38:32 GMT Subject: RFR: 8274851: [PPC64] Port zgc to linux on ppc64le Message-ID: Port the Z garbage collector ([JDK-8209683](https://bugs.openjdk.java.net/browse/JDK-8209683)) to linux on ppc64le. ------------- Commit messages: - Port zgc to linux on ppc64le Changes: https://git.openjdk.java.net/jdk/pull/5842/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5842&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8274851 Stats: 1275 lines in 11 files changed: 1264 ins; 0 del; 11 mod Patch: https://git.openjdk.java.net/jdk/pull/5842.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5842/head:pull/5842 PR: https://git.openjdk.java.net/jdk/pull/5842 From coleenp at openjdk.java.net Wed Oct 6 23:37:18 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 6 Oct 2021 23:37:18 GMT Subject: RFR: 8274004: Change 'nonleaf' rank name Message-ID: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> 8273956: Add checking for rank values This change does 3 things. I could separate them but this has all been tested together and most of the change is mechanical. The first is a simple rename of nonleaf => safepoint. The second change is to add the enum class Rank which only allows subtraction from a base rank, and some additional type checking so that rank arithmetic doesn't overlap with a different rank. Ie if you say nosafepoint-n, that doesn't overlap with oopstorage. The third change is to remove the _safepoint_check_always/never flag. That is now intrinsic in the ranking, as all nosafepoint ranks are lower than all safepoint ranks. Future changes could add rank names in the middle of nosafepoint and safepoint, like handshake. As of now, the rank subtractions isn't unmanageable. There are actually not many nested Mutex. This is the last of the planned subtasks for Mutex ranking cleanup. If you have other ideas or suggestions, please let me know. Tested with tier1-8 at one point in time and just retested with tier1-6. ------------- Commit messages: - 8274004: Change 'nonleaf' rank name Changes: https://git.openjdk.java.net/jdk/pull/5845/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5845&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8274004 Stats: 445 lines in 41 files changed: 98 ins; 117 del; 230 mod Patch: https://git.openjdk.java.net/jdk/pull/5845.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5845/head:pull/5845 PR: https://git.openjdk.java.net/jdk/pull/5845 From dcubed at openjdk.java.net Wed Oct 6 23:45:04 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 6 Oct 2021 23:45:04 GMT Subject: RFR: 8274004: Change 'nonleaf' rank name In-Reply-To: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> References: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> Message-ID: On Wed, 6 Oct 2021 23:27:17 GMT, Coleen Phillimore wrote: > 8273956: Add checking for rank values > > This change does 3 things. I could separate them but this has all been tested together and most of the change is mechanical. The first is a simple rename of nonleaf => safepoint. The second change is to add the enum class Rank which only allows subtraction from a base rank, and some additional type checking so that rank arithmetic doesn't overlap with a different rank. Ie if you say nosafepoint-n, that doesn't overlap with oopstorage. The third change is to remove the _safepoint_check_always/never flag. That is now intrinsic in the ranking, as all nosafepoint ranks are lower than all safepoint ranks. > > Future changes could add rank names in the middle of nosafepoint and safepoint, like handshake. As of now, the rank subtractions isn't unmanageable. There are actually not many nested Mutex. > > This is the last of the planned subtasks for Mutex ranking cleanup. If you have other ideas or suggestions, please let me know. > > Tested with tier1-8 at one point in time and just retested with tier1-6. The PR description mentions: `8273956: Add checking for rank values` but the PR synopsis is: `8274004: Change 'nonleaf' rank name` ------------- PR: https://git.openjdk.java.net/jdk/pull/5845 From coleen.phillimore at oracle.com Wed Oct 6 23:48:08 2021 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 6 Oct 2021 19:48:08 -0400 Subject: RFR: 8274004: Change 'nonleaf' rank name In-Reply-To: References: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> Message-ID: <7226a2eb-2e9f-016c-176f-e7fc8dc00dc4@oracle.com> On 10/6/21 7:45 PM, Daniel D.Daugherty wrote: > On Wed, 6 Oct 2021 23:27:17 GMT, Coleen Phillimore wrote: > >> 8273956: Add checking for rank values >> >> This change does 3 things. I could separate them but this has all been tested together and most of the change is mechanical. The first is a simple rename of nonleaf => safepoint. The second change is to add the enum class Rank which only allows subtraction from a base rank, and some additional type checking so that rank arithmetic doesn't overlap with a different rank. Ie if you say nosafepoint-n, that doesn't overlap with oopstorage. The third change is to remove the _safepoint_check_always/never flag. That is now intrinsic in the ranking, as all nosafepoint ranks are lower than all safepoint ranks. >> >> Future changes could add rank names in the middle of nosafepoint and safepoint, like handshake. As of now, the rank subtractions isn't unmanageable. There are actually not many nested Mutex. >> >> This is the last of the planned subtasks for Mutex ranking cleanup. If you have other ideas or suggestions, please let me know. >> >> Tested with tier1-8 at one point in time and just retested with tier1-6. > The PR description mentions: > `8273956: Add checking for rank values` > but the PR synopsis is: > `8274004: Change 'nonleaf' rank name` It's actually both.? But I should close the 8273956 as a duplicate of the first and remove that line in the description. thanks, Coleen > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/5845 From dcubed at openjdk.java.net Wed Oct 6 23:55:10 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 6 Oct 2021 23:55:10 GMT Subject: RFR: 8274004: Change 'nonleaf' rank name In-Reply-To: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> References: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> Message-ID: On Wed, 6 Oct 2021 23:27:17 GMT, Coleen Phillimore wrote: > Also fixes: 8273956: Add checking for rank values > > This change does 3 things. I could separate them but this has all been tested together and most of the change is mechanical. The first is a simple rename of nonleaf => safepoint. The second change is to add the enum class Rank which only allows subtraction from a base rank, and some additional type checking so that rank arithmetic doesn't overlap with a different rank. Ie if you say nosafepoint-n, that doesn't overlap with oopstorage. The third change is to remove the _safepoint_check_always/never flag. That is now intrinsic in the ranking, as all nosafepoint ranks are lower than all safepoint ranks. > > Future changes could add rank names in the middle of nosafepoint and safepoint, like handshake. As of now, the rank subtractions isn't unmanageable. There are actually not many nested Mutex. > > This is the last of the planned subtasks for Mutex ranking cleanup. If you have other ideas or suggestions, please let me know. > > Tested with tier1-8 at one point in time and just retested with tier1-6. Just do ------------- PR: https://git.openjdk.java.net/jdk/pull/5845 From dholmes at openjdk.java.net Thu Oct 7 01:55:08 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 7 Oct 2021 01:55:08 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs [v2] In-Reply-To: References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> Message-ID: <-iNqXCpPPhM2tTKFQvyB-ZQk_xDMTVlwQ9NYA4bppTA=.2fb7bc1b-1452-412b-827b-36fb5ebab4f4@github.com> On Wed, 6 Oct 2021 16:41:39 GMT, Doug Simon wrote: >> This enhances hs-err logs to: >> * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. >> * Show machine code for the top frames on the native stack. >> >> An interpreter or stub frame is only shown if it is the crashing frame. >> >> A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > use nullptr instead of 0 or implicit NULL tests Hi Doug, Seems okay from a runtime perspective. A few minor code nits below - and the test needs a change. Thanks, David src/hotspot/share/utilities/vmError.cpp line 259: > 257: return false; > 258: } > 259: if (printed[i] == 0) { I assume 0 is a null check so please use nullptr. src/hotspot/share/utilities/vmError.cpp line 260: > 258: } > 259: if (printed[i] == 0) { > 260: printed[i] = owner; Seems odd to have this side-effect in what appears to be a query - perhaps `add_to_printed` would be more apt: adds `owner` to the list of printed owners, unless already present. Returns `true` if owner was added, and false it if was already present, or else we've exceeded the printing limit. ? src/hotspot/share/utilities/vmError.cpp line 277: > 275: */ > 276: static bool print_code(outputStream* st, Thread* thread, address pc, bool is_crash_pc, > 277: address* printed, int printed_len) { Nit: Please align the parameters with the line above src/hotspot/share/utilities/vmError.cpp line 282: > 280: // The interpreter CodeBlob is very large so try to print the codelet instead. > 281: InterpreterCodelet* codelet = Interpreter::codelet_containing(pc); > 282: if (codelet != NULL) { Please use `nullptr` in new code. test/hotspot/jtreg/runtime/ErrorHandling/MachCodeFramesInErrorFile.java line 107: > 105: "-Xcomp", > 106: "-XX:CompileCommand=compileonly,MachCodeFramesInErrorFile$Crasher.*", > 107: "-XX:CompileCommand=dontinline,MachCodeFramesInErrorFile$Crasher.*", This looks like it requires C2, so there should be an `@requires` for that in case the test is run under Zero or a MinimalVM build. ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5446 From david.holmes at oracle.com Thu Oct 7 01:55:41 2021 From: david.holmes at oracle.com (David Holmes) Date: Thu, 7 Oct 2021 11:55:41 +1000 Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v5] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: On 7/10/2021 4:22 am, Xin Liu wrote: > On Wed, 6 Oct 2021 06:30:36 GMT, Xin Liu wrote: >> Xin Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Update per reviewers' feedbacks. >> >> 1. rename the test to TestOnErrorWithSelfAttachingJCmd.java. >> 2. rename java_current_transition_into_native() and add the static qualifier. >> 3. make unlock_locks_on_error more general. ensure formal name is consistent >> with argument name. > > 1. `HeapDumpOnOutOfMemoryError` can't cover OOME from Java world and thread exhaustion. > 2. I admit that unlocking mutexes is hacky. To be fair, this problem predates this patch. There is [Jfr::on_vm_shutdown(true)](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/utilities/vmError.cpp#L1560) in VMError::report_and_die. if you use ""-XX:+FlightRecorder", "-XX:StartFlightRecording:dumponexit=true", this will tamper the coredump because it unlocks all holding mutexes. Two wrongs don't make a right. :) While JFR may need to do some hackery like this for its own purposes, the more it is done the more likely you will run into problems because of it. I'd much prefer to see if we can handle this without resorting to such hacks. Thanks, David > 3. I just learned from David that "you can't unlock a lock that belongs to another thread!". I need to clean up this. > > I would like to provide a general solution to the temporary Stop-The-world. Not only for OnError=jcmd %, but also for JFR dumponexit. I think I can have a RAII object to do the bookkeeping of unlocked mutexes. I will clean up everything for coredump. > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/5590 > From dholmes at openjdk.java.net Thu Oct 7 02:15:06 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 7 Oct 2021 02:15:06 GMT Subject: RFR: 8274004: Change 'nonleaf' rank name In-Reply-To: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> References: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> Message-ID: On Wed, 6 Oct 2021 23:27:17 GMT, Coleen Phillimore wrote: > Also fixes: 8273956: Add checking for rank values > > This change does 3 things. I could separate them but this has all been tested together and most of the change is mechanical. The first is a simple rename of nonleaf => safepoint. The second change is to add the enum class Rank which only allows subtraction from a base rank, and some additional type checking so that rank arithmetic doesn't overlap with a different rank. Ie if you say nosafepoint-n, that doesn't overlap with oopstorage. The third change is to remove the _safepoint_check_always/never flag. That is now intrinsic in the ranking, as all nosafepoint ranks are lower than all safepoint ranks. > > Future changes could add rank names in the middle of nosafepoint and safepoint, like handshake. As of now, the rank subtractions are not unmanageable. There are actually not many nested Mutex. > > This is the last of the planned subtasks for Mutex ranking cleanup. If you have other ideas or suggestions, please let me know. > > Tested with tier1-8 at one point in time and just retested with tier1-6. Hi Coleen, Overall this looks good! But shouldn't the test: test/hotspot/gtest/runtime/test_safepoint_locks.cpp be replaced by one that checks we lock with/without a safepoint check for a safepoint/nonsafepoint lock respectively? (Or does that test already exist elsewhere?). Also should there not be a gtest to verify your Rank overlap checking? Thanks, David ------------- PR: https://git.openjdk.java.net/jdk/pull/5845 From stuefe at openjdk.java.net Thu Oct 7 04:20:11 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 7 Oct 2021 04:20:11 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: <_Vn5AFGH5Fa05S0_Y88L-QkBUdZ-PD7dsK4yg405TPU=.003e14df-d79e-4ddc-a433-d1f450eaa23a@github.com> On Thu, 7 Oct 2021 01:57:39 GMT, David Holmes wrote: > 1. `HeapDumpOnOutOfMemoryError` can't cover OOME from Java world and thread exhaustion. What do you mean with "java world"? About thread exhaustion, you are right. We tried to fix this but run into "works as designed" arguments. We do this downstream in SapMachine though, so worst thing you could just take over your patch for Coretto too (see https://github.com/SAP/SapMachine/blob/d574d26282fd2e67aa242de176bbe6cdc1e41205/src/hotspot/share/prims/jvm.cpp#L2920). > 2. I admit that unlocking mutexes is hacky. To be fair, this problem predates this patch. There is [Jfr::on_vm_shutdown(true)](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/utilities/vmError.cpp#L1560) in VMError::report_and_die. if you use ""-XX:+FlightRecorder", "-XX:StartFlightRecording:dumponexit=true", this will tamper the coredump because it unlocks all holding mutexes. Yes, and arguably they should not do that either. ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From kbarrett at openjdk.java.net Thu Oct 7 06:27:10 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Thu, 7 Oct 2021 06:27:10 GMT Subject: RFR: 8269537: memset() is called after operator new [v3] In-Reply-To: References: Message-ID: On Tue, 5 Oct 2021 20:39:30 GMT, Leo Korinth wrote: >> The basic problem is that we are relying on undefined behaviour, as documented in the code: >> >> // This whole business of passing information from ResourceObj::operator new >> // to the ResourceObj constructor via fields in the "object" is technically UB. >> // But it seems to work within the limitations of HotSpot usage (such as no >> // multiple inheritance) with the compilers and compiler options we're using. >> // And it gives some possibly useful checking for misuse of ResourceObj. >> >> >> I am removing the undefined behaviour by passing the type of allocation through a thread local variable. >> >> This solution has some advantages: >> 1) it is not UB >> 2) it is simpler and easier to understand >> 3) it uses less memory (I could make it use even less if I made the enum `allocation_type` a u8) >> 4) in the *very* unlikely situation that stack memory (or embedded) already equals the data calculated from the address of the object, the code will also work. >> >> When doing the change, I also updated `allocated_on_stack()` to the new name `allocated_on_stack_or_embedded()` which is much harder to misinterpret. >> >> I also disallow to "fake" the memory type by explicitly calling `ResourceObj::set_allocation_type`. >> >> This forced me to change two places that is faking the allocation type of an embedded `GrowableArray` from `STACK_OR_EMBEDDED` to `C_HEAP`. The faking of the type is hard to understand as a `STACK_OR_EMBEDDED` `GrowableArray` can allocate any type of object. My guess is that `GrowableArray` has changed behaviour, or maybe that it was hard to understand because the old naming of `allocated_on_stack()`. >> >> I have also tried to update the comments. In doing that I not only changed the comments for this change, but also for the *incorrect* advice to always delete object you allocate with new. >> >> Testing on debug build tier1-3 >> Testing on release build tier1 > > Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: > > Use a thread local buffer so that the compiler might reorder operator new. Changes requested by kbarrett (Reviewer). src/hotspot/share/asm/codeBuffer.cpp line 143: > 141: NOT_PRODUCT(clear_strings()); > 142: > 143: assert(_default_oop_recorder.allocated_on_stack_or_embedded(), "should be embedded object"); It might have been better to do this name change separately, to reduce distraction. src/hotspot/share/asm/codeBuffer.hpp line 369: > 367: // addresses in a sibling section. > 368: > 369: class CodeBuffer: public ResourceObj DEBUG_ONLY(COMMA private Scrubber) { Deriving from ResourceObj rather than the previous fakery seems good. I think the multiple-inheritance derivation from Scrubber could be removed, instead adding ~CodeBuffer to do the zapping. src/hotspot/share/asm/codeBuffer.hpp line 376: > 374: // CodeBuffers must be allocated on the stack except for a single > 375: // special case during expansion which is handled internally. This > 376: // is done to guarantee proper cleanup of resources. This comment seems dangling now that the associated operator definitions have been removed. src/hotspot/share/classfile/stackMapFrame.hpp line 66: > 64: > 65: StackMapFrame(const StackMapFrame& cp) : > 66: ResourceObj(), Good find. src/hotspot/share/gc/g1/g1Allocator.hpp line 243: > 241: _g1h(g1h), > 242: _allocation_region(NULL), > 243: _allocated_regions(2, mtGC), Good riddance. src/hotspot/share/memory/allocation.cpp line 154: > 152: > 153: #ifdef ASSERT > 154: thread_local ResourceObj::RecentAllocations ResourceObj::_recent_allocations; Don't use `thread_local`. See the style guide. src/hotspot/share/memory/allocation.cpp line 234: > 232: } > 233: #endif // ASSERT > 234: Keep the blank line after the `#endif` src/hotspot/share/memory/allocation.hpp line 398: > 396: class ResourceObj ALLOCATION_SUPER_CLASS_SPEC { > 397: public: > 398: enum allocation_type : uint8_t { STACK_OR_EMBEDDED, RESOURCE_AREA, C_HEAP, ARENA }; Consider adding an "empty" allocation-type, that indicates an entry in the RecentAllocations is unused. src/hotspot/share/memory/allocation.hpp line 414: > 412: // allocation is done in a recursive step. In that case an assert will trigger. > 413: class RecentAllocations { > 414: static const unsigned BufferSize = 5; "5" seems like an odd choice. src/hotspot/share/memory/allocation.hpp line 416: > 414: static const unsigned BufferSize = 5; > 415: uintptr_t _begin[BufferSize]; > 416: uintptr_t _past_end[BufferSize]; I think these should be `const void*` rather than `uintptr_t`. src/hotspot/share/memory/memRegion.hpp line 110: > 108: // A ResourceObj version of MemRegionClosure > 109: > 110: class MemRegionClosureRO: public ResourceObj { I think this class could be deleted entirely, with the only derived class (`DirtyCardToOopClosure`) instead deriving from ResourceObj. src/hotspot/share/prims/jvmtiDeferredUpdates.hpp line 132: > 130: JvmtiDeferredUpdates() : > 131: _relock_count_after_wait(0), > 132: _deferred_locals_updates(1, mtCompiler) { } Good riddance. ------------- PR: https://git.openjdk.java.net/jdk/pull/5387 From xliu at openjdk.java.net Thu Oct 7 07:12:11 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Thu, 7 Oct 2021 07:12:11 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself In-Reply-To: <_Vn5AFGH5Fa05S0_Y88L-QkBUdZ-PD7dsK4yg405TPU=.003e14df-d79e-4ddc-a433-d1f450eaa23a@github.com> References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <_Vn5AFGH5Fa05S0_Y88L-QkBUdZ-PD7dsK4yg405TPU=.003e14df-d79e-4ddc-a433-d1f450eaa23a@github.com> Message-ID: <7LrFJs_XHUZzd9S3c1me1HdHwGEnphJTMU1DAdu3m5M=.658f567c-ab39-4496-b006-6ed47ac399c2@github.com> On Thu, 7 Oct 2021 04:17:13 GMT, Thomas Stuefe wrote: > > 1. `HeapDumpOnOutOfMemoryError` can't cover OOME from Java world and thread exhaustion. > > What do you mean with "java world"? About thread exhaustion, you are right. We tried to fix this but run into "works as designed" arguments. We do this downstream in SapMachine though, so worst thing you could just take over your patch for Coretto too (see https://github.com/SAP/SapMachine/blob/d574d26282fd2e67aa242de176bbe6cdc1e41205/src/hotspot/share/prims/jvm.cpp#L2920). > HeapDumpOnOutOfMemoryError won't react OOME thrown from Java libraries. [JDK-8257790](https://bugs.openjdk.java.net/browse/JDK-8257790). eg. `java.nio.ByteBuffer.allocateDirect(size)` may throw OOME just like thread exhaustion. ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From burban at openjdk.java.net Thu Oct 7 07:36:31 2021 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Thu, 7 Oct 2021 07:36:31 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler [v3] In-Reply-To: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> Message-ID: > r18 should not be used as it is reserved as platform register. Linux is fine with userspace using it, but Windows and also recently macOS (https://github.com/openjdk/jdk11u-dev/pull/301#issuecomment-911998917 ) are actually using it on the kernel side. > > The macro assembler uses the bit pattern `0x7fff_ffff` (== `r0-r30`) to specify which registers to spill; fortunately this helper is only used here: > > https://github.com/openjdk/jdk/blob/c05dc268acaf87236f30cf700ea3ac778e3b20e5/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp#L1400-L1404 > > I haven't seen causing this particular instance any issues in practice _yet_, presumably because it looks hard to align the stars in order to trigger a problem (between `stp` and `ldp` of `r18` a transition to kernel space must happen *and* the kernel needs to do something with `r18`). But jdk11u-dev has more usages of the `::pusha`/`::popa` macro and that causes troubles as explained in the link above. > > Output of `-XX:+PrintInterpreter` before this change: > > ---------------------------------------------------------------------- > method entry point (kind = native) [0x0000000138809b00, 0x000000013880a280] 1920 bytes > -------------------------------------------------------------------------------- > 0x0000000138809b00: ldr x2, [x12, #16] > 0x0000000138809b04: ldrh w2, [x2, #44] > 0x0000000138809b08: add x24, x20, x2, uxtx #3 > 0x0000000138809b0c: sub x24, x24, #0x8 > [...] > 0x0000000138809fa4: stp x16, x17, [sp, #128] > 0x0000000138809fa8: stp x18, x19, [sp, #144] > 0x0000000138809fac: stp x20, x21, [sp, #160] > [...] > 0x0000000138809fc0: stp x30, xzr, [sp, #240] > 0x0000000138809fc4: mov x0, x28 > ;; 0x10864ACCC > 0x0000000138809fc8: mov x9, #0xaccc // #44236 > 0x0000000138809fcc: movk x9, #0x864, lsl #16 > 0x0000000138809fd0: movk x9, #0x1, lsl #32 > 0x0000000138809fd4: blr x9 > 0x0000000138809fd8: ldp x2, x3, [sp, #16] > [...] > 0x0000000138809ff4: ldp x16, x17, [sp, #128] > 0x0000000138809ff8: ldp x18, x19, [sp, #144] > 0x0000000138809ffc: ldp x20, x21, [sp, #160] > > > After: > > ---------------------------------------------------------------------- > method entry point (kind = native) [0x0000000108e4db00, 0x0000000108e4e280] 1920 bytes > > -------------------------------------------------------------------------------- > 0x0000000108e4db00: ldr x2, [x12, #16] > 0x0000000108e4db04: ldrh w2, [x2, #44] > 0x0000000108e4db08: add x24, x20, x2, uxtx #3 > 0x0000000108e4db0c: sub x24, x24, #0x8 > [...] > 0x0000000108e4dfa4: stp x16, x17, [sp, #128] > 0x0000000108e4dfa8: stp x19, x20, [sp, #144] > 0x0000000108e4dfac: stp x21, x22, [sp, #160] > [...] > 0x0000000108e4dfbc: stp x29, x30, [sp, #224] > 0x0000000108e4dfc0: mov x0, x28 > ;; 0x107E4A06C > 0x0000000108e4dfc4: mov x9, #0xa06c // #41068 > 0x0000000108e4dfc8: movk x9, #0x7e4, lsl #16 > 0x0000000108e4dfcc: movk x9, #0x1, lsl #32 > 0x0000000108e4dfd0: blr x9 > 0x0000000108e4dfd4: ldp x2, x3, [sp, #16] > [...] > 0x0000000108e4dff0: ldp x16, x17, [sp, #128] > 0x0000000108e4dff4: ldp x19, x20, [sp, #144] > 0x0000000108e4dff8: ldp x21, x22, [sp, #160] > [...] Bernhard Urban-Forster has updated the pull request incrementally with one additional commit since the last revision: fix linux/aarch64 build: s/r18/r18_tls ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5828/files - new: https://git.openjdk.java.net/jdk/pull/5828/files/84c5c902..ba55aa0e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5828&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5828&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/5828.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5828/head:pull/5828 PR: https://git.openjdk.java.net/jdk/pull/5828 From ihse at openjdk.java.net Thu Oct 7 07:38:08 2021 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Thu, 7 Oct 2021 07:38:08 GMT Subject: RFR: 8274851: [PPC64] Port zgc to linux on ppc64le In-Reply-To: References: Message-ID: On Wed, 6 Oct 2021 19:27:26 GMT, Niklas Radomski wrote: > Port the Z garbage collector ([JDK-8209683](https://bugs.openjdk.java.net/browse/JDK-8209683)) to linux on ppc64le. Changes requested by ihse (Reviewer). make/autoconf/jvm-features.m4 line 365: > 363: AC_MSG_RESULT([yes]) > 364: else > 365: AC_MSG_RESULT([no, $OPENJDK_TARGET_CPU]) Please use the `$OPENJDK_TARGET_OS-$OPENJDK_TARGET_CPU` error message per the pattern above. make/hotspot/gensrc/GensrcAdlc.gmk line 3: > 1: # > 2: # Copyright (c) 2013, 2021, Oracle and/or its affiliates. All rights reserved. > 3: # Copyright (c) 2021 SAP SE. All rights reserved. I must say I'm a bit surprised to see these new copyright headers for trivial changes. At the very least, this has not been SAP practice before. Is this in accordance with SAP legal recommendations? ------------- PR: https://git.openjdk.java.net/jdk/pull/5842 From nradomski at openjdk.java.net Thu Oct 7 08:27:50 2021 From: nradomski at openjdk.java.net (Niklas Radomski) Date: Thu, 7 Oct 2021 08:27:50 GMT Subject: RFR: 8274851: [PPC64] Port zgc to linux on ppc64le [v2] In-Reply-To: References: Message-ID: On Thu, 7 Oct 2021 07:33:47 GMT, Magnus Ihse Bursie wrote: >> Niklas Radomski has updated the pull request incrementally with two additional commits since the last revision: >> >> - Update autoconf error message >> - Remove copyright headers > > make/hotspot/gensrc/GensrcAdlc.gmk line 3: > >> 1: # >> 2: # Copyright (c) 2013, 2021, Oracle and/or its affiliates. All rights reserved. >> 3: # Copyright (c) 2021 SAP SE. All rights reserved. > > I must say I'm a bit surprised to see these new copyright headers for trivial changes. At the very least, this has not been SAP practice before. Is this in accordance with SAP legal recommendations? I removed them, thank you for the heads-up! ------------- PR: https://git.openjdk.java.net/jdk/pull/5842 From nradomski at openjdk.java.net Thu Oct 7 08:27:47 2021 From: nradomski at openjdk.java.net (Niklas Radomski) Date: Thu, 7 Oct 2021 08:27:47 GMT Subject: RFR: 8274851: [PPC64] Port zgc to linux on ppc64le [v2] In-Reply-To: References: Message-ID: > Port the Z garbage collector ([JDK-8209683](https://bugs.openjdk.java.net/browse/JDK-8209683)) to linux on ppc64le. Niklas Radomski has updated the pull request incrementally with two additional commits since the last revision: - Update autoconf error message - Remove copyright headers ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5842/files - new: https://git.openjdk.java.net/jdk/pull/5842/files/17946ddf..5da8c1b7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5842&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5842&range=00-01 Stats: 3 lines in 2 files changed: 0 ins; 2 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/5842.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5842/head:pull/5842 PR: https://git.openjdk.java.net/jdk/pull/5842 From dnsimon at openjdk.java.net Thu Oct 7 08:30:10 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Thu, 7 Oct 2021 08:30:10 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs [v2] In-Reply-To: <-iNqXCpPPhM2tTKFQvyB-ZQk_xDMTVlwQ9NYA4bppTA=.2fb7bc1b-1452-412b-827b-36fb5ebab4f4@github.com> References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> <-iNqXCpPPhM2tTKFQvyB-ZQk_xDMTVlwQ9NYA4bppTA=.2fb7bc1b-1452-412b-827b-36fb5ebab4f4@github.com> Message-ID: On Thu, 7 Oct 2021 01:43:05 GMT, David Holmes wrote: >> Doug Simon has updated the pull request incrementally with one additional commit since the last revision: >> >> use nullptr instead of 0 or implicit NULL tests > > src/hotspot/share/utilities/vmError.cpp line 260: > >> 258: } >> 259: if (printed[i] == 0) { >> 260: printed[i] = owner; > > Seems odd to have this side-effect in what appears to be a query - perhaps `add_to_printed` would be more apt: adds `owner` to the list of printed owners, unless already present. Returns `true` if owner was added, and false it if was already present, or else we've exceeded the printing limit. ? That's a good suggestion. I've changed it to be `add_if_absent` and made it more generic. ------------- PR: https://git.openjdk.java.net/jdk/pull/5446 From aph at openjdk.java.net Thu Oct 7 08:53:09 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 7 Oct 2021 08:53:09 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler [v3] In-Reply-To: References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> Message-ID: <6JpuK39Hiw9D4DlQoUKxqrIxSADCAjyxCxiy3niPBwI=.d17d804f-df3c-4f83-af63-4057cd0ca632@github.com> On Thu, 7 Oct 2021 07:36:31 GMT, Bernhard Urban-Forster wrote: >> r18 should not be used as it is reserved as platform register. Linux is fine with userspace using it, but Windows and also recently macOS (https://github.com/openjdk/jdk11u-dev/pull/301#issuecomment-911998917 ) are actually using it on the kernel side. >> >> The macro assembler uses the bit pattern `0x7fff_ffff` (== `r0-r30`) to specify which registers to spill; fortunately this helper is only used here: >> >> https://github.com/openjdk/jdk/blob/c05dc268acaf87236f30cf700ea3ac778e3b20e5/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp#L1400-L1404 >> >> I haven't seen causing this particular instance any issues in practice _yet_, presumably because it looks hard to align the stars in order to trigger a problem (between `stp` and `ldp` of `r18` a transition to kernel space must happen *and* the kernel needs to do something with `r18`). But jdk11u-dev has more usages of the `::pusha`/`::popa` macro and that causes troubles as explained in the link above. >> >> Output of `-XX:+PrintInterpreter` before this change: >> >> ---------------------------------------------------------------------- >> method entry point (kind = native) [0x0000000138809b00, 0x000000013880a280] 1920 bytes >> -------------------------------------------------------------------------------- >> 0x0000000138809b00: ldr x2, [x12, #16] >> 0x0000000138809b04: ldrh w2, [x2, #44] >> 0x0000000138809b08: add x24, x20, x2, uxtx #3 >> 0x0000000138809b0c: sub x24, x24, #0x8 >> [...] >> 0x0000000138809fa4: stp x16, x17, [sp, #128] >> 0x0000000138809fa8: stp x18, x19, [sp, #144] >> 0x0000000138809fac: stp x20, x21, [sp, #160] >> [...] >> 0x0000000138809fc0: stp x30, xzr, [sp, #240] >> 0x0000000138809fc4: mov x0, x28 >> ;; 0x10864ACCC >> 0x0000000138809fc8: mov x9, #0xaccc // #44236 >> 0x0000000138809fcc: movk x9, #0x864, lsl #16 >> 0x0000000138809fd0: movk x9, #0x1, lsl #32 >> 0x0000000138809fd4: blr x9 >> 0x0000000138809fd8: ldp x2, x3, [sp, #16] >> [...] >> 0x0000000138809ff4: ldp x16, x17, [sp, #128] >> 0x0000000138809ff8: ldp x18, x19, [sp, #144] >> 0x0000000138809ffc: ldp x20, x21, [sp, #160] >> >> >> After: >> >> ---------------------------------------------------------------------- >> method entry point (kind = native) [0x0000000108e4db00, 0x0000000108e4e280] 1920 bytes >> >> -------------------------------------------------------------------------------- >> 0x0000000108e4db00: ldr x2, [x12, #16] >> 0x0000000108e4db04: ldrh w2, [x2, #44] >> 0x0000000108e4db08: add x24, x20, x2, uxtx #3 >> 0x0000000108e4db0c: sub x24, x24, #0x8 >> [...] >> 0x0000000108e4dfa4: stp x16, x17, [sp, #128] >> 0x0000000108e4dfa8: stp x19, x20, [sp, #144] >> 0x0000000108e4dfac: stp x21, x22, [sp, #160] >> [...] >> 0x0000000108e4dfbc: stp x29, x30, [sp, #224] >> 0x0000000108e4dfc0: mov x0, x28 >> ;; 0x107E4A06C >> 0x0000000108e4dfc4: mov x9, #0xa06c // #41068 >> 0x0000000108e4dfc8: movk x9, #0x7e4, lsl #16 >> 0x0000000108e4dfcc: movk x9, #0x1, lsl #32 >> 0x0000000108e4dfd0: blr x9 >> 0x0000000108e4dfd4: ldp x2, x3, [sp, #16] >> [...] >> 0x0000000108e4dff0: ldp x16, x17, [sp, #128] >> 0x0000000108e4dff4: ldp x19, x20, [sp, #144] >> 0x0000000108e4dff8: ldp x21, x22, [sp, #160] >> [...] > > Bernhard Urban-Forster has updated the pull request incrementally with one additional commit since the last revision: > > fix linux/aarch64 build: s/r18/r18_tls So we have fixed the only use of `popa()` so that it does not clobber R18. That is good. However, I think we've found another bug. `reguard_yellow_pages()` is a native function, so we should be saving floating-point registers too. It looks to me like this code should use `push_call_clobbered_registers()`. Then we'll have no use for `pusha()` in trunk, and your fix for `popa()` can go into 11u. ------------- PR: https://git.openjdk.java.net/jdk/pull/5828 From xliu at openjdk.java.net Thu Oct 7 09:15:45 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Thu, 7 Oct 2021 09:15:45 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v6] In-Reply-To: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: > This patch allows the custom commands of OnError to attach to HotSpot itself. > It sets the thread of report_and_die() to Native before os::fork_and_exec(cmd). > This prevents cmds which require safepoint synchronization from deadlock. > eg. OnError='jcmd %p Thread.print'. > > Without this patch, we will encounter a deadlock at safepoint synchronization. > `"main" #1` is the very thread which executes `os::fork_and_exec(cmd)`. > > > Aborting due to java.lang.OutOfMemoryError: Java heap space > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (debug.cpp:364), pid=94632, tid=94633 > # fatal error: OutOfMemory encountered: Java heap space > # > # JRE version: OpenJDK Runtime Environment (18.0) (build 18-internal+0-adhoc.xxinliu.jdk) > # Java VM: OpenJDK 64-Bit Server VM (18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) > # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again > # > # An error report file with more information is saved as: > # /local/home/xxinliu/JDK-2085/hs_err_pid94632.log > # > # -XX:OnError="jcmd %p Thread.print" > # Executing /bin/sh -c "jcmd 94632 Thread.print" ... > 94632: > [10.616s][warning][safepoint] > [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timeout detected: > [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint. > [10.616s][warning][safepoint] # SafepointSynchronize::begin: Threads which did not reach the safepoint: > [10.616s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=236.97ms elapsed=10.61s tid=0x00007f01b00232f0 nid=94633 runnable [0x00007f01b7a08000] > [10.616s][warning][safepoint] java.lang.Thread.State: RUNNABLE > [10.616s][warning][safepoint] > [10.616s][warning][safepoint] # SafepointSynchronize::begin: (End of list) Xin Liu has updated the pull request incrementally with one additional commit since the last revision: Don't unlock the holding mutexes for the thread State transition. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5590/files - new: https://git.openjdk.java.net/jdk/pull/5590/files/2877637e..62c9e736 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5590&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5590&range=04-05 Stats: 112 lines in 5 files changed: 37 ins; 74 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/5590.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5590/head:pull/5590 PR: https://git.openjdk.java.net/jdk/pull/5590 From ihse at openjdk.java.net Thu Oct 7 12:14:07 2021 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Thu, 7 Oct 2021 12:14:07 GMT Subject: RFR: 8274851: [PPC64] Port zgc to linux on ppc64le [v2] In-Reply-To: References: Message-ID: On Thu, 7 Oct 2021 08:27:47 GMT, Niklas Radomski wrote: >> Port the Z garbage collector ([JDK-8209683](https://bugs.openjdk.java.net/browse/JDK-8209683)) to linux on ppc64le. > > Niklas Radomski has updated the pull request incrementally with two additional commits since the last revision: > > - Update autoconf error message > - Remove copyright headers Build changes look good now. I leave it to others to determine the actual code changes. ------------- Marked as reviewed by ihse (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5842 From coleenp at openjdk.java.net Thu Oct 7 12:29:10 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 7 Oct 2021 12:29:10 GMT Subject: RFR: 8274004: Change 'nonleaf' rank name In-Reply-To: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> References: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> Message-ID: On Wed, 6 Oct 2021 23:27:17 GMT, Coleen Phillimore wrote: > Also fixes: 8273956: Add checking for rank values > > This change does 3 things. I could separate them but this has all been tested together and most of the change is mechanical. The first is a simple rename of nonleaf => safepoint. The second change is to add the enum class Rank which only allows subtraction from a base rank, and some additional type checking so that rank arithmetic doesn't overlap with a different rank. Ie if you say nosafepoint-n, that doesn't overlap with oopstorage. The third change is to remove the _safepoint_check_always/never flag. That is now intrinsic in the ranking, as all nosafepoint ranks are lower than all safepoint ranks. > > Future changes could add rank names in the middle of nosafepoint and safepoint, like handshake. As of now, the rank subtractions are not unmanageable. There are actually not many nested Mutex. > > This is the last of the planned subtasks for Mutex ranking cleanup. If you have other ideas or suggestions, please let me know. > > Tested with tier1-8 at one point in time and just retested with tier1-6. test/hotspot/gtest/runtime/test_mutex.cpp line 290: > 288: ThreadInVMfromNative invm(THREAD); > 289: > 290: Monitor* monitor_rank_broken = new Monitor(Mutex::oopstorage-4, "monitor_rank_broken"); @dholmes-ora This test checks for overlapping safepoint ranges. ------------- PR: https://git.openjdk.java.net/jdk/pull/5845 From coleenp at openjdk.java.net Thu Oct 7 12:38:09 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 7 Oct 2021 12:38:09 GMT Subject: RFR: 8274004: Change 'nonleaf' rank name In-Reply-To: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> References: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> Message-ID: <4dtM1iipmdLApuuovC-NqC6ejIpsqdn3r69uRdr6POY=.49a782a0-3c52-45dc-9710-55c09dc2bb58@github.com> On Wed, 6 Oct 2021 23:27:17 GMT, Coleen Phillimore wrote: > Also fixes: 8273956: Add checking for rank values > > This change does 3 things. I could separate them but this has all been tested together and most of the change is mechanical. The first is a simple rename of nonleaf => safepoint. The second change is to add the enum class Rank which only allows subtraction from a base rank, and some additional type checking so that rank arithmetic doesn't overlap with a different rank. Ie if you say nosafepoint-n, that doesn't overlap with oopstorage. The third change is to remove the _safepoint_check_always/never flag. That is now intrinsic in the ranking, as all nosafepoint ranks are lower than all safepoint ranks. > > Future changes could add rank names in the middle of nosafepoint and safepoint, like handshake. As of now, the rank subtractions are not unmanageable. There are actually not many nested Mutex. > > This is the last of the planned subtasks for Mutex ranking cleanup. If you have other ideas or suggestions, please let me know. > > Tested with tier1-8 at one point in time and just retested with tier1-6. test/hotspot/gtest/runtime/test_mutex.cpp line 327: > 325: // If the lock above succeeds, try to safepoint to test the NSV implied with this nosafepoint lock. > 326: ThreadBlockInVM tbivm(thread); > 327: thread->print_thread_state_on(tty); I moved the three tests from test_safepoint.cpp here. The third test in that file was checking the specification of _safepoint_check_always/never matched the rank, which is now obsolete. ------------- PR: https://git.openjdk.java.net/jdk/pull/5845 From simonis at openjdk.java.net Thu Oct 7 12:42:08 2021 From: simonis at openjdk.java.net (Volker Simonis) Date: Thu, 7 Oct 2021 12:42:08 GMT Subject: RFR: 8273563: Improve performance of implicit exceptions with -XX:-OmitStackTraceInFastThrow [v3] In-Reply-To: References: Message-ID: On Tue, 5 Oct 2021 15:35:57 GMT, Martin Doerr wrote: >> Volker Simonis has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: >> >> - Fix special case where we're creating an implicit exception for a regular invoke* bytecode >> - Minor updates as requested by @TheRealMDoerr >> - 8273563: Improve performance of implicit exceptions with -XX:-OmitStackTraceInFastThrow > > Thanks for the update! New version looks good to me and our tests with disabled OmitStackTraceInFastThrow have passed. I think you should choose a couple of jtreg tests and add an additional run with OmitStackTraceInFastThrow disabled. What do you think? Thanks for running the test @TheRealMDoerr. I've also run tier1/tier2/jck_runtime/jck_compiler on both linux/amd64 and linux/aarch64 with a special build where `OmitStackTraceInFastThrow` was disabled. I haven't seen any problems except for `compiler/loopopts/TestDivZeroCheckControl.java` which timeouts. But that's expected because that test creates `ArithmeticExceptions` in a hot loop. My change makes it run in half the time, but that's still not enough to prevent timeouts :) Anything left before you can formally review the change? ------------- PR: https://git.openjdk.java.net/jdk/pull/5488 From simonis at openjdk.java.net Thu Oct 7 12:46:12 2021 From: simonis at openjdk.java.net (Volker Simonis) Date: Thu, 7 Oct 2021 12:46:12 GMT Subject: RFR: 8273563: Improve performance of implicit exceptions with -XX:-OmitStackTraceInFastThrow [v3] In-Reply-To: References: Message-ID: On Tue, 5 Oct 2021 19:40:55 GMT, Nils Eliasson wrote: > In the benchmark you are testing a case where the exception isn't used - will the allocation be eliminated then? > > In addition to the benchmark you have provided, I would like to see functional tests for both when the allocation is used, and when it isn't. Have you looked at the new IR Test Framework? Good idea! I'll take a look. I've now also understood what @TheRealMDoerr meant by "*choose a couple of jtreg tests and add an additional run with OmitStackTraceInFastThrow disabled*2 :) I'll add both, an explicit test as well as investigate if it makes sense to run some testes with `-XX:-OmitStackTraceInFastThrow` to exercise the new functionality. ------------- PR: https://git.openjdk.java.net/jdk/pull/5488 From shade at openjdk.java.net Thu Oct 7 12:51:32 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 7 Oct 2021 12:51:32 GMT Subject: RFR: 8274903: Zero: Support AsyncGetCallTrace Message-ID: This is a Zero infrastructure improvement that makes Zero VM work with AsyncGetCallTrace, and by extension, async-profiler. Zero is quite odd in stack management. The "real" stack actually contains the C++ Interpreter and the rest of VM code. The Java stack is reported through the usual "frame" mechanism the rest of VM uses to get the mapping from Template Interpreter, stub, and compiled code. So, to support Java-centric AsyncGetCallTrace, we t "only" need Zero to report the proper Java frames from its ZeroStack from the profiling/signal handlers. Additional testing: - [x] Linux x86_64 Zero `serviceability/AsyncGetCallTrace` now pass - [x] Linux x86_64 Zero works with `async-profiler` ------------- Commit messages: - Initial work: runs async-profiler successfully Changes: https://git.openjdk.java.net/jdk/pull/5848/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5848&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8274903 Stats: 136 lines in 4 files changed: 124 ins; 6 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/5848.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5848/head:pull/5848 PR: https://git.openjdk.java.net/jdk/pull/5848 From dnsimon at openjdk.java.net Thu Oct 7 13:33:41 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Thu, 7 Oct 2021 13:33:41 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs [v3] In-Reply-To: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> Message-ID: > This enhances hs-err logs to: > * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. > * Show machine code for the top frames on the native stack. > > An interpreter or stub frame is only shown if it is the crashing frame. > > A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). Doug Simon has updated the pull request incrementally with five additional commits since the last revision: - improved test for platforms such as Windows where only a single frame can have its code printed - more nullptr usage changed should_print_code to add_if_absent made printed_capacity a const value handle platforms where native stack cannot be walked directly by VMError improved Crasher such that the crash happens in a JIT compiled frame - refined MachCodeFramesInErrorFile test - omit timestamp from nmethod decoding and printing - removed leftover debug printing ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5446/files - new: https://git.openjdk.java.net/jdk/pull/5446/files/0a69c058..ba55d0e3 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5446&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5446&range=01-02 Stats: 169 lines in 6 files changed: 83 ins; 26 del; 60 mod Patch: https://git.openjdk.java.net/jdk/pull/5446.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5446/head:pull/5446 PR: https://git.openjdk.java.net/jdk/pull/5446 From dnsimon at openjdk.java.net Thu Oct 7 13:46:08 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Thu, 7 Oct 2021 13:46:08 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs [v3] In-Reply-To: References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> Message-ID: On Thu, 7 Oct 2021 13:33:41 GMT, Doug Simon wrote: >> This enhances hs-err logs to: >> * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. >> * Show machine code for the top frames on the native stack. >> >> An interpreter or stub frame is only shown if it is the crashing frame. >> >> A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). > > Doug Simon has updated the pull request incrementally with five additional commits since the last revision: > > - improved test for platforms such as Windows where only a single frame can have its code printed > - more nullptr usage > changed should_print_code to add_if_absent > made printed_capacity a const value > handle platforms where native stack cannot be walked directly by VMError > improved Crasher such that the crash happens in a JIT compiled frame > - refined MachCodeFramesInErrorFile test > - omit timestamp from nmethod decoding and printing > - removed leftover debug printing I have pushed further changes to this PR to address test failures: * omit timestamp from nmethod decoding and printing (0621eb90dd9644b73ed3e595fbd8dee6fd12a25e) * support printing the crashing frame on platforms such as Windows where stack walking cannot be done without platform support * changed MachCodeFramesInErrorFile to pass if Crasher fails to crash in the expected frame (driving C2 with VM options is not 100% guaranteed to give the expected outcome) I would appreciate a review of these changes. ------------- PR: https://git.openjdk.java.net/jdk/pull/5446 From dnsimon at openjdk.java.net Thu Oct 7 13:49:12 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Thu, 7 Oct 2021 13:49:12 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs [v3] In-Reply-To: References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> Message-ID: On Thu, 7 Oct 2021 13:33:41 GMT, Doug Simon wrote: >> This enhances hs-err logs to: >> * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. >> * Show machine code for the top frames on the native stack. >> >> An interpreter or stub frame is only shown if it is the crashing frame. >> >> A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). > > Doug Simon has updated the pull request incrementally with five additional commits since the last revision: > > - improved test for platforms such as Windows where only a single frame can have its code printed > - more nullptr usage > changed should_print_code to add_if_absent > made printed_capacity a const value > handle platforms where native stack cannot be walked directly by VMError > improved Crasher such that the crash happens in a JIT compiled frame > - refined MachCodeFramesInErrorFile test > - omit timestamp from nmethod decoding and printing > - removed leftover debug printing src/hotspot/share/compiler/compileTask.cpp line 244: > 242: if (timestamp) { > 243: // Print current time > 244: st->print("%7d ", (int)tty->time_stamp().milliseconds()); I'm not sure why a timestamp should be included when printing the contents of an nmethod outside the context of a CompileTask. The timestamp was causing the `DisassembleCodeBlobTest` to fail as it expects disassembling a given nmethod twice to produce the same result. ------------- PR: https://git.openjdk.java.net/jdk/pull/5446 From pliden at openjdk.java.net Thu Oct 7 14:07:12 2021 From: pliden at openjdk.java.net (Per Liden) Date: Thu, 7 Oct 2021 14:07:12 GMT Subject: RFR: 8274851: [PPC64] Port zgc to linux on ppc64le [v2] In-Reply-To: References: Message-ID: On Thu, 7 Oct 2021 08:27:47 GMT, Niklas Radomski wrote: >> Port the Z garbage collector ([JDK-8209683](https://bugs.openjdk.java.net/browse/JDK-8209683)) to linux on ppc64le. > > Niklas Radomski has updated the pull request incrementally with two additional commits since the last revision: > > - Update autoconf error message > - Remove copyright headers Nice to see ZGC ported to linux/ppc! I don't have a lot to say about the ppc-specific code. As you know, we at Oracle don't build and test on that platform, so any problems here will go unnoticed by us. It would be good to see a review or two from your colleagues at SAP. src/hotspot/share/gc/z/zBarrierSetAssembler.cpp line 30: > 28: > 29: Address ZBarrierSetAssemblerBase::address_bad_mask_from_thread(Register thread) { > 30: return Address(thread, (intptr_t) ZThreadLocalData::address_bad_mask_offset()); Instead of casting here in this platform agnostic code, I'd suggest that you add a new constructor for `Address` on PPC, one that takes `(Register, ByteSize)` arguments. Other platforms have that, so I'm a bit surprised that PPC doesn't already have that too. src/hotspot/share/gc/z/zBarrierSetAssembler.cpp line 34: > 32: > 33: Address ZBarrierSetAssemblerBase::address_bad_mask_from_jni_env(Register env) { > 34: return Address(env, (intptr_t) (ZThreadLocalData::address_bad_mask_offset() - JavaThread::jni_environment_offset())); .... and we would avoid there cast here also. ------------- Changes requested by pliden (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5842 From coleenp at openjdk.java.net Thu Oct 7 15:28:31 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 7 Oct 2021 15:28:31 GMT Subject: RFR: 8274004: Change 'nonleaf' rank name [v2] In-Reply-To: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> References: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> Message-ID: > Also fixes: 8273956: Add checking for rank values > > This change does 3 things. I could separate them but this has all been tested together and most of the change is mechanical. The first is a simple rename of nonleaf => safepoint. The second change is to add the enum class Rank which only allows subtraction from a base rank, and some additional type checking so that rank arithmetic doesn't overlap with a different rank. Ie if you say nosafepoint-n, that doesn't overlap with oopstorage. The third change is to remove the _safepoint_check_always/never flag. That is now intrinsic in the ranking, as all nosafepoint ranks are lower than all safepoint ranks. > > Future changes could add rank names in the middle of nosafepoint and safepoint, like handshake. As of now, the rank subtractions are not unmanageable. There are actually not many nested Mutex. > > This is the last of the planned subtasks for Mutex ranking cleanup. If you have other ideas or suggestions, please let me know. > > Tested with tier1-8 at one point in time and just retested with tier1-6. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix overlap error message printing and add a test. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5845/files - new: https://git.openjdk.java.net/jdk/pull/5845/files/3c6a5b6c..59bfd945 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5845&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5845&range=00-01 Stats: 16 lines in 2 files changed: 13 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/5845.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5845/head:pull/5845 PR: https://git.openjdk.java.net/jdk/pull/5845 From pchilanomate at openjdk.java.net Thu Oct 7 16:02:08 2021 From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo) Date: Thu, 7 Oct 2021 16:02:08 GMT Subject: RFR: 8274004: Change 'nonleaf' rank name [v2] In-Reply-To: References: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> Message-ID: On Thu, 7 Oct 2021 15:28:31 GMT, Coleen Phillimore wrote: >> Also fixes: 8273956: Add checking for rank values >> >> This change does 3 things. I could separate them but this has all been tested together and most of the change is mechanical. The first is a simple rename of nonleaf => safepoint. The second change is to add the enum class Rank which only allows subtraction from a base rank, and some additional type checking so that rank arithmetic doesn't overlap with a different rank. Ie if you say nosafepoint-n, that doesn't overlap with oopstorage. The third change is to remove the _safepoint_check_always/never flag. That is now intrinsic in the ranking, as all nosafepoint ranks are lower than all safepoint ranks. >> >> Future changes could add rank names in the middle of nosafepoint and safepoint, like handshake. As of now, the rank subtractions are not unmanageable. There are actually not many nested Mutex. >> >> This is the last of the planned subtasks for Mutex ranking cleanup. If you have other ideas or suggestions, please let me know. >> >> Tested with tier1-8 at one point in time and just retested with tier1-6. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix overlap error message printing and add a test. LGTM! Thanks, Patricio ------------- Marked as reviewed by pchilanomate (Committer). PR: https://git.openjdk.java.net/jdk/pull/5845 From coleenp at openjdk.java.net Thu Oct 7 16:10:08 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 7 Oct 2021 16:10:08 GMT Subject: RFR: 8274004: Change 'nonleaf' rank name [v2] In-Reply-To: References: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> Message-ID: On Thu, 7 Oct 2021 15:59:42 GMT, Patricio Chilano Mateo wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix overlap error message printing and add a test. > > LGTM! > > Thanks, > Patricio Thanks for the review @pchilano and offline discussion about check_for_overlap. ------------- PR: https://git.openjdk.java.net/jdk/pull/5845 From github.com+42899633+eastig at openjdk.java.net Thu Oct 7 16:27:12 2021 From: github.com+42899633+eastig at openjdk.java.net (Evgeny Astigeevich) Date: Thu, 7 Oct 2021 16:27:12 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v8] In-Reply-To: References: <5BXaz78z6bOTNs5Pi1ffTsU_WOADzU5prpsiWEtw-i8=.bea9d849-2d8d-48d0-8393-bc3b5174ab37@github.com> <57NBCKVSwh09wtVC2X3uWiei7QgKOlpdzI-MO_KyZbI=.c702c7e7-8301-4269-92fa-e5e2c7387221@github.com> <7u0gO49QpF80Rfs3TQZsFgGc7zgmHiurWP3Q00xEQic=.07209741-d84f-4beb-83ce-52804a607f1f@github.com> Message-ID: On Wed, 29 Sep 2021 06:44:29 GMT, Christian Hagedorn wrote: >> The [IR test framework](https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/compiler/lib/ir_framework/README.md) can parse the C2 opto assembly output, and also control compilation level and flags through method annotations. I'm not sure if we could use that to check the C1 output though. Maybe @chhagedorn has some ideas? > > The framework only supports regex matching on C2's ideal and opto assembly output. You would still need the current checks for C1's output. So, I'm not sure if it's worth transforming the C2 checks only to use the framework. > The [IR test framework](https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/compiler/lib/ir_framework/README.md) can parse the C2 opto assembly output, and also control compilation level and flags through method annotations. I'm not sure if we could use that to check the C1 output though. Maybe @chhagedorn has some ideas? I wrote a test using the IR test framework. The framework checks the output of `-XX:+PrintIdeal -XX:+PrintOptoAssembly`. It cannot check what instructions and how many of them have been emitted. It can only check onSpinWait is intrinsified and the format of the intrinsic. I think the framework needs to support the output of `PrintAssembly`. ------------- PR: https://git.openjdk.java.net/jdk/pull/5562 From rkennke at openjdk.java.net Thu Oct 7 16:57:14 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 7 Oct 2021 16:57:14 GMT Subject: RFR: 8274024: Use regular accessors to internal fields of oopDesc In-Reply-To: References: Message-ID: <9zKzKj8Ny95GND0g2qF7oJ_shlxK8IL3t1rAN_z_NEo=.aa40f24a-5fe7-492c-8ca5-ac06a26c7d85@github.com> On Mon, 20 Sep 2021 18:59:34 GMT, Roman Kennke wrote: > Currently, we are using 'raw' accessors to initialize the mark, Klass*, (array-)length and klass_gap of oops. This is ugly and confusing and we should just use the regular accessors. > > Testing: > - [ ] tier1 > - [ ] tier2 > - [ ] hotspot_gc I'm withdrawing this PR. ------------- PR: https://git.openjdk.java.net/jdk/pull/5585 From rkennke at openjdk.java.net Thu Oct 7 16:57:15 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 7 Oct 2021 16:57:15 GMT Subject: Withdrawn: 8274024: Use regular accessors to internal fields of oopDesc In-Reply-To: References: Message-ID: On Mon, 20 Sep 2021 18:59:34 GMT, Roman Kennke wrote: > Currently, we are using 'raw' accessors to initialize the mark, Klass*, (array-)length and klass_gap of oops. This is ugly and confusing and we should just use the regular accessors. > > Testing: > - [ ] tier1 > - [ ] tier2 > - [ ] hotspot_gc This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/5585 From rkennke at openjdk.java.net Thu Oct 7 16:59:11 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 7 Oct 2021 16:59:11 GMT Subject: RFR: 8272723: Don't use Access API to access primitive fields [v3] In-Reply-To: References: Message-ID: On Fri, 20 Aug 2021 08:51:54 GMT, Roman Kennke wrote: >> For earlier incarnations of Shenandoah, we needed to put barriers before accessing primitive fields. This is no longer necessary nor implemented/used by any GC, and we should simplify the code to do plain access instead. >> >> (We may want to remove remaining primitive access machinery in the Access API soon) >> >> Testing: >> - [x] build x86_32 and x86_64 >> - [x] tier1 >> - [x] tier2 >> - [x] hotspot_gc > > Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: > > - Merge branch 'master' into JDK-8272723 > - Remove redundant asserts > - Revert to use RawAccess for volatile accesses > - Fix alignment > - Consolidate (obj)field_addr() variants > - Remove remaining primitive Access API uses > - 8272723: Don't use Access API to access primitive fields This PR has been lingering for a while. Can I get another review, please? Maybe @fisk or @stefank? Thanks, Roman ------------- PR: https://git.openjdk.java.net/jdk/pull/5187 From dnsimon at openjdk.java.net Thu Oct 7 17:26:10 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Thu, 7 Oct 2021 17:26:10 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs [v3] In-Reply-To: References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> Message-ID: On Thu, 7 Oct 2021 13:33:41 GMT, Doug Simon wrote: >> This enhances hs-err logs to: >> * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. >> * Show machine code for the top frames on the native stack. >> >> An interpreter or stub frame is only shown if it is the crashing frame. >> >> A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). > > Doug Simon has updated the pull request incrementally with five additional commits since the last revision: > > - improved test for platforms such as Windows where only a single frame can have its code printed > - more nullptr usage > changed should_print_code to add_if_absent > made printed_capacity a const value > handle platforms where native stack cannot be walked directly by VMError > improved Crasher such that the crash happens in a JIT compiled frame > - refined MachCodeFramesInErrorFile test > - omit timestamp from nmethod decoding and printing > - removed leftover debug printing The MachCodeFramesInErrorFile test is crashing on macosx-aarch64. I've isolated the problem in lldb: Process 43544 stopped * thread #3, stop reason = signal SIGBUS * frame #0: 0x00000001064cc8b4 libjvm.dylib`PcDescCache::add_pc_desc(this=0x00000001186083b0, pc_desc=0x0000000118608770) at nmethod.cpp:395:18 frame #1: 0x00000001064d4588 libjvm.dylib`PcDescContainer::find_pc_desc_internal(this=0x00000001186083b0, pc=" \U00000003\xd5\xe9s@\xd1?\U00000001", approximate=true, search=0x000000017008d0f0) at nmethod.cpp:2220:20 frame #2: 0x0000000105bd23a8 libjvm.dylib`PcDescContainer::find_pc_desc(this=0x00000001186083b0, pc=" \U00000003\xd5\xe9s@\xd1?\U00000001", approximate=true, search=0x000000017008d0f0) at compiledMethod.hpp:136:12 frame #3: 0x0000000105bd230c libjvm.dylib`CompiledMethod::find_pc_desc(this=0x0000000118608310, pc=" \U00000003\xd5\xe9s@\xd1?\U00000001", approximate=true) at compiledMethod.hpp:404:31 frame #4: 0x0000000105d326fc libjvm.dylib`CompiledMethod::pc_desc_near(this=0x0000000118608310, pc=" \U00000003\xd5\xe9s@\xd1?\U00000001") at compiledMethod.hpp:232:45 frame #5: 0x00000001064d7c8c libjvm.dylib`nmethod::scope_desc_in(this=0x0000000118608310, begin="\U0000001f \U00000003\xd5\xe9s@\xd1?\U00000001", end="\xe9s@\xd1?\U00000001") at nmethod.cpp:3129:15 frame #6: 0x00000001064d6c6c libjvm.dylib`nmethod::has_code_comment(this=0x0000000118608310, begin="\U0000001f \U00000003\xd5\xe9s@\xd1?\U00000001", end="\xe9s@\xd1?\U00000001") at nmethod.cpp:3255:20 frame #7: 0x00000001064cf9fc libjvm.dylib`nmethod::decode2(this=0x0000000118608310, ost=0x000000017008df38) const at nmethod.cpp:2983:79 frame #8: 0x0000000105e3f244 libjvm.dylib`Disassembler::decode(cb=0x0000000118608310, st=0x000000017008df38) at disassembler.cpp:875:21 frame #9: 0x00000001067dd568 libjvm.dylib`print_code(st=0x000000017008df38, thread=0x000000010080bc20, pc="J\U00000001@\xf9\x8b\xcfA\xf9\U0000007f\U00000001@\xb9\x88\xc3N9\xc8", is_crash_pc=true, printed=0x000000017008dc50, printed_capacity=5) at vmError.cpp:307:9 frame #10: 0x00000001067dbe18 libjvm.dylib`VMError::report(st=0x000000017008df38, _verbose=true) at vmError.cpp:928:16 This looks like a pre-existing bug. @vnkozlov would be able to help me with this? ------------- PR: https://git.openjdk.java.net/jdk/pull/5446 From kvn at openjdk.java.net Thu Oct 7 18:26:09 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 7 Oct 2021 18:26:09 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs [v3] In-Reply-To: References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> Message-ID: On Thu, 7 Oct 2021 17:23:20 GMT, Doug Simon wrote: >> Doug Simon has updated the pull request incrementally with five additional commits since the last revision: >> >> - improved test for platforms such as Windows where only a single frame can have its code printed >> - more nullptr usage >> changed should_print_code to add_if_absent >> made printed_capacity a const value >> handle platforms where native stack cannot be walked directly by VMError >> improved Crasher such that the crash happens in a JIT compiled frame >> - refined MachCodeFramesInErrorFile test >> - omit timestamp from nmethod decoding and printing >> - removed leftover debug printing > > The MachCodeFramesInErrorFile test is crashing on macosx-aarch64. I've isolated the problem in lldb: > > Process 43544 stopped > * thread #3, stop reason = signal SIGBUS > * frame #0: 0x00000001064cc8b4 libjvm.dylib`PcDescCache::add_pc_desc(this=0x00000001186083b0, pc_desc=0x0000000118608770) at nmethod.cpp:395:18 > frame #1: 0x00000001064d4588 libjvm.dylib`PcDescContainer::find_pc_desc_internal(this=0x00000001186083b0, pc=" \U00000003\xd5\xe9s@\xd1?\U00000001", approximate=true, search=0x000000017008d0f0) at nmethod.cpp:2220:20 > frame #2: 0x0000000105bd23a8 libjvm.dylib`PcDescContainer::find_pc_desc(this=0x00000001186083b0, pc=" \U00000003\xd5\xe9s@\xd1?\U00000001", approximate=true, search=0x000000017008d0f0) at compiledMethod.hpp:136:12 > frame #3: 0x0000000105bd230c libjvm.dylib`CompiledMethod::find_pc_desc(this=0x0000000118608310, pc=" \U00000003\xd5\xe9s@\xd1?\U00000001", approximate=true) at compiledMethod.hpp:404:31 > frame #4: 0x0000000105d326fc libjvm.dylib`CompiledMethod::pc_desc_near(this=0x0000000118608310, pc=" \U00000003\xd5\xe9s@\xd1?\U00000001") at compiledMethod.hpp:232:45 > frame #5: 0x00000001064d7c8c libjvm.dylib`nmethod::scope_desc_in(this=0x0000000118608310, begin="\U0000001f \U00000003\xd5\xe9s@\xd1?\U00000001", end="\xe9s@\xd1?\U00000001") at nmethod.cpp:3129:15 > frame #6: 0x00000001064d6c6c libjvm.dylib`nmethod::has_code_comment(this=0x0000000118608310, begin="\U0000001f \U00000003\xd5\xe9s@\xd1?\U00000001", end="\xe9s@\xd1?\U00000001") at nmethod.cpp:3255:20 > frame #7: 0x00000001064cf9fc libjvm.dylib`nmethod::decode2(this=0x0000000118608310, ost=0x000000017008df38) const at nmethod.cpp:2983:79 > frame #8: 0x0000000105e3f244 libjvm.dylib`Disassembler::decode(cb=0x0000000118608310, st=0x000000017008df38) at disassembler.cpp:875:21 > frame #9: 0x00000001067dd568 libjvm.dylib`print_code(st=0x000000017008df38, thread=0x000000010080bc20, pc="J\U00000001@\xf9\x8b\xcfA\xf9\U0000007f\U00000001@\xb9\x88\xc3N9\xc8", is_crash_pc=true, printed=0x000000017008dc50, printed_capacity=5) at vmError.cpp:307:9 > frame #10: 0x00000001067dbe18 libjvm.dylib`VMError::report(st=0x000000017008df38, _verbose=true) at vmError.cpp:928:16 > > > This looks like a pre-existing bug. > @vnkozlov would be able to help me with this? @dougxc I looked on issues with SIGBUS(0xa) and may be related: https://bugs.openjdk.java.net/browse/JDK-8266453 https://bugs.openjdk.java.net/browse/JDK-8262894 https://bugs.openjdk.java.net/browse/JDK-8262896 Fixes enable WX. May be we need something like this here. ------------- PR: https://git.openjdk.java.net/jdk/pull/5446 From dnsimon at openjdk.java.net Thu Oct 7 20:27:34 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Thu, 7 Oct 2021 20:27:34 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs [v4] In-Reply-To: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> Message-ID: > This enhances hs-err logs to: > * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. > * Show machine code for the top frames on the native stack. > > An interpreter or stub frame is only shown if it is the crashing frame. > > A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). Doug Simon has updated the pull request incrementally with one additional commit since the last revision: disable write protection before decoding nmethod ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5446/files - new: https://git.openjdk.java.net/jdk/pull/5446/files/ba55d0e3..62cc8664 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5446&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5446&range=02-03 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/5446.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5446/head:pull/5446 PR: https://git.openjdk.java.net/jdk/pull/5446 From dnsimon at openjdk.java.net Thu Oct 7 20:30:11 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Thu, 7 Oct 2021 20:30:11 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs [v4] In-Reply-To: References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> Message-ID: On Thu, 7 Oct 2021 20:27:34 GMT, Doug Simon wrote: >> This enhances hs-err logs to: >> * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. >> * Show machine code for the top frames on the native stack. >> >> An interpreter or stub frame is only shown if it is the crashing frame. >> >> A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > disable write protection before decoding nmethod Thanks, you're right. I placed it in `nmethod::decode2`: 62cc866440084366dc4b69d47391557a9457f6bd BTW, I noticed that often nmethod info is written to `tty` during crash reporting. I tracked this down to [this change](https://github.com/dougxc/jdk/commit/4af237427144060d0b97c14a086401b0a3781198#diff-82948e0048fb88e1e9281b61394d9a9e2e0c8e75ab1a834c3bf943dbde73e2ffR618) where the `st` argument is ignored. @TheRealMDoerr do you think that should be fixed? ------------- PR: https://git.openjdk.java.net/jdk/pull/5446 From dnsimon at openjdk.java.net Thu Oct 7 20:40:39 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Thu, 7 Oct 2021 20:40:39 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs [v5] In-Reply-To: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> Message-ID: <7RAYmO3tmTPL8afNThkD469tIudcLJs7P1tVT3yIc8E=.e265dad5-a2f2-4c25-8585-9424cbe23dac@github.com> > This enhances hs-err logs to: > * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. > * Show machine code for the top frames on the native stack. > > An interpreter or stub frame is only shown if it is the crashing frame. > > A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: disable write protection before decoding nmethod ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5446/files - new: https://git.openjdk.java.net/jdk/pull/5446/files/62cc8664..5e00264b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5446&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5446&range=03-04 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/5446.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5446/head:pull/5446 PR: https://git.openjdk.java.net/jdk/pull/5446 From bylokhov at amazon.com Thu Oct 7 21:39:59 2021 From: bylokhov at amazon.com (Sergey Bylokhov) Date: Thu, 7 Oct 2021 14:39:59 -0700 Subject: Best practices to work on arrays via JNI from a few threads Message-ID: Hello, hotspot team. I would like to clarify what is the current best practices to work with arrays from the JNI. Currently I am working on the performance improvement in the color conversion area in the sun.java2d.cmm package. The logic there is simple pass the array of data(can be one pixel or huge image) to the native via JNI, then call some littlecms code(this is third party library) and save result back. For now that code uses GetByteArrayElements/ReleaseByteArrayElements. I have found that in some cases we spent most of the time(60%) copying data. So I have a few ideas on how to improve the code: - Use the GetPrimitiveArrayCritical/ReleasePrimitiveArrayCritical - it will increase the performance of the single threaded code, but may have some contention issue - Split the work across the threads Do I understand correctly that I cannot use both improvements together? The Critical API if not in the current implementation, but theoretically may return a copy of array, so if one thread will get a copy it will overwrite the whole array and delete the changes of the other threads. Is my understanding correct that since there is no GetPrimitiveArrayRegionCritical API, I need to use GetByteArrayRegion/SetByteArrayRegion for each slice in each thread? Probably we have some other way to do this job without coping the data fourth and back? Is it possible to create a view of the array and then lock only that slice via GetPrimitiveArrayCritical? Or the Critical API became obsolete due to many issues described here?: * https://bugs.openjdk.java.net/browse/JDK-8199919? * http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-March/052153.html Thank you for any suggestions. PS: I have read the long similar thread "GetPrimitiveArrayCritical vs GetByteArrayRegion": http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-March/051898.html Where Martin suggested to add the "GetPrimitiveArrayRegionCritical", as far as understand we do not have such plans? -- Best regards, Sergey. From kvn at openjdk.java.net Thu Oct 7 23:03:11 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 7 Oct 2021 23:03:11 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs [v5] In-Reply-To: <7RAYmO3tmTPL8afNThkD469tIudcLJs7P1tVT3yIc8E=.e265dad5-a2f2-4c25-8585-9424cbe23dac@github.com> References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> <7RAYmO3tmTPL8afNThkD469tIudcLJs7P1tVT3yIc8E=.e265dad5-a2f2-4c25-8585-9424cbe23dac@github.com> Message-ID: On Thu, 7 Oct 2021 20:40:39 GMT, Doug Simon wrote: >> This enhances hs-err logs to: >> * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. >> * Show machine code for the top frames on the native stack. >> >> An interpreter or stub frame is only shown if it is the crashing frame. >> >> A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). > > Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > disable write protection before decoding nmethod Update looks good. ------------- PR: https://git.openjdk.java.net/jdk/pull/5446 From david.holmes at oracle.com Thu Oct 7 23:57:37 2021 From: david.holmes at oracle.com (David Holmes) Date: Fri, 8 Oct 2021 09:57:37 +1000 Subject: Best practices to work on arrays via JNI from a few threads In-Reply-To: References: Message-ID: Hi Sergey, On 8/10/2021 7:39 am, Sergey Bylokhov wrote: > Hello, hotspot team. > > I would like to clarify what is the current best practices to work with > arrays from the JNI. > > Currently I am working on the performance improvement in the color > conversion area in the sun.java2d.cmm package. The logic there is simple > pass the array of data(can be one pixel or huge image) to the native via > JNI, then call some littlecms code(this is third party library) and save > result back. For now that code uses > GetByteArrayElements/ReleaseByteArrayElements. I have found that in some > cases we spent most of the time(60%) copying data. So I have a few ideas > on how to improve the code: > ?- Use the GetPrimitiveArrayCritical/ReleasePrimitiveArrayCritical - it > will increase the performance of the single threaded code, but may have > some contention issue > ?- Split the work across the threads > > Do I understand correctly that I cannot use both improvements together? > The Critical API if not in the current implementation, but theoretically > may return a copy of array, so if one thread will get a copy it will > overwrite the whole array and delete the changes of the other threads. The final parameter to GetPrimitiveArrayCritical is "jboolean *isCopy", so you can pass the address of a variable there and know whether you have a copy or not, and then decide how to proceed from there. Hotspot will not make a copy but will use either object-pinning (for GC's that do that - only Shenandoah) or the GCLocker. Deferring GC may cause you other problems so you have to evaluate how long you will need to process the array data in the worst case. You can process the raw array in multiple threads, but then you need to synchronize those threads to know when the array can be released again. > Is my understanding correct that since there is no > GetPrimitiveArrayRegionCritical API, I need to use > GetByteArrayRegion/SetByteArrayRegion for each slice in each thread? If you want to parallelise the copying overhead then yes you need to use the region API in each thread. > Probably we have some other way to do this job without coping the data > fourth and back? Is it possible to create a view of the array and then > lock only that slice via GetPrimitiveArrayCritical? An array is an array, it can't be partitioned as you describe. > Or the Critical API became obsolete due to many issues described here?: > ?* https://bugs.openjdk.java.net/browse/JDK-8199919? > ?* > http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-March/052153.html > > Thank you for any suggestions. > > PS: I have read the long similar thread "GetPrimitiveArrayCritical vs > GetByteArrayRegion": > http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-March/051898.html > Where Martin suggested to add the "GetPrimitiveArrayRegionCritical", as > far as understand we do not have such plans? None of the RFE's from that discussion were picked up. Cheers, David From david.holmes at oracle.com Fri Oct 8 00:01:56 2021 From: david.holmes at oracle.com (David Holmes) Date: Fri, 8 Oct 2021 10:01:56 +1000 Subject: Best practices to work on arrays via JNI from a few threads In-Reply-To: References: Message-ID: <28e0f738-7d67-3f9d-96d8-826b5010c54b@oracle.com> Clarification ... On 8/10/2021 9:57 am, David Holmes wrote: > Hi Sergey, > > On 8/10/2021 7:39 am, Sergey Bylokhov wrote: >> Hello, hotspot team. >> >> I would like to clarify what is the current best practices to work >> with arrays from the JNI. >> >> Currently I am working on the performance improvement in the color >> conversion area in the sun.java2d.cmm package. The logic there is >> simple pass the array of data(can be one pixel or huge image) to the >> native via JNI, then call some littlecms code(this is third party >> library) and save result back. For now that code uses >> GetByteArrayElements/ReleaseByteArrayElements. I have found that in >> some cases we spent most of the time(60%) copying data. So I have a >> few ideas on how to improve the code: >> ??- Use the GetPrimitiveArrayCritical/ReleasePrimitiveArrayCritical - >> it will increase the performance of the single threaded code, but may >> have some contention issue >> ??- Split the work across the threads >> >> Do I understand correctly that I cannot use both improvements together? >> The Critical API if not in the current implementation, but >> theoretically may return a copy of array, so if one thread will get a >> copy it will overwrite the whole array and delete the changes of the >> other threads. > > The final parameter to GetPrimitiveArrayCritical is "jboolean *isCopy", > so you can pass the address of a variable there and know whether you > have a copy or not, and then decide how to proceed from there. Hotspot > will not make a copy but will use either object-pinning (for GC's that > do that - only Shenandoah) or the GCLocker. Deferring GC may cause you > other problems so you have to evaluate how long you will need to process > the array data in the worst case. You can process the raw array in > multiple threads, but then you need to synchronize those threads to know > when the array can be released again. I was thinking of a model where one thread calls GetPrimitiveArrayCritical, verifies it has a non-copy, and then shares it with other threads. You could of course have each thread call GetPrimitiveArrayCritical and just work on its own region of the array, but if you wanted to guard against a theoretical VM that might return a copy then the fallback code would be much more complicated. David ----- >> Is my understanding correct that since there is no >> GetPrimitiveArrayRegionCritical API, I need to use >> GetByteArrayRegion/SetByteArrayRegion for each slice in each thread? > > If you want to parallelise the copying overhead then yes you need to use > the region API in each thread. > >> Probably we have some other way to do this job without coping the data >> fourth and back? Is it possible to create a view of the array and then >> lock only that slice via GetPrimitiveArrayCritical? > > An array is an array, it can't be partitioned as you describe. > >> Or the Critical API became obsolete due to many issues described here?: >> ??* https://bugs.openjdk.java.net/browse/JDK-8199919? >> ??* >> http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-March/052153.html >> >> >> Thank you for any suggestions. >> >> PS: I have read the long similar thread "GetPrimitiveArrayCritical vs >> GetByteArrayRegion": >> http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-March/051898.html >> >> Where Martin suggested to add the "GetPrimitiveArrayRegionCritical", >> as far as understand we do not have such plans? > > None of the RFE's from that discussion were picked up. > > Cheers, > David From dholmes at openjdk.java.net Fri Oct 8 00:41:08 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 8 Oct 2021 00:41:08 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs [v5] In-Reply-To: <7RAYmO3tmTPL8afNThkD469tIudcLJs7P1tVT3yIc8E=.e265dad5-a2f2-4c25-8585-9424cbe23dac@github.com> References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> <7RAYmO3tmTPL8afNThkD469tIudcLJs7P1tVT3yIc8E=.e265dad5-a2f2-4c25-8585-9424cbe23dac@github.com> Message-ID: On Thu, 7 Oct 2021 20:40:39 GMT, Doug Simon wrote: >> This enhances hs-err logs to: >> * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. >> * Show machine code for the top frames on the native stack. >> >> An interpreter or stub frame is only shown if it is the crashing frame. >> >> A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). > > Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. Hi Doug, Updates to non-compiler code look fine to me. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5446 From dholmes at openjdk.java.net Fri Oct 8 00:56:18 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 8 Oct 2021 00:56:18 GMT Subject: RFR: 8274004: Change 'nonleaf' rank name [v2] In-Reply-To: References: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> Message-ID: On Thu, 7 Oct 2021 15:28:31 GMT, Coleen Phillimore wrote: >> Also fixes: 8273956: Add checking for rank values >> >> This change does 3 things. I could separate them but this has all been tested together and most of the change is mechanical. The first is a simple rename of nonleaf => safepoint. The second change is to add the enum class Rank which only allows subtraction from a base rank, and some additional type checking so that rank arithmetic doesn't overlap with a different rank. Ie if you say nosafepoint-n, that doesn't overlap with oopstorage. The third change is to remove the _safepoint_check_always/never flag. That is now intrinsic in the ranking, as all nosafepoint ranks are lower than all safepoint ranks. >> >> Future changes could add rank names in the middle of nosafepoint and safepoint, like handshake. As of now, the rank subtractions are not unmanageable. There are actually not many nested Mutex. >> >> This is the last of the planned subtasks for Mutex ranking cleanup. If you have other ideas or suggestions, please let me know. >> >> Tested with tier1-8 at one point in time and just retested with tier1-6. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix overlap error message printing and add a test. Hi Coleen, A couple of comments on tests, but otherwise okay. Thanks, David test/hotspot/gtest/runtime/test_mutex.cpp line 292: > 290: Monitor* monitor_rank_broken = new Monitor(Mutex::oopstorage-4, "monitor_rank_broken"); > 291: monitor_rank_broken->lock_without_safepoint_check(); > 292: monitor_rank_broken->unlock(); This is dead code right - the assertion failure will stop us getting here. test/hotspot/gtest/runtime/test_mutex.cpp line 302: > 300: Monitor* monitor_rank_broken = new Monitor(Mutex::safepoint-40, "monitor_rank_broken"); > 301: monitor_rank_broken->lock_without_safepoint_check(); > 302: monitor_rank_broken->unlock(); Ditto - dead code test/hotspot/gtest/runtime/test_mutex.cpp line 310: > 308: ThreadInVMfromNative invm(THREAD); > 309: > 310: Monitor* monitor_rank_broken = new Monitor(Mutex::safepoint-1, "monitor_rank_broken"); This rank is not actually broken is it - otherwise we won't get to the next line. test/hotspot/gtest/runtime/test_mutex.cpp line 313: > 311: Monitor* monitor_rank_also_broken = new Monitor(monitor_rank_broken->rank()-39, "monitor_rank_also_broken"); > 312: monitor_rank_also_broken->lock_without_safepoint_check(); > 313: monitor_rank_also_broken->unlock(); Dead code ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5845 From bylokhov at amazon.com Fri Oct 8 01:15:21 2021 From: bylokhov at amazon.com (Sergey Bylokhov) Date: Thu, 7 Oct 2021 18:15:21 -0700 Subject: Best practices to work on arrays via JNI from a few threads In-Reply-To: <28e0f738-7d67-3f9d-96d8-826b5010c54b@oracle.com> References: <28e0f738-7d67-3f9d-96d8-826b5010c54b@oracle.com> Message-ID: <19610e48-6020-8bb0-5ee1-ae9a07ccfba8@amazon.com> On 10/7/21 5:01 PM, David Holmes wrote: >> The final parameter to GetPrimitiveArrayCritical is "jboolean *isCopy", >> so you can pass the address of a variable there and know whether you >> have a copy or not, and then decide how to proceed from there. Hotspot >> will not make a copy but will use either object-pinning (for GC's that >> do that - only Shenandoah) or the GCLocker. Deferring GC may cause you >> other problems so you have to evaluate how long you will need to process >> the array data in the worst case. You can process the raw array in >> multiple threads, but then you need to synchronize those threads to know >> when the array can be released again. > > I was thinking of a model where one thread calls > GetPrimitiveArrayCritical, verifies it has a non-copy, and then shares > it with other threads. You could of course have each thread call > GetPrimitiveArrayCritical and just work on its own region of the array, > but if you wanted to guard against a theoretical VM that might return a > copy then the fallback code would be much more complicated. But how I can share it to some other java threads if it is not possible to call java methods after GetPrimitiveArrayCritical? The other way to implement it is to call GetPrimitiveArrayCritical and then use the native threads. But I really would like to use java thread pool. -- Best regards, Sergey. From david.holmes at oracle.com Fri Oct 8 01:32:24 2021 From: david.holmes at oracle.com (David Holmes) Date: Fri, 8 Oct 2021 11:32:24 +1000 Subject: Best practices to work on arrays via JNI from a few threads In-Reply-To: <19610e48-6020-8bb0-5ee1-ae9a07ccfba8@amazon.com> References: <28e0f738-7d67-3f9d-96d8-826b5010c54b@oracle.com> <19610e48-6020-8bb0-5ee1-ae9a07ccfba8@amazon.com> Message-ID: On 8/10/2021 11:15 am, Sergey Bylokhov wrote: > On 10/7/21 5:01 PM, David Holmes wrote: >>> The final parameter to GetPrimitiveArrayCritical is "jboolean *isCopy", >>> so you can pass the address of a variable there and know whether you >>> have a copy or not, and then decide how to proceed from there. Hotspot >>> will not make a copy but will use either object-pinning (for GC's that >>> do that - only Shenandoah) or the GCLocker. Deferring GC may cause you >>> other problems so you have to evaluate how long you will need to process >>> the array data in the worst case. You can process the raw array in >>> multiple threads, but then you need to synchronize those threads to know >>> when the array can be released again. >> >> I was thinking of a model where one thread calls >> GetPrimitiveArrayCritical, verifies it has a non-copy, and then shares >> it with other threads. You could of course have each thread call >> GetPrimitiveArrayCritical and just work on its own region of the array, >> but if you wanted to guard against a theoretical VM that might return a >> copy then the fallback code would be much more complicated. > > But how I can share it to some other java threads if it is not possible > to call java methods after GetPrimitiveArrayCritical? > > The other way to implement it is to call GetPrimitiveArrayCritical > and then use the native threads. But I really would like to use java > thread pool. The raw array has to be operated on in native code - there is no way to pass it back to Java. So those threads, whereever they originate, will have to call a native method to do this. If only 1 thread does the GPAC then it needs to coordinate with the other threads in native code. If each thread does its own call to GPAC then there is no issue with coordination. But if your primary concern was the copying overhead then just use GPAC to avoid the copying and stick to one thread. David ----- From dnsimon at openjdk.java.net Fri Oct 8 08:07:09 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Fri, 8 Oct 2021 08:07:09 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs [v5] In-Reply-To: References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> <7RAYmO3tmTPL8afNThkD469tIudcLJs7P1tVT3yIc8E=.e265dad5-a2f2-4c25-8585-9424cbe23dac@github.com> Message-ID: On Thu, 7 Oct 2021 23:00:00 GMT, Vladimir Kozlov wrote: >> Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. > > Update looks good. Thanks @vnkozlov and @dholmes-ora for the input and reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/5446 From dnsimon at openjdk.java.net Fri Oct 8 08:10:14 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Fri, 8 Oct 2021 08:10:14 GMT Subject: Integrated: 8272586: emit abstract machine code in hs-err logs In-Reply-To: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> Message-ID: <5GEy3O6FP5T2W8Czt0WkATnrKHlWksIn-ByDybPpQ8I=.2345faf2-c7fa-447a-80be-10e5bd769f12@github.com> On Thu, 9 Sep 2021 16:56:57 GMT, Doug Simon wrote: > This enhances hs-err logs to: > * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. > * Show machine code for the top frames on the native stack. > > An interpreter or stub frame is only shown if it is the crashing frame. > > A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). This pull request has now been integrated. Changeset: b60837a7 Author: Doug Simon URL: https://git.openjdk.java.net/jdk/commit/b60837a7d5d6f920d2fb968369564df155dc1018 Stats: 338 lines in 7 files changed: 288 ins; 20 del; 30 mod 8272586: emit abstract machine code in hs-err logs Reviewed-by: kvn, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/5446 From xliu at openjdk.java.net Fri Oct 8 08:32:11 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Fri, 8 Oct 2021 08:32:11 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v6] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: On Thu, 7 Oct 2021 09:15:45 GMT, Xin Liu wrote: >> This patch allows the custom commands of OnError to attach to HotSpot itself. >> It sets the thread of report_and_die() to Native before os::fork_and_exec(cmd). >> This prevents cmds which require safepoint synchronization from deadlock. >> eg. OnError='jcmd %p Thread.print'. >> >> Without this patch, we will encounter a deadlock at safepoint synchronization. >> `"main" #1` is the very thread which executes `os::fork_and_exec(cmd)`. >> >> >> Aborting due to java.lang.OutOfMemoryError: Java heap space >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (debug.cpp:364), pid=94632, tid=94633 >> # fatal error: OutOfMemory encountered: Java heap space >> # >> # JRE version: OpenJDK Runtime Environment (18.0) (build 18-internal+0-adhoc.xxinliu.jdk) >> # Java VM: OpenJDK 64-Bit Server VM (18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) >> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again >> # >> # An error report file with more information is saved as: >> # /local/home/xxinliu/JDK-2085/hs_err_pid94632.log >> # >> # -XX:OnError="jcmd %p Thread.print" >> # Executing /bin/sh -c "jcmd 94632 Thread.print" ... >> 94632: >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timeout detected: >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint. >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Threads which did not reach the safepoint: >> [10.616s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=236.97ms elapsed=10.61s tid=0x00007f01b00232f0 nid=94633 runnable [0x00007f01b7a08000] >> [10.616s][warning][safepoint] java.lang.Thread.State: RUNNABLE >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: (End of list) > > Xin Liu has updated the pull request incrementally with one additional commit since the last revision: > > Don't unlock the holding mutexes for the thread State transition. Hi, Reviewers, Could you take a look the the new revision? I removed 'unlock_locks_on_error' completely. I reuse JavaThreadInVMAndNative of jfrEmergencyDump.cpp . I change `JavaThreadInVMAndNative::transition_to_native` to support re-entrance. ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From kbarrett at openjdk.java.net Fri Oct 8 09:22:06 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 8 Oct 2021 09:22:06 GMT Subject: RFR: 8274615: Support relaxed atomic add for linux-aarch64 In-Reply-To: <1V2HK9V1rgEy2RkQmT_1PrIIbQcOAd94etBUs026Yds=.d5740b78-e13d-4abd-a0c7-3a3fbc2a4ac0@github.com> References: <1V2HK9V1rgEy2RkQmT_1PrIIbQcOAd94etBUs026Yds=.d5740b78-e13d-4abd-a0c7-3a3fbc2a4ac0@github.com> Message-ID: On Fri, 1 Oct 2021 05:04:45 GMT, Kim Barrett wrote: > Please review this change to the linux-aarch64 port to support relaxed > atomic add operations. Both ldxr/stxr and LSE-based implementations are > provided. The appropriate one to use selected at runtime, using the existing > infrastructure for that selection. > > Testing: > mach5 tier1-3 > Some hand-checked performance and functional testing that both LSE and > non-LSE implementations work, and that the relaxed operations are faster > than the conservative default. ping? @theRealAph ? ------------- PR: https://git.openjdk.java.net/jdk/pull/5785 From aph at openjdk.java.net Fri Oct 8 09:33:07 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 8 Oct 2021 09:33:07 GMT Subject: RFR: 8274615: Support relaxed atomic add for linux-aarch64 In-Reply-To: <1V2HK9V1rgEy2RkQmT_1PrIIbQcOAd94etBUs026Yds=.d5740b78-e13d-4abd-a0c7-3a3fbc2a4ac0@github.com> References: <1V2HK9V1rgEy2RkQmT_1PrIIbQcOAd94etBUs026Yds=.d5740b78-e13d-4abd-a0c7-3a3fbc2a4ac0@github.com> Message-ID: On Fri, 1 Oct 2021 05:04:45 GMT, Kim Barrett wrote: > Please review this change to the linux-aarch64 port to support relaxed > atomic add operations. Both ldxr/stxr and LSE-based implementations are > provided. The appropriate one to use selected at runtime, using the existing > infrastructure for that selection. > > Testing: > mach5 tier1-3 > Some hand-checked performance and functional testing that both LSE and > non-LSE implementations work, and that the relaxed operations are faster > than the conservative default. Looks good, thanks. ------------- Marked as reviewed by aph (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5785 From mdoerr at openjdk.java.net Fri Oct 8 10:12:13 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Fri, 8 Oct 2021 10:12:13 GMT Subject: RFR: 8272586: emit abstract machine code in hs-err logs [v5] In-Reply-To: <7RAYmO3tmTPL8afNThkD469tIudcLJs7P1tVT3yIc8E=.e265dad5-a2f2-4c25-8585-9424cbe23dac@github.com> References: <_x8GBM6jheaTcSGxuQBfVS95SZwy6rIzyyA_05DSkwc=.db255415-591f-450e-bb5a-0707feae8727@github.com> <7RAYmO3tmTPL8afNThkD469tIudcLJs7P1tVT3yIc8E=.e265dad5-a2f2-4c25-8585-9424cbe23dac@github.com> Message-ID: On Thu, 7 Oct 2021 20:40:39 GMT, Doug Simon wrote: >> This enhances hs-err logs to: >> * Show the [abstract machine code](https://bugs.openjdk.java.net/browse/JDK-8213084) of the crashing frame if a disassembler is not installed. >> * Show machine code for the top frames on the native stack. >> >> An interpreter or stub frame is only shown if it is the crashing frame. >> >> A sample of the enhanced hs-err log can be seen [here](https://bugs.openjdk.java.net/secure/attachment/96664/hs_err_pid7179.log). > > Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. > Thanks, you're right. I placed it in `nmethod::decode2`: [5e00264](https://github.com/openjdk/jdk/commit/5e00264be9cf5871020a3023aa20c87fa72bdb43) > > BTW, I noticed that often nmethod info is written to `tty` during crash reporting. I tracked this down to [this change](https://github.com/dougxc/jdk/commit/4af237427144060d0b97c14a086401b0a3781198#diff-82948e0048fb88e1e9281b61394d9a9e2e0c8e75ab1a834c3bf943dbde73e2ffR618) where the `st` argument is ignored. @TheRealMDoerr do you think that should be fixed? `nm->print_nmethod(verbose);` was already used before my change. I had only moved it out of `os::print_location`. But yes, would be good to fix it. ------------- PR: https://git.openjdk.java.net/jdk/pull/5446 From coleenp at openjdk.java.net Fri Oct 8 12:26:46 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 8 Oct 2021 12:26:46 GMT Subject: RFR: 8274004: Change 'nonleaf' rank name [v2] In-Reply-To: References: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> Message-ID: On Thu, 7 Oct 2021 15:28:31 GMT, Coleen Phillimore wrote: >> Also fixes: 8273956: Add checking for rank values >> >> This change does 3 things. I could separate them but this has all been tested together and most of the change is mechanical. The first is a simple rename of nonleaf => safepoint. The second change is to add the enum class Rank which only allows subtraction from a base rank, and some additional type checking so that rank arithmetic doesn't overlap with a different rank. Ie if you say nosafepoint-n, that doesn't overlap with oopstorage. The third change is to remove the _safepoint_check_always/never flag. That is now intrinsic in the ranking, as all nosafepoint ranks are lower than all safepoint ranks. >> >> Future changes could add rank names in the middle of nosafepoint and safepoint, like handshake. As of now, the rank subtractions are not unmanageable. There are actually not many nested Mutex. >> >> This is the last of the planned subtasks for Mutex ranking cleanup. If you have other ideas or suggestions, please let me know. >> >> Tested with tier1-8 at one point in time and just retested with tier1-6. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix overlap error message printing and add a test. I removed the dead test code and retested. Thanks for the code review David, and all the discussions leading to this change. ------------- PR: https://git.openjdk.java.net/jdk/pull/5845 From coleenp at openjdk.java.net Fri Oct 8 12:26:42 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 8 Oct 2021 12:26:42 GMT Subject: RFR: 8274004: Change 'nonleaf' rank name [v3] In-Reply-To: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> References: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> Message-ID: <_j_8HU0x6wY_wdykmr-yK3m3Vh9xe_8EBHAndUvqaTI=.6868eeb9-8087-45bd-8aea-06ed9cda720f@github.com> > Also fixes: 8273956: Add checking for rank values > > This change does 3 things. I could separate them but this has all been tested together and most of the change is mechanical. The first is a simple rename of nonleaf => safepoint. The second change is to add the enum class Rank which only allows subtraction from a base rank, and some additional type checking so that rank arithmetic doesn't overlap with a different rank. Ie if you say nosafepoint-n, that doesn't overlap with oopstorage. The third change is to remove the _safepoint_check_always/never flag. That is now intrinsic in the ranking, as all nosafepoint ranks are lower than all safepoint ranks. > > Future changes could add rank names in the middle of nosafepoint and safepoint, like handshake. As of now, the rank subtractions are not unmanageable. There are actually not many nested Mutex. > > This is the last of the planned subtasks for Mutex ranking cleanup. If you have other ideas or suggestions, please let me know. > > Tested with tier1-8 at one point in time and just retested with tier1-6. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Remove dead code in test. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5845/files - new: https://git.openjdk.java.net/jdk/pull/5845/files/59bfd945..e8df15de Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5845&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5845&range=01-02 Stats: 8 lines in 1 file changed: 0 ins; 6 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/5845.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5845/head:pull/5845 PR: https://git.openjdk.java.net/jdk/pull/5845 From coleenp at openjdk.java.net Fri Oct 8 12:26:55 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 8 Oct 2021 12:26:55 GMT Subject: RFR: 8274004: Change 'nonleaf' rank name [v2] In-Reply-To: References: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> Message-ID: <6D3k_Hszkk1_wh4hVsg_SN-KjzC-qtLzz7y7KY8oy_c=.7c9d92e6-0564-45b5-84fa-d3750f218626@github.com> On Fri, 8 Oct 2021 00:48:05 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix overlap error message printing and add a test. > > test/hotspot/gtest/runtime/test_mutex.cpp line 292: > >> 290: Monitor* monitor_rank_broken = new Monitor(Mutex::oopstorage-4, "monitor_rank_broken"); >> 291: monitor_rank_broken->lock_without_safepoint_check(); >> 292: monitor_rank_broken->unlock(); > > This is dead code right - the assertion failure will stop us getting here. Yes, it is dead code. I'll remove it and retest to make sure nothing surprising happens. > test/hotspot/gtest/runtime/test_mutex.cpp line 302: > >> 300: Monitor* monitor_rank_broken = new Monitor(Mutex::safepoint-40, "monitor_rank_broken"); >> 301: monitor_rank_broken->lock_without_safepoint_check(); >> 302: monitor_rank_broken->unlock(); > > Ditto - dead code removed. > test/hotspot/gtest/runtime/test_mutex.cpp line 310: > >> 308: ThreadInVMfromNative invm(THREAD); >> 309: >> 310: Monitor* monitor_rank_broken = new Monitor(Mutex::safepoint-1, "monitor_rank_broken"); > > This rank is not actually broken is it - otherwise we won't get to the next line. right. It's not broken. I'll rename it to 'monitor_rank_ok'. > test/hotspot/gtest/runtime/test_mutex.cpp line 313: > >> 311: Monitor* monitor_rank_also_broken = new Monitor(monitor_rank_broken->rank()-39, "monitor_rank_also_broken"); >> 312: monitor_rank_also_broken->lock_without_safepoint_check(); >> 313: monitor_rank_also_broken->unlock(); > > Dead code removed. ------------- PR: https://git.openjdk.java.net/jdk/pull/5845 From coleenp at openjdk.java.net Fri Oct 8 12:26:57 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 8 Oct 2021 12:26:57 GMT Subject: Integrated: 8274004: Change 'nonleaf' rank name In-Reply-To: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> References: <6s02U-nIpDg-XFfG32B8-dHSZqzY79fHVLrOM_OLBkY=.94efe120-31fb-41c6-a9a9-05cb233414b6@github.com> Message-ID: <0lCdhQcwqr2iY7jits8rmozKtqgOpJ1f4NAWCRoWQIc=.8d3c7b24-6629-4743-bd4c-537198df641d@github.com> On Wed, 6 Oct 2021 23:27:17 GMT, Coleen Phillimore wrote: > Also fixes: 8273956: Add checking for rank values > > This change does 3 things. I could separate them but this has all been tested together and most of the change is mechanical. The first is a simple rename of nonleaf => safepoint. The second change is to add the enum class Rank which only allows subtraction from a base rank, and some additional type checking so that rank arithmetic doesn't overlap with a different rank. Ie if you say nosafepoint-n, that doesn't overlap with oopstorage. The third change is to remove the _safepoint_check_always/never flag. That is now intrinsic in the ranking, as all nosafepoint ranks are lower than all safepoint ranks. > > Future changes could add rank names in the middle of nosafepoint and safepoint, like handshake. As of now, the rank subtractions are not unmanageable. There are actually not many nested Mutex. > > This is the last of the planned subtasks for Mutex ranking cleanup. If you have other ideas or suggestions, please let me know. > > Tested with tier1-8 at one point in time and just retested with tier1-6. This pull request has now been integrated. Changeset: 6364719c Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/6364719cd1c57220769ea580d958da8dc2fdf7f9 Stats: 452 lines in 41 files changed: 105 ins; 117 del; 230 mod 8274004: Change 'nonleaf' rank name 8273956: Add checking for rank values Reviewed-by: dholmes, pchilanomate ------------- PR: https://git.openjdk.java.net/jdk/pull/5845 From mdoerr at openjdk.java.net Fri Oct 8 12:38:12 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Fri, 8 Oct 2021 12:38:12 GMT Subject: RFR: 8274851: [PPC64] Port zgc to linux on ppc64le [v2] In-Reply-To: References: Message-ID: On Thu, 7 Oct 2021 08:27:47 GMT, Niklas Radomski wrote: >> Port the Z garbage collector ([JDK-8209683](https://bugs.openjdk.java.net/browse/JDK-8209683)) to linux on ppc64le. > > Niklas Radomski has updated the pull request incrementally with two additional commits since the last revision: > > - Update autoconf error message > - Remove copyright headers Nice contribution! Thanks for doing the changes I had requested during my offline review. New version needs a few minor fixes, but looks great in general. src/hotspot/cpu/ppc/ppc.ad line 8142: > 8140: match(Set res (CompareAndExchangeP mem_ptr (Binary src1 src2))); > 8141: predicate((((CompareAndSwapNode*)n)->order() != MemNode::acquire && ((CompareAndSwapNode*)n)->order() != MemNode::seqcst) > 8142: && n->as_Load()->barrier_data() == 0); Needs to be `as_LoadStore()`. src/hotspot/cpu/ppc/ppc.ad line 8157: > 8155: match(Set res (CompareAndExchangeP mem_ptr (Binary src1 src2))); > 8156: predicate((((CompareAndSwapNode*)n)->order() == MemNode::acquire || ((CompareAndSwapNode*)n)->order() == MemNode::seqcst) > 8157: && n->as_Load()->barrier_data() == 0); Needs to be `as_LoadStore()`. src/hotspot/cpu/ppc/ppc.ad line 8379: > 8377: instruct getAndSetP(iRegPdst res, iRegPdst mem_ptr, iRegPsrc src, flagsRegCR0 cr0) %{ > 8378: match(Set res (GetAndSetP mem_ptr src)); > 8379: predicate(n->as_Load()->barrier_data() == 0); Needs to be `as_LoadStore()`. ------------- Changes requested by mdoerr (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5842 From eosterlund at openjdk.java.net Fri Oct 8 15:22:10 2021 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 8 Oct 2021 15:22:10 GMT Subject: RFR: 8274851: [PPC64] Port zgc to linux on ppc64le [v2] In-Reply-To: References: Message-ID: On Thu, 7 Oct 2021 08:27:47 GMT, Niklas Radomski wrote: >> Port the Z garbage collector ([JDK-8209683](https://bugs.openjdk.java.net/browse/JDK-8209683)) to linux on ppc64le. > > Niklas Radomski has updated the pull request incrementally with two additional commits since the last revision: > > - Update autoconf error message > - Remove copyright headers Thank you for contributing a ZGC port to PPC. Apart from what has already been said, this looks good to me. ------------- Marked as reviewed by eosterlund (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5842 From nradomski at openjdk.java.net Fri Oct 8 15:41:51 2021 From: nradomski at openjdk.java.net (Niklas Radomski) Date: Fri, 8 Oct 2021 15:41:51 GMT Subject: RFR: 8274851: [PPC64] Port zgc to linux on ppc64le [v2] In-Reply-To: References: Message-ID: On Thu, 7 Oct 2021 13:54:16 GMT, Per Liden wrote: >> Niklas Radomski has updated the pull request incrementally with two additional commits since the last revision: >> >> - Update autoconf error message >> - Remove copyright headers > > src/hotspot/share/gc/z/zBarrierSetAssembler.cpp line 30: > >> 28: >> 29: Address ZBarrierSetAssemblerBase::address_bad_mask_from_thread(Register thread) { >> 30: return Address(thread, (intptr_t) ZThreadLocalData::address_bad_mask_offset()); > > Instead of casting here in this platform agnostic code, I'd suggest that you add a new constructor for `Address` on PPC, one that takes `(Register, ByteSize)` arguments. Other platforms have that, so I'm a bit surprised that PPC doesn't already have that too. Good catch, thank you for the suggestion! Totally agreed, I changed it accordingly. ------------- PR: https://git.openjdk.java.net/jdk/pull/5842 From nradomski at openjdk.java.net Fri Oct 8 15:41:47 2021 From: nradomski at openjdk.java.net (Niklas Radomski) Date: Fri, 8 Oct 2021 15:41:47 GMT Subject: RFR: 8274851: [PPC64] Port zgc to linux on ppc64le [v3] In-Reply-To: References: Message-ID: > Port the Z garbage collector ([JDK-8209683](https://bugs.openjdk.java.net/browse/JDK-8209683)) to linux on ppc64le. Niklas Radomski has updated the pull request incrementally with three additional commits since the last revision: - Remove superfluous copyright change - Fix predicate conditions - Add ByteSize constructor to Address ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5842/files - new: https://git.openjdk.java.net/jdk/pull/5842/files/5da8c1b7..052ad5c8 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5842&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5842&range=01-02 Stats: 9 lines in 3 files changed: 3 ins; 0 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/5842.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5842/head:pull/5842 PR: https://git.openjdk.java.net/jdk/pull/5842 From eosterlund at openjdk.java.net Fri Oct 8 16:40:13 2021 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 8 Oct 2021 16:40:13 GMT Subject: RFR: 8272723: Don't use Access API to access primitive fields [v3] In-Reply-To: References: Message-ID: <_Sm7dIzy7UfJu2xdA-uaXdUrJUNOy3k-MyS34bQLlhE=.f9e0e05b-baee-46d6-82e4-26978a59197c@github.com> On Fri, 20 Aug 2021 08:51:54 GMT, Roman Kennke wrote: >> For earlier incarnations of Shenandoah, we needed to put barriers before accessing primitive fields. This is no longer necessary nor implemented/used by any GC, and we should simplify the code to do plain access instead. >> >> (We may want to remove remaining primitive access machinery in the Access API soon) >> >> Testing: >> - [x] build x86_32 and x86_64 >> - [x] tier1 >> - [x] tier2 >> - [x] hotspot_gc > > Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: > > - Merge branch 'master' into JDK-8272723 > - Remove redundant asserts > - Revert to use RawAccess for volatile accesses > - Fix alignment > - Consolidate (obj)field_addr() variants > - Remove remaining primitive Access API uses > - 8272723: Don't use Access API to access primitive fields Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5187 From pliden at openjdk.java.net Fri Oct 8 17:14:08 2021 From: pliden at openjdk.java.net (Per Liden) Date: Fri, 8 Oct 2021 17:14:08 GMT Subject: RFR: 8274851: [PPC64] Port zgc to linux on ppc64le [v3] In-Reply-To: References: Message-ID: On Fri, 8 Oct 2021 15:41:47 GMT, Niklas Radomski wrote: >> Port the Z garbage collector ([JDK-8209683](https://bugs.openjdk.java.net/browse/JDK-8209683)) to linux on ppc64le. > > Niklas Radomski has updated the pull request incrementally with three additional commits since the last revision: > > - Remove superfluous copyright change > - Fix predicate conditions > - Add ByteSize constructor to Address Marked as reviewed by pliden (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/5842 From github.com+42899633+eastig at openjdk.java.net Fri Oct 8 20:14:14 2021 From: github.com+42899633+eastig at openjdk.java.net (Evgeny Astigeevich) Date: Fri, 8 Oct 2021 20:14:14 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v3] In-Reply-To: <5nm0e8NHUdcYJJzhxbBq8rlJcn3Y6fSYzlRrGlm_RtE=.6177e231-f9e7-42df-861f-6974d6648e5c@github.com> References: <5nm0e8NHUdcYJJzhxbBq8rlJcn3Y6fSYzlRrGlm_RtE=.6177e231-f9e7-42df-861f-6974d6648e5c@github.com> Message-ID: On Wed, 22 Sep 2021 13:43:02 GMT, Andrew Haley wrote: >> Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: >> >> Move emitting code to MacroAssembler::spin_wait >> >> Code emitting spin pauses is moved to MacroAssembler::spin_wait. >> As OptoAssembly output is changed, tests are updated to parse >> PrintAssembly. > > It's pretty much expected, yes. The debuginfo isn't all that precise. Hi @theRealAph @stooart-mon and others, We have a discussion on CSR request: https://bugs.openjdk.java.net/browse/JDK-8274564. I appreciate if you share your opinion. Thanks, Evgeny ------------- PR: https://git.openjdk.java.net/jdk/pull/5562 From goetz at openjdk.java.net Sat Oct 9 12:11:08 2021 From: goetz at openjdk.java.net (Goetz Lindenmaier) Date: Sat, 9 Oct 2021 12:11:08 GMT Subject: RFR: 8274770: [PPC64] resolve_jobject needs a generic implementation to support load barriers In-Reply-To: References: Message-ID: On Tue, 5 Oct 2021 10:28:07 GMT, Martin Doerr wrote: > The default implementation of BarrierSetAssembler::resolve_jobject should be generic and support all GCs. Single GCs can still implement an optimized version. LGTM ------------- Marked as reviewed by goetz (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5821 From dholmes at openjdk.java.net Mon Oct 11 00:27:09 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 11 Oct 2021 00:27:09 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v6] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: On Thu, 7 Oct 2021 09:15:45 GMT, Xin Liu wrote: >> This patch allows the custom commands of OnError to attach to HotSpot itself. >> It sets the thread of report_and_die() to Native before os::fork_and_exec(cmd). >> This prevents cmds which require safepoint synchronization from deadlock. >> eg. OnError='jcmd %p Thread.print'. >> >> Without this patch, we will encounter a deadlock at safepoint synchronization. >> `"main" #1` is the very thread which executes `os::fork_and_exec(cmd)`. >> >> >> Aborting due to java.lang.OutOfMemoryError: Java heap space >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (debug.cpp:364), pid=94632, tid=94633 >> # fatal error: OutOfMemory encountered: Java heap space >> # >> # JRE version: OpenJDK Runtime Environment (18.0) (build 18-internal+0-adhoc.xxinliu.jdk) >> # Java VM: OpenJDK 64-Bit Server VM (18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) >> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again >> # >> # An error report file with more information is saved as: >> # /local/home/xxinliu/JDK-2085/hs_err_pid94632.log >> # >> # -XX:OnError="jcmd %p Thread.print" >> # Executing /bin/sh -c "jcmd 94632 Thread.print" ... >> 94632: >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timeout detected: >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint. >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Threads which did not reach the safepoint: >> [10.616s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=236.97ms elapsed=10.61s tid=0x00007f01b00232f0 nid=94633 runnable [0x00007f01b7a08000] >> [10.616s][warning][safepoint] java.lang.Thread.State: RUNNABLE >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: (End of list) > > Xin Liu has updated the pull request incrementally with one additional commit since the last revision: > > Don't unlock the holding mutexes for the thread State transition. I wrote earlier: > We have to decide whether it is safest/best to simply not transition to native if holding locks, or whether it is okay to proceed knowing that there are risks of secondary crashes, or hangs, if we do. so it seems you are now taking the second option. Okay fair enough - it is a simpler approach. But I don't see the point in using `JavaThreadInVMAndNative` here - nor "promoting" it to look like some kind of general utility function. AFAICS this boils down to just doing an explicit and simple: `set_thread_state(_thread_in_native);` which can be done before the onError loop. And you can save the state first and restore after the loop. Basically just three lines of code. Thanks, David src/hotspot/share/utilities/vmError.cpp line 1639: > 1637: jtivm.transition_to_native(); > 1638: if (os::fork_and_exec(cmd) < 0) { > 1639: out.print_cr("os::fork_and_exec failed: %s (%s=%d)", I think you may need to restore the state if fork_and_exec fails - or perhaps switch to using print_raw_cr? ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5590 From stefank at openjdk.java.net Mon Oct 11 07:03:12 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 11 Oct 2021 07:03:12 GMT Subject: RFR: 8272723: Don't use Access API to access primitive fields [v3] In-Reply-To: References: Message-ID: On Fri, 20 Aug 2021 08:51:54 GMT, Roman Kennke wrote: >> For earlier incarnations of Shenandoah, we needed to put barriers before accessing primitive fields. This is no longer necessary nor implemented/used by any GC, and we should simplify the code to do plain access instead. >> >> (We may want to remove remaining primitive access machinery in the Access API soon) >> >> Testing: >> - [x] build x86_32 and x86_64 >> - [x] tier1 >> - [x] tier2 >> - [x] hotspot_gc > > Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: > > - Merge branch 'master' into JDK-8272723 > - Remove redundant asserts > - Revert to use RawAccess for volatile accesses > - Fix alignment > - Consolidate (obj)field_addr() variants > - Remove remaining primitive Access API uses > - 8272723: Don't use Access API to access primitive fields Looks good ------------- Marked as reviewed by stefank (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5187 From dnsimon at openjdk.java.net Mon Oct 11 08:37:27 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Mon, 11 Oct 2021 08:37:27 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable Message-ID: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> This PR adds a `ErrorLogPrintCodeLimit` (develop) option for configuring the amount of code printed in a hs-err log file. There's a hard limit of 10 so that the buffer used to avoid duplicates in `VMError::print_code` is stack allocated. In addition, the Java stack is also scanned when considering code to print as the native stack may not have any Java compiled frames. For example, a transition into the VM through a RuntimeStub can prevent the native stack walk from seeing the frames above the stub. The MachCodeFramesInErrorFile test has been made more robust in terms of validating its expectations of how C2 intrinsifies methods. There's one other minor change to address [this comment](https://github.com/openjdk/jdk/pull/5446#issuecomment-938518814). ------------- Commit messages: - make max code printed in hs-err log configurable and also scan Java stack for code to print - do not unnecessarily pollute tty Changes: https://git.openjdk.java.net/jdk/pull/5875/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5875&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8274986 Stats: 166 lines in 6 files changed: 71 ins; 37 del; 58 mod Patch: https://git.openjdk.java.net/jdk/pull/5875.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5875/head:pull/5875 PR: https://git.openjdk.java.net/jdk/pull/5875 From never at openjdk.java.net Mon Oct 11 08:37:28 2021 From: never at openjdk.java.net (Tom Rodriguez) Date: Mon, 11 Oct 2021 08:37:28 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable In-Reply-To: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> Message-ID: On Fri, 8 Oct 2021 22:28:53 GMT, Doug Simon wrote: > This PR adds a `ErrorLogPrintCodeLimit` (develop) option for configuring the amount of code printed in a hs-err log file. There's a hard limit of 10 so that the buffer used to avoid duplicates in `VMError::print_code` is stack allocated. > In addition, the Java stack is also scanned when considering code to print as the native stack may not have any Java compiled frames. For example, a transition into the VM through a RuntimeStub can prevent the native stack walk from seeing the frames above the stub. > The MachCodeFramesInErrorFile test has been made more robust in terms of validating its expectations of how C2 intrinsifies methods. > > There's one other minor change to address [this comment](https://github.com/openjdk/jdk/pull/5446#issuecomment-938518814). src/hotspot/share/code/codeBlob.cpp line 658: > 656: if (st == tty) { > 657: nm->print_nmethod(verbose); > 658: } I think this should be something like this: ResourceMark rm; st->print(INTPTR_FORMAT " is at entry_point+%d in (nmethod*)" INTPTR_FORMAT, p2i(addr), (int)(addr - nm->entry_point()), p2i(nm)); - if (verbose) { - st->print(" for "); - nm->method()->print_value_on(st); - } st->cr(); - nm->print_nmethod(verbose); + if (verbose && st == tty) { + nm->print_nmethod(verbose); + } else { + nm->print_on(st); + } return; } st->print_cr(INTPTR_FORMAT " is at code_begin+%d in ", p2i(addr), (int)(addr - code_begin())); src/hotspot/share/runtime/globals.hpp line 1339: > 1337: "number of stack frames to print in VM-level stack dump") \ > 1338: \ > 1339: develop(int, ErrorLogPrintCodeLimit, 3, \ Shouldn't this be at least diagnostic? develop flags are fairly useless. ------------- PR: https://git.openjdk.java.net/jdk/pull/5875 From dnsimon at openjdk.java.net Mon Oct 11 08:37:29 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Mon, 11 Oct 2021 08:37:29 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable In-Reply-To: References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> Message-ID: On Mon, 11 Oct 2021 03:37:28 GMT, Tom Rodriguez wrote: >> This PR adds a `ErrorLogPrintCodeLimit` (develop) option for configuring the amount of code printed in a hs-err log file. There's a hard limit of 10 so that the buffer used to avoid duplicates in `VMError::print_code` is stack allocated. >> In addition, the Java stack is also scanned when considering code to print as the native stack may not have any Java compiled frames. For example, a transition into the VM through a RuntimeStub can prevent the native stack walk from seeing the frames above the stub. >> The MachCodeFramesInErrorFile test has been made more robust in terms of validating its expectations of how C2 intrinsifies methods. >> >> There's one other minor change to address [this comment](https://github.com/openjdk/jdk/pull/5446#issuecomment-938518814). > > src/hotspot/share/code/codeBlob.cpp line 658: > >> 656: if (st == tty) { >> 657: nm->print_nmethod(verbose); >> 658: } > > I think this should be something like this: > > ResourceMark rm; > st->print(INTPTR_FORMAT " is at entry_point+%d in (nmethod*)" INTPTR_FORMAT, > p2i(addr), (int)(addr - nm->entry_point()), p2i(nm)); > - if (verbose) { > - st->print(" for "); > - nm->method()->print_value_on(st); > - } > st->cr(); > - nm->print_nmethod(verbose); > + if (verbose && st == tty) { > + nm->print_nmethod(verbose); > + } else { > + nm->print_on(st); > + } > return; > } > st->print_cr(INTPTR_FORMAT " is at code_begin+%d in ", p2i(addr), (int)(addr - code_begin())); I don't think so. As far as I can see, the intent is to have a single line of output for an nmethod followed by an optional " for ..." suffix when `verbose == true`. The `nm->print_nmethod(verbose)` call then prints extra multi-line detail for the nmethod with the number of lines printed governed by `verbose`. This code seems like it went from being hard coded to always go to `tty` and was then parameterized to go to an arbitrary stream but the evolution accidentally overlooked some code that still goes to `tty`. I don't want to make extensive changes here as there really should be a single effort to rationalize all dumping to ensure it's parameterized. > src/hotspot/share/runtime/globals.hpp line 1339: > >> 1337: "number of stack frames to print in VM-level stack dump") \ >> 1338: \ >> 1339: develop(int, ErrorLogPrintCodeLimit, 3, \ > > Shouldn't this be at least diagnostic? develop flags are fairly useless. I chose develop to be consistent with `StackPrintLimit`. What's the process for adding a diagnostic flag @dholmes-ora ? Should we change both `StackPrintLimit` and `ErrorLogPrintCodeLimit` to be diagnostic? I agree with Tom that having these as develop flags is of limited use. ------------- PR: https://git.openjdk.java.net/jdk/pull/5875 From dholmes at openjdk.java.net Mon Oct 11 09:09:10 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 11 Oct 2021 09:09:10 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable In-Reply-To: References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> Message-ID: <4yEJ0qZpdQ5c466801kermZpB_gM42ZhlsDEey44TNk=.54dc15b7-9bb5-48e8-bdd3-f65eeb2406eb@github.com> On Mon, 11 Oct 2021 08:18:47 GMT, Doug Simon wrote: >> src/hotspot/share/runtime/globals.hpp line 1339: >> >>> 1337: "number of stack frames to print in VM-level stack dump") \ >>> 1338: \ >>> 1339: develop(int, ErrorLogPrintCodeLimit, 3, \ >> >> Shouldn't this be at least diagnostic? develop flags are fairly useless. > > I chose develop to be consistent with `StackPrintLimit`. > What's the process for adding a diagnostic flag @dholmes-ora ? Should we change both `StackPrintLimit` and `ErrorLogPrintCodeLimit` to be diagnostic? I agree with Tom that having these as develop flags is of limited use. I have no idea what the history of `StackPrintLimit` is but there doesn't seem to be any call to have it be a non-develop flag. I'm assuming 100 is plenty and noone ever needs to change it. For the new flag, develop is pretty useless if the issue is that the default value may not show enough information. Making it diagnostic seems reasonable. If people hit an issue and the default value doesn't show enough information again then they can change the value and hopefully diagnose the problem further. David ------------- PR: https://git.openjdk.java.net/jdk/pull/5875 From dnsimon at openjdk.java.net Mon Oct 11 09:39:12 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Mon, 11 Oct 2021 09:39:12 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable In-Reply-To: <4yEJ0qZpdQ5c466801kermZpB_gM42ZhlsDEey44TNk=.54dc15b7-9bb5-48e8-bdd3-f65eeb2406eb@github.com> References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> <4yEJ0qZpdQ5c466801kermZpB_gM42ZhlsDEey44TNk=.54dc15b7-9bb5-48e8-bdd3-f65eeb2406eb@github.com> Message-ID: On Mon, 11 Oct 2021 09:06:25 GMT, David Holmes wrote: >> I chose develop to be consistent with `StackPrintLimit`. >> What's the process for adding a diagnostic flag @dholmes-ora ? Should we change both `StackPrintLimit` and `ErrorLogPrintCodeLimit` to be diagnostic? I agree with Tom that having these as develop flags is of limited use. > > I have no idea what the history of `StackPrintLimit` is but there doesn't seem to be any call to have it be a non-develop flag. I'm assuming 100 is plenty and noone ever needs to change it. > > For the new flag, develop is pretty useless if the issue is that the default value may not show enough information. Making it diagnostic seems reasonable. If people hit an issue and the default value doesn't show enough information again then they can change the value and hopefully diagnose the problem further. > > David Ok, let's just make `ErrorLogPrintCodeLimit` be diagnostic. Is there more process for this apart from simply making the change in this PR? ------------- PR: https://git.openjdk.java.net/jdk/pull/5875 From shade at openjdk.java.net Mon Oct 11 09:59:36 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 11 Oct 2021 09:59:36 GMT Subject: RFR: 8275031: runtime/ErrorHandling/MachCodeFramesInErrorFile.java fails when hsdis is present Message-ID: tier1 test that is newly added by [JDK-8272586](https://bugs.openjdk.java.net/browse/JDK-8272586) fails on hsdis-enabled machine. I chose to fix it by annotating the real disassembly with `[Disassembly]`, and checking for its existence in the test. Additional testing: - [x] Test now passes when `hsdis` is installed - [x] Test still passes when `hsdis` is not installed ------------- Commit messages: - Fix Changes: https://git.openjdk.java.net/jdk/pull/5888/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5888&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275031 Stats: 9 lines in 2 files changed: 9 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/5888.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5888/head:pull/5888 PR: https://git.openjdk.java.net/jdk/pull/5888 From rkennke at openjdk.java.net Mon Oct 11 10:41:13 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 11 Oct 2021 10:41:13 GMT Subject: Integrated: 8272723: Don't use Access API to access primitive fields In-Reply-To: References: Message-ID: <83W1F7xXj1hiAA9ydUySth-lb5QSwGt6IGWDdcEKek8=.47a17cea-dafa-47c1-a25a-bb4d4ff82692@github.com> On Thu, 19 Aug 2021 14:27:14 GMT, Roman Kennke wrote: > For earlier incarnations of Shenandoah, we needed to put barriers before accessing primitive fields. This is no longer necessary nor implemented/used by any GC, and we should simplify the code to do plain access instead. > > (We may want to remove remaining primitive access machinery in the Access API soon) > > Testing: > - [x] build x86_32 and x86_64 > - [x] tier1 > - [x] tier2 > - [x] hotspot_gc This pull request has now been integrated. Changeset: 3edee1e1 Author: Roman Kennke URL: https://git.openjdk.java.net/jdk/commit/3edee1e1feed564397ac47a32c0394d7798bac17 Stats: 198 lines in 9 files changed: 4 ins; 92 del; 102 mod 8272723: Don't use Access API to access primitive fields Reviewed-by: stefank, eosterlund ------------- PR: https://git.openjdk.java.net/jdk/pull/5187 From dholmes at openjdk.java.net Mon Oct 11 11:53:09 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 11 Oct 2021 11:53:09 GMT Subject: RFR: 8274615: Support relaxed atomic add for linux-aarch64 In-Reply-To: <1V2HK9V1rgEy2RkQmT_1PrIIbQcOAd94etBUs026Yds=.d5740b78-e13d-4abd-a0c7-3a3fbc2a4ac0@github.com> References: <1V2HK9V1rgEy2RkQmT_1PrIIbQcOAd94etBUs026Yds=.d5740b78-e13d-4abd-a0c7-3a3fbc2a4ac0@github.com> Message-ID: On Fri, 1 Oct 2021 05:04:45 GMT, Kim Barrett wrote: > Please review this change to the linux-aarch64 port to support relaxed > atomic add operations. Both ldxr/stxr and LSE-based implementations are > provided. The appropriate one to use selected at runtime, using the existing > infrastructure for that selection. > > Testing: > mach5 tier1-3 > Some hand-checked performance and functional testing that both LSE and > non-LSE implementations work, and that the relaxed operations are faster > than the conservative default. Hi Kim, Not an ARM expert but seeing as this just elides the dmb this looks fine. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5785 From dholmes at openjdk.java.net Mon Oct 11 12:06:06 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 11 Oct 2021 12:06:06 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable In-Reply-To: References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> <4yEJ0qZpdQ5c466801kermZpB_gM42ZhlsDEey44TNk=.54dc15b7-9bb5-48e8-bdd3-f65eeb2406eb@github.com> Message-ID: On Mon, 11 Oct 2021 09:35:46 GMT, Doug Simon wrote: >> I have no idea what the history of `StackPrintLimit` is but there doesn't seem to be any call to have it be a non-develop flag. I'm assuming 100 is plenty and noone ever needs to change it. >> >> For the new flag, develop is pretty useless if the issue is that the default value may not show enough information. Making it diagnostic seems reasonable. If people hit an issue and the default value doesn't show enough information again then they can change the value and hopefully diagnose the problem further. >> >> David > > Ok, let's just make `ErrorLogPrintCodeLimit` be diagnostic. Is there more process for this apart from simply making the change in this PR? Only the change in the PR is needed - no CSR request for diagnostic flags. ------------- PR: https://git.openjdk.java.net/jdk/pull/5875 From dholmes at openjdk.java.net Mon Oct 11 12:14:10 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 11 Oct 2021 12:14:10 GMT Subject: RFR: 8275031: runtime/ErrorHandling/MachCodeFramesInErrorFile.java fails when hsdis is present In-Reply-To: References: Message-ID: <74KYNBmyJ7s_h1dC4tDDnSYYeAquf7uSRGZJnrD4oX0=.89aac852-3683-4597-8bd6-e030b3ef950f@github.com> On Mon, 11 Oct 2021 09:52:03 GMT, Aleksey Shipilev wrote: > tier1 test that is newly added by [JDK-8272586](https://bugs.openjdk.java.net/browse/JDK-8272586) fails on hsdis-enabled machine. I chose to fix it by annotating the real disassembly with `[Disassembly]`, and checking for its existence in the test. > > Additional testing: > - [x] Test now passes when `hsdis` is installed > - [x] Test still passes when `hsdis` is not installed This seems quite reasonable. @dougxc please also review. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5888 From dnsimon at openjdk.java.net Mon Oct 11 12:19:06 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Mon, 11 Oct 2021 12:19:06 GMT Subject: RFR: 8275031: runtime/ErrorHandling/MachCodeFramesInErrorFile.java fails when hsdis is present In-Reply-To: References: Message-ID: On Mon, 11 Oct 2021 09:52:03 GMT, Aleksey Shipilev wrote: > tier1 test that is newly added by [JDK-8272586](https://bugs.openjdk.java.net/browse/JDK-8272586) fails on hsdis-enabled machine. I chose to fix it by annotating the real disassembly with `[Disassembly]`, and checking for its existence in the test. > > Additional testing: > - [x] Test now passes when `hsdis` is installed > - [x] Test still passes when `hsdis` is not installed Marked as reviewed by dnsimon (Committer). Looks good to me. Thanks @shipilev and sorry for the breakage. ------------- PR: https://git.openjdk.java.net/jdk/pull/5888 From dnsimon at openjdk.java.net Mon Oct 11 12:30:43 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Mon, 11 Oct 2021 12:30:43 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable [v2] In-Reply-To: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> Message-ID: > This PR adds a `ErrorLogPrintCodeLimit` (develop) option for configuring the amount of code printed in a hs-err log file. There's a hard limit of 10 so that the buffer used to avoid duplicates in `VMError::print_code` is stack allocated. > In addition, the Java stack is also scanned when considering code to print as the native stack may not have any Java compiled frames. For example, a transition into the VM through a RuntimeStub can prevent the native stack walk from seeing the frames above the stub. > The MachCodeFramesInErrorFile test has been made more robust in terms of validating its expectations of how C2 intrinsifies methods. > > There's one other minor change to address [this comment](https://github.com/openjdk/jdk/pull/5446#issuecomment-938518814). Doug Simon has updated the pull request incrementally with one additional commit since the last revision: made ErrorLogPrintCodeLimit a diagnostic flag ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5875/files - new: https://git.openjdk.java.net/jdk/pull/5875/files/5b141c15..ae0fa83a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5875&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5875&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/5875.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5875/head:pull/5875 PR: https://git.openjdk.java.net/jdk/pull/5875 From pliden at openjdk.java.net Mon Oct 11 12:33:42 2021 From: pliden at openjdk.java.net (Per Liden) Date: Mon, 11 Oct 2021 12:33:42 GMT Subject: RFR: 8275035: Clean up worker thread infrastructure Message-ID: I propose that we clean up our GangWorker/WorkGang and related classes, to remove abstractions we no longer need (after CMS was removed, MutexDispatcher was removed, Parallel is now using WorkGang, etc) and adjusting names as follows: * Rename AbstractGangTask to WorkerTask * Rename WorkGang to WorkerThreads * Fold GangWorker into WorkerThread * Fold WorkManager into WorkerThreads * Move SubTaskDone and friends to a new workerUtils.hpp/cpp I've split things up into several commits to make it easier to review. Testing: Passes Tier 1-3 on all Oracle platforms. ------------- Commit messages: - Clean up naming in ReferenceProcessor - Clean up naming in Shenandoah - Clean up naming in HeapDumper/HeapInspector - Clean up naming in G1 - Clean up naming in pretouch - Clean up naming in WeakProcessor - Clean up naming in ZGC - Remove unused log tag - Rename workgroup.hpp/cpp to workerThread.hpp/cpp - Clean up worker code - ... and 10 more: https://git.openjdk.java.net/jdk/compare/c032186b...39b23f42 Changes: https://git.openjdk.java.net/jdk/pull/5886/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5886&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275035 Stats: 2415 lines in 105 files changed: 881 ins; 1083 del; 451 mod Patch: https://git.openjdk.java.net/jdk/pull/5886.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5886/head:pull/5886 PR: https://git.openjdk.java.net/jdk/pull/5886 From stefank at openjdk.java.net Mon Oct 11 12:46:09 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 11 Oct 2021 12:46:09 GMT Subject: RFR: 8275035: Clean up worker thread infrastructure In-Reply-To: References: Message-ID: On Mon, 11 Oct 2021 09:21:21 GMT, Per Liden wrote: > I propose that we clean up our GangWorker/WorkGang and related classes, to remove abstractions we no longer need (after CMS was removed, MutexDispatcher was removed, Parallel is now using WorkGang, etc) and adjusting names as follows: > > * Rename AbstractGangTask to WorkerTask > * Rename WorkGang to WorkerThreads > * Fold GangWorker into WorkerThread > * Fold WorkManager into WorkerThreads > * Move SubTaskDone and friends to a new workerUtils.hpp/cpp > > I've split things up into several commits to make it easier to review. > > Testing: Passes Tier 1-3 on all Oracle platforms. Looks good ------------- Marked as reviewed by stefank (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5886 From coleenp at openjdk.java.net Mon Oct 11 16:32:28 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 11 Oct 2021 16:32:28 GMT Subject: RFR: 8274794: Make Thread::_owned_locks available in product Message-ID: Make thread owned locks available in product and change the error message printing to print by thread so that not only locks in mutexLocker are printed in the hs_err file. Tested with tier1-3 and with product build, and tested hs_err files manually with some selected temporary ShouldNotReachHere, also new test. ------------- Commit messages: - Add test to print out locks in error file. - 8274794: Make Thread::_owned_locks available in product Changes: https://git.openjdk.java.net/jdk/pull/5896/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5896&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8274794 Stats: 317 lines in 10 files changed: 199 ins; 98 del; 20 mod Patch: https://git.openjdk.java.net/jdk/pull/5896.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5896/head:pull/5896 PR: https://git.openjdk.java.net/jdk/pull/5896 From never at openjdk.java.net Mon Oct 11 17:31:07 2021 From: never at openjdk.java.net (Tom Rodriguez) Date: Mon, 11 Oct 2021 17:31:07 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable [v2] In-Reply-To: References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> Message-ID: On Mon, 11 Oct 2021 08:30:48 GMT, Doug Simon wrote: >> src/hotspot/share/code/codeBlob.cpp line 658: >> >>> 656: if (st == tty) { >>> 657: nm->print_nmethod(verbose); >>> 658: } >> >> I think this should be something like this: >> >> ResourceMark rm; >> st->print(INTPTR_FORMAT " is at entry_point+%d in (nmethod*)" INTPTR_FORMAT, >> p2i(addr), (int)(addr - nm->entry_point()), p2i(nm)); >> - if (verbose) { >> - st->print(" for "); >> - nm->method()->print_value_on(st); >> - } >> st->cr(); >> - nm->print_nmethod(verbose); >> + if (verbose && st == tty) { >> + nm->print_nmethod(verbose); >> + } else { >> + nm->print_on(st); >> + } >> return; >> } >> st->print_cr(INTPTR_FORMAT " is at code_begin+%d in ", p2i(addr), (int)(addr - code_begin())); > > I don't think so. As far as I can see, the intent is to have a single line of output for an nmethod followed by an optional " for ..." suffix when `verbose == true`. > The `nm->print_nmethod(verbose)` call then prints extra multi-line detail for the nmethod with the number of lines printed governed by `verbose`. > This code seems like it went from being hard coded to always go to `tty` and was then parameterized to go to an arbitrary stream but the evolution accidentally overlooked some code that still goes to `tty`. > I don't want to make extensive changes here as there really should be a single effort to rationalize all dumping to ensure it's parameterized. I agree you might not want to bite this off in this PR, but this piece of code is reason you commonly see random nmethods appearing on the tty just before a crash. verbose is only ever true when called from findpc in debug.cpp. All the other non-verbose work print_nmethod does is useless, like writing to the xtty, and otherwise boils down to nm->print_on(tty). But all these printing paths could use a rework so I'm fine if you skip it. ------------- PR: https://git.openjdk.java.net/jdk/pull/5875 From mdoerr at openjdk.java.net Mon Oct 11 18:28:56 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Mon, 11 Oct 2021 18:28:56 GMT Subject: RFR: 8274851: [PPC64] Port zgc to linux on ppc64le [v3] In-Reply-To: References: Message-ID: <4dGhPv7DKhCFo7EdlOW5AmNHgmTsI9gR_ExPcRqb6wM=.008ec779-aeba-4f0f-97f6-5e04ca8f4ed0@github.com> On Fri, 8 Oct 2021 15:41:47 GMT, Niklas Radomski wrote: >> Port the Z garbage collector ([JDK-8209683](https://bugs.openjdk.java.net/browse/JDK-8209683)) to linux on ppc64le. > > Niklas Radomski has updated the pull request incrementally with three additional commits since the last revision: > > - Remove superfluous copyright change > - Fix predicate conditions > - Add ByteSize constructor to Address Thanks for doing the requested fixes. LGTM. ------------- Marked as reviewed by mdoerr (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5842 From coleenp at openjdk.java.net Mon Oct 11 18:39:44 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 11 Oct 2021 18:39:44 GMT Subject: RFR: 8274794: Make Thread::_owned_locks available in product [v2] In-Reply-To: References: Message-ID: > Make thread owned locks available in product and change the error message printing to print by thread so that not only locks in mutexLocker are printed in the hs_err file. > Tested with tier1-3 and with product build, and tested hs_err files manually with some selected temporary ShouldNotReachHere, also new test. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix optimized build (move print_on) up. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5896/files - new: https://git.openjdk.java.net/jdk/pull/5896/files/0f185c2d..69802d3f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5896&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5896&range=00-01 Stats: 30 lines in 1 file changed: 15 ins; 15 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/5896.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5896/head:pull/5896 PR: https://git.openjdk.java.net/jdk/pull/5896 From dnsimon at openjdk.java.net Mon Oct 11 19:32:15 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Mon, 11 Oct 2021 19:32:15 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable [v2] In-Reply-To: References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> Message-ID: On Mon, 11 Oct 2021 17:28:10 GMT, Tom Rodriguez wrote: >> I don't think so. As far as I can see, the intent is to have a single line of output for an nmethod followed by an optional " for ..." suffix when `verbose == true`. >> The `nm->print_nmethod(verbose)` call then prints extra multi-line detail for the nmethod with the number of lines printed governed by `verbose`. >> This code seems like it went from being hard coded to always go to `tty` and was then parameterized to go to an arbitrary stream but the evolution accidentally overlooked some code that still goes to `tty`. >> I don't want to make extensive changes here as there really should be a single effort to rationalize all dumping to ensure it's parameterized. > > I agree you might not want to bite this off in this PR, but this piece of code is reason you commonly see random nmethods appearing on the tty just before a crash. verbose is only ever true when called from findpc in debug.cpp. All the other non-verbose work print_nmethod does is useless, like writing to the xtty, and otherwise boils down to nm->print_on(tty). But all these printing paths could use a rework so I'm fine if you skip it. As you convinced me in a side conversation, your suggestion has the advantage that we now at least get this output in a hs_err file instead of loosing it altogether. ------------- PR: https://git.openjdk.java.net/jdk/pull/5875 From dnsimon at openjdk.java.net Mon Oct 11 19:45:23 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Mon, 11 Oct 2021 19:45:23 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable [v3] In-Reply-To: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> Message-ID: > This PR adds a `ErrorLogPrintCodeLimit` (develop) option for configuring the amount of code printed in a hs-err log file. There's a hard limit of 10 so that the buffer used to avoid duplicates in `VMError::print_code` is stack allocated. > In addition, the Java stack is also scanned when considering code to print as the native stack may not have any Java compiled frames. For example, a transition into the VM through a RuntimeStub can prevent the native stack walk from seeing the frames above the stub. > The MachCodeFramesInErrorFile test has been made more robust in terms of validating its expectations of how C2 intrinsifies methods. > > There's one other minor change to address [this comment](https://github.com/openjdk/jdk/pull/5446#issuecomment-938518814). Doug Simon has updated the pull request incrementally with one additional commit since the last revision: print extra info for an nmethod in CodeBlob::dump_for_addr to st instead of tty in non-verbose mode ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5875/files - new: https://git.openjdk.java.net/jdk/pull/5875/files/ae0fa83a..feb40153 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5875&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5875&range=01-02 Stats: 5 lines in 1 file changed: 3 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/5875.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5875/head:pull/5875 PR: https://git.openjdk.java.net/jdk/pull/5875 From never at openjdk.java.net Mon Oct 11 19:52:59 2021 From: never at openjdk.java.net (Tom Rodriguez) Date: Mon, 11 Oct 2021 19:52:59 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable [v3] In-Reply-To: References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> Message-ID: <1euAUwriI-73wiESteRc5-qiBWWWHu-DbZZSD3fDkJQ=.cf7accca-94b0-4b86-b582-1277b3b6a996@github.com> On Mon, 11 Oct 2021 19:45:23 GMT, Doug Simon wrote: >> This PR adds a `ErrorLogPrintCodeLimit` (develop) option for configuring the amount of code printed in a hs-err log file. There's a hard limit of 10 so that the buffer used to avoid duplicates in `VMError::print_code` is stack allocated. >> In addition, the Java stack is also scanned when considering code to print as the native stack may not have any Java compiled frames. For example, a transition into the VM through a RuntimeStub can prevent the native stack walk from seeing the frames above the stub. >> The MachCodeFramesInErrorFile test has been made more robust in terms of validating its expectations of how C2 intrinsifies methods. >> >> There's one other minor change to address [this comment](https://github.com/openjdk/jdk/pull/5446#issuecomment-938518814). > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > print extra info for an nmethod in CodeBlob::dump_for_addr to st instead of tty in non-verbose mode Marked as reviewed by never (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/5875 From duke at openjdk.java.net Mon Oct 11 20:17:36 2021 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Mon, 11 Oct 2021 20:17:36 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v10] In-Reply-To: References: Message-ID: <8P-tWT-7UC9TMLz8zo5liDy2rOONBU864RUlRhthLeY=.05ef40f6-38ca-4e84-a54b-eb8dbae2b97f@github.com> > This PR is a follow-up on the discussion [?RFC: AArch64: Implementing spin pauses with ISB?](https://mail.openjdk.java.net/pipermail/hotspot-dev/2021-August/054033.html). > > It adds DIAGNOSTIC options `OnSpinWaitInst=inst`, where `inst` can be: > > - `none`: no implementation for spin pauses. This is the default value. > - `nop`: use `nop` instruction for spin pauses. > - `isb`: use `isb` instruction for spin pauses. > - `yield`: use `yield` instruction for spin pauses. > > And `OnSpinWaitInstCount=count`, where `count` specifies a number of `OnSpinWaitInst` and can be in `1..99` range. It is an error to use `OnSpinWaitInstCount` when `OnSpinWaitInst` is `none`. > > The code for the `Thread.onSpinWait` intrinsic is generated based on the values of `OnSpinWaitInst` and `OnSpinWaitInstCount`. > > Testing: > > - `make test TEST="gtest"`: Passed > - `make run-test TEST="tier1"`: Passed > - `make run-test TEST="tier2"`: Passed > - `make run-test TEST=hotspot/jtreg/compiler/onSpinWait`: Passed > > CSR: https://bugs.openjdk.java.net/browse/JDK-8274564 Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: Make OnSpinWaitInst/OnSpinWaitInstCount DIAGNOSTIC ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5562/files - new: https://git.openjdk.java.net/jdk/pull/5562/files/122ea2e9..85fe15b2 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5562&range=09 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5562&range=08-09 Stats: 6 lines in 1 file changed: 2 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/5562.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5562/head:pull/5562 PR: https://git.openjdk.java.net/jdk/pull/5562 From nradomski at openjdk.java.net Mon Oct 11 20:18:54 2021 From: nradomski at openjdk.java.net (Niklas Radomski) Date: Mon, 11 Oct 2021 20:18:54 GMT Subject: RFR: 8274851: [PPC64] Port zgc to linux on ppc64le [v3] In-Reply-To: References: Message-ID: <1R7fNrUlVve0iCeLPoSqdffRJzwjNMBjv7ELUcE3fNc=.bbc9ad55-a976-4129-9da0-dad205a7a50a@github.com> On Fri, 8 Oct 2021 15:41:47 GMT, Niklas Radomski wrote: >> Port the Z garbage collector ([JDK-8209683](https://bugs.openjdk.java.net/browse/JDK-8209683)) to linux on ppc64le. > > Niklas Radomski has updated the pull request incrementally with three additional commits since the last revision: > > - Remove superfluous copyright change > - Fix predicate conditions > - Add ByteSize constructor to Address Thank you for your reviews! Glad to see that the changes have been so well received. ------------- PR: https://git.openjdk.java.net/jdk/pull/5842 From dholmes at openjdk.java.net Mon Oct 11 22:03:53 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 11 Oct 2021 22:03:53 GMT Subject: RFR: 8274794: Make Thread::_owned_locks available in product [v2] In-Reply-To: References: Message-ID: On Mon, 11 Oct 2021 18:39:44 GMT, Coleen Phillimore wrote: >> Make thread owned locks available in product and change the error message printing to print by thread so that not only locks in mutexLocker are printed in the hs_err file. >> Tested with tier1-3 and with product build, and tested hs_err files manually with some selected temporary ShouldNotReachHere, also new test. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix optimized build (move print_on) up. Hi Coleen, Sorry but I have significant concerns with parts of this change. First it isn't safe to print the owned locks of all threads during error reporting as those threads are still running and could be updating their list of owned threads. I also think there is still need for the _mutex_array because that allows you to check what each mutex thinks its state is, independently of what the threads think ie. if there's a bug in the owned_locks code and a locked lock was mistakenly elided, then you now have no way to know that. My understanding of the motivation for this change was to allow printing of the dynamically defined locks held by the current thread during error reporting - but this change tries to go way beyond that. Thanks, David ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5896 From dholmes at openjdk.java.net Tue Oct 12 02:12:48 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 12 Oct 2021 02:12:48 GMT Subject: RFR: 8275035: Clean up worker thread infrastructure In-Reply-To: References: Message-ID: On Mon, 11 Oct 2021 09:21:21 GMT, Per Liden wrote: > I propose that we clean up our GangWorker/WorkGang and related classes, to remove abstractions we no longer need (after CMS was removed, MutexDispatcher was removed, Parallel is now using WorkGang, etc) and adjusting names as follows: > > * Rename AbstractGangTask to WorkerTask > * Rename WorkGang to WorkerThreads > * Fold GangWorker into WorkerThread > * Fold WorkManager into WorkerThreads > * Move SubTaskDone and friends to a new workerUtils.hpp/cpp > > I've split things up into several commits to make it easier to review. > > Testing: Passes Tier 1-3 on all Oracle platforms. Changes to non GC files look fine. Thanks, David ------------- PR: https://git.openjdk.java.net/jdk/pull/5886 From xliu at openjdk.java.net Tue Oct 12 04:35:17 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Tue, 12 Oct 2021 04:35:17 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v7] In-Reply-To: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: > This patch allows the custom commands of OnError to attach to HotSpot itself. > It sets the thread of report_and_die() to Native before os::fork_and_exec(cmd). > This prevents cmds which require safepoint synchronization from deadlock. > eg. OnError='jcmd %p Thread.print'. > > Without this patch, we will encounter a deadlock at safepoint synchronization. > `"main" #1` is the very thread which executes `os::fork_and_exec(cmd)`. > > > Aborting due to java.lang.OutOfMemoryError: Java heap space > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (debug.cpp:364), pid=94632, tid=94633 > # fatal error: OutOfMemory encountered: Java heap space > # > # JRE version: OpenJDK Runtime Environment (18.0) (build 18-internal+0-adhoc.xxinliu.jdk) > # Java VM: OpenJDK 64-Bit Server VM (18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) > # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again > # > # An error report file with more information is saved as: > # /local/home/xxinliu/JDK-2085/hs_err_pid94632.log > # > # -XX:OnError="jcmd %p Thread.print" > # Executing /bin/sh -c "jcmd 94632 Thread.print" ... > 94632: > [10.616s][warning][safepoint] > [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timeout detected: > [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint. > [10.616s][warning][safepoint] # SafepointSynchronize::begin: Threads which did not reach the safepoint: > [10.616s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=236.97ms elapsed=10.61s tid=0x00007f01b00232f0 nid=94633 runnable [0x00007f01b7a08000] > [10.616s][warning][safepoint] java.lang.Thread.State: RUNNABLE > [10.616s][warning][safepoint] > [10.616s][warning][safepoint] # SafepointSynchronize::begin: (End of list) Xin Liu has updated the pull request incrementally with one additional commit since the last revision: Revert JavaThreadInVMAndNative change. Only change the state to Native if current thread is Java Thread and it's in VM. Restore thread state after fork_and_exec() if it's changed. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5590/files - new: https://git.openjdk.java.net/jdk/pull/5590/files/62c9e736..b7ce3f9b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5590&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5590&range=05-06 Stats: 55 lines in 2 files changed: 51 ins; 2 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/5590.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5590/head:pull/5590 PR: https://git.openjdk.java.net/jdk/pull/5590 From dholmes at openjdk.java.net Tue Oct 12 05:35:51 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 12 Oct 2021 05:35:51 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v7] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: On Tue, 12 Oct 2021 04:35:17 GMT, Xin Liu wrote: >> This patch allows the custom commands of OnError to attach to HotSpot itself. >> It sets the thread of report_and_die() to Native before os::fork_and_exec(cmd). >> This prevents cmds which require safepoint synchronization from deadlock. >> eg. OnError='jcmd %p Thread.print'. >> >> Without this patch, we will encounter a deadlock at safepoint synchronization. >> `"main" #1` is the very thread which executes `os::fork_and_exec(cmd)`. >> >> >> Aborting due to java.lang.OutOfMemoryError: Java heap space >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (debug.cpp:364), pid=94632, tid=94633 >> # fatal error: OutOfMemory encountered: Java heap space >> # >> # JRE version: OpenJDK Runtime Environment (18.0) (build 18-internal+0-adhoc.xxinliu.jdk) >> # Java VM: OpenJDK 64-Bit Server VM (18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) >> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again >> # >> # An error report file with more information is saved as: >> # /local/home/xxinliu/JDK-2085/hs_err_pid94632.log >> # >> # -XX:OnError="jcmd %p Thread.print" >> # Executing /bin/sh -c "jcmd 94632 Thread.print" ... >> 94632: >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timeout detected: >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint. >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Threads which did not reach the safepoint: >> [10.616s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=236.97ms elapsed=10.61s tid=0x00007f01b00232f0 nid=94633 runnable [0x00007f01b7a08000] >> [10.616s][warning][safepoint] java.lang.Thread.State: RUNNABLE >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: (End of list) > > Xin Liu has updated the pull request incrementally with one additional commit since the last revision: > > Revert JavaThreadInVMAndNative change. > > Only change the state to Native if current thread is Java Thread and it's in VM. > Restore thread state after fork_and_exec() if it's changed. src/hotspot/share/runtime/interfaceSupport.inline.hpp line 215: > 213: }; > 214: > 215: class JavaThreadInVMAndNative : public StackObj { Shouldn't this have been removed again? src/hotspot/share/utilities/vmError.cpp line 45: > 43: #include "runtime/frame.inline.hpp" > 44: #include "runtime/init.hpp" > 45: #include "runtime/interfaceSupport.inline.hpp" Why is this needed now? src/hotspot/share/utilities/vmError.cpp line 1348: > 1346: public: > 1347: VMErrorForceInNative(Thread* t): _jt(t != NULL && t->is_Java_thread() ? JavaThread::cast(t) : NULL) { > 1348: if (_jt != NULL && _jt->thread_state() == _thread_in_vm) { What if it is _thread_in_Java? Isn't that possible. I think you can unconditionally change the state to `_thread_in_native` and then restore it. src/hotspot/share/utilities/vmError.cpp line 1659: > 1657: VMErrorForceInNative fn(Thread::current_or_null()); > 1658: if (os::fork_and_exec(cmd) < 0) { > 1659: out.print_cr("os::fork_and_exec failed: %s (%s=%d)", I'm still concerned about using out.print_cr if you are _thread_in_native - is it safe? If it is, and if `print_raw` is also safe to execute when in native, then you could just apply the VMErrorForceInNative before the while loop. ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From xliu at openjdk.java.net Tue Oct 12 05:48:56 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Tue, 12 Oct 2021 05:48:56 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v7] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: On Tue, 12 Oct 2021 05:28:19 GMT, David Holmes wrote: >> Xin Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert JavaThreadInVMAndNative change. >> >> Only change the state to Native if current thread is Java Thread and it's in VM. >> Restore thread state after fork_and_exec() if it's changed. > > src/hotspot/share/runtime/interfaceSupport.inline.hpp line 215: > >> 213: }; >> 214: >> 215: class JavaThreadInVMAndNative : public StackObj { > > Shouldn't this have been removed again? oh, yeah. sorry ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From dholmes at openjdk.java.net Tue Oct 12 06:08:51 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 12 Oct 2021 06:08:51 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v7] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: On Tue, 12 Oct 2021 04:35:17 GMT, Xin Liu wrote: >> This patch allows the custom commands of OnError to attach to HotSpot itself. >> It sets the thread of report_and_die() to Native before os::fork_and_exec(cmd). >> This prevents cmds which require safepoint synchronization from deadlock. >> eg. OnError='jcmd %p Thread.print'. >> >> Without this patch, we will encounter a deadlock at safepoint synchronization. >> `"main" #1` is the very thread which executes `os::fork_and_exec(cmd)`. >> >> >> Aborting due to java.lang.OutOfMemoryError: Java heap space >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (debug.cpp:364), pid=94632, tid=94633 >> # fatal error: OutOfMemory encountered: Java heap space >> # >> # JRE version: OpenJDK Runtime Environment (18.0) (build 18-internal+0-adhoc.xxinliu.jdk) >> # Java VM: OpenJDK 64-Bit Server VM (18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) >> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again >> # >> # An error report file with more information is saved as: >> # /local/home/xxinliu/JDK-2085/hs_err_pid94632.log >> # >> # -XX:OnError="jcmd %p Thread.print" >> # Executing /bin/sh -c "jcmd 94632 Thread.print" ... >> 94632: >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timeout detected: >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint. >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Threads which did not reach the safepoint: >> [10.616s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=236.97ms elapsed=10.61s tid=0x00007f01b00232f0 nid=94633 runnable [0x00007f01b7a08000] >> [10.616s][warning][safepoint] java.lang.Thread.State: RUNNABLE >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: (End of list) > > Xin Liu has updated the pull request incrementally with one additional commit since the last revision: > > Revert JavaThreadInVMAndNative change. > > Only change the state to Native if current thread is Java Thread and it's in VM. > Restore thread state after fork_and_exec() if it's changed. Changes requested by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From xliu at openjdk.java.net Tue Oct 12 06:08:52 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Tue, 12 Oct 2021 06:08:52 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v7] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: On Tue, 12 Oct 2021 05:25:31 GMT, David Holmes wrote: >> Xin Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert JavaThreadInVMAndNative change. >> >> Only change the state to Native if current thread is Java Thread and it's in VM. >> Restore thread state after fork_and_exec() if it's changed. > > src/hotspot/share/utilities/vmError.cpp line 1348: > >> 1346: public: >> 1347: VMErrorForceInNative(Thread* t): _jt(t != NULL && t->is_Java_thread() ? JavaThread::cast(t) : NULL) { >> 1348: if (_jt != NULL && _jt->thread_state() == _thread_in_vm) { > > What if it is _thread_in_Java? Isn't that possible. > > I think you can unconditionally change the state to `_thread_in_native` and then restore it. > // _thread_in_Java : Executing either interpreted or compiled Java code (or could be in a stub) you mean it's possible to be here from interpreter, or compiled code? Hotspot runtime is so complex. I can't wrap my head around it. ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From xliu at openjdk.java.net Tue Oct 12 06:15:51 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Tue, 12 Oct 2021 06:15:51 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v7] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: On Tue, 12 Oct 2021 05:27:15 GMT, David Holmes wrote: >> Xin Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert JavaThreadInVMAndNative change. >> >> Only change the state to Native if current thread is Java Thread and it's in VM. >> Restore thread state after fork_and_exec() if it's changed. > > src/hotspot/share/utilities/vmError.cpp line 1659: > >> 1657: VMErrorForceInNative fn(Thread::current_or_null()); >> 1658: if (os::fork_and_exec(cmd) < 0) { >> 1659: out.print_cr("os::fork_and_exec failed: %s (%s=%d)", > > I'm still concerned about using out.print_cr if you are _thread_in_native - is it safe? If it is, and if `print_raw` is also safe to execute when in native, then you could just apply the VMErrorForceInNative before the while loop. I think it's safe. IIUC, a java thread in Native is safe if we keep track all oops in handles. In that sense, we prevent them from GC. A fdStream should have nothing to do with oops and GC. ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From shade at openjdk.java.net Tue Oct 12 06:25:52 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 12 Oct 2021 06:25:52 GMT Subject: RFR: 8275031: runtime/ErrorHandling/MachCodeFramesInErrorFile.java fails when hsdis is present In-Reply-To: References: Message-ID: On Mon, 11 Oct 2021 09:52:03 GMT, Aleksey Shipilev wrote: > tier1 test that is newly added by [JDK-8272586](https://bugs.openjdk.java.net/browse/JDK-8272586) fails on hsdis-enabled machine. I chose to fix it by annotating the real disassembly with `[Disassembly]`, and checking for its existence in the test. > > Additional testing: > - [x] Test now passes when `hsdis` is installed > - [x] Test still passes when `hsdis` is not installed Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/5888 From shade at openjdk.java.net Tue Oct 12 06:25:53 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 12 Oct 2021 06:25:53 GMT Subject: Integrated: 8275031: runtime/ErrorHandling/MachCodeFramesInErrorFile.java fails when hsdis is present In-Reply-To: References: Message-ID: On Mon, 11 Oct 2021 09:52:03 GMT, Aleksey Shipilev wrote: > tier1 test that is newly added by [JDK-8272586](https://bugs.openjdk.java.net/browse/JDK-8272586) fails on hsdis-enabled machine. I chose to fix it by annotating the real disassembly with `[Disassembly]`, and checking for its existence in the test. > > Additional testing: > - [x] Test now passes when `hsdis` is installed > - [x] Test still passes when `hsdis` is not installed This pull request has now been integrated. Changeset: a5f09d10 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/a5f09d1088d9dd610139370739e9fbd6e34416cb Stats: 9 lines in 2 files changed: 9 ins; 0 del; 0 mod 8275031: runtime/ErrorHandling/MachCodeFramesInErrorFile.java fails when hsdis is present Reviewed-by: dholmes, dnsimon ------------- PR: https://git.openjdk.java.net/jdk/pull/5888 From xliu at openjdk.java.net Tue Oct 12 07:02:12 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Tue, 12 Oct 2021 07:02:12 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v8] In-Reply-To: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: > This patch allows the custom commands of OnError to attach to HotSpot itself. > It sets the thread of report_and_die() to Native before os::fork_and_exec(cmd). > This prevents cmds which require safepoint synchronization from deadlock. > eg. OnError='jcmd %p Thread.print'. > > Without this patch, we will encounter a deadlock at safepoint synchronization. > `"main" #1` is the very thread which executes `os::fork_and_exec(cmd)`. > > > Aborting due to java.lang.OutOfMemoryError: Java heap space > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (debug.cpp:364), pid=94632, tid=94633 > # fatal error: OutOfMemory encountered: Java heap space > # > # JRE version: OpenJDK Runtime Environment (18.0) (build 18-internal+0-adhoc.xxinliu.jdk) > # Java VM: OpenJDK 64-Bit Server VM (18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) > # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again > # > # An error report file with more information is saved as: > # /local/home/xxinliu/JDK-2085/hs_err_pid94632.log > # > # -XX:OnError="jcmd %p Thread.print" > # Executing /bin/sh -c "jcmd 94632 Thread.print" ... > 94632: > [10.616s][warning][safepoint] > [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timeout detected: > [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint. > [10.616s][warning][safepoint] # SafepointSynchronize::begin: Threads which did not reach the safepoint: > [10.616s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=236.97ms elapsed=10.61s tid=0x00007f01b00232f0 nid=94633 runnable [0x00007f01b7a08000] > [10.616s][warning][safepoint] java.lang.Thread.State: RUNNABLE > [10.616s][warning][safepoint] > [10.616s][warning][safepoint] # SafepointSynchronize::begin: (End of list) Xin Liu has updated the pull request incrementally with one additional commit since the last revision: Change to VM unconditionally as long as current thread is JavaThread. Hoist VMErrorForceNative out of While. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5590/files - new: https://git.openjdk.java.net/jdk/pull/5590/files/b7ce3f9b..09d59f47 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5590&range=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5590&range=06-07 Stats: 49 lines in 2 files changed: 8 ins; 38 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/5590.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5590/head:pull/5590 PR: https://git.openjdk.java.net/jdk/pull/5590 From dnsimon at openjdk.java.net Tue Oct 12 07:34:22 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Tue, 12 Oct 2021 07:34:22 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable [v4] In-Reply-To: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> Message-ID: <97Qv0DRICqk36UFMtjY8IX7k-K6TPAW6ZZk5KPNz-_0=.4c895dae-c478-4ed9-ac86-94e5809fed4b@github.com> > This PR adds a `ErrorLogPrintCodeLimit` (develop) option for configuring the amount of code printed in a hs-err log file. There's a hard limit of 10 so that the buffer used to avoid duplicates in `VMError::print_code` is stack allocated. > In addition, the Java stack is also scanned when considering code to print as the native stack may not have any Java compiled frames. For example, a transition into the VM through a RuntimeStub can prevent the native stack walk from seeing the frames above the stub. > The MachCodeFramesInErrorFile test has been made more robust in terms of validating its expectations of how C2 intrinsifies methods. > > There's one other minor change to address [this comment](https://github.com/openjdk/jdk/pull/5446#issuecomment-938518814). Doug Simon has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - print extra info for an nmethod in CodeBlob::dump_for_addr to st instead of tty in non-verbose mode - made ErrorLogPrintCodeLimit a diagnostic flag - make max code printed in hs-err log configurable and also scan Java stack for code to print - do not unnecessarily pollute tty ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5875/files - new: https://git.openjdk.java.net/jdk/pull/5875/files/feb40153..83e6abd4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5875&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5875&range=02-03 Stats: 835 lines in 73 files changed: 346 ins; 247 del; 242 mod Patch: https://git.openjdk.java.net/jdk/pull/5875.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5875/head:pull/5875 PR: https://git.openjdk.java.net/jdk/pull/5875 From dholmes at openjdk.java.net Tue Oct 12 08:00:50 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 12 Oct 2021 08:00:50 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable [v4] In-Reply-To: <97Qv0DRICqk36UFMtjY8IX7k-K6TPAW6ZZk5KPNz-_0=.4c895dae-c478-4ed9-ac86-94e5809fed4b@github.com> References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> <97Qv0DRICqk36UFMtjY8IX7k-K6TPAW6ZZk5KPNz-_0=.4c895dae-c478-4ed9-ac86-94e5809fed4b@github.com> Message-ID: On Tue, 12 Oct 2021 07:34:22 GMT, Doug Simon wrote: >> This PR adds a `ErrorLogPrintCodeLimit` (develop) option for configuring the amount of code printed in a hs-err log file. There's a hard limit of 10 so that the buffer used to avoid duplicates in `VMError::print_code` is stack allocated. >> In addition, the Java stack is also scanned when considering code to print as the native stack may not have any Java compiled frames. For example, a transition into the VM through a RuntimeStub can prevent the native stack walk from seeing the frames above the stub. >> The MachCodeFramesInErrorFile test has been made more robust in terms of validating its expectations of how C2 intrinsifies methods. >> >> There's one other minor change to address [this comment](https://github.com/openjdk/jdk/pull/5446#issuecomment-938518814). > > Doug Simon has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: > > - print extra info for an nmethod in CodeBlob::dump_for_addr to st instead of tty in non-verbose mode > - made ErrorLogPrintCodeLimit a diagnostic flag > - make max code printed in hs-err log configurable and also scan Java stack for code to print > - do not unnecessarily pollute tty Hi Doug, A couple of minor comments/queries but otherwise this seems okay. Thanks, David src/hotspot/share/runtime/flags/jvmFlagLimit.cpp line 37: > 35: #include "oops/markWord.hpp" > 36: #include "runtime/task.hpp" > 37: #include "utilities/vmError.hpp" Why is this needed? src/hotspot/share/utilities/vmError.cpp line 920: > 918: // Even though ErrorLogPrintCodeLimit is ranged checked > 919: // during argument parsing, there's no way to prevent it > 920: // being set to a value outside the range. I don't understand what you mean here. Do we not abort the VM if the flag value is out-of-range? src/hotspot/share/utilities/vmError.cpp line 925: > 923: // Scan the native stack > 924: if (!_print_native_stack_used) { > 925: // Only try print code of the crashing frame since Existing typo: s/try/try to/ ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5875 From dnsimon at openjdk.java.net Tue Oct 12 08:13:55 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Tue, 12 Oct 2021 08:13:55 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable [v4] In-Reply-To: References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> <97Qv0DRICqk36UFMtjY8IX7k-K6TPAW6ZZk5KPNz-_0=.4c895dae-c478-4ed9-ac86-94e5809fed4b@github.com> Message-ID: <-9KCrDWTzsu7HeZNK0nEHN8UwI3UmQUv8XX1gWyxB_M=.dfe28c56-c374-4558-87cf-7cf7349d76da@github.com> On Tue, 12 Oct 2021 07:36:36 GMT, David Holmes wrote: >> Doug Simon has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: >> >> - print extra info for an nmethod in CodeBlob::dump_for_addr to st instead of tty in non-verbose mode >> - made ErrorLogPrintCodeLimit a diagnostic flag >> - make max code printed in hs-err log configurable and also scan Java stack for code to print >> - do not unnecessarily pollute tty > > src/hotspot/share/runtime/flags/jvmFlagLimit.cpp line 37: > >> 35: #include "oops/markWord.hpp" >> 36: #include "runtime/task.hpp" >> 37: #include "utilities/vmError.hpp" > > Why is this needed? For the definition of `VMError::max_error_log_print_code` to be available just like `task.hpp` is included to make `PeriodicTask::min_interval` (used in defining the range of `PerfDataSamplingInterval`) available. > src/hotspot/share/utilities/vmError.cpp line 920: > >> 918: // Even though ErrorLogPrintCodeLimit is ranged checked >> 919: // during argument parsing, there's no way to prevent it >> 920: // being set to a value outside the range. > > I don't understand what you mean here. Do we not abort the VM if the flag value is out-of-range? `ErrorLogPrintCodeLimit` is a writable global variable so this is just being extra defensive should anything update `ErrorLogPrintCodeLimit` (e.g. `ErrorLogPrintCodeLimit == 100;`) after argument parsing. > src/hotspot/share/utilities/vmError.cpp line 925: > >> 923: // Scan the native stack >> 924: if (!_print_native_stack_used) { >> 925: // Only try print code of the crashing frame since > > Existing typo: s/try/try to/ Fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/5875 From aph at openjdk.java.net Tue Oct 12 08:14:57 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 12 Oct 2021 08:14:57 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v10] In-Reply-To: <8P-tWT-7UC9TMLz8zo5liDy2rOONBU864RUlRhthLeY=.05ef40f6-38ca-4e84-a54b-eb8dbae2b97f@github.com> References: <8P-tWT-7UC9TMLz8zo5liDy2rOONBU864RUlRhthLeY=.05ef40f6-38ca-4e84-a54b-eb8dbae2b97f@github.com> Message-ID: On Mon, 11 Oct 2021 20:17:36 GMT, Evgeny Astigeevich wrote: >> This PR is a follow-up on the discussion [?RFC: AArch64: Implementing spin pauses with ISB?](https://mail.openjdk.java.net/pipermail/hotspot-dev/2021-August/054033.html). >> >> It adds DIAGNOSTIC options `OnSpinWaitInst=inst`, where `inst` can be: >> >> - `none`: no implementation for spin pauses. This is the default value. >> - `nop`: use `nop` instruction for spin pauses. >> - `isb`: use `isb` instruction for spin pauses. >> - `yield`: use `yield` instruction for spin pauses. >> >> And `OnSpinWaitInstCount=count`, where `count` specifies a number of `OnSpinWaitInst` and can be in `1..99` range. It is an error to use `OnSpinWaitInstCount` when `OnSpinWaitInst` is `none`. >> >> The code for the `Thread.onSpinWait` intrinsic is generated based on the values of `OnSpinWaitInst` and `OnSpinWaitInstCount`. >> >> Testing: >> >> - `make test TEST="gtest"`: Passed >> - `make run-test TEST="tier1"`: Passed >> - `make run-test TEST="tier2"`: Passed >> - `make run-test TEST=hotspot/jtreg/compiler/onSpinWait`: Passed >> >> CSR: https://bugs.openjdk.java.net/browse/JDK-8274564 > > Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: > > Make OnSpinWaitInst/OnSpinWaitInstCount DIAGNOSTIC Can we have a simple (as simple as possible) JMH benchmark, please? It should be something like a couple of threads racing to count up to a million. ------------- PR: https://git.openjdk.java.net/jdk/pull/5562 From dnsimon at openjdk.java.net Tue Oct 12 08:28:07 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Tue, 12 Oct 2021 08:28:07 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable [v5] In-Reply-To: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> Message-ID: > This PR adds a `ErrorLogPrintCodeLimit` (develop) option for configuring the amount of code printed in a hs-err log file. There's a hard limit of 10 so that the buffer used to avoid duplicates in `VMError::print_code` is stack allocated. > In addition, the Java stack is also scanned when considering code to print as the native stack may not have any Java compiled frames. For example, a transition into the VM through a RuntimeStub can prevent the native stack walk from seeing the frames above the stub. > The MachCodeFramesInErrorFile test has been made more robust in terms of validating its expectations of how C2 intrinsifies methods. > > There's one other minor change to address [this comment](https://github.com/openjdk/jdk/pull/5446#issuecomment-938518814). Doug Simon has updated the pull request incrementally with two additional commits since the last revision: - clarify need for defensive code - grammar fix ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5875/files - new: https://git.openjdk.java.net/jdk/pull/5875/files/83e6abd4..29060fc9 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5875&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5875&range=03-04 Stats: 3 lines in 1 file changed: 1 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/5875.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5875/head:pull/5875 PR: https://git.openjdk.java.net/jdk/pull/5875 From dnsimon at openjdk.java.net Tue Oct 12 08:28:08 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Tue, 12 Oct 2021 08:28:08 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable [v4] In-Reply-To: <-9KCrDWTzsu7HeZNK0nEHN8UwI3UmQUv8XX1gWyxB_M=.dfe28c56-c374-4558-87cf-7cf7349d76da@github.com> References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> <97Qv0DRICqk36UFMtjY8IX7k-K6TPAW6ZZk5KPNz-_0=.4c895dae-c478-4ed9-ac86-94e5809fed4b@github.com> <-9KCrDWTzsu7HeZNK0nEHN8UwI3UmQUv8XX1gWyxB_M=.dfe28c56-c374-4558-87cf-7cf7349d76da@github.com> Message-ID: On Tue, 12 Oct 2021 08:08:25 GMT, Doug Simon wrote: >> src/hotspot/share/utilities/vmError.cpp line 920: >> >>> 918: // Even though ErrorLogPrintCodeLimit is ranged checked >>> 919: // during argument parsing, there's no way to prevent it >>> 920: // being set to a value outside the range. >> >> I don't understand what you mean here. Do we not abort the VM if the flag value is out-of-range? > > `ErrorLogPrintCodeLimit` is a writable global variable so this is just being extra defensive should anything update `ErrorLogPrintCodeLimit` (e.g. `ErrorLogPrintCodeLimit == 100;`) after argument parsing. I clarified the comment: 29060fc92429d9c33b5f90d5e9caba08a54b0e4e ------------- PR: https://git.openjdk.java.net/jdk/pull/5875 From kbarrett at openjdk.java.net Tue Oct 12 09:44:16 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Tue, 12 Oct 2021 09:44:16 GMT Subject: RFR: 8274615: Support relaxed atomic add for linux-aarch64 [v2] In-Reply-To: <1V2HK9V1rgEy2RkQmT_1PrIIbQcOAd94etBUs026Yds=.d5740b78-e13d-4abd-a0c7-3a3fbc2a4ac0@github.com> References: <1V2HK9V1rgEy2RkQmT_1PrIIbQcOAd94etBUs026Yds=.d5740b78-e13d-4abd-a0c7-3a3fbc2a4ac0@github.com> Message-ID: > Please review this change to the linux-aarch64 port to support relaxed > atomic add operations. Both ldxr/stxr and LSE-based implementations are > provided. The appropriate one to use selected at runtime, using the existing > infrastructure for that selection. > > Testing: > mach5 tier1-3 > Some hand-checked performance and functional testing that both LSE and > non-LSE implementations work, and that the relaxed operations are faster > than the conservative default. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'master' into aarch64_touch_exp - relaxed atomic add ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5785/files - new: https://git.openjdk.java.net/jdk/pull/5785/files/be7d43e4..942221ce Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5785&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5785&range=00-01 Stats: 11761 lines in 530 files changed: 6041 ins; 3729 del; 1991 mod Patch: https://git.openjdk.java.net/jdk/pull/5785.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5785/head:pull/5785 PR: https://git.openjdk.java.net/jdk/pull/5785 From kbarrett at openjdk.java.net Tue Oct 12 09:44:18 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Tue, 12 Oct 2021 09:44:18 GMT Subject: RFR: 8274615: Support relaxed atomic add for linux-aarch64 [v2] In-Reply-To: References: <1V2HK9V1rgEy2RkQmT_1PrIIbQcOAd94etBUs026Yds=.d5740b78-e13d-4abd-a0c7-3a3fbc2a4ac0@github.com> Message-ID: On Fri, 8 Oct 2021 09:30:14 GMT, Andrew Haley wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: >> >> - Merge branch 'master' into aarch64_touch_exp >> - relaxed atomic add > > Looks good, thanks. Thanks @theRealAph and @dholmes-ora for reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/5785 From kbarrett at openjdk.java.net Tue Oct 12 09:44:20 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Tue, 12 Oct 2021 09:44:20 GMT Subject: Integrated: 8274615: Support relaxed atomic add for linux-aarch64 In-Reply-To: <1V2HK9V1rgEy2RkQmT_1PrIIbQcOAd94etBUs026Yds=.d5740b78-e13d-4abd-a0c7-3a3fbc2a4ac0@github.com> References: <1V2HK9V1rgEy2RkQmT_1PrIIbQcOAd94etBUs026Yds=.d5740b78-e13d-4abd-a0c7-3a3fbc2a4ac0@github.com> Message-ID: On Fri, 1 Oct 2021 05:04:45 GMT, Kim Barrett wrote: > Please review this change to the linux-aarch64 port to support relaxed > atomic add operations. Both ldxr/stxr and LSE-based implementations are > provided. The appropriate one to use selected at runtime, using the existing > infrastructure for that selection. > > Testing: > mach5 tier1-3 > Some hand-checked performance and functional testing that both LSE and > non-LSE implementations work, and that the relaxed operations are faster > than the conservative default. This pull request has now been integrated. Changeset: 8de26361 Author: Kim Barrett URL: https://git.openjdk.java.net/jdk/commit/8de26361f7d789c7b317536198c891756038a8ea Stats: 64 lines in 4 files changed: 51 ins; 0 del; 13 mod 8274615: Support relaxed atomic add for linux-aarch64 Reviewed-by: aph, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/5785 From dnsimon at openjdk.java.net Tue Oct 12 09:52:57 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Tue, 12 Oct 2021 09:52:57 GMT Subject: Integrated: 8274986: max code printed in hs-err logs should be configurable In-Reply-To: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> Message-ID: On Fri, 8 Oct 2021 22:28:53 GMT, Doug Simon wrote: > This PR adds a `ErrorLogPrintCodeLimit` (develop) option for configuring the amount of code printed in a hs-err log file. There's a hard limit of 10 so that the buffer used to avoid duplicates in `VMError::print_code` is stack allocated. > In addition, the Java stack is also scanned when considering code to print as the native stack may not have any Java compiled frames. For example, a transition into the VM through a RuntimeStub can prevent the native stack walk from seeing the frames above the stub. > The MachCodeFramesInErrorFile test has been made more robust in terms of validating its expectations of how C2 intrinsifies methods. > > There's one other minor change to address [this comment](https://github.com/openjdk/jdk/pull/5446#issuecomment-938518814). This pull request has now been integrated. Changeset: 33050f80 Author: Doug Simon URL: https://git.openjdk.java.net/jdk/commit/33050f8013366f5e3a01ab1a75ba3fee9cc73089 Stats: 171 lines in 6 files changed: 76 ins; 38 del; 57 mod 8274986: max code printed in hs-err logs should be configurable Reviewed-by: never, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/5875 From dholmes at openjdk.java.net Tue Oct 12 11:45:58 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 12 Oct 2021 11:45:58 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable [v4] In-Reply-To: <-9KCrDWTzsu7HeZNK0nEHN8UwI3UmQUv8XX1gWyxB_M=.dfe28c56-c374-4558-87cf-7cf7349d76da@github.com> References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> <97Qv0DRICqk36UFMtjY8IX7k-K6TPAW6ZZk5KPNz-_0=.4c895dae-c478-4ed9-ac86-94e5809fed4b@github.com> <-9KCrDWTzsu7HeZNK0nEHN8UwI3UmQUv8XX1gWyxB_M=.dfe28c56-c374-4558-87cf-7cf7349d76da@github.com> Message-ID: <9SbNldj32iBS3XPUP-fCJcSir0-yqLcwTiTtEf032R0=.9a470d97-ea74-4591-bf3d-d4cafd352402@github.com> On Tue, 12 Oct 2021 08:06:22 GMT, Doug Simon wrote: >> src/hotspot/share/runtime/flags/jvmFlagLimit.cpp line 37: >> >>> 35: #include "oops/markWord.hpp" >>> 36: #include "runtime/task.hpp" >>> 37: #include "utilities/vmError.hpp" >> >> Why is this needed? > > For the definition of `VMError::max_error_log_print_code` to be available just like `task.hpp` is included to make `PeriodicTask::min_interval` (used in defining the range of `PerfDataSamplingInterval`) available. I'm guessing there is some macro weirdness involved when defining the flag in globals.hpp. ------------- PR: https://git.openjdk.java.net/jdk/pull/5875 From dholmes at openjdk.java.net Tue Oct 12 11:45:59 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 12 Oct 2021 11:45:59 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable [v5] In-Reply-To: References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> Message-ID: On Tue, 12 Oct 2021 08:28:07 GMT, Doug Simon wrote: >> This PR adds a `ErrorLogPrintCodeLimit` (develop) option for configuring the amount of code printed in a hs-err log file. There's a hard limit of 10 so that the buffer used to avoid duplicates in `VMError::print_code` is stack allocated. >> In addition, the Java stack is also scanned when considering code to print as the native stack may not have any Java compiled frames. For example, a transition into the VM through a RuntimeStub can prevent the native stack walk from seeing the frames above the stub. >> The MachCodeFramesInErrorFile test has been made more robust in terms of validating its expectations of how C2 intrinsifies methods. >> >> There's one other minor change to address [this comment](https://github.com/openjdk/jdk/pull/5446#issuecomment-938518814). > > Doug Simon has updated the pull request incrementally with two additional commits since the last revision: > > - clarify need for defensive code > - grammar fix src/hotspot/share/utilities/vmError.cpp line 921: > 919: // during argument parsing, there's no way to prevent it > 920: // subsequently (i.e., after parsing) being set to a > 921: // value outside the range. This is overly defensive IMO. Flags should never be touched after initialization is complete and we assume we can trust ourselves. ------------- PR: https://git.openjdk.java.net/jdk/pull/5875 From dnsimon at openjdk.java.net Tue Oct 12 11:57:57 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Tue, 12 Oct 2021 11:57:57 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable [v5] In-Reply-To: References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> Message-ID: <6Rgeo_st3AIVFJxc8jU-hdBnCf7YzphvraQ3HolXOOE=.027904b7-42dc-49c6-a075-c62b899aafdd@github.com> On Tue, 12 Oct 2021 11:42:47 GMT, David Holmes wrote: >> Doug Simon has updated the pull request incrementally with two additional commits since the last revision: >> >> - clarify need for defensive code >> - grammar fix > > src/hotspot/share/utilities/vmError.cpp line 921: > >> 919: // during argument parsing, there's no way to prevent it >> 920: // subsequently (i.e., after parsing) being set to a >> 921: // value outside the range. > > This is overly defensive IMO. Flags should never be touched after initialization is complete and we assume we can trust ourselves. In general I agree but in error reporting code I get extra defensive plus the defensive code is small and not on a hot path. That said, I won't object to it being undone in a subsequent PR. ------------- PR: https://git.openjdk.java.net/jdk/pull/5875 From david.holmes at oracle.com Tue Oct 12 12:02:55 2021 From: david.holmes at oracle.com (David Holmes) Date: Tue, 12 Oct 2021 22:02:55 +1000 Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v7] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: <1e531383-4b9e-5b10-9d37-69b9f1fbc098@oracle.com> On 12/10/2021 4:08 pm, Xin Liu wrote: > On Tue, 12 Oct 2021 05:25:31 GMT, David Holmes wrote: > >>> Xin Liu has updated the pull request incrementally with one additional commit since the last revision: >>> >>> Revert JavaThreadInVMAndNative change. >>> >>> Only change the state to Native if current thread is Java Thread and it's in VM. >>> Restore thread state after fork_and_exec() if it's changed. >> >> src/hotspot/share/utilities/vmError.cpp line 1348: >> >>> 1346: public: >>> 1347: VMErrorForceInNative(Thread* t): _jt(t != NULL && t->is_Java_thread() ? JavaThread::cast(t) : NULL) { >>> 1348: if (_jt != NULL && _jt->thread_state() == _thread_in_vm) { >> >> What if it is _thread_in_Java? Isn't that possible. >> >> I think you can unconditionally change the state to `_thread_in_native` and then restore it. > >> // _thread_in_Java : Executing either interpreted or compiled Java code (or could be in a stub) > > you mean it's possible to be here from interpreter, or compiled code? > Hotspot runtime is so complex. I can't wrap my head around it. A crash can happen in any code. David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/5590 > From david.holmes at oracle.com Tue Oct 12 12:04:22 2021 From: david.holmes at oracle.com (David Holmes) Date: Tue, 12 Oct 2021 22:04:22 +1000 Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v7] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: <6e337385-ca91-cf87-b5be-c5c1c777cfc4@oracle.com> On 12/10/2021 4:15 pm, Xin Liu wrote: > On Tue, 12 Oct 2021 05:27:15 GMT, David Holmes wrote: > >>> Xin Liu has updated the pull request incrementally with one additional commit since the last revision: >>> >>> Revert JavaThreadInVMAndNative change. >>> >>> Only change the state to Native if current thread is Java Thread and it's in VM. >>> Restore thread state after fork_and_exec() if it's changed. >> >> src/hotspot/share/utilities/vmError.cpp line 1659: >> >>> 1657: VMErrorForceInNative fn(Thread::current_or_null()); >>> 1658: if (os::fork_and_exec(cmd) < 0) { >>> 1659: out.print_cr("os::fork_and_exec failed: %s (%s=%d)", >> >> I'm still concerned about using out.print_cr if you are _thread_in_native - is it safe? If it is, and if `print_raw` is also safe to execute when in native, then you could just apply the VMErrorForceInNative before the while loop. > > I think it's safe. IIUC, a java thread in Native is safe if we keep track all oops in handles. > In that sense, we prevent them from GC. A fdStream should have nothing to do with oops and GC. Can out.print_cr acquire a lock (e.g. ttyLock?) ? If so then it could block and that would use a ThreadBlockInVM which requires it to be _thread_in_vm. That is the kind of "safety" I'm concerned about. David ----- > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/5590 > From nradomski at openjdk.java.net Tue Oct 12 12:21:52 2021 From: nradomski at openjdk.java.net (Niklas Radomski) Date: Tue, 12 Oct 2021 12:21:52 GMT Subject: RFR: 8274770: [PPC64] resolve_jobject needs a generic implementation to support load barriers In-Reply-To: References: Message-ID: On Tue, 5 Oct 2021 10:28:07 GMT, Martin Doerr wrote: > The default implementation of BarrierSetAssembler::resolve_jobject should be generic and support all GCs. Single GCs can still implement an optimized version. Marked as reviewed by nradomski (Author). ------------- PR: https://git.openjdk.java.net/jdk/pull/5821 From mdoerr at openjdk.java.net Tue Oct 12 12:21:52 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Tue, 12 Oct 2021 12:21:52 GMT Subject: RFR: 8274770: [PPC64] resolve_jobject needs a generic implementation to support load barriers In-Reply-To: References: Message-ID: On Tue, 5 Oct 2021 10:28:07 GMT, Martin Doerr wrote: > The default implementation of BarrierSetAssembler::resolve_jobject should be generic and support all GCs. Single GCs can still implement an optimized version. Thanks for the reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/5821 From mdoerr at openjdk.java.net Tue Oct 12 12:21:53 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Tue, 12 Oct 2021 12:21:53 GMT Subject: Integrated: 8274770: [PPC64] resolve_jobject needs a generic implementation to support load barriers In-Reply-To: References: Message-ID: On Tue, 5 Oct 2021 10:28:07 GMT, Martin Doerr wrote: > The default implementation of BarrierSetAssembler::resolve_jobject should be generic and support all GCs. Single GCs can still implement an optimized version. This pull request has now been integrated. Changeset: e16b93ad Author: Martin Doerr URL: https://git.openjdk.java.net/jdk/commit/e16b93ad52c96fddd9097c2cb0fa78ae781c547b Stats: 34 lines in 3 files changed: 31 ins; 0 del; 3 mod 8274770: [PPC64] resolve_jobject needs a generic implementation to support load barriers Reviewed-by: goetz, nradomski ------------- PR: https://git.openjdk.java.net/jdk/pull/5821 From burban at openjdk.java.net Tue Oct 12 12:39:14 2021 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Tue, 12 Oct 2021 12:39:14 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler [v4] In-Reply-To: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> Message-ID: > r18 should not be used as it is reserved as platform register. Linux is fine with userspace using it, but Windows and also recently macOS (https://github.com/openjdk/jdk11u-dev/pull/301#issuecomment-911998917 ) are actually using it on the kernel side. > > The macro assembler uses the bit pattern `0x7fff_ffff` (== `r0-r30`) to specify which registers to spill; fortunately this helper is only used here: > > https://github.com/openjdk/jdk/blob/c05dc268acaf87236f30cf700ea3ac778e3b20e5/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp#L1400-L1404 > > I haven't seen causing this particular instance any issues in practice _yet_, presumably because it looks hard to align the stars in order to trigger a problem (between `stp` and `ldp` of `r18` a transition to kernel space must happen *and* the kernel needs to do something with `r18`). But jdk11u-dev has more usages of the `::pusha`/`::popa` macro and that causes troubles as explained in the link above. > > Output of `-XX:+PrintInterpreter` before this change: > > ---------------------------------------------------------------------- > method entry point (kind = native) [0x0000000138809b00, 0x000000013880a280] 1920 bytes > -------------------------------------------------------------------------------- > 0x0000000138809b00: ldr x2, [x12, #16] > 0x0000000138809b04: ldrh w2, [x2, #44] > 0x0000000138809b08: add x24, x20, x2, uxtx #3 > 0x0000000138809b0c: sub x24, x24, #0x8 > [...] > 0x0000000138809fa4: stp x16, x17, [sp, #128] > 0x0000000138809fa8: stp x18, x19, [sp, #144] > 0x0000000138809fac: stp x20, x21, [sp, #160] > [...] > 0x0000000138809fc0: stp x30, xzr, [sp, #240] > 0x0000000138809fc4: mov x0, x28 > ;; 0x10864ACCC > 0x0000000138809fc8: mov x9, #0xaccc // #44236 > 0x0000000138809fcc: movk x9, #0x864, lsl #16 > 0x0000000138809fd0: movk x9, #0x1, lsl #32 > 0x0000000138809fd4: blr x9 > 0x0000000138809fd8: ldp x2, x3, [sp, #16] > [...] > 0x0000000138809ff4: ldp x16, x17, [sp, #128] > 0x0000000138809ff8: ldp x18, x19, [sp, #144] > 0x0000000138809ffc: ldp x20, x21, [sp, #160] > > > After: > > ---------------------------------------------------------------------- > method entry point (kind = native) [0x0000000108e4db00, 0x0000000108e4e280] 1920 bytes > > -------------------------------------------------------------------------------- > 0x0000000108e4db00: ldr x2, [x12, #16] > 0x0000000108e4db04: ldrh w2, [x2, #44] > 0x0000000108e4db08: add x24, x20, x2, uxtx #3 > 0x0000000108e4db0c: sub x24, x24, #0x8 > [...] > 0x0000000108e4dfa4: stp x16, x17, [sp, #128] > 0x0000000108e4dfa8: stp x19, x20, [sp, #144] > 0x0000000108e4dfac: stp x21, x22, [sp, #160] > [...] > 0x0000000108e4dfbc: stp x29, x30, [sp, #224] > 0x0000000108e4dfc0: mov x0, x28 > ;; 0x107E4A06C > 0x0000000108e4dfc4: mov x9, #0xa06c // #41068 > 0x0000000108e4dfc8: movk x9, #0x7e4, lsl #16 > 0x0000000108e4dfcc: movk x9, #0x1, lsl #32 > 0x0000000108e4dfd0: blr x9 > 0x0000000108e4dfd4: ldp x2, x3, [sp, #16] > [...] > 0x0000000108e4dff0: ldp x16, x17, [sp, #128] > 0x0000000108e4dff4: ldp x19, x20, [sp, #144] > 0x0000000108e4dff8: ldp x21, x22, [sp, #160] > [...] Bernhard Urban-Forster has updated the pull request incrementally with one additional commit since the last revision: use {push,pop}_CPU_state and fix r18 usage there as well ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5828/files - new: https://git.openjdk.java.net/jdk/pull/5828/files/ba55aa0e..d7ca7b39 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5828&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5828&range=02-03 Stats: 18 lines in 2 files changed: 7 ins; 7 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/5828.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5828/head:pull/5828 PR: https://git.openjdk.java.net/jdk/pull/5828 From burban at openjdk.java.net Tue Oct 12 13:10:00 2021 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Tue, 12 Oct 2021 13:10:00 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler [v3] In-Reply-To: <6JpuK39Hiw9D4DlQoUKxqrIxSADCAjyxCxiy3niPBwI=.d17d804f-df3c-4f83-af63-4057cd0ca632@github.com> References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> <6JpuK39Hiw9D4DlQoUKxqrIxSADCAjyxCxiy3niPBwI=.d17d804f-df3c-4f83-af63-4057cd0ca632@github.com> Message-ID: On Thu, 7 Oct 2021 08:50:15 GMT, Andrew Haley wrote: >> Bernhard Urban-Forster has updated the pull request incrementally with one additional commit since the last revision: >> >> fix linux/aarch64 build: s/r18/r18_tls > > So we have fixed the only use of `popa()` so that it does not clobber R18. That is good. > However, I think we've found another bug. `reguard_yellow_pages()` is a native function, so we should be saving floating-point registers too. It looks to me like this code should use `push_call_clobbered_registers()`. Then we'll have no use for `pusha()` in trunk, and your fix for `popa()` can go into 11u. Thanks for the suggestion @theRealAph. By looking through the code I discovered another buggy set of helpers, `{push,pop}_CPU_state()` which also messed with `r18`. Does it look fine to you, or should the code around `reguard_yellow_pages()` rather use `{push,pop}-call_clobbered_registers()`? ------------- PR: https://git.openjdk.java.net/jdk/pull/5828 From coleen.phillimore at oracle.com Tue Oct 12 13:19:06 2021 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 12 Oct 2021 09:19:06 -0400 Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v7] In-Reply-To: <6e337385-ca91-cf87-b5be-c5c1c777cfc4@oracle.com> References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <6e337385-ca91-cf87-b5be-c5c1c777cfc4@oracle.com> Message-ID: <6166f716-7e3f-ae49-8a81-00ce185e9640@oracle.com> On 10/12/21 8:04 AM, David Holmes wrote: > On 12/10/2021 4:15 pm, Xin Liu wrote: >> On Tue, 12 Oct 2021 05:27:15 GMT, David Holmes >> wrote: >> >>>> Xin Liu has updated the pull request incrementally with one >>>> additional commit since the last revision: >>>> >>>> ?? Revert JavaThreadInVMAndNative change. >>>> ?? ?? Only change the state to Native if current thread is Java >>>> Thread and it's in VM. >>>> ?? Restore thread state after fork_and_exec() if it's changed. >>> >>> src/hotspot/share/utilities/vmError.cpp line 1659: >>> >>>> 1657:?????? VMErrorForceInNative fn(Thread::current_or_null()); >>>> 1658:?????? if (os::fork_and_exec(cmd) < 0) { >>>> 1659:???????? out.print_cr("os::fork_and_exec failed: %s (%s=%d)", >>> >>> I'm still concerned about using out.print_cr if you are >>> _thread_in_native - is it safe? If it is, and if `print_raw` is also >>> safe to execute when in native, then you could just apply the >>> VMErrorForceInNative before the while loop. >> >> I think it's safe. IIUC, a java thread in Native is safe if we keep >> track all oops in handles. >> In that sense, we prevent them from GC.? A fdStream should have >> nothing to do with oops and GC. > > Can out.print_cr acquire a lock (e.g. ttyLock?) ? If so then it could > block and that would use a ThreadBlockInVM which requires it to be > _thread_in_vm. That is the kind of "safety" I'm concerned about. ttyLock is taken and released inside of print_cr, but it's a nosafepoint lock so would not block. Coleen > > David > ----- > >> ------------- >> >> PR: https://git.openjdk.java.net/jdk/pull/5590 >> From coleenp at openjdk.java.net Tue Oct 12 13:28:51 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 12 Oct 2021 13:28:51 GMT Subject: RFR: 8274794: Make Thread::_owned_locks available in product [v2] In-Reply-To: References: Message-ID: <59m-yGgbtXkojaOIE_3kBV6SLRWdqkgGo3sAQ0fS7p4=.7ea72277-ddd4-4a72-baba-5aaa20621366@github.com> On Mon, 11 Oct 2021 18:39:44 GMT, Coleen Phillimore wrote: >> Make thread owned locks available in product and change the error message printing to print by thread so that not only locks in mutexLocker are printed in the hs_err file. >> Tested with tier1-3 and with product build, and tested hs_err files manually with some selected temporary ShouldNotReachHere, also new test. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix optimized build (move print_on) up. in opposite order, I've found the mutex_array to be not useful at all for debugging - didn't know it existed for 20yrs. You can check the lock you're interested in directly because it's a global name. The small chance of a thread adding a lock during error reporting and potential crash, is handled by the error reporting "STEP" framework, so I don't see that as a concern. There are a few locks that would be useful to see in error handling (like the HandshakeState_lock) that are not covered by the mutex_array because they're not global. It would have helped debug some bug I was looking for several months ago. I was actually motivated to do this because of some awful looking code in https://github.com/openjdk/jdk/pull/5590, but apparently that's been removed. My real motivation, besides accurate error reporting, is to remove the 'def' macros in mutexLocker.cpp so that the default case of _allow_vm_block isn't specified, so that we can see some safepoint checking locks that pass true which block safepoints that really should not pass true. Some of these are cut/paste code. This is the change that wanted to get to: https://github.com/openjdk/jdk/commit/38b58c59b4bc9fbd62a57c3550d3e02f30efe8cb ------------- PR: https://git.openjdk.java.net/jdk/pull/5896 From stuefe at openjdk.java.net Tue Oct 12 16:17:51 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 12 Oct 2021 16:17:51 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable [v5] In-Reply-To: <6Rgeo_st3AIVFJxc8jU-hdBnCf7YzphvraQ3HolXOOE=.027904b7-42dc-49c6-a075-c62b899aafdd@github.com> References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> <6Rgeo_st3AIVFJxc8jU-hdBnCf7YzphvraQ3HolXOOE=.027904b7-42dc-49c6-a075-c62b899aafdd@github.com> Message-ID: On Tue, 12 Oct 2021 11:55:03 GMT, Doug Simon wrote: >> src/hotspot/share/utilities/vmError.cpp line 921: >> >>> 919: // during argument parsing, there's no way to prevent it >>> 920: // subsequently (i.e., after parsing) being set to a >>> 921: // value outside the range. >> >> This is overly defensive IMO. Flags should never be touched after initialization is complete and we assume we can trust ourselves. > > In general I agree but in error reporting code I get extra defensive plus the defensive code is small and not on a hot path. > That said, I won't object to it being undone in a subsequent PR. I thought the point of MIN2 here is to handle ErrorLogPrintCodeLimit < max. IMHO ErrorLogPrintCodeLimit > max would be just an assert-worthy error, since as David wrote flag values should not be changed after initialization. ------------- PR: https://git.openjdk.java.net/jdk/pull/5875 From mcimadamore at openjdk.java.net Tue Oct 12 16:37:57 2021 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Tue, 12 Oct 2021 16:37:57 GMT Subject: RFR: 8275063: Implementation of Memory Access API (Second incubator) Message-ID: This PR contains the API and implementation changes for JEP-419 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. [1] - https://openjdk.java.net/jeps/419 ------------- Commit messages: - Fix issues with Aarch64 VaList implementation - Drop argument type profiling for removed `MemoryAccess` class - Add missing files - Tweak javadoc for MemeorySegment/MemoryAddress::setUtf8String - Initial push from panama/foreign Changes: https://git.openjdk.java.net/jdk/pull/5907/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5907&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275063 Stats: 14262 lines in 186 files changed: 6583 ins; 5144 del; 2535 mod Patch: https://git.openjdk.java.net/jdk/pull/5907.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5907/head:pull/5907 PR: https://git.openjdk.java.net/jdk/pull/5907 From mcimadamore at openjdk.java.net Tue Oct 12 16:37:57 2021 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Tue, 12 Oct 2021 16:37:57 GMT Subject: RFR: 8275063: Implementation of Memory Access API (Second incubator) In-Reply-To: References: Message-ID: <-PiXRcdXSgem16bbZeDqOiJpOJu7FT_FaB8nFYHoGig=.91ba78cb-66d9-4ed3-82e9-184ab4b2616c@github.com> On Tue, 12 Oct 2021 11:16:51 GMT, Maurizio Cimadamore wrote: > This PR contains the API and implementation changes for JEP-419 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/419 This PR contains mainly API changes. We have tried to simplify the API, by removing redundant classes and moving functionalities where they belong. Below we list the main changes introduced in this PR. A big thanks to all who helped along the way: @briangoetz, @ChrisHegarty, @FrauBoes, @iwanowww, @JornVernee, @PaulSandoz, @sundararajana and @rose00. ### Value layouts and carriers This is perhaps the biggest change in the API, which has a knock-on effect in other areas as well. In the past, a value layout used to be a fairly *neutral* description of a piece of memory containing a scalar value. A value layout has a size, alignment and endianness. Since a value layout contains no information on *how* the value is to be dereferenced by Java clients, said clients have to specify a *carrier* when obtaining a var handle from a value layout. In this iteration, we have decided to attach carrier types to value layouts - so, in addition to size, alignment and endianness, all value layouts now have a `carrier` accessor, which returns a `j.l.Class`. You will find several hand-specialized version of `ValueLayout`, one for each main carrier type (e.g. `ValueLayout.OfInt` for the `int` carrier). Attaching carrier information to layouts simplifies the API in many ways: * When obtaining a var handle from a layout, it is no longer necessary to provide a carrier; the layout in fact contains all the necessary information for the dereference operation to occur. * Similarly, when linking downcall method handles, the `MethodType` parameter is no longer necessary, as now carrier information can be inferred from the provided `FunctionDescriptor`. * We can express dereference operations in a more general fashion - e.g. `get(ValueLayout.OfInt)` instead of `getInt(ByteOrder)`. Note how the new form is more *complete*. This iteration also adds support for `boolean` and `MemoryAddress` carriers in memory access var handles. ### Layout attributes To help the `CLinker` distinguish between a 32-bit layout modelling to a C `int` and a similar layout modelling a C `float` we have in the past resorted to *layout attributes* - that is, we injected custom classification information in layouts, and then required clients working with the `CLinker` API to only work with such augmented layouts. Because of the changes described above, this is no longer necessary: a layout is always associated with a Java carrier, so the `CLinker` can always disambiguate between `ValueLayout.OfInt` and `ValueLayout.OfFloat` even though they have the same size. For this reason, API support for custom layout attributes has been dropped in this iteration. Similarly, platform-dependent layout constants in `CLinker` have been removed; clients interacting with the foreign linker can simply use the basic layout constants defined in `ValueLayout` (e.g. `JAVA_INT`, `JAVA_FLOAT`, ...) - which is not too different from using the JNI types `jint` and `jfloat`. Of course tools (such as `jextract`) are free to define *custom* layouts which model C type for a specific platform. ### Memory dereference Previous iterations of the API provided ready-made dereference operations as static methods in the `MemoryAccess` class. This class is now removed, and all the dereference operations have been moved to `MemorySegment` and `MemoryAddress`. As mentioned before, the new dereference operations have a new form. Instead of: MemorySegment segment = ... MemoryAccess.getIntAtOffset(segment, /*offset */ 100, /* endianness */ ByteOrder.nativeOrder()); The new API works as follows: MemorySegment segment = ... int val = segment.get(JAVA_INT, /*offset */ 100); Note that the new dereference method is not static, and that parameters such as endianness are now omitted, since clients can just specify the value layout they want to work with. Also, since the new dereference methods are not static, we no longer need the workaround to enable VM argument type profiling (this was necessary to make static methods in `MemoryAccess` class perform reasonably well in the face of profile pollution). The same dereference operations are also available in `MemoryAddress`; when working with native code it might be necessary to dereference a raw pointer. In Java 17, to write a basic comparator function for qsort, the following code is needed: static int qsortCompare(MemoryAddress addr1, MemoryAddress addr2) { return MemoryAccess.getIntAtOffset(MemorySegment.globalNativeSegment(), addr1.toRawLongValue()) - MemoryAccess.getIntAtOffset(MemorySegment.globalNativeSegment(), addr2.toRawLongValue()); With the proposed changes, the above code simplifies to: static int qsortCompare(MemoryAddress addr1, MemoryAddress addr2) { return addr1.get(C_INT, 0) - addr2.get(C_INT, 0); } Which is far more readable. Note that dereferencing a memory address is still a potentially unsafe operation, as an address has no spatia/temporall bounds. For this reason, dereference operation on `MemoryAddress` are marked as *restricted*. ### Memory copy This iteration adds more support for copying Java arrays to and from memory segments. In Java 17, it is possible to copy a Java array into a memory segment, as follows: int[] array = .... MemorySegment segment = ... MemorySegment heapView = MemorySegment.ofArray(array); segment.asSlice(startSegmentOffset, nelems * 4) .copyFrom(heapView.asSlice(startArrayOffset * 4, nelems * 4)) This code snippet is suboptimal for several reasons: * three temporary segments have to be created: the heap view, plus the two slices * note how the code has to carefully slice the source/target segment to make sure that only the desired elements are copied, and at the desired target offset in the segment. * offset in arrays is expressed in elements, whereas offset in segments is expressed in bytes - which calls for potential mistakes. * it is not possible to specify custom endianness/alignment for the copy operation With the changes in this PR, the above code becomes: int[] array = .... MemorySegment segment = ... MemorySegment.copy(array, startArrayOffset, segment, JAVA_INT, startSegmentOffset, nelems); The above code is much simpler, with less potential for mistakes. Also, the extra value layout allows client to inject additional alignment constraints and endianness (if required). ### Role of `MemoryAddress` In Java 17, `MemoryAddress` has a scope accessor. This is useful when reasoning about an address obtained from a memory segment, but is far less useful when thinking about an address received from a native call. While it is possible to model the latter as an address associated with the *global scope*, `MemoryAddress` supports several operations which allow clients to attach spatial and temporal bounds to an address, turning it into a `MemorySegment`. If the address already has a scope, the semantics of some of these operation becomes confusing. For this reason, in this iteration `MemoryAddress` is only used to model raw native addresses. There is no scope associated with a memory address - dereferencing a raw address is always unsafe. This change brings more clarity to API, as `MemoryAddress` is nothing but a simple wrapper around a `long` value. This also means that obtaining a `MemoryAddress` from a heap segment is no longer a possibility; in other words, clients that don't care about native interop should probably just use `MemorySegment` and forget about `MemoryAddress`. ### Downcall method handle safety In Java 17, by-reference parameters to downcall method handles are passed as `MemoryAddress` arguments. This means that e.g. passing a segment by-reference requires a conversion (from segment to memory address, using `MemorySegment::address`). This conversion is lossy, as we lose information about the original memory segment (spatial and temporal bounds). As a result, passing parameter by-reference to downcall method handle is less safe. The changes described in this PR introduce stronger safety guarantees for by-reference parameters. The `CLinker` will now map any such parameter to the `Addressable` type - a common super interface of all things that can be passed by reference. This means clients can pass a memory segment *directly* to a downcall method handle, no conversion required. Because of that, the `CLinker` runtime can make sure that e.g. arguments passed by reference are kept alive for the entire duration of the native call. ### Native symbols In Java 17, looking up a symbol on a native library is done using the `SymbolLookup` interface. This interface used to return a plain `MemoryAddress` (the address of the native function). Given the changes described above, `MemoryAddress` is no longer a great choice: * a `MemoryAddress` now models a raw native memory address, and has no scope * a `MemoryAddress` can be easily dereferenced* For this reason, a new abstraction is added, namely `NativeSymbol`, which is used to model the entry point of a symbol (a function or a global variable) in a native library. A native symbol is `Addressable`, has a name and a resource scope. Since native symbols have a scope, the `CLinker` runtime can make sure that the scope of the native symbol corresponding to the native function being executed cannot be closed prematurely. This effectively allows clients to support safe library loading abstractions which support [deterministic library unloading](https://github.com/sundararajana/panama-jextract-samples/blob/master/dlopen/Dlopen.java). Additionally, we have tweaked the `CLinker::upcallStub` method to also return an *anonymous* `NativeSymbol`, rather than a raw `MemoryAddress`. ### Resource scope tweaks This PR removes the distinction between *implicit* and *explicit* scopes. Now all scopes (except for the *global scope*) are closeable, and can be associated with a `Cleaner`, if required. Another change in the resource scope API is in how temporal scope dependencies are handled. In Java 17 scopes could be acquired and released: MemorySegment segment = ... ResourceScope.Handle segmentHandle = segment.scope().acquire(); try { } finally { segment.scope().release(segmentHandle); } This PR removes the `ResourceScope::acquire` and `ResourceScope::release` methods, and allows instead to capture dependencies between scopes in a more explicit fashion: MemorySegment segment = ... try (ResourceScope criticalScope = ResourceScope.newConfinedScope()) { criticalScope.keepAlive(segment.scope()); } API javadoc: http://cr.openjdk.java.net/~mcimadamore/8275064/javadoc/jdk/incubator/foreign/package-summary.html Specdiff: http://cr.openjdk.java.net/~mcimadamore/8275064/specdiff_out/overview-summary.html ------------- PR: https://git.openjdk.java.net/jdk/pull/5907 From duke at openjdk.java.net Tue Oct 12 16:48:53 2021 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Tue, 12 Oct 2021 16:48:53 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v10] In-Reply-To: <8P-tWT-7UC9TMLz8zo5liDy2rOONBU864RUlRhthLeY=.05ef40f6-38ca-4e84-a54b-eb8dbae2b97f@github.com> References: <8P-tWT-7UC9TMLz8zo5liDy2rOONBU864RUlRhthLeY=.05ef40f6-38ca-4e84-a54b-eb8dbae2b97f@github.com> Message-ID: <8DTPmK-vB5Qfkg7t4S68Lk1w7g9mLHjqwgGcoy2C52M=.9bca4a23-f449-43b3-ac93-1963e77818ca@github.com> On Mon, 11 Oct 2021 20:17:36 GMT, Evgeny Astigeevich wrote: >> This PR is a follow-up on the discussion [?RFC: AArch64: Implementing spin pauses with ISB?](https://mail.openjdk.java.net/pipermail/hotspot-dev/2021-August/054033.html). >> >> It adds DIAGNOSTIC options `OnSpinWaitInst=inst`, where `inst` can be: >> >> - `none`: no implementation for spin pauses. This is the default value. >> - `nop`: use `nop` instruction for spin pauses. >> - `isb`: use `isb` instruction for spin pauses. >> - `yield`: use `yield` instruction for spin pauses. >> >> And `OnSpinWaitInstCount=count`, where `count` specifies a number of `OnSpinWaitInst` and can be in `1..99` range. It is an error to use `OnSpinWaitInstCount` when `OnSpinWaitInst` is `none`. >> >> The code for the `Thread.onSpinWait` intrinsic is generated based on the values of `OnSpinWaitInst` and `OnSpinWaitInstCount`. >> >> Testing: >> >> - `make test TEST="gtest"`: Passed >> - `make run-test TEST="tier1"`: Passed >> - `make run-test TEST="tier2"`: Passed >> - `make run-test TEST=hotspot/jtreg/compiler/onSpinWait`: Passed >> >> CSR: https://bugs.openjdk.java.net/browse/JDK-8274564 > > Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: > > Make OnSpinWaitInst/OnSpinWaitInstCount DIAGNOSTIC As the new options are changed to diagnostic, CSR is not needed. See https://bugs.openjdk.java.net/browse/JDK-8274564 for details. ------------- PR: https://git.openjdk.java.net/jdk/pull/5562 From erikj at openjdk.java.net Tue Oct 12 17:03:52 2021 From: erikj at openjdk.java.net (Erik Joelsson) Date: Tue, 12 Oct 2021 17:03:52 GMT Subject: RFR: 8275063: Implementation of Memory Access API (Second incubator) In-Reply-To: References: Message-ID: On Tue, 12 Oct 2021 11:16:51 GMT, Maurizio Cimadamore wrote: > This PR contains the API and implementation changes for JEP-419 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/419 Build changes look good. ------------- Marked as reviewed by erikj (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5907 From Divino.Cesar at microsoft.com Tue Oct 12 17:48:55 2021 From: Divino.Cesar at microsoft.com (Cesar Soares Lucas) Date: Tue, 12 Oct 2021 17:48:55 +0000 Subject: [External] : Re: RFC - Improving C2 Escape Analysis In-Reply-To: References: <20210930140335.648146897@eggemoggin.niobe.net> Message-ID: Hi, again, John. First of all, we greatly appreciated the feedback that you provided to us. Thank you. I talked with my manager and the team and it's clear to us that contributing to Valhalla is the best long-term strategy. Therefore, we'll be looking into how to contribute to it in the near future. That being said, we believe that improving the current EA and dependent optimizations seems to be the best short-term strategy for us and for that reason we'll invest our immediate efforts in that direction. We also believe that this approach will help us gain more experience with the problem and the platform overall. Cheers, Cesar ________________________________ From: Cesar Soares Lucas Sent: October 5, 2021 12:30 PM To: John Rose Cc: Mark Reinhold ; hotspot-dev at openjdk.java.net ; Brian Stafford ; Martijn Verburg Subject: Re: [External] : Re: RFC - Improving C2 Escape Analysis Hi John, Thank you for answering my questions and providing some examples of "hard-cases"! I'm going over all this with my teammates and I'll come back to you as soon as possible. Regards, Cesar ________________________________ From: John Rose Sent: October 4, 2021 3:20 PM To: Cesar Soares Lucas Cc: Mark Reinhold ; hotspot-dev at openjdk.java.net ; Brian Stafford ; Martijn Verburg Subject: Re: [External] : Re: RFC - Improving C2 Escape Analysis On Oct 4, 2021, at 2:32 PM, Cesar Soares Lucas wrote: > > ... > I think it's safe to say that the Inlining horizon and the presence of > control/data-flow merge points in the compilation unit are the major hurdles to > the effectiveness of the current EA and dependent optimizations. I understand > that Valhalla (Inline Types) will help tremendously with the inlining horizon > aspect of the problem. How much of a problem is control-flow/data-flow merge > points in a world where Inline Types are in place? There?s an easy thought experiment or two here: Suppose a class V is known to be identity-insensitive. This means a physical copy of V can be recopied into another location (on the heap or even registers or stack) with no violation of semantics. If there is some use u(v) of a V-value v, we can pass the physical representation of v through an copy operation c(v) which preserves the value but physically rebuffers it somewhere else, so that the new graph is u(c(v)). We call this ?splitting the use? u(v). Experiment 1: At the horizon of the current task, split all incoming and outgoing uses of type V. Check how this transform affects your favorite EA. I think you suddenly don?t care about identity escapes of those values, beyond the horizon, because they take up new identities there. Experiment 2: In the current task, split all uses of type V that touch phis. (Where there are already copies, of the form c(v), use the identity c(c(v)) = c(v).) Again, check how this affects your favorite flow-sensitive EA. Now I think you have decoupled questions of identity from the control flow graph. The limiting case is to treat values of type V as always splitting. And you probably don?t ever need explicit c-nodes. The c notation helps to explain how V-type references behave differently from normal references, in classic EA algorithms. > From my (external) point of > view, I can imagine that this will still be a problem. What?s an example of such a problem, that remains after uses are split with the c-operator? (There?s an issue of removing needless splits, but if the c-operator is never materialized, then the problem reduces, I think, to tracking ?souvenirs? of last-known bufferings of given value bundles.) > If so, it looks to me > that this is an issue that once solved would benefit not only the current EA > implementation but also Valhalla. What do you [all] think? > > You mention about the "remaining [EA] hard cases once Inline Types are in > place". Can you please expand a little bit on that? Oops, I left a sentence incomplete there. It should have been simply: Not to make EA unnecessary, by any means, but rather to focus EA better on some remaining ?hard cases?. The hard cases continue to be behavior of objects outside of the inlining horizon that might disrupt optimistic optimizations inside the horizon. A couple of examples: A. We might wish to use thread-local allocation for ?confined? objects which never leave the thread, but which need to be accessed as if they were on the heap. (Stack allocation is a *special case* of thread-local allocation: It requires an extra assurance that the object lives only during the lifetime of a stack frame. BTW I think concurrent cross-thread access to thread-local objects is likely to be very buggy, which is why I draw the line at confinement, not the different line of ?during the *global* lifetime of a stack frame?.) B. We might wish to vectorize over data sources and sinks assuming (after a check perhaps) that the memory streams are non-overlapping with writes we are vectorizing (a ?noalias? condition). A hard case of A. is when code might let a mostly-confined reference escape to somewhere another thread can see it. There are many ways to build a fence around this, such as static inter-task analysis beyond the horizon (what is called ?interprocedural?, but the JIT task not the procedure is the key unit). Or GC barriers that detect escapes dynamically. A hard case of B. is similar, but you might test for different conditions, trying to prove that distinct alias categories stay distinct beyond the horizon. Actually, I don?t have any good suggestions for that, other than the usual technique of predicating on a non-overlap condition inside the horizon, before a loop that needs the condition. Even if we use only static inter-task techniques to analyze and summarize the global access to references (passed to and from method calls and data structures), there is probably always a dynamic component. At least, if you add a new subclass you probably have to adjoin the summaries of its overriding methods, if you are relying on summaries of related methods; in the worst case you might have to retract optimizations in running code (as we do when we de-opt when a devirtualization is not longer correct). Turning that around, there might be cases where a reference *might* escape but it dynamically *does not*. (Example: A so-far-never-taken path makes a reference escape, where the other paths keep it confined.) We might impose dynamic checks on methods whose behavior we have summarized (for the benefit of compile tasks which do not inline them), so that the summary can be sharper, at the cost of doing de-opt if the sharper summary is no longer valid. There?s a frontier here, of more and more clever and aggressive and speculative optimizations we can do. I think it helps to have them operate on fewer (larger?) objects. So I guess what I?m hoping for here is that if many small objects are made identity-free, then we can better use limited resources of time, space, and engineering cleverness to track the remaining objects whose access we wish to optimize. > Thank you again for taking the time to read the report and provide your > perspective! My pleasure! From dnsimon at openjdk.java.net Tue Oct 12 19:58:57 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Tue, 12 Oct 2021 19:58:57 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable [v5] In-Reply-To: References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> <6Rgeo_st3AIVFJxc8jU-hdBnCf7YzphvraQ3HolXOOE=.027904b7-42dc-49c6-a075-c62b899aafdd@github.com> Message-ID: On Tue, 12 Oct 2021 16:14:27 GMT, Thomas Stuefe wrote: >> In general I agree but in error reporting code I get extra defensive plus the defensive code is small and not on a hot path. >> That said, I won't object to it being undone in a subsequent PR. > > I thought the point of MIN2 here is to handle ErrorLogPrintCodeLimit < max. IMHO ErrorLogPrintCodeLimit > max would be just an assert-worthy error, since as David wrote flag values should not be changed after initialization. The point is to ensure that we don't run off the end of the stack allocated `printed` array in the (granted, unlikely) case that `ErrorLogPrintCodeLimit` is (accidentally) updated after arg parsing . I'm not sure an assert is the best thing as it would cause error reporting to recurse. Maybe I was being too defensive but I figured the overhead is negligible so why not be ultra-safe. ------------- PR: https://git.openjdk.java.net/jdk/pull/5875 From coleenp at openjdk.java.net Tue Oct 12 20:09:56 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 12 Oct 2021 20:09:56 GMT Subject: RFR: 8274794: Make Thread::_owned_locks available in product [v2] In-Reply-To: References: Message-ID: <2UHmgS4PYxXYwVfF9o090WjhnSq0iyitoE9F2yBDZ9A=.943464f7-4f72-4bcd-ac8a-88ec46078a6f@github.com> On Mon, 11 Oct 2021 18:39:44 GMT, Coleen Phillimore wrote: >> Make thread owned locks available in product and change the error message printing to print by thread so that not only locks in mutexLocker are printed in the hs_err file. >> Tested with tier1-3 and with product build, and tested hs_err files manually with some selected temporary ShouldNotReachHere, also new test. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix optimized build (move print_on) up. I have a better patch, withdrawing this one. ------------- PR: https://git.openjdk.java.net/jdk/pull/5896 From coleenp at openjdk.java.net Tue Oct 12 20:09:56 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 12 Oct 2021 20:09:56 GMT Subject: Withdrawn: 8274794: Make Thread::_owned_locks available in product In-Reply-To: References: Message-ID: On Mon, 11 Oct 2021 16:23:13 GMT, Coleen Phillimore wrote: > Make thread owned locks available in product and change the error message printing to print by thread so that not only locks in mutexLocker are printed in the hs_err file. > Tested with tier1-3 and with product build, and tested hs_err files manually with some selected temporary ShouldNotReachHere, also new test. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/5896 From john.r.rose at oracle.com Tue Oct 12 20:26:11 2021 From: john.r.rose at oracle.com (John Rose) Date: Tue, 12 Oct 2021 20:26:11 +0000 Subject: [External] : Re: RFC - Improving C2 Escape Analysis In-Reply-To: References: <20210930140335.648146897@eggemoggin.niobe.net> Message-ID: That sounds like an excellent approach: Get some wins and build some infrastructure before Valhalla, but with an eye towards improving Valhalla also. I was in a conversation just this morning where stack allocation came up, as a possible way to handle some instances of identity-free Valhalla objects. The point would be to allow access consistent with the layout of a heap object, but with allocation on the stack. The tactics we use today are either full scalarization (preferable usually) or heap buffering (like an old-school boxed primitive). Stack allocation might be a useful intermediate tactic. (Maybe also consider thread-local heaps, with explicit deallocation or just their own GC.) Those extra thread-local or stack/activation-local tactics are expensive and may not pay off in the long run, but you are doing the right thing in investigating them. If and when you find wins for regular pre-Valhalla objects, you are also finding wins (probably) for Valhalla objects. Anyway, thanks for looking at this stuff. It?s a pleasure to collaborate. On Oct 12, 2021, at 10:48 AM, Cesar Soares Lucas > wrote: Hi, again, John. First of all, we greatly appreciated the feedback that you provided to us. Thank you. I talked with my manager and the team and it's clear to us that contributing to Valhalla is the best long-term strategy. Therefore, we'll be looking into how to contribute to it in the near future. That being said, we believe that improving the current EA and dependent optimizations seems to be the best short-term strategy for us and for that reason we'll invest our immediate efforts in that direction. We also believe that this approach will help us gain more experience with the problem and the platform overall. From mcimadamore at openjdk.java.net Tue Oct 12 20:51:02 2021 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Tue, 12 Oct 2021 20:51:02 GMT Subject: RFR: 8275063: Implementation of Foreign Function & Memory API (Second incubator) [v2] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-419 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/419 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Fix TestLinkToNativeRBP test ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5907/files - new: https://git.openjdk.java.net/jdk/pull/5907/files/be0dd36e..23f69054 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5907&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5907&range=00-01 Stats: 7 lines in 1 file changed: 1 ins; 4 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/5907.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5907/head:pull/5907 PR: https://git.openjdk.java.net/jdk/pull/5907 From dholmes at openjdk.java.net Tue Oct 12 23:07:51 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 12 Oct 2021 23:07:51 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself In-Reply-To: <6166f716-7e3f-ae49-8a81-00ce185e9640@oracle.com> References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <6166f716-7e3f-ae49-8a81-00ce185e9640@oracle.com> Message-ID: On Tue, 12 Oct 2021 13:20:21 GMT, coleen.phillimore at oracle.com wrote: > ttyLock is taken and released inside of print_cr, but it's a nosafepoint lock so would not block. Okay - thanks @coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From dholmes at openjdk.java.net Tue Oct 12 23:11:50 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 12 Oct 2021 23:11:50 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v8] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: On Tue, 12 Oct 2021 07:02:12 GMT, Xin Liu wrote: >> This patch allows the custom commands of OnError to attach to HotSpot itself. >> It sets the thread of report_and_die() to Native before os::fork_and_exec(cmd). >> This prevents cmds which require safepoint synchronization from deadlock. >> eg. OnError='jcmd %p Thread.print'. >> >> Without this patch, we will encounter a deadlock at safepoint synchronization. >> `"main" #1` is the very thread which executes `os::fork_and_exec(cmd)`. >> >> >> Aborting due to java.lang.OutOfMemoryError: Java heap space >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (debug.cpp:364), pid=94632, tid=94633 >> # fatal error: OutOfMemory encountered: Java heap space >> # >> # JRE version: OpenJDK Runtime Environment (18.0) (build 18-internal+0-adhoc.xxinliu.jdk) >> # Java VM: OpenJDK 64-Bit Server VM (18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) >> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again >> # >> # An error report file with more information is saved as: >> # /local/home/xxinliu/JDK-2085/hs_err_pid94632.log >> # >> # -XX:OnError="jcmd %p Thread.print" >> # Executing /bin/sh -c "jcmd 94632 Thread.print" ... >> 94632: >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timeout detected: >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint. >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Threads which did not reach the safepoint: >> [10.616s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=236.97ms elapsed=10.61s tid=0x00007f01b00232f0 nid=94633 runnable [0x00007f01b7a08000] >> [10.616s][warning][safepoint] java.lang.Thread.State: RUNNABLE >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: (End of list) > > Xin Liu has updated the pull request incrementally with one additional commit since the last revision: > > Change to VM unconditionally as long as current thread is JavaThread. > > Hoist VMErrorForceNative out of While. This seems fine to me. Approval is conditional on hearing about further testing. Have you tested this by running an app with "OnError='jmcd %p` and then throwing random signals at it to force crashes in different places? You can run -Xint and -Xcomp to increases chances of hitting different code locations. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5590 From psandoz at openjdk.java.net Wed Oct 13 00:06:03 2021 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Wed, 13 Oct 2021 00:06:03 GMT Subject: RFR: 8275063: Implementation of Foreign Function & Memory API (Second incubator) [v2] In-Reply-To: References: Message-ID: On Tue, 12 Oct 2021 20:51:02 GMT, Maurizio Cimadamore wrote: >> This PR contains the API and implementation changes for JEP-419 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. >> >> [1] - https://openjdk.java.net/jeps/419 > > Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: > > Fix TestLinkToNativeRBP test Like with previous reviews of code for Panama JEPs there are not many comments on this PR, since prior reviews occurred for PRs in the panama-foreign repo. src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/CLinker.java line 77: > 75: * Furthermore, if the function descriptor's return layout is a group layout, the resulting downcall method handle accepts > 76: * an extra parameter of type {@link SegmentAllocator}, which is used by the linker runtime to allocate the > 77: * memory region associated with the struct returned by the downcall method handle. Suggestion: * memory region associated with the struct returned by the downcall method handle. src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/CLinker.java line 88: > 86: * in which the specialized signature of a given variable arity callsite is described in full. Alternatively, > 87: * if the foreign library allows it, clients might also be able to interact with variable arity methods > 88: * using by passing a trailing parameter of type {@link VaList}. Suggestion: * by passing a trailing parameter of type {@link VaList}. src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/CLinker.java line 145: > 143: * Lookup a symbol in the standard libraries associated with this linker. > 144: * The set of symbols available for lookup is unspecified, as it depends on the platform and on the operating system. > 145: * @return a linker-specific library lookup which is suitable to find symbols in the standard libraries associated with this linker. Suggestion: * @return a symbol in the standard libraries associated with this linker. src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/FunctionDescriptor.java line 93: > 91: Objects.requireNonNull(argLayouts); > 92: Arrays.stream(argLayouts).forEach(Objects::requireNonNull); > 93: return new FunctionDescriptor(null, argLayouts); We need to clone `argLayouts`, otherwise the user can modify the contents. Internally, using `List.of` is useful, since it does the cloning and null checks, and that is the type that is returned. Also `argumentLayouts` uses `Arrays.asList` which supports modification of the contents. src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/SegmentAllocator.java line 394: > 392: *

> 393: * The returned allocator might throw an {@link OutOfMemoryError} if the total memory allocated with this allocator > 394: * exceeds the arena size, or the system capacity. Furthermore, the returned allocator is not thread safe, and all The "the returned allocator is not thread safe" contradicts the prior sentence "An allocator associated with a shared resource scope is thread-safe and allocation requests may be performed concurrently". Perhaps just say: "

The returned allocator is not thread safe if the given scope is a shared scope. Concurrent allocation needs to be guarded with synchronization primitives. " src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/AbstractMemorySegmentImpl.java line 260: > 258: > 259: @Override > 260: public final MemorySegment asOverlappingSlice(MemorySegment other) { Please ignore these comments if you wish. Adding `max() // exclusive` to complement `min()` might be useful. In these cases i find it easier to sort the segments e.g. `var a = this; var b = other; if (a.min() > b.min()) { // swap a and b }` then the subsequent logic tends to get simpler e.g. overlap test is `if (b.min() < a.max())`, `segmentOffset` is always +ve. src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/ConfinedScope.java line 100: > 98: do { > 99: value = (int)ASYNC_RELEASE_COUNT.getVolatile(this); > 100: } while (!ASYNC_RELEASE_COUNT.compareAndSet(this, value, value + 1)); Use `getAndAdd` (and ignore the return value). ------------- PR: https://git.openjdk.java.net/jdk/pull/5907 From stuefe at openjdk.java.net Wed Oct 13 04:44:52 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 13 Oct 2021 04:44:52 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v8] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: On Tue, 12 Oct 2021 07:02:12 GMT, Xin Liu wrote: >> This patch allows the custom commands of OnError to attach to HotSpot itself. >> It sets the thread of report_and_die() to Native before os::fork_and_exec(cmd). >> This prevents cmds which require safepoint synchronization from deadlock. >> eg. OnError='jcmd %p Thread.print'. >> >> Without this patch, we will encounter a deadlock at safepoint synchronization. >> `"main" #1` is the very thread which executes `os::fork_and_exec(cmd)`. >> >> >> Aborting due to java.lang.OutOfMemoryError: Java heap space >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (debug.cpp:364), pid=94632, tid=94633 >> # fatal error: OutOfMemory encountered: Java heap space >> # >> # JRE version: OpenJDK Runtime Environment (18.0) (build 18-internal+0-adhoc.xxinliu.jdk) >> # Java VM: OpenJDK 64-Bit Server VM (18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) >> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again >> # >> # An error report file with more information is saved as: >> # /local/home/xxinliu/JDK-2085/hs_err_pid94632.log >> # >> # -XX:OnError="jcmd %p Thread.print" >> # Executing /bin/sh -c "jcmd 94632 Thread.print" ... >> 94632: >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timeout detected: >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint. >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Threads which did not reach the safepoint: >> [10.616s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=236.97ms elapsed=10.61s tid=0x00007f01b00232f0 nid=94633 runnable [0x00007f01b7a08000] >> [10.616s][warning][safepoint] java.lang.Thread.State: RUNNABLE >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: (End of list) > > Xin Liu has updated the pull request incrementally with one additional commit since the last revision: > > Change to VM unconditionally as long as current thread is JavaThread. > > Hoist VMErrorForceNative out of While. This is almost good to me. Thanks for refraining from the mutex unlock. Wrt the tests, David is right, test it with some signals. But IMHO a manual test is safe enough here. Cheers, Thomas src/hotspot/share/utilities/vmError.cpp line 1646: > 1644: // at safepoints. > 1645: VMErrorForceInNative fn(Thread::current_or_null()); > 1646: I have similar concerns as David about printing to acquire locks. Even if that is not a problem with tty specifically, its something to keep in mind (e.g. with the recent attempt to print logging via network sockets, such things can creep in). I would in that case move the RAII object close around fork_and_exec. Only do this for fork, and undo it after returning, doing the printing stuff in the original state instead. ------------- Changes requested by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5590 From stuefe at openjdk.java.net Wed Oct 13 04:49:52 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 13 Oct 2021 04:49:52 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable [v5] In-Reply-To: References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> <6Rgeo_st3AIVFJxc8jU-hdBnCf7YzphvraQ3HolXOOE=.027904b7-42dc-49c6-a075-c62b899aafdd@github.com> Message-ID: On Tue, 12 Oct 2021 19:55:54 GMT, Doug Simon wrote: >> I thought the point of MIN2 here is to handle ErrorLogPrintCodeLimit < max. IMHO ErrorLogPrintCodeLimit > max would be just an assert-worthy error, since as David wrote flag values should not be changed after initialization. > > The point is to ensure that we don't run off the end of the stack allocated `printed` array in the (granted, unlikely) case that `ErrorLogPrintCodeLimit` is (accidentally) updated after arg parsing . > I'm not sure an assert is the best thing as it would cause error reporting to recurse. > Maybe I was being too defensive but I figured the overhead is negligible so why not be ultra-safe. Recursive asserts would be caught by secondary error handling and show up as "Error occurred during error reporting" printout. Not ideal, but at least won't endanger the rest of the printing. ------------- PR: https://git.openjdk.java.net/jdk/pull/5875 From dholmes at openjdk.java.net Wed Oct 13 05:06:53 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 13 Oct 2021 05:06:53 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v8] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: <3BAKh1StGdHMg2w3cvKuOxXmTWi128y_xCKLMKJ3zQU=.7782fab0-279a-480a-8b22-446784f96027@github.com> On Wed, 13 Oct 2021 04:39:33 GMT, Thomas Stuefe wrote: >> Xin Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Change to VM unconditionally as long as current thread is JavaThread. >> >> Hoist VMErrorForceNative out of While. > > src/hotspot/share/utilities/vmError.cpp line 1646: > >> 1644: // at safepoints. >> 1645: VMErrorForceInNative fn(Thread::current_or_null()); >> 1646: > > I have similar concerns as David about printing to acquire locks. Even if that is not a problem with tty specifically, its something to keep in mind (e.g. with the recent attempt to print logging via network sockets, such things can creep in). > > I would in that case move the RAII object close around fork_and_exec. Only do this for fork, and undo it after returning, doing the printing stuff in the original state instead. I suggested moving the RAII object so it is only used once - under the assumption that output stream usage was safe to do while in native. Otherwise we need this: int ret = 0; { VMErrorForceInNative fn(_thread); ret = os::fork_and_exec(cmd); } if (ret < 0) { out.print_cr("os::fork_and_exec failed: %s (%s=%d)", os::strerror(errno), os::errno_name(errno), errno); } which is a bit clunky but okay I can live with it. And note we already have `_thread` to use so don't need to call `Thread::current_or_null()` (which should have been `Thread::current_or_null_safe()` in any case.) ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From xliu at openjdk.java.net Wed Oct 13 06:56:00 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Wed, 13 Oct 2021 06:56:00 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v8] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: On Tue, 12 Oct 2021 07:02:12 GMT, Xin Liu wrote: >> This patch allows the custom commands of OnError to attach to HotSpot itself. >> It sets the thread of report_and_die() to Native before os::fork_and_exec(cmd). >> This prevents cmds which require safepoint synchronization from deadlock. >> eg. OnError='jcmd %p Thread.print'. >> >> Without this patch, we will encounter a deadlock at safepoint synchronization. >> `"main" #1` is the very thread which executes `os::fork_and_exec(cmd)`. >> >> >> Aborting due to java.lang.OutOfMemoryError: Java heap space >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (debug.cpp:364), pid=94632, tid=94633 >> # fatal error: OutOfMemory encountered: Java heap space >> # >> # JRE version: OpenJDK Runtime Environment (18.0) (build 18-internal+0-adhoc.xxinliu.jdk) >> # Java VM: OpenJDK 64-Bit Server VM (18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) >> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again >> # >> # An error report file with more information is saved as: >> # /local/home/xxinliu/JDK-2085/hs_err_pid94632.log >> # >> # -XX:OnError="jcmd %p Thread.print" >> # Executing /bin/sh -c "jcmd 94632 Thread.print" ... >> 94632: >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timeout detected: >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint. >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Threads which did not reach the safepoint: >> [10.616s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=236.97ms elapsed=10.61s tid=0x00007f01b00232f0 nid=94633 runnable [0x00007f01b7a08000] >> [10.616s][warning][safepoint] java.lang.Thread.State: RUNNABLE >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: (End of list) > > Xin Liu has updated the pull request incrementally with one additional commit since the last revision: > > Change to VM unconditionally as long as current thread is JavaThread. > > Hoist VMErrorForceNative out of While. Hi, Reviewers, I run benchmark renaissance and strike SIGSEGV to any java threads randomly. $java -XX:+SafepointTimeout -XX:OnError='jcmd %p Thread.print' -jar ./renaissance-gpl-0.13.0.jar log-regression & $kill -11 11131 (not pid=11130, but its first java thread "main"#1, or some other java threads) It seem that okay for both `_thread_in_Java` and `_thread_in_VM` in most cases. However, it's not safe when the java thread was in `_thread_in_blocked`. For instance, it was in java.lang.Runtime.gc()V+0 java.base` . It is easy to get stuck when VM state was at safepoint. # -XX:OnError="jcmd %p Thread.print" original thread state = _thread_blocked # Executing /bin/sh -c "jcmd 37330 Thread.print" ... 37330: I just realize this is a reckless hack. we can't guarantee to avoid deadlock. We shouldn't go this way. How about we drop this? ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From duke at openjdk.java.net Wed Oct 13 07:18:52 2021 From: duke at openjdk.java.net (openjdk-notifier[bot]) Date: Wed, 13 Oct 2021 07:18:52 GMT Subject: RFR: 8274851: [PPC64] Port zgc to linux on ppc64le [v3] In-Reply-To: References: Message-ID: On Fri, 8 Oct 2021 15:41:47 GMT, Niklas Radomski wrote: >> Port the Z garbage collector ([JDK-8209683](https://bugs.openjdk.java.net/browse/JDK-8209683)) to linux on ppc64le. > > Niklas Radomski has updated the pull request incrementally with three additional commits since the last revision: > > - Remove superfluous copyright change > - Fix predicate conditions > - Add ByteSize constructor to Address The dependent pull request has now been integrated, and the target branch of this pull request has been updated. This means that changes from the dependent pull request can start to show up as belonging to this pull request, which may be confusing for reviewers. To remedy this situation, simply merge the latest changes from the new target branch into this pull request by running commands similar to these in the local repository for your personal fork: git checkout 8275049_ZGC_log_register git fetch https://git.openjdk.java.net/jdk master git merge FETCH_HEAD # if there are conflicts, follow the instructions given by git merge git commit -m "Merge master" git push ------------- PR: https://git.openjdk.java.net/jdk/pull/5842 From aph at openjdk.java.net Wed Oct 13 07:25:53 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 13 Oct 2021 07:25:53 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler [v3] In-Reply-To: <6JpuK39Hiw9D4DlQoUKxqrIxSADCAjyxCxiy3niPBwI=.d17d804f-df3c-4f83-af63-4057cd0ca632@github.com> References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> <6JpuK39Hiw9D4DlQoUKxqrIxSADCAjyxCxiy3niPBwI=.d17d804f-df3c-4f83-af63-4057cd0ca632@github.com> Message-ID: On Thu, 7 Oct 2021 08:50:15 GMT, Andrew Haley wrote: >> Bernhard Urban-Forster has updated the pull request incrementally with one additional commit since the last revision: >> >> fix linux/aarch64 build: s/r18/r18_tls > > So we have fixed the only use of `popa()` so that it does not clobber R18. That is good. > However, I think we've found another bug. `reguard_yellow_pages()` is a native function, so we should be saving floating-point registers too. It looks to me like this code should use `push_call_clobbered_registers()`. Then we'll have no use for `pusha()` in trunk, and your fix for `popa()` can go into 11u. > Thanks for the suggestion @theRealAph. By looking through the code I discovered another buggy set of helpers, `{push,pop}_CPU_state()` which also messed with `r18`. Sure, so we can do the same thing there, if we stil need it. > Does it look fine to you, or should the code around `reguard_yellow_pages()` rather use `{push,pop}-call_clobbered_registers()`? Yes, very much so. ------------- PR: https://git.openjdk.java.net/jdk/pull/5828 From aph at openjdk.java.net Wed Oct 13 07:25:55 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 13 Oct 2021 07:25:55 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler [v4] In-Reply-To: References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> Message-ID: On Tue, 12 Oct 2021 12:39:14 GMT, Bernhard Urban-Forster wrote: >> r18 should not be used as it is reserved as platform register. Linux is fine with userspace using it, but Windows and also recently macOS (https://github.com/openjdk/jdk11u-dev/pull/301#issuecomment-911998917 ) are actually using it on the kernel side. >> >> The macro assembler uses the bit pattern `0x7fff_ffff` (== `r0-r30`) to specify which registers to spill; fortunately this helper is only used here: >> >> https://github.com/openjdk/jdk/blob/c05dc268acaf87236f30cf700ea3ac778e3b20e5/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp#L1400-L1404 >> >> I haven't seen causing this particular instance any issues in practice _yet_, presumably because it looks hard to align the stars in order to trigger a problem (between `stp` and `ldp` of `r18` a transition to kernel space must happen *and* the kernel needs to do something with `r18`). But jdk11u-dev has more usages of the `::pusha`/`::popa` macro and that causes troubles as explained in the link above. >> >> Output of `-XX:+PrintInterpreter` before this change: >> >> ---------------------------------------------------------------------- >> method entry point (kind = native) [0x0000000138809b00, 0x000000013880a280] 1920 bytes >> -------------------------------------------------------------------------------- >> 0x0000000138809b00: ldr x2, [x12, #16] >> 0x0000000138809b04: ldrh w2, [x2, #44] >> 0x0000000138809b08: add x24, x20, x2, uxtx #3 >> 0x0000000138809b0c: sub x24, x24, #0x8 >> [...] >> 0x0000000138809fa4: stp x16, x17, [sp, #128] >> 0x0000000138809fa8: stp x18, x19, [sp, #144] >> 0x0000000138809fac: stp x20, x21, [sp, #160] >> [...] >> 0x0000000138809fc0: stp x30, xzr, [sp, #240] >> 0x0000000138809fc4: mov x0, x28 >> ;; 0x10864ACCC >> 0x0000000138809fc8: mov x9, #0xaccc // #44236 >> 0x0000000138809fcc: movk x9, #0x864, lsl #16 >> 0x0000000138809fd0: movk x9, #0x1, lsl #32 >> 0x0000000138809fd4: blr x9 >> 0x0000000138809fd8: ldp x2, x3, [sp, #16] >> [...] >> 0x0000000138809ff4: ldp x16, x17, [sp, #128] >> 0x0000000138809ff8: ldp x18, x19, [sp, #144] >> 0x0000000138809ffc: ldp x20, x21, [sp, #160] >> >> >> After: >> >> ---------------------------------------------------------------------- >> method entry point (kind = native) [0x0000000108e4db00, 0x0000000108e4e280] 1920 bytes >> >> -------------------------------------------------------------------------------- >> 0x0000000108e4db00: ldr x2, [x12, #16] >> 0x0000000108e4db04: ldrh w2, [x2, #44] >> 0x0000000108e4db08: add x24, x20, x2, uxtx #3 >> 0x0000000108e4db0c: sub x24, x24, #0x8 >> [...] >> 0x0000000108e4dfa4: stp x16, x17, [sp, #128] >> 0x0000000108e4dfa8: stp x19, x20, [sp, #144] >> 0x0000000108e4dfac: stp x21, x22, [sp, #160] >> [...] >> 0x0000000108e4dfbc: stp x29, x30, [sp, #224] >> 0x0000000108e4dfc0: mov x0, x28 >> ;; 0x107E4A06C >> 0x0000000108e4dfc4: mov x9, #0xa06c // #41068 >> 0x0000000108e4dfc8: movk x9, #0x7e4, lsl #16 >> 0x0000000108e4dfcc: movk x9, #0x1, lsl #32 >> 0x0000000108e4dfd0: blr x9 >> 0x0000000108e4dfd4: ldp x2, x3, [sp, #16] >> [...] >> 0x0000000108e4dff0: ldp x16, x17, [sp, #128] >> 0x0000000108e4dff4: ldp x19, x20, [sp, #144] >> 0x0000000108e4dff8: ldp x21, x22, [sp, #160] >> [...] > > Bernhard Urban-Forster has updated the pull request incrementally with one additional commit since the last revision: > > use {push,pop}_CPU_state and fix r18 usage there as well src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp line 1400: > 1398: __ br(Assembler::NE, no_reguard); > 1399: > 1400: __ push_CPU_state(); Suggestion: __ push_call_clobbered_registers(); src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp line 1404: > 1402: __ mov(rscratch2, CAST_FROM_FN_PTR(address, SharedRuntime::reguard_yellow_pages)); > 1403: __ blr(rscratch2); > 1404: __ pop_CPU_state(); Suggestion: __ pop_call_clobbered_registers(); ------------- PR: https://git.openjdk.java.net/jdk/pull/5828 From david.holmes at oracle.com Wed Oct 13 07:26:53 2021 From: david.holmes at oracle.com (David Holmes) Date: Wed, 13 Oct 2021 17:26:53 +1000 Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v8] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: <54f257c1-934a-fac2-a295-704531996afb@oracle.com> Hi Xin, On 13/10/2021 4:56 pm, Xin Liu wrote: > Hi, Reviewers, > > I run benchmark renaissance and strike SIGSEGV to any java threads randomly. > > $java -XX:+SafepointTimeout -XX:OnError='jcmd %p Thread.print' -jar ./renaissance-gpl-0.13.0.jar log-regression & > $kill -11 11131 (not pid=11130, but its first java thread "main"#1, or some other java threads) > > It seem that okay for both `_thread_in_Java` and `_thread_in_VM` in most cases. However, it's not safe when the java thread was in `_thread_in_blocked`. For instance, it was in java.lang.Runtime.gc()V+0 java.base` . It is easy to get stuck when VM state was at safepoint. Right ... if the crash happens at a safepoint you can't reconnect to the VM at all. > > # -XX:OnError="jcmd %p Thread.print" > original thread state = _thread_blocked > # Executing /bin/sh -c "jcmd 37330 Thread.print" ... > 37330: > > > I just realize this is a reckless hack. we can't guarantee to avoid deadlock. We shouldn't go this way. How about we drop this? You certainly cannot guarantee anything if you want to try and inspect the state of the VM that is crashing. But I thought the intent here was just to try and expand the set of cases where you can actually manage this. At the moment when you crash there are two possibilities: - jcmd can't work no matter what you do - jcmd could work if not for the safepoint issue the second set of possibilities is what you are trying to enable via this patch. What we really have no idea of is the relative sizes of these two sets of circumstances. :) Cheers, David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/5590 > From xliu at openjdk.java.net Wed Oct 13 07:30:03 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Wed, 13 Oct 2021 07:30:03 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v8] In-Reply-To: <3BAKh1StGdHMg2w3cvKuOxXmTWi128y_xCKLMKJ3zQU=.7782fab0-279a-480a-8b22-446784f96027@github.com> References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <3BAKh1StGdHMg2w3cvKuOxXmTWi128y_xCKLMKJ3zQU=.7782fab0-279a-480a-8b22-446784f96027@github.com> Message-ID: On Wed, 13 Oct 2021 05:02:50 GMT, David Holmes wrote: >> src/hotspot/share/utilities/vmError.cpp line 1646: >> >>> 1644: // at safepoints. >>> 1645: VMErrorForceInNative fn(Thread::current_or_null()); >>> 1646: >> >> I have similar concerns as David about printing to acquire locks. Even if that is not a problem with tty specifically, its something to keep in mind (e.g. with the recent attempt to print logging via network sockets, such things can creep in). >> >> I would in that case move the RAII object close around fork_and_exec. Only do this for fork, and undo it after returning, doing the printing stuff in the original state instead. > > I suggested moving the RAII object so it is only used once - under the assumption that output stream usage was safe to do while in native. Otherwise we need this: > > int ret = 0; > { > VMErrorForceInNative fn(_thread); > ret = os::fork_and_exec(cmd); > } > if (ret < 0) { > out.print_cr("os::fork_and_exec failed: %s (%s=%d)", > os::strerror(errno), os::errno_name(errno), errno); > } > > which is a bit clunky but okay I can live with it. And note we already have `_thread` to use so don't need to call `Thread::current_or_null()` (which should have been `Thread::current_or_null_safe()` in any case.) It's risky indeed. if we can't eradicate the deadlock situation, I feel it's not worth it. Even though ttyLock works in Native now, it doesn't mean it will be safe in the future. ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From nradomski at openjdk.java.net Wed Oct 13 07:40:57 2021 From: nradomski at openjdk.java.net (Niklas Radomski) Date: Wed, 13 Oct 2021 07:40:57 GMT Subject: Integrated: 8274851: [PPC64] Port zgc to linux on ppc64le In-Reply-To: References: Message-ID: On Wed, 6 Oct 2021 19:27:26 GMT, Niklas Radomski wrote: > Port the Z garbage collector ([JDK-8209683](https://bugs.openjdk.java.net/browse/JDK-8209683)) to linux on ppc64le. This pull request has now been integrated. Changeset: 337b73a4 Author: Niklas Radomski Committer: Martin Doerr URL: https://git.openjdk.java.net/jdk/commit/337b73a459ba24aa529b7b097617434be1d0030e Stats: 1273 lines in 11 files changed: 1265 ins; 0 del; 8 mod 8274851: [PPC64] Port zgc to linux on ppc64le Reviewed-by: ihse, pliden, mdoerr, eosterlund ------------- PR: https://git.openjdk.java.net/jdk/pull/5842 From dnsimon at openjdk.java.net Wed Oct 13 07:49:53 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Wed, 13 Oct 2021 07:49:53 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable [v5] In-Reply-To: References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> <6Rgeo_st3AIVFJxc8jU-hdBnCf7YzphvraQ3HolXOOE=.027904b7-42dc-49c6-a075-c62b899aafdd@github.com> Message-ID: On Wed, 13 Oct 2021 04:47:18 GMT, Thomas Stuefe wrote: >> The point is to ensure that we don't run off the end of the stack allocated `printed` array in the (granted, unlikely) case that `ErrorLogPrintCodeLimit` is (accidentally) updated after arg parsing . >> I'm not sure an assert is the best thing as it would cause error reporting to recurse. >> Maybe I was being too defensive but I figured the overhead is negligible so why not be ultra-safe. > > Recursive asserts would be caught by secondary error handling and show up as "Error occurred during error reporting" printout. Not ideal, but at least won't endanger the rest of the printing. Yes but in a production scenario (which is where robust error reporting is critical), the assert is ignored and we end up with potential buffer overflow. ------------- PR: https://git.openjdk.java.net/jdk/pull/5875 From stuefe at openjdk.java.net Wed Oct 13 08:27:59 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 13 Oct 2021 08:27:59 GMT Subject: RFR: 8274986: max code printed in hs-err logs should be configurable [v5] In-Reply-To: References: <8Z8-tyqFF9kDrGQvQrpkZhQSobWwrA50gzfSoeocsXg=.91f77626-67aa-4d41-ab4b-ee4129f1a2dc@github.com> <6Rgeo_st3AIVFJxc8jU-hdBnCf7YzphvraQ3HolXOOE=.027904b7-42dc-49c6-a075-c62b899aafdd@github.com> Message-ID: On Wed, 13 Oct 2021 07:46:27 GMT, Doug Simon wrote: >> Recursive asserts would be caught by secondary error handling and show up as "Error occurred during error reporting" printout. Not ideal, but at least won't endanger the rest of the printing. > > Yes but in a production scenario (which is where robust error reporting is critical), the assert is ignored and we end up with potential buffer overflow. Could be a guarantee then. ------------- PR: https://git.openjdk.java.net/jdk/pull/5875 From burban at openjdk.java.net Wed Oct 13 09:03:08 2021 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Wed, 13 Oct 2021 09:03:08 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler [v5] In-Reply-To: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> Message-ID: > r18 should not be used as it is reserved as platform register. Linux is fine with userspace using it, but Windows and also recently macOS (https://github.com/openjdk/jdk11u-dev/pull/301#issuecomment-911998917 ) are actually using it on the kernel side. > > The macro assembler uses the bit pattern `0x7fff_ffff` (== `r0-r30`) to specify which registers to spill; fortunately this helper is only used here: > > https://github.com/openjdk/jdk/blob/c05dc268acaf87236f30cf700ea3ac778e3b20e5/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp#L1400-L1404 > > I haven't seen causing this particular instance any issues in practice _yet_, presumably because it looks hard to align the stars in order to trigger a problem (between `stp` and `ldp` of `r18` a transition to kernel space must happen *and* the kernel needs to do something with `r18`). But jdk11u-dev has more usages of the `::pusha`/`::popa` macro and that causes troubles as explained in the link above. > > Output of `-XX:+PrintInterpreter` before this change: > > ---------------------------------------------------------------------- > method entry point (kind = native) [0x0000000138809b00, 0x000000013880a280] 1920 bytes > -------------------------------------------------------------------------------- > 0x0000000138809b00: ldr x2, [x12, #16] > 0x0000000138809b04: ldrh w2, [x2, #44] > 0x0000000138809b08: add x24, x20, x2, uxtx #3 > 0x0000000138809b0c: sub x24, x24, #0x8 > [...] > 0x0000000138809fa4: stp x16, x17, [sp, #128] > 0x0000000138809fa8: stp x18, x19, [sp, #144] > 0x0000000138809fac: stp x20, x21, [sp, #160] > [...] > 0x0000000138809fc0: stp x30, xzr, [sp, #240] > 0x0000000138809fc4: mov x0, x28 > ;; 0x10864ACCC > 0x0000000138809fc8: mov x9, #0xaccc // #44236 > 0x0000000138809fcc: movk x9, #0x864, lsl #16 > 0x0000000138809fd0: movk x9, #0x1, lsl #32 > 0x0000000138809fd4: blr x9 > 0x0000000138809fd8: ldp x2, x3, [sp, #16] > [...] > 0x0000000138809ff4: ldp x16, x17, [sp, #128] > 0x0000000138809ff8: ldp x18, x19, [sp, #144] > 0x0000000138809ffc: ldp x20, x21, [sp, #160] > > > After: > > ---------------------------------------------------------------------- > method entry point (kind = native) [0x0000000108e4db00, 0x0000000108e4e280] 1920 bytes > > -------------------------------------------------------------------------------- > 0x0000000108e4db00: ldr x2, [x12, #16] > 0x0000000108e4db04: ldrh w2, [x2, #44] > 0x0000000108e4db08: add x24, x20, x2, uxtx #3 > 0x0000000108e4db0c: sub x24, x24, #0x8 > [...] > 0x0000000108e4dfa4: stp x16, x17, [sp, #128] > 0x0000000108e4dfa8: stp x19, x20, [sp, #144] > 0x0000000108e4dfac: stp x21, x22, [sp, #160] > [...] > 0x0000000108e4dfbc: stp x29, x30, [sp, #224] > 0x0000000108e4dfc0: mov x0, x28 > ;; 0x107E4A06C > 0x0000000108e4dfc4: mov x9, #0xa06c // #41068 > 0x0000000108e4dfc8: movk x9, #0x7e4, lsl #16 > 0x0000000108e4dfcc: movk x9, #0x1, lsl #32 > 0x0000000108e4dfd0: blr x9 > 0x0000000108e4dfd4: ldp x2, x3, [sp, #16] > [...] > 0x0000000108e4dff0: ldp x16, x17, [sp, #128] > 0x0000000108e4dff4: ldp x19, x20, [sp, #144] > 0x0000000108e4dff8: ldp x21, x22, [sp, #160] > [...] Bernhard Urban-Forster has updated the pull request incrementally with one additional commit since the last revision: Apply suggestions from code review Co-authored-by: Andrew Haley ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5828/files - new: https://git.openjdk.java.net/jdk/pull/5828/files/d7ca7b39..0bf88e8b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5828&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5828&range=03-04 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/5828.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5828/head:pull/5828 PR: https://git.openjdk.java.net/jdk/pull/5828 From burban at openjdk.java.net Wed Oct 13 09:05:49 2021 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Wed, 13 Oct 2021 09:05:49 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler [v3] In-Reply-To: References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> <6JpuK39Hiw9D4DlQoUKxqrIxSADCAjyxCxiy3niPBwI=.d17d804f-df3c-4f83-af63-4057cd0ca632@github.com> Message-ID: <_rTV8PdPmEPg0IqbdI92IKvwEflp2BB3CSwa4sJoaTo=.49754ebf-062a-49f6-bb9a-fca61bae9f7d@github.com> On Wed, 13 Oct 2021 07:21:23 GMT, Andrew Haley wrote: > > Does it look fine to you, or should the code around `reguard_yellow_pages()` rather use `{push,pop}-call_clobbered_registers()`? > > Yes, very much so. done, thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/5828 From aph at openjdk.java.net Wed Oct 13 09:14:52 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 13 Oct 2021 09:14:52 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler [v5] In-Reply-To: References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> Message-ID: <_5MELTb40jP8GvePxU2HQpuq4t1ta0GI5Np4Pz52tt0=.b097bbce-40a1-4b46-943e-d777ea2b8d30@github.com> On Wed, 13 Oct 2021 09:03:08 GMT, Bernhard Urban-Forster wrote: >> r18 should not be used as it is reserved as platform register. Linux is fine with userspace using it, but Windows and also recently macOS (https://github.com/openjdk/jdk11u-dev/pull/301#issuecomment-911998917 ) are actually using it on the kernel side. >> >> The macro assembler uses the bit pattern `0x7fff_ffff` (== `r0-r30`) to specify which registers to spill; fortunately this helper is only used here: >> >> https://github.com/openjdk/jdk/blob/c05dc268acaf87236f30cf700ea3ac778e3b20e5/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp#L1400-L1404 >> >> I haven't seen causing this particular instance any issues in practice _yet_, presumably because it looks hard to align the stars in order to trigger a problem (between `stp` and `ldp` of `r18` a transition to kernel space must happen *and* the kernel needs to do something with `r18`). But jdk11u-dev has more usages of the `::pusha`/`::popa` macro and that causes troubles as explained in the link above. >> >> Output of `-XX:+PrintInterpreter` before this change: >> >> ---------------------------------------------------------------------- >> method entry point (kind = native) [0x0000000138809b00, 0x000000013880a280] 1920 bytes >> -------------------------------------------------------------------------------- >> 0x0000000138809b00: ldr x2, [x12, #16] >> 0x0000000138809b04: ldrh w2, [x2, #44] >> 0x0000000138809b08: add x24, x20, x2, uxtx #3 >> 0x0000000138809b0c: sub x24, x24, #0x8 >> [...] >> 0x0000000138809fa4: stp x16, x17, [sp, #128] >> 0x0000000138809fa8: stp x18, x19, [sp, #144] >> 0x0000000138809fac: stp x20, x21, [sp, #160] >> [...] >> 0x0000000138809fc0: stp x30, xzr, [sp, #240] >> 0x0000000138809fc4: mov x0, x28 >> ;; 0x10864ACCC >> 0x0000000138809fc8: mov x9, #0xaccc // #44236 >> 0x0000000138809fcc: movk x9, #0x864, lsl #16 >> 0x0000000138809fd0: movk x9, #0x1, lsl #32 >> 0x0000000138809fd4: blr x9 >> 0x0000000138809fd8: ldp x2, x3, [sp, #16] >> [...] >> 0x0000000138809ff4: ldp x16, x17, [sp, #128] >> 0x0000000138809ff8: ldp x18, x19, [sp, #144] >> 0x0000000138809ffc: ldp x20, x21, [sp, #160] >> >> >> After: >> >> ---------------------------------------------------------------------- >> method entry point (kind = native) [0x0000000108e4db00, 0x0000000108e4e280] 1920 bytes >> >> -------------------------------------------------------------------------------- >> 0x0000000108e4db00: ldr x2, [x12, #16] >> 0x0000000108e4db04: ldrh w2, [x2, #44] >> 0x0000000108e4db08: add x24, x20, x2, uxtx #3 >> 0x0000000108e4db0c: sub x24, x24, #0x8 >> [...] >> 0x0000000108e4dfa4: stp x16, x17, [sp, #128] >> 0x0000000108e4dfa8: stp x19, x20, [sp, #144] >> 0x0000000108e4dfac: stp x21, x22, [sp, #160] >> [...] >> 0x0000000108e4dfbc: stp x29, x30, [sp, #224] >> 0x0000000108e4dfc0: mov x0, x28 >> ;; 0x107E4A06C >> 0x0000000108e4dfc4: mov x9, #0xa06c // #41068 >> 0x0000000108e4dfc8: movk x9, #0x7e4, lsl #16 >> 0x0000000108e4dfcc: movk x9, #0x1, lsl #32 >> 0x0000000108e4dfd0: blr x9 >> 0x0000000108e4dfd4: ldp x2, x3, [sp, #16] >> [...] >> 0x0000000108e4dff0: ldp x16, x17, [sp, #128] >> 0x0000000108e4dff4: ldp x19, x20, [sp, #144] >> 0x0000000108e4dff8: ldp x21, x22, [sp, #160] >> [...] > > Bernhard Urban-Forster has updated the pull request incrementally with one additional commit since the last revision: > > Apply suggestions from code review > > Co-authored-by: Andrew Haley Marked as reviewed by aph (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/5828 From mcimadamore at openjdk.java.net Wed Oct 13 12:07:58 2021 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Wed, 13 Oct 2021 12:07:58 GMT Subject: RFR: 8275063: Implementation of Foreign Function & Memory API (Second incubator) [v3] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-419 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/419 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Apply suggestions from code review Co-authored-by: Paul Sandoz ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5907/files - new: https://git.openjdk.java.net/jdk/pull/5907/files/23f69054..d6bf27ff Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5907&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5907&range=01-02 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/5907.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5907/head:pull/5907 PR: https://git.openjdk.java.net/jdk/pull/5907 From mcimadamore at openjdk.java.net Wed Oct 13 12:48:06 2021 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Wed, 13 Oct 2021 12:48:06 GMT Subject: RFR: 8275063: Implementation of Foreign Function & Memory API (Second incubator) [v4] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-419 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/419 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Address review comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5907/files - new: https://git.openjdk.java.net/jdk/pull/5907/files/d6bf27ff..27e71ee7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5907&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5907&range=02-03 Stats: 35 lines in 4 files changed: 8 ins; 11 del; 16 mod Patch: https://git.openjdk.java.net/jdk/pull/5907.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5907/head:pull/5907 PR: https://git.openjdk.java.net/jdk/pull/5907 From aph at openjdk.java.net Wed Oct 13 14:49:15 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 13 Oct 2021 14:49:15 GMT Subject: RFR: 8275052: AArch64: Fix flag handling for MacOS. Message-ID: This is more of the fallout from JDK-8273297. We've noticed that blocks of less than 8kbytes are very slowly encrypted with AES/GCM . This is because of incorrect flag handline in vm_version_bsd_aarch64.cpp. Before this patch: Benchmark (dataSize) (keyLength) (provider) Mode Cnt Score Error Units AESGCMBench.decrypt 8191 256 avgt 6 95821.232 ? 2447.246 ns/op AESGCMBench.decrypt 8192 256 avgt 6 3653.619 ? 83.432 ns/op After: Benchmark (dataSize) (keyLength) (provider) Mode Cnt Score Error Units AESGCMBench.decrypt 8191 256 avgt 6 3371.735 ? 70.204 ns/op AESGCMBench.decrypt 8192 256 avgt 6 3119.928 ? 80.375 ns/op ------------- Commit messages: - Fix flag handling for MacOS. Changes: https://git.openjdk.java.net/jdk/pull/5894/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5894&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275052 Stats: 16 lines in 2 files changed: 7 ins; 7 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/5894.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5894/head:pull/5894 PR: https://git.openjdk.java.net/jdk/pull/5894 From ngasson at openjdk.java.net Wed Oct 13 14:49:15 2021 From: ngasson at openjdk.java.net (Nick Gasson) Date: Wed, 13 Oct 2021 14:49:15 GMT Subject: RFR: 8275052: AArch64: Fix flag handling for MacOS. In-Reply-To: References: Message-ID: <43kbCGG1652j3DrZj62IQDVdAt9JF2x9vusfZSYlxZ8=.9fc59351-7c2d-47ca-ba2b-2afa7fcc500b@github.com> On Mon, 11 Oct 2021 13:06:14 GMT, Andrew Haley wrote: > This is more of the fallout from JDK-8273297. > We've noticed that blocks of less than 8kbytes are very slowly encrypted with AES/GCM . This is because of incorrect flag handline in vm_version_bsd_aarch64.cpp. > > Before this patch: > > > Benchmark (dataSize) (keyLength) (provider) Mode Cnt Score Error Units > AESGCMBench.decrypt 8191 256 avgt 6 95821.232 ? 2447.246 ns/op > AESGCMBench.decrypt 8192 256 avgt 6 3653.619 ? 83.432 ns/op > > > After: > > > Benchmark (dataSize) (keyLength) (provider) Mode Cnt Score Error Units > AESGCMBench.decrypt 8191 256 avgt 6 3371.735 ? 70.204 ns/op > AESGCMBench.decrypt 8192 256 avgt 6 3119.928 ? 80.375 ns/op Marked as reviewed by ngasson (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/5894 From adinn at openjdk.java.net Wed Oct 13 14:56:48 2021 From: adinn at openjdk.java.net (Andrew Dinn) Date: Wed, 13 Oct 2021 14:56:48 GMT Subject: RFR: 8275052: AArch64: Fix flag handling for MacOS. In-Reply-To: References: Message-ID: On Mon, 11 Oct 2021 13:06:14 GMT, Andrew Haley wrote: > This is more of the fallout from JDK-8273297. > We've noticed that blocks of less than 8kbytes are very slowly encrypted with AES/GCM . This is because of incorrect flag handline in vm_version_bsd_aarch64.cpp. > > Before this patch: > > > Benchmark (dataSize) (keyLength) (provider) Mode Cnt Score Error Units > AESGCMBench.decrypt 8191 256 avgt 6 95821.232 ? 2447.246 ns/op > AESGCMBench.decrypt 8192 256 avgt 6 3653.619 ? 83.432 ns/op > > > After: > > > Benchmark (dataSize) (keyLength) (provider) Mode Cnt Score Error Units > AESGCMBench.decrypt 8191 256 avgt 6 3371.735 ? 70.204 ns/op > AESGCMBench.decrypt 8192 256 avgt 6 3119.928 ? 80.375 ns/op yes, looks ok. ------------- Marked as reviewed by adinn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5894 From ayang at openjdk.java.net Wed Oct 13 14:57:52 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Wed, 13 Oct 2021 14:57:52 GMT Subject: RFR: 8275035: Clean up worker thread infrastructure In-Reply-To: References: Message-ID: <0_2hNibwTQ_O4ogxshpot8dy46d3J9NkHQ2CCP4TMIo=.3f95825d-1aa9-44e6-b068-50c7515102a9@github.com> On Mon, 11 Oct 2021 09:21:21 GMT, Per Liden wrote: > I propose that we clean up our GangWorker/WorkGang and related classes, to remove abstractions we no longer need (after CMS was removed, MutexDispatcher was removed, Parallel is now using WorkGang, etc) and adjusting names as follows: > > * Rename AbstractGangTask to WorkerTask > * Rename WorkGang to WorkerThreads > * Fold GangWorker into WorkerThread > * Fold WorkManager into WorkerThreads > * Move SubTaskDone and friends to a new workerUtils.hpp/cpp > > I've split things up into several commits to make it easier to review. > > Testing: Passes Tier 1-3 on all Oracle platforms. Marked as reviewed by ayang (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/5886 From psandoz at openjdk.java.net Wed Oct 13 18:37:23 2021 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Wed, 13 Oct 2021 18:37:23 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) Message-ID: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> This PR improves the performance of vector operations that accept masks on architectures that support masking in hardware, specifically Intel AVX512 and ARM SVE. On architectures that do not support masking in hardware the same technique as before is applied to most operations, specifically composition using blend. Masked loads/stores are a special form of masked operation that require additional care to ensure out-of-bounds access throw exceptions. The range checking has not been fully optimized and will require further work. No API enhancements were required and only a few additional tests were needed. ------------- Commit messages: - Apply patch from https://github.com/openjdk/panama-vector/pull/142 - Apply patch from https://github.com/openjdk/panama-vector/pull/139 - Apply patch from https://github.com/openjdk/panama-vector/pull/151 - Add new files. - 8271515: Integration of JEP 417: Vector API (Third Incubator) Changes: https://git.openjdk.java.net/jdk/pull/5873/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5873&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8271515 Stats: 21999 lines in 106 files changed: 16228 ins; 2077 del; 3694 mod Patch: https://git.openjdk.java.net/jdk/pull/5873.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5873/head:pull/5873 PR: https://git.openjdk.java.net/jdk/pull/5873 From kvn at openjdk.java.net Wed Oct 13 19:35:52 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 13 Oct 2021 19:35:52 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) In-Reply-To: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> Message-ID: On Fri, 8 Oct 2021 21:25:26 GMT, Paul Sandoz wrote: > This PR improves the performance of vector operations that accept masks on architectures that support masking in hardware, specifically Intel AVX512 and ARM SVE. > > On architectures that do not support masking in hardware the same technique as before is applied to most operations, specifically composition using blend. > > Masked loads/stores are a special form of masked operation that require additional care to ensure out-of-bounds access throw exceptions. The range checking has not been fully optimized and will require further work. > > No API enhancements were required and only a few additional tests were needed. C2 and x86 changes seems fine. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5873 From coleenp at openjdk.java.net Wed Oct 13 20:37:03 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 13 Oct 2021 20:37:03 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" Message-ID: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> The MultiArray_lock isn't held when restoring shared arrays, so an observer can find one that isn't completely restored. Tested with tier1-3 in progress, and CDS tests locally. Not really reproduceable otherwise even with sleeps. ------------- Commit messages: - 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" Changes: https://git.openjdk.java.net/jdk/pull/5935/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5935&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8274338 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/5935.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5935/head:pull/5935 PR: https://git.openjdk.java.net/jdk/pull/5935 From dholmes at openjdk.java.net Wed Oct 13 21:36:49 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 13 Oct 2021 21:36:49 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" In-Reply-To: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> Message-ID: <6q4u2H-jsjQgu0xR2dkZ9sLVV9j8_7tu01egJ_uVbn0=.cd24a440-f53b-41d1-8b59-e52c7c7a4fa4@github.com> On Wed, 13 Oct 2021 20:29:46 GMT, Coleen Phillimore wrote: > The MultiArray_lock isn't held when restoring shared arrays, so an observer can find one that isn't completely restored. > Tested with tier1-3 in progress, and CDS tests locally. Not really reproduceable otherwise even with sleeps. Hi Coleen, Given the array_klasses() is accessed lock-free (using acquire/release semantics) I don't see how adding this lock prevents a partially restored array from being observed? The lock prevents concurrent restoration operations, and prevents restoration concurrently with creation (if that is even possible), but readers are still lock-free AFAICS. Thanks, David src/hotspot/share/oops/instanceKlass.cpp line 2588: > 2586: if (array_klasses() != NULL) { > 2587: // To get a consistent list of classes we need MultiArray_lock to ensure > 2588: // array classes aren't observed while they are being created. s/created/restored ------------- PR: https://git.openjdk.java.net/jdk/pull/5935 From coleenp at openjdk.java.net Wed Oct 13 22:51:52 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 13 Oct 2021 22:51:52 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" In-Reply-To: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> Message-ID: On Wed, 13 Oct 2021 20:29:46 GMT, Coleen Phillimore wrote: > The MultiArray_lock isn't held when restoring shared arrays, so an observer can find one that isn't completely restored. > Tested with tier1-3 in progress, and CDS tests locally. Not really reproduceable otherwise even with sleeps. If you look at everywhere array classes are created, there's a MultiArray_lock protecting them until the mirror is created so that it seems atomic when added to the ClassLoaderData::_klasses and the mirror is created and added to the klass. The JVMTI code locks MultiArray_lock to iterate through the classes (with a similar comment, actually copied). The restore_unshareable_info code doesn't lock restoring the mirror and adding the array Klass to the ClassLoaderData, but it should. The lock could be lower down but just taking it once seemed just as good and in a better place with less conditionals. ------------- PR: https://git.openjdk.java.net/jdk/pull/5935 From duke at openjdk.java.net Wed Oct 13 23:21:50 2021 From: duke at openjdk.java.net (duke) Date: Wed, 13 Oct 2021 23:21:50 GMT Subject: Withdrawn: 8270554: Shenandoah: Optimize heap scan loop In-Reply-To: References: Message-ID: On Thu, 15 Jul 2021 14:37:26 GMT, Roman Kennke wrote: > This is a fall-out from Lilliput. I noticed that in the heap scan loop, we load the size of objects in the size-based object scanner, even though all of the object closures already load the size, or at least in some cases, the Klass* which is necessary to determine the size. We can optimize that by making the scan loop co-operate with the closures. In other words, this changes the loop to avoid double-loading the Klass* in most cases in the size-based part of the scan loop. > > Note: the motivation in Lilliput is not performance, but correctness, because there loading the Klass* means loading the header, and this needs to be done carefully because of concurrent evacuation and concurrent locking code both messing with the header, and thus depends a lot on the actual closures to do it correctly. > > Implementation notes: > - SH::evacuate_object() has been changed so that it can return both the forwardee and the size. I opted to return the size as return-value because otherwise I'd have to null check an incoming pointer in the cases when we're not interested in the size. The way it is done, it can simply be ignored (and optimized-out) by the compiler. > - I added a do_object_size() variant to all affected iterators. I tried to do it with templates, but could not figure out how to please the compiler. > - While I was at it, I marked all do_object() methods as 'inline'. > - I ran some benchmarks. I think I see consistent but small improvements in evac and update-refs times, but it's not large enough to say that it is a definite improvement. > > Testing: > - [x] hotspot_gc_shenandoah > - [x] tier1 (+UseShenandoahGC) > - [x] specjvm testing This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/4797 From iklam at openjdk.java.net Thu Oct 14 00:06:54 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 14 Oct 2021 00:06:54 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" In-Reply-To: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> Message-ID: <5dpWXNoc6z9E2Kmg3ncvUl11oEjC_krmC19w3_TShgI=.aec10bf2-2355-4bd9-8c92-6645d3a95455@github.com> On Wed, 13 Oct 2021 20:29:46 GMT, Coleen Phillimore wrote: > The MultiArray_lock isn't held when restoring shared arrays, so an observer can find one that isn't completely restored. > Tested with tier1-3 in progress, and CDS tests locally. Not really reproduceable otherwise even with sleeps. LGTM ------------- PR: https://git.openjdk.java.net/jdk/pull/5935 From dholmes at openjdk.java.net Thu Oct 14 03:46:46 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 14 Oct 2021 03:46:46 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" In-Reply-To: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> Message-ID: On Wed, 13 Oct 2021 20:29:46 GMT, Coleen Phillimore wrote: > The MultiArray_lock isn't held when restoring shared arrays, so an observer can find one that isn't completely restored. > Tested with tier1-3 in progress, and CDS tests locally. Not really reproduceable otherwise even with sleeps. If the JVMTI code already acquires the lock to iterate through the classes then I can see why taking the lock in restore_unshareable_info might fix the observed problem. But that begs the question as to why JVMTI, as an observer, needs to take the lock in the first place, when these objects should not be observable until after they have been safely published? If the objects are not being safely published then other observers may also need to take out this lock. The real bug here seems to be that instanceKlasses can be found and inspected before InstanceKlass::restore_unshareable_info has been applied to that instanceKlass! Looking at Klass::restore_unshareable_info it has this: // Add to class loader list first before creating the mirror // (same order as class file parsing) loader_data->add_class(this); And that to me seems a bug as now the class can be found through the CLD before it is ready to be observed. You fixed the current problem where JVMTI stumbles over a null mirror related to an array class, but what other inappropriately set data in the instanceKlass might it also look at? ------------- PR: https://git.openjdk.java.net/jdk/pull/5935 From xliu at openjdk.java.net Thu Oct 14 04:18:53 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Thu, 14 Oct 2021 04:18:53 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> <6166f716-7e3f-ae49-8a81-00ce185e9640@oracle.com> Message-ID: On Tue, 12 Oct 2021 23:04:22 GMT, David Holmes wrote: >> On 10\/12\/21 8\:04 AM\, David Holmes wrote\: >>> On 12\/10\/2021 4\:15 pm\, Xin Liu wrote\: >>>> On Tue\, 12 Oct 2021 05\:27\:15 GMT\, David Holmes \ >>>> wrote\: >>>> >>>>>> Xin Liu has updated the pull request incrementally with one >>>>>> additional commit since the last revision\: >>>>>> >>>>>> \?\? Revert JavaThreadInVMAndNative change\. >>>>>> \?\? \?\? Only change the state to Native if current thread is Java >>>>>> Thread and it\'s in VM\. >>>>>> \?\? Restore thread state after fork\_and\_exec\(\) if it\'s changed\. >>>>> >>>>> src\/hotspot\/share\/utilities\/vmError\.cpp line 1659\: >>>>> >>>>>> 1657\:\?\?\?\?\?\? VMErrorForceInNative fn\(Thread\:\:current\_or\_null\(\)\)\; >>>>>> 1658\:\?\?\?\?\?\? if \(os\:\:fork\_and\_exec\(cmd\) \< 0\) \{ >>>>>> 1659\:\?\?\?\?\?\?\?\? out\.print\_cr\(\"os\:\:fork\_and\_exec failed\: \%s \(\%s\=\%d\)\"\, >>>>> >>>>> I\'m still concerned about using out\.print\_cr if you are >>>>> \_thread\_in\_native \- is it safe\? If it is\, and if \`print\_raw\` is also >>>>> safe to execute when in native\, then you could just apply the >>>>> VMErrorForceInNative before the while loop\. >>>> >>>> I think it\'s safe\. IIUC\, a java thread in Native is safe if we keep >>>> track all oops in handles\. >>>> In that sense\, we prevent them from GC\.\? A fdStream should have >>>> nothing to do with oops and GC\. >>> >>> Can out\.print\_cr acquire a lock \(e\.g\. ttyLock\?\) \? If so then it could >>> block and that would use a ThreadBlockInVM which requires it to be >>> \_thread\_in\_vm\. That is the kind of \"safety\" I\'m concerned about\. >> >> ttyLock is taken and released inside of print\_cr\, but it\'s a nosafepoint >> lock so would not block\. >> Coleen >> >>> >>> David >>> \-\-\-\-\- >>> > >> ttyLock is taken and released inside of print_cr, but it's a nosafepoint lock so would not block. > > Okay - thanks @coleenp hi, @dholmes-ora , @tstuefe , Thank you for helping me in this PR. Let me drop this one. This is 10x more harder than I initially thought. `VMError::report_and_die()` could be called from anywhere, anytime. I thought _thread must be in VM state if it is a Java Thread. I was wrong. Like David pointed out, this patch can only solve "the second possibilities". Without it, we can claim that "Deadlock when jcmd of OnError attaches to itself" is a feature instead of a bug. But we shouldn't leave undefined behavior to HotSpot for "the 1st set of possibilities". Further, transitioning current Java Thread to Native can't guarantee it's safepoint safe either. It's still up to the last frame. static bool safepoint_safe_with(JavaThread *thread, JavaThreadState state) { switch(state) { case _thread_in_native: // native threads are safe if they have no java stack or have walkable stack return !thread->has_last_Java_frame() || thread->frame_anchor()->walkable(); ... I don't like nondeterministic behavior, which leaves more questions than answers. We can revisit it later if we have a better solution. ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From xliu at openjdk.java.net Thu Oct 14 04:18:54 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Thu, 14 Oct 2021 04:18:54 GMT Subject: Withdrawn: 8273608: Deadlock when jcmd of OnError attaches to itself In-Reply-To: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: On Mon, 20 Sep 2021 22:02:37 GMT, Xin Liu wrote: > This patch allows the custom commands of OnError to attach to HotSpot itself. > It sets the thread of report_and_die() to Native before os::fork_and_exec(cmd). > This prevents cmds which require safepoint synchronization from deadlock. > eg. OnError='jcmd %p Thread.print'. > > Without this patch, we will encounter a deadlock at safepoint synchronization. > `"main" #1` is the very thread which executes `os::fork_and_exec(cmd)`. > > > Aborting due to java.lang.OutOfMemoryError: Java heap space > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (debug.cpp:364), pid=94632, tid=94633 > # fatal error: OutOfMemory encountered: Java heap space > # > # JRE version: OpenJDK Runtime Environment (18.0) (build 18-internal+0-adhoc.xxinliu.jdk) > # Java VM: OpenJDK 64-Bit Server VM (18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) > # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again > # > # An error report file with more information is saved as: > # /local/home/xxinliu/JDK-2085/hs_err_pid94632.log > # > # -XX:OnError="jcmd %p Thread.print" > # Executing /bin/sh -c "jcmd 94632 Thread.print" ... > 94632: > [10.616s][warning][safepoint] > [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timeout detected: > [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint. > [10.616s][warning][safepoint] # SafepointSynchronize::begin: Threads which did not reach the safepoint: > [10.616s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=236.97ms elapsed=10.61s tid=0x00007f01b00232f0 nid=94633 runnable [0x00007f01b7a08000] > [10.616s][warning][safepoint] java.lang.Thread.State: RUNNABLE > [10.616s][warning][safepoint] > [10.616s][warning][safepoint] # SafepointSynchronize::begin: (End of list) This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From dholmes at openjdk.java.net Thu Oct 14 04:26:55 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 14 Oct 2021 04:26:55 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v8] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: On Tue, 12 Oct 2021 07:02:12 GMT, Xin Liu wrote: >> This patch allows the custom commands of OnError to attach to HotSpot itself. >> It sets the thread of report_and_die() to Native before os::fork_and_exec(cmd). >> This prevents cmds which require safepoint synchronization from deadlock. >> eg. OnError='jcmd %p Thread.print'. >> >> Without this patch, we will encounter a deadlock at safepoint synchronization. >> `"main" #1` is the very thread which executes `os::fork_and_exec(cmd)`. >> >> >> Aborting due to java.lang.OutOfMemoryError: Java heap space >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (debug.cpp:364), pid=94632, tid=94633 >> # fatal error: OutOfMemory encountered: Java heap space >> # >> # JRE version: OpenJDK Runtime Environment (18.0) (build 18-internal+0-adhoc.xxinliu.jdk) >> # Java VM: OpenJDK 64-Bit Server VM (18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) >> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again >> # >> # An error report file with more information is saved as: >> # /local/home/xxinliu/JDK-2085/hs_err_pid94632.log >> # >> # -XX:OnError="jcmd %p Thread.print" >> # Executing /bin/sh -c "jcmd 94632 Thread.print" ... >> 94632: >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timeout detected: >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint. >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Threads which did not reach the safepoint: >> [10.616s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=236.97ms elapsed=10.61s tid=0x00007f01b00232f0 nid=94633 runnable [0x00007f01b7a08000] >> [10.616s][warning][safepoint] java.lang.Thread.State: RUNNABLE >> [10.616s][warning][safepoint] >> [10.616s][warning][safepoint] # SafepointSynchronize::begin: (End of list) > > Xin Liu has updated the pull request incrementally with one additional commit since the last revision: > > Change to VM unconditionally as long as current thread is JavaThread. > > Hoist VMErrorForceNative out of While. Hi Xin, I must apologise for misleading you with the use of _thread_in_native, I had forgotten that safepoint-safety also relied on the stack state. David ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From stuefe at openjdk.java.net Thu Oct 14 04:53:56 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 14 Oct 2021 04:53:56 GMT Subject: RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v8] In-Reply-To: References: <3mILJs2Lcq7t5gUDP70FH2LVsm-NT2UTsm1JY-rCKB0=.d4dd4b55-9476-43ad-a0cf-ce6d2a9bef4e@github.com> Message-ID: On Thu, 14 Oct 2021 04:22:33 GMT, David Holmes wrote: >> Xin Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Change to VM unconditionally as long as current thread is JavaThread. >> >> Hoist VMErrorForceNative out of While. > > Hi Xin, > > I must apologise for misleading you with the use of _thread_in_native, I had forgotten that safepoint-safety also relied on the stack state. > > David > hi, @dholmes-ora , @tstuefe , > > Thank you for helping me in this PR. Let me drop this one. > > This is 10x more harder than I initially thought. `VMError::report_and_die()` could be called from anywhere, anytime. I thought _thread must be in VM state if it is a Java Thread. I was wrong. > > Like David pointed out, this patch can only solve "the second possibilities". Without it, we can claim that "Deadlock when jcmd of OnError attaches to itself" is a feature instead of a bug. But we shouldn't leave undefined behavior to HotSpot for "the 1st set of possibilities". Further, transitioning current Java Thread to Native can't guarantee it's safepoint safe either. It's still up to the last frame. > > ``` > static bool safepoint_safe_with(JavaThread *thread, JavaThreadState state) { > switch(state) { > case _thread_in_native: > // native threads are safe if they have no java stack or have walkable stack > return !thread->has_last_Java_frame() || thread->frame_anchor()->walkable(); > ... > ``` > > I don't like nondeterministic behavior, which leaves more questions than answers. We can revisit it later if we have a better solution. Hi Xin, sorry for your frustration. The complexity of this was not clear to me as well. Thinking about this, self-attaching jcmd for analysis on error is maybe not the best solution to your problem. If the VM is crashy, all bets are off to what happens and we may hang and/or spoil the core. For hanging, we have the crash timeout watcher which will eventually kick the VM, but still, this is not perfect (takes too long, and you lose the core). And if the VM is in a java OOM, it is in a deterministic state but may have a high memory footprint which may prevent it from forking successfully. `os::fork_and_exec()` uses fork(), not vfork() or posix_spawn(), on some platforms. See the https://github.com/openjdk/jdk/pull/5698 for a discussion (stalled, waiting on input). An alternative I would like would be to add a way to execute jcmds from within the VM itself on OOM, without the need to spawn a child process, without involving the attach framework. We'd still have to dive into native, but since we would restrict this to deterministic OOM scenarios this should work. For example an option like `-XX:CommandOnOOM=...`, to complement the existing `OnOOM` switches. These have the problem that they don't fire - for backward compatibility reasons - for all cases of OOM. But a new switch is free to handle this differently. Cheers, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/5590 From iklam at openjdk.java.net Thu Oct 14 05:08:59 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 14 Oct 2021 05:08:59 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" In-Reply-To: References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> Message-ID: On Thu, 14 Oct 2021 03:43:42 GMT, David Holmes wrote: > If the JVMTI code already acquires the lock to iterate through the classes then I can see why taking the lock in restore_unshareable_info might fix the observed problem. But that begs the question as to why JVMTI, as an observer, needs to take the lock in the first place, when these objects should not be observable until after they have been safely published? If the objects are not being safely published then other observers may also need to take out this lock. The real bug here seems to be that instanceKlasses can be found and inspected before InstanceKlass::restore_unshareable_info has been applied to that instanceKlass! Looking at Klass::restore_unshareable_info it has this: > > ``` > // Add to class loader list first before creating the mirror > // (same order as class file parsing) > loader_data->add_class(this); > ``` > > And that to me seems a bug as now the class can be found through the CLD before it is ready to be observed. > > You fixed the current problem where JVMTI stumbles over a null mirror related to an array class, but what other inappropriately set data in the instanceKlass might it also look at? Yes, I agree that's fishy. But I think the bug may not be the call to add_class. If you look at ClassFileParser::fill_instance_klass(), it also does this very early, before many fields of the class are initialized: const bool publicize = !is_internal(); _loader_data->add_class(ik, publicize); whereas Klass::restore_unshareable_info always uses publicize==true. Could that be an issue? Also, when a class is parsed, the mirror is created almost at the very end. However, for archived classes, the mirror is created when only the Klass portion are initialized. Maybe we should delay the mirror creation to a later time? ------------- PR: https://git.openjdk.java.net/jdk/pull/5935 From dholmes at openjdk.java.net Thu Oct 14 05:48:49 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 14 Oct 2021 05:48:49 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" In-Reply-To: References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> Message-ID: On Thu, 14 Oct 2021 05:06:00 GMT, Ioi Lam wrote: > whereas Klass::restore_unshareable_info always uses publicize==true. Could that be an issue? The definition of publicize is just: `publicize = !is_internal();` and I can't see how that would change the visibility in terms of the JVMTI code. > Maybe we should delay the mirror creation to a later time? I think there should be a lot of similarity between the process for regular loading and the loading from the archive. But ultimately both should have the same property: no publishing of the klass until after it is fully "initialized". But neither process seems to ensure that, which is puzzling ... unless I'm not looking high enough up for some locking in the process in both cases. Thinking further on the fix ... without knowing exactly how the assertion failure arises, it seems to me that the fix cannot be complete. If the JVMTI code is racing with the execution of restore_unshareable_info and the setting of the mirror, then even with the lock added the JVMTI code could still get there first and find the mirror not set. ------------- PR: https://git.openjdk.java.net/jdk/pull/5935 From ddong at openjdk.java.net Thu Oct 14 05:54:16 2021 From: ddong at openjdk.java.net (Denghui Dong) Date: Thu, 14 Oct 2021 05:54:16 GMT Subject: RFR: 8275259: Add support for Java level DCmd Message-ID: <48CHRA6hPJvj0wmgRjRKbV0b9WSv71sVk3NafMIdHi4=.a70075a9-68f7-4df7-bcd6-7de6de7cb259@github.com> I proposed to extend DCmd to allow Java developers to customize their own diagnostic commands last week. At present, I have implemented a preliminary version. In the current implementation, I provided some simple APIs in the Java layer (under sun.management.cmd) for defining and registering commands. - Executable interface User-defined commands need to implement this interface, the interface only contains a simle method: /** * @param output the output when this executable is running */ void execute(PrintWriter output); - Add two annotations (@Command and @Parameter) to describe the command meta info - Use Factory API to register command, the following forms are supported @Command(name = "My.Echo", description = "Echo description") class Echo implements Executable { @Parameter(name = "text", ordinal=0, isMandatory = true) String text; @Parameter(name = "repeat", isMandatory = true, defaultValue = "1") int repeat; @Override public void execute(PrintWriter out) { for (int i = 0 ; i < repeat; i++) { out.println(text); } } } Factory.register(Echo.class); Factory.register("My.Date", output -> { output.println(new Date()); }); - When the command is running, the VM will call `Executor.executeCommand` to execute the command. In the implementation of Executor, I introduced a simple timeout mechanism to prevent the command channel from being blocked. At the VM layer, I extended the existing DCmd framework(implemented JavaDCmd and JavaDCmdFactoryImpl) to be compatible with existing functions (jmx, help, etc.). In terms of security, considering that the SecurityManager will be deprecated, there are not too many considerations for the time being. Any input is appreciated. Thanks, Denghui ------------- Commit messages: - 8275259: Add support for Java level DCmd Changes: https://git.openjdk.java.net/jdk/pull/5938/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5938&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275259 Stats: 1159 lines in 12 files changed: 1158 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/5938.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5938/head:pull/5938 PR: https://git.openjdk.java.net/jdk/pull/5938 From dholmes at openjdk.java.net Thu Oct 14 05:57:49 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 14 Oct 2021 05:57:49 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" In-Reply-To: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> Message-ID: On Wed, 13 Oct 2021 20:29:46 GMT, Coleen Phillimore wrote: > The MultiArray_lock isn't held when restoring shared arrays, so an observer can find one that isn't completely restored. > Tested with tier1-3 in progress, and CDS tests locally. Not really reproduceable otherwise even with sleeps. AFAICS `publicize` only affects logging in `add_class`. ------------- PR: https://git.openjdk.java.net/jdk/pull/5935 From dholmes at openjdk.java.net Thu Oct 14 06:02:53 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 14 Oct 2021 06:02:53 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" In-Reply-To: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> Message-ID: On Wed, 13 Oct 2021 20:29:46 GMT, Coleen Phillimore wrote: > The MultiArray_lock isn't held when restoring shared arrays, so an observer can find one that isn't completely restored. > Tested with tier1-3 in progress, and CDS tests locally. Not really reproduceable otherwise even with sleeps. > This is the assert it gets in ClassLoaderData::loaded_classes_do, where this is the NULL CLD. > 359 if (k->is_array_klass() || (k->is_instance_klass() && InstanceKlass::cast(k)->is_loaded())) { > 360 #ifdef ASSERT > 361 oop m = k->java_mirror(); > 362 assert(m != NULL, "NULL mirror"); Maybe it is the assertion that is wrong? If we are actually adding incomplete classes to the CLD such that they are visible to traversal, then the code should not be assuming they are complete and a NULL mirror may be perfectly acceptable? ------------- PR: https://git.openjdk.java.net/jdk/pull/5935 From pliden at openjdk.java.net Thu Oct 14 09:03:18 2021 From: pliden at openjdk.java.net (Per Liden) Date: Thu, 14 Oct 2021 09:03:18 GMT Subject: RFR: 8275035: Clean up worker thread infrastructure [v2] In-Reply-To: References: Message-ID: <-Syf1OT7GZmoQxHt8U4f-A5sEDEy4Yppqwqfj4B1FXo=.bbb9637d-bc9f-4e35-a1a5-0c94a90d825f@github.com> > I propose that we clean up our GangWorker/WorkGang and related classes, to remove abstractions we no longer need (after CMS was removed, MutexDispatcher was removed, Parallel is now using WorkGang, etc) and adjusting names as follows: > > * Rename AbstractGangTask to WorkerTask > * Rename WorkGang to WorkerThreads > * Fold GangWorker into WorkerThread > * Fold WorkManager into WorkerThreads > * Move SubTaskDone and friends to a new workerUtils.hpp/cpp > > I've split things up into several commits to make it easier to review. > > Testing: Passes Tier 1-3 on all Oracle platforms. Per Liden has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: - Merge master - Clean up naming in ReferenceProcessor - Clean up naming in Shenandoah - Clean up naming in HeapDumper/HeapInspector - Clean up naming in G1 - Clean up naming in pretouch - Clean up naming in WeakProcessor - Clean up naming in ZGC - Remove unused log tag - Rename workgroup.hpp/cpp to workerThread.hpp/cpp - ... and 11 more: https://git.openjdk.java.net/jdk/compare/8b1b6f9f...9a89f785 ------------- Changes: https://git.openjdk.java.net/jdk/pull/5886/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5886&range=01 Stats: 2415 lines in 105 files changed: 881 ins; 1083 del; 451 mod Patch: https://git.openjdk.java.net/jdk/pull/5886.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5886/head:pull/5886 PR: https://git.openjdk.java.net/jdk/pull/5886 From coleenp at openjdk.java.net Thu Oct 14 11:48:17 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 14 Oct 2021 11:48:17 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" [v2] In-Reply-To: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> Message-ID: > The MultiArray_lock isn't held when restoring shared arrays, so an observer can find one that isn't completely restored. > Tested with tier1-3 in progress, and CDS tests locally. Not really reproduceable otherwise even with sleeps. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix comment. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5935/files - new: https://git.openjdk.java.net/jdk/pull/5935/files/045366d4..b5ba03a1 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5935&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5935&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/5935.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5935/head:pull/5935 PR: https://git.openjdk.java.net/jdk/pull/5935 From coleenp at openjdk.java.net Thu Oct 14 11:48:20 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 14 Oct 2021 11:48:20 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" [v2] In-Reply-To: <6q4u2H-jsjQgu0xR2dkZ9sLVV9j8_7tu01egJ_uVbn0=.cd24a440-f53b-41d1-8b59-e52c7c7a4fa4@github.com> References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> <6q4u2H-jsjQgu0xR2dkZ9sLVV9j8_7tu01egJ_uVbn0=.cd24a440-f53b-41d1-8b59-e52c7c7a4fa4@github.com> Message-ID: On Wed, 13 Oct 2021 21:30:38 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix comment. > > src/hotspot/share/oops/instanceKlass.cpp line 2588: > >> 2586: if (array_klasses() != NULL) { >> 2587: // To get a consistent list of classes we need MultiArray_lock to ensure >> 2588: // array classes aren't observed while they are being created. > > s/created/restored fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/5935 From coleenp at openjdk.java.net Thu Oct 14 11:55:49 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 14 Oct 2021 11:55:49 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" [v2] In-Reply-To: References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> Message-ID: On Thu, 14 Oct 2021 11:48:17 GMT, Coleen Phillimore wrote: >> The MultiArray_lock isn't held when restoring shared arrays, so an observer can find one that isn't completely restored. >> Tested with tier1-3 in progress, and CDS tests locally. Not really reproduceable otherwise even with sleeps. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix comment. Thanks Ioi for the review and for answering David's questions. Let me try to answer further. In ClassLoaderData::loaded_classes_do, the InstanceKlass is added to the CLD but it's not marked as 'loaded' until the mirror is created, so it won't be followed. The array case has no such initialization state field, so both adding to CLD and updating the mirror must be as-if atomic. This is accomplished by locking the MultiArray_lock. In the creation case, the ArrayKlass is added to the CLD after the mirror is created, still with the lock. In the restore_unshareable_info, the class is added to the CLD first because it shares code with InstanceKlass (it's in Klass::restore_unshareable_info). We could add some 'if' statements in Klass::restore_unshareable_info to fix the order of the ArrayKlass case, but taking out the lock will still be needed to be consistent with the creation case, and is sufficient here also. Also, I would not like to take out the null check for this case. Having code find a null mirror further down will likely cause problems in whatever the do_klass() callback wants (in this case, likely wants the mirror). ------------- PR: https://git.openjdk.java.net/jdk/pull/5935 From pliden at openjdk.java.net Thu Oct 14 14:08:56 2021 From: pliden at openjdk.java.net (Per Liden) Date: Thu, 14 Oct 2021 14:08:56 GMT Subject: RFR: 8275035: Clean up worker thread infrastructure [v2] In-Reply-To: <-Syf1OT7GZmoQxHt8U4f-A5sEDEy4Yppqwqfj4B1FXo=.bbb9637d-bc9f-4e35-a1a5-0c94a90d825f@github.com> References: <-Syf1OT7GZmoQxHt8U4f-A5sEDEy4Yppqwqfj4B1FXo=.bbb9637d-bc9f-4e35-a1a5-0c94a90d825f@github.com> Message-ID: On Thu, 14 Oct 2021 09:03:18 GMT, Per Liden wrote: >> I propose that we clean up our GangWorker/WorkGang and related classes, to remove abstractions we no longer need (after CMS was removed, MutexDispatcher was removed, Parallel is now using WorkGang, etc) and adjusting names as follows: >> >> * Rename AbstractGangTask to WorkerTask >> * Rename WorkGang to WorkerThreads >> * Fold GangWorker into WorkerThread >> * Fold WorkManager into WorkerThreads >> * Move SubTaskDone and friends to a new workerUtils.hpp/cpp >> >> I've split things up into several commits to make it easier to review. >> >> Testing: Passes Tier 1-3 on all Oracle platforms. > > Per Liden has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: > > - Merge master > - Clean up naming in ReferenceProcessor > - Clean up naming in Shenandoah > - Clean up naming in HeapDumper/HeapInspector > - Clean up naming in G1 > - Clean up naming in pretouch > - Clean up naming in WeakProcessor > - Clean up naming in ZGC > - Remove unused log tag > - Rename workgroup.hpp/cpp to workerThread.hpp/cpp > - ... and 11 more: https://git.openjdk.java.net/jdk/compare/8b1b6f9f...9a89f785 Thanks all for reviewing! ------------- PR: https://git.openjdk.java.net/jdk/pull/5886 From pliden at openjdk.java.net Thu Oct 14 14:08:57 2021 From: pliden at openjdk.java.net (Per Liden) Date: Thu, 14 Oct 2021 14:08:57 GMT Subject: Integrated: 8275035: Clean up worker thread infrastructure In-Reply-To: References: Message-ID: On Mon, 11 Oct 2021 09:21:21 GMT, Per Liden wrote: > I propose that we clean up our GangWorker/WorkGang and related classes, to remove abstractions we no longer need (after CMS was removed, MutexDispatcher was removed, Parallel is now using WorkGang, etc) and adjusting names as follows: > > * Rename AbstractGangTask to WorkerTask > * Rename WorkGang to WorkerThreads > * Fold GangWorker into WorkerThread > * Fold WorkManager into WorkerThreads > * Move SubTaskDone and friends to a new workerUtils.hpp/cpp > > I've split things up into several commits to make it easier to review. > > Testing: Passes Tier 1-3 on all Oracle platforms. This pull request has now been integrated. Changeset: 54b88707 Author: Per Liden URL: https://git.openjdk.java.net/jdk/commit/54b887076612c0eaa410a849178f8ba0c4ed3eeb Stats: 2415 lines in 105 files changed: 881 ins; 1083 del; 451 mod 8275035: Clean up worker thread infrastructure Reviewed-by: stefank, ayang ------------- PR: https://git.openjdk.java.net/jdk/pull/5886 From burban at openjdk.java.net Thu Oct 14 14:54:53 2021 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Thu, 14 Oct 2021 14:54:53 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler [v3] In-Reply-To: References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> <6JpuK39Hiw9D4DlQoUKxqrIxSADCAjyxCxiy3niPBwI=.d17d804f-df3c-4f83-af63-4057cd0ca632@github.com> Message-ID: <4xfxEIBgf6rnZWISOlngf_cVaepjji22V1civ6MJ1BQ=.93f0932d-94bd-46c7-9508-0084b40827c1@github.com> On Wed, 13 Oct 2021 07:21:23 GMT, Andrew Haley wrote: >> So we have fixed the only use of `popa()` so that it does not clobber R18. That is good. >> However, I think we've found another bug. `reguard_yellow_pages()` is a native function, so we should be saving floating-point registers too. It looks to me like this code should use `push_call_clobbered_registers()`. Then we'll have no use for `pusha()` in trunk, and your fix for `popa()` can go into 11u. > >> Thanks for the suggestion @theRealAph. By looking through the code I discovered another buggy set of helpers, `{push,pop}_CPU_state()` which also messed with `r18`. > > Sure, so we can do the same thing there, if we stil need it. > >> Does it look fine to you, or should the code around `reguard_yellow_pages()` rather use `{push,pop}-call_clobbered_registers()`? > > Yes, very much so. @theRealAph would you mind to sponsor this change? ------------- PR: https://git.openjdk.java.net/jdk/pull/5828 From coleenp at openjdk.java.net Thu Oct 14 16:03:22 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 14 Oct 2021 16:03:22 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" [v3] In-Reply-To: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> Message-ID: > The MultiArray_lock isn't held when restoring shared arrays, so an observer can find one that isn't completely restored. > Tested with tier1-3 in progress, and CDS tests locally. Not really reproduceable otherwise even with sleeps. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Add assert that MultiArray_lock is held for loaded_classes_do. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5935/files - new: https://git.openjdk.java.net/jdk/pull/5935/files/b5ba03a1..4047ee88 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5935&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5935&range=01-02 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/5935.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5935/head:pull/5935 PR: https://git.openjdk.java.net/jdk/pull/5935 From burban at openjdk.java.net Thu Oct 14 17:01:57 2021 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Thu, 14 Oct 2021 17:01:57 GMT Subject: Integrated: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler In-Reply-To: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> Message-ID: On Tue, 5 Oct 2021 20:33:31 GMT, Bernhard Urban-Forster wrote: > r18 should not be used as it is reserved as platform register. Linux is fine with userspace using it, but Windows and also recently macOS (https://github.com/openjdk/jdk11u-dev/pull/301#issuecomment-911998917 ) are actually using it on the kernel side. > > The macro assembler uses the bit pattern `0x7fff_ffff` (== `r0-r30`) to specify which registers to spill; fortunately this helper is only used here: > > https://github.com/openjdk/jdk/blob/c05dc268acaf87236f30cf700ea3ac778e3b20e5/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp#L1400-L1404 > > I haven't seen causing this particular instance any issues in practice _yet_, presumably because it looks hard to align the stars in order to trigger a problem (between `stp` and `ldp` of `r18` a transition to kernel space must happen *and* the kernel needs to do something with `r18`). But jdk11u-dev has more usages of the `::pusha`/`::popa` macro and that causes troubles as explained in the link above. > > Output of `-XX:+PrintInterpreter` before this change: > > ---------------------------------------------------------------------- > method entry point (kind = native) [0x0000000138809b00, 0x000000013880a280] 1920 bytes > -------------------------------------------------------------------------------- > 0x0000000138809b00: ldr x2, [x12, #16] > 0x0000000138809b04: ldrh w2, [x2, #44] > 0x0000000138809b08: add x24, x20, x2, uxtx #3 > 0x0000000138809b0c: sub x24, x24, #0x8 > [...] > 0x0000000138809fa4: stp x16, x17, [sp, #128] > 0x0000000138809fa8: stp x18, x19, [sp, #144] > 0x0000000138809fac: stp x20, x21, [sp, #160] > [...] > 0x0000000138809fc0: stp x30, xzr, [sp, #240] > 0x0000000138809fc4: mov x0, x28 > ;; 0x10864ACCC > 0x0000000138809fc8: mov x9, #0xaccc // #44236 > 0x0000000138809fcc: movk x9, #0x864, lsl #16 > 0x0000000138809fd0: movk x9, #0x1, lsl #32 > 0x0000000138809fd4: blr x9 > 0x0000000138809fd8: ldp x2, x3, [sp, #16] > [...] > 0x0000000138809ff4: ldp x16, x17, [sp, #128] > 0x0000000138809ff8: ldp x18, x19, [sp, #144] > 0x0000000138809ffc: ldp x20, x21, [sp, #160] > > > After: > > ---------------------------------------------------------------------- > method entry point (kind = native) [0x0000000108e4db00, 0x0000000108e4e280] 1920 bytes > > -------------------------------------------------------------------------------- > 0x0000000108e4db00: ldr x2, [x12, #16] > 0x0000000108e4db04: ldrh w2, [x2, #44] > 0x0000000108e4db08: add x24, x20, x2, uxtx #3 > 0x0000000108e4db0c: sub x24, x24, #0x8 > [...] > 0x0000000108e4dfa4: stp x16, x17, [sp, #128] > 0x0000000108e4dfa8: stp x19, x20, [sp, #144] > 0x0000000108e4dfac: stp x21, x22, [sp, #160] > [...] > 0x0000000108e4dfbc: stp x29, x30, [sp, #224] > 0x0000000108e4dfc0: mov x0, x28 > ;; 0x107E4A06C > 0x0000000108e4dfc4: mov x9, #0xa06c // #41068 > 0x0000000108e4dfc8: movk x9, #0x7e4, lsl #16 > 0x0000000108e4dfcc: movk x9, #0x1, lsl #32 > 0x0000000108e4dfd0: blr x9 > 0x0000000108e4dfd4: ldp x2, x3, [sp, #16] > [...] > 0x0000000108e4dff0: ldp x16, x17, [sp, #128] > 0x0000000108e4dff4: ldp x19, x20, [sp, #144] > 0x0000000108e4dff8: ldp x21, x22, [sp, #160] > [...] This pull request has now been integrated. Changeset: ede3f4e9 Author: Bernhard Urban-Forster Committer: Andrew Haley URL: https://git.openjdk.java.net/jdk/commit/ede3f4e94c752a8457b7c24e001bd122845d2f6a Stats: 25 lines in 3 files changed: 8 ins; 13 del; 4 mod 8274795: AArch64: avoid spilling and restoring r18 in macro assembler Reviewed-by: aph ------------- PR: https://git.openjdk.java.net/jdk/pull/5828 From psandoz at openjdk.java.net Thu Oct 14 17:21:25 2021 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Thu, 14 Oct 2021 17:21:25 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v2] In-Reply-To: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> Message-ID: <0LvaNkXLAVEoVt15uJ3Y2ZFzbctyTk9A76JhKXmZBqo=.f827b200-6aca-49ec-bcf5-76dbe194f4a0@github.com> > This PR improves the performance of vector operations that accept masks on architectures that support masking in hardware, specifically Intel AVX512 and ARM SVE. > > On architectures that do not support masking in hardware the same technique as before is applied to most operations, specifically composition using blend. > > Masked loads/stores are a special form of masked operation that require additional care to ensure out-of-bounds access throw exceptions. The range checking has not been fully optimized and will require further work. > > No API enhancements were required and only a few additional tests were needed. Paul Sandoz has updated the pull request incrementally with one additional commit since the last revision: Apply patch from https://github.com/openjdk/panama-vector/pull/152 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5873/files - new: https://git.openjdk.java.net/jdk/pull/5873/files/f558b74b..bfb8fff8 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5873&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5873&range=00-01 Stats: 2 lines in 2 files changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/5873.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5873/head:pull/5873 PR: https://git.openjdk.java.net/jdk/pull/5873 From psandoz at openjdk.java.net Thu Oct 14 17:30:52 2021 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Thu, 14 Oct 2021 17:30:52 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v2] In-Reply-To: <0LvaNkXLAVEoVt15uJ3Y2ZFzbctyTk9A76JhKXmZBqo=.f827b200-6aca-49ec-bcf5-76dbe194f4a0@github.com> References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> <0LvaNkXLAVEoVt15uJ3Y2ZFzbctyTk9A76JhKXmZBqo=.f827b200-6aca-49ec-bcf5-76dbe194f4a0@github.com> Message-ID: On Thu, 14 Oct 2021 17:21:25 GMT, Paul Sandoz wrote: >> This PR improves the performance of vector operations that accept masks on architectures that support masking in hardware, specifically Intel AVX512 and ARM SVE. >> >> On architectures that do not support masking in hardware the same technique as before is applied to most operations, specifically composition using blend. >> >> Masked loads/stores are a special form of masked operation that require additional care to ensure out-of-bounds access throw exceptions. The range checking has not been fully optimized and will require further work. >> >> No API enhancements were required and only a few additional tests were needed. > > Paul Sandoz has updated the pull request incrementally with one additional commit since the last revision: > > Apply patch from https://github.com/openjdk/panama-vector/pull/152 @njian there is a conflict with `macroAssembler_aarch64.cpp`: --- a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp +++ b/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp @@@ -2580,8 -2495,8 +2571,13 @@@ void MacroAssembler::pop_call_clobbered } void MacroAssembler::push_CPU_state(bool save_vectors, bool use_sve, ++<<<<<<< HEAD + int sve_vector_size_in_bytes, int total_predicate_in_bytes) { + push(0x3fffffff, sp); // integer registers except lr & sp ++======= + int sve_vector_size_in_bytes) { + push(RegSet::range(r0, r29), sp); // integer registers except lr & sp ++>>>>>>> master if (save_vectors && use_sve && sve_vector_size_in_bytes > 16) { sub(sp, sp, sve_vector_size_in_bytes * FloatRegisterImpl::number_of_registers); for (int i = 0; i < FloatRegisterImpl::number_of_registers; i++) { I think the resolution is this: @@ -2581,7 +2572,7 @@ void MacroAssembler::pop_call_clobbered_registers_except(RegSet exclude) { void MacroAssembler::push_CPU_state(bool save_vectors, bool use_sve, int sve_vector_size_in_bytes, int total_predicate_in_bytes) { - push(0x3fffffff, sp); // integer registers except lr & sp + push(RegSet::range(r0, r29), sp); // integer registers except lr & sp is that correct? ------------- PR: https://git.openjdk.java.net/jdk/pull/5873 From duke at openjdk.java.net Thu Oct 14 20:12:42 2021 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Thu, 14 Oct 2021 20:12:42 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v11] In-Reply-To: References: Message-ID: > This PR is a follow-up on the discussion [?RFC: AArch64: Implementing spin pauses with ISB?](https://mail.openjdk.java.net/pipermail/hotspot-dev/2021-August/054033.html). > > It adds DIAGNOSTIC options `OnSpinWaitInst=inst`, where `inst` can be: > > - `none`: no implementation for spin pauses. This is the default value. > - `nop`: use `nop` instruction for spin pauses. > - `isb`: use `isb` instruction for spin pauses. > - `yield`: use `yield` instruction for spin pauses. > > And `OnSpinWaitInstCount=count`, where `count` specifies a number of `OnSpinWaitInst` and can be in `1..99` range. It is an error to use `OnSpinWaitInstCount` when `OnSpinWaitInst` is `none`. > > The code for the `Thread.onSpinWait` intrinsic is generated based on the values of `OnSpinWaitInst` and `OnSpinWaitInstCount`. > > Testing: > > - `make test TEST="gtest"`: Passed > - `make run-test TEST="tier1"`: Passed > - `make run-test TEST="tier2"`: Passed > - `make run-test TEST=hotspot/jtreg/compiler/onSpinWait`: Passed > > CSR: https://bugs.openjdk.java.net/browse/JDK-8274564 Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: Add simple microbenchmark to measure counting with pauses ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5562/files - new: https://git.openjdk.java.net/jdk/pull/5562/files/85fe15b2..970d8d6f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5562&range=10 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5562&range=09-10 Stats: 81 lines in 1 file changed: 81 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/5562.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5562/head:pull/5562 PR: https://git.openjdk.java.net/jdk/pull/5562 From duke at openjdk.java.net Thu Oct 14 20:12:43 2021 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Thu, 14 Oct 2021 20:12:43 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v10] In-Reply-To: References: <8P-tWT-7UC9TMLz8zo5liDy2rOONBU864RUlRhthLeY=.05ef40f6-38ca-4e84-a54b-eb8dbae2b97f@github.com> Message-ID: On Tue, 12 Oct 2021 08:11:20 GMT, Andrew Haley wrote: > threads racing to count up Done. ------------- PR: https://git.openjdk.java.net/jdk/pull/5562 From duke at openjdk.java.net Thu Oct 14 20:12:45 2021 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Thu, 14 Oct 2021 20:12:45 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v10] In-Reply-To: <8P-tWT-7UC9TMLz8zo5liDy2rOONBU864RUlRhthLeY=.05ef40f6-38ca-4e84-a54b-eb8dbae2b97f@github.com> References: <8P-tWT-7UC9TMLz8zo5liDy2rOONBU864RUlRhthLeY=.05ef40f6-38ca-4e84-a54b-eb8dbae2b97f@github.com> Message-ID: On Mon, 11 Oct 2021 20:17:36 GMT, Evgeny Astigeevich wrote: >> This PR is a follow-up on the discussion [?RFC: AArch64: Implementing spin pauses with ISB?](https://mail.openjdk.java.net/pipermail/hotspot-dev/2021-August/054033.html). >> >> It adds DIAGNOSTIC options `OnSpinWaitInst=inst`, where `inst` can be: >> >> - `none`: no implementation for spin pauses. This is the default value. >> - `nop`: use `nop` instruction for spin pauses. >> - `isb`: use `isb` instruction for spin pauses. >> - `yield`: use `yield` instruction for spin pauses. >> >> And `OnSpinWaitInstCount=count`, where `count` specifies a number of `OnSpinWaitInst` and can be in `1..99` range. It is an error to use `OnSpinWaitInstCount` when `OnSpinWaitInst` is `none`. >> >> The code for the `Thread.onSpinWait` intrinsic is generated based on the values of `OnSpinWaitInst` and `OnSpinWaitInstCount`. >> >> Testing: >> >> - `make test TEST="gtest"`: Passed >> - `make run-test TEST="tier1"`: Passed >> - `make run-test TEST="tier2"`: Passed >> - `make run-test TEST=hotspot/jtreg/compiler/onSpinWait`: Passed >> >> CSR: https://bugs.openjdk.java.net/browse/JDK-8274564 > > Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: > > Make OnSpinWaitInst/OnSpinWaitInstCount DIAGNOSTIC Results: - 1 `isb` Benchmark (maxNum) Mode Cnt Score Error Units ThreadOnSpinWait.count:withOnSpinWait 1000000 avgt 10 18.029 ? 0.001 ms/op ThreadOnSpinWait.count:withSleep0 1000000 avgt 10 337.506 ? 2.530 ms/op ThreadOnSpinWait.count:withoutPause 1000000 avgt 10 2.663 ? 0.073 ms/op - 1 `yield` Benchmark (maxNum) Mode Cnt Score Error Units ThreadOnSpinWait.count:withOnSpinWait 1000000 avgt 10 2.667 ? 0.069 ms/op ThreadOnSpinWait.count:withSleep0 1000000 avgt 10 339.933 ? 4.954 ms/op ThreadOnSpinWait.count:withoutPause 1000000 avgt 10 2.677 ? 0.075 ms/op ------------- PR: https://git.openjdk.java.net/jdk/pull/5562 From burban at openjdk.java.net Thu Oct 14 20:21:53 2021 From: burban at openjdk.java.net (Bernhard Urban-Forster) Date: Thu, 14 Oct 2021 20:21:53 GMT Subject: RFR: JDK-8274795: AArch64: avoid spilling and restoring r18 in macro assembler [v5] In-Reply-To: References: <3FQD96CcugPZkfoXtOx0r5iSV4rqpzq2UG7rswcZgso=.44ed43bb-27d8-4a3e-96cb-4658c7231626@github.com> Message-ID: On Wed, 13 Oct 2021 09:03:08 GMT, Bernhard Urban-Forster wrote: >> r18 should not be used as it is reserved as platform register. Linux is fine with userspace using it, but Windows and also recently macOS (https://github.com/openjdk/jdk11u-dev/pull/301#issuecomment-911998917 ) are actually using it on the kernel side. >> >> The macro assembler uses the bit pattern `0x7fff_ffff` (== `r0-r30`) to specify which registers to spill; fortunately this helper is only used here: >> >> https://github.com/openjdk/jdk/blob/c05dc268acaf87236f30cf700ea3ac778e3b20e5/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp#L1400-L1404 >> >> I haven't seen causing this particular instance any issues in practice _yet_, presumably because it looks hard to align the stars in order to trigger a problem (between `stp` and `ldp` of `r18` a transition to kernel space must happen *and* the kernel needs to do something with `r18`). But jdk11u-dev has more usages of the `::pusha`/`::popa` macro and that causes troubles as explained in the link above. >> >> Output of `-XX:+PrintInterpreter` before this change: >> >> ---------------------------------------------------------------------- >> method entry point (kind = native) [0x0000000138809b00, 0x000000013880a280] 1920 bytes >> -------------------------------------------------------------------------------- >> 0x0000000138809b00: ldr x2, [x12, #16] >> 0x0000000138809b04: ldrh w2, [x2, #44] >> 0x0000000138809b08: add x24, x20, x2, uxtx #3 >> 0x0000000138809b0c: sub x24, x24, #0x8 >> [...] >> 0x0000000138809fa4: stp x16, x17, [sp, #128] >> 0x0000000138809fa8: stp x18, x19, [sp, #144] >> 0x0000000138809fac: stp x20, x21, [sp, #160] >> [...] >> 0x0000000138809fc0: stp x30, xzr, [sp, #240] >> 0x0000000138809fc4: mov x0, x28 >> ;; 0x10864ACCC >> 0x0000000138809fc8: mov x9, #0xaccc // #44236 >> 0x0000000138809fcc: movk x9, #0x864, lsl #16 >> 0x0000000138809fd0: movk x9, #0x1, lsl #32 >> 0x0000000138809fd4: blr x9 >> 0x0000000138809fd8: ldp x2, x3, [sp, #16] >> [...] >> 0x0000000138809ff4: ldp x16, x17, [sp, #128] >> 0x0000000138809ff8: ldp x18, x19, [sp, #144] >> 0x0000000138809ffc: ldp x20, x21, [sp, #160] >> >> >> After: >> >> ---------------------------------------------------------------------- >> method entry point (kind = native) [0x0000000108e4db00, 0x0000000108e4e280] 1920 bytes >> >> -------------------------------------------------------------------------------- >> 0x0000000108e4db00: ldr x2, [x12, #16] >> 0x0000000108e4db04: ldrh w2, [x2, #44] >> 0x0000000108e4db08: add x24, x20, x2, uxtx #3 >> 0x0000000108e4db0c: sub x24, x24, #0x8 >> [...] >> 0x0000000108e4dfa4: stp x16, x17, [sp, #128] >> 0x0000000108e4dfa8: stp x19, x20, [sp, #144] >> 0x0000000108e4dfac: stp x21, x22, [sp, #160] >> [...] >> 0x0000000108e4dfbc: stp x29, x30, [sp, #224] >> 0x0000000108e4dfc0: mov x0, x28 >> ;; 0x107E4A06C >> 0x0000000108e4dfc4: mov x9, #0xa06c // #41068 >> 0x0000000108e4dfc8: movk x9, #0x7e4, lsl #16 >> 0x0000000108e4dfcc: movk x9, #0x1, lsl #32 >> 0x0000000108e4dfd0: blr x9 >> 0x0000000108e4dfd4: ldp x2, x3, [sp, #16] >> [...] >> 0x0000000108e4dff0: ldp x16, x17, [sp, #128] >> 0x0000000108e4dff4: ldp x19, x20, [sp, #144] >> 0x0000000108e4dff8: ldp x21, x22, [sp, #160] >> [...] > > Bernhard Urban-Forster has updated the pull request incrementally with one additional commit since the last revision: > > Apply suggestions from code review > > Co-authored-by: Andrew Haley Thank you Andrew! ------------- PR: https://git.openjdk.java.net/jdk/pull/5828 From coleenp at openjdk.java.net Thu Oct 14 21:00:09 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 14 Oct 2021 21:00:09 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file Message-ID: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> See CR for details. This saves the created mutex in a mutex_array during Mutex construction so that all Mutex, not just the ones in mutexLocker.cpp can be printed in the hs_err file. ThreadCritical is used to protect the mutex_array, and to avoid UB should be used when printing the owned locks, but since that is done during error handling, this seemed better not to lock during error handling. There's 0 probability that another thread will be creating a lock during this error handling anyway. Tested with tier1-8. ------------- Commit messages: - 8274794: Print all owned locks in hs_err file Changes: https://git.openjdk.java.net/jdk/pull/5958/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5958&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8274794 Stats: 284 lines in 8 files changed: 199 ins; 79 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/5958.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5958/head:pull/5958 PR: https://git.openjdk.java.net/jdk/pull/5958 From coleenp at openjdk.java.net Thu Oct 14 21:13:02 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 14 Oct 2021 21:13:02 GMT Subject: RFR: 8274301: Locking with _no_safepoint_check_flag does not check NJT Message-ID: There are Mutexes that have a safepoint checking rank, that are locked via: MutexLocker ml(safepoint_lock, Mutex::_no_safepoint_check_flag); These don't assert for the inconsistency because the thread that calls this is a NonJavaThread. But if you lock MutexLocker ml(nosafepoint_lock); // default safepoint check These *will* assert for the safepoint checking inconsistency even for NonJavaThreads. NonJavaThreads don't participate in the safepoint checking protocol. And some of these safepoint checking locks are shared between java and nonjava threads. So in the existing code, there's a mix of locking with _no_safepoint_check_flag and not. This has a wrinkle because if you have a safepoint checking lock, and call wait on it, we now have to handle the case that the caller is a NonJavaThread and should not safepoint check (just like the Mutex::lock call). Maybe this shouldn't be fixed, and we should just document the discrepancy. Tested with tier1-3. ------------- Commit messages: - 8274301: Locking with _no_safepoint_check_flag does not check NJT Changes: https://git.openjdk.java.net/jdk/pull/5959/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5959&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8274301 Stats: 48 lines in 11 files changed: 10 ins; 7 del; 31 mod Patch: https://git.openjdk.java.net/jdk/pull/5959.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5959/head:pull/5959 PR: https://git.openjdk.java.net/jdk/pull/5959 From dholmes at openjdk.java.net Thu Oct 14 22:12:48 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 14 Oct 2021 22:12:48 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" [v3] In-Reply-To: References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> Message-ID: <14VVyFHj2yD6j7XXf4FBlYcnrEkXCCWay8EEoToG_lo=.e7c19075-f0c6-4699-94e6-8740c5261b9f@github.com> On Thu, 14 Oct 2021 16:03:22 GMT, Coleen Phillimore wrote: >> The MultiArray_lock isn't held when restoring shared arrays, so an observer can find one that isn't completely restored. >> Tested with tier1-3 in progress, and CDS tests locally. Not really reproduceable otherwise even with sleeps. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add assert that MultiArray_lock is held for loaded_classes_do. I will state again: If the JVMTI code is racing with the execution of restore_unshareable_info and the setting of the mirror, then even with the lock added the JVMTI code could still get there first and find the mirror not set. The added lock simply ensures that the JVMTI code and the restore_unshareable_info code cannot execute concurrently - it does not guarantee that the JVMTI code can't execute until after restore_unshareable_info. ------------- PR: https://git.openjdk.java.net/jdk/pull/5935 From iklam at openjdk.java.net Thu Oct 14 22:27:46 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 14 Oct 2021 22:27:46 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" [v3] In-Reply-To: <14VVyFHj2yD6j7XXf4FBlYcnrEkXCCWay8EEoToG_lo=.e7c19075-f0c6-4699-94e6-8740c5261b9f@github.com> References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> <14VVyFHj2yD6j7XXf4FBlYcnrEkXCCWay8EEoToG_lo=.e7c19075-f0c6-4699-94e6-8740c5261b9f@github.com> Message-ID: On Thu, 14 Oct 2021 22:09:41 GMT, David Holmes wrote: > I will state again: If the JVMTI code is racing with the execution of restore_unshareable_info and the setting of the mirror, then even with the lock added the JVMTI code could still get there first and find the mirror not set. > > The added lock simply ensures that the JVMTI code and the restore_unshareable_info code cannot execute concurrently - it does not guarantee that the JVMTI code can't execute until after restore_unshareable_info. For an InstanceKlass k to be iterated on by ClassLoaderData::loaded_classes_do(), k->is_loaded() must be true. The class enters the "loaded" state when SystemDictionay::add_to_hierarchy(k) is called, which happens after k->restore_unshareable_info() has completed. So the current implementation is safe w.r.t. InstanceKlasses. For ArrayKlasses, they would be *kind of* safe after this PR, as the JVMTI code will not see an ArrayKlasses that's in the middle of restore_unshareable_info(). However, it will still be possible for the JVMTI code to see the `[LFooBar;` class without seeing the `FooBar;` class. This may produce unexpected results. ------------- PR: https://git.openjdk.java.net/jdk/pull/5935 From dholmes at openjdk.java.net Thu Oct 14 22:40:45 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 14 Oct 2021 22:40:45 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" [v3] In-Reply-To: References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> Message-ID: On Thu, 14 Oct 2021 16:03:22 GMT, Coleen Phillimore wrote: >> The MultiArray_lock isn't held when restoring shared arrays, so an observer can find one that isn't completely restored. >> Tested with tier1-3 in progress, and CDS tests locally. Not really reproduceable otherwise even with sleeps. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add assert that MultiArray_lock is held for loaded_classes_do. We hit an assertion failure because execution of loaded_klasses_do by JVMTI encountered a shared array class that had a NULL mirror. So two things need to be known: 1. At what point does that shared array class become visible to loaded_klasses_do? and 2. At what point does the shared array class have its mirror set? Either we must guarantee that the mirror is set before the array class is visible, or the two actions must be atomic with respect to the execution of loaded_klasses_do. The latter implies that any locking must cover both actions. ------------- PR: https://git.openjdk.java.net/jdk/pull/5935 From coleenp at openjdk.java.net Fri Oct 15 02:29:48 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 15 Oct 2021 02:29:48 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" [v3] In-Reply-To: References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> Message-ID: On Thu, 14 Oct 2021 16:03:22 GMT, Coleen Phillimore wrote: >> The MultiArray_lock isn't held when restoring shared arrays, so an observer can find one that isn't completely restored. >> Tested with tier1-3 in progress, and CDS tests locally. Not really reproduceable otherwise even with sleeps. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add assert that MultiArray_lock is held for loaded_classes_do. Yes: 1. the shared array class becomes visible to loaded_klasses_do when ClassLoaderData::add_class() is called in Klass::restore_unshareable_info 2. the shared array class has its mirror set afterward in that same function: Klass::restore_unshareable_info The guarantee that the two actions are atomic wrt to loaded_classes_do are because the MultiArray_lock is held while both 1 and 2 are executed. The lock keeps 1 and 2 together. The lock keeps Jvmti from iterating through the CLDG while the ArrayKlass is in an inconsistent state. @iklam you're right - we'd see the ObjArrayKlass first since the InstanceKlass will be added to the CLD before its marked as is_loaded(). I don't know what the result of that would be. You could open a new bug for that to investigate that and try to write a test case (using JVMTI that would show if it got confused). ------------- PR: https://git.openjdk.java.net/jdk/pull/5935 From njian at openjdk.java.net Fri Oct 15 02:35:54 2021 From: njian at openjdk.java.net (Ningsheng Jian) Date: Fri, 15 Oct 2021 02:35:54 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v2] In-Reply-To: <0LvaNkXLAVEoVt15uJ3Y2ZFzbctyTk9A76JhKXmZBqo=.f827b200-6aca-49ec-bcf5-76dbe194f4a0@github.com> References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> <0LvaNkXLAVEoVt15uJ3Y2ZFzbctyTk9A76JhKXmZBqo=.f827b200-6aca-49ec-bcf5-76dbe194f4a0@github.com> Message-ID: On Thu, 14 Oct 2021 17:21:25 GMT, Paul Sandoz wrote: >> This PR improves the performance of vector operations that accept masks on architectures that support masking in hardware, specifically Intel AVX512 and ARM SVE. >> >> On architectures that do not support masking in hardware the same technique as before is applied to most operations, specifically composition using blend. >> >> Masked loads/stores are a special form of masked operation that require additional care to ensure out-of-bounds access throw exceptions. The range checking has not been fully optimized and will require further work. >> >> No API enhancements were required and only a few additional tests were needed. > > Paul Sandoz has updated the pull request incrementally with one additional commit since the last revision: > > Apply patch from https://github.com/openjdk/panama-vector/pull/152 > @njian there is a conflict with `macroAssembler_aarch64.cpp`: > > I think the resolution is this: > > ``` > @@ -2581,7 +2572,7 @@ void MacroAssembler::pop_call_clobbered_registers_except(RegSet exclude) { > > void MacroAssembler::push_CPU_state(bool save_vectors, bool use_sve, > int sve_vector_size_in_bytes, int total_predicate_in_bytes) { > - push(0x3fffffff, sp); // integer registers except lr & sp > + push(RegSet::range(r0, r29), sp); // integer registers except lr & sp > ``` > > is that correct? Yes, I think so. ------------- PR: https://git.openjdk.java.net/jdk/pull/5873 From dholmes at openjdk.java.net Fri Oct 15 02:46:53 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 15 Oct 2021 02:46:53 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" [v3] In-Reply-To: References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> Message-ID: On Thu, 14 Oct 2021 16:03:22 GMT, Coleen Phillimore wrote: >> The MultiArray_lock isn't held when restoring shared arrays, so an observer can find one that isn't completely restored. >> Tested with tier1-3 in progress, and CDS tests locally. Not really reproduceable otherwise even with sleeps. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add assert that MultiArray_lock is held for loaded_classes_do. Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/5935 From dholmes at openjdk.java.net Fri Oct 15 02:46:54 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 15 Oct 2021 02:46:54 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" [v3] In-Reply-To: References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> Message-ID: On Fri, 15 Oct 2021 02:26:22 GMT, Coleen Phillimore wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Add assert that MultiArray_lock is held for loaded_classes_do. > > Yes: > 1. the shared array class becomes visible to loaded_klasses_do when ClassLoaderData::add_class() is called in Klass::restore_unshareable_info > 2. the shared array class has its mirror set afterward in that same function: Klass::restore_unshareable_info > > The guarantee that the two actions are atomic wrt to loaded_classes_do are because the MultiArray_lock is held while both 1 and 2 are executed. The lock keeps 1 and 2 together. The lock keeps Jvmti from iterating through the CLDG while the ArrayKlass is in an inconsistent state. > > @iklam you're right - we'd see the ObjArrayKlass first since the InstanceKlass will be added to the CLD before its marked as is_loaded(). I don't know what the result of that would be. You could open a new bug for that to investigate that and try to write a test case (using JVMTI that would show if it got confused). @coleenp , @iklam thanks for your patience on this. My confusion/concern was around whether `loaded_classes_do` could find the `ik` and from that use `ik->array_klasses()` to get to the `ak` and then find it has no mirror. But Coleen assures me that we never do that and will only access the `ak` via `loaded_classes_do` directly, which is safe due to the additional locking added. Thanks, David ------------- PR: https://git.openjdk.java.net/jdk/pull/5935 From iklam at openjdk.java.net Fri Oct 15 03:06:54 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 15 Oct 2021 03:06:54 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" [v3] In-Reply-To: References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> Message-ID: On Thu, 14 Oct 2021 16:03:22 GMT, Coleen Phillimore wrote: >> The MultiArray_lock isn't held when restoring shared arrays, so an observer can find one that isn't completely restored. >> Tested with tier1-3 in progress, and CDS tests locally. Not really reproduceable otherwise even with sleeps. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add assert that MultiArray_lock is held for loaded_classes_do. Marked as reviewed by iklam (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/5935 From iklam at openjdk.java.net Fri Oct 15 03:06:55 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 15 Oct 2021 03:06:55 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" [v3] In-Reply-To: References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> <14VVyFHj2yD6j7XXf4FBlYcnrEkXCCWay8EEoToG_lo=.e7c19075-f0c6-4699-94e6-8740c5261b9f@github.com> Message-ID: On Thu, 14 Oct 2021 22:24:53 GMT, Ioi Lam wrote: >> I will state again: If the JVMTI code is racing with the execution of restore_unshareable_info and the setting of the mirror, then even with the lock added the JVMTI code could still get there first and find the mirror not set. >> >> The added lock simply ensures that the JVMTI code and the restore_unshareable_info code cannot execute concurrently - it does not guarantee that the JVMTI code can't execute until after restore_unshareable_info. > >> I will state again: If the JVMTI code is racing with the execution of restore_unshareable_info and the setting of the mirror, then even with the lock added the JVMTI code could still get there first and find the mirror not set. >> >> The added lock simply ensures that the JVMTI code and the restore_unshareable_info code cannot execute concurrently - it does not guarantee that the JVMTI code can't execute until after restore_unshareable_info. > > For an InstanceKlass k to be iterated on by ClassLoaderData::loaded_classes_do(), k->is_loaded() must be true. The class enters the "loaded" state when SystemDictionay::add_to_hierarchy(k) is called, which happens after k->restore_unshareable_info() has completed. So the current implementation is safe w.r.t. InstanceKlasses. > > For ArrayKlasses, they would be *kind of* safe after this PR, as the JVMTI code will not see an ArrayKlasses that's in the middle of restore_unshareable_info(). However, it will still be possible for the JVMTI code to see the `[LFooBar;` class without seeing the `FooBar;` class. This may produce unexpected results. > @iklam you're right - we'd see the ObjArrayKlass first since the InstanceKlass will be added to the CLD before its marked as is_loaded(). I don't know what the result of that would be. You could open a new bug for that to investigate that and try to write a test case (using JVMTI that would show if it got confused). I filed https://bugs.openjdk.java.net/browse/JDK-8275318 "loaded_classes_do may see ArrayKlass before InstanceKlass is loaded" ------------- PR: https://git.openjdk.java.net/jdk/pull/5935 From dholmes at openjdk.java.net Fri Oct 15 04:02:48 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 15 Oct 2021 04:02:48 GMT Subject: RFR: 8274301: Locking with _no_safepoint_check_flag does not check NJT In-Reply-To: References: Message-ID: On Thu, 14 Oct 2021 21:04:29 GMT, Coleen Phillimore wrote: > There are Mutexes that have a safepoint checking rank, that are locked via: > MutexLocker ml(safepoint_lock, Mutex::_no_safepoint_check_flag); > > These don't assert for the inconsistency because the thread that calls this is a NonJavaThread. > > But if you lock > MutexLocker ml(nosafepoint_lock); // default safepoint check > > These *will* assert for the safepoint checking inconsistency even for NonJavaThreads. > > NonJavaThreads don't participate in the safepoint checking protocol. And some of these safepoint checking locks are shared between java and nonjava threads. So in the existing code, there's a mix of locking with _no_safepoint_check_flag and not. > > This has a wrinkle because if you have a safepoint checking lock, and call wait on it, we now have to handle the case that the caller is a NonJavaThread and should not safepoint check (just like the Mutex::lock call). > > Maybe this shouldn't be fixed, and we should just document the discrepancy. > > Tested with tier1-3. Hi Coleen, If a `MutexLocker`/`MonitorLocker` is only acquired by NJT's then the `_no_safepoint_check` flag is redundant. If it is also used by a JT then of course the flag must remain. You don't need to make any changes to `Monitor::wait()` because NJTs should only be calling `Monitor::wait_without_safepoint_check()`. The `MonitorLocker::wait` call should call the appropriate `Monitor` function based on the flag and/or the type of thread. Thanks, David ------------- PR: https://git.openjdk.java.net/jdk/pull/5959 From stuefe at openjdk.java.net Fri Oct 15 05:44:05 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 15 Oct 2021 05:44:05 GMT Subject: RFR: JDK-8275320: NMT should perform buffer overrun checks Message-ID: This is part of a number of RFE I plan to improve and simplify C-heap overflow checking in hotspot. For the whole story please refer to https://bugs.openjdk.java.net/browse/JDK-8275301. This proposal adds NMT buffer overflow checking. As laid out in JDK-8275301: - it would give us C-heap overflow checking in release builds - the additional costs are neglectable - NMT needs intact headers anyway. Faced with buffer overwrites today, it would maybe crash or maybe account wrongly, but it's a bit of a lottery really. The error reports would also be confusing. - it is a preparation for future code removal (the memory guarding done in debug only in os::malloc() and friends, and possibly the guarding done with CheckJNICalls) Patch notes: 1) The malloc header is changed such that it contains a 16-bit canary directly preceding the user payload of the allocation. On 64-bit, we don't even need to enlarge the malloc header: we carve some bits out by decreasing the size of the bucket index bit field to 16 bits. The bucket index field is used to store the bucket slot of the malloc site table in NMT detail mode. The malloc site table width is 512 atm, so 65k gives plenty of room for growing the malloc site table should we ever want to. On 32-bit, I had to enlarge the header from 8 bytes to 16 bytes. That is because there were not enough bits to spare for a canary. On the upside, 8 bytes were not enough anyway, strictly speaking, to guarantee proper alignment e.g. for 128bit data types on all 32-bit platforms. See e.g. the malloc alignment the glibc uses. I also took the freedom of re-arranging the malloc header fields a bit to minimize the difference between 32-bit and 64-bit platforms, and to align each field optimally according to its size. I also switched from bitfields to real types in order to be able to do a sizeof() on them. For more details, see the comment in mallocTracker.hpp. 2) I added a footer canary trailing the user allocation to catch tail buffer overruns. For simplicity reasons (alignment) and to save some cycles I made it a byte only. That is enough to catch most overrun scenarios. If you think this is too small, I'm open to change it. 3) I put a bit of work into error reporting. When NMT detects corruption, it will now print out a hex dump of the corrupted area to tty before asserting. 4) I added a bunch of gtests to test various heap overwrite scenarios. I also had to extend the gtest macros a bit because I wanted these tests of course to run in release builds too, but we did not have a death test macro for release builds yet (there are possibilities for code simplification here too, but that's for another RFE). (Note that these gtests, to test anything, need to run with NMT switched on. We do this as part of our NMT jtreg-controlled gtests in tier1). Even though the patch adds more code than it removes, it prepares possible code removal (if we can agree to do that) and the net result will be less complexity, not more. Again, see JDK-8275301 for details. -------------- Example output a buffer overrun would provide: Block at 0x00005600f86136b0: footer canary broken at 0x00005600f86136c1 (buffer overflow?) NMT Block at 0x00005600f86136b0, corruption at: 0x00005600f86136c1: 0x00005600f86136a8: 21 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 0x00005600f86136b8: 00 00 00 00 0f 00 1f fa 00 61 00 00 00 00 00 00 0x00005600f86136c8: 41 39 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00005600f86136d8: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00005600f86136e8: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00005600f86136f8: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00005600f8613708: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00005600f8613718: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00005600f8613728: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00005600f8613738: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 assert failed: fatal error: Block at 0x00005600f86136b0: footer canary broken at 0x00005600f86136c1 (buffer overflow?)# # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (mallocTracker.cpp:203), pid=10805, tid=10805 # fatal error: Block at 0x00005600f86136b0: footer canary broken at 0x00005600f86136c1 (buffer overflow?) # ------- Tests: - manual tests with Linux x64, x86, minimal build - GHAs all clean - SAP nightlies in progress. ------------- Commit messages: - Let NMT do overflow detection Changes: https://git.openjdk.java.net/jdk/pull/5952/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5952&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275320 Stats: 356 lines in 9 files changed: 321 ins; 4 del; 31 mod Patch: https://git.openjdk.java.net/jdk/pull/5952.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5952/head:pull/5952 PR: https://git.openjdk.java.net/jdk/pull/5952 From stuefe at openjdk.java.net Fri Oct 15 06:08:47 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 15 Oct 2021 06:08:47 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file In-Reply-To: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: On Thu, 14 Oct 2021 20:51:44 GMT, Coleen Phillimore wrote: > See CR for details. This saves the created mutex in a mutex_array during Mutex construction so that all Mutex, not just the ones in mutexLocker.cpp can be printed in the hs_err file. ThreadCritical is used to protect the mutex_array, and to avoid UB should be used when printing the owned locks, but since that is done during error handling, this seemed better not to lock during error handling. There's 0 probability that another thread will be creating a lock during this error handling anyway. > Tested with tier1-8. Hi Coleen, Should we limit this to debug-only, to not pay the costs for analysis we don't need? Would it make sense to use a linked list instead? Removal would be far cheaper. Also, since every Mutex lives in this list anyway without exception, we could make the list pointers just part of the Mutex class itself, to save having to allocate a node. I wonder whether we should, for global Mutexes, skip the destruction sequence altogether. Oh, I see the global mutexes are all new'ed, so we never delete them. Printing on error is a bit of a problem with no perfect solution. If you print without locking, you risk crashes. But those would be caught by secondary error handling. But you don't get your full list. If you draw the lock OTOH, you may deadlock if the lock is already active, but that would be caught by the STEP timeout system too. So I think what you do is okay either way. Cheers, Thomas src/hotspot/share/runtime/mutex.cpp line 284: > 282: static void remove_mutex(Mutex* var) { > 283: ThreadCritical tc; > 284: int i = _mutex_array->find_from_end(var); Ouch. You count on Mutexes either living forever or being very short-lived here? ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From dholmes at openjdk.java.net Fri Oct 15 06:12:51 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 15 Oct 2021 06:12:51 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file In-Reply-To: References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: <5FDmp96YqdMogHuWeT8IgQ_boNe7Pvq8a5fpUF_Ibic=.27c30498-e568-4067-9375-61a02287d31d@github.com> On Fri, 15 Oct 2021 05:46:10 GMT, Thomas Stuefe wrote: >> See CR for details. This saves the created mutex in a mutex_array during Mutex construction so that all Mutex, not just the ones in mutexLocker.cpp can be printed in the hs_err file. ThreadCritical is used to protect the mutex_array, and to avoid UB should be used when printing the owned locks, but since that is done during error handling, this seemed better not to lock during error handling. There's 0 probability that another thread will be creating a lock during this error handling anyway. >> Tested with tier1-8. > > src/hotspot/share/runtime/mutex.cpp line 284: > >> 282: static void remove_mutex(Mutex* var) { >> 283: ThreadCritical tc; >> 284: int i = _mutex_array->find_from_end(var); > > Ouch. You count on Mutexes either living forever or being very short-lived here? I think the view is that the static mutexes will populate from the start of the array so any dynamic mutex, which is the only candidate for removal, must be closer to the end. ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From dholmes at openjdk.java.net Fri Oct 15 06:31:54 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 15 Oct 2021 06:31:54 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file In-Reply-To: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: <-EsJtgWgiLvWX2J6kWm1B1BzyyKDffoGPaLiab5a4w0=.df833afc-0705-4d0d-ab28-665bd75a676e@github.com> On Thu, 14 Oct 2021 20:51:44 GMT, Coleen Phillimore wrote: > See CR for details. This saves the created mutex in a mutex_array during Mutex construction so that all Mutex, not just the ones in mutexLocker.cpp can be printed in the hs_err file. ThreadCritical is used to protect the mutex_array, and to avoid UB should be used when printing the owned locks, but since that is done during error handling, this seemed better not to lock during error handling. There's 0 probability that another thread will be creating a lock during this error handling anyway. > Tested with tier1-8. Hi Coleen, This looks good to me. One minor nit with the new test. :) Thanks, David src/hotspot/share/runtime/mutex.cpp line 278: > 276: _mutex_array = new (ResourceObj::C_HEAP, mtThread) GrowableArray(128, mtThread); > 277: } > 278: ThreadCritical tc; You could avoid the TC on first call by duplicating the push line after the allocation and returning. This also avoid any potential issue with TC needing more initialization than has been done when this is first called. src/hotspot/share/runtime/mutexLocker.hpp line 176: > 174: void print_owned_locks_on_error(outputStream* st); > 175: > 176: char *lock_name(Mutex *mutex); Good catch! test/hotspot/jtreg/runtime/ErrorHandling/ErrorFileLocksTest.java line 41: > 39: import java.util.regex.Pattern; > 40: > 41: public class ErrorFileLocksTest { Nit: could you s/Locks/Mutex please. My brain parses this a Filelock test. :) ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5958 From stuefe at openjdk.java.net Fri Oct 15 08:30:56 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 15 Oct 2021 08:30:56 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file In-Reply-To: <5FDmp96YqdMogHuWeT8IgQ_boNe7Pvq8a5fpUF_Ibic=.27c30498-e568-4067-9375-61a02287d31d@github.com> References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> <5FDmp96YqdMogHuWeT8IgQ_boNe7Pvq8a5fpUF_Ibic=.27c30498-e568-4067-9375-61a02287d31d@github.com> Message-ID: On Fri, 15 Oct 2021 06:09:44 GMT, David Holmes wrote: > I think the view is that the static mutexes will populate from the start of the array so any dynamic mutex, which is the only candidate for removal, must be closer to the end. That's what I understood too. But there can be a ton of dynamic Mutexes, so this does not really help that much. ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From stuefe at openjdk.java.net Fri Oct 15 08:42:52 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 15 Oct 2021 08:42:52 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file In-Reply-To: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: <22-EJFZ1l816xBIk5Lb4LXbm_KzeTeuResPzSn8GGpE=.38c74bbd-6b56-4651-a02b-d41f4679629e@github.com> On Thu, 14 Oct 2021 20:51:44 GMT, Coleen Phillimore wrote: > See CR for details. This saves the created mutex in a mutex_array during Mutex construction so that all Mutex, not just the ones in mutexLocker.cpp can be printed in the hs_err file. ThreadCritical is used to protect the mutex_array, and to avoid UB should be used when printing the owned locks, but since that is done during error handling, this seemed better not to lock during error handling. There's 0 probability that another thread will be creating a lock during this error handling anyway. > Tested with tier1-8. Thinking this through further, I think that this is may not be the best approach. The number of dynamically generated Mutexes can be very high, and their lifetime is unpredictable. E.g. because each ClassLoaderData is guarded by a Mutex. And we can have a lot of them (not sure if JEP 416 will do away with reflection-specific loaders, but just think of dynamic languages like jruby. Or just people who like class loaders :-). I just ran a test with 5000 loaders and the array gets quite big, linear search is not good here. Especially since GrowableArray compacts itself on every removal. My alternative proposal would be, as I wrote, to use a double-linked list. Removal would be O(1). And if you embed the list pointers in the Mutex itself, you don't need much more memory. Compared to your patch you'd pay 8 bytes per Mutex more because you need two pointers, but you'd save the overhead GrowableArray causes. Less allocation churn too since you save allocation of the array. And it also solves the problem of releasing memory if you release a bunch of Mutexes. With a GrowableArray, it has to shrink itself, but if you embed the list pointers in Mutex you get memory management for free. Cheers, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From duke at openjdk.java.net Fri Oct 15 10:10:52 2021 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Fri, 15 Oct 2021 10:10:52 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v10] In-Reply-To: References: <8P-tWT-7UC9TMLz8zo5liDy2rOONBU864RUlRhthLeY=.05ef40f6-38ca-4e84-a54b-eb8dbae2b97f@github.com> Message-ID: On Tue, 12 Oct 2021 08:11:20 GMT, Andrew Haley wrote: >> Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: >> >> Make OnSpinWaitInst/OnSpinWaitInstCount DIAGNOSTIC > > Can we have a simple (as simple as possible) JMH benchmark, please? It should be something like a couple of threads racing to count up to a million. @theRealAph, any comments on the microbenchmark I wrote? ------------- PR: https://git.openjdk.java.net/jdk/pull/5562 From ngasson at openjdk.java.net Fri Oct 15 10:36:50 2021 From: ngasson at openjdk.java.net (Nick Gasson) Date: Fri, 15 Oct 2021 10:36:50 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v10] In-Reply-To: References: <8P-tWT-7UC9TMLz8zo5liDy2rOONBU864RUlRhthLeY=.05ef40f6-38ca-4e84-a54b-eb8dbae2b97f@github.com> Message-ID: On Tue, 12 Oct 2021 08:11:20 GMT, Andrew Haley wrote: >> Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: >> >> Make OnSpinWaitInst/OnSpinWaitInstCount DIAGNOSTIC > > Can we have a simple (as simple as possible) JMH benchmark, please? It should be something like a couple of threads racing to count up to a million. > @theRealAph, any comments on the microbenchmark I wrote? I think to show the benefit of the `onSpinWait()` changes you'd need some contention between the threads - e.g. if `nowork()` was updating a shared counter. ------------- PR: https://git.openjdk.java.net/jdk/pull/5562 From coleenp at openjdk.java.net Fri Oct 15 12:13:52 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 15 Oct 2021 12:13:52 GMT Subject: RFR: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" [v3] In-Reply-To: References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> Message-ID: On Thu, 14 Oct 2021 16:03:22 GMT, Coleen Phillimore wrote: >> The MultiArray_lock isn't held when restoring shared arrays, so an observer can find one that isn't completely restored. >> Tested with tier1-3 in progress, and CDS tests locally. Not really reproduceable otherwise even with sleeps. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add assert that MultiArray_lock is held for loaded_classes_do. Thank you David and Ioi. ------------- PR: https://git.openjdk.java.net/jdk/pull/5935 From coleenp at openjdk.java.net Fri Oct 15 12:13:53 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 15 Oct 2021 12:13:53 GMT Subject: Integrated: 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" In-Reply-To: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> References: <6I27lbCLtBQosJOoPHek1nw4wUVc-t7AGM5VD77YFEc=.06bd69a4-30bd-4adf-9c5f-ad3d80914dc4@github.com> Message-ID: On Wed, 13 Oct 2021 20:29:46 GMT, Coleen Phillimore wrote: > The MultiArray_lock isn't held when restoring shared arrays, so an observer can find one that isn't completely restored. > Tested with tier1-3 in progress, and CDS tests locally. Not really reproduceable otherwise even with sleeps. This pull request has now been integrated. Changeset: 172aed1a Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/172aed1a2d75756b140cb723133ac5fb67f7745e Stats: 6 lines in 2 files changed: 6 ins; 0 del; 0 mod 8274338: com/sun/jdi/RedefineCrossEvent.java failed "assert(m != __null) failed: NULL mirror" Reviewed-by: dholmes, iklam ------------- PR: https://git.openjdk.java.net/jdk/pull/5935 From coleenp at openjdk.java.net Fri Oct 15 12:23:55 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 15 Oct 2021 12:23:55 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file In-Reply-To: <-EsJtgWgiLvWX2J6kWm1B1BzyyKDffoGPaLiab5a4w0=.df833afc-0705-4d0d-ab28-665bd75a676e@github.com> References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> <-EsJtgWgiLvWX2J6kWm1B1BzyyKDffoGPaLiab5a4w0=.df833afc-0705-4d0d-ab28-665bd75a676e@github.com> Message-ID: On Fri, 15 Oct 2021 06:13:28 GMT, David Holmes wrote: >> See CR for details. This saves the created mutex in a mutex_array during Mutex construction so that all Mutex, not just the ones in mutexLocker.cpp can be printed in the hs_err file. ThreadCritical is used to protect the mutex_array, and to avoid UB should be used when printing the owned locks, but since that is done during error handling, this seemed better not to lock during error handling. There's 0 probability that another thread will be creating a lock during this error handling anyway. >> Tested with tier1-8. > > src/hotspot/share/runtime/mutex.cpp line 278: > >> 276: _mutex_array = new (ResourceObj::C_HEAP, mtThread) GrowableArray(128, mtThread); >> 277: } >> 278: ThreadCritical tc; > > You could avoid the TC on first call by duplicating the push line after the allocation and returning. This also avoid any potential issue with TC needing more initialization than has been done when this is first called. Ok, seems like a good idea. > test/hotspot/jtreg/runtime/ErrorHandling/ErrorFileLocksTest.java line 41: > >> 39: import java.util.regex.Pattern; >> 40: >> 41: public class ErrorFileLocksTest { > > Nit: could you s/Locks/Mutex please. My brain parses this a Filelock test. :) I like TestErrorFileMutex.java with Test at the beginning. That matches other names in that directory. ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From coleenp at openjdk.java.net Fri Oct 15 12:23:55 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 15 Oct 2021 12:23:55 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file In-Reply-To: References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> <5FDmp96YqdMogHuWeT8IgQ_boNe7Pvq8a5fpUF_Ibic=.27c30498-e568-4067-9375-61a02287d31d@github.com> Message-ID: On Fri, 15 Oct 2021 08:27:24 GMT, Thomas Stuefe wrote: >> I think the view is that the static mutexes will populate from the start of the array so any dynamic mutex, which is the only candidate for removal, must be closer to the end. > >> I think the view is that the static mutexes will populate from the start of the array so any dynamic mutex, which is the only candidate for removal, must be closer to the end. > > That's what I understood too. But there can be a ton of dynamic Mutexes, so this does not really help that much. Yes. That's what my logging found. There are a set of 100 or so static mutexes and then we have startThread_lock which is very short lived, TaskTermination_lock and HandshakeState_lock per thread. The latter isn't that short lived. ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From coleenp at openjdk.java.net Fri Oct 15 12:26:53 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 15 Oct 2021 12:26:53 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file In-Reply-To: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: On Thu, 14 Oct 2021 20:51:44 GMT, Coleen Phillimore wrote: > See CR for details. This saves the created mutex in a mutex_array during Mutex construction so that all Mutex, not just the ones in mutexLocker.cpp can be printed in the hs_err file. ThreadCritical is used to protect the mutex_array, and to avoid UB should be used when printing the owned locks, but since that is done during error handling, this seemed better not to lock during error handling. There's 0 probability that another thread will be creating a lock during this error handling anyway. > Tested with tier1-8. Thanks for looking at this Thomas and for the experiments. Mine weren't so large. I'll change it to a doubly linked list. That does seem safer and better. I made Mutex smaller by removing the embedded char[] so I think we can handle another pointer in the class. ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From zgu at openjdk.java.net Fri Oct 15 12:38:47 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Fri, 15 Oct 2021 12:38:47 GMT Subject: RFR: JDK-8275320: NMT should perform buffer overrun checks In-Reply-To: References: Message-ID: On Thu, 14 Oct 2021 15:49:05 GMT, Thomas Stuefe wrote: > This is part of a number of RFE I plan to improve and simplify C-heap overflow checking in hotspot. For the whole story please refer to https://bugs.openjdk.java.net/browse/JDK-8275301. > > This proposal adds NMT buffer overflow checking. As laid out in JDK-8275301: > > - it would give us C-heap overflow checking in release builds > - the additional costs are neglectable > - NMT needs intact headers anyway. Faced with buffer overwrites today, it would maybe crash or maybe account wrongly, but it's a bit of a lottery really. The error reports would also be confusing. > - it is a preparation for future code removal (the memory guarding done in debug only in os::malloc() and friends, and possibly the guarding done with CheckJNICalls) > > Patch notes: > > 1) The malloc header is changed such that it contains a 16-bit canary directly preceding the user payload of the allocation. > > On 64-bit, we don't even need to enlarge the malloc header: we carve some bits out by decreasing the size of the bucket index bit field to 16 bits. The bucket index field is used to store the bucket slot of the malloc site table in NMT detail mode. The malloc site table width is 512 atm, so 65k gives plenty of room for growing the malloc site table should we ever want to. > > On 32-bit, I had to enlarge the header from 8 bytes to 16 bytes. That is because there were not enough bits to spare for a canary. On the upside, 8 bytes were not enough anyway, strictly speaking, to guarantee proper alignment e.g. for 128bit data types on all 32-bit platforms. See e.g. the malloc alignment the glibc uses. > > I also took the freedom of re-arranging the malloc header fields a bit to minimize the difference between 32-bit and 64-bit platforms, and to align each field optimally according to its size. I also switched from bitfields to real types in order to be able to do a sizeof() on them. > > For more details, see the comment in mallocTracker.hpp. > > 2) I added a footer canary trailing the user allocation to catch tail buffer overruns. For simplicity reasons (alignment) and to save some cycles I made it a byte only. That is enough to catch most overrun scenarios. If you think this is too small, I'm open to change it. > > 3) I put a bit of work into error reporting. When NMT detects corruption, it will now print out a hex dump of the corrupted area to tty before asserting. > > 4) I added a bunch of gtests to test various heap overwrite scenarios. I also had to extend the gtest macros a bit because I wanted these tests of course to run in release builds too, but we did not have a death test macro for release builds yet (there are possibilities for code simplification here too, but that's for another RFE). > > (Note that these gtests, to test anything, need to run with NMT switched on. We do this as part of our NMT jtreg-controlled gtests in tier1). > > Even though the patch adds more code than it removes, it prepares possible code removal (if we can agree to do that) and the net result will be less complexity, not more. Again, see JDK-8275301 for details. > > -------------- > > Example output a buffer overrun would provide: > > > Block at 0x00005600f86136b0: footer canary broken at 0x00005600f86136c1 (buffer overflow?) > NMT Block at 0x00005600f86136b0, corruption at: 0x00005600f86136c1: > 0x00005600f86136a8: 21 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 > 0x00005600f86136b8: 00 00 00 00 0f 00 1f fa 00 61 00 00 00 00 00 00 > 0x00005600f86136c8: 41 39 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x00005600f86136d8: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x00005600f86136e8: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x00005600f86136f8: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x00005600f8613708: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x00005600f8613718: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x00005600f8613728: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x00005600f8613738: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > assert failed: fatal error: Block at 0x00005600f86136b0: footer canary broken at 0x00005600f86136c1 (buffer overflow?)# > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (mallocTracker.cpp:203), pid=10805, tid=10805 > # fatal error: Block at 0x00005600f86136b0: footer canary broken at 0x00005600f86136c1 (buffer overflow?) > # > > ------- > > Tests: > - manual tests with Linux x64, x86, minimal build > - GHAs all clean > - SAP nightlies in progress. Sorry, we already have GuardedMemory for detecting buffer overrun, why introduce a new one? ------------- PR: https://git.openjdk.java.net/jdk/pull/5952 From aph at openjdk.java.net Fri Oct 15 13:12:57 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 15 Oct 2021 13:12:57 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v10] In-Reply-To: References: <8P-tWT-7UC9TMLz8zo5liDy2rOONBU864RUlRhthLeY=.05ef40f6-38ca-4e84-a54b-eb8dbae2b97f@github.com> Message-ID: On Tue, 12 Oct 2021 08:11:20 GMT, Andrew Haley wrote: >> Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: >> >> Make OnSpinWaitInst/OnSpinWaitInstCount DIAGNOSTIC > > Can we have a simple (as simple as possible) JMH benchmark, please? It should be something like a couple of threads racing to count up to a million. > @theRealAph, any comments on the microbenchmark I wrote? Something like this works well: @Param({"1000000"}) public int maxNum; @Param({"4"}) public int threadCount; AtomicInteger theCounter; Thread threads[]; void work() { for (;;) { int prev = theCounter.get(); if (prev >= maxNum) { break; } if (theCounter.compareAndExchange(prev, prev + 1) != prev) { Thread.onSpinWait(); } } } @Setup(Level.Trial) public void foo() { theCounter = new AtomicInteger(); } @Setup(Level.Invocation) public void setup() { lock = new AtomicInteger(); theCounter.set(0); counter = 0; threads = new Thread[threadCount]; for (int i = 0; i< threads.length; i++) { threads[i] = new Thread(this::work); } } @Benchmark public void trial() throws Exception { for (int i = 0; i< threads.length; i++) { threads[i].start(); } for (int i = 0; i< threads.length; i++) { threads[i].join(); } } } Before: Benchmark (maxNum) (threadCount) Mode Cnt Score Error Units ThreadOnSpinWait.trial 1000000 2 avgt 3 43.830 ? 32.543 ms/op With `-XX:OnSpinWaitInst=isb -XX:OnSpinWaitInstCount=4` Benchmark (maxNum) (threadCount) Mode Cnt Score Error Units ThreadOnSpinWait.trial 1000000 2 avgt 3 22.181 ? 11.592 ms/op With `-XX:OnSpinWaitInst=isb -XX:OnSpinWaitInstCount=1` Benchmark (maxNum) (threadCount) Mode Cnt Score Error Units ThreadOnSpinWait.trial 1000000 2 avgt 3 36.281 ? 31.700 ms/op This is Apple M1, where you have to be very careful because there's some processor frequency scaling going on. ------------- PR: https://git.openjdk.java.net/jdk/pull/5562 From stuefe at openjdk.java.net Fri Oct 15 13:19:49 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 15 Oct 2021 13:19:49 GMT Subject: RFR: JDK-8275320: NMT should perform buffer overrun checks In-Reply-To: References: Message-ID: On Fri, 15 Oct 2021 12:35:47 GMT, Zhengyu Gu wrote: > Sorry, we already have GuardedMemory for detecting buffer overrun, why introduce a new one? GuardedMemory has a number of disadvantages, and I'd like to remove it in favor of NMT doing buffer overrun checks. For my full reasoning, please see my reasoning in the umbrella RFE https://bugs.openjdk.java.net/browse/JDK-8275301: --------- Disadvantages of the current solution: - We have no way to do C-heap checking in release builds. But there, it is sorely missed. We ship release VMs, and those VMs get used in myriad ways with a lot of faulty third-party native code. I would love to be able to flip a product switch at a customer site and have some basic C-heap checks done, without relegating to external tools or debug c-libs. - The debug-only guards in os::malloc() are quite fat, really, a whopping 48 bytes per allocation on 64-bit, 40 bytes on 32-bit. That is for guarding alone. They distort the memory allocation picture, since blowing up every allocation this way causes the underlying libc to do different things. Therefore we have different memory layouts and allocation patterns between debug and release. In addition, we have different code paths too, e.g. in debug os::realloc calls os::malloc + os::free whereas in release builds it calls directly into libc ::realloc. All that means that in debug builds we test something different than what we ship in release builds. - The canary in the headers of the debug-only guards do not directly precede the user portion of the data, so we won't catch negative buffer overflows of only a few bytes. - The guarding added by CheckJNICalls is unnecessarily expensive too, since it copies the memory around, handing a copy of the guarded memory up to the caller. - The fact that three different code sections all do malloc headers incurs unnecessary costs, and the code is unnecessarily complex. It makes also statistics difficult to understand since the silent overhead can be large (compare the rise in RSS with the rise in NMT allocations in a debug build). - None of the current overflow checkers print out hex dumps of the violated memory. That is what the libc usually does and it is very useful. --------- Thanks, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/5952 From coleenp at openjdk.java.net Fri Oct 15 13:21:49 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 15 Oct 2021 13:21:49 GMT Subject: RFR: 8274301: Locking with _no_safepoint_check_flag does not check NJT In-Reply-To: References: Message-ID: On Thu, 14 Oct 2021 21:04:29 GMT, Coleen Phillimore wrote: > There are Mutexes that have a safepoint checking rank, that are locked via: > MutexLocker ml(safepoint_lock, Mutex::_no_safepoint_check_flag); > > These don't assert for the inconsistency because the thread that calls this is a NonJavaThread. > > But if you lock > MutexLocker ml(nosafepoint_lock); // default safepoint check > > These *will* assert for the safepoint checking inconsistency even for NonJavaThreads. > > NonJavaThreads don't participate in the safepoint checking protocol. And some of these safepoint checking locks are shared between java and nonjava threads. So in the existing code, there's a mix of locking with _no_safepoint_check_flag and not. > > This has a wrinkle because if you have a safepoint checking lock, and call wait on it, we now have to handle the case that the caller is a NonJavaThread and should not safepoint check (just like the Mutex::lock call). > > Maybe this shouldn't be fixed, and we should just document the discrepancy. > > Tested with tier1-3. Yes, it's true that the _no_safepoint_check is redundant but we check this for all non-safepointing lock acquisitions, just not safepointing ones. Because we want to keep the check, I think it should check both classes of locks. If we do enforce this MutexLocker parameter is consistent for safepointing checking locks, the MonitorLocker wait code does this: bool wait(int64_t timeout = 0) { return _flag == Mutex::_safepoint_check_flag ? as_monitor()->wait(timeout) : as_monitor()->wait_without_safepoint_check(timeout); } So if we use MonitorLocker(safepoint_check_lock) and then wait on a safepoint checking lock, then the wait with safepoint check will be called on a NJT. Which is why I added it. ------------- PR: https://git.openjdk.java.net/jdk/pull/5959 From stuefe at openjdk.java.net Fri Oct 15 13:26:58 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 15 Oct 2021 13:26:58 GMT Subject: RFR: JDK-8275320: NMT should perform buffer overrun checks In-Reply-To: References: Message-ID: On Fri, 15 Oct 2021 13:16:24 GMT, Thomas Stuefe wrote: > > Sorry, we already have GuardedMemory for detecting buffer overrun, why introduce a new one? > > GuardedMemory has a number of disadvantages, and I'd like to remove it in favor of NMT doing buffer overrun checks. For my full reasoning, please see my reasoning in the umbrella RFE https://bugs.openjdk.java.net/browse/JDK-8275301: > > Disadvantages of the current solution: > > * We have no way to do C-heap checking in release builds. But there, it is sorely missed. We ship release VMs, and those VMs get used in myriad ways with a lot of faulty third-party native code. I would love to be able to flip a product switch at a customer site and have some basic C-heap checks done, without relegating to external tools or debug c-libs. > * The debug-only guards in os::malloc() are quite fat, really, a whopping 48 bytes per allocation on 64-bit, 40 bytes on 32-bit. That is for guarding alone. They distort the memory allocation picture, since blowing up every allocation this way causes the underlying libc to do different things. Therefore we have different memory layouts and allocation patterns between debug and release. In addition, we have different code paths too, e.g. in debug os::realloc calls os::malloc + os::free whereas in release builds it calls directly into libc ::realloc. All that means that in debug builds we test something different than what we ship in release builds. > * The canary in the headers of the debug-only guards do not directly precede the user portion of the data, so we won't catch negative buffer overflows of only a few bytes. > * The guarding added by CheckJNICalls is unnecessarily expensive too, since it copies the memory around, handing a copy of the guarded memory up to the caller. > * The fact that three different code sections all do malloc headers incurs unnecessary costs, and the code is unnecessarily complex. It makes also statistics difficult to understand since the silent overhead can be large (compare the rise in RSS with the rise in NMT allocations in a debug build). > * None of the current overflow checkers print out hex dumps of the violated memory. That is what the libc usually does and it is very useful. > > Thanks, Thomas p.s. I contemplated to do NMT overflow checks and removal of old guarding code in one RFE but was concerned that it would be too confusing and get stuck in review limbo. Maybe that was wrong. But this RFE here makes more sense when viewed as part of a whole. ------------- PR: https://git.openjdk.java.net/jdk/pull/5952 From stefank at openjdk.java.net Fri Oct 15 13:58:06 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Fri, 15 Oct 2021 13:58:06 GMT Subject: RFR: 8275334: Move class loading Events to a separate section in hs_err files Message-ID: The generic "Events" section in hs_err files tend to fill up with class loading events. I propose that we move class loading events to its own section. We already have a section for class unloading, so adding a class loading section is also good for symmetry reasons. This is what this will look like after the patch: Classes loaded (20 events): Event: 4,426 Loading class com/sun/org/apache/xerces/internal/xinclude/MultipleScopeNamespaceSupport done Event: 4,426 Loading class com/sun/org/apache/xerces/internal/xinclude/XIncludeNamespaceSupport done Event: 4,428 Loading class com/sun/org/apache/xerces/internal/parsers/AbstractSAXParser$LocatorProxy Event: 4,428 Loading class org/xml/sax/ext/Locator2 Event: 4,428 Loading class org/xml/sax/ext/Locator2 done Event: 4,428 Loading class com/sun/org/apache/xerces/internal/parsers/AbstractSAXParser$LocatorProxy done Event: 4,430 Loading class com/sun/org/apache/xerces/internal/impl/XMLScanner$NameType Event: 4,430 Loading class com/sun/org/apache/xerces/internal/impl/XMLScanner$NameType done Event: 4,435 Loading class org/xml/sax/helpers/LocatorImpl Event: 4,435 Loading class org/xml/sax/helpers/LocatorImpl done Event: 4,498 Loading class java/util/Vector$ListItr Event: 4,498 Loading class java/util/Vector$ListItr done Event: 4,499 Loading class java/lang/StrictMath Event: 4,499 Loading class java/lang/StrictMath done Event: 4,623 Loading class java/util/BitSet Event: 4,623 Loading class java/util/BitSet done Event: 4,656 Loading class java/util/ArrayList$ListItr Event: 4,657 Loading class java/util/ArrayList$ListItr done Event: 4,693 Loading class java/util/ArrayList$SubList Event: 4,694 Loading class java/util/ArrayList$SubList done Classes unloaded (20 events): Event: 3,445 Thread 0x00007f061c2578d0 Unloading class 0x0000000100088000 'avrora/Main' Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a3c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a3c00' Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a3400 'java/lang/invoke/LambdaForm$MH+0x00000001000a3400' Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a1c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a1c00' Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a1400 'java/lang/invoke/LambdaForm$MH+0x00000001000a1400' Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a1000 'java/lang/invoke/LambdaForm$MH+0x00000001000a1000' Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a0c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a0c00' Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a0800 'java/lang/invoke/LambdaForm$MH+0x00000001000a0800' Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a0400 'java/lang/invoke/LambdaForm$MH+0x00000001000a0400' Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a0000 'java/lang/invoke/LambdaForm$MH+0x00000001000a0000' Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010012c000 'jdk/internal/reflect/GeneratedMethodAccessor1' Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x0000000100123000 'java/lang/invoke/LambdaForm$MH+0x0000000100123000' Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x0000000100121800 'java/lang/invoke/LambdaForm$MH+0x0000000100121800' Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x0000000100121000 'java/lang/invoke/LambdaForm$MH+0x0000000100121000' Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009fc00 'java/lang/invoke/LambdaForm$MH+0x000000010009fc00' Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009ec00 'java/lang/invoke/LambdaForm$MH+0x000000010009ec00' Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009e800 'java/lang/invoke/LambdaForm$MH+0x000000010009e800' Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009e400 'java/lang/invoke/LambdaForm$MH+0x000000010009e400' Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009cc00 'java/lang/invoke/LambdaForm$MH+0x000000010009cc00' Event: 5,914 Thread 0x00007f061c2578d0 Unloading class 0x0000000100123000 'jdk/internal/reflect/GeneratedMethodAccessor2' ... Events (20 events): Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dada610 Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dadd990 Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dadfd90 Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dae9e90 Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060daea490 Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060daeee90 Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060daf0710 Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060daf2410 Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dafe810 Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060db01d10 Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060db63c90 Event: 5,681 Thread 0x00007f05901b3bd0 Thread exited: 0x00007f05901b3bd0 Event: 5,681 Thread 0x00007f058c2319d0 Thread exited: 0x00007f058c2319d0 Event: 5,687 Thread 0x00007f05cc533030 Thread exited: 0x00007f05cc533030 Event: 5,687 Thread 0x00007f058c229ff0 Thread exited: 0x00007f058c229ff0 Event: 5,715 Thread 0x00007f058401fb00 Thread exited: 0x00007f058401fb00 Event: 5,715 Thread 0x00007f05cc449b80 Thread exited: 0x00007f05cc449b80 Event: 5,715 Thread 0x00007f05cc45b6f0 Thread exited: 0x00007f05cc45b6f0 Event: 5,872 Thread 0x00007f0584179750 Thread added: 0x00007f0584179750 Event: 5,872 Protecting memory [0x00007f05f85f8000,0x00007f05f85fc000] with protection modes 0 Before the patch: Classes unloaded (20 events): Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a3c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a3c00' Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a3400 'java/lang/invoke/LambdaForm$MH+0x00000001000a3400' Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a1c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a1c00' Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a1400 'java/lang/invoke/LambdaForm$MH+0x00000001000a1400' Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a1000 'java/lang/invoke/LambdaForm$MH+0x00000001000a1000' Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a0c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a0c00' Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a0800 'java/lang/invoke/LambdaForm$MH+0x00000001000a0800' Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a0400 'java/lang/invoke/LambdaForm$MH+0x00000001000a0400' Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a0000 'java/lang/invoke/LambdaForm$MH+0x00000001000a0000' Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010012c000 'jdk/internal/reflect/GeneratedMethodAccessor1' Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x0000000100123000 'java/lang/invoke/LambdaForm$MH+0x0000000100123000' Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x0000000100121800 'java/lang/invoke/LambdaForm$MH+0x0000000100121800' Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x0000000100121000 'java/lang/invoke/LambdaForm$MH+0x0000000100121000' Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009fc00 'java/lang/invoke/LambdaForm$MH+0x000000010009fc00' Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009ec00 'java/lang/invoke/LambdaForm$MH+0x000000010009ec00' Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009e800 'java/lang/invoke/LambdaForm$MH+0x000000010009e800' Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009e400 'java/lang/invoke/LambdaForm$MH+0x000000010009e400' Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009cc00 'java/lang/invoke/LambdaForm$MH+0x000000010009cc00' Event: 5,964 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a3400 'jdk/internal/reflect/GeneratedMethodAccessor2' Event: 6,059 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a3400 'jdk/internal/reflect/GeneratedMethodAccessor3' ... Events (20 events): Event: 4,662 loading class java/util/BitSet Event: 4,663 loading class java/util/BitSet done Event: 4,697 loading class java/util/ArrayList$ListItr Event: 4,697 loading class java/util/ArrayList$ListItr done Event: 4,738 loading class java/util/ArrayList$SubList Event: 4,739 loading class java/util/ArrayList$SubList done Event: 4,745 Thread 0x00007fe0940ebe70 Thread added: 0x00007fe0940ebe70 Event: 4,745 Protecting memory [0x00007fe109bb9000,0x00007fe109bbd000] with protection modes 0 Event: 4,803 Thread 0x00007fe0c03f1850 Thread added: 0x00007fe0c03f1850 Event: 4,803 Protecting memory [0x00007fe109ab8000,0x00007fe109abc000] with protection modes 0 Event: 4,809 Thread 0x00007fe0c412be00 Thread added: 0x00007fe0c412be00 Event: 4,809 Protecting memory [0x00007fe1097b5000,0x00007fe1097b9000] with protection modes 0 Event: 5,839 Thread 0x00007fe0c412be00 Thread exited: 0x00007fe0c412be00 Event: 5,846 Thread 0x00007fe0c03f1850 Thread exited: 0x00007fe0c03f1850 Event: 5,874 Thread 0x00007fe0940ebe70 Thread exited: 0x00007fe0940ebe70 Event: 5,874 Thread 0x00007fe0c03044d0 Thread exited: 0x00007fe0c03044d0 Event: 5,874 Thread 0x00007fe0dc465540 Thread exited: 0x00007fe0dc465540 Event: 5,894 Thread 0x00007fe0c02e7420 Thread exited: 0x00007fe0c02e7420 Event: 6,024 Thread 0x00007fe09c184030 Thread added: 0x00007fe09c184030 Event: 6,024 Protecting memory [0x00007fe1095b2000,0x00007fe1095b6000] with protection modes 0 ------------- Commit messages: - 8275334: Move class loading Events to a separate section in hs_err files Changes: https://git.openjdk.java.net/jdk/pull/5968/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5968&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275334 Stats: 20 lines in 3 files changed: 19 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/5968.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5968/head:pull/5968 PR: https://git.openjdk.java.net/jdk/pull/5968 From duke at openjdk.java.net Fri Oct 15 14:08:08 2021 From: duke at openjdk.java.net (Vamsi Parasa) Date: Fri, 15 Oct 2021 14:08:08 GMT Subject: RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh Message-ID: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> Optimize the new Math.unsignedMultiplyHigh using the x86 mul instruction. This change show 1.87X improvement on a micro benchmark. ------------- Commit messages: - fix typo in function name - 8275167: x86 intrinsic for unsignedMultiplyHigh Changes: https://git.openjdk.java.net/jdk/pull/5933/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5933&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275167 Stats: 59 lines in 11 files changed: 59 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/5933.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5933/head:pull/5933 PR: https://git.openjdk.java.net/jdk/pull/5933 From stuefe at openjdk.java.net Fri Oct 15 14:09:01 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 15 Oct 2021 14:09:01 GMT Subject: RFR: 8275334: Move class loading Events to a separate section in hs_err files In-Reply-To: References: Message-ID: On Fri, 15 Oct 2021 13:49:58 GMT, Stefan Karlsson wrote: > The generic "Events" section in hs_err files tend to fill up with class loading events. I propose that we move class loading events to its own section. > > We already have a section for class unloading, so adding a class loading section is also good for symmetry reasons. > > This is what this will look like after the patch: > > Classes loaded (20 events): > Event: 4,426 Loading class com/sun/org/apache/xerces/internal/xinclude/MultipleScopeNamespaceSupport done > Event: 4,426 Loading class com/sun/org/apache/xerces/internal/xinclude/XIncludeNamespaceSupport done > Event: 4,428 Loading class com/sun/org/apache/xerces/internal/parsers/AbstractSAXParser$LocatorProxy > Event: 4,428 Loading class org/xml/sax/ext/Locator2 > Event: 4,428 Loading class org/xml/sax/ext/Locator2 done > Event: 4,428 Loading class com/sun/org/apache/xerces/internal/parsers/AbstractSAXParser$LocatorProxy done > Event: 4,430 Loading class com/sun/org/apache/xerces/internal/impl/XMLScanner$NameType > Event: 4,430 Loading class com/sun/org/apache/xerces/internal/impl/XMLScanner$NameType done > Event: 4,435 Loading class org/xml/sax/helpers/LocatorImpl > Event: 4,435 Loading class org/xml/sax/helpers/LocatorImpl done > Event: 4,498 Loading class java/util/Vector$ListItr > Event: 4,498 Loading class java/util/Vector$ListItr done > Event: 4,499 Loading class java/lang/StrictMath > Event: 4,499 Loading class java/lang/StrictMath done > Event: 4,623 Loading class java/util/BitSet > Event: 4,623 Loading class java/util/BitSet done > Event: 4,656 Loading class java/util/ArrayList$ListItr > Event: 4,657 Loading class java/util/ArrayList$ListItr done > Event: 4,693 Loading class java/util/ArrayList$SubList > Event: 4,694 Loading class java/util/ArrayList$SubList done > > Classes unloaded (20 events): > Event: 3,445 Thread 0x00007f061c2578d0 Unloading class 0x0000000100088000 'avrora/Main' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a3c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a3c00' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a3400 'java/lang/invoke/LambdaForm$MH+0x00000001000a3400' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a1c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a1c00' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a1400 'java/lang/invoke/LambdaForm$MH+0x00000001000a1400' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a1000 'java/lang/invoke/LambdaForm$MH+0x00000001000a1000' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a0c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a0c00' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a0800 'java/lang/invoke/LambdaForm$MH+0x00000001000a0800' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a0400 'java/lang/invoke/LambdaForm$MH+0x00000001000a0400' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a0000 'java/lang/invoke/LambdaForm$MH+0x00000001000a0000' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010012c000 'jdk/internal/reflect/GeneratedMethodAccessor1' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x0000000100123000 'java/lang/invoke/LambdaForm$MH+0x0000000100123000' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x0000000100121800 'java/lang/invoke/LambdaForm$MH+0x0000000100121800' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x0000000100121000 'java/lang/invoke/LambdaForm$MH+0x0000000100121000' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009fc00 'java/lang/invoke/LambdaForm$MH+0x000000010009fc00' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009ec00 'java/lang/invoke/LambdaForm$MH+0x000000010009ec00' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009e800 'java/lang/invoke/LambdaForm$MH+0x000000010009e800' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009e400 'java/lang/invoke/LambdaForm$MH+0x000000010009e400' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009cc00 'java/lang/invoke/LambdaForm$MH+0x000000010009cc00' > Event: 5,914 Thread 0x00007f061c2578d0 Unloading class 0x0000000100123000 'jdk/internal/reflect/GeneratedMethodAccessor2' > ... > Events (20 events): > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dada610 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dadd990 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dadfd90 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dae9e90 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060daea490 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060daeee90 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060daf0710 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060daf2410 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dafe810 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060db01d10 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060db63c90 > Event: 5,681 Thread 0x00007f05901b3bd0 Thread exited: 0x00007f05901b3bd0 > Event: 5,681 Thread 0x00007f058c2319d0 Thread exited: 0x00007f058c2319d0 > Event: 5,687 Thread 0x00007f05cc533030 Thread exited: 0x00007f05cc533030 > Event: 5,687 Thread 0x00007f058c229ff0 Thread exited: 0x00007f058c229ff0 > Event: 5,715 Thread 0x00007f058401fb00 Thread exited: 0x00007f058401fb00 > Event: 5,715 Thread 0x00007f05cc449b80 Thread exited: 0x00007f05cc449b80 > Event: 5,715 Thread 0x00007f05cc45b6f0 Thread exited: 0x00007f05cc45b6f0 > Event: 5,872 Thread 0x00007f0584179750 Thread added: 0x00007f0584179750 > Event: 5,872 Protecting memory [0x00007f05f85f8000,0x00007f05f85fc000] with protection modes 0 > > > Before the patch: > > Classes unloaded (20 events): > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a3c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a3c00' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a3400 'java/lang/invoke/LambdaForm$MH+0x00000001000a3400' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a1c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a1c00' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a1400 'java/lang/invoke/LambdaForm$MH+0x00000001000a1400' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a1000 'java/lang/invoke/LambdaForm$MH+0x00000001000a1000' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a0c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a0c00' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a0800 'java/lang/invoke/LambdaForm$MH+0x00000001000a0800' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a0400 'java/lang/invoke/LambdaForm$MH+0x00000001000a0400' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a0000 'java/lang/invoke/LambdaForm$MH+0x00000001000a0000' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010012c000 'jdk/internal/reflect/GeneratedMethodAccessor1' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x0000000100123000 'java/lang/invoke/LambdaForm$MH+0x0000000100123000' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x0000000100121800 'java/lang/invoke/LambdaForm$MH+0x0000000100121800' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x0000000100121000 'java/lang/invoke/LambdaForm$MH+0x0000000100121000' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009fc00 'java/lang/invoke/LambdaForm$MH+0x000000010009fc00' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009ec00 'java/lang/invoke/LambdaForm$MH+0x000000010009ec00' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009e800 'java/lang/invoke/LambdaForm$MH+0x000000010009e800' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009e400 'java/lang/invoke/LambdaForm$MH+0x000000010009e400' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009cc00 'java/lang/invoke/LambdaForm$MH+0x000000010009cc00' > Event: 5,964 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a3400 'jdk/internal/reflect/GeneratedMethodAccessor2' > Event: 6,059 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a3400 'jdk/internal/reflect/GeneratedMethodAccessor3' > ... > Events (20 events): > Event: 4,662 loading class java/util/BitSet > Event: 4,663 loading class java/util/BitSet done > Event: 4,697 loading class java/util/ArrayList$ListItr > Event: 4,697 loading class java/util/ArrayList$ListItr done > Event: 4,738 loading class java/util/ArrayList$SubList > Event: 4,739 loading class java/util/ArrayList$SubList done > Event: 4,745 Thread 0x00007fe0940ebe70 Thread added: 0x00007fe0940ebe70 > Event: 4,745 Protecting memory [0x00007fe109bb9000,0x00007fe109bbd000] with protection modes 0 > Event: 4,803 Thread 0x00007fe0c03f1850 Thread added: 0x00007fe0c03f1850 > Event: 4,803 Protecting memory [0x00007fe109ab8000,0x00007fe109abc000] with protection modes 0 > Event: 4,809 Thread 0x00007fe0c412be00 Thread added: 0x00007fe0c412be00 > Event: 4,809 Protecting memory [0x00007fe1097b5000,0x00007fe1097b9000] with protection modes 0 > Event: 5,839 Thread 0x00007fe0c412be00 Thread exited: 0x00007fe0c412be00 > Event: 5,846 Thread 0x00007fe0c03f1850 Thread exited: 0x00007fe0c03f1850 > Event: 5,874 Thread 0x00007fe0940ebe70 Thread exited: 0x00007fe0940ebe70 > Event: 5,874 Thread 0x00007fe0c03044d0 Thread exited: 0x00007fe0c03044d0 > Event: 5,874 Thread 0x00007fe0dc465540 Thread exited: 0x00007fe0dc465540 > Event: 5,894 Thread 0x00007fe0c02e7420 Thread exited: 0x00007fe0c02e7420 > Event: 6,024 Thread 0x00007fe09c184030 Thread added: 0x00007fe09c184030 > Event: 6,024 Protecting memory [0x00007fe1095b2000,0x00007fe1095b6000] with protection modes 0 Looks good. ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5968 From coleenp at openjdk.java.net Fri Oct 15 14:25:52 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 15 Oct 2021 14:25:52 GMT Subject: RFR: 8275334: Move class loading Events to a separate section in hs_err files In-Reply-To: References: Message-ID: On Fri, 15 Oct 2021 13:49:58 GMT, Stefan Karlsson wrote: > The generic "Events" section in hs_err files tend to fill up with class loading events. I propose that we move class loading events to its own section. > > We already have a section for class unloading, so adding a class loading section is also good for symmetry reasons. > > This is what this will look like after the patch: > > Classes loaded (20 events): > Event: 4,426 Loading class com/sun/org/apache/xerces/internal/xinclude/MultipleScopeNamespaceSupport done > Event: 4,426 Loading class com/sun/org/apache/xerces/internal/xinclude/XIncludeNamespaceSupport done > Event: 4,428 Loading class com/sun/org/apache/xerces/internal/parsers/AbstractSAXParser$LocatorProxy > Event: 4,428 Loading class org/xml/sax/ext/Locator2 > Event: 4,428 Loading class org/xml/sax/ext/Locator2 done > Event: 4,428 Loading class com/sun/org/apache/xerces/internal/parsers/AbstractSAXParser$LocatorProxy done > Event: 4,430 Loading class com/sun/org/apache/xerces/internal/impl/XMLScanner$NameType > Event: 4,430 Loading class com/sun/org/apache/xerces/internal/impl/XMLScanner$NameType done > Event: 4,435 Loading class org/xml/sax/helpers/LocatorImpl > Event: 4,435 Loading class org/xml/sax/helpers/LocatorImpl done > Event: 4,498 Loading class java/util/Vector$ListItr > Event: 4,498 Loading class java/util/Vector$ListItr done > Event: 4,499 Loading class java/lang/StrictMath > Event: 4,499 Loading class java/lang/StrictMath done > Event: 4,623 Loading class java/util/BitSet > Event: 4,623 Loading class java/util/BitSet done > Event: 4,656 Loading class java/util/ArrayList$ListItr > Event: 4,657 Loading class java/util/ArrayList$ListItr done > Event: 4,693 Loading class java/util/ArrayList$SubList > Event: 4,694 Loading class java/util/ArrayList$SubList done > > Classes unloaded (20 events): > Event: 3,445 Thread 0x00007f061c2578d0 Unloading class 0x0000000100088000 'avrora/Main' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a3c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a3c00' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a3400 'java/lang/invoke/LambdaForm$MH+0x00000001000a3400' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a1c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a1c00' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a1400 'java/lang/invoke/LambdaForm$MH+0x00000001000a1400' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a1000 'java/lang/invoke/LambdaForm$MH+0x00000001000a1000' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a0c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a0c00' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a0800 'java/lang/invoke/LambdaForm$MH+0x00000001000a0800' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a0400 'java/lang/invoke/LambdaForm$MH+0x00000001000a0400' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a0000 'java/lang/invoke/LambdaForm$MH+0x00000001000a0000' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010012c000 'jdk/internal/reflect/GeneratedMethodAccessor1' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x0000000100123000 'java/lang/invoke/LambdaForm$MH+0x0000000100123000' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x0000000100121800 'java/lang/invoke/LambdaForm$MH+0x0000000100121800' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x0000000100121000 'java/lang/invoke/LambdaForm$MH+0x0000000100121000' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009fc00 'java/lang/invoke/LambdaForm$MH+0x000000010009fc00' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009ec00 'java/lang/invoke/LambdaForm$MH+0x000000010009ec00' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009e800 'java/lang/invoke/LambdaForm$MH+0x000000010009e800' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009e400 'java/lang/invoke/LambdaForm$MH+0x000000010009e400' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009cc00 'java/lang/invoke/LambdaForm$MH+0x000000010009cc00' > Event: 5,914 Thread 0x00007f061c2578d0 Unloading class 0x0000000100123000 'jdk/internal/reflect/GeneratedMethodAccessor2' > ... > Events (20 events): > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dada610 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dadd990 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dadfd90 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dae9e90 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060daea490 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060daeee90 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060daf0710 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060daf2410 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dafe810 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060db01d10 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060db63c90 > Event: 5,681 Thread 0x00007f05901b3bd0 Thread exited: 0x00007f05901b3bd0 > Event: 5,681 Thread 0x00007f058c2319d0 Thread exited: 0x00007f058c2319d0 > Event: 5,687 Thread 0x00007f05cc533030 Thread exited: 0x00007f05cc533030 > Event: 5,687 Thread 0x00007f058c229ff0 Thread exited: 0x00007f058c229ff0 > Event: 5,715 Thread 0x00007f058401fb00 Thread exited: 0x00007f058401fb00 > Event: 5,715 Thread 0x00007f05cc449b80 Thread exited: 0x00007f05cc449b80 > Event: 5,715 Thread 0x00007f05cc45b6f0 Thread exited: 0x00007f05cc45b6f0 > Event: 5,872 Thread 0x00007f0584179750 Thread added: 0x00007f0584179750 > Event: 5,872 Protecting memory [0x00007f05f85f8000,0x00007f05f85fc000] with protection modes 0 > > > Before the patch: > > Classes unloaded (20 events): > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a3c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a3c00' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a3400 'java/lang/invoke/LambdaForm$MH+0x00000001000a3400' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a1c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a1c00' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a1400 'java/lang/invoke/LambdaForm$MH+0x00000001000a1400' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a1000 'java/lang/invoke/LambdaForm$MH+0x00000001000a1000' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a0c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a0c00' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a0800 'java/lang/invoke/LambdaForm$MH+0x00000001000a0800' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a0400 'java/lang/invoke/LambdaForm$MH+0x00000001000a0400' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a0000 'java/lang/invoke/LambdaForm$MH+0x00000001000a0000' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010012c000 'jdk/internal/reflect/GeneratedMethodAccessor1' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x0000000100123000 'java/lang/invoke/LambdaForm$MH+0x0000000100123000' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x0000000100121800 'java/lang/invoke/LambdaForm$MH+0x0000000100121800' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x0000000100121000 'java/lang/invoke/LambdaForm$MH+0x0000000100121000' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009fc00 'java/lang/invoke/LambdaForm$MH+0x000000010009fc00' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009ec00 'java/lang/invoke/LambdaForm$MH+0x000000010009ec00' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009e800 'java/lang/invoke/LambdaForm$MH+0x000000010009e800' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009e400 'java/lang/invoke/LambdaForm$MH+0x000000010009e400' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009cc00 'java/lang/invoke/LambdaForm$MH+0x000000010009cc00' > Event: 5,964 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a3400 'jdk/internal/reflect/GeneratedMethodAccessor2' > Event: 6,059 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a3400 'jdk/internal/reflect/GeneratedMethodAccessor3' > ... > Events (20 events): > Event: 4,662 loading class java/util/BitSet > Event: 4,663 loading class java/util/BitSet done > Event: 4,697 loading class java/util/ArrayList$ListItr > Event: 4,697 loading class java/util/ArrayList$ListItr done > Event: 4,738 loading class java/util/ArrayList$SubList > Event: 4,739 loading class java/util/ArrayList$SubList done > Event: 4,745 Thread 0x00007fe0940ebe70 Thread added: 0x00007fe0940ebe70 > Event: 4,745 Protecting memory [0x00007fe109bb9000,0x00007fe109bbd000] with protection modes 0 > Event: 4,803 Thread 0x00007fe0c03f1850 Thread added: 0x00007fe0c03f1850 > Event: 4,803 Protecting memory [0x00007fe109ab8000,0x00007fe109abc000] with protection modes 0 > Event: 4,809 Thread 0x00007fe0c412be00 Thread added: 0x00007fe0c412be00 > Event: 4,809 Protecting memory [0x00007fe1097b5000,0x00007fe1097b9000] with protection modes 0 > Event: 5,839 Thread 0x00007fe0c412be00 Thread exited: 0x00007fe0c412be00 > Event: 5,846 Thread 0x00007fe0c03f1850 Thread exited: 0x00007fe0c03f1850 > Event: 5,874 Thread 0x00007fe0940ebe70 Thread exited: 0x00007fe0940ebe70 > Event: 5,874 Thread 0x00007fe0c03044d0 Thread exited: 0x00007fe0c03044d0 > Event: 5,874 Thread 0x00007fe0dc465540 Thread exited: 0x00007fe0dc465540 > Event: 5,894 Thread 0x00007fe0c02e7420 Thread exited: 0x00007fe0c02e7420 > Event: 6,024 Thread 0x00007fe09c184030 Thread added: 0x00007fe09c184030 > Event: 6,024 Protecting memory [0x00007fe1095b2000,0x00007fe1095b6000] with protection modes 0 Looks good. Thank you! ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5968 From bpb at openjdk.java.net Fri Oct 15 15:56:51 2021 From: bpb at openjdk.java.net (Brian Burkhalter) Date: Fri, 15 Oct 2021 15:56:51 GMT Subject: RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh In-Reply-To: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> References: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> Message-ID: On Wed, 13 Oct 2021 18:55:10 GMT, Vamsi Parasa wrote: > Optimize the new Math.unsignedMultiplyHigh using the x86 mul instruction. This change show 1.87X improvement on a micro benchmark. The `java.lang.Math` change looks good, of course. On the rest I cannot comment but it is good to see something being done here. ------------- PR: https://git.openjdk.java.net/jdk/pull/5933 From aph at openjdk.java.net Fri Oct 15 16:17:51 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 15 Oct 2021 16:17:51 GMT Subject: RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh In-Reply-To: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> References: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> Message-ID: On Wed, 13 Oct 2021 18:55:10 GMT, Vamsi Parasa wrote: > Optimize the new Math.unsignedMultiplyHigh using the x86 mul instruction. This change show 1.87X improvement on a micro benchmark. src/hotspot/share/opto/mulnode.cpp line 468: > 466: } > 467: > 468: //============================================================================= MulHiLNode::Value() and UMulHiLNode::Value() seem to be identical. Perhaps some refactoring would be in order, maybe make a common shared routine. ------------- PR: https://git.openjdk.java.net/jdk/pull/5933 From coleenp at openjdk.java.net Fri Oct 15 16:28:49 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 15 Oct 2021 16:28:49 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file In-Reply-To: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: On Thu, 14 Oct 2021 20:51:44 GMT, Coleen Phillimore wrote: > See CR for details. This saves the created mutex in a mutex_array during Mutex construction so that all Mutex, not just the ones in mutexLocker.cpp can be printed in the hs_err file. ThreadCritical is used to protect the mutex_array, and to avoid UB should be used when printing the owned locks, but since that is done during error handling, this seemed better not to lock during error handling. There's 0 probability that another thread will be creating a lock during this error handling anyway. > Tested with tier1-8. The new printing in the error file looks like this. I'm not sure if it'll be confusing to see how many mutexes have been created, or maybe the message should be reworded somewhat? VM Mutex/Monitor currently owned by a thread: [0x0000146688024e00] Threads_lock - owner thread: 0x00001466880283c0 allow_vm_block safepoint-1 Total Mutex count 580 The allow_vm_block and safepoint-1 are NOT printed in PRODUCT mode. I took out this because I don't know what it's trying to say and it doesn't seem meaningful: // print format used by Mutex::print_on_error() st->print_cr(" ([mutex/lock_event])"); ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From coleenp at openjdk.java.net Fri Oct 15 16:47:30 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 15 Oct 2021 16:47:30 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file [v2] In-Reply-To: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: > See CR for details. This saves the created mutex in a mutex_array during Mutex construction so that all Mutex, not just the ones in mutexLocker.cpp can be printed in the hs_err file. ThreadCritical is used to protect the mutex_array, and to avoid UB should be used when printing the owned locks, but since that is done during error handling, this seemed better not to lock during error handling. There's 0 probability that another thread will be creating a lock during this error handling anyway. > Tested with tier1-8. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Make mutex_array be a linked list. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5958/files - new: https://git.openjdk.java.net/jdk/pull/5958/files/3ff9cc02..665d5cc0 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5958&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5958&range=00-01 Stats: 97 lines in 3 files changed: 41 ins; 38 del; 18 mod Patch: https://git.openjdk.java.net/jdk/pull/5958.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5958/head:pull/5958 PR: https://git.openjdk.java.net/jdk/pull/5958 From mcimadamore at openjdk.java.net Fri Oct 15 16:54:58 2021 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Fri, 15 Oct 2021 16:54:58 GMT Subject: RFR: 8275063: Implementation of Foreign Function & Memory API (Second incubator) [v5] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-419 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/419 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: * Fix javadoc issue in VaList * Fix bug in concurrent logic for shared scope acquire ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5907/files - new: https://git.openjdk.java.net/jdk/pull/5907/files/27e71ee7..5ed94c30 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5907&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5907&range=03-04 Stats: 10 lines in 2 files changed: 2 ins; 3 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/5907.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5907/head:pull/5907 PR: https://git.openjdk.java.net/jdk/pull/5907 From mcimadamore at openjdk.java.net Fri Oct 15 16:56:08 2021 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Fri, 15 Oct 2021 16:56:08 GMT Subject: RFR: 8275063: Implementation of Foreign Function & Memory API (Second incubator) [v5] In-Reply-To: References: Message-ID: On Fri, 15 Oct 2021 16:54:58 GMT, Maurizio Cimadamore wrote: >> This PR contains the API and implementation changes for JEP-419 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. >> >> [1] - https://openjdk.java.net/jeps/419 > > Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: > > * Fix javadoc issue in VaList > * Fix bug in concurrent logic for shared scope acquire src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/SharedScope.java line 138: > 136: ResourceCleanup prev = (ResourceCleanup) FST.getAcquire(this); > 137: cleanup.next = prev; > 138: ResourceCleanup newSegment = (ResourceCleanup) FST.compareAndExchangeRelease(this, prev, cleanup); In this place we can actually overwrite CLOSED_LIST (if getAcquired has seen such value). While this particular add will fail later on (since we check witness), another concurrent add might then pass, since it sees a value other than CLOSED_LIST (which is supposed to be a terminal state). Also, after consulting with @DougLea and @shipilev - it seems like using volatile semantics is the way to go here. This should fix a spurious (and very rare) failure on TestResourceScope on arm64. ------------- PR: https://git.openjdk.java.net/jdk/pull/5907 From aph at openjdk.java.net Fri Oct 15 16:58:52 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 15 Oct 2021 16:58:52 GMT Subject: RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh In-Reply-To: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> References: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> Message-ID: On Wed, 13 Oct 2021 18:55:10 GMT, Vamsi Parasa wrote: > Optimize the new Math.unsignedMultiplyHigh using the x86 mul instruction. This change show 1.87X improvement on a micro benchmark. test/micro/org/openjdk/bench/java/lang/MathBench.java line 547: > 545: return Math.unsignedMultiplyHigh(long747, long13); > 546: } > 547: As far as I can tell, the JMH overhead dominates when trying to measure the latency of events in the nanosecond range. `unsignedMultiplyHigh` should have a latency of maybe 1.5-2ns. Is that what you saw? ------------- PR: https://git.openjdk.java.net/jdk/pull/5933 From kvn at openjdk.java.net Fri Oct 15 18:48:48 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 15 Oct 2021 18:48:48 GMT Subject: RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh In-Reply-To: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> References: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> Message-ID: On Wed, 13 Oct 2021 18:55:10 GMT, Vamsi Parasa wrote: > Optimize the new Math.unsignedMultiplyHigh using the x86 mul instruction. This change show 1.87X improvement on a micro benchmark. How you verified correctness of results? I suggest to extend `test/jdk//java/lang/Math/MultiplicationTests.java` test to cover unsigned method. ------------- Changes requested by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5933 From duke at openjdk.java.net Fri Oct 15 19:21:53 2021 From: duke at openjdk.java.net (Vamsi Parasa) Date: Fri, 15 Oct 2021 19:21:53 GMT Subject: RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh In-Reply-To: References: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> Message-ID: On Fri, 15 Oct 2021 18:45:41 GMT, Vladimir Kozlov wrote: > How you verified correctness of results? I suggest to extend `test/jdk//java/lang/Math/MultiplicationTests.java` test to cover unsigned method. Tests for unsignedMultiplyHigh were already added in test/jdk//java/lang/Math/MultiplicationTests.java (in July 2021 by Brian Burkhalter). Used that test to verify the correctness of the results. ------------- PR: https://git.openjdk.java.net/jdk/pull/5933 From duke at openjdk.java.net Fri Oct 15 19:34:00 2021 From: duke at openjdk.java.net (Vamsi Parasa) Date: Fri, 15 Oct 2021 19:34:00 GMT Subject: RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh In-Reply-To: References: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> Message-ID: On Fri, 15 Oct 2021 16:14:25 GMT, Andrew Haley wrote: >> Optimize the new Math.unsignedMultiplyHigh using the x86 mul instruction. This change show 1.87X improvement on a micro benchmark. > > src/hotspot/share/opto/mulnode.cpp line 468: > >> 466: } >> 467: >> 468: //============================================================================= > > MulHiLNode::Value() and UMulHiLNode::Value() seem to be identical. Perhaps some refactoring would be in order, maybe make a common shared routine. Sure, will do the refactoring to use a shared routine. > test/micro/org/openjdk/bench/java/lang/MathBench.java line 547: > >> 545: return Math.unsignedMultiplyHigh(long747, long13); >> 546: } >> 547: > > As far as I can tell, the JMH overhead dominates when trying to measure the latency of events in the nanosecond range. `unsignedMultiplyHigh` should have a latency of maybe 1.5-2ns. Is that what you saw? Yes, the JMH overhead was dominating the measurement of latency. The latency observed for `unsignedMultiplyHigh` was 2.3ns with the intrinsic enabled. ------------- PR: https://git.openjdk.java.net/jdk/pull/5933 From kvn at openjdk.java.net Fri Oct 15 20:22:45 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 15 Oct 2021 20:22:45 GMT Subject: RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh In-Reply-To: References: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> Message-ID: <8MuiklM5Nt3VkzyVHbWqwMh_LkVvVY2Mf65_0zTx4Kw=.9351008f-b489-4103-be9d-87e6fc4a8f39@github.com> On Fri, 15 Oct 2021 19:19:13 GMT, Vamsi Parasa wrote: > > How you verified correctness of results? I suggest to extend `test/jdk//java/lang/Math/MultiplicationTests.java` test to cover unsigned method. > > Tests for unsignedMultiplyHigh were already added in test/jdk//java/lang/Math/MultiplicationTests.java (in July 2021 by Brian Burkhalter). Used that test to verify the correctness of the results. Good. It seems I have old version of the test. Did you run it with -Xcomp? How you verified that intrinsic is used? ------------- PR: https://git.openjdk.java.net/jdk/pull/5933 From duke at openjdk.java.net Fri Oct 15 21:06:52 2021 From: duke at openjdk.java.net (Vamsi Parasa) Date: Fri, 15 Oct 2021 21:06:52 GMT Subject: RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh In-Reply-To: <8MuiklM5Nt3VkzyVHbWqwMh_LkVvVY2Mf65_0zTx4Kw=.9351008f-b489-4103-be9d-87e6fc4a8f39@github.com> References: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> <8MuiklM5Nt3VkzyVHbWqwMh_LkVvVY2Mf65_0zTx4Kw=.9351008f-b489-4103-be9d-87e6fc4a8f39@github.com> Message-ID: On Fri, 15 Oct 2021 20:19:31 GMT, Vladimir Kozlov wrote: > > > How you verified correctness of results? I suggest to extend `test/jdk//java/lang/Math/MultiplicationTests.java` test to cover unsigned method. > > > > > > Tests for unsignedMultiplyHigh were already added in test/jdk//java/lang/Math/MultiplicationTests.java (in July 2021 by Brian Burkhalter). Used that test to verify the correctness of the results. > > Good. It seems I have old version of the test. Did you run it with -Xcomp? How you verified that intrinsic is used? The tests were not run with -Xcomp. Looked at the generated x86 code using the disassembler (hsdis) and verified that that correct instruction (mul) instead of (imul, without the intrinsic) was being generated. ------------- PR: https://git.openjdk.java.net/jdk/pull/5933 From psandoz at openjdk.java.net Sat Oct 16 00:56:14 2021 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Sat, 16 Oct 2021 00:56:14 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v3] In-Reply-To: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> Message-ID: > This PR improves the performance of vector operations that accept masks on architectures that support masking in hardware, specifically Intel AVX512 and ARM SVE. > > On architectures that do not support masking in hardware the same technique as before is applied to most operations, specifically composition using blend. > > Masked loads/stores are a special form of masked operation that require additional care to ensure out-of-bounds access throw exceptions. The range checking has not been fully optimized and will require further work. > > No API enhancements were required and only a few additional tests were needed. Paul Sandoz has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: - Merge branch 'master' into JDK-8271515-vector-api - Apply patch from https://github.com/openjdk/panama-vector/pull/152 - Apply patch from https://github.com/openjdk/panama-vector/pull/142 - Apply patch from https://github.com/openjdk/panama-vector/pull/139 - Apply patch from https://github.com/openjdk/panama-vector/pull/151 - Add new files. - 8271515: Integration of JEP 417: Vector API (Third Incubator) ------------- Changes: https://git.openjdk.java.net/jdk/pull/5873/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5873&range=02 Stats: 21998 lines in 106 files changed: 16227 ins; 2077 del; 3694 mod Patch: https://git.openjdk.java.net/jdk/pull/5873.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5873/head:pull/5873 PR: https://git.openjdk.java.net/jdk/pull/5873 From stuefe at openjdk.java.net Sat Oct 16 07:48:52 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Sat, 16 Oct 2021 07:48:52 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file [v2] In-Reply-To: References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: On Fri, 15 Oct 2021 16:47:30 GMT, Coleen Phillimore wrote: >> See CR for details. This saves the created mutex in a mutex_array during Mutex construction so that all Mutex, not just the ones in mutexLocker.cpp can be printed in the hs_err file. ThreadCritical is used to protect the mutex_array, and to avoid UB should be used when printing the owned locks, but since that is done during error handling, this seemed better not to lock during error handling. There's 0 probability that another thread will be creating a lock during this error handling anyway. >> Tested with tier1-8. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Make mutex_array be a linked list. Hi Coleen, thank you for taking my suggestion. I still think that this may be better suited for debug only, like the other analysis features of Mutexes, but I leave that up to you. There is a case for making this available in release builds too. Apart from that, the rest are mostly unimportant nits. I'll leave it up to you what you take from them. Cheers , Thomas src/hotspot/share/runtime/mutex.cpp line 273: > 271: // Array used to print owned locks on error. > 272: static Mutex* _mutex_array = NULL; > 273: static int _array_count = 0; Do we need this counter? Could we not just count the mutexes while printing them? src/hotspot/share/runtime/mutex.cpp line 280: > 278: ThreadCritical tc; > 279: Mutex* old_next = _next_mutex; > 280: assert(old_next != nullptr, "only static mutexes don't have a next"); This was a brainteaser till I realized that the end of the list is composed of the Mutexes created first, which we don't delete. Maybe add that as a comment? src/hotspot/share/runtime/mutex.cpp line 322: > 320: } > 321: _array_count += 1; > 322: } Proposal: maybe factor this block (and its counterpart in ~Mutex) to an own private method, "add_to_global_mutex_list" and "remove_from_global_mutex_list"? That would also make the difference between `_next` and `_next_mutex` a bit clearer to the casual reader. src/hotspot/share/runtime/mutex.cpp line 335: > 333: #ifdef ASSERT > 334: if (_allow_vm_block) { > 335: st->print("%s", " allow_vm_block"); nit: shorten to `print_raw(" alloc_vm_block")` ? src/hotspot/share/runtime/mutex.cpp line 344: > 342: // Print all mutexes/monitors that are currently owned by a thread; called > 343: // by fatal error handler. > 344: void Mutex::print_owned_locks_on_error(outputStream* st) { Maybe a small comment that we explicitly avoid locking here, because we are in error handling and locking may block? That we rather risk a secondary crash instead of blocking in error handling? src/hotspot/share/runtime/mutex.cpp line 345: > 343: // by fatal error handler. > 344: void Mutex::print_owned_locks_on_error(outputStream* st) { > 345: ResourceMark rm; Why is this mark needed? src/hotspot/share/runtime/mutex.cpp line 346: > 344: void Mutex::print_owned_locks_on_error(outputStream* st) { > 345: ResourceMark rm; > 346: st->print("VM Mutex/Monitor currently owned by a thread: "); nit: use plural? e.g. "All VM Mutexes/Monitors currently owned by any thread:" src/hotspot/share/runtime/mutex.cpp line 361: > 359: } > 360: if (none) st->print_cr("None"); > 361: st->print_cr("Total Mutex count %d", _array_count); This prints the total number of mutexes, not those owned by a thread, is that intentional? src/hotspot/share/utilities/vmError.cpp line 1920: > 1918: MutexLocker ml(ErrorTest_lock, Mutex::_no_safepoint_check_flag); > 1919: assert(how == 0, "test assert with lock"); > 1920: break; I like this test. ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From stuefe at openjdk.java.net Sat Oct 16 07:48:53 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Sat, 16 Oct 2021 07:48:53 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file [v2] In-Reply-To: References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: On Sat, 16 Oct 2021 06:34:59 GMT, Thomas Stuefe wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Make mutex_array be a linked list. > > src/hotspot/share/runtime/mutex.cpp line 344: > >> 342: // Print all mutexes/monitors that are currently owned by a thread; called >> 343: // by fatal error handler. >> 344: void Mutex::print_owned_locks_on_error(outputStream* st) { > > Maybe a small comment that we explicitly avoid locking here, because we are in error handling and locking may block? That we rather risk a secondary crash instead of blocking in error handling? You also could assert for `VMError::is_error_reported()` to prevent future use outside error handling. ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From mcimadamore at openjdk.java.net Sat Oct 16 11:11:59 2021 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Sat, 16 Oct 2021 11:11:59 GMT Subject: RFR: 8275063: Implementation of Foreign Function & Memory API (Second incubator) [v6] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-419 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/419 Maurizio Cimadamore has updated the pull request incrementally with two additional commits since the last revision: - * use `invokeWithArguments` to simplify new test - Add test for liveness check with high-aririty downcalls (make sure that if an exception occurs in a downcall because of liveness, ref count of other resources are left intact). ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5907/files - new: https://git.openjdk.java.net/jdk/pull/5907/files/5ed94c30..a4db81fe Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5907&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5907&range=04-05 Stats: 39 lines in 2 files changed: 39 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/5907.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5907/head:pull/5907 PR: https://git.openjdk.java.net/jdk/pull/5907 From zgu at openjdk.java.net Sat Oct 16 12:45:52 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Sat, 16 Oct 2021 12:45:52 GMT Subject: RFR: JDK-8275320: NMT should perform buffer overrun checks In-Reply-To: References: Message-ID: On Fri, 15 Oct 2021 13:23:46 GMT, Thomas Stuefe wrote: > > > Sorry, we already have GuardedMemory for detecting buffer overrun, why introduce a new one? > > > > > > GuardedMemory has a number of disadvantages, and I'd like to remove it in favor of NMT doing buffer overrun checks. For my full reasoning, please see my reasoning in the umbrella RFE https://bugs.openjdk.java.net/browse/JDK-8275301: > > Disadvantages of the current solution: > > > > * We have no way to do C-heap checking in release builds. But there, it is sorely missed. We ship release VMs, and those VMs get used in myriad ways with a lot of faulty third-party native code. I would love to be able to flip a product switch at a customer site and have some basic C-heap checks done, without relegating to external tools or debug c-libs. > > * The debug-only guards in os::malloc() are quite fat, really, a whopping 48 bytes per allocation on 64-bit, 40 bytes on 32-bit. That is for guarding alone. They distort the memory allocation picture, since blowing up every allocation this way causes the underlying libc to do different things. Therefore we have different memory layouts and allocation patterns between debug and release. In addition, we have different code paths too, e.g. in debug os::realloc calls os::malloc + os::free whereas in release builds it calls directly into libc ::realloc. All that means that in debug builds we test something different than what we ship in release builds. > > * The canary in the headers of the debug-only guards do not directly precede the user portion of the data, so we won't catch negative buffer overflows of only a few bytes. > > * The guarding added by CheckJNICalls is unnecessarily expensive too, since it copies the memory around, handing a copy of the guarded memory up to the caller. > > * The fact that three different code sections all do malloc headers incurs unnecessary costs, and the code is unnecessarily complex. It makes also statistics difficult to understand since the silent overhead can be large (compare the rise in RSS with the rise in NMT allocations in a debug build). > > * None of the current overflow checkers print out hex dumps of the violated memory. That is what the libc usually does and it is very useful. > > > > Thanks, Thomas > > p.s. I contemplated to do NMT overflow checks and removal of old guarding code in one RFE but was concerned that it would be too confusing and get stuck in review limbo. Maybe that was wrong. But this RFE here makes more sense when viewed as part of a whole. Thanks for explanation. So, buffer overrun detection is now only available when NMT is on, vs. always on with GuardedMemory in debug build. Right? ------------- PR: https://git.openjdk.java.net/jdk/pull/5952 From stuefe at openjdk.java.net Sun Oct 17 05:50:44 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Sun, 17 Oct 2021 05:50:44 GMT Subject: RFR: JDK-8275320: NMT should perform buffer overrun checks In-Reply-To: References: Message-ID: <_r5qw_r-3Be7zUJuf4gcb10MFe9varAWAvix_CaJiYs=.758ad563-f2d2-466d-bcb5-1ccc6b547e94@github.com> On Sat, 16 Oct 2021 12:42:42 GMT, Zhengyu Gu wrote: > > > > Sorry, we already have GuardedMemory for detecting buffer overrun, why introduce a new one? > > > > > > > > > GuardedMemory has a number of disadvantages, and I'd like to remove it in favor of NMT doing buffer overrun checks. For my full reasoning, please see my reasoning in the umbrella RFE https://bugs.openjdk.java.net/browse/JDK-8275301: > > > Disadvantages of the current solution: > > > > > > * We have no way to do C-heap checking in release builds. But there, it is sorely missed. We ship release VMs, and those VMs get used in myriad ways with a lot of faulty third-party native code. I would love to be able to flip a product switch at a customer site and have some basic C-heap checks done, without relegating to external tools or debug c-libs. > > > * The debug-only guards in os::malloc() are quite fat, really, a whopping 48 bytes per allocation on 64-bit, 40 bytes on 32-bit. That is for guarding alone. They distort the memory allocation picture, since blowing up every allocation this way causes the underlying libc to do different things. Therefore we have different memory layouts and allocation patterns between debug and release. In addition, we have different code paths too, e.g. in debug os::realloc calls os::malloc + os::free whereas in release builds it calls directly into libc ::realloc. All that means that in debug builds we test something different than what we ship in release builds. > > > * The canary in the headers of the debug-only guards do not directly precede the user portion of the data, so we won't catch negative buffer overflows of only a few bytes. > > > * The guarding added by CheckJNICalls is unnecessarily expensive too, since it copies the memory around, handing a copy of the guarded memory up to the caller. > > > * The fact that three different code sections all do malloc headers incurs unnecessary costs, and the code is unnecessarily complex. It makes also statistics difficult to understand since the silent overhead can be large (compare the rise in RSS with the rise in NMT allocations in a debug build). > > > * None of the current overflow checkers print out hex dumps of the violated memory. That is what the libc usually does and it is very useful. > > > > > > Thanks, Thomas > > > > > > p.s. I contemplated to do NMT overflow checks and removal of old guarding code in one RFE but was concerned that it would be too confusing and get stuck in review limbo. Maybe that was wrong. But this RFE here makes more sense when viewed as part of a whole. > > Thanks for explanation. So, buffer overrun detection is now only available when NMT is on, vs. always on with GuardedMemory in debug build. Right? Well, not with this patch obviously. But yes, that would be my proposal. To get "always-on", we could switch NMT on by default in debug builds. "summary" level is not really expensive at all, it uses less memory than GuardedMemory does, and the per-flag accounting does not really add much overhead (GuardedMemory also does some accounting btw). Though tbh my first priority is to give us overflow checks in release builds. If we only do that and leave GuardedMemory in place I would be happy already. I had two customer cases very recently with heap overwriters, one of which I misused NMT to trigger a crash and analyze the core. A neighboring (non-VM-allocated) block was overwriting the following (VM allocated) heap block. Cheers, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/5952 From dholmes at openjdk.java.net Sun Oct 17 13:02:50 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Sun, 17 Oct 2021 13:02:50 GMT Subject: RFR: 8274301: Locking with _no_safepoint_check_flag does not check NJT In-Reply-To: References: Message-ID: On Thu, 14 Oct 2021 21:04:29 GMT, Coleen Phillimore wrote: > There are Mutexes that have a safepoint checking rank, that are locked via: > MutexLocker ml(safepoint_lock, Mutex::_no_safepoint_check_flag); > > These don't assert for the inconsistency because the thread that calls this is a NonJavaThread. > > But if you lock > MutexLocker ml(nosafepoint_lock); // default safepoint check > > These *will* assert for the safepoint checking inconsistency even for NonJavaThreads. > > NonJavaThreads don't participate in the safepoint checking protocol. And some of these safepoint checking locks are shared between java and nonjava threads. So in the existing code, there's a mix of locking with _no_safepoint_check_flag and not. > > This has a wrinkle because if you have a safepoint checking lock, and call wait on it, we now have to handle the case that the caller is a NonJavaThread and should not safepoint check (just like the Mutex::lock call). > > Maybe this shouldn't be fixed, and we should just document the discrepancy. > > Tested with tier1-3. Sorry but it makes no sense to me to duplicate the `wait_without_safepoint_check` logic inside `wait` when for a NJT you can, and should, always calls `wait_without_safepoint_check`. Safepoint-checking is irrelevant to NJTs and this is just making thing confusingly complicated IMO. ------------- PR: https://git.openjdk.java.net/jdk/pull/5959 From zgu at openjdk.java.net Sun Oct 17 13:32:46 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Sun, 17 Oct 2021 13:32:46 GMT Subject: RFR: JDK-8275320: NMT should perform buffer overrun checks In-Reply-To: <_r5qw_r-3Be7zUJuf4gcb10MFe9varAWAvix_CaJiYs=.758ad563-f2d2-466d-bcb5-1ccc6b547e94@github.com> References: <_r5qw_r-3Be7zUJuf4gcb10MFe9varAWAvix_CaJiYs=.758ad563-f2d2-466d-bcb5-1ccc6b547e94@github.com> Message-ID: <6HZO-x_TuA4XOmbGfIo7DCXFgZCSNU69FkpWT8WWtL8=.6c3180cf-ced1-43d3-966f-3f21e9d3bffe@github.com> On Sun, 17 Oct 2021 05:47:24 GMT, Thomas Stuefe wrote: > > > > > Sorry, we already have GuardedMemory for detecting buffer overrun, why introduce a new one? > > > > > > > > > > > > GuardedMemory has a number of disadvantages, and I'd like to remove it in favor of NMT doing buffer overrun checks. For my full reasoning, please see my reasoning in the umbrella RFE https://bugs.openjdk.java.net/browse/JDK-8275301: > > > > Disadvantages of the current solution: > > > > > > > > * We have no way to do C-heap checking in release builds. But there, it is sorely missed. We ship release VMs, and those VMs get used in myriad ways with a lot of faulty third-party native code. I would love to be able to flip a product switch at a customer site and have some basic C-heap checks done, without relegating to external tools or debug c-libs. > > > > * The debug-only guards in os::malloc() are quite fat, really, a whopping 48 bytes per allocation on 64-bit, 40 bytes on 32-bit. That is for guarding alone. They distort the memory allocation picture, since blowing up every allocation this way causes the underlying libc to do different things. Therefore we have different memory layouts and allocation patterns between debug and release. In addition, we have different code paths too, e.g. in debug os::realloc calls os::malloc + os::free whereas in release builds it calls directly into libc ::realloc. All that means that in debug builds we test something different than what we ship in release builds. > > > > * The canary in the headers of the debug-only guards do not directly precede the user portion of the data, so we won't catch negative buffer overflows of only a few bytes. > > > > * The guarding added by CheckJNICalls is unnecessarily expensive too, since it copies the memory around, handing a copy of the guarded memory up to the caller. > > > > * The fact that three different code sections all do malloc headers incurs unnecessary costs, and the code is unnecessarily complex. It makes also statistics difficult to understand since the silent overhead can be large (compare the rise in RSS with the rise in NMT allocations in a debug build). > > > > * None of the current overflow checkers print out hex dumps of the violated memory. That is what the libc usually does and it is very useful. > > > > > > > > Thanks, Thomas > > > > > > > > > p.s. I contemplated to do NMT overflow checks and removal of old guarding code in one RFE but was concerned that it would be too confusing and get stuck in review limbo. Maybe that was wrong. But this RFE here makes more sense when viewed as part of a whole. > > > > > > Thanks for explanation. So, buffer overrun detection is now only available when NMT is on, vs. always on with GuardedMemory in debug build. Right? > > Well, not with this patch obviously. But yes, that would be my proposal. To get "always-on", we could switch NMT on by default in debug builds. "summary" level is not really expensive at all, it uses less memory than GuardedMemory does, and the per-flag accounting does not really add much overhead (GuardedMemory also does some accounting btw). > > Though tbh my first priority is to give us overflow checks in release builds. If we only do that and leave GuardedMemory in place I would be happy already. I had two customer cases very recently with heap overwriters, one of which I misused NMT to trigger a crash and analyze the core. A neighboring (non-VM-allocated) block was overwriting the following (VM allocated) heap block. > > Cheers, Thomas I have no problem on technical side. Changing NMT default value, I believe, needs CSR. Probably should start with a CSR to get a consensus. Thanks. -Zhengyu ------------- PR: https://git.openjdk.java.net/jdk/pull/5952 From dholmes at openjdk.java.net Mon Oct 18 07:03:50 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 18 Oct 2021 07:03:50 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file [v2] In-Reply-To: References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: On Sat, 16 Oct 2021 06:15:39 GMT, Thomas Stuefe wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Make mutex_array be a linked list. > > src/hotspot/share/runtime/mutex.cpp line 280: > >> 278: ThreadCritical tc; >> 279: Mutex* old_next = _next_mutex; >> 280: assert(old_next != nullptr, "only static mutexes don't have a next"); > > This was a brainteaser till I realized that the end of the list is composed of the Mutexes created first, which we don't delete. Maybe add that as a comment? Isn't it only the very first Mutex we create (statically) that doesn't have a next? ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From stefank at openjdk.java.net Mon Oct 18 07:05:53 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 18 Oct 2021 07:05:53 GMT Subject: RFR: 8275334: Move class loading Events to a separate section in hs_err files In-Reply-To: References: Message-ID: <-ks-uqCZWhxL0kxckGkVOEZocjSLyRCSlC85H7WrlSc=.5903174d-6212-4abd-a7e2-d0b517866e37@github.com> On Fri, 15 Oct 2021 13:49:58 GMT, Stefan Karlsson wrote: > The generic "Events" section in hs_err files tend to fill up with class loading events. I propose that we move class loading events to its own section. > > We already have a section for class unloading, so adding a class loading section is also good for symmetry reasons. > > This is what this will look like after the patch: > > Classes loaded (20 events): > Event: 4,426 Loading class com/sun/org/apache/xerces/internal/xinclude/MultipleScopeNamespaceSupport done > Event: 4,426 Loading class com/sun/org/apache/xerces/internal/xinclude/XIncludeNamespaceSupport done > Event: 4,428 Loading class com/sun/org/apache/xerces/internal/parsers/AbstractSAXParser$LocatorProxy > Event: 4,428 Loading class org/xml/sax/ext/Locator2 > Event: 4,428 Loading class org/xml/sax/ext/Locator2 done > Event: 4,428 Loading class com/sun/org/apache/xerces/internal/parsers/AbstractSAXParser$LocatorProxy done > Event: 4,430 Loading class com/sun/org/apache/xerces/internal/impl/XMLScanner$NameType > Event: 4,430 Loading class com/sun/org/apache/xerces/internal/impl/XMLScanner$NameType done > Event: 4,435 Loading class org/xml/sax/helpers/LocatorImpl > Event: 4,435 Loading class org/xml/sax/helpers/LocatorImpl done > Event: 4,498 Loading class java/util/Vector$ListItr > Event: 4,498 Loading class java/util/Vector$ListItr done > Event: 4,499 Loading class java/lang/StrictMath > Event: 4,499 Loading class java/lang/StrictMath done > Event: 4,623 Loading class java/util/BitSet > Event: 4,623 Loading class java/util/BitSet done > Event: 4,656 Loading class java/util/ArrayList$ListItr > Event: 4,657 Loading class java/util/ArrayList$ListItr done > Event: 4,693 Loading class java/util/ArrayList$SubList > Event: 4,694 Loading class java/util/ArrayList$SubList done > > Classes unloaded (20 events): > Event: 3,445 Thread 0x00007f061c2578d0 Unloading class 0x0000000100088000 'avrora/Main' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a3c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a3c00' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a3400 'java/lang/invoke/LambdaForm$MH+0x00000001000a3400' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a1c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a1c00' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a1400 'java/lang/invoke/LambdaForm$MH+0x00000001000a1400' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a1000 'java/lang/invoke/LambdaForm$MH+0x00000001000a1000' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a0c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a0c00' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a0800 'java/lang/invoke/LambdaForm$MH+0x00000001000a0800' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a0400 'java/lang/invoke/LambdaForm$MH+0x00000001000a0400' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a0000 'java/lang/invoke/LambdaForm$MH+0x00000001000a0000' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010012c000 'jdk/internal/reflect/GeneratedMethodAccessor1' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x0000000100123000 'java/lang/invoke/LambdaForm$MH+0x0000000100123000' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x0000000100121800 'java/lang/invoke/LambdaForm$MH+0x0000000100121800' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x0000000100121000 'java/lang/invoke/LambdaForm$MH+0x0000000100121000' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009fc00 'java/lang/invoke/LambdaForm$MH+0x000000010009fc00' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009ec00 'java/lang/invoke/LambdaForm$MH+0x000000010009ec00' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009e800 'java/lang/invoke/LambdaForm$MH+0x000000010009e800' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009e400 'java/lang/invoke/LambdaForm$MH+0x000000010009e400' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009cc00 'java/lang/invoke/LambdaForm$MH+0x000000010009cc00' > Event: 5,914 Thread 0x00007f061c2578d0 Unloading class 0x0000000100123000 'jdk/internal/reflect/GeneratedMethodAccessor2' > ... > Events (20 events): > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dada610 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dadd990 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dadfd90 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dae9e90 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060daea490 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060daeee90 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060daf0710 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060daf2410 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dafe810 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060db01d10 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060db63c90 > Event: 5,681 Thread 0x00007f05901b3bd0 Thread exited: 0x00007f05901b3bd0 > Event: 5,681 Thread 0x00007f058c2319d0 Thread exited: 0x00007f058c2319d0 > Event: 5,687 Thread 0x00007f05cc533030 Thread exited: 0x00007f05cc533030 > Event: 5,687 Thread 0x00007f058c229ff0 Thread exited: 0x00007f058c229ff0 > Event: 5,715 Thread 0x00007f058401fb00 Thread exited: 0x00007f058401fb00 > Event: 5,715 Thread 0x00007f05cc449b80 Thread exited: 0x00007f05cc449b80 > Event: 5,715 Thread 0x00007f05cc45b6f0 Thread exited: 0x00007f05cc45b6f0 > Event: 5,872 Thread 0x00007f0584179750 Thread added: 0x00007f0584179750 > Event: 5,872 Protecting memory [0x00007f05f85f8000,0x00007f05f85fc000] with protection modes 0 > > > Before the patch: > > Classes unloaded (20 events): > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a3c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a3c00' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a3400 'java/lang/invoke/LambdaForm$MH+0x00000001000a3400' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a1c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a1c00' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a1400 'java/lang/invoke/LambdaForm$MH+0x00000001000a1400' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a1000 'java/lang/invoke/LambdaForm$MH+0x00000001000a1000' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a0c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a0c00' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a0800 'java/lang/invoke/LambdaForm$MH+0x00000001000a0800' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a0400 'java/lang/invoke/LambdaForm$MH+0x00000001000a0400' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a0000 'java/lang/invoke/LambdaForm$MH+0x00000001000a0000' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010012c000 'jdk/internal/reflect/GeneratedMethodAccessor1' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x0000000100123000 'java/lang/invoke/LambdaForm$MH+0x0000000100123000' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x0000000100121800 'java/lang/invoke/LambdaForm$MH+0x0000000100121800' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x0000000100121000 'java/lang/invoke/LambdaForm$MH+0x0000000100121000' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009fc00 'java/lang/invoke/LambdaForm$MH+0x000000010009fc00' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009ec00 'java/lang/invoke/LambdaForm$MH+0x000000010009ec00' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009e800 'java/lang/invoke/LambdaForm$MH+0x000000010009e800' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009e400 'java/lang/invoke/LambdaForm$MH+0x000000010009e400' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009cc00 'java/lang/invoke/LambdaForm$MH+0x000000010009cc00' > Event: 5,964 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a3400 'jdk/internal/reflect/GeneratedMethodAccessor2' > Event: 6,059 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a3400 'jdk/internal/reflect/GeneratedMethodAccessor3' > ... > Events (20 events): > Event: 4,662 loading class java/util/BitSet > Event: 4,663 loading class java/util/BitSet done > Event: 4,697 loading class java/util/ArrayList$ListItr > Event: 4,697 loading class java/util/ArrayList$ListItr done > Event: 4,738 loading class java/util/ArrayList$SubList > Event: 4,739 loading class java/util/ArrayList$SubList done > Event: 4,745 Thread 0x00007fe0940ebe70 Thread added: 0x00007fe0940ebe70 > Event: 4,745 Protecting memory [0x00007fe109bb9000,0x00007fe109bbd000] with protection modes 0 > Event: 4,803 Thread 0x00007fe0c03f1850 Thread added: 0x00007fe0c03f1850 > Event: 4,803 Protecting memory [0x00007fe109ab8000,0x00007fe109abc000] with protection modes 0 > Event: 4,809 Thread 0x00007fe0c412be00 Thread added: 0x00007fe0c412be00 > Event: 4,809 Protecting memory [0x00007fe1097b5000,0x00007fe1097b9000] with protection modes 0 > Event: 5,839 Thread 0x00007fe0c412be00 Thread exited: 0x00007fe0c412be00 > Event: 5,846 Thread 0x00007fe0c03f1850 Thread exited: 0x00007fe0c03f1850 > Event: 5,874 Thread 0x00007fe0940ebe70 Thread exited: 0x00007fe0940ebe70 > Event: 5,874 Thread 0x00007fe0c03044d0 Thread exited: 0x00007fe0c03044d0 > Event: 5,874 Thread 0x00007fe0dc465540 Thread exited: 0x00007fe0dc465540 > Event: 5,894 Thread 0x00007fe0c02e7420 Thread exited: 0x00007fe0c02e7420 > Event: 6,024 Thread 0x00007fe09c184030 Thread added: 0x00007fe09c184030 > Event: 6,024 Protecting memory [0x00007fe1095b2000,0x00007fe1095b6000] with protection modes 0 Thanks for reviewing! ------------- PR: https://git.openjdk.java.net/jdk/pull/5968 From stefank at openjdk.java.net Mon Oct 18 07:05:53 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 18 Oct 2021 07:05:53 GMT Subject: Integrated: 8275334: Move class loading Events to a separate section in hs_err files In-Reply-To: References: Message-ID: On Fri, 15 Oct 2021 13:49:58 GMT, Stefan Karlsson wrote: > The generic "Events" section in hs_err files tend to fill up with class loading events. I propose that we move class loading events to its own section. > > We already have a section for class unloading, so adding a class loading section is also good for symmetry reasons. > > This is what this will look like after the patch: > > Classes loaded (20 events): > Event: 4,426 Loading class com/sun/org/apache/xerces/internal/xinclude/MultipleScopeNamespaceSupport done > Event: 4,426 Loading class com/sun/org/apache/xerces/internal/xinclude/XIncludeNamespaceSupport done > Event: 4,428 Loading class com/sun/org/apache/xerces/internal/parsers/AbstractSAXParser$LocatorProxy > Event: 4,428 Loading class org/xml/sax/ext/Locator2 > Event: 4,428 Loading class org/xml/sax/ext/Locator2 done > Event: 4,428 Loading class com/sun/org/apache/xerces/internal/parsers/AbstractSAXParser$LocatorProxy done > Event: 4,430 Loading class com/sun/org/apache/xerces/internal/impl/XMLScanner$NameType > Event: 4,430 Loading class com/sun/org/apache/xerces/internal/impl/XMLScanner$NameType done > Event: 4,435 Loading class org/xml/sax/helpers/LocatorImpl > Event: 4,435 Loading class org/xml/sax/helpers/LocatorImpl done > Event: 4,498 Loading class java/util/Vector$ListItr > Event: 4,498 Loading class java/util/Vector$ListItr done > Event: 4,499 Loading class java/lang/StrictMath > Event: 4,499 Loading class java/lang/StrictMath done > Event: 4,623 Loading class java/util/BitSet > Event: 4,623 Loading class java/util/BitSet done > Event: 4,656 Loading class java/util/ArrayList$ListItr > Event: 4,657 Loading class java/util/ArrayList$ListItr done > Event: 4,693 Loading class java/util/ArrayList$SubList > Event: 4,694 Loading class java/util/ArrayList$SubList done > > Classes unloaded (20 events): > Event: 3,445 Thread 0x00007f061c2578d0 Unloading class 0x0000000100088000 'avrora/Main' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a3c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a3c00' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a3400 'java/lang/invoke/LambdaForm$MH+0x00000001000a3400' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a1c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a1c00' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a1400 'java/lang/invoke/LambdaForm$MH+0x00000001000a1400' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a1000 'java/lang/invoke/LambdaForm$MH+0x00000001000a1000' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a0c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a0c00' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a0800 'java/lang/invoke/LambdaForm$MH+0x00000001000a0800' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a0400 'java/lang/invoke/LambdaForm$MH+0x00000001000a0400' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x00000001000a0000 'java/lang/invoke/LambdaForm$MH+0x00000001000a0000' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010012c000 'jdk/internal/reflect/GeneratedMethodAccessor1' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x0000000100123000 'java/lang/invoke/LambdaForm$MH+0x0000000100123000' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x0000000100121800 'java/lang/invoke/LambdaForm$MH+0x0000000100121800' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x0000000100121000 'java/lang/invoke/LambdaForm$MH+0x0000000100121000' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009fc00 'java/lang/invoke/LambdaForm$MH+0x000000010009fc00' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009ec00 'java/lang/invoke/LambdaForm$MH+0x000000010009ec00' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009e800 'java/lang/invoke/LambdaForm$MH+0x000000010009e800' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009e400 'java/lang/invoke/LambdaForm$MH+0x000000010009e400' > Event: 4,864 Thread 0x00007f061c2578d0 Unloading class 0x000000010009cc00 'java/lang/invoke/LambdaForm$MH+0x000000010009cc00' > Event: 5,914 Thread 0x00007f061c2578d0 Unloading class 0x0000000100123000 'jdk/internal/reflect/GeneratedMethodAccessor2' > ... > Events (20 events): > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dada610 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dadd990 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dadfd90 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dae9e90 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060daea490 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060daeee90 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060daf0710 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060daf2410 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060dafe810 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060db01d10 > Event: 5,565 Thread 0x00007f061c2f97e0 flushing nmethod 0x00007f060db63c90 > Event: 5,681 Thread 0x00007f05901b3bd0 Thread exited: 0x00007f05901b3bd0 > Event: 5,681 Thread 0x00007f058c2319d0 Thread exited: 0x00007f058c2319d0 > Event: 5,687 Thread 0x00007f05cc533030 Thread exited: 0x00007f05cc533030 > Event: 5,687 Thread 0x00007f058c229ff0 Thread exited: 0x00007f058c229ff0 > Event: 5,715 Thread 0x00007f058401fb00 Thread exited: 0x00007f058401fb00 > Event: 5,715 Thread 0x00007f05cc449b80 Thread exited: 0x00007f05cc449b80 > Event: 5,715 Thread 0x00007f05cc45b6f0 Thread exited: 0x00007f05cc45b6f0 > Event: 5,872 Thread 0x00007f0584179750 Thread added: 0x00007f0584179750 > Event: 5,872 Protecting memory [0x00007f05f85f8000,0x00007f05f85fc000] with protection modes 0 > > > Before the patch: > > Classes unloaded (20 events): > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a3c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a3c00' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a3400 'java/lang/invoke/LambdaForm$MH+0x00000001000a3400' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a1c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a1c00' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a1400 'java/lang/invoke/LambdaForm$MH+0x00000001000a1400' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a1000 'java/lang/invoke/LambdaForm$MH+0x00000001000a1000' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a0c00 'java/lang/invoke/LambdaForm$MH+0x00000001000a0c00' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a0800 'java/lang/invoke/LambdaForm$MH+0x00000001000a0800' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a0400 'java/lang/invoke/LambdaForm$MH+0x00000001000a0400' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a0000 'java/lang/invoke/LambdaForm$MH+0x00000001000a0000' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010012c000 'jdk/internal/reflect/GeneratedMethodAccessor1' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x0000000100123000 'java/lang/invoke/LambdaForm$MH+0x0000000100123000' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x0000000100121800 'java/lang/invoke/LambdaForm$MH+0x0000000100121800' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x0000000100121000 'java/lang/invoke/LambdaForm$MH+0x0000000100121000' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009fc00 'java/lang/invoke/LambdaForm$MH+0x000000010009fc00' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009ec00 'java/lang/invoke/LambdaForm$MH+0x000000010009ec00' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009e800 'java/lang/invoke/LambdaForm$MH+0x000000010009e800' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009e400 'java/lang/invoke/LambdaForm$MH+0x000000010009e400' > Event: 4,911 Thread 0x00007fe1282560f0 Unloading class 0x000000010009cc00 'java/lang/invoke/LambdaForm$MH+0x000000010009cc00' > Event: 5,964 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a3400 'jdk/internal/reflect/GeneratedMethodAccessor2' > Event: 6,059 Thread 0x00007fe1282560f0 Unloading class 0x00000001000a3400 'jdk/internal/reflect/GeneratedMethodAccessor3' > ... > Events (20 events): > Event: 4,662 loading class java/util/BitSet > Event: 4,663 loading class java/util/BitSet done > Event: 4,697 loading class java/util/ArrayList$ListItr > Event: 4,697 loading class java/util/ArrayList$ListItr done > Event: 4,738 loading class java/util/ArrayList$SubList > Event: 4,739 loading class java/util/ArrayList$SubList done > Event: 4,745 Thread 0x00007fe0940ebe70 Thread added: 0x00007fe0940ebe70 > Event: 4,745 Protecting memory [0x00007fe109bb9000,0x00007fe109bbd000] with protection modes 0 > Event: 4,803 Thread 0x00007fe0c03f1850 Thread added: 0x00007fe0c03f1850 > Event: 4,803 Protecting memory [0x00007fe109ab8000,0x00007fe109abc000] with protection modes 0 > Event: 4,809 Thread 0x00007fe0c412be00 Thread added: 0x00007fe0c412be00 > Event: 4,809 Protecting memory [0x00007fe1097b5000,0x00007fe1097b9000] with protection modes 0 > Event: 5,839 Thread 0x00007fe0c412be00 Thread exited: 0x00007fe0c412be00 > Event: 5,846 Thread 0x00007fe0c03f1850 Thread exited: 0x00007fe0c03f1850 > Event: 5,874 Thread 0x00007fe0940ebe70 Thread exited: 0x00007fe0940ebe70 > Event: 5,874 Thread 0x00007fe0c03044d0 Thread exited: 0x00007fe0c03044d0 > Event: 5,874 Thread 0x00007fe0dc465540 Thread exited: 0x00007fe0dc465540 > Event: 5,894 Thread 0x00007fe0c02e7420 Thread exited: 0x00007fe0c02e7420 > Event: 6,024 Thread 0x00007fe09c184030 Thread added: 0x00007fe09c184030 > Event: 6,024 Protecting memory [0x00007fe1095b2000,0x00007fe1095b6000] with protection modes 0 This pull request has now been integrated. Changeset: bb7dacdc Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/bb7dacdc78ad50797900e7e9610a1ed8e7ab1b00 Stats: 20 lines in 3 files changed: 19 ins; 0 del; 1 mod 8275334: Move class loading Events to a separate section in hs_err files Reviewed-by: stuefe, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/5968 From stuefe at openjdk.java.net Mon Oct 18 07:55:49 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 18 Oct 2021 07:55:49 GMT Subject: RFR: JDK-8275320: NMT should perform buffer overrun checks In-Reply-To: <6HZO-x_TuA4XOmbGfIo7DCXFgZCSNU69FkpWT8WWtL8=.6c3180cf-ced1-43d3-966f-3f21e9d3bffe@github.com> References: <_r5qw_r-3Be7zUJuf4gcb10MFe9varAWAvix_CaJiYs=.758ad563-f2d2-466d-bcb5-1ccc6b547e94@github.com> <6HZO-x_TuA4XOmbGfIo7DCXFgZCSNU69FkpWT8WWtL8=.6c3180cf-ced1-43d3-966f-3f21e9d3bffe@github.com> Message-ID: On Sun, 17 Oct 2021 13:30:17 GMT, Zhengyu Gu wrote: > > > > > > Sorry, we already have GuardedMemory for detecting buffer overrun, why introduce a new one? > > > > > > > > > > > > > > > GuardedMemory has a number of disadvantages, and I'd like to remove it in favor of NMT doing buffer overrun checks. For my full reasoning, please see my reasoning in the umbrella RFE https://bugs.openjdk.java.net/browse/JDK-8275301: > > > > > Disadvantages of the current solution: > > > > > > > > > > * We have no way to do C-heap checking in release builds. But there, it is sorely missed. We ship release VMs, and those VMs get used in myriad ways with a lot of faulty third-party native code. I would love to be able to flip a product switch at a customer site and have some basic C-heap checks done, without relegating to external tools or debug c-libs. > > > > > * The debug-only guards in os::malloc() are quite fat, really, a whopping 48 bytes per allocation on 64-bit, 40 bytes on 32-bit. That is for guarding alone. They distort the memory allocation picture, since blowing up every allocation this way causes the underlying libc to do different things. Therefore we have different memory layouts and allocation patterns between debug and release. In addition, we have different code paths too, e.g. in debug os::realloc calls os::malloc + os::free whereas in release builds it calls directly into libc ::realloc. All that means that in debug builds we test something different than what we ship in release builds. > > > > > * The canary in the headers of the debug-only guards do not directly precede the user portion of the data, so we won't catch negative buffer overflows of only a few bytes. > > > > > * The guarding added by CheckJNICalls is unnecessarily expensive too, since it copies the memory around, handing a copy of the guarded memory up to the caller. > > > > > * The fact that three different code sections all do malloc headers incurs unnecessary costs, and the code is unnecessarily complex. It makes also statistics difficult to understand since the silent overhead can be large (compare the rise in RSS with the rise in NMT allocations in a debug build). > > > > > * None of the current overflow checkers print out hex dumps of the violated memory. That is what the libc usually does and it is very useful. > > > > > > > > > > Thanks, Thomas > > > > > > > > > > > > p.s. I contemplated to do NMT overflow checks and removal of old guarding code in one RFE but was concerned that it would be too confusing and get stuck in review limbo. Maybe that was wrong. But this RFE here makes more sense when viewed as part of a whole. > > > > > > > > > Thanks for explanation. So, buffer overrun detection is now only available when NMT is on, vs. always on with GuardedMemory in debug build. Right? > > > > > > Well, not with this patch obviously. But yes, that would be my proposal. To get "always-on", we could switch NMT on by default in debug builds. "summary" level is not really expensive at all, it uses less memory than GuardedMemory does, and the per-flag accounting does not really add much overhead (GuardedMemory also does some accounting btw). > > Though tbh my first priority is to give us overflow checks in release builds. If we only do that and leave GuardedMemory in place I would be happy already. I had two customer cases very recently with heap overwriters, one of which I misused NMT to trigger a crash and analyze the core. A neighboring (non-VM-allocated) block was overwriting the following (VM allocated) heap block. > > Cheers, Thomas > > I have no problem on technical side. Changing NMT default value, I believe, needs CSR. Probably should start with a CSR to get a consensus. > > Thanks. > > -Zhengyu Hi Zhengyu, I would like to keep the discussion about removing GuardedMemory and potentially enabling NMT per default on debug builds from this issue here. And just concentrate on this first step, lined out in this PR, which is getting NMT to do heap bounds checking. I believe that it has merits on its own, even with no subsequent changes. And then, I would do some analysis (e.g. verify my assumption that NMT=summary is cheaper than what we do today with GuardedMemory) and use this as a base for a follow-up RFE to remove GuardedMemory. I'm not sure though we need a CSR. Since it would only affect debug code, it will never be shipped to the customer, so there can be no compatibility issues. But there would be a need for discussion of course. Cheers, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/5952 From coleenp at openjdk.java.net Mon Oct 18 13:00:56 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 18 Oct 2021 13:00:56 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file [v2] In-Reply-To: References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: On Sat, 16 Oct 2021 06:36:07 GMT, Thomas Stuefe wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Make mutex_array be a linked list. > > src/hotspot/share/runtime/mutex.cpp line 273: > >> 271: // Array used to print owned locks on error. >> 272: static Mutex* _mutex_array = NULL; >> 273: static int _array_count = 0; > > Do we need this counter? Could we not just count the mutexes while printing them? Yes, I could change it to count while iterating. I thought I wanted it for debugging purposes, that's why I made it static. > src/hotspot/share/runtime/mutex.cpp line 322: > >> 320: } >> 321: _array_count += 1; >> 322: } > > Proposal: maybe factor this block (and its counterpart in ~Mutex) to an own private method, "add_to_global_mutex_list" and "remove_from_global_mutex_list"? That would also make the difference between `_next` and `_next_mutex` a bit clearer to the casual reader. Ok, I almost did this but then didn't want to change the header file to add functions or accessors. > src/hotspot/share/runtime/mutex.cpp line 335: > >> 333: #ifdef ASSERT >> 334: if (_allow_vm_block) { >> 335: st->print("%s", " allow_vm_block"); > > nit: shorten to `print_raw(" alloc_vm_block")` ? ok > src/hotspot/share/runtime/mutex.cpp line 345: > >> 343: // by fatal error handler. >> 344: void Mutex::print_owned_locks_on_error(outputStream* st) { >> 345: ResourceMark rm; > > Why is this mark needed? rank_name() uses resource allocation. > src/hotspot/share/runtime/mutex.cpp line 346: > >> 344: void Mutex::print_owned_locks_on_error(outputStream* st) { >> 345: ResourceMark rm; >> 346: st->print("VM Mutex/Monitor currently owned by a thread: "); > > nit: use plural? > e.g. "All VM Mutexes/Monitors currently owned by any thread:" ok. > src/hotspot/share/runtime/mutex.cpp line 361: > >> 359: } >> 360: if (none) st->print_cr("None"); >> 361: st->print_cr("Total Mutex count %d", _array_count); > > This prints the total number of mutexes, not those owned by a thread, is that intentional? yeah I wanted to know how many we have total in the array. > src/hotspot/share/utilities/vmError.cpp line 1920: > >> 1918: MutexLocker ml(ErrorTest_lock, Mutex::_no_safepoint_check_flag); >> 1919: assert(how == 0, "test assert with lock"); >> 1920: break; > > I like this test. Thank you! ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From coleenp at openjdk.java.net Mon Oct 18 13:00:52 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 18 Oct 2021 13:00:52 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file [v2] In-Reply-To: References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: On Fri, 15 Oct 2021 16:47:30 GMT, Coleen Phillimore wrote: >> See CR for details. This saves the created mutex in a mutex_array during Mutex construction so that all Mutex, not just the ones in mutexLocker.cpp can be printed in the hs_err file. ThreadCritical is used to protect the mutex_array, and to avoid UB should be used when printing the owned locks, but since that is done during error handling, this seemed better not to lock during error handling. There's 0 probability that another thread will be creating a lock during this error handling anyway. >> Tested with tier1-8. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Make mutex_array be a linked list. Thanks for the comments Thomas. ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From coleenp at openjdk.java.net Mon Oct 18 13:00:57 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 18 Oct 2021 13:00:57 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file [v2] In-Reply-To: References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: On Mon, 18 Oct 2021 07:00:54 GMT, David Holmes wrote: >> src/hotspot/share/runtime/mutex.cpp line 280: >> >>> 278: ThreadCritical tc; >>> 279: Mutex* old_next = _next_mutex; >>> 280: assert(old_next != nullptr, "only static mutexes don't have a next"); >> >> This was a brainteaser till I realized that the end of the list is composed of the Mutexes created first, which we don't delete. Maybe add that as a comment? > > Isn't it only the very first Mutex we create (statically) that doesn't have a next? Yes, the array is added to the front, so the first Mutex created is a static one so you'll never delete that. The _mutex_array will never go to null. Suggestions for comment welcome. ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From coleenp at openjdk.java.net Mon Oct 18 13:00:57 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 18 Oct 2021 13:00:57 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file [v2] In-Reply-To: References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: <1Vm7bhl1JtfagsBywABKYdS2StGlBL-Pgt76f4Gxc7U=.2940d10a-2cb4-4c08-9629-5abd861b7e9f@github.com> On Sat, 16 Oct 2021 06:38:30 GMT, Thomas Stuefe wrote: >> src/hotspot/share/runtime/mutex.cpp line 344: >> >>> 342: // Print all mutexes/monitors that are currently owned by a thread; called >>> 343: // by fatal error handler. >>> 344: void Mutex::print_owned_locks_on_error(outputStream* st) { >> >> Maybe a small comment that we explicitly avoid locking here, because we are in error handling and locking may block? That we rather risk a secondary crash instead of blocking in error handling? > > You also could assert for `VMError::is_error_reported()` to prevent future use outside error handling. Nice suggestion. ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From coleenp at openjdk.java.net Mon Oct 18 13:08:48 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 18 Oct 2021 13:08:48 GMT Subject: RFR: 8274301: Locking with _no_safepoint_check_flag does not check NJT In-Reply-To: References: Message-ID: On Thu, 14 Oct 2021 21:04:29 GMT, Coleen Phillimore wrote: > There are Mutexes that have a safepoint checking rank, that are locked via: > MutexLocker ml(safepoint_lock, Mutex::_no_safepoint_check_flag); > > These don't assert for the inconsistency because the thread that calls this is a NonJavaThread. > > But if you lock > MutexLocker ml(nosafepoint_lock); // default safepoint check > > These *will* assert for the safepoint checking inconsistency even for NonJavaThreads. > > NonJavaThreads don't participate in the safepoint checking protocol. And some of these safepoint checking locks are shared between java and nonjava threads. So in the existing code, there's a mix of locking with _no_safepoint_check_flag and not. > > This has a wrinkle because if you have a safepoint checking lock, and call wait on it, we now have to handle the case that the caller is a NonJavaThread and should not safepoint check (just like the Mutex::lock call). > > Maybe this shouldn't be fixed, and we should just document the discrepancy. > > Tested with tier1-3. So the code I added to wait is exactly like the code that's in lock, but I can add the checking into MonitorLocker instead, which makes us have to call Thread::current() again there. MonitorLocker ... bool wait(int64_t timeout = 0) { return _flag == Mutex::_safepoint_check_flag && Thread::current()->is_Java_thread() ? as_monitor()->wait(timeout) : as_monitor()->wait_without_safepoint_check(timeout); } ------------- PR: https://git.openjdk.java.net/jdk/pull/5959 From coleenp at openjdk.java.net Mon Oct 18 13:24:27 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 18 Oct 2021 13:24:27 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file [v3] In-Reply-To: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: > See CR for details. This saves the created mutex in a mutex_array during Mutex construction so that all Mutex, not just the ones in mutexLocker.cpp can be printed in the hs_err file. ThreadCritical is used to protect the mutex_array, and to avoid UB should be used when printing the owned locks, but since that is done during error handling, this seemed better not to lock during error handling. There's 0 probability that another thread will be creating a lock during this error handling anyway. > Tested with tier1-8. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: tstuefe suggestions. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5958/files - new: https://git.openjdk.java.net/jdk/pull/5958/files/665d5cc0..947b5658 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5958&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5958&range=01-02 Stats: 51 lines in 3 files changed: 21 ins; 12 del; 18 mod Patch: https://git.openjdk.java.net/jdk/pull/5958.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5958/head:pull/5958 PR: https://git.openjdk.java.net/jdk/pull/5958 From stuefe at openjdk.java.net Mon Oct 18 15:58:50 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 18 Oct 2021 15:58:50 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file [v3] In-Reply-To: References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: On Mon, 18 Oct 2021 13:24:27 GMT, Coleen Phillimore wrote: >> See CR for details. This saves the created mutex in a mutex_array during Mutex construction so that all Mutex, not just the ones in mutexLocker.cpp can be printed in the hs_err file. ThreadCritical is used to protect the mutex_array, and to avoid UB should be used when printing the owned locks, but since that is done during error handling, this seemed better not to lock during error handling. There's 0 probability that another thread will be creating a lock during this error handling anyway. >> Tested with tier1-8. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > tstuefe suggestions. Hi Coleen, this looks good to me now. My remaining remarks are bikeshedding. Thanks for taking my suggestions. ..Thomas src/hotspot/share/runtime/mutex.cpp line 290: > 288: ThreadCritical tc; > 289: Mutex* old_next = _next_mutex; > 290: assert(old_next != nullptr, "only static mutexes don't have a next"); How about "this list can never be empty"? ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5958 From coleenp at openjdk.java.net Mon Oct 18 15:58:51 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 18 Oct 2021 15:58:51 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file [v3] In-Reply-To: References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: <5Fj8GMtjtM1338wmifZp5YtQVO-N_ULfGObrCEnUzkA=.2398c350-2193-4dd5-9164-fd3c68b5d137@github.com> On Mon, 18 Oct 2021 15:48:07 GMT, Thomas Stuefe wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> tstuefe suggestions. > > src/hotspot/share/runtime/mutex.cpp line 290: > >> 288: ThreadCritical tc; >> 289: Mutex* old_next = _next_mutex; >> 290: assert(old_next != nullptr, "only static mutexes don't have a next"); > > How about "this list can never be empty"? I am perfectly happy to use your bikeshed colors! ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From stuefe at openjdk.java.net Mon Oct 18 15:58:51 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 18 Oct 2021 15:58:51 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file [v2] In-Reply-To: References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: On Mon, 18 Oct 2021 12:56:58 GMT, Coleen Phillimore wrote: >> src/hotspot/share/runtime/mutex.cpp line 345: >> >>> 343: // by fatal error handler. >>> 344: void Mutex::print_owned_locks_on_error(outputStream* st) { >>> 345: ResourceMark rm; >> >> Why is this mark needed? > > rank_name() uses resource allocation. Could move it to print_on then, to limit the chance of actual mallocs happening during error reporting (but I leave that up to you). ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From coleenp at openjdk.java.net Mon Oct 18 16:06:29 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 18 Oct 2021 16:06:29 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file [v4] In-Reply-To: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: > See CR for details. This saves the created mutex in a mutex_array during Mutex construction so that all Mutex, not just the ones in mutexLocker.cpp can be printed in the hs_err file. ThreadCritical is used to protect the mutex_array, and to avoid UB should be used when printing the owned locks, but since that is done during error handling, this seemed better not to lock during error handling. There's 0 probability that another thread will be creating a lock during this error handling anyway. > Tested with tier1-8. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: tstuefe suggestions. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5958/files - new: https://git.openjdk.java.net/jdk/pull/5958/files/947b5658..c88f44c5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5958&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5958&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/5958.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5958/head:pull/5958 PR: https://git.openjdk.java.net/jdk/pull/5958 From coleenp at openjdk.java.net Mon Oct 18 16:06:32 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 18 Oct 2021 16:06:32 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file [v3] In-Reply-To: References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: On Mon, 18 Oct 2021 13:24:27 GMT, Coleen Phillimore wrote: >> See CR for details. This saves the created mutex in a mutex_array during Mutex construction so that all Mutex, not just the ones in mutexLocker.cpp can be printed in the hs_err file. ThreadCritical is used to protect the mutex_array, and to avoid UB should be used when printing the owned locks, but since that is done during error handling, this seemed better not to lock during error handling. There's 0 probability that another thread will be creating a lock during this error handling anyway. >> Tested with tier1-8. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > tstuefe suggestions. Thomas, thank you for reviewing and all the good comments. ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From coleenp at openjdk.java.net Mon Oct 18 16:06:33 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 18 Oct 2021 16:06:33 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file [v2] In-Reply-To: References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: <0vc4_Kr8hwNYmRXqrRN_Zpmuhbh2gimRlhF4ZyP-Eq4=.5902fb58-54b5-4f98-b7b1-64268bc9dabd@github.com> On Mon, 18 Oct 2021 15:53:42 GMT, Thomas Stuefe wrote: >> rank_name() uses resource allocation. > > Could move it to print_on then, to limit the chance of actual mallocs happening during error reporting (but I leave that up to you). I could do that but we usually have the ResourceMark used by the callers of the print_on functions. I don't want to be inconsistent. ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From duke at openjdk.java.net Mon Oct 18 16:09:44 2021 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Mon, 18 Oct 2021 16:09:44 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v12] In-Reply-To: References: Message-ID: > This PR is a follow-up on the discussion [?RFC: AArch64: Implementing spin pauses with ISB?](https://mail.openjdk.java.net/pipermail/hotspot-dev/2021-August/054033.html). > > It adds DIAGNOSTIC options `OnSpinWaitInst=inst`, where `inst` can be: > > - `none`: no implementation for spin pauses. This is the default value. > - `nop`: use `nop` instruction for spin pauses. > - `isb`: use `isb` instruction for spin pauses. > - `yield`: use `yield` instruction for spin pauses. > > And `OnSpinWaitInstCount=count`, where `count` specifies a number of `OnSpinWaitInst` and can be in `1..99` range. It is an error to use `OnSpinWaitInstCount` when `OnSpinWaitInst` is `none`. > > The code for the `Thread.onSpinWait` intrinsic is generated based on the values of `OnSpinWaitInst` and `OnSpinWaitInstCount`. > > Testing: > > - `make test TEST="gtest"`: Passed > - `make run-test TEST="tier1"`: Passed > - `make run-test TEST="tier2"`: Passed > - `make run-test TEST=hotspot/jtreg/compiler/onSpinWait`: Passed > > CSR: https://bugs.openjdk.java.net/browse/JDK-8274564 Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: Add microbenchmarks for Thread.onSpinWait ThreadOnSpinWait: latency of Thread.onSpinWait() vs Thread.sleep(0) vs no pause. ThreadOnSpinWaitProducerConsumer: producer-consumer where consumer can go to 1ms sleep. ThreadOnSpinWaitSharedCounter: threads racing to count up (code provided by Andrew Haley, Red Hat). ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5562/files - new: https://git.openjdk.java.net/jdk/pull/5562/files/970d8d6f..e0331a79 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5562&range=11 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5562&range=10-11 Stats: 243 lines in 3 files changed: 204 ins; 29 del; 10 mod Patch: https://git.openjdk.java.net/jdk/pull/5562.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5562/head:pull/5562 PR: https://git.openjdk.java.net/jdk/pull/5562 From duke at openjdk.java.net Mon Oct 18 16:16:58 2021 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Mon, 18 Oct 2021 16:16:58 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v13] In-Reply-To: References: Message-ID: > This PR is a follow-up on the discussion [?RFC: AArch64: Implementing spin pauses with ISB?](https://mail.openjdk.java.net/pipermail/hotspot-dev/2021-August/054033.html). > > It adds DIAGNOSTIC options `OnSpinWaitInst=inst`, where `inst` can be: > > - `none`: no implementation for spin pauses. This is the default value. > - `nop`: use `nop` instruction for spin pauses. > - `isb`: use `isb` instruction for spin pauses. > - `yield`: use `yield` instruction for spin pauses. > > And `OnSpinWaitInstCount=count`, where `count` specifies a number of `OnSpinWaitInst` and can be in `1..99` range. It is an error to use `OnSpinWaitInstCount` when `OnSpinWaitInst` is `none`. > > The code for the `Thread.onSpinWait` intrinsic is generated based on the values of `OnSpinWaitInst` and `OnSpinWaitInstCount`. > > Testing: > > - `make test TEST="gtest"`: Passed > - `make run-test TEST="tier1"`: Passed > - `make run-test TEST="tier2"`: Passed > - `make run-test TEST=hotspot/jtreg/compiler/onSpinWait`: Passed > > CSR: https://bugs.openjdk.java.net/browse/JDK-8274564 Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: Fix formatting and spelling ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5562/files - new: https://git.openjdk.java.net/jdk/pull/5562/files/e0331a79..42f0bb6d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5562&range=12 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5562&range=11-12 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/5562.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5562/head:pull/5562 PR: https://git.openjdk.java.net/jdk/pull/5562 From duke at openjdk.java.net Mon Oct 18 16:16:58 2021 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Mon, 18 Oct 2021 16:16:58 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v10] In-Reply-To: References: <8P-tWT-7UC9TMLz8zo5liDy2rOONBU864RUlRhthLeY=.05ef40f6-38ca-4e84-a54b-eb8dbae2b97f@github.com> Message-ID: On Fri, 15 Oct 2021 13:09:27 GMT, Andrew Haley wrote: > > @theRealAph, any comments on the microbenchmark I wrote? > > Something like this works well: Thank you for the microbenchmark. I added it. I also added a microbenchmark to measure the latency of `Thread.onSpinWait` vs `Thread.sleep(0)` vs no pause. And a microbenchmark `ThreadOnSpinWaitProducerConsumer`. ------------- PR: https://git.openjdk.java.net/jdk/pull/5562 From zgu at openjdk.java.net Mon Oct 18 17:23:04 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Mon, 18 Oct 2021 17:23:04 GMT Subject: RFR: 8275413: Remove unused InstanceKlass::set_array_klasses() method Message-ID: Please review this trivial patch that removes unused nstanceKlass::set_array_klasses() method. ------------- Commit messages: - v0 Changes: https://git.openjdk.java.net/jdk/pull/5991/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5991&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275413 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/5991.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5991/head:pull/5991 PR: https://git.openjdk.java.net/jdk/pull/5991 From coleenp at openjdk.java.net Mon Oct 18 17:44:51 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 18 Oct 2021 17:44:51 GMT Subject: RFR: 8275413: Remove unused InstanceKlass::set_array_klasses() method In-Reply-To: References: Message-ID: On Mon, 18 Oct 2021 17:16:03 GMT, Zhengyu Gu wrote: > Please review this trivial patch that removes unused nstanceKlass::set_array_klasses() method. Oh yes, that's trivial. Thanks! ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5991 From coleenp at openjdk.java.net Mon Oct 18 19:44:15 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 18 Oct 2021 19:44:15 GMT Subject: RFR: 8271124: CTW tests crashed with assert(is_valid_method(o)) failed: should be valid jmethodid [v2] In-Reply-To: References: Message-ID: > Add a lock in VirtualSpaceList::contains to protect the virtual space list for Metaspace::contains. See CR for more details. > Tested with tier2 linux-aarch64-debug where the failure happened, and tier1-3 linux-x64-debug,linux-x64,windows-x64-debug in progress (almost done but it's being slow today). Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'master' into jmethodid - 8271124: CTW tests crashed with assert(is_valid_method(o)) failed: should be valid jmethodid ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5990/files - new: https://git.openjdk.java.net/jdk/pull/5990/files/ea62b5ce..cc2ae480 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5990&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5990&range=00-01 Stats: 2168 lines in 61 files changed: 1402 ins; 581 del; 185 mod Patch: https://git.openjdk.java.net/jdk/pull/5990.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5990/head:pull/5990 PR: https://git.openjdk.java.net/jdk/pull/5990 From iklam at openjdk.java.net Mon Oct 18 19:58:49 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 18 Oct 2021 19:58:49 GMT Subject: RFR: 8271124: CTW tests crashed with assert(is_valid_method(o)) failed: should be valid jmethodid [v2] In-Reply-To: References: Message-ID: On Mon, 18 Oct 2021 19:44:15 GMT, Coleen Phillimore wrote: >> Add a lock in VirtualSpaceList::contains to protect the virtual space list for Metaspace::contains. See CR for more details. >> Tested with tier2 linux-aarch64-debug where the failure happened, and tier1-3 linux-x64-debug,linux-x64,windows-x64-debug in progress (almost done but it's being slow today). > > Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch 'master' into jmethodid > - 8271124: CTW tests crashed with assert(is_valid_method(o)) failed: should be valid jmethodid LGTM. I looked at the other functions in this class, and it seems like VirtualSpaceList::is_full() has a similar problem. I filed https://bugs.openjdk.java.net/browse/JDK-8275440 to remove that function. Other functions like reserved_words(), committed_words() and num_nodes() also seem suspicious. @tstuefe can you take a look? ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5990 From coleenp at openjdk.java.net Mon Oct 18 19:58:51 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 18 Oct 2021 19:58:51 GMT Subject: RFR: 8271124: CTW tests crashed with assert(is_valid_method(o)) failed: should be valid jmethodid [v2] In-Reply-To: References: Message-ID: On Mon, 18 Oct 2021 19:44:15 GMT, Coleen Phillimore wrote: >> Add a lock in VirtualSpaceList::contains to protect the virtual space list for Metaspace::contains. See CR for more details. >> Tested with tier2 linux-aarch64-debug where the failure happened, and tier1-3 linux-x64-debug,linux-x64,windows-x64-debug in progress (almost done but it's being slow today). > > Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch 'master' into jmethodid > - 8271124: CTW tests crashed with assert(is_valid_method(o)) failed: should be valid jmethodid Shenandoah test gets a lock inversion with the introduction of the Metaspace_lock. I'm reopening for more comments. Shenandoah gets: assert(false) failed: Attempting to acquire lock Metaspace_lock/nosafepoint-3 out of order with lock StackWatermark_lock/stackwatermark -- possible deadlock found via github actions. ------------- PR: https://git.openjdk.java.net/jdk/pull/5990 From coleenp at openjdk.java.net Mon Oct 18 20:02:02 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 18 Oct 2021 20:02:02 GMT Subject: RFR: 8271124: CTW tests crashed with assert(is_valid_method(o)) failed: should be valid jmethodid [v2] In-Reply-To: References: Message-ID: On Mon, 18 Oct 2021 19:44:15 GMT, Coleen Phillimore wrote: >> Add a lock in VirtualSpaceList::contains to protect the virtual space list for Metaspace::contains. See CR for more details. >> Tested with tier2 linux-aarch64-debug where the failure happened, and tier1-3 linux-x64-debug,linux-x64,windows-x64-debug in progress (almost done but it's being slow today). > > Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch 'master' into jmethodid > - 8271124: CTW tests crashed with assert(is_valid_method(o)) failed: should be valid jmethodid Reclosing this PR. ------------- PR: https://git.openjdk.java.net/jdk/pull/5990 From coleenp at openjdk.java.net Mon Oct 18 20:02:03 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 18 Oct 2021 20:02:03 GMT Subject: Withdrawn: 8271124: CTW tests crashed with assert(is_valid_method(o)) failed: should be valid jmethodid In-Reply-To: References: Message-ID: On Mon, 18 Oct 2021 17:04:16 GMT, Coleen Phillimore wrote: > Add a lock in VirtualSpaceList::contains to protect the virtual space list for Metaspace::contains. See CR for more details. > Tested with tier2 linux-aarch64-debug where the failure happened, and tier1-3 linux-x64-debug,linux-x64,windows-x64-debug in progress (almost done but it's being slow today). This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/5990 From dholmes at openjdk.java.net Tue Oct 19 00:07:50 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 19 Oct 2021 00:07:50 GMT Subject: RFR: 8274301: Locking with _no_safepoint_check_flag does not check NJT In-Reply-To: References: Message-ID: On Thu, 14 Oct 2021 21:04:29 GMT, Coleen Phillimore wrote: > There are Mutexes that have a safepoint checking rank, that are locked via: > MutexLocker ml(safepoint_lock, Mutex::_no_safepoint_check_flag); > > These don't assert for the inconsistency because the thread that calls this is a NonJavaThread. > > But if you lock > MutexLocker ml(nosafepoint_lock); // default safepoint check > > These *will* assert for the safepoint checking inconsistency even for NonJavaThreads. > > NonJavaThreads don't participate in the safepoint checking protocol. And some of these safepoint checking locks are shared between java and nonjava threads. So in the existing code, there's a mix of locking with _no_safepoint_check_flag and not. > > This has a wrinkle because if you have a safepoint checking lock, and call wait on it, we now have to handle the case that the caller is a NonJavaThread and should not safepoint check (just like the Mutex::lock call). > > Maybe this shouldn't be fixed, and we should just document the discrepancy. > > Tested with tier1-3. This is effectively undoing some of the key changes to the API that you made in JDK-8222811! There you split wait() into two functions: wait() and wait_without_safepoint_check() and then had MonitorLocker::wait() call the right version based on the safepoint_check flag used at construction time. And now you are making wait() implement both safepoint checking and non-safepoint checking code paths, because you want to add an extra check based on the type of the thread? Sorry but I am not seeing an improvement here - quite the opposite. > Maybe this shouldn't be fixed, and we should just document the discrepancy. At the moment I'd vote for documentation (if actually needed). Cheers, David ------------- PR: https://git.openjdk.java.net/jdk/pull/5959 From sviswanathan at openjdk.java.net Tue Oct 19 01:11:52 2021 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Tue, 19 Oct 2021 01:11:52 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v3] In-Reply-To: References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> Message-ID: On Sat, 16 Oct 2021 00:56:14 GMT, Paul Sandoz wrote: >> This PR improves the performance of vector operations that accept masks on architectures that support masking in hardware, specifically Intel AVX512 and ARM SVE. >> >> On architectures that do not support masking in hardware the same technique as before is applied to most operations, specifically composition using blend. >> >> Masked loads/stores are a special form of masked operation that require additional care to ensure out-of-bounds access throw exceptions. The range checking has not been fully optimized and will require further work. >> >> No API enhancements were required and only a few additional tests were needed. > > Paul Sandoz has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Merge branch 'master' into JDK-8271515-vector-api > - Apply patch from https://github.com/openjdk/panama-vector/pull/152 > - Apply patch from https://github.com/openjdk/panama-vector/pull/142 > - Apply patch from https://github.com/openjdk/panama-vector/pull/139 > - Apply patch from https://github.com/openjdk/panama-vector/pull/151 > - Add new files. > - 8271515: Integration of JEP 417: Vector API (Third Incubator) src/hotspot/share/utilities/globalDefinitions.hpp line 36: > 34: > 35: #include COMPILER_HEADER(utilities/globalDefinitions) > 36: #include "utilities/globalDefinitions_vecApi.hpp" This change is not needed. src/hotspot/share/utilities/globalDefinitions_vecApi.hpp line 29: > 27: // the intent of this file to provide a header that can be included in .s files. > 28: > 29: #ifndef SHARE_VM_UTILITIES_GLOBALDEFINITIONS_VECAPI_HPP The file src/hotspot/share/utilities/globalDefinitions_vecApi.hpp is not needed. src/jdk.incubator.vector/share/classes/jdk/incubator/vector/AbstractMask.java line 67: > 65: > 66: @Override > 67: public boolean laneIsSet(int i) { Missing ForceInline. src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Byte128Vector.java line 278: > 276: @Override > 277: @ForceInline > 278: public Byte128Vector lanewise(Unary op, VectorMask m) { Should this method be final as well? src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Byte128Vector.java line 290: > 288: @Override > 289: @ForceInline > 290: public Byte128Vector lanewise(Binary op, Vector v, VectorMask m) { Should this method be final as well? src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Byte128Vector.java line 313: > 311: public final > 312: Byte128Vector > 313: lanewise(Ternary op, Vector v1, Vector v2) { For unary and binary operator above, we use VectorOperators.Unary and VectorOperators.Binary. Should we use VectorOperators.Ternary here as well then? src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Byte128Vector.java line 321: > 319: public final > 320: Byte128Vector > 321: lanewise(Ternary op, Vector v1, Vector v2, VectorMask m) { Should we use VectorOperators.Ternary here? src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Byte128Vector.java line 731: > 729: @Override > 730: @ForceInline > 731: public long toLong() { Should this and other mask operation methods be final methods? src/jdk.incubator.vector/share/classes/jdk/incubator/vector/ByteVector.java line 603: > 601: if (opKind(op, VO_SPECIAL)) { > 602: if (op == ZOMO) { > 603: return blend(broadcast(-1), compare(NE, 0, m)); This doesn't look correct. The lanes where mask is false should get the original lane value in this vector. ------------- PR: https://git.openjdk.java.net/jdk/pull/5873 From njian at openjdk.java.net Tue Oct 19 01:43:52 2021 From: njian at openjdk.java.net (Ningsheng Jian) Date: Tue, 19 Oct 2021 01:43:52 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v3] In-Reply-To: References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> Message-ID: On Wed, 13 Oct 2021 19:33:12 GMT, Vladimir Kozlov wrote: > C2 and x86 changes seems fine. Could any reviewer please help to review AArch64 changes in this patch? @theRealAph @adinn @nick-arm ? The AArch64 changes include: 1. SVE scalable predicate register allocation support 2. SVE backend support for vector masking operations with predicate feature ------------- PR: https://git.openjdk.java.net/jdk/pull/5873 From dholmes at openjdk.java.net Tue Oct 19 02:04:50 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 19 Oct 2021 02:04:50 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file [v4] In-Reply-To: References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: On Mon, 18 Oct 2021 16:06:29 GMT, Coleen Phillimore wrote: >> See CR for details. This saves the created mutex in a mutex_array during Mutex construction so that all Mutex, not just the ones in mutexLocker.cpp can be printed in the hs_err file. ThreadCritical is used to protect the mutex_array, and to avoid UB should be used when printing the owned locks, but since that is done during error handling, this seemed better not to lock during error handling. There's 0 probability that another thread will be creating a lock during this error handling anyway. >> Tested with tier1-8. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > tstuefe suggestions. Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From dholmes at openjdk.java.net Tue Oct 19 02:27:48 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 19 Oct 2021 02:27:48 GMT Subject: RFR: 8275063: Implementation of Foreign Function & Memory API (Second incubator) [v6] In-Reply-To: References: Message-ID: On Sat, 16 Oct 2021 11:11:59 GMT, Maurizio Cimadamore wrote: >> This PR contains the API and implementation changes for JEP-419 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. >> >> [1] - https://openjdk.java.net/jeps/419 > > Maurizio Cimadamore has updated the pull request incrementally with two additional commits since the last revision: > > - * use `invokeWithArguments` to simplify new test > - Add test for liveness check with high-aririty downcalls > (make sure that if an exception occurs in a downcall because of liveness, > ref count of other resources are left intact). Hotspot changes seem fine. Thanks, David ------------- PR: https://git.openjdk.java.net/jdk/pull/5907 From stuefe at openjdk.java.net Tue Oct 19 05:18:52 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 19 Oct 2021 05:18:52 GMT Subject: RFR: 8271124: CTW tests crashed with assert(is_valid_method(o)) failed: should be valid jmethodid [v2] In-Reply-To: References: Message-ID: On Mon, 18 Oct 2021 19:53:30 GMT, Ioi Lam wrote: > LGTM. > > I looked at the other functions in this class, and it seems like VirtualSpaceList::is_full() has a similar problem. I filed https://bugs.openjdk.java.net/browse/JDK-8275440 to remove that function. > > Other functions like reserved_words(), committed_words() and num_nodes() also seem suspicious. @tstuefe can you take a look? I'll take a look. ------------- PR: https://git.openjdk.java.net/jdk/pull/5990 From mgronlun at openjdk.java.net Tue Oct 19 09:47:06 2021 From: mgronlun at openjdk.java.net (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 19 Oct 2021 09:47:06 GMT Subject: RFR: 8275445: RunThese30M.java failed "assert(ZAddress::is_marked(addr)) failed: Should be marked" Message-ID: Greetings, This fixes the issue seen in testing when accessing an oop as part of unloading. All oop accesses will be done outside of unloading and the result (the codesource) will be cached and reused in the FinalizerEntry. Testing: in progress Thanks Markus PS one effect of this is that classes that unload before they have allocated anything will not have a codesource attribute. This can be fixed by letting the classes register with the table as part of class loading, instead of during allocation. I will followup with a separate change for that. ------------- Commit messages: - move oop accesses outside of unload Changes: https://git.openjdk.java.net/jdk/pull/6001/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6001&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275445 Stats: 108 lines in 4 files changed: 63 ins; 44 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/6001.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6001/head:pull/6001 PR: https://git.openjdk.java.net/jdk/pull/6001 From zgu at openjdk.java.net Tue Oct 19 12:03:59 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Tue, 19 Oct 2021 12:03:59 GMT Subject: Integrated: 8275413: Remove unused InstanceKlass::set_array_klasses() method In-Reply-To: References: Message-ID: On Mon, 18 Oct 2021 17:16:03 GMT, Zhengyu Gu wrote: > Please review this trivial patch that removes unused nstanceKlass::set_array_klasses() method. This pull request has now been integrated. Changeset: a4491f22 Author: Zhengyu Gu URL: https://git.openjdk.java.net/jdk/commit/a4491f22533370f481de71b7ddc8f6105fa2d414 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod 8275413: Remove unused InstanceKlass::set_array_klasses() method Reviewed-by: coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/5991 From lkorinth at openjdk.java.net Tue Oct 19 12:23:55 2021 From: lkorinth at openjdk.java.net (Leo Korinth) Date: Tue, 19 Oct 2021 12:23:55 GMT Subject: RFR: 8269537: memset() is called after operator new [v3] In-Reply-To: References: Message-ID: On Thu, 7 Oct 2021 05:49:24 GMT, Kim Barrett wrote: >> Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: >> >> Use a thread local buffer so that the compiler might reorder operator new. > > src/hotspot/share/asm/codeBuffer.cpp line 143: > >> 141: NOT_PRODUCT(clear_strings()); >> 142: >> 143: assert(_default_oop_recorder.allocated_on_stack_or_embedded(), "should be embedded object"); > > It might have been better to do this name change separately, to reduce distraction. I created https://bugs.openjdk.java.net/browse/JDK-8275506 and https://github.com/openjdk/jdk/pull/6004. I will merge that one. ------------- PR: https://git.openjdk.java.net/jdk/pull/5387 From lkorinth at openjdk.java.net Tue Oct 19 12:26:10 2021 From: lkorinth at openjdk.java.net (Leo Korinth) Date: Tue, 19 Oct 2021 12:26:10 GMT Subject: RFR: 8275506: Rename allocated_on_stack to allocated_on_stack_or_embedded Message-ID: In allocation.hpp, the name allocated_on_stack can be misleading, better rename the function to allocated_on_stack_or_embedded and it will match the name of the enum as a bonus. ------------- Commit messages: - 8275506: Rename allocated_on_stack to allocated_on_stack_or_embedded Changes: https://git.openjdk.java.net/jdk/pull/6004/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6004&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275506 Stats: 16 lines in 6 files changed: 0 ins; 0 del; 16 mod Patch: https://git.openjdk.java.net/jdk/pull/6004.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6004/head:pull/6004 PR: https://git.openjdk.java.net/jdk/pull/6004 From lkorinth at openjdk.java.net Tue Oct 19 12:36:56 2021 From: lkorinth at openjdk.java.net (Leo Korinth) Date: Tue, 19 Oct 2021 12:36:56 GMT Subject: RFR: 8269537: memset() is called after operator new [v3] In-Reply-To: References: Message-ID: On Thu, 7 Oct 2021 05:54:55 GMT, Kim Barrett wrote: >> Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: >> >> Use a thread local buffer so that the compiler might reorder operator new. > > src/hotspot/share/asm/codeBuffer.hpp line 369: > >> 367: // addresses in a sibling section. >> 368: >> 369: class CodeBuffer: public ResourceObj DEBUG_ONLY(COMMA private Scrubber) { > > Deriving from ResourceObj rather than the previous fakery seems good. I think the multiple-inheritance derivation from Scrubber could be removed, instead adding ~CodeBuffer to do the zapping. That was how it looked when I started this fix (and it was rebased out to the Scrubber class by JDK-8264207). I will not change it back in this fix as the multiple inheritance problem is not introduced here. Another reason is that the authors of JDK-8264207 must have thought it looked better the way it is now. Personally I do not know. > src/hotspot/share/memory/allocation.cpp line 154: > >> 152: >> 153: #ifdef ASSERT >> 154: thread_local ResourceObj::RecentAllocations ResourceObj::_recent_allocations; > > Don't use `thread_local`. See the style guide. Will fix. ------------- PR: https://git.openjdk.java.net/jdk/pull/5387 From lkorinth at openjdk.java.net Tue Oct 19 13:06:56 2021 From: lkorinth at openjdk.java.net (Leo Korinth) Date: Tue, 19 Oct 2021 13:06:56 GMT Subject: RFR: 8269537: memset() is called after operator new [v3] In-Reply-To: References: Message-ID: On Thu, 7 Oct 2021 06:12:07 GMT, Kim Barrett wrote: >> Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: >> >> Use a thread local buffer so that the compiler might reorder operator new. > > src/hotspot/share/memory/allocation.hpp line 398: > >> 396: class ResourceObj ALLOCATION_SUPER_CLASS_SPEC { >> 397: public: >> 398: enum allocation_type : uint8_t { STACK_OR_EMBEDDED, RESOURCE_AREA, C_HEAP, ARENA }; > > Consider adding an "empty" allocation-type, that indicates an entry in the RecentAllocations is unused. After thinking a bit, I will not add an "empty" value. It will make it harder to make functions "total" that matches on the enum, and the "empty" value does not really need to be reset in RecentAllocations. ------------- PR: https://git.openjdk.java.net/jdk/pull/5387 From lkorinth at openjdk.java.net Tue Oct 19 13:06:57 2021 From: lkorinth at openjdk.java.net (Leo Korinth) Date: Tue, 19 Oct 2021 13:06:57 GMT Subject: RFR: 8269537: memset() is called after operator new [v3] In-Reply-To: References: Message-ID: On Tue, 19 Oct 2021 13:01:10 GMT, Leo Korinth wrote: >> src/hotspot/share/memory/allocation.hpp line 398: >> >>> 396: class ResourceObj ALLOCATION_SUPER_CLASS_SPEC { >>> 397: public: >>> 398: enum allocation_type : uint8_t { STACK_OR_EMBEDDED, RESOURCE_AREA, C_HEAP, ARENA }; >> >> Consider adding an "empty" allocation-type, that indicates an entry in the RecentAllocations is unused. > > After thinking a bit, I will not add an "empty" value. It will make it harder to make functions "total" that matches on the enum, and the "empty" value does not really need to be reset in RecentAllocations. STACK_OR_EMBEDDED is kind of a nice neutral element "mempty" in my opinion. ------------- PR: https://git.openjdk.java.net/jdk/pull/5387 From tobias.hartmann at oracle.com Tue Oct 19 13:17:21 2021 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 19 Oct 2021 15:17:21 +0200 Subject: RFC - Improving C2 Escape Analysis In-Reply-To: References: <20210930140335.648146897@eggemoggin.niobe.net> Message-ID: <415a6622-a46c-33da-8e39-c8f3068c7df3@oracle.com> Hi Cesar, First of all, very nice writeup! On 04.10.21 23:32, Cesar Soares Lucas wrote: > How much of a problem is control-flow/data-flow merge > points in a world where Inline Types are in place? From my (external) point of > view, I can ?imagine that this will still be a problem. If so, it looks to me > that this is an issue that ?once solved would benefit not only the current EA > implementation but also Valhalla. ?What do you [all] think? Some details on how inline types are currently implemented in C2: - We don't rely on or even use Escape Analysis for inline types. - Inline types are aggressively scalarized during parsing when created, casted to, loaded from an array/field, passed as an argument or returned from a call. - We implemented an optimized calling convention to allow passing/returning inline types in a scalarized form to/from C2 compiled code. - Buffering (= heap allocation) is only needed if an inline type is: - Passed as Object (or another supertype) and mixed with other types (no scalarization possible) - Stored into a non-flattened field/array - Returned from a method when there are not enough registers available to hold all fields As a result, inline types would not directly benefit from an improved EA. That said, support for stack/thread local allocation would be something that we could potentially use in Valhalla to avoid buffering on the heap. It's difficult though because in most cases we use buffering when storing to a non-flattened field that escapes to other threads. Best regards, Tobias From lkorinth at openjdk.java.net Tue Oct 19 13:38:50 2021 From: lkorinth at openjdk.java.net (Leo Korinth) Date: Tue, 19 Oct 2021 13:38:50 GMT Subject: RFR: 8269537: memset() is called after operator new [v3] In-Reply-To: References: Message-ID: On Thu, 7 Oct 2021 06:21:09 GMT, Kim Barrett wrote: >> Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: >> >> Use a thread local buffer so that the compiler might reorder operator new. > > src/hotspot/share/memory/allocation.hpp line 416: > >> 414: static const unsigned BufferSize = 5; >> 415: uintptr_t _begin[BufferSize]; >> 416: uintptr_t _past_end[BufferSize]; > > I think these should be `const void*` rather than `uintptr_t`. Why do you think it is better? Comparing integers might possibly be preferable from comparing pointers from *different* objects that is undefined if I remember correctly. Of course casting them to integers probably does not make it that much better defined, but it feels at least a bit better... I will change it if you still prefer pointer comparison. ------------- PR: https://git.openjdk.java.net/jdk/pull/5387 From ron.pressler at oracle.com Tue Oct 19 13:39:03 2021 From: ron.pressler at oracle.com (Ron Pressler) Date: Tue, 19 Oct 2021 13:39:03 +0000 Subject: RFC - Improving C2 Escape Analysis In-Reply-To: <415a6622-a46c-33da-8e39-c8f3068c7df3@oracle.com> References: <20210930140335.648146897@eggemoggin.niobe.net> <415a6622-a46c-33da-8e39-c8f3068c7df3@oracle.com> Message-ID: <44563450-403B-4A15-95AB-5FB5DCA4ED0B@oracle.com> > On 19 Oct 2021, at 14:17, Tobias Hartmann wrote: > > > That said, support for stack/thread local allocation would be something that we could potentially > use in Valhalla to avoid buffering on the heap. It's difficult though because in most cases we use > buffering when storing to a non-flattened field that escapes to other threads. > Just a reminder, stacks can move in memory (virtual threads), so great care must be taken with pointers pointing into the stack. As a general rule, that?s a no-no. ? Ron From lkorinth at openjdk.java.net Tue Oct 19 13:46:53 2021 From: lkorinth at openjdk.java.net (Leo Korinth) Date: Tue, 19 Oct 2021 13:46:53 GMT Subject: RFR: 8269537: memset() is called after operator new [v3] In-Reply-To: References: Message-ID: On Thu, 7 Oct 2021 06:21:52 GMT, Kim Barrett wrote: >> Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: >> >> Use a thread local buffer so that the compiler might reorder operator new. > > src/hotspot/share/memory/allocation.hpp line 414: > >> 412: // allocation is done in a recursive step. In that case an assert will trigger. >> 413: class RecentAllocations { >> 414: static const unsigned BufferSize = 5; > > "5" seems like an odd choice. What would you suggest? I could make it "2", but then a smaller code change (or compiler upgrade) would force us to change the size. Any size is quite arbitrary, I will update this number to your suggestion if you give me a suggestion. ------------- PR: https://git.openjdk.java.net/jdk/pull/5387 From lkorinth at openjdk.java.net Tue Oct 19 13:59:50 2021 From: lkorinth at openjdk.java.net (Leo Korinth) Date: Tue, 19 Oct 2021 13:59:50 GMT Subject: RFR: 8269537: memset() is called after operator new [v3] In-Reply-To: References: Message-ID: <90ieb6LOCOVaQNJWtR4A--008sktDRtI3czSMeaK1dE=.563992aa-16ab-42de-a2e0-eede50eb4a20@github.com> On Thu, 7 Oct 2021 06:18:00 GMT, Kim Barrett wrote: >> Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: >> >> Use a thread local buffer so that the compiler might reorder operator new. > > src/hotspot/share/memory/memRegion.hpp line 110: > >> 108: // A ResourceObj version of MemRegionClosure >> 109: >> 110: class MemRegionClosureRO: public ResourceObj { > > I think this class could be deleted entirely, with the only derived class (`DirtyCardToOopClosure`) instead deriving from ResourceObj. I agree. ------------- PR: https://git.openjdk.java.net/jdk/pull/5387 From weijun at openjdk.java.net Tue Oct 19 14:01:11 2021 From: weijun at openjdk.java.net (Weijun Wang) Date: Tue, 19 Oct 2021 14:01:11 GMT Subject: RFR: 8275512: Upgrade required version of jtreg to 6.1 Message-ID: <8ZF_bktOd9u8s06pS5q_VfPnDplwwjhlpz8pdLyHnvw=.67c1b571-5f34-4251-9872-34ccb89dbd5a@github.com> As a follow up of JEP 411, we will soon disallow security manager by default. jtreg 6.1 does not set its own security manager if JDK version is >= 18. ------------- Commit messages: - 8275512: Upgrade required version of jtreg to 6.1 Changes: https://git.openjdk.java.net/jdk/pull/6012/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6012&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275512 Stats: 7 lines in 6 files changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/6012.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6012/head:pull/6012 PR: https://git.openjdk.java.net/jdk/pull/6012 From ihse at openjdk.java.net Tue Oct 19 14:09:49 2021 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Tue, 19 Oct 2021 14:09:49 GMT Subject: RFR: 8275512: Upgrade required version of jtreg to 6.1 In-Reply-To: <8ZF_bktOd9u8s06pS5q_VfPnDplwwjhlpz8pdLyHnvw=.67c1b571-5f34-4251-9872-34ccb89dbd5a@github.com> References: <8ZF_bktOd9u8s06pS5q_VfPnDplwwjhlpz8pdLyHnvw=.67c1b571-5f34-4251-9872-34ccb89dbd5a@github.com> Message-ID: <1Lt3IpTy-rZBwrcdQpbCaw1mCBR42UyLd0JIn_55qJ4=.cf17c8d5-ee9d-4ad1-8f21-ff7a4e5eeacd@github.com> On Tue, 19 Oct 2021 13:51:45 GMT, Weijun Wang wrote: > As a follow up of JEP 411, we will soon disallow security manager by default. jtreg 6.1 does not set its own security manager if JDK version is >= 18. LGTM ------------- Marked as reviewed by ihse (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6012 From zgu at openjdk.java.net Tue Oct 19 15:17:05 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Tue, 19 Oct 2021 15:17:05 GMT Subject: RFR: 8275439: Remove PrintVtableStats Message-ID: PrintVtableStats has not been very useful and reported information may not be accurate neither. Please see the CR and comments in JDK-8275413 for details. - PrintVtableStats is none product flag, so does not require a CSR to remove it. - The patch also removes Klass::array_klasses_do() and families, as PrintVtableStats is the last use. - InstanceKlass/ArrayKlass::array_klasses_do(void f(Klass* k, TRAPS), TRAPS) have no use, remove them as well. ------------- Commit messages: - v0 Changes: https://git.openjdk.java.net/jdk/pull/6017/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6017&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275439 Stats: 125 lines in 9 files changed: 0 ins; 125 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/6017.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6017/head:pull/6017 PR: https://git.openjdk.java.net/jdk/pull/6017 From coleenp at openjdk.java.net Tue Oct 19 15:39:50 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 19 Oct 2021 15:39:50 GMT Subject: RFR: 8275439: Remove PrintVtableStats In-Reply-To: References: Message-ID: On Tue, 19 Oct 2021 15:09:01 GMT, Zhengyu Gu wrote: > PrintVtableStats has not been very useful and reported information may not be accurate neither. Please see the CR and comments in JDK-8275413 for details. > > - PrintVtableStats is none product flag, so does not require a CSR to remove it. > - The patch also removes Klass::array_klasses_do() and families, as PrintVtableStats is the last use. > - InstanceKlass/ArrayKlass::array_klasses_do(void f(Klass* k, TRAPS), TRAPS) have no use, remove them as well. Nice! thanks! ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6017 From iignatyev at openjdk.java.net Tue Oct 19 15:44:51 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Tue, 19 Oct 2021 15:44:51 GMT Subject: RFR: 8275512: Upgrade required version of jtreg to 6.1 In-Reply-To: <8ZF_bktOd9u8s06pS5q_VfPnDplwwjhlpz8pdLyHnvw=.67c1b571-5f34-4251-9872-34ccb89dbd5a@github.com> References: <8ZF_bktOd9u8s06pS5q_VfPnDplwwjhlpz8pdLyHnvw=.67c1b571-5f34-4251-9872-34ccb89dbd5a@github.com> Message-ID: On Tue, 19 Oct 2021 13:51:45 GMT, Weijun Wang wrote: > As a follow up of JEP 411, we will soon disallow security manager by default. jtreg 6.1 does not set its own security manager if JDK version is >= 18. LGTM ------------- Marked as reviewed by iignatyev (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6012 From rkennke at openjdk.java.net Tue Oct 19 16:26:05 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 19 Oct 2021 16:26:05 GMT Subject: RFR: 8275527: Separate/refactor forward pointer access Message-ID: Accessing the forward pointer overlay of object headers is currently done in a somewhat crude way. oopDesc::forwardee() and oopeDesc::is_forwarded() both load the header-word. This seems kinda bad, because they are most often used in conjunction, e.g.: fwd = obj->forwarded() ? obj->forwardee() : promote_obj(); or similar constructs. Also, in some places, the forwardee is accessed in a more direct fashion using markWord::decode_pointer(), or as HeapWord*, sometimes is_forwarded() is determined by forwardee() == NULL checks (and crude overrides to prepare non-forwarded headers to return NULL in such checks, see g1Full* source files). I propose to extract and refactor forward pointer access in a way to better support above pattern. In-fact I originally wrote this with performance-enhancement in mind (see oopForwarding.hpp header for more details), but I could not show any actual performance improvements, so let's consider this purely on cleanliness and separation-of-concerns basis. Testing: - [x] tier - [ ] tier2 - [x] hotspot_gc ------------- Commit messages: - Update some copyright headers - Add missing includes - Merge branch 'master' into optimize-fwdptr - Add missing includes - Rename mwd -> fwd - Add missing file - Move remaining forwarding methods to OopForwarding - Renames, cleanups - Initial implementation of MarkWordDecoder Changes: https://git.openjdk.java.net/jdk/pull/5955/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5955&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275527 Stats: 351 lines in 27 files changed: 170 ins; 79 del; 102 mod Patch: https://git.openjdk.java.net/jdk/pull/5955.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5955/head:pull/5955 PR: https://git.openjdk.java.net/jdk/pull/5955 From tschatzl at openjdk.java.net Tue Oct 19 16:26:05 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 19 Oct 2021 16:26:05 GMT Subject: RFR: 8275527: Separate/refactor forward pointer access In-Reply-To: References: Message-ID: On Thu, 14 Oct 2021 16:37:02 GMT, Roman Kennke wrote: > Accessing the forward pointer overlay of object headers is currently done in a somewhat crude way. oopDesc::forwardee() and oopeDesc::is_forwarded() both load the header-word. This seems kinda bad, because they are most often used in conjunction, e.g.: > fwd = obj->forwarded() ? obj->forwardee() : promote_obj(); > > or similar constructs. > > Also, in some places, the forwardee is accessed in a more direct fashion using markWord::decode_pointer(), or as HeapWord*, sometimes is_forwarded() is determined by forwardee() == NULL checks (and crude overrides to prepare non-forwarded headers to return NULL in such checks, see g1Full* source files). > > I propose to extract and refactor forward pointer access in a way to better support above pattern. In-fact I originally wrote this with performance-enhancement in mind (see oopForwarding.hpp header for more details), but I could not show any actual performance improvements, so let's consider this purely on cleanliness and separation-of-concerns basis. > > Testing: > - [x] tier > - [ ] tier2 > - [x] hotspot_gc I like it. Lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5955 From hseigel at openjdk.java.net Tue Oct 19 17:19:07 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Tue, 19 Oct 2021 17:19:07 GMT Subject: RFR: 8272614: Unused parameters in MethodHandleNatives linking methods Message-ID: Please review this small fix for JDK-8272614 to remove the unused indexInCP argument to linkCallSite() and linkDynamicConstant(). The fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-6 on Linux x64. Thanks, Harold ------------- Commit messages: - 8272614: Unused parameters in MethodHandleNatives linking methods Changes: https://git.openjdk.java.net/jdk/pull/6021/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6021&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8272614 Stats: 8 lines in 3 files changed: 0 ins; 3 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/6021.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6021/head:pull/6021 PR: https://git.openjdk.java.net/jdk/pull/6021 From weijun at openjdk.java.net Tue Oct 19 17:24:17 2021 From: weijun at openjdk.java.net (Weijun Wang) Date: Tue, 19 Oct 2021 17:24:17 GMT Subject: RFR: 8275512: Upgrade required version of jtreg to 6.1 [v2] In-Reply-To: <8ZF_bktOd9u8s06pS5q_VfPnDplwwjhlpz8pdLyHnvw=.67c1b571-5f34-4251-9872-34ccb89dbd5a@github.com> References: <8ZF_bktOd9u8s06pS5q_VfPnDplwwjhlpz8pdLyHnvw=.67c1b571-5f34-4251-9872-34ccb89dbd5a@github.com> Message-ID: > As a follow up of JEP 411, we will soon disallow security manager by default. jtreg 6.1 does not set its own security manager if JDK version is >= 18. Weijun Wang has updated the pull request incrementally with one additional commit since the last revision: upgrade the version in GHA config only in patch2: unchanged: ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6012/files - new: https://git.openjdk.java.net/jdk/pull/6012/files/b86e799c..f5ffc49b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6012&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6012&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/6012.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6012/head:pull/6012 PR: https://git.openjdk.java.net/jdk/pull/6012 From zgu at openjdk.java.net Tue Oct 19 17:27:51 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Tue, 19 Oct 2021 17:27:51 GMT Subject: RFR: 8275439: Remove PrintVtableStats In-Reply-To: References: Message-ID: On Tue, 19 Oct 2021 15:36:53 GMT, Coleen Phillimore wrote: >> PrintVtableStats has not been very useful and reported information may not be accurate neither. Please see the CR and comments in JDK-8275413 for details. >> >> - PrintVtableStats is none product flag, so does not require a CSR to remove it. >> - The patch also removes Klass::array_klasses_do() and families, as PrintVtableStats is the last use. >> - InstanceKlass/ArrayKlass::array_klasses_do(void f(Klass* k, TRAPS), TRAPS) have no use, remove them as well. > > Nice! thanks! Thanks, @coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/6017 From joehw at openjdk.java.net Tue Oct 19 17:30:56 2021 From: joehw at openjdk.java.net (Joe Wang) Date: Tue, 19 Oct 2021 17:30:56 GMT Subject: RFR: 8275512: Upgrade required version of jtreg to 6.1 [v2] In-Reply-To: References: <8ZF_bktOd9u8s06pS5q_VfPnDplwwjhlpz8pdLyHnvw=.67c1b571-5f34-4251-9872-34ccb89dbd5a@github.com> Message-ID: On Tue, 19 Oct 2021 17:24:17 GMT, Weijun Wang wrote: >> As a follow up of JEP 411, we will soon disallow security manager by default. jtreg 6.1 does not set its own security manager if JDK version is >= 18. > > Weijun Wang has updated the pull request incrementally with one additional commit since the last revision: > > upgrade the version in GHA config > > only in patch2: > unchanged: Marked as reviewed by joehw (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/6012 From lancea at openjdk.java.net Tue Oct 19 20:19:33 2021 From: lancea at openjdk.java.net (Lance Andersen) Date: Tue, 19 Oct 2021 20:19:33 GMT Subject: RFR: 8275512: Upgrade required version of jtreg to 6.1 [v2] In-Reply-To: References: <8ZF_bktOd9u8s06pS5q_VfPnDplwwjhlpz8pdLyHnvw=.67c1b571-5f34-4251-9872-34ccb89dbd5a@github.com> Message-ID: On Tue, 19 Oct 2021 17:24:17 GMT, Weijun Wang wrote: >> As a follow up of JEP 411, we will soon disallow security manager by default. jtreg 6.1 does not set its own security manager if JDK version is >= 18. > > Weijun Wang has updated the pull request incrementally with one additional commit since the last revision: > > upgrade the version in GHA config > > only in patch2: > unchanged: Marked as reviewed by lancea (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/6012 From psandoz at openjdk.java.net Tue Oct 19 20:21:47 2021 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Tue, 19 Oct 2021 20:21:47 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v3] In-Reply-To: References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> Message-ID: <5Vyao27ssAe8dot0Z6Epc3KlRi57Aaf8j3UdroTSbFU=.1aae99e3-de3c-49b2-a14b-88fa24e9779b@github.com> On Mon, 18 Oct 2021 22:56:37 GMT, Sandhya Viswanathan wrote: >> Paul Sandoz has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: >> >> - Merge branch 'master' into JDK-8271515-vector-api >> - Apply patch from https://github.com/openjdk/panama-vector/pull/152 >> - Apply patch from https://github.com/openjdk/panama-vector/pull/142 >> - Apply patch from https://github.com/openjdk/panama-vector/pull/139 >> - Apply patch from https://github.com/openjdk/panama-vector/pull/151 >> - Add new files. >> - 8271515: Integration of JEP 417: Vector API (Third Incubator) > > src/hotspot/share/utilities/globalDefinitions_vecApi.hpp line 29: > >> 27: // the intent of this file to provide a header that can be included in .s files. >> 28: >> 29: #ifndef SHARE_VM_UTILITIES_GLOBALDEFINITIONS_VECAPI_HPP > > The file src/hotspot/share/utilities/globalDefinitions_vecApi.hpp is not needed. I notice src/jdk.incubator.vector/windows/native/libsvml/globals_vectorApiSupport_windows.S.inc contains a refence in comments to that file, I presume i can remove that comment too? > src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Byte128Vector.java line 278: > >> 276: @Override >> 277: @ForceInline >> 278: public Byte128Vector lanewise(Unary op, VectorMask m) { > > Should this method be final as well? It's actually redundant because the class is final. Better to drop final from all declarations, at the risk of creating a larger diff. > src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Byte128Vector.java line 321: > >> 319: public final >> 320: Byte128Vector >> 321: lanewise(Ternary op, Vector v1, Vector v2, VectorMask m) { > > Should we use VectorOperators.Ternary here? We static import `import static jdk.incubator.vector.VectorOperators.*;`, and the qualification of enclosing type `VectorOperators` seems inconsistently used. I think we can drop it at all declarations. > src/jdk.incubator.vector/share/classes/jdk/incubator/vector/ByteVector.java line 603: > >> 601: if (opKind(op, VO_SPECIAL)) { >> 602: if (op == ZOMO) { >> 603: return blend(broadcast(-1), compare(NE, 0, m)); > > This doesn't look correct. The lanes where mask is false should get the original lane value in this vector. That should work, since `compare(NE, 0, m) === compare(NE, 0).and(m)`, so when an `m` lane is unset the lane element of `this` vector will be selected. Running jshell against a build of PR: $ ~/Projects/jdk/jdk/build/macosx-x86_64-server-release/images/jdk/bin/jshell --add-modules jdk.incubator.vector | Welcome to JShell -- Version 18-internal | For an introduction type: /help intro jshell> import jdk.incubator.vector.* jshell> var s = IntVector.SPECIES_256; s ==> Species[int, 8, S_256_BIT] jshell> var v = IntVector.fromArray(s, new int[]{0, 1, 0, -2, 0, 3, 0, -4}, 0); v ==> [0, 1, 0, -2, 0, 3, 0, -4] jshell> var z = v.lanewise(VectorOperators.ZOMO); z ==> [0, -1, 0, -1, 0, -1, 0, -1] jshell> z = v.lanewise(VectorOperators.ZOMO, s.loadMask(new boolean[]{false, false, false, false, true, true, true, true}, 0)); z ==> [0, 1, 0, -2, 0, -1, 0, -1] jshell> ------------- PR: https://git.openjdk.java.net/jdk/pull/5873 From sviswanathan at openjdk.java.net Tue Oct 19 20:21:52 2021 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Tue, 19 Oct 2021 20:21:52 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v3] In-Reply-To: <5Vyao27ssAe8dot0Z6Epc3KlRi57Aaf8j3UdroTSbFU=.1aae99e3-de3c-49b2-a14b-88fa24e9779b@github.com> References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> <5Vyao27ssAe8dot0Z6Epc3KlRi57Aaf8j3UdroTSbFU=.1aae99e3-de3c-49b2-a14b-88fa24e9779b@github.com> Message-ID: On Tue, 19 Oct 2021 18:54:01 GMT, Paul Sandoz wrote: >> src/hotspot/share/utilities/globalDefinitions_vecApi.hpp line 29: >> >>> 27: // the intent of this file to provide a header that can be included in .s files. >>> 28: >>> 29: #ifndef SHARE_VM_UTILITIES_GLOBALDEFINITIONS_VECAPI_HPP >> >> The file src/hotspot/share/utilities/globalDefinitions_vecApi.hpp is not needed. > > I notice src/jdk.incubator.vector/windows/native/libsvml/globals_vectorApiSupport_windows.S.inc contains a refence in comments to that file, I presume i can remove that comment too? Yes, that comment can also be removed. It is a leftover from when svml was built as part of libjvm.so. >> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Byte128Vector.java line 278: >> >>> 276: @Override >>> 277: @ForceInline >>> 278: public Byte128Vector lanewise(Unary op, VectorMask m) { >> >> Should this method be final as well? > > It's actually redundant because the class is final. Better to drop final from all declarations, at the risk of creating a larger diff. Got it. I am ok with leaving things as is if it makes it easier. ------------- PR: https://git.openjdk.java.net/jdk/pull/5873 From psandoz at openjdk.java.net Tue Oct 19 20:21:55 2021 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Tue, 19 Oct 2021 20:21:55 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v3] In-Reply-To: References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> <5Vyao27ssAe8dot0Z6Epc3KlRi57Aaf8j3UdroTSbFU=.1aae99e3-de3c-49b2-a14b-88fa24e9779b@github.com> Message-ID: On Tue, 19 Oct 2021 18:57:59 GMT, Sandhya Viswanathan wrote: >> It's actually redundant because the class is final. Better to drop final from all declarations, at the risk of creating a larger diff. > > Got it. I am ok with leaving things as is if it makes it easier. For now to reduce changes in this PR i would prefer to do a follow cleanup for code consistency for that and for `final` methods. ------------- PR: https://git.openjdk.java.net/jdk/pull/5873 From duke at openjdk.java.net Tue Oct 19 20:34:55 2021 From: duke at openjdk.java.net (Vamsi Parasa) Date: Tue, 19 Oct 2021 20:34:55 GMT Subject: RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh [v2] In-Reply-To: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> References: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> Message-ID: > Optimize the new Math.unsignedMultiplyHigh using the x86 mul instruction. This change show 1.87X improvement on a micro benchmark. Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: refactoring to remove code duplication by using a common routine for UMulHiLNode and MulHiLNode ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5933/files - new: https://git.openjdk.java.net/jdk/pull/5933/files/cb30b268..a10a9fbe Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5933&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5933&range=00-01 Stats: 25 lines in 2 files changed: 12 ins; 12 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/5933.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5933/head:pull/5933 PR: https://git.openjdk.java.net/jdk/pull/5933 From duke at openjdk.java.net Tue Oct 19 20:35:03 2021 From: duke at openjdk.java.net (Vamsi Parasa) Date: Tue, 19 Oct 2021 20:35:03 GMT Subject: RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh [v2] In-Reply-To: References: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> <8MuiklM5Nt3VkzyVHbWqwMh_LkVvVY2Mf65_0zTx4Kw=.9351008f-b489-4103-be9d-87e6fc4a8f39@github.com> Message-ID: <5AqWGf-saY2vk4Z14pSF25waVdcU0j06jW3aSUH76dg=.bba122ee-cf23-44be-bdbc-8bf54b4a6b71@github.com> On Fri, 15 Oct 2021 21:04:12 GMT, Vamsi Parasa wrote: > > > How you verified correctness of results? I suggest to extend `test/jdk//java/lang/Math/MultiplicationTests.java` test to cover unsigned method. > > > > > > Tests for unsignedMultiplyHigh were already added in test/jdk//java/lang/Math/MultiplicationTests.java (in July 2021 by Brian Burkhalter). Used that test to verify the correctness of the results. > > Good. It seems I have old version of the test. Did you run it with -Xcomp? How you verified that intrinsic is used? I have verified that the intrinsic is being used by looking at the x86 assembly code generated by using perfasm profiler. ------------- PR: https://git.openjdk.java.net/jdk/pull/5933 From duke at openjdk.java.net Tue Oct 19 20:35:17 2021 From: duke at openjdk.java.net (Vamsi Parasa) Date: Tue, 19 Oct 2021 20:35:17 GMT Subject: RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh [v2] In-Reply-To: References: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> Message-ID: On Fri, 15 Oct 2021 19:31:24 GMT, Vamsi Parasa wrote: >> src/hotspot/share/opto/mulnode.cpp line 468: >> >>> 466: } >>> 467: >>> 468: //============================================================================= >> >> MulHiLNode::Value() and UMulHiLNode::Value() seem to be identical. Perhaps some refactoring would be in order, maybe make a common shared routine. > > Sure, will do the refactoring to use a shared routine. Pushed the refactored code to use a common routine for MulHiLNode::Value() and UMulHiLNode::Value(). Please review... ------------- PR: https://git.openjdk.java.net/jdk/pull/5933 From duke at openjdk.java.net Tue Oct 19 20:35:26 2021 From: duke at openjdk.java.net (Vamsi Parasa) Date: Tue, 19 Oct 2021 20:35:26 GMT Subject: Withdrawn: 8275167: x86 intrinsic for unsignedMultiplyHigh In-Reply-To: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> References: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> Message-ID: <0g7B5EoV8DdxjVj7jQpJJD7aAjvqbr_M0rl2GyA9ing=.0a6e9bdb-7eab-4fbf-911a-6db783b6882b@github.com> On Wed, 13 Oct 2021 18:55:10 GMT, Vamsi Parasa wrote: > Optimize the new Math.unsignedMultiplyHigh using the x86 mul instruction. This change show 1.87X improvement on a micro benchmark. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/5933 From sviswanathan at openjdk.java.net Tue Oct 19 20:45:35 2021 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Tue, 19 Oct 2021 20:45:35 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v3] In-Reply-To: <5Vyao27ssAe8dot0Z6Epc3KlRi57Aaf8j3UdroTSbFU=.1aae99e3-de3c-49b2-a14b-88fa24e9779b@github.com> References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> <5Vyao27ssAe8dot0Z6Epc3KlRi57Aaf8j3UdroTSbFU=.1aae99e3-de3c-49b2-a14b-88fa24e9779b@github.com> Message-ID: On Tue, 19 Oct 2021 19:51:54 GMT, Paul Sandoz wrote: >> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/ByteVector.java line 603: >> >>> 601: if (opKind(op, VO_SPECIAL)) { >>> 602: if (op == ZOMO) { >>> 603: return blend(broadcast(-1), compare(NE, 0, m)); >> >> This doesn't look correct. The lanes where mask is false should get the original lane value in this vector. > > That should work, since `compare(NE, 0, m) === compare(NE, 0).and(m)`, so when an `m` lane is unset the lane element of `this` vector will be selected. > > Running jshell against a build of PR: > > $ ~/Projects/jdk/jdk/build/macosx-x86_64-server-release/images/jdk/bin/jshell --add-modules jdk.incubator.vector > | Welcome to JShell -- Version 18-internal > | For an introduction type: /help intro > > jshell> import jdk.incubator.vector.* > > jshell> var s = IntVector.SPECIES_256; > s ==> Species[int, 8, S_256_BIT] > > jshell> var v = IntVector.fromArray(s, new int[]{0, 1, 0, -2, 0, 3, 0, -4}, 0); > v ==> [0, 1, 0, -2, 0, 3, 0, -4] > > jshell> var z = v.lanewise(VectorOperators.ZOMO); > z ==> [0, -1, 0, -1, 0, -1, 0, -1] > > jshell> z = v.lanewise(VectorOperators.ZOMO, s.loadMask(new boolean[]{false, false, false, false, true, true, true, true}, 0)); > z ==> [0, 1, 0, -2, 0, -1, 0, -1] > > jshell> Yes, you are correct. There is no problem here. ------------- PR: https://git.openjdk.java.net/jdk/pull/5873 From jjg at openjdk.java.net Tue Oct 19 20:45:40 2021 From: jjg at openjdk.java.net (Jonathan Gibbons) Date: Tue, 19 Oct 2021 20:45:40 GMT Subject: RFR: 8275512: Upgrade required version of jtreg to 6.1 [v2] In-Reply-To: References: <8ZF_bktOd9u8s06pS5q_VfPnDplwwjhlpz8pdLyHnvw=.67c1b571-5f34-4251-9872-34ccb89dbd5a@github.com> Message-ID: On Tue, 19 Oct 2021 17:24:17 GMT, Weijun Wang wrote: >> As a follow up of JEP 411, we will soon disallow security manager by default. jtreg 6.1 does not set its own security manager if JDK version is >= 18. > > Weijun Wang has updated the pull request incrementally with one additional commit since the last revision: > > upgrade the version in GHA config > > only in patch2: > unchanged: Marked as reviewed by jjg (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/6012 From mchung at openjdk.java.net Tue Oct 19 20:54:09 2021 From: mchung at openjdk.java.net (Mandy Chung) Date: Tue, 19 Oct 2021 20:54:09 GMT Subject: RFR: 8275512: Upgrade required version of jtreg to 6.1 [v2] In-Reply-To: References: <8ZF_bktOd9u8s06pS5q_VfPnDplwwjhlpz8pdLyHnvw=.67c1b571-5f34-4251-9872-34ccb89dbd5a@github.com> Message-ID: On Tue, 19 Oct 2021 17:24:17 GMT, Weijun Wang wrote: >> As a follow up of JEP 411, we will soon disallow security manager by default. jtreg 6.1 does not set its own security manager if JDK version is >= 18. > > Weijun Wang has updated the pull request incrementally with one additional commit since the last revision: > > upgrade the version in GHA config > > only in patch2: > unchanged: Marked as reviewed by mchung (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/6012 From stefank at openjdk.java.net Tue Oct 19 21:02:05 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 19 Oct 2021 21:02:05 GMT Subject: RFR: 8275527: Separate/refactor forward pointer access In-Reply-To: References: Message-ID: On Thu, 14 Oct 2021 16:37:02 GMT, Roman Kennke wrote: > Accessing the forward pointer overlay of object headers is currently done in a somewhat crude way. oopDesc::forwardee() and oopeDesc::is_forwarded() both load the header-word. This seems kinda bad, because they are most often used in conjunction, e.g.: > fwd = obj->forwarded() ? obj->forwardee() : promote_obj(); > > or similar constructs. > > Also, in some places, the forwardee is accessed in a more direct fashion using markWord::decode_pointer(), or as HeapWord*, sometimes is_forwarded() is determined by forwardee() == NULL checks (and crude overrides to prepare non-forwarded headers to return NULL in such checks, see g1Full* source files). > > I propose to extract and refactor forward pointer access in a way to better support above pattern. In-fact I originally wrote this with performance-enhancement in mind (see oopForwarding.hpp header for more details), but I could not show any actual performance improvements, so let's consider this purely on cleanliness and separation-of-concerns basis. > > Testing: > - [x] tier > - [x] tier2 > - [x] hotspot_gc I like how the mark/is_marked/decode_pointer code gets cleaned up. And having an OopForwading class probably makes sense for parts of the code. I'm less excited about the way the explicitly stack allocated OopForwarding objects make the code less readable. Take these two examples: assert(old->is_objArray(), "invariant"); - assert(old->is_forwarded(), "invariant"); + assert(OopForwarding(old).is_forwarded(), "invariant"); oop new_obj = obj->is_forwarded() ? obj->forwardee() : _young_gen->copy_to_survivor_space(obj); OopForwarding fwd(obj); oop new_obj = fwd.is_forwarded() ? fwd.forwardee() : _young_gen->copy_to_survivor_space(obj); I think both snippets were nicer to read in the earlier code. And there's a lot of those kind of changes in this patch. Maybe there's a hybrid approach where we could still have oopDesc::is_forwarded, oopDesc::forwardee, etc. but also have OopForwarding objects for those places that really need them? And then minimize the amount of places we use `OopForwarding fwd(obj)` and `OopForwarding(obj)->forwardee()` in the code? ------------- Changes requested by stefank (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5955 From weijun at openjdk.java.net Tue Oct 19 21:09:22 2021 From: weijun at openjdk.java.net (Weijun Wang) Date: Tue, 19 Oct 2021 21:09:22 GMT Subject: Integrated: 8275512: Upgrade required version of jtreg to 6.1 In-Reply-To: <8ZF_bktOd9u8s06pS5q_VfPnDplwwjhlpz8pdLyHnvw=.67c1b571-5f34-4251-9872-34ccb89dbd5a@github.com> References: <8ZF_bktOd9u8s06pS5q_VfPnDplwwjhlpz8pdLyHnvw=.67c1b571-5f34-4251-9872-34ccb89dbd5a@github.com> Message-ID: On Tue, 19 Oct 2021 13:51:45 GMT, Weijun Wang wrote: > As a follow up of JEP 411, we will soon disallow security manager by default. jtreg 6.1 does not set its own security manager if JDK version is >= 18. This pull request has now been integrated. Changeset: c24fb852 Author: Weijun Wang URL: https://git.openjdk.java.net/jdk/commit/c24fb852f20bf0fc2817dfed52ff1609a5bced59 Stats: 8 lines in 7 files changed: 0 ins; 0 del; 8 mod 8275512: Upgrade required version of jtreg to 6.1 Reviewed-by: ihse, iignatyev, joehw, lancea, jjg, mchung ------------- PR: https://git.openjdk.java.net/jdk/pull/6012 From dholmes at openjdk.java.net Tue Oct 19 21:52:07 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 19 Oct 2021 21:52:07 GMT Subject: RFR: 8272614: Unused parameters in MethodHandleNatives linking methods In-Reply-To: References: Message-ID: On Tue, 19 Oct 2021 17:12:16 GMT, Harold Seigel wrote: > Please review this small fix for JDK-8272614 to remove the unused indexInCP argument to linkCallSite() and linkDynamicConstant(). The fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-6 on Linux x64. > > Thanks, Harold LGTM! Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6021 From dholmes at openjdk.java.net Tue Oct 19 21:57:06 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 19 Oct 2021 21:57:06 GMT Subject: RFR: 8275439: Remove PrintVtableStats In-Reply-To: References: Message-ID: On Tue, 19 Oct 2021 15:09:01 GMT, Zhengyu Gu wrote: > PrintVtableStats has not been very useful and reported information may not be accurate neither. Please see the CR and comments in JDK-8275413 for details. > > - PrintVtableStats is none product flag, so does not require a CSR to remove it. > - The patch also removes Klass::array_klasses_do() and families, as PrintVtableStats is the last use. > - InstanceKlass/ArrayKlass::array_klasses_do(void f(Klass* k, TRAPS), TRAPS) have no use, remove them as well. Looks good. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6017 From sviswanathan at openjdk.java.net Tue Oct 19 22:40:44 2021 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Tue, 19 Oct 2021 22:40:44 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v4] In-Reply-To: References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> Message-ID: On Tue, 19 Oct 2021 22:37:10 GMT, Paul Sandoz wrote: >> This PR improves the performance of vector operations that accept masks on architectures that support masking in hardware, specifically Intel AVX512 and ARM SVE. >> >> On architectures that do not support masking in hardware the same technique as before is applied to most operations, specifically composition using blend. >> >> Masked loads/stores are a special form of masked operation that require additional care to ensure out-of-bounds access throw exceptions. The range checking has not been fully optimized and will require further work. >> >> No API enhancements were required and only a few additional tests were needed. > > Paul Sandoz has updated the pull request incrementally with one additional commit since the last revision: > > Resolve review comments. Marked as reviewed by sviswanathan (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/5873 From psandoz at openjdk.java.net Tue Oct 19 22:40:41 2021 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Tue, 19 Oct 2021 22:40:41 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v4] In-Reply-To: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> Message-ID: > This PR improves the performance of vector operations that accept masks on architectures that support masking in hardware, specifically Intel AVX512 and ARM SVE. > > On architectures that do not support masking in hardware the same technique as before is applied to most operations, specifically composition using blend. > > Masked loads/stores are a special form of masked operation that require additional care to ensure out-of-bounds access throw exceptions. The range checking has not been fully optimized and will require further work. > > No API enhancements were required and only a few additional tests were needed. Paul Sandoz has updated the pull request incrementally with one additional commit since the last revision: Resolve review comments. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5873/files - new: https://git.openjdk.java.net/jdk/pull/5873/files/32d58f17..101cd23a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5873&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5873&range=02-03 Stats: 50 lines in 4 files changed: 1 ins; 49 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/5873.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5873/head:pull/5873 PR: https://git.openjdk.java.net/jdk/pull/5873 From sviswanathan at openjdk.java.net Tue Oct 19 22:40:49 2021 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Tue, 19 Oct 2021 22:40:49 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v3] In-Reply-To: References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> Message-ID: <3_mnZHJQE6OzJrloA48Fj6hlOu1plIUnHlWRPFpkc70=.e2d80d75-0bb5-4e48-a8da-80ddee06ffd5@github.com> On Sat, 16 Oct 2021 00:56:14 GMT, Paul Sandoz wrote: >> This PR improves the performance of vector operations that accept masks on architectures that support masking in hardware, specifically Intel AVX512 and ARM SVE. >> >> On architectures that do not support masking in hardware the same technique as before is applied to most operations, specifically composition using blend. >> >> Masked loads/stores are a special form of masked operation that require additional care to ensure out-of-bounds access throw exceptions. The range checking has not been fully optimized and will require further work. >> >> No API enhancements were required and only a few additional tests were needed. > > Paul Sandoz has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Merge branch 'master' into JDK-8271515-vector-api > - Apply patch from https://github.com/openjdk/panama-vector/pull/152 > - Apply patch from https://github.com/openjdk/panama-vector/pull/142 > - Apply patch from https://github.com/openjdk/panama-vector/pull/139 > - Apply patch from https://github.com/openjdk/panama-vector/pull/151 > - Add new files. > - 8271515: Integration of JEP 417: Vector API (Third Incubator) The Java changes look good to me. src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorMask.java line 574: > 572: * @throws ClassCastException if the species is wrong > 573: */ > 574: abstract VectorMask check(Class> maskClass, Vector vector); This is a package-private method so the java doc style comments are not needed here. ------------- PR: https://git.openjdk.java.net/jdk/pull/5873 From psandoz at openjdk.java.net Tue Oct 19 22:40:52 2021 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Tue, 19 Oct 2021 22:40:52 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v3] In-Reply-To: <3_mnZHJQE6OzJrloA48Fj6hlOu1plIUnHlWRPFpkc70=.e2d80d75-0bb5-4e48-a8da-80ddee06ffd5@github.com> References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> <3_mnZHJQE6OzJrloA48Fj6hlOu1plIUnHlWRPFpkc70=.e2d80d75-0bb5-4e48-a8da-80ddee06ffd5@github.com> Message-ID: <9jOILX8e6SyWk4LEd68IYSh_S9KRCaUwrz8GUIqO5VY=.18090bd3-c5de-4995-a8bd-b97fd2ed8fea@github.com> On Tue, 19 Oct 2021 21:14:08 GMT, Sandhya Viswanathan wrote: >> Paul Sandoz has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: >> >> - Merge branch 'master' into JDK-8271515-vector-api >> - Apply patch from https://github.com/openjdk/panama-vector/pull/152 >> - Apply patch from https://github.com/openjdk/panama-vector/pull/142 >> - Apply patch from https://github.com/openjdk/panama-vector/pull/139 >> - Apply patch from https://github.com/openjdk/panama-vector/pull/151 >> - Add new files. >> - 8271515: Integration of JEP 417: Vector API (Third Incubator) > > src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorMask.java line 574: > >> 572: * @throws ClassCastException if the species is wrong >> 573: */ >> 574: abstract VectorMask check(Class> maskClass, Vector vector); > > This is a package-private method so the java doc style comments are not needed here. I think that is fine, documentation for us :-) I converted to package private since it only really makes sense for internal use right now. ------------- PR: https://git.openjdk.java.net/jdk/pull/5873 From sviswanathan at openjdk.java.net Tue Oct 19 22:40:53 2021 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Tue, 19 Oct 2021 22:40:53 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v3] In-Reply-To: <9jOILX8e6SyWk4LEd68IYSh_S9KRCaUwrz8GUIqO5VY=.18090bd3-c5de-4995-a8bd-b97fd2ed8fea@github.com> References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> <3_mnZHJQE6OzJrloA48Fj6hlOu1plIUnHlWRPFpkc70=.e2d80d75-0bb5-4e48-a8da-80ddee06ffd5@github.com> <9jOILX8e6SyWk4LEd68IYSh_S9KRCaUwrz8GUIqO5VY=.18090bd3-c5de-4995-a8bd-b97fd2ed8fea@github.com> Message-ID: On Tue, 19 Oct 2021 22:34:13 GMT, Paul Sandoz wrote: >> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorMask.java line 574: >> >>> 572: * @throws ClassCastException if the species is wrong >>> 573: */ >>> 574: abstract VectorMask check(Class> maskClass, Vector vector); >> >> This is a package-private method so the java doc style comments are not needed here. > > I think that is fine, documentation for us :-) I converted to package private since it only really makes sense for internal use right now. Sounds good. ------------- PR: https://git.openjdk.java.net/jdk/pull/5873 From psandoz at openjdk.java.net Wed Oct 20 00:22:30 2021 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Wed, 20 Oct 2021 00:22:30 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v5] In-Reply-To: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> Message-ID: > This PR improves the performance of vector operations that accept masks on architectures that support masking in hardware, specifically Intel AVX512 and ARM SVE. > > On architectures that do not support masking in hardware the same technique as before is applied to most operations, specifically composition using blend. > > Masked loads/stores are a special form of masked operation that require additional care to ensure out-of-bounds access throw exceptions. The range checking has not been fully optimized and will require further work. > > No API enhancements were required and only a few additional tests were needed. Paul Sandoz has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: - Merge branch 'master' into JDK-8271515-vector-api - Resolve review comments. - Merge branch 'master' into JDK-8271515-vector-api - Apply patch from https://github.com/openjdk/panama-vector/pull/152 - Apply patch from https://github.com/openjdk/panama-vector/pull/142 - Apply patch from https://github.com/openjdk/panama-vector/pull/139 - Apply patch from https://github.com/openjdk/panama-vector/pull/151 - Add new files. - 8271515: Integration of JEP 417: Vector API (Third Incubator) ------------- Changes: https://git.openjdk.java.net/jdk/pull/5873/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5873&range=04 Stats: 21956 lines in 105 files changed: 16182 ins; 2080 del; 3694 mod Patch: https://git.openjdk.java.net/jdk/pull/5873.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5873/head:pull/5873 PR: https://git.openjdk.java.net/jdk/pull/5873 From zjx001202 at gmail.com Wed Oct 20 03:08:40 2021 From: zjx001202 at gmail.com (Glavo) Date: Wed, 20 Oct 2021 11:08:40 +0800 Subject: How to ensure that a call to a virtual method in a single method is looked up in the virtual function table only once? Message-ID: (I am a novice in HotSpot, JIT compiler and assembly, if there is some misunderstanding, please forgive me.) I designed a very simple micro benchmark to test the performance of virtual function calls (Test.java: https://paste.ubuntu.com/p/VKc57mnWgp/). The method be tested is a very simple method: private static void fun0(Iterator it) { while (it.hasNext()) it.next(); } I know C2 will try to specialize the implementation for a few parameter types to completely de virtualize, so I call it with many different iterator types to avoid being de virtualized. I installed hsdis for my JDK, ran it with `java -XX:+UnlockDiagnosticVMOptions -XX:CompileCommand=print,Test::fun0 Test.java`, and got this ASM code: https://paste.ubuntu.com/p/k2mg2svdcx/ To my great surprise, it seems that every time a virtual method is called, it needs to be looked up in the virtual function table. In my understanding, this is unnecessary: The parameter `it` remains unchanged, so for calling a virtual method, we should only lookup it in the virtual function table on the first call and cache the function pointer on the register or stack. Each subsequent call can be made through the function pointer without virtual call. Unfortunately, it seems that C2 did not make this optimization for me, and I think this will cause a noticeable performance degradation to my code. I tried to use MethodHandle instead of virtual calls in methods, and designed microbenchmarks using JMH. The result disappointed me. It is always slower than making a virtual call directly. I have some questions about this: Will the resulting large number of virtual calls significantly degrade function performance? Can I use existing functions (like MethodHandle) in my code to avoid this overhead? Why doesn't C2 do this optimization? Because it's hard to implement? Because there is a lot of analysis overhead? Or something else? From ngasson at openjdk.java.net Wed Oct 20 03:16:12 2021 From: ngasson at openjdk.java.net (Nick Gasson) Date: Wed, 20 Oct 2021 03:16:12 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v5] In-Reply-To: References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> Message-ID: On Wed, 20 Oct 2021 00:22:30 GMT, Paul Sandoz wrote: >> This PR improves the performance of vector operations that accept masks on architectures that support masking in hardware, specifically Intel AVX512 and ARM SVE. >> >> On architectures that do not support masking in hardware the same technique as before is applied to most operations, specifically composition using blend. >> >> Masked loads/stores are a special form of masked operation that require additional care to ensure out-of-bounds access throw exceptions. The range checking has not been fully optimized and will require further work. >> >> No API enhancements were required and only a few additional tests were needed. > > Paul Sandoz has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: > > - Merge branch 'master' into JDK-8271515-vector-api > - Resolve review comments. > - Merge branch 'master' into JDK-8271515-vector-api > - Apply patch from https://github.com/openjdk/panama-vector/pull/152 > - Apply patch from https://github.com/openjdk/panama-vector/pull/142 > - Apply patch from https://github.com/openjdk/panama-vector/pull/139 > - Apply patch from https://github.com/openjdk/panama-vector/pull/151 > - Add new files. > - 8271515: Integration of JEP 417: Vector API (Third Incubator) AArch64 changes look ok apart from some minor comments. src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 2057: > 2055: > 2056: const int num_of_regs = PRegisterImpl::number_of_saved_registers; > 2057: unsigned char regs[num_of_regs]; Seems clearer to use `PRegisterImpl::number_of_saved_registers` directly rather than introducing `num_of_regs`. src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 2078: > 2076: } > 2077: > 2078: // Return the number of dwords poped popped src/hotspot/cpu/aarch64/register_aarch64.hpp line 248: > 246: // AArch64 has 8 governing predicate registers, but p7 is used as an > 247: // all-1s register so the predicates to save are from p0 to p6 if we > 248: // don't have non-governing predicate registers support. This comment is a bit difficult to read. How about: // AArch64 has 8 governing predicate registers, but p7 is used as an // all-1s register so the predicates to save are from p0 to p6. We // don't support non-governing predicate registers. src/hotspot/cpu/aarch64/vmreg_aarch64.hpp line 42: > 40: > 41: inline Register as_Register() { > 42: assert(is_Register(), "must be"); I think it's better to leave this file as it was if you're only making whitespace changes here. ------------- Marked as reviewed by ngasson (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5873 From rednaxelafx at gmail.com Wed Oct 20 07:09:39 2021 From: rednaxelafx at gmail.com (Krystal Mok) Date: Wed, 20 Oct 2021 00:09:39 -0700 Subject: How to ensure that a call to a virtual method in a single method is looked up in the virtual function table only once? In-Reply-To: References: Message-ID: Hi Glavo, Happened to bump into this thread. Some thoughts: You're probably looking for a resulting code shape that looks like this: private static void fun0(Iterator it) { InstanceKlass kls = it._klass; if (kls != SomeExpectedIteratorImpl) uncommon_trap(); // CheckCastPP it to SomeExpectedIteratorImpl in the rest of the method // CHA devirtualize it.hasNext() and it.next() to the concrete impl methods on SomeExpectedIteratorImpl while (it.hasNext()) { it.next(); } } C2's infrastructure is actually well capable of doing this. Just curious: what types are fed into this example method from the caller? BTW, those call sites in the C2-compiled code don't seem like vtable dispatches. Rather, they're compiled inline-caches (CompileIC). The call site makes a call to the "unverified entry point" (UEP) of the target method, and the UEP code makes 1 direct type check and either rejects the call (bounces to the slowpath virtual/interface method lookup), or the type check passes and you're in to the "verified entry point" (VEP) which performs the regular method entry point stuff like stack banging and stack frame setup. ; inline cache code 0x00007fd4efcfea31: movabs $0x80000b318,%rax 0x00007fd4efcfea3b: call 0x00007fd4e8129ce0 the target (UEP) likely looks something like this: 0x00007fd4e8129ce0: mov 0x8(%rsi),%r10d ; load narrow klass (when UseCompressedClassPointers) 0x00007fd4e8129ce4: shl $0x3,%r10 ; decode narrow klass to regular klass 0x00007fd4e8129ce8: cmp %r10,%rax ; <- this is the direct type check in UEP 0x00007fd4e8129ceb: jne 0x7fcff1045ca0 ; {runtime_call} One thing worth mentioning is that there's a tradeoff made in C2 to encapsulate the virtual dispatch logic such that the GetKlass(o) operation (what I wrote in pseudocode above as it._klass) is not directly exposed in the IR, not event on the MachNode level. So C2 wouldn't have the chance to remove redundant computation of obj->_klass.vtable even without the compiled inline caches (i.e. directly generating vtable dispatch). - Kris On Tue, Oct 19, 2021 at 8:09 PM Glavo wrote: > (I am a novice in HotSpot, JIT compiler and assembly, if there is some > misunderstanding, please forgive me.) > > I designed a very simple micro benchmark to test the performance of virtual > function calls (Test.java: https://paste.ubuntu.com/p/VKc57mnWgp/). > > The method be tested is a very simple method: > > private static void fun0(Iterator it) { > while (it.hasNext()) it.next(); > } > > I know C2 will try to specialize the implementation for a few parameter > types to completely de virtualize, so I call it with many different > iterator types to avoid being de virtualized. > > I installed hsdis for my JDK, ran it with `java > -XX:+UnlockDiagnosticVMOptions -XX:CompileCommand=print,Test::fun0 > Test.java`, and got this ASM code: > > https://paste.ubuntu.com/p/k2mg2svdcx/ > > To my great surprise, it seems that every time a virtual method is called, > it needs to be looked up in the virtual function table. > > In my understanding, this is unnecessary: The parameter `it` remains > unchanged, so for calling a virtual method, we should only lookup it in the > virtual function table on the first call and cache the function pointer on > the register or stack. Each subsequent call can be made through the > function pointer without virtual call. > > Unfortunately, it seems that C2 did not make this optimization for me, and > I think this will cause a noticeable performance degradation to my code. > > I tried to use MethodHandle instead of virtual calls in methods, and > designed microbenchmarks using JMH. The result disappointed me. It is > always slower than making a virtual call directly. > > I have some questions about this: > > Will the resulting large number of virtual calls significantly degrade > function performance? > > Can I use existing functions (like MethodHandle) in my code to avoid this > overhead? > > Why doesn't C2 do this optimization? Because it's hard to implement? > Because there is a lot of analysis overhead? Or something else? > From shade at openjdk.java.net Wed Oct 20 07:54:22 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 20 Oct 2021 07:54:22 GMT Subject: RFR: 8275586: Zero: Simplify interpreter initialization Message-ID: <8FCUBqssHqcaYRC6gnr37F8A9gGX1Hzvx8ny5BQblOY=.b54f2029-7a37-493a-bcc9-6fec9c29c943@github.com> The prolog in `BytecodeInterpreter` is hairy due to early initialization of interpreter statics. Previous rewrites make it mostly redundant, and we can now simplify it. This also implicitly fixes a initialization bug. If `JvmtiExport::can_post_interpreter_events()` changes at runtime, we will call into the uninitialized version: // Call the interpreter if (JvmtiExport::can_post_interpreter_events()) { BytecodeInterpreter::run(istate); } else { BytecodeInterpreter::run(istate); } Additional testing: - [ ] Linux x86_64 fastdebug `make bootcycle-images` ------------- Commit messages: - Fix Changes: https://git.openjdk.java.net/jdk/pull/6029/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6029&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275586 Stats: 70 lines in 3 files changed: 7 ins; 48 del; 15 mod Patch: https://git.openjdk.java.net/jdk/pull/6029.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6029/head:pull/6029 PR: https://git.openjdk.java.net/jdk/pull/6029 From stefank at openjdk.java.net Wed Oct 20 08:17:24 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 20 Oct 2021 08:17:24 GMT Subject: RFR: 8275405: Linking error for classes with lambda template parameters and virtual functions Message-ID: We encountered the following linking error when trying to build Generational ZGC on Windows: jvm.exp : error LNK2001: unresolved external symbol "const ZBasicOopIterateClosure >::`vftable'" (??_7?$ZBasicOopIterateClosure at V@@@@6B@) I narrowed this down to a simple reproducer, which doesn't link when built through the HotSpot build system: #include std::function = [](){}; I found that we have a line in our make files that filters out symbols that contain the string vftable (though it checks the mangled name, so a bit hard to find): else ifeq ($(call isTargetOs, windows), true) DUMP_SYMBOLS_CMD := $(DUMPBIN) -symbols *.obj FILTER_SYMBOLS_AWK_SCRIPT := \ '{ \ if ($$7 ~ /??_7.*@@6B@/ && $$7 !~ /type_info/) print $$7; \ }' The following line prints the vftable symbol if it doesn't contain the string 'type_info': if ($$7 ~ /??_7.*@@6B@/ && $$7 !~ /type_info/) print $$7; The printed values are added to a list of symbols that get filtered out of the mapfile, which is then passed to the linker. I can get the code to link if I add a second exception for vftable symbols containing the string 'lambda': if ($$7 ~ /??_7.*@@6B@/ && $$7 !~ /type_info/ && $$7 !~ /lambda/) print $$7; I did an additional experiment where I completely removed this filtering of vftable symbols. When I did that the linker complained that we used more than 64K symbols. ------------- Commit messages: - 8275405: Linking error for classes with lambda template parameters and virtual functions Changes: https://git.openjdk.java.net/jdk/pull/6030/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6030&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275405 Stats: 18 lines in 1 file changed: 17 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/6030.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6030/head:pull/6030 PR: https://git.openjdk.java.net/jdk/pull/6030 From lzhai at openjdk.java.net Wed Oct 20 09:31:10 2021 From: lzhai at openjdk.java.net (Leslie Zhai) Date: Wed, 20 Oct 2021 09:31:10 GMT Subject: RFR: 8275512: Upgrade required version of jtreg to 6.1 [v2] In-Reply-To: References: <8ZF_bktOd9u8s06pS5q_VfPnDplwwjhlpz8pdLyHnvw=.67c1b571-5f34-4251-9872-34ccb89dbd5a@github.com> Message-ID: On Tue, 19 Oct 2021 17:24:17 GMT, Weijun Wang wrote: >> As a follow up of JEP 411, we will soon disallow security manager by default. jtreg 6.1 does not set its own security manager if JDK version is >= 18. > > Weijun Wang has updated the pull request incrementally with one additional commit since the last revision: > > upgrade the version in GHA config > > only in patch2: > unchanged: Hi @wangweij But how to be suitable for jtreg version 6-dev+0? Running tests using JTREG control variable 'VM_OPTIONS=-XX:+UseZGC;TIMEOUT_FACTOR=20;VERBOSE=summary' Test selection 'tier1', will run: * jtreg:test/hotspot/jtreg:tier1 * jtreg:test/jdk:tier1 * jtreg:test/langtools:tier1 * jtreg:test/jaxp:tier1 * jtreg:test/lib-test:tier1 Running test 'jtreg:test/hotspot/jtreg:tier1' Error: The testsuite at /var/lib/jenkins/repos/openjdk/jdk/test/hotspot/jtreg requires jtreg version 6.1 b1 or higher and this is jtreg version 6-dev+0. Thanks, Leslie Zhai ------------- PR: https://git.openjdk.java.net/jdk/pull/6012 From lkorinth at openjdk.java.net Wed Oct 20 09:36:38 2021 From: lkorinth at openjdk.java.net (Leo Korinth) Date: Wed, 20 Oct 2021 09:36:38 GMT Subject: RFR: 8269537: memset() is called after operator new [v4] In-Reply-To: References: Message-ID: > The basic problem is that we are relying on undefined behaviour, as documented in the code: > > // This whole business of passing information from ResourceObj::operator new > // to the ResourceObj constructor via fields in the "object" is technically UB. > // But it seems to work within the limitations of HotSpot usage (such as no > // multiple inheritance) with the compilers and compiler options we're using. > // And it gives some possibly useful checking for misuse of ResourceObj. > > > I am removing the undefined behaviour by passing the type of allocation through a thread local variable. > > This solution has some advantages: > 1) it is not UB > 2) it is simpler and easier to understand > 3) it uses less memory (I could make it use even less if I made the enum `allocation_type` a u8) > 4) in the *very* unlikely situation that stack memory (or embedded) already equals the data calculated from the address of the object, the code will also work. > > When doing the change, I also updated `allocated_on_stack()` to the new name `allocated_on_stack_or_embedded()` which is much harder to misinterpret. > > I also disallow to "fake" the memory type by explicitly calling `ResourceObj::set_allocation_type`. > > This forced me to change two places that is faking the allocation type of an embedded `GrowableArray` from `STACK_OR_EMBEDDED` to `C_HEAP`. The faking of the type is hard to understand as a `STACK_OR_EMBEDDED` `GrowableArray` can allocate any type of object. My guess is that `GrowableArray` has changed behaviour, or maybe that it was hard to understand because the old naming of `allocated_on_stack()`. > > I have also tried to update the comments. In doing that I not only changed the comments for this change, but also for the *incorrect* advice to always delete object you allocate with new. > > Testing on debug build tier1-3 > Testing on release build tier1 Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: review updates ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5387/files - new: https://git.openjdk.java.net/jdk/pull/5387/files/c9d45906..640f2d83 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5387&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5387&range=02-03 Stats: 11 lines in 4 files changed: 0 ins; 7 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/5387.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5387/head:pull/5387 PR: https://git.openjdk.java.net/jdk/pull/5387 From tschatzl at openjdk.java.net Wed Oct 20 10:30:06 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 20 Oct 2021 10:30:06 GMT Subject: RFR: 8275527: Separate/refactor forward pointer access In-Reply-To: References: Message-ID: On Thu, 14 Oct 2021 16:37:02 GMT, Roman Kennke wrote: > Accessing the forward pointer overlay of object headers is currently done in a somewhat crude way. oopDesc::forwardee() and oopeDesc::is_forwarded() both load the header-word. This seems kinda bad, because they are most often used in conjunction, e.g.: > fwd = obj->forwarded() ? obj->forwardee() : promote_obj(); > > or similar constructs. > > Also, in some places, the forwardee is accessed in a more direct fashion using markWord::decode_pointer(), or as HeapWord*, sometimes is_forwarded() is determined by forwardee() == NULL checks (and crude overrides to prepare non-forwarded headers to return NULL in such checks, see g1Full* source files). > > I propose to extract and refactor forward pointer access in a way to better support above pattern. In-fact I originally wrote this with performance-enhancement in mind (see oopForwarding.hpp header for more details), but I could not show any actual performance improvements, so let's consider this purely on cleanliness and separation-of-concerns basis. > > Testing: > - [x] tier > - [x] tier2 > - [x] hotspot_gc @rkennke : could you close the superfluous CRs for this issue? You filed this one 9 times in total... (probably due to network issues yesterday). Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/5955 From ihse at openjdk.java.net Wed Oct 20 10:33:12 2021 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Wed, 20 Oct 2021 10:33:12 GMT Subject: RFR: 8275512: Upgrade required version of jtreg to 6.1 [v2] In-Reply-To: References: <8ZF_bktOd9u8s06pS5q_VfPnDplwwjhlpz8pdLyHnvw=.67c1b571-5f34-4251-9872-34ccb89dbd5a@github.com> Message-ID: <3fR1ogAG0CXuhf9_gJZrD-GGi9RqG_lP5bGD7fd_GPY=.2ed9d3b9-3b80-4300-a90f-6d5fe954d945@github.com> On Wed, 20 Oct 2021 09:28:30 GMT, Leslie Zhai wrote: >> Weijun Wang has updated the pull request incrementally with one additional commit since the last revision: >> >> upgrade the version in GHA config >> >> only in patch2: >> unchanged: > > Hi @wangweij > > But how to be suitable for jtreg version 6-dev+0? > > > Running tests using JTREG control variable 'VM_OPTIONS=-XX:+UseZGC;TIMEOUT_FACTOR=20;VERBOSE=summary' > Test selection 'tier1', will run: > * jtreg:test/hotspot/jtreg:tier1 > * jtreg:test/jdk:tier1 > * jtreg:test/langtools:tier1 > * jtreg:test/jaxp:tier1 > * jtreg:test/lib-test:tier1 > > Running test 'jtreg:test/hotspot/jtreg:tier1' > Error: The testsuite at /var/lib/jenkins/repos/openjdk/jdk/test/hotspot/jtreg requires jtreg version 6.1 b1 or higher and this is jtreg version 6-dev+0. > > > Thanks, > Leslie Zhai @xiangzhai The entire point of this PR is to *not* allow jtreg 6.0. You need to upgrade your local jtreg installation to 6.1. ------------- PR: https://git.openjdk.java.net/jdk/pull/6012 From ihse at openjdk.java.net Wed Oct 20 10:38:13 2021 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Wed, 20 Oct 2021 10:38:13 GMT Subject: RFR: 8275405: Linking error for classes with lambda template parameters and virtual functions In-Reply-To: References: Message-ID: On Wed, 20 Oct 2021 08:11:34 GMT, Stefan Karlsson wrote: > We encountered the following linking error when trying to build Generational ZGC on Windows: > > jvm.exp : error LNK2001: unresolved external symbol "const ZBasicOopIterateClosure >::`vftable'" (??_7?$ZBasicOopIterateClosure at V@@@@6B@) > > > I narrowed this down to a simple reproducer, which doesn't link when built through the HotSpot build system: > > #include > std::function = [](){}; > > > I found that we have a line in our make files that filters out symbols that contain the string vftable (though it checks the mangled name, so a bit hard to find): > > else ifeq ($(call isTargetOs, windows), true) > DUMP_SYMBOLS_CMD := $(DUMPBIN) -symbols *.obj > FILTER_SYMBOLS_AWK_SCRIPT := \ > '{ \ > if ($$7 ~ /??_7.*@@6B@/ && $$7 !~ /type_info/) print $$7; \ > }' > > > The following line prints the vftable symbol if it doesn't contain the string 'type_info': > if ($$7 ~ /??_7.*@@6B@/ && $$7 !~ /type_info/) print $$7; > > The printed values are added to a list of symbols that get filtered out of the mapfile, which is then passed to the linker. > > I can get the code to link if I add a second exception for vftable symbols containing the string 'lambda': > if ($$7 ~ /??_7.*@@6B@/ && $$7 !~ /type_info/ && $$7 !~ /lambda/) print $$7; > > I did an additional experiment where I completely removed this filtering of vftable symbols. When I did that the linker complained that we used more than 64K symbols. Looks good to me. make/hotspot/lib/JvmMapfile.gmk line 109: > 107: # > 108: # Some usages of C++ lambdas require the vftable symbol of classes that use > 109: # the lambda type as a template parameter. The usage of those classes wont Suggestion: # the lambda type as a template parameter. The usage of those classes won't ------------- Marked as reviewed by ihse (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6030 From tschatzl at openjdk.java.net Wed Oct 20 10:41:08 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 20 Oct 2021 10:41:08 GMT Subject: RFR: 8275527: Separate/refactor forward pointer access In-Reply-To: References: Message-ID: On Thu, 14 Oct 2021 16:37:02 GMT, Roman Kennke wrote: > Accessing the forward pointer overlay of object headers is currently done in a somewhat crude way. oopDesc::forwardee() and oopeDesc::is_forwarded() both load the header-word. This seems kinda bad, because they are most often used in conjunction, e.g.: > fwd = obj->forwarded() ? obj->forwardee() : promote_obj(); > > or similar constructs. > > Also, in some places, the forwardee is accessed in a more direct fashion using markWord::decode_pointer(), or as HeapWord*, sometimes is_forwarded() is determined by forwardee() == NULL checks (and crude overrides to prepare non-forwarded headers to return NULL in such checks, see g1Full* source files). > > I propose to extract and refactor forward pointer access in a way to better support above pattern. In-fact I originally wrote this with performance-enhancement in mind (see oopForwarding.hpp header for more details), but I could not show any actual performance improvements, so let's consider this purely on cleanliness and separation-of-concerns basis. > > Testing: > - [x] tier > - [x] tier2 > - [x] hotspot_gc Never mind, already did that. I did not check if any of the duplicates contain more information than the one used for this PR though. ------------- PR: https://git.openjdk.java.net/jdk/pull/5955 From rkennke at openjdk.java.net Wed Oct 20 10:41:08 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 20 Oct 2021 10:41:08 GMT Subject: RFR: 8275527: Separate/refactor forward pointer access In-Reply-To: References: Message-ID: On Wed, 20 Oct 2021 10:36:17 GMT, Thomas Schatzl wrote: > Never mind, already did that. I did not check if any of the duplicates contain more information than the one used for this PR though. Oh dear! Thank you! Yes, when I tried to file the issue, bugs.o.j.n was down and/or very slow. Apparently every attempt actually went through, without actually sending back a confirmation. ------------- PR: https://git.openjdk.java.net/jdk/pull/5955 From duke at openjdk.java.net Wed Oct 20 11:05:11 2021 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Wed, 20 Oct 2021 11:05:11 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v10] In-Reply-To: References: <8P-tWT-7UC9TMLz8zo5liDy2rOONBU864RUlRhthLeY=.05ef40f6-38ca-4e84-a54b-eb8dbae2b97f@github.com> Message-ID: <1zMQcfx5qebdex-GpeyLJd1UnRTJSH2udgAp9plFQGY=.66329bdf-ae95-439b-933e-67bfcb7ea39d@github.com> On Fri, 15 Oct 2021 13:09:27 GMT, Andrew Haley wrote: >> Can we have a simple (as simple as possible) JMH benchmark, please? It should be something like a couple of threads racing to count up to a million. > >> @theRealAph, any comments on the microbenchmark I wrote? > > Something like this works well: > > > @Param({"1000000"}) > public int maxNum; > > @Param({"4"}) > public int threadCount; > > AtomicInteger theCounter; > > Thread threads[]; > > void work() { > for (;;) { > int prev = theCounter.get(); > if (prev >= maxNum) { > break; > } > if (theCounter.compareAndExchange(prev, prev + 1) != prev) { > Thread.onSpinWait(); > } > } > } > > @Setup(Level.Trial) > public void foo() { > theCounter = new AtomicInteger(); > } > > @Setup(Level.Invocation) > public void setup() { > theCounter.set(0); > threads = new Thread[threadCount]; > > for (int i = 0; i< threads.length; i++) { > threads[i] = new Thread(this::work); > } > > } > > @Benchmark > public void trial() throws Exception { > for (int i = 0; i< threads.length; i++) { > threads[i].start(); > } > for (int i = 0; i< threads.length; i++) { > threads[i].join(); > } > } > } > > Before: > > Benchmark (maxNum) (threadCount) Mode Cnt Score Error Units > ThreadOnSpinWait.trial 1000000 2 avgt 3 43.830 ? 32.543 ms/op > > With `-XX:OnSpinWaitInst=isb -XX:OnSpinWaitInstCount=4` > > Benchmark (maxNum) (threadCount) Mode Cnt Score Error Units > ThreadOnSpinWait.trial 1000000 2 avgt 3 22.181 ? 11.592 ms/op > > With `-XX:OnSpinWaitInst=isb -XX:OnSpinWaitInstCount=1` > > Benchmark (maxNum) (threadCount) Mode Cnt Score Error Units > ThreadOnSpinWait.trial 1000000 2 avgt 3 36.281 ? 31.700 ms/op > > > This is Apple M1, where you have to be very careful because there's some processor > frequency scaling going on. Hi @theRealAph, Any comments on the microbenchmarks? ------------- PR: https://git.openjdk.java.net/jdk/pull/5562 From mcimadamore at openjdk.java.net Wed Oct 20 11:14:19 2021 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Wed, 20 Oct 2021 11:14:19 GMT Subject: RFR: 8275063: Implementation of Foreign Function & Memory API (Second incubator) [v7] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-419 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/419 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Fix failing microbenchmarks. Contributed by @FrauBoes (thanks!) ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5907/files - new: https://git.openjdk.java.net/jdk/pull/5907/files/a4db81fe..52189683 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5907&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5907&range=05-06 Stats: 39 lines in 9 files changed: 1 ins; 10 del; 28 mod Patch: https://git.openjdk.java.net/jdk/pull/5907.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5907/head:pull/5907 PR: https://git.openjdk.java.net/jdk/pull/5907 From ayang at openjdk.java.net Wed Oct 20 11:33:06 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Wed, 20 Oct 2021 11:33:06 GMT Subject: RFR: 8275527: Separate/refactor forward pointer access In-Reply-To: References: Message-ID: On Thu, 14 Oct 2021 16:37:02 GMT, Roman Kennke wrote: > sometimes is_forwarded() is determined by forwardee() == NULL checks (and crude overrides to prepare non-forwarded headers to return NULL in such checks, see g1Full* source files). Overriding the markword to return null for non-forwarded objs is indeed a bit hacky. However, from a caller's perspective, this is the most intuitive one. Taking `G1FullGCCompactTask::G1CompactRegionClosure::apply` as an example: HeapWord* destination = cast_from_oop(obj->forwardee()); if (destination == NULL) { // Object not moving return size; } ... Copy::aligned_conjoint_words(obj_addr, destination, size); `destination` is either null (obj is not moved) or the new location after moving; rather smooth and straightforward control flow. I wonder if we can use this pattern without overriding the markdown. Here's a straw man proposal. oop oopDesc::forwardee() const { auto mark_word = mark(); bool is_forwarded = mark_word.is_marked(); if (!is_forwarded) { return nullptr; } return cast_to_oop(mark_word.decode_pointer()); } The intention is to load the markword only once. Additionally, callers can continue using the `is_forwarded() / forwardee()` pair, or switch to the `forwardee() == nullptr` pattern on a case-by-case basis if desired. ------------- PR: https://git.openjdk.java.net/jdk/pull/5955 From rkennke at openjdk.java.net Wed Oct 20 11:37:06 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 20 Oct 2021 11:37:06 GMT Subject: RFR: 8275527: Separate/refactor forward pointer access In-Reply-To: References: Message-ID: On Tue, 19 Oct 2021 20:58:49 GMT, Stefan Karlsson wrote: > I like how the mark/is_marked/decode_pointer code gets cleaned up. And having an OopForwading class probably makes sense for parts of the code. > > I'm less excited about the way the explicitly stack allocated OopForwarding objects make the code less readable. Take these two examples: > > ``` > assert(old->is_objArray(), "invariant"); > - assert(old->is_forwarded(), "invariant"); > + assert(OopForwarding(old).is_forwarded(), "invariant"); > ``` > > ``` > oop new_obj = obj->is_forwarded() ? obj->forwardee() > : _young_gen->copy_to_survivor_space(obj); > OopForwarding fwd(obj); > oop new_obj = fwd.is_forwarded() ? fwd.forwardee() > : _young_gen->copy_to_survivor_space(obj); > ``` > > I think both snippets were nicer to read in the earlier code. And there's a lot of those kind of changes in this patch. > > Maybe there's a hybrid approach where we could still have oopDesc::is_forwarded, oopDesc::forwardee, etc. but also have OopForwarding objects for those places that really need them? And then minimize the amount of places we use `OopForwarding fwd(obj)` and `OopForwarding(obj)->forwardee()` in the code? In the first snippet, there is only one usage of the mark-word, i.e. no problem to simplify that. However, in the 2nd snippet, the point is that we can avoid loading the mark-word twice, and if we want to do that, we need an enclosing scope. The alternative would be to do something like: markWord mark = obj->mark(); oop fwd = mark->is_forwarded() ? mark->forwardee() : something_else(); ... which is pretty much the same as what I proposed, except that the markWord and the logic is encapsulated in OopForwarding. Notice that in some cases we actually already do that - those are the 'messy' cases where we deliberately load the mark-word once, and drag it through the code that needs it (sometimes not only for performance but also for correctness in the face of concurrency) - and the use is_marked() and decode_pointer() on it. Given that I have not been able to show performance benefits, I could revert cases like you mentioned to their original form, and consolidate the implementation into markWord, unless we prefer the most efficient solution everywhere. ? ------------- PR: https://git.openjdk.java.net/jdk/pull/5955 From zgu at openjdk.java.net Wed Oct 20 11:50:13 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Wed, 20 Oct 2021 11:50:13 GMT Subject: RFR: 8275439: Remove PrintVtableStats In-Reply-To: References: Message-ID: On Tue, 19 Oct 2021 21:54:20 GMT, David Holmes wrote: >> PrintVtableStats has not been very useful and reported information may not be accurate neither. Please see the CR and comments in JDK-8275413 for details. >> >> - PrintVtableStats is none product flag, so does not require a CSR to remove it. >> - The patch also removes Klass::array_klasses_do() and families, as PrintVtableStats is the last use. >> - InstanceKlass/ArrayKlass::array_klasses_do(void f(Klass* k, TRAPS), TRAPS) have no use, remove them as well. > > Looks good. > > Thanks, > David Thanks, @dholmes-ora ------------- PR: https://git.openjdk.java.net/jdk/pull/6017 From zgu at openjdk.java.net Wed Oct 20 11:50:13 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Wed, 20 Oct 2021 11:50:13 GMT Subject: Integrated: 8275439: Remove PrintVtableStats In-Reply-To: References: Message-ID: On Tue, 19 Oct 2021 15:09:01 GMT, Zhengyu Gu wrote: > PrintVtableStats has not been very useful and reported information may not be accurate neither. Please see the CR and comments in JDK-8275413 for details. > > - PrintVtableStats is none product flag, so does not require a CSR to remove it. > - The patch also removes Klass::array_klasses_do() and families, as PrintVtableStats is the last use. > - InstanceKlass/ArrayKlass::array_klasses_do(void f(Klass* k, TRAPS), TRAPS) have no use, remove them as well. This pull request has now been integrated. Changeset: 135cf3c9 Author: Zhengyu Gu URL: https://git.openjdk.java.net/jdk/commit/135cf3c94d4bce1b23c4dd7697030f558a5f682b Stats: 125 lines in 9 files changed: 0 ins; 125 del; 0 mod 8275439: Remove PrintVtableStats Reviewed-by: coleenp, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/6017 From stefank at openjdk.java.net Wed Oct 20 12:23:28 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 20 Oct 2021 12:23:28 GMT Subject: RFR: 8275405: Linking error for classes with lambda template parameters and virtual functions [v2] In-Reply-To: References: Message-ID: > We encountered the following linking error when trying to build Generational ZGC on Windows: > > jvm.exp : error LNK2001: unresolved external symbol "const ZBasicOopIterateClosure >::`vftable'" (??_7?$ZBasicOopIterateClosure at V@@@@6B@) > > > I narrowed this down to a simple reproducer, which doesn't link when built through the HotSpot build system: > > #include > std::function = [](){}; > > > I found that we have a line in our make files that filters out symbols that contain the string vftable (though it checks the mangled name, so a bit hard to find): > > else ifeq ($(call isTargetOs, windows), true) > DUMP_SYMBOLS_CMD := $(DUMPBIN) -symbols *.obj > FILTER_SYMBOLS_AWK_SCRIPT := \ > '{ \ > if ($$7 ~ /??_7.*@@6B@/ && $$7 !~ /type_info/) print $$7; \ > }' > > > The following line prints the vftable symbol if it doesn't contain the string 'type_info': > if ($$7 ~ /??_7.*@@6B@/ && $$7 !~ /type_info/) print $$7; > > The printed values are added to a list of symbols that get filtered out of the mapfile, which is then passed to the linker. > > I can get the code to link if I add a second exception for vftable symbols containing the string 'lambda': > if ($$7 ~ /??_7.*@@6B@/ && $$7 !~ /type_info/ && $$7 !~ /lambda/) print $$7; > > I did an additional experiment where I completely removed this filtering of vftable symbols. When I did that the linker complained that we used more than 64K symbols. Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Fix typo ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6030/files - new: https://git.openjdk.java.net/jdk/pull/6030/files/54a353f3..fddb8be6 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6030&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6030&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/6030.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6030/head:pull/6030 PR: https://git.openjdk.java.net/jdk/pull/6030 From pliden at openjdk.java.net Wed Oct 20 12:57:09 2021 From: pliden at openjdk.java.net (Per Liden) Date: Wed, 20 Oct 2021 12:57:09 GMT Subject: RFR: 8275405: Linking error for classes with lambda template parameters and virtual functions [v2] In-Reply-To: References: Message-ID: On Wed, 20 Oct 2021 12:23:28 GMT, Stefan Karlsson wrote: >> We encountered the following linking error when trying to build Generational ZGC on Windows: >> >> jvm.exp : error LNK2001: unresolved external symbol "const ZBasicOopIterateClosure >::`vftable'" (??_7?$ZBasicOopIterateClosure at V@@@@6B@) >> >> >> I narrowed this down to a simple reproducer, which doesn't link when built through the HotSpot build system: >> >> #include >> std::function = [](){}; >> >> >> I found that we have a line in our make files that filters out symbols that contain the string vftable (though it checks the mangled name, so a bit hard to find): >> >> else ifeq ($(call isTargetOs, windows), true) >> DUMP_SYMBOLS_CMD := $(DUMPBIN) -symbols *.obj >> FILTER_SYMBOLS_AWK_SCRIPT := \ >> '{ \ >> if ($$7 ~ /??_7.*@@6B@/ && $$7 !~ /type_info/) print $$7; \ >> }' >> >> >> The following line prints the vftable symbol if it doesn't contain the string 'type_info': >> if ($$7 ~ /??_7.*@@6B@/ && $$7 !~ /type_info/) print $$7; >> >> The printed values are added to a list of symbols that get filtered out of the mapfile, which is then passed to the linker. >> >> I can get the code to link if I add a second exception for vftable symbols containing the string 'lambda': >> if ($$7 ~ /??_7.*@@6B@/ && $$7 !~ /type_info/ && $$7 !~ /lambda/) print $$7; >> >> I did an additional experiment where I completely removed this filtering of vftable symbols. When I did that the linker complained that we used more than 64K symbols. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Fix typo Looks good to me! ------------- Marked as reviewed by pliden (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6030 From stefank at openjdk.java.net Wed Oct 20 12:59:08 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 20 Oct 2021 12:59:08 GMT Subject: RFR: 8275527: Separate/refactor forward pointer access In-Reply-To: References: Message-ID: On Thu, 14 Oct 2021 16:37:02 GMT, Roman Kennke wrote: > Accessing the forward pointer overlay of object headers is currently done in a somewhat crude way. oopDesc::forwardee() and oopeDesc::is_forwarded() both load the header-word. This seems kinda bad, because they are most often used in conjunction, e.g.: > fwd = obj->forwarded() ? obj->forwardee() : promote_obj(); > > or similar constructs. > > Also, in some places, the forwardee is accessed in a more direct fashion using markWord::decode_pointer(), or as HeapWord*, sometimes is_forwarded() is determined by forwardee() == NULL checks (and crude overrides to prepare non-forwarded headers to return NULL in such checks, see g1Full* source files). > > I propose to extract and refactor forward pointer access in a way to better support above pattern. In-fact I originally wrote this with performance-enhancement in mind (see oopForwarding.hpp header for more details), but I could not show any actual performance improvements, so let's consider this purely on cleanliness and separation-of-concerns basis. > > Testing: > - [x] tier > - [x] tier2 > - [x] hotspot_gc I don't think we should make the code more complicated by using the micro optimized version unless we can measure the benefits. IMHO, it is harder to read and it makes it more difficult to know if the read-once of was done for correctness purposes or not. I understand that that other reviewers might disagree with my view, and if that's the case, then I'll back off. Some concrete feedback on the current patch: I don't like that we store the oop in the OopForwarding. That oop gets snuck into functions via the fwd variable. That makes it less obvious that the functions depend on the old oop. Code like the following becomes more obscure, when you don't immediately see that it operates on the original (old) object: const oop forward_ptr = fwd.forward_to_atomic(obj, memory_order_relaxed); if (forward_ptr == NULL) { I wonder if it wouldn't make sense to make forward_to_atomic static, just like the forward_to is static? That way we wouldn't have to store _obj in OopForwarding, and I think the code would be more decoupled and easier to read. ------------- PR: https://git.openjdk.java.net/jdk/pull/5955 From mcimadamore at openjdk.java.net Wed Oct 20 14:00:30 2021 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Wed, 20 Oct 2021 14:00:30 GMT Subject: RFR: 8275063: Implementation of Foreign Function & Memory API (Second incubator) [v8] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-419 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/419 Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Fix copyright header in TestArrayCopy ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5907/files - new: https://git.openjdk.java.net/jdk/pull/5907/files/52189683..414952ad Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5907&range=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5907&range=06-07 Stats: 16 lines in 1 file changed: 0 ins; 0 del; 16 mod Patch: https://git.openjdk.java.net/jdk/pull/5907.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5907/head:pull/5907 PR: https://git.openjdk.java.net/jdk/pull/5907 From stefank at openjdk.java.net Wed Oct 20 14:01:12 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 20 Oct 2021 14:01:12 GMT Subject: RFR: 8275527: Separate/refactor forward pointer access In-Reply-To: References: Message-ID: On Thu, 14 Oct 2021 16:37:02 GMT, Roman Kennke wrote: > Accessing the forward pointer overlay of object headers is currently done in a somewhat crude way. oopDesc::forwardee() and oopeDesc::is_forwarded() both load the header-word. This seems kinda bad, because they are most often used in conjunction, e.g.: > fwd = obj->forwarded() ? obj->forwardee() : promote_obj(); > > or similar constructs. > > Also, in some places, the forwardee is accessed in a more direct fashion using markWord::decode_pointer(), or as HeapWord*, sometimes is_forwarded() is determined by forwardee() == NULL checks (and crude overrides to prepare non-forwarded headers to return NULL in such checks, see g1Full* source files). > > I propose to extract and refactor forward pointer access in a way to better support above pattern. In-fact I originally wrote this with performance-enhancement in mind (see oopForwarding.hpp header for more details), but I could not show any actual performance improvements, so let's consider this purely on cleanliness and separation-of-concerns basis. > > Testing: > - [x] tier > - [x] tier2 > - [x] hotspot_gc Here's what this would look like with a static OopForwarding. The OopForwarding objects are kept local to the function they are created in: https://github.com/stefank/jdk/commit/58732bcfbc5b86ba872ea86cb224d6007be70da5 and here it is on-top of your branch: https://github.com/openjdk/jdk/compare/master...stefank:pr_5955 ------------- PR: https://git.openjdk.java.net/jdk/pull/5955 From rkennke at openjdk.java.net Wed Oct 20 14:17:05 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Wed, 20 Oct 2021 14:17:05 GMT Subject: RFR: 8275527: Separate/refactor forward pointer access In-Reply-To: References: Message-ID: On Thu, 14 Oct 2021 16:37:02 GMT, Roman Kennke wrote: > Accessing the forward pointer overlay of object headers is currently done in a somewhat crude way. oopDesc::forwardee() and oopeDesc::is_forwarded() both load the header-word. This seems kinda bad, because they are most often used in conjunction, e.g.: > fwd = obj->forwarded() ? obj->forwardee() : promote_obj(); > > or similar constructs. > > Also, in some places, the forwardee is accessed in a more direct fashion using markWord::decode_pointer(), or as HeapWord*, sometimes is_forwarded() is determined by forwardee() == NULL checks (and crude overrides to prepare non-forwarded headers to return NULL in such checks, see g1Full* source files). > > I propose to extract and refactor forward pointer access in a way to better support above pattern. In-fact I originally wrote this with performance-enhancement in mind (see oopForwarding.hpp header for more details), but I could not show any actual performance improvements, so let's consider this purely on cleanliness and separation-of-concerns basis. > > Testing: > - [x] tier > - [x] tier2 > - [x] hotspot_gc > Here's what this would look like with a static OopForwarding. The OopForwarding objects are kept local to the function they are created in: [stefank at 58732bc](https://github.com/stefank/jdk/commit/58732bcfbc5b86ba872ea86cb224d6007be70da5) > > and here it is on-top of your branch: [master...stefank:pr_5955](https://github.com/openjdk/jdk/compare/master...stefank:pr_5955) > Here's what this would look like with a static OopForwarding. The OopForwarding objects are kept local to the function they are created in: [stefank at 58732bc](https://github.com/stefank/jdk/commit/58732bcfbc5b86ba872ea86cb224d6007be70da5) > > and here it is on-top of your branch: [master...stefank:pr_5955](https://github.com/openjdk/jdk/compare/master...stefank:pr_5955) Thanks! Yes this is indeed better. I will merge into this PR as soon as I get to it. ------------- PR: https://git.openjdk.java.net/jdk/pull/5955 From psandoz at openjdk.java.net Wed Oct 20 15:28:34 2021 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Wed, 20 Oct 2021 15:28:34 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v6] In-Reply-To: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> Message-ID: > This PR improves the performance of vector operations that accept masks on architectures that support masking in hardware, specifically Intel AVX512 and ARM SVE. > > On architectures that do not support masking in hardware the same technique as before is applied to most operations, specifically composition using blend. > > Masked loads/stores are a special form of masked operation that require additional care to ensure out-of-bounds access throw exceptions. The range checking has not been fully optimized and will require further work. > > No API enhancements were required and only a few additional tests were needed. Paul Sandoz has updated the pull request incrementally with two additional commits since the last revision: - Merge pull request #1 from nsjian/JDK-8271515 Address AArch64 review comments from Nick. - Address review comments from Nick. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5873/files - new: https://git.openjdk.java.net/jdk/pull/5873/files/9945ce8d..2919465a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5873&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5873&range=04-05 Stats: 35 lines in 5 files changed: 3 ins; 2 del; 30 mod Patch: https://git.openjdk.java.net/jdk/pull/5873.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5873/head:pull/5873 PR: https://git.openjdk.java.net/jdk/pull/5873 From lfoltan at openjdk.java.net Wed Oct 20 15:46:07 2021 From: lfoltan at openjdk.java.net (Lois Foltan) Date: Wed, 20 Oct 2021 15:46:07 GMT Subject: RFR: 8272614: Unused parameters in MethodHandleNatives linking methods In-Reply-To: References: Message-ID: On Tue, 19 Oct 2021 17:12:16 GMT, Harold Seigel wrote: > Please review this small fix for JDK-8272614 to remove the unused indexInCP argument to linkCallSite() and linkDynamicConstant(). The fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-6 on Linux x64. > > Thanks, Harold LGTM! ------------- Marked as reviewed by lfoltan (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6021 From hseigel at openjdk.java.net Wed Oct 20 15:51:12 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 20 Oct 2021 15:51:12 GMT Subject: RFR: 8272614: Unused parameters in MethodHandleNatives linking methods In-Reply-To: References: Message-ID: On Tue, 19 Oct 2021 17:12:16 GMT, Harold Seigel wrote: > Please review this small fix for JDK-8272614 to remove the unused indexInCP argument to linkCallSite() and linkDynamicConstant(). The fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-6 on Linux x64. > > Thanks, Harold Thanks David and Lois! ------------- PR: https://git.openjdk.java.net/jdk/pull/6021 From hseigel at openjdk.java.net Wed Oct 20 15:51:12 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 20 Oct 2021 15:51:12 GMT Subject: Integrated: 8272614: Unused parameters in MethodHandleNatives linking methods In-Reply-To: References: Message-ID: On Tue, 19 Oct 2021 17:12:16 GMT, Harold Seigel wrote: > Please review this small fix for JDK-8272614 to remove the unused indexInCP argument to linkCallSite() and linkDynamicConstant(). The fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-6 on Linux x64. > > Thanks, Harold This pull request has now been integrated. Changeset: bbc60611 Author: Harold Seigel URL: https://git.openjdk.java.net/jdk/commit/bbc606117fcd8b48fc8f830c50cf7eb573da1c4c Stats: 8 lines in 3 files changed: 0 ins; 3 del; 5 mod 8272614: Unused parameters in MethodHandleNatives linking methods Reviewed-by: dholmes, lfoltan ------------- PR: https://git.openjdk.java.net/jdk/pull/6021 From aph at openjdk.java.net Wed Oct 20 16:58:07 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 20 Oct 2021 16:58:07 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v13] In-Reply-To: References: Message-ID: On Mon, 18 Oct 2021 16:16:58 GMT, Evgeny Astigeevich wrote: >> This PR is a follow-up on the discussion [?RFC: AArch64: Implementing spin pauses with ISB?](https://mail.openjdk.java.net/pipermail/hotspot-dev/2021-August/054033.html). >> >> It adds DIAGNOSTIC options `OnSpinWaitInst=inst`, where `inst` can be: >> >> - `none`: no implementation for spin pauses. This is the default value. >> - `nop`: use `nop` instruction for spin pauses. >> - `isb`: use `isb` instruction for spin pauses. >> - `yield`: use `yield` instruction for spin pauses. >> >> And `OnSpinWaitInstCount=count`, where `count` specifies a number of `OnSpinWaitInst` and can be in `1..99` range. It is an error to use `OnSpinWaitInstCount` when `OnSpinWaitInst` is `none`. >> >> The code for the `Thread.onSpinWait` intrinsic is generated based on the values of `OnSpinWaitInst` and `OnSpinWaitInstCount`. >> >> Testing: >> >> - `make test TEST="gtest"`: Passed >> - `make run-test TEST="tier1"`: Passed >> - `make run-test TEST="tier2"`: Passed >> - `make run-test TEST=hotspot/jtreg/compiler/onSpinWait`: Passed >> >> CSR: https://bugs.openjdk.java.net/browse/JDK-8274564 > > Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: > > Fix formatting and spelling test/micro/org/openjdk/bench/java/lang/ThreadOnSpinWaitProducerConsumer.java line 47: > 45: @State(Scope.Benchmark) > 46: public class ThreadOnSpinWaitProducerConsumer { > 47: @Param({"100"}) This test seems rather artificial and unrealistic to me. You can get improved performance simply by increasing the `spinNum` so that it waits for longer. The way to get some advantage from `onSpinWait` is to have multiple threads racing to update the same thing, e.g. to acquire a lock. ------------- PR: https://git.openjdk.java.net/jdk/pull/5562 From sviswanathan at openjdk.java.net Wed Oct 20 17:36:05 2021 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Wed, 20 Oct 2021 17:36:05 GMT Subject: RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh [v2] In-Reply-To: References: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> Message-ID: On Tue, 19 Oct 2021 20:34:55 GMT, Vamsi Parasa wrote: >> Optimize the new Math.unsignedMultiplyHigh using the x86 mul instruction. This change show 1.87X improvement on a micro benchmark. > > Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > refactoring to remove code duplication by using a common routine for UMulHiLNode and MulHiLNode Marked as reviewed by sviswanathan (Reviewer). The patch looks good to me. ------------- PR: https://git.openjdk.java.net/jdk/pull/5933 From sviswanathan at openjdk.java.net Wed Oct 20 17:36:06 2021 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Wed, 20 Oct 2021 17:36:06 GMT Subject: RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh [v2] In-Reply-To: <8MuiklM5Nt3VkzyVHbWqwMh_LkVvVY2Mf65_0zTx4Kw=.9351008f-b489-4103-be9d-87e6fc4a8f39@github.com> References: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> <8MuiklM5Nt3VkzyVHbWqwMh_LkVvVY2Mf65_0zTx4Kw=.9351008f-b489-4103-be9d-87e6fc4a8f39@github.com> Message-ID: On Fri, 15 Oct 2021 20:19:31 GMT, Vladimir Kozlov wrote: >>> How you verified correctness of results? I suggest to extend `test/jdk//java/lang/Math/MultiplicationTests.java` test to cover unsigned method. >> >> Tests for unsignedMultiplyHigh were already added in test/jdk//java/lang/Math/MultiplicationTests.java (in July 2021 by Brian Burkhalter). Used that test to verify the correctness of the results. > >> > How you verified correctness of results? I suggest to extend `test/jdk//java/lang/Math/MultiplicationTests.java` test to cover unsigned method. >> >> Tests for unsignedMultiplyHigh were already added in test/jdk//java/lang/Math/MultiplicationTests.java (in July 2021 by Brian Burkhalter). Used that test to verify the correctness of the results. > > Good. It seems I have old version of the test. > Did you run it with -Xcomp? How you verified that intrinsic is used? @vnkozlov if the patch looks ok to you, could you please run this through your testing? ------------- PR: https://git.openjdk.java.net/jdk/pull/5933 From kvn at openjdk.java.net Wed Oct 20 19:11:09 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 20 Oct 2021 19:11:09 GMT Subject: RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh [v2] In-Reply-To: References: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> Message-ID: On Tue, 19 Oct 2021 20:34:55 GMT, Vamsi Parasa wrote: >> Optimize the new Math.unsignedMultiplyHigh using the x86 mul instruction. This change show 1.87X improvement on a micro benchmark. > > Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > refactoring to remove code duplication by using a common routine for UMulHiLNode and MulHiLNode Looks good. And I submitted testing. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5933 From kvn at openjdk.java.net Wed Oct 20 22:18:09 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 20 Oct 2021 22:18:09 GMT Subject: RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh [v2] In-Reply-To: References: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> Message-ID: On Tue, 19 Oct 2021 20:34:55 GMT, Vamsi Parasa wrote: >> Optimize the new Math.unsignedMultiplyHigh using the x86 mul instruction. This change show 1.87X improvement on a micro benchmark. > > Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > refactoring to remove code duplication by using a common routine for UMulHiLNode and MulHiLNode Tests passed. You can integrate changes. ------------- PR: https://git.openjdk.java.net/jdk/pull/5933 From duke at openjdk.java.net Wed Oct 20 22:22:01 2021 From: duke at openjdk.java.net (Vamsi Parasa) Date: Wed, 20 Oct 2021 22:22:01 GMT Subject: RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh [v2] In-Reply-To: References: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> Message-ID: On Wed, 20 Oct 2021 22:14:33 GMT, Vladimir Kozlov wrote: > Tests passed. You can integrate changes. Thanks Vladimir! What are the next steps to integrate the change? ------------- PR: https://git.openjdk.java.net/jdk/pull/5933 From duke at openjdk.java.net Wed Oct 20 22:44:06 2021 From: duke at openjdk.java.net (Vamsi Parasa) Date: Wed, 20 Oct 2021 22:44:06 GMT Subject: Integrated: 8275167: x86 intrinsic for unsignedMultiplyHigh In-Reply-To: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> References: <7IzrZdL0elgXbuisyLNYC2wkyOTe1RHUPuGRI7YsAQ4=.aed9dea3-4775-4592-b43e-c3e08e167f90@github.com> Message-ID: On Wed, 13 Oct 2021 18:55:10 GMT, Vamsi Parasa wrote: > Optimize the new Math.unsignedMultiplyHigh using the x86 mul instruction. This change show 1.87X improvement on a micro benchmark. This pull request has now been integrated. Changeset: af7c56b8 Author: vamsi-parasa Committer: Sandhya Viswanathan URL: https://git.openjdk.java.net/jdk/commit/af7c56b85bb2828a9d68f9e1c753a4adfa7ebb4f Stats: 63 lines in 11 files changed: 61 ins; 2 del; 0 mod 8275167: x86 intrinsic for unsignedMultiplyHigh Reviewed-by: kvn, sviswanathan ------------- PR: https://git.openjdk.java.net/jdk/pull/5933 From njian at openjdk.java.net Thu Oct 21 01:59:09 2021 From: njian at openjdk.java.net (Ningsheng Jian) Date: Thu, 21 Oct 2021 01:59:09 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v5] In-Reply-To: References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> Message-ID: On Wed, 20 Oct 2021 03:13:07 GMT, Nick Gasson wrote: > AArch64 changes look ok apart from some minor comments. Thank you @nick-arm for the review! All your comments have been addressed. ------------- PR: https://git.openjdk.java.net/jdk/pull/5873 From stefank at openjdk.java.net Thu Oct 21 07:47:05 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 21 Oct 2021 07:47:05 GMT Subject: RFR: 8275527: Separate/refactor forward pointer access In-Reply-To: References: Message-ID: On Thu, 14 Oct 2021 16:37:02 GMT, Roman Kennke wrote: > Accessing the forward pointer overlay of object headers is currently done in a somewhat crude way. oopDesc::forwardee() and oopeDesc::is_forwarded() both load the header-word. This seems kinda bad, because they are most often used in conjunction, e.g.: > fwd = obj->forwarded() ? obj->forwardee() : promote_obj(); > > or similar constructs. > > Also, in some places, the forwardee is accessed in a more direct fashion using markWord::decode_pointer(), or as HeapWord*, sometimes is_forwarded() is determined by forwardee() == NULL checks (and crude overrides to prepare non-forwarded headers to return NULL in such checks, see g1Full* source files). > > I propose to extract and refactor forward pointer access in a way to better support above pattern. In-fact I originally wrote this with performance-enhancement in mind (see oopForwarding.hpp header for more details), but I could not show any actual performance improvements, so let's consider this purely on cleanliness and separation-of-concerns basis. > > Testing: > - [x] tier > - [x] tier2 > - [x] hotspot_gc A few more comments: src/hotspot/share/gc/g1/g1FullGCCompactionPoint.cpp line 112: > 110: // since it will be restored by preserved marks. > 111: object->init_mark(); > 112: } else { Could you explain why this was removed? src/hotspot/share/gc/parallel/psPromotionManager.inline.hpp line 142: > 140: OopForwarding fwd(o); > 141: if (!fwd.is_forwarded()) { > 142: return copy_unmarked_to_survivor_space(o, fwd); I wonder if this copy_unmarked should be renamed to copy_unforwarded? src/hotspot/share/gc/serial/defNewGeneration.cpp line 56: > 54: #include "oops/instanceRefKlass.hpp" > 55: #include "oops/oop.inline.hpp" > 56: #include "oops/oopForwarding.hpp" The patch is inconsistent w.r.t. the include order between oop.inline.hpp and oopForwarding.hpp. src/hotspot/share/oops/oopForwarding.hpp line 83: > 81: #endif > 82: #endif > 83: } This should preferable be in a .cpp file, though it seems excessive to introduce one just for this case. Since it is only used by the forward_to functions, maybe add it to the suggested oopForwarding.inline.hpp file? src/hotspot/share/oops/oopForwarding.hpp line 125: > 123: return cast_to_oop(old_mark.decode_pointer()); > 124: } > 125: } The forward_to functions uses functions in oop.inline.hpp. They need to be split out into a oopForwarding.inline.hpp file. ------------- PR: https://git.openjdk.java.net/jdk/pull/5955 From stefank at openjdk.java.net Thu Oct 21 07:52:07 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 21 Oct 2021 07:52:07 GMT Subject: RFR: 8275405: Linking error for classes with lambda template parameters and virtual functions [v2] In-Reply-To: References: Message-ID: On Wed, 20 Oct 2021 12:23:28 GMT, Stefan Karlsson wrote: >> We encountered the following linking error when trying to build Generational ZGC on Windows: >> >> jvm.exp : error LNK2001: unresolved external symbol "const ZBasicOopIterateClosure >::`vftable'" (??_7?$ZBasicOopIterateClosure at V@@@@6B@) >> >> >> I narrowed this down to a simple reproducer, which doesn't link when built through the HotSpot build system: >> >> #include >> std::function = [](){}; >> >> >> I found that we have a line in our make files that filters out symbols that contain the string vftable (though it checks the mangled name, so a bit hard to find): >> >> else ifeq ($(call isTargetOs, windows), true) >> DUMP_SYMBOLS_CMD := $(DUMPBIN) -symbols *.obj >> FILTER_SYMBOLS_AWK_SCRIPT := \ >> '{ \ >> if ($$7 ~ /??_7.*@@6B@/ && $$7 !~ /type_info/) print $$7; \ >> }' >> >> >> The following line prints the vftable symbol if it doesn't contain the string 'type_info': >> if ($$7 ~ /??_7.*@@6B@/ && $$7 !~ /type_info/) print $$7; >> >> The printed values are added to a list of symbols that get filtered out of the mapfile, which is then passed to the linker. >> >> I can get the code to link if I add a second exception for vftable symbols containing the string 'lambda': >> if ($$7 ~ /??_7.*@@6B@/ && $$7 !~ /type_info/ && $$7 !~ /lambda/) print $$7; >> >> I did an additional experiment where I completely removed this filtering of vftable symbols. When I did that the linker complained that we used more than 64K symbols. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Fix typo Thanks for reviewing! ------------- PR: https://git.openjdk.java.net/jdk/pull/6030 From stefank at openjdk.java.net Thu Oct 21 07:52:08 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 21 Oct 2021 07:52:08 GMT Subject: Integrated: 8275405: Linking error for classes with lambda template parameters and virtual functions In-Reply-To: References: Message-ID: On Wed, 20 Oct 2021 08:11:34 GMT, Stefan Karlsson wrote: > We encountered the following linking error when trying to build Generational ZGC on Windows: > > jvm.exp : error LNK2001: unresolved external symbol "const ZBasicOopIterateClosure >::`vftable'" (??_7?$ZBasicOopIterateClosure at V@@@@6B@) > > > I narrowed this down to a simple reproducer, which doesn't link when built through the HotSpot build system: > > #include > std::function = [](){}; > > > I found that we have a line in our make files that filters out symbols that contain the string vftable (though it checks the mangled name, so a bit hard to find): > > else ifeq ($(call isTargetOs, windows), true) > DUMP_SYMBOLS_CMD := $(DUMPBIN) -symbols *.obj > FILTER_SYMBOLS_AWK_SCRIPT := \ > '{ \ > if ($$7 ~ /??_7.*@@6B@/ && $$7 !~ /type_info/) print $$7; \ > }' > > > The following line prints the vftable symbol if it doesn't contain the string 'type_info': > if ($$7 ~ /??_7.*@@6B@/ && $$7 !~ /type_info/) print $$7; > > The printed values are added to a list of symbols that get filtered out of the mapfile, which is then passed to the linker. > > I can get the code to link if I add a second exception for vftable symbols containing the string 'lambda': > if ($$7 ~ /??_7.*@@6B@/ && $$7 !~ /type_info/ && $$7 !~ /lambda/) print $$7; > > I did an additional experiment where I completely removed this filtering of vftable symbols. When I did that the linker complained that we used more than 64K symbols. This pull request has now been integrated. Changeset: 09f5235c Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/09f5235c65de546640d5f923fa9369e28643c6ed Stats: 18 lines in 1 file changed: 17 ins; 0 del; 1 mod 8275405: Linking error for classes with lambda template parameters and virtual functions Reviewed-by: ihse, pliden ------------- PR: https://git.openjdk.java.net/jdk/pull/6030 From duke at openjdk.java.net Thu Oct 21 09:02:12 2021 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Thu, 21 Oct 2021 09:02:12 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v13] In-Reply-To: References: Message-ID: On Wed, 20 Oct 2021 16:16:45 GMT, Andrew Haley wrote: >> Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix formatting and spelling > > test/micro/org/openjdk/bench/java/lang/ThreadOnSpinWaitProducerConsumer.java line 47: > >> 45: @State(Scope.Benchmark) >> 46: public class ThreadOnSpinWaitProducerConsumer { >> 47: @Param({"100"}) > > This test seems rather artificial and unrealistic to me. You can get improved performance simply by increasing the `spinNum` so that it waits for longer. The way to get some advantage from `onSpinWait` is to have multiple threads racing to update the same thing, e.g. to acquire a lock. I agree with you. I am removing it. ------------- PR: https://git.openjdk.java.net/jdk/pull/5562 From duke at openjdk.java.net Thu Oct 21 09:06:06 2021 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Thu, 21 Oct 2021 09:06:06 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v14] In-Reply-To: References: Message-ID: > This PR is a follow-up on the discussion [?RFC: AArch64: Implementing spin pauses with ISB?](https://mail.openjdk.java.net/pipermail/hotspot-dev/2021-August/054033.html). > > It adds DIAGNOSTIC options `OnSpinWaitInst=inst`, where `inst` can be: > > - `none`: no implementation for spin pauses. This is the default value. > - `nop`: use `nop` instruction for spin pauses. > - `isb`: use `isb` instruction for spin pauses. > - `yield`: use `yield` instruction for spin pauses. > > And `OnSpinWaitInstCount=count`, where `count` specifies a number of `OnSpinWaitInst` and can be in `1..99` range. It is an error to use `OnSpinWaitInstCount` when `OnSpinWaitInst` is `none`. > > The code for the `Thread.onSpinWait` intrinsic is generated based on the values of `OnSpinWaitInst` and `OnSpinWaitInstCount`. > > Testing: > > - `make test TEST="gtest"`: Passed > - `make run-test TEST="tier1"`: Passed > - `make run-test TEST="tier2"`: Passed > - `make run-test TEST=hotspot/jtreg/compiler/onSpinWait`: Passed > > CSR: https://bugs.openjdk.java.net/browse/JDK-8274564 Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: Remove ThreadOnSpinWaitProducerConsumer microbenchmark as unrealistic ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5562/files - new: https://git.openjdk.java.net/jdk/pull/5562/files/42f0bb6d..a06b4821 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5562&range=13 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5562&range=12-13 Stats: 116 lines in 1 file changed: 0 ins; 116 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/5562.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5562/head:pull/5562 PR: https://git.openjdk.java.net/jdk/pull/5562 From duke at openjdk.java.net Thu Oct 21 09:06:09 2021 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Thu, 21 Oct 2021 09:06:09 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v13] In-Reply-To: References: Message-ID: On Wed, 20 Oct 2021 16:16:45 GMT, Andrew Haley wrote: >> Evgeny Astigeevich has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix formatting and spelling > > test/micro/org/openjdk/bench/java/lang/ThreadOnSpinWaitProducerConsumer.java line 47: > >> 45: @State(Scope.Benchmark) >> 46: public class ThreadOnSpinWaitProducerConsumer { >> 47: @Param({"100"}) > > This test seems rather artificial and unrealistic to me. You can get improved performance simply by increasing the `spinNum` so that it waits for longer. The way to get some advantage from `onSpinWait` is to have multiple threads racing to update the same thing, e.g. to acquire a lock. @theRealAph, done ------------- PR: https://git.openjdk.java.net/jdk/pull/5562 From coleenp at openjdk.java.net Thu Oct 21 11:29:12 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 21 Oct 2021 11:29:12 GMT Subject: RFR: 8274794: Print all owned locks in hs_err file [v4] In-Reply-To: References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: On Mon, 18 Oct 2021 16:06:29 GMT, Coleen Phillimore wrote: >> See CR for details. This saves the created mutex in a mutex_array during Mutex construction so that all Mutex, not just the ones in mutexLocker.cpp can be printed in the hs_err file. ThreadCritical is used to protect the mutex_array, and to avoid UB should be used when printing the owned locks, but since that is done during error handling, this seemed better not to lock during error handling. There's 0 probability that another thread will be creating a lock during this error handling anyway. >> Tested with tier1-8. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > tstuefe suggestions. Thank you for the review, David and Thomas. ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From coleenp at openjdk.java.net Thu Oct 21 11:30:12 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 21 Oct 2021 11:30:12 GMT Subject: RFR: 8274301: Locking with _no_safepoint_check_flag does not check NJT In-Reply-To: References: Message-ID: On Thu, 14 Oct 2021 21:04:29 GMT, Coleen Phillimore wrote: > There are Mutexes that have a safepoint checking rank, that are locked via: > MutexLocker ml(safepoint_lock, Mutex::_no_safepoint_check_flag); > > These don't assert for the inconsistency because the thread that calls this is a NonJavaThread. > > But if you lock > MutexLocker ml(nosafepoint_lock); // default safepoint check > > These *will* assert for the safepoint checking inconsistency even for NonJavaThreads. > > NonJavaThreads don't participate in the safepoint checking protocol. And some of these safepoint checking locks are shared between java and nonjava threads. So in the existing code, there's a mix of locking with _no_safepoint_check_flag and not. > > This has a wrinkle because if you have a safepoint checking lock, and call wait on it, we now have to handle the case that the caller is a NonJavaThread and should not safepoint check (just like the Mutex::lock call). > > Maybe this shouldn't be fixed, and we should just document the discrepancy. > > Tested with tier1-3. So one 'no' vote and no 'yes' votes so I'm closing as WNF. It's a minor inconsistency and we'll see if it impedes understanding this code in the future. ------------- PR: https://git.openjdk.java.net/jdk/pull/5959 From coleenp at openjdk.java.net Thu Oct 21 11:30:12 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 21 Oct 2021 11:30:12 GMT Subject: Withdrawn: 8274301: Locking with _no_safepoint_check_flag does not check NJT In-Reply-To: References: Message-ID: On Thu, 14 Oct 2021 21:04:29 GMT, Coleen Phillimore wrote: > There are Mutexes that have a safepoint checking rank, that are locked via: > MutexLocker ml(safepoint_lock, Mutex::_no_safepoint_check_flag); > > These don't assert for the inconsistency because the thread that calls this is a NonJavaThread. > > But if you lock > MutexLocker ml(nosafepoint_lock); // default safepoint check > > These *will* assert for the safepoint checking inconsistency even for NonJavaThreads. > > NonJavaThreads don't participate in the safepoint checking protocol. And some of these safepoint checking locks are shared between java and nonjava threads. So in the existing code, there's a mix of locking with _no_safepoint_check_flag and not. > > This has a wrinkle because if you have a safepoint checking lock, and call wait on it, we now have to handle the case that the caller is a NonJavaThread and should not safepoint check (just like the Mutex::lock call). > > Maybe this shouldn't be fixed, and we should just document the discrepancy. > > Tested with tier1-3. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/5959 From coleenp at openjdk.java.net Thu Oct 21 11:32:14 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 21 Oct 2021 11:32:14 GMT Subject: Integrated: 8274794: Print all owned locks in hs_err file In-Reply-To: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> References: <4nMLgAd3gKQPaGKe4R72cilrult63gFya_R_1k4WEJw=.37026a8c-70c0-4efe-8bda-e222da4574af@github.com> Message-ID: On Thu, 14 Oct 2021 20:51:44 GMT, Coleen Phillimore wrote: > See CR for details. This saves the created mutex in a mutex_array during Mutex construction so that all Mutex, not just the ones in mutexLocker.cpp can be printed in the hs_err file. ThreadCritical is used to protect the mutex_array, and to avoid UB should be used when printing the owned locks, but since that is done during error handling, this seemed better not to lock during error handling. There's 0 probability that another thread will be creating a lock during this error handling anyway. > Tested with tier1-8. This pull request has now been integrated. Changeset: 819d2df8 Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/819d2df8b01e04bcc89a0a995e21b68799f890be Stats: 338 lines in 8 files changed: 228 ins; 96 del; 14 mod 8274794: Print all owned locks in hs_err file Reviewed-by: stuefe, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/5958 From stefank at openjdk.java.net Thu Oct 21 12:23:19 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 21 Oct 2021 12:23:19 GMT Subject: RFR: 8275712: Hashtable literal_size functions are broken Message-ID: The literal_size functions are used to estimate the size of held objects in some of our hashtables. Two bugs: 1) They return int, but the returned value is in bytes and could be larger than MAX_INT. 2) Non-String objects report words instead of bytes. Manually tested by running JFR and checking the output in JMC. ------------- Commit messages: - Remove tab - 8275712: Hashtable literal_size functions are broken Changes: https://git.openjdk.java.net/jdk/pull/6063/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6063&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275712 Stats: 33 lines in 2 files changed: 15 ins; 10 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/6063.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6063/head:pull/6063 PR: https://git.openjdk.java.net/jdk/pull/6063 From coleenp at openjdk.java.net Thu Oct 21 12:56:03 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 21 Oct 2021 12:56:03 GMT Subject: RFR: 8275712: Hashtable literal_size functions are broken In-Reply-To: References: Message-ID: On Thu, 21 Oct 2021 11:26:36 GMT, Stefan Karlsson wrote: > The literal_size functions are used to estimate the size of held objects in some of our hashtables. > > Two bugs: > 1) They return int, but the returned value is in bytes and could be larger than MAX_INT. > 2) Non-String objects report words instead of bytes. > > Manually tested by running JFR and checking the output in JMC. Looks good. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6063 From zgu at openjdk.java.net Thu Oct 21 13:36:05 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Thu, 21 Oct 2021 13:36:05 GMT Subject: RFR: 8275712: Hashtable literal_size functions are broken In-Reply-To: References: Message-ID: <0OtP_IrZeXNLaHwnyJx0bjwB0mvlLJfYHKHqCkOf0TE=.b52e246d-ebc4-4d0c-8054-7f7a346b3c74@github.com> On Thu, 21 Oct 2021 11:26:36 GMT, Stefan Karlsson wrote: > The literal_size functions are used to estimate the size of held objects in some of our hashtables. > > Two bugs: > 1) They return int, but the returned value is in bytes and could be larger than MAX_INT. > 2) Non-String objects report words instead of bytes. > > Manually tested by running JFR and checking the output in JMC. LGTM ------------- Marked as reviewed by zgu (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6063 From aph at openjdk.java.net Thu Oct 21 14:40:17 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 21 Oct 2021 14:40:17 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v13] In-Reply-To: References: Message-ID: On Thu, 21 Oct 2021 09:03:15 GMT, Evgeny Astigeevich wrote: >> test/micro/org/openjdk/bench/java/lang/ThreadOnSpinWaitProducerConsumer.java line 47: >> >>> 45: @State(Scope.Benchmark) >>> 46: public class ThreadOnSpinWaitProducerConsumer { >>> 47: @Param({"100"}) >> >> This test seems rather artificial and unrealistic to me. You can get improved performance simply by increasing the `spinNum` so that it waits for longer. The way to get some advantage from `onSpinWait` is to have multiple threads racing to update the same thing, e.g. to acquire a lock. > > @theRealAph, done Looks good. I'm not entirely sure whether this test is truly representative of the real-world cases that people have seen, but if we find out more we can always add another JMH test. ------------- PR: https://git.openjdk.java.net/jdk/pull/5562 From mcimadamore at openjdk.java.net Thu Oct 21 15:20:16 2021 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 21 Oct 2021 15:20:16 GMT Subject: RFR: 8275063: Implementation of Foreign Function & Memory API (Second incubator) [v9] In-Reply-To: References: Message-ID: > This PR contains the API and implementation changes for JEP-419 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/419 Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision: - Fix regression in VaList treatment on AArch64 (contributed by @nick-arm) - Merge branch 'master' into JEP-419 - Fix copyright header in TestArrayCopy - Fix failing microbenchmarks. Contributed by @FrauBoes (thanks!) - * use `invokeWithArguments` to simplify new test - Add test for liveness check with high-aririty downcalls (make sure that if an exception occurs in a downcall because of liveness, ref count of other resources are left intact). - * Fix javadoc issue in VaList * Fix bug in concurrent logic for shared scope acquire - Address review comments - Apply suggestions from code review Co-authored-by: Paul Sandoz - Fix TestLinkToNativeRBP test - ... and 5 more: https://git.openjdk.java.net/jdk/compare/aeabb3df...5c69eabf ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5907/files - new: https://git.openjdk.java.net/jdk/pull/5907/files/414952ad..5c69eabf Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5907&range=08 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5907&range=07-08 Stats: 25202 lines in 587 files changed: 18962 ins; 4240 del; 2000 mod Patch: https://git.openjdk.java.net/jdk/pull/5907.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5907/head:pull/5907 PR: https://git.openjdk.java.net/jdk/pull/5907 From duke at openjdk.java.net Thu Oct 21 15:23:11 2021 From: duke at openjdk.java.net (Evgeny Astigeevich) Date: Thu, 21 Oct 2021 15:23:11 GMT Subject: RFR: 8186670: Implement _onSpinWait() intrinsic for AArch64 [v13] In-Reply-To: References: Message-ID: On Thu, 21 Oct 2021 14:37:18 GMT, Andrew Haley wrote: >> @theRealAph, done > > Looks good. I'm not entirely sure whether this test is truly representative of the real-world cases that people have seen, but if we find out more we can always add another JMH test. This test is too artificial. Going through my records I've found I have a microbenchmark for `java.util.concurrent. SynchronousQueue` which shows good improvements on jdk11. `SynchronousQueue` uses `onSpinWait`. Since jdk17 `SynchronousQueue` has not been using `onSpinWait` any more (See https://bugs.openjdk.java.net/browse/JDK-8267502). Maybe I can come up with a microbenchmark based on `SynchronousQueue` [code](https://github.com/openjdk/jdk11u-dev/blob/master/src/java.base/share/classes/java/util/concurrent/SynchronousQueue.java#L412): SNode awaitFulfill(SNode s, boolean timed, long nanos) { /* * When a node/thread is about to block, it sets its waiter * field and then rechecks state at least one more time * before actually parking, thus covering race vs * fulfiller noticing that waiter is non-null so should be * woken. * * When invoked by nodes that appear at the point of call * to be at the head of the stack, calls to park are * preceded by spins to avoid blocking when producers and * consumers are arriving very close in time. This can * happen enough to bother only on multiprocessors. * * The order of checks for returning out of main loop * reflects fact that interrupts have precedence over * normal returns, which have precedence over * timeouts. (So, on timeout, one last check for match is * done before giving up.) Except that calls from untimed * SynchronousQueue.{poll/offer} don't check interrupts * and don't wait at all, so are trapped in transfer * method rather than calling awaitFulfill. */ final long deadline = timed ? System.nanoTime() + nanos : 0L; Thread w = Thread.currentThread(); int spins = shouldSpin(s) ? (timed ? MAX_TIMED_SPINS : MAX_UNTIMED_SPINS) : 0; for (;;) { if (w.isInterrupted()) s.tryCancel(); SNode m = s.match; if (m != null) return m; if (timed) { nanos = deadline - System.nanoTime(); if (nanos <= 0L) { s.tryCancel(); continue; } } if (spins > 0) { Thread.onSpinWait(); spins = shouldSpin(s) ? (spins - 1) : 0; } else if (s.waiter == null) s.waiter = w; // establish waiter so can park next iter else if (!timed) LockSupport.park(this); else if (nanos > SPIN_FOR_TIMEOUT_THRESHOLD) LockSupport.parkNanos(this, nanos); } } I've created https://bugs.openjdk.java.net/browse/JDK-8275728 to write such a microbenchmark. ------------- PR: https://git.openjdk.java.net/jdk/pull/5562 From zgu at openjdk.java.net Thu Oct 21 15:25:29 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Thu, 21 Oct 2021 15:25:29 GMT Subject: RFR: 8275718: Relax memory constraint on exception counter updates Message-ID: This is another instance of counter updates that only need atomic guarantee. ------------- Commit messages: - v0 Changes: https://git.openjdk.java.net/jdk/pull/6065/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6065&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275718 Stats: 5 lines in 1 file changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/6065.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6065/head:pull/6065 PR: https://git.openjdk.java.net/jdk/pull/6065 From Divino.Cesar at microsoft.com Thu Oct 21 23:07:39 2021 From: Divino.Cesar at microsoft.com (Cesar Soares Lucas) Date: Thu, 21 Oct 2021 23:07:39 +0000 Subject: RFC - Improving C2 Escape Analysis In-Reply-To: <44563450-403B-4A15-95AB-5FB5DCA4ED0B@oracle.com> References: <20210930140335.648146897@eggemoggin.niobe.net> <415a6622-a46c-33da-8e39-c8f3068c7df3@oracle.com> <44563450-403B-4A15-95AB-5FB5DCA4ED0B@oracle.com> Message-ID: Hi Tobias and Ron, Thank you for providing more details about this. Really helpful! We decided to direct our immediate efforts to improving EA for heap objects. AFAIU inline types will bring a lot of performance improvements but that will require the programmer to rewrite part(s) of the application - please correct if I'm wrong here. We are investigating how we can improve the performance of applications that aren't updated to use inline types. About the write-up, may I ask you what do you think is the most critical issue raised there? Or perhaps there is something else that we missed? Cheers, Cesar ________________________________ From: Ron Pressler Sent: October 19, 2021 6:39 AM To: Tobias Hartmann Cc: Cesar Soares Lucas ; John Rose ; Mark Reinhold ; hotspot-dev at openjdk.java.net ; Brian Stafford ; Martijn Verburg Subject: Re: RFC - Improving C2 Escape Analysis [You don't often get email from ron.pressler at oracle.com. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.] > On 19 Oct 2021, at 14:17, Tobias Hartmann wrote: > > > That said, support for stack/thread local allocation would be something that we could potentially > use in Valhalla to avoid buffering on the heap. It's difficult though because in most cases we use > buffering when storing to a non-flattened field that escapes to other threads. > Just a reminder, stacks can move in memory (virtual threads), so great care must be taken with pointers pointing into the stack. As a general rule, that?s a no-no. ? Ron From dholmes at openjdk.java.net Fri Oct 22 04:51:31 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 22 Oct 2021 04:51:31 GMT Subject: Integrated: 8275761: Backout: JDK-8274794 Print all owned locks in hs_err file Message-ID: Revert "8274794: Print all owned locks in hs_err file" This reverts commit 819d2df8b01e04bcc89a0a995e21b68799f890be. We are seeing a large number of failures on some of our internal tier5 tests and so are backing this out while it is investigated further. Thanks, David ------------- Commit messages: - 8275761: Backout: JDK-8274794 Print all owned locks in hs_err file Changes: https://git.openjdk.java.net/jdk/pull/6074/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6074&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275761 Stats: 338 lines in 8 files changed: 96 ins; 228 del; 14 mod Patch: https://git.openjdk.java.net/jdk/pull/6074.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6074/head:pull/6074 PR: https://git.openjdk.java.net/jdk/pull/6074 From mikael at openjdk.java.net Fri Oct 22 04:51:31 2021 From: mikael at openjdk.java.net (Mikael Vidstedt) Date: Fri, 22 Oct 2021 04:51:31 GMT Subject: Integrated: 8275761: Backout: JDK-8274794 Print all owned locks in hs_err file In-Reply-To: References: Message-ID: On Fri, 22 Oct 2021 04:42:37 GMT, David Holmes wrote: > Revert "8274794: Print all owned locks in hs_err file" > > This reverts commit 819d2df8b01e04bcc89a0a995e21b68799f890be. > > We are seeing a large number of failures on some of our internal tier5 tests and so are backing this out while it is investigated further. > > Thanks, > David Marked as reviewed by mikael (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/6074 From dholmes at openjdk.java.net Fri Oct 22 04:51:31 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 22 Oct 2021 04:51:31 GMT Subject: Integrated: 8275761: Backout: JDK-8274794 Print all owned locks in hs_err file In-Reply-To: References: Message-ID: On Fri, 22 Oct 2021 04:43:46 GMT, Mikael Vidstedt wrote: >> Revert "8274794: Print all owned locks in hs_err file" >> >> This reverts commit 819d2df8b01e04bcc89a0a995e21b68799f890be. >> >> We are seeing a large number of failures on some of our internal tier5 tests and so are backing this out while it is investigated further. >> >> Thanks, >> David > > Marked as reviewed by mikael (Reviewer). Thanks @vidmik ! ------------- PR: https://git.openjdk.java.net/jdk/pull/6074 From dholmes at openjdk.java.net Fri Oct 22 04:51:31 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 22 Oct 2021 04:51:31 GMT Subject: Integrated: 8275761: Backout: JDK-8274794 Print all owned locks in hs_err file In-Reply-To: References: Message-ID: On Fri, 22 Oct 2021 04:42:37 GMT, David Holmes wrote: > Revert "8274794: Print all owned locks in hs_err file" > > This reverts commit 819d2df8b01e04bcc89a0a995e21b68799f890be. > > We are seeing a large number of failures on some of our internal tier5 tests and so are backing this out while it is investigated further. > > Thanks, > David This pull request has now been integrated. Changeset: fab3d6c6 Author: David Holmes URL: https://git.openjdk.java.net/jdk/commit/fab3d6c6122b2ce23dfa12db489923d8261f8f35 Stats: 338 lines in 8 files changed: 96 ins; 228 del; 14 mod 8275761: Backout: JDK-8274794 Print all owned locks in hs_err file Reviewed-by: mikael ------------- PR: https://git.openjdk.java.net/jdk/pull/6074 From tobias.hartmann at oracle.com Fri Oct 22 08:12:14 2021 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 22 Oct 2021 10:12:14 +0200 Subject: [External] : Re: RFC - Improving C2 Escape Analysis In-Reply-To: References: <20210930140335.648146897@eggemoggin.niobe.net> <415a6622-a46c-33da-8e39-c8f3068c7df3@oracle.com> <44563450-403B-4A15-95AB-5FB5DCA4ED0B@oracle.com> Message-ID: <81f86a0b-dfb7-0b45-1779-49209a82ae40@oracle.com> Hi Cesar, On 22.10.21 01:07, Cesar Soares Lucas wrote: > We decided to direct our immediate efforts to improving EA for heap objects. > AFAIU inline types will bring a lot of performance improvements but that will > require the programmer to rewrite part(s) of the application - please correct > if I'm wrong here. We are investigating how we can improve the performance of > applications that aren't updated to use inline types. That makes perfect sense to me. > About the write-up, may I ask you what do you think is the most critical > issue raised there? I think the most critical issue is that current EA does not properly handle even "simple" control flow merges (including the cases you described in the "Re-use Stack Slots" section). I.e., cases where the object does not escape on any paths but control flow is too complicated for EA to prove that. Often, it's not even multiple objects of the same type that are merged but just a single object that is used in different paths. > Or perhaps there is something else that we missed? Maybe the inability to handle `null` could be described/handled as a separate issue. IIRC both EA and scalarization do not support `null` being merged with a non-escaping object. Also, I think there are (or will be) cases when EA is able to prove that an object is non-escaping but scalarization still fails. For example, if not all loads can be folded or there are other uses left (see `PhaseMacroExpand::can_eliminate_allocation`). Since scalarization is the main benefit of EA, improvements to EA need to be made to scalarization as well. Best regards, Tobias > ---------------------------------------------------------------------------------------------------- > *From:* Ron Pressler > *Sent:* October 19, 2021 6:39 AM > *To:* Tobias Hartmann > *Cc:* Cesar Soares Lucas ; John Rose ; Mark > Reinhold ; hotspot-dev at openjdk.java.net ; > Brian Stafford ; Martijn Verburg > *Subject:* Re: RFC - Improving C2 Escape Analysis > ? > [You don't often get email from ron.pressler at oracle.com. Learn why this is important at > http://aka.ms/LearnAboutSenderIdentification.] > > >> On 19 Oct 2021, at 14:17, Tobias Hartmann wrote: >> >> >> That said, support for stack/thread local allocation would be something that we could potentially >> use in Valhalla to avoid buffering on the heap. It's difficult though because in most cases we use >> buffering when storing to a non-flattened field that escapes to other threads. >> > > > Just a reminder, stacks can move in memory (virtual threads), so great care must be > taken with pointers pointing into the stack. As a general rule, that?s a no-no. > > ? Ron From stefank at openjdk.java.net Fri Oct 22 08:23:13 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Fri, 22 Oct 2021 08:23:13 GMT Subject: RFR: 8275717: Reimplement STATIC_ASSERT to use static_assert Message-ID: We currently have a bespoke implementation of STATIC_ASSERT, which was added before we transitioned over to C++14. I recently saw some compiler warnings (macos-aarch64) about that implementation. I propose that we reimplement STATIC_ASSERT to use static_assert. FWIW, some HotSpot code already uses static_assert instead of STATIC_ASSERT. An alternative would be to completely remove STATIC_ASSERT and always use static_assert. That would require us to add a message string to all of the > 400 usages. Maybe we should wait with that until we can use C++17, which removes the requirement to provide a message? ------------- Commit messages: - 8275717: Reimplement STATIC_ASSERT to use static_assert Changes: https://git.openjdk.java.net/jdk/pull/6077/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6077&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275717 Stats: 16 lines in 1 file changed: 0 ins; 15 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/6077.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6077/head:pull/6077 PR: https://git.openjdk.java.net/jdk/pull/6077 From stefank at openjdk.java.net Fri Oct 22 08:24:08 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Fri, 22 Oct 2021 08:24:08 GMT Subject: RFR: 8275712: Hashtable literal_size functions are broken In-Reply-To: References: Message-ID: On Thu, 21 Oct 2021 11:26:36 GMT, Stefan Karlsson wrote: > The literal_size functions are used to estimate the size of held objects in some of our hashtables. > > Two bugs: > 1) They return int, but the returned value is in bytes and could be larger than MAX_INT. > 2) Non-String objects report words instead of bytes. > > Manually tested by running JFR and checking the output in JMC. Thanks for reviewing! ------------- PR: https://git.openjdk.java.net/jdk/pull/6063 From stefank at openjdk.java.net Fri Oct 22 08:24:09 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Fri, 22 Oct 2021 08:24:09 GMT Subject: Integrated: 8275712: Hashtable literal_size functions are broken In-Reply-To: References: Message-ID: On Thu, 21 Oct 2021 11:26:36 GMT, Stefan Karlsson wrote: > The literal_size functions are used to estimate the size of held objects in some of our hashtables. > > Two bugs: > 1) They return int, but the returned value is in bytes and could be larger than MAX_INT. > 2) Non-String objects report words instead of bytes. > > Manually tested by running JFR and checking the output in JMC. This pull request has now been integrated. Changeset: 1efe946d Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/1efe946db77e38507511a9c898b8b59fe9ba1aeb Stats: 33 lines in 2 files changed: 15 ins; 10 del; 8 mod 8275712: Hashtable literal_size functions are broken Reviewed-by: coleenp, zgu ------------- PR: https://git.openjdk.java.net/jdk/pull/6063 From stuefe at openjdk.java.net Fri Oct 22 13:00:03 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 22 Oct 2021 13:00:03 GMT Subject: RFR: 8275717: Reimplement STATIC_ASSERT to use static_assert In-Reply-To: References: Message-ID: On Fri, 22 Oct 2021 08:15:52 GMT, Stefan Karlsson wrote: > We currently have a bespoke implementation of STATIC_ASSERT, which was added before we transitioned over to C++14. I recently saw some compiler warnings (macos-aarch64) about that implementation. I propose that we reimplement STATIC_ASSERT to use static_assert. FWIW, some HotSpot code already uses static_assert instead of STATIC_ASSERT. > > An alternative would be to completely remove STATIC_ASSERT and always use static_assert. That would require us to add a message string to all of the > 400 usages. Maybe we should wait with that until we can use C++17, which removes the requirement to provide a message? LGTM. How do the asserts at build time look like? I'm in favor of retaining the macro since it enables us to add support for non-compliant compilers and keeps backporting simple. Thanks, Thomas ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6077 From eosterlund at openjdk.java.net Fri Oct 22 14:20:06 2021 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 22 Oct 2021 14:20:06 GMT Subject: RFR: 8275717: Reimplement STATIC_ASSERT to use static_assert In-Reply-To: References: Message-ID: On Fri, 22 Oct 2021 08:15:52 GMT, Stefan Karlsson wrote: > We currently have a bespoke implementation of STATIC_ASSERT, which was added before we transitioned over to C++14. I recently saw some compiler warnings (macos-aarch64) about that implementation. I propose that we reimplement STATIC_ASSERT to use static_assert. FWIW, some HotSpot code already uses static_assert instead of STATIC_ASSERT. > > An alternative would be to completely remove STATIC_ASSERT and always use static_assert. That would require us to add a message string to all of the > 400 usages. Maybe we should wait with that until we can use C++17, which removes the requirement to provide a message? Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6077 From stefank at openjdk.java.net Fri Oct 22 14:58:04 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Fri, 22 Oct 2021 14:58:04 GMT Subject: RFR: 8275717: Reimplement STATIC_ASSERT to use static_assert In-Reply-To: References: Message-ID: On Fri, 22 Oct 2021 12:57:06 GMT, Thomas Stuefe wrote: > LGTM. > > How do the asserts at build time look like? > > I'm in favor of retaining the macro since it enables us to add support for non-compliant compilers and keeps backporting simple. > > Thanks, Thomas Thanks for reviewing! This is one of the warnings: In file included from /Users/runner/work/jdk/jdk/jdk/src/hotspot/share/asm/register.cpp:1: In file included from /Users/runner/work/jdk/jdk/jdk/src/hotspot/share/precompiled/precompiled.hpp:34: In file included from /Users/runner/work/jdk/jdk/jdk/src/hotspot/share/classfile/classLoaderData.hpp:28: In file included from /Users/runner/work/jdk/jdk/jdk/src/hotspot/share/memory/allocation.hpp:29: ------------- PR: https://git.openjdk.java.net/jdk/pull/6077 From kbarrett at openjdk.java.net Fri Oct 22 15:44:09 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 22 Oct 2021 15:44:09 GMT Subject: RFR: 8275717: Reimplement STATIC_ASSERT to use static_assert In-Reply-To: References: Message-ID: On Fri, 22 Oct 2021 08:15:52 GMT, Stefan Karlsson wrote: > We currently have a bespoke implementation of STATIC_ASSERT, which was added before we transitioned over to C++14. I recently saw some compiler warnings (macos-aarch64) about that implementation. I propose that we reimplement STATIC_ASSERT to use static_assert. FWIW, some HotSpot code already uses static_assert instead of STATIC_ASSERT. > > An alternative would be to completely remove STATIC_ASSERT and always use static_assert. That would require us to add a message string to all of the > 400 usages. Maybe we should wait with that until we can use C++17, which removes the requirement to provide a message? Looks good. The "unused typedef" warnings suggest a build with `-Wunused-local-typedefs` enabled. I thought we were not enabling that (because it's not actually useful), but don't see such disabling and it's enabled by `-Wall`. So why aren't we getting those warnings in any function where we use STATIC_ASSERT, for any gcc-based platform? ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6077 From duke at openjdk.java.net Sat Oct 23 06:39:10 2021 From: duke at openjdk.java.net (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Sat, 23 Oct 2021 06:39:10 GMT Subject: RFR: 8275512: Upgrade required version of jtreg to 6.1 [v2] In-Reply-To: References: <8ZF_bktOd9u8s06pS5q_VfPnDplwwjhlpz8pdLyHnvw=.67c1b571-5f34-4251-9872-34ccb89dbd5a@github.com> Message-ID: On Wed, 20 Oct 2021 09:28:30 GMT, Leslie Zhai wrote: > requires jtreg version 6.1 b1 or higher This confused me a bit too; I was using jtreg-6+1.tar.gz from [Adoption Group build](https://ci.adoptopenjdk.net/view/Dependencies/job/dependency_pipeline/lastSuccessfulBuild/artifact/jtreg/), and apparently 6+1 is 6.0b1, not 6.1. For now I'm using jtregtip. ------------- PR: https://git.openjdk.java.net/jdk/pull/6012 From dholmes at openjdk.java.net Mon Oct 25 00:26:02 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 25 Oct 2021 00:26:02 GMT Subject: RFR: 8275718: Relax memory constraint on exception counter updates In-Reply-To: References: Message-ID: On Thu, 21 Oct 2021 15:16:28 GMT, Zhengyu Gu wrote: > This is another instance of counter updates that only need atomic guarantee. What does this achieve? These are not "hot" counters so performance is of zero concern. The conservative memory ordering doesn't guarantee cross-thread visibility but it does seem to me there is less likelihood of error reporting seeing a stale value from another thread if we have "conservative" memory ordering (which traditionally should have been a full bi-directional fence for atomic r-m-w operations). David ------------- PR: https://git.openjdk.java.net/jdk/pull/6065 From shade at openjdk.java.net Mon Oct 25 08:22:05 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 25 Oct 2021 08:22:05 GMT Subject: RFR: 8275586: Zero: Simplify interpreter initialization In-Reply-To: <8FCUBqssHqcaYRC6gnr37F8A9gGX1Hzvx8ny5BQblOY=.b54f2029-7a37-493a-bcc9-6fec9c29c943@github.com> References: <8FCUBqssHqcaYRC6gnr37F8A9gGX1Hzvx8ny5BQblOY=.b54f2029-7a37-493a-bcc9-6fec9c29c943@github.com> Message-ID: On Wed, 20 Oct 2021 07:44:36 GMT, Aleksey Shipilev wrote: > The prolog in `BytecodeInterpreter` is hairy due to early initialization of interpreter statics. Previous rewrites make it mostly redundant, and we can now simplify it. > > This also implicitly fixes a initialization bug. If `JvmtiExport::can_post_interpreter_events()` changes at runtime, we will call into the uninitialized version: > > > // Call the interpreter > if (JvmtiExport::can_post_interpreter_events()) { > BytecodeInterpreter::run(istate); > } else { > BytecodeInterpreter::run(istate); > } > > > Additional testing: > - [x] Linux x86_64 fastdebug `make bootcycle-images` Any takers? :) ------------- PR: https://git.openjdk.java.net/jdk/pull/6029 From stefank at openjdk.java.net Mon Oct 25 08:48:16 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 25 Oct 2021 08:48:16 GMT Subject: RFR: 4718400: Many quantities are held as signed that should be unsigned Message-ID: <8w0hrPXhr8NHtRq-phr7oH7pa-X8xpJolTcrRxDrDf4=.4113324d-699f-410c-8db3-ef114d812c03@github.com> The GC code tends to use the size_t type to describe sizes of memory and objects, while the oopDesc class and *Klasses use int. The unit of the object size is in "words" (8 bytes on 64-bits, 4 bytes on 32-bits), and int variables can currently hold the largest object size in words. However, when/if we start to experiment with larger objects, using int could be limiting. I propose that we convert oopDesc::size to return size_t to describe the "objects size"-in-words quantities, and update calling code to also use size_t. Note that there will still be places in the JVM that expects the size to fit in 4 bytes. I've listed a couple of them below. I'm reusing this old bug, which more or less covers what this patch does. It would probably be good to update the name of the bug if we decide this patch should be integrated, but I given the age of the RFE, I decided to open the PR with its current name. :) Comments on the patch: * There are some JVMCI usages of 'long' (not jlong). It would be good to figure out why that is, and if that's something that should be cleaned out. * JFR stores field offsets in bytes in 4 bytes variables. This could be a bug. I added an assert. * java_lang_Class::set_oop_size/oop_size uses size_t but the stored field is only 4 bytes. Leaving this for future consideration. * I left array object size calculation as-is for now. I've run this through tier1-3. I'll run more tiers after initial feedback on this proposal ------------- Commit messages: - Use size_t for Object sizes Changes: https://git.openjdk.java.net/jdk/pull/6100/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6100&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-4718400 Stats: 113 lines in 52 files changed: 3 ins; 18 del; 92 mod Patch: https://git.openjdk.java.net/jdk/pull/6100.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6100/head:pull/6100 PR: https://git.openjdk.java.net/jdk/pull/6100 From stefank at openjdk.java.net Mon Oct 25 09:11:06 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 25 Oct 2021 09:11:06 GMT Subject: RFR: 8275717: Reimplement STATIC_ASSERT to use static_assert In-Reply-To: References: Message-ID: On Fri, 22 Oct 2021 15:39:32 GMT, Kim Barrett wrote: > The "unused typedef" warnings suggest a build with `-Wunused-local-typedefs` enabled. I thought we were not enabling that (because it's not actually useful), but don't see such disabling and it's enabled by `-Wall`. So why aren't we getting those warnings in any function where we use STATIC_ASSERT, for any gcc-based platform? The failures were seen with clang. I haven't tried to dig deeper. Thanks all for reviewing! ------------- PR: https://git.openjdk.java.net/jdk/pull/6077 From stefank at openjdk.java.net Mon Oct 25 09:11:07 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 25 Oct 2021 09:11:07 GMT Subject: Integrated: 8275717: Reimplement STATIC_ASSERT to use static_assert In-Reply-To: References: Message-ID: On Fri, 22 Oct 2021 08:15:52 GMT, Stefan Karlsson wrote: > We currently have a bespoke implementation of STATIC_ASSERT, which was added before we transitioned over to C++14. I recently saw some compiler warnings (macos-aarch64) about that implementation. I propose that we reimplement STATIC_ASSERT to use static_assert. FWIW, some HotSpot code already uses static_assert instead of STATIC_ASSERT. > > An alternative would be to completely remove STATIC_ASSERT and always use static_assert. That would require us to add a message string to all of the > 400 usages. Maybe we should wait with that until we can use C++17, which removes the requirement to provide a message? This pull request has now been integrated. Changeset: 0bcc1749 Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/0bcc1749eaea20cb983a983073ad33d305681879 Stats: 16 lines in 1 file changed: 0 ins; 15 del; 1 mod 8275717: Reimplement STATIC_ASSERT to use static_assert Reviewed-by: stuefe, eosterlund, kbarrett ------------- PR: https://git.openjdk.java.net/jdk/pull/6077 From coleenp at openjdk.java.net Mon Oct 25 12:08:04 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 25 Oct 2021 12:08:04 GMT Subject: RFR: 4718400: Many quantities are held as signed that should be unsigned In-Reply-To: <8w0hrPXhr8NHtRq-phr7oH7pa-X8xpJolTcrRxDrDf4=.4113324d-699f-410c-8db3-ef114d812c03@github.com> References: <8w0hrPXhr8NHtRq-phr7oH7pa-X8xpJolTcrRxDrDf4=.4113324d-699f-410c-8db3-ef114d812c03@github.com> Message-ID: On Mon, 25 Oct 2021 08:42:20 GMT, Stefan Karlsson wrote: > The GC code tends to use the size_t type to describe sizes of memory and objects, while the oopDesc class and *Klasses use int. The unit of the object size is in "words" (8 bytes on 64-bits, 4 bytes on 32-bits), and int variables can currently hold the largest object size in words. However, when/if we start to experiment with larger objects, using int could be limiting. I propose that we convert oopDesc::size to return size_t to describe the "objects size"-in-words quantities, and update calling code to also use size_t. Note that there will still be places in the JVM that expects the size to fit in 4 bytes. I've listed a couple of them below. > > I'm reusing this old bug, which more or less covers what this patch does. It would probably be good to update the name of the bug if we decide this patch should be integrated, but I given the age of the RFE, I decided to open the PR with its current name. :) > > Comments on the patch: > * There are some JVMCI usages of 'long' (not jlong). It would be good to figure out why that is, and if that's something that should be cleaned out. > * JFR stores field offsets in bytes in 4 bytes variables. This could be a bug. I added an assert. > * java_lang_Class::set_oop_size/oop_size uses size_t but the stored field is only 4 bytes. Leaving this for future consideration. > * I left array object size calculation as-is for now. > > I've run this through tier1-3. I'll run more tiers after initial feedback on this proposal The runtime changes look great! src/hotspot/share/classfile/javaClasses.cpp line 1383: > 1381: assert(_oop_size_offset != 0, "must be set"); > 1382: assert(size > 0, "Oop size must be greater than zero, not " SIZE_FORMAT, size); > 1383: assert(size <= INT_MAX, "Lossy conversion: " SIZE_FORMAT, size); Yes, can you add a small comment here that the size in the java.lang.Class instance is only 4 bytes? It probably doesn't need to be but not something to change in this change. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6100 From rbackman at openjdk.java.net Mon Oct 25 12:36:06 2021 From: rbackman at openjdk.java.net (Rickard =?UTF-8?B?QsOkY2ttYW4=?=) Date: Mon, 25 Oct 2021 12:36:06 GMT Subject: RFR: 4718400: Many quantities are held as signed that should be unsigned In-Reply-To: <8w0hrPXhr8NHtRq-phr7oH7pa-X8xpJolTcrRxDrDf4=.4113324d-699f-410c-8db3-ef114d812c03@github.com> References: <8w0hrPXhr8NHtRq-phr7oH7pa-X8xpJolTcrRxDrDf4=.4113324d-699f-410c-8db3-ef114d812c03@github.com> Message-ID: On Mon, 25 Oct 2021 08:42:20 GMT, Stefan Karlsson wrote: > The GC code tends to use the size_t type to describe sizes of memory and objects, while the oopDesc class and *Klasses use int. The unit of the object size is in "words" (8 bytes on 64-bits, 4 bytes on 32-bits), and int variables can currently hold the largest object size in words. However, when/if we start to experiment with larger objects, using int could be limiting. I propose that we convert oopDesc::size to return size_t to describe the "objects size"-in-words quantities, and update calling code to also use size_t. Note that there will still be places in the JVM that expects the size to fit in 4 bytes. I've listed a couple of them below. > > I'm reusing this old bug, which more or less covers what this patch does. It would probably be good to update the name of the bug if we decide this patch should be integrated, but I given the age of the RFE, I decided to open the PR with its current name. :) > > Comments on the patch: > * There are some JVMCI usages of 'long' (not jlong). It would be good to figure out why that is, and if that's something that should be cleaned out. > * JFR stores field offsets in bytes in 4 bytes variables. This could be a bug. I added an assert. > * java_lang_Class::set_oop_size/oop_size uses size_t but the stored field is only 4 bytes. Leaving this for future consideration. > * I left array object size calculation as-is for now. > > I've run this through tier1-3. I'll run more tiers after initial feedback on this proposal Looks good. ------------- Marked as reviewed by rbackman (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6100 From coleenp at openjdk.java.net Mon Oct 25 12:59:24 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 25 Oct 2021 12:59:24 GMT Subject: RFR: 8275800: Redefinition leaks MethodData::_extra_data_lock Message-ID: The Mutex destructor isn't called for the MDO lock. We found this leak with crashes for the patch to print all mutex locks on error, which is now withdrawn. It was easily reproduced with that patch. It is not easily reproduced otherwise. I tested this change with that patch though and it passes tier1-6, ------------- Commit messages: - 8275800: Redefinition leaks MethodData::_extra_data_lock Changes: https://git.openjdk.java.net/jdk/pull/6105/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6105&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275800 Stats: 17 lines in 2 files changed: 8 ins; 8 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/6105.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6105/head:pull/6105 PR: https://git.openjdk.java.net/jdk/pull/6105 From dholmes at openjdk.java.net Mon Oct 25 21:40:40 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 25 Oct 2021 21:40:40 GMT Subject: RFR: 8275800: Redefinition leaks MethodData::_extra_data_lock In-Reply-To: References: Message-ID: <8vVneUUEGRU697sJgAYHWcqguc_1cZnxtlchPr8npiE=.d4099019-f0da-428e-9293-a4fd1a6237e4@github.com> On Mon, 25 Oct 2021 12:51:32 GMT, Coleen Phillimore wrote: > The Mutex destructor isn't called for the MDO lock. We found this leak with crashes for the patch to print all mutex locks on error, which is now withdrawn. It was easily reproduced with that patch. It is not easily reproduced otherwise. > I tested this change with that patch though and it passes tier1-6, src/hotspot/share/oops/instanceKlass.cpp line 2692: > 2690: > 2691: // Called also by InstanceKlass::deallocate_contents > 2692: void InstanceKlass::release_C_heap_structures_internal() { Why do we have two different cleanup functions here? It seems overly subtle that `deallocate_contents` only calls `release_C_heap_structures_internal`, rather than `release_C_heap_structures`. And now the only difference is whether `constants()->release_C_heap_structures()` is called, it seems to me that distinction should perhaps be more explicit in the caller. ------------- PR: https://git.openjdk.java.net/jdk/pull/6105 From coleenp at openjdk.java.net Mon Oct 25 22:20:16 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 25 Oct 2021 22:20:16 GMT Subject: RFR: 8275800: Redefinition leaks MethodData::_extra_data_lock In-Reply-To: <8vVneUUEGRU697sJgAYHWcqguc_1cZnxtlchPr8npiE=.d4099019-f0da-428e-9293-a4fd1a6237e4@github.com> References: <8vVneUUEGRU697sJgAYHWcqguc_1cZnxtlchPr8npiE=.d4099019-f0da-428e-9293-a4fd1a6237e4@github.com> Message-ID: <1UcQ7ZQtc-WXG46d96s1oxGWTkdot8jdy3bUEVgQAYw=.a3356dab-3c11-478e-a11d-9000064f5aeb@github.com> On Mon, 25 Oct 2021 21:33:33 GMT, David Holmes wrote: >> The Mutex destructor isn't called for the MDO lock. We found this leak with crashes for the patch to print all mutex locks on error, which is now withdrawn. It was easily reproduced with that patch. It is not easily reproduced otherwise. >> I tested this change with that patch though and it passes tier1-6, > > src/hotspot/share/oops/instanceKlass.cpp line 2692: > >> 2690: >> 2691: // Called also by InstanceKlass::deallocate_contents >> 2692: void InstanceKlass::release_C_heap_structures_internal() { > > Why do we have two different cleanup functions here? It seems overly subtle that `deallocate_contents` only calls `release_C_heap_structures_internal`, rather than `release_C_heap_structures`. And now the only difference is whether `constants()->release_C_heap_structures()` is called, it seems to me that distinction should perhaps be more explicit in the caller. // Can't release the constant pool here because the constant pool can be // deallocated separately from the InstanceKlass for default methods and // redefine classes. Because of this comment. We have two ways that we can get to ConstantPool::deallocate_contents. We don't have two ways to get to the Method::deallocate_contents. The deallocate list contains unloaded/unattached Methods (from the relocator for jsr/ret), scratch classes from redefinition, fully formed constant pools from default methods, or fully formed InstanceKlass from classfile parsing errors or class definition errors. Only the constant pool case should be separated out in deallocate_contents, which is why it's excluded. ------------- PR: https://git.openjdk.java.net/jdk/pull/6105 From coleenp at openjdk.java.net Mon Oct 25 22:20:17 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 25 Oct 2021 22:20:17 GMT Subject: RFR: 8275800: Redefinition leaks MethodData::_extra_data_lock In-Reply-To: <1UcQ7ZQtc-WXG46d96s1oxGWTkdot8jdy3bUEVgQAYw=.a3356dab-3c11-478e-a11d-9000064f5aeb@github.com> References: <8vVneUUEGRU697sJgAYHWcqguc_1cZnxtlchPr8npiE=.d4099019-f0da-428e-9293-a4fd1a6237e4@github.com> <1UcQ7ZQtc-WXG46d96s1oxGWTkdot8jdy3bUEVgQAYw=.a3356dab-3c11-478e-a11d-9000064f5aeb@github.com> Message-ID: On Mon, 25 Oct 2021 22:16:05 GMT, Coleen Phillimore wrote: >> src/hotspot/share/oops/instanceKlass.cpp line 2692: >> >>> 2690: >>> 2691: // Called also by InstanceKlass::deallocate_contents >>> 2692: void InstanceKlass::release_C_heap_structures_internal() { >> >> Why do we have two different cleanup functions here? It seems overly subtle that `deallocate_contents` only calls `release_C_heap_structures_internal`, rather than `release_C_heap_structures`. And now the only difference is whether `constants()->release_C_heap_structures()` is called, it seems to me that distinction should perhaps be more explicit in the caller. > > // Can't release the constant pool here because the constant pool can be > // deallocated separately from the InstanceKlass for default methods and > // redefine classes. > > Because of this comment. We have two ways that we can get to ConstantPool::deallocate_contents. We don't have two ways to get to the Method::deallocate_contents. > > The deallocate list contains unloaded/unattached Methods (from the relocator for jsr/ret), scratch classes from redefinition, fully formed constant pools from default methods, or fully formed InstanceKlass from classfile parsing errors or class definition errors. > > Only the constant pool case should be separated out in deallocate_contents, which is why it's excluded. Is the suggestion to have a default parameter to free CHEAP structures that's false for deallocate_contents? ------------- PR: https://git.openjdk.java.net/jdk/pull/6105 From coleenp at openjdk.java.net Mon Oct 25 22:49:39 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 25 Oct 2021 22:49:39 GMT Subject: RFR: 8275800: Redefinition leaks MethodData::_extra_data_lock [v2] In-Reply-To: References: Message-ID: > The Mutex destructor isn't called for the MDO lock. We found this leak with crashes for the patch to print all mutex locks on error, which is now withdrawn. It was easily reproduced with that patch. It is not easily reproduced otherwise. > I tested this change with that patch though and it passes tier1-6, Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Add parameter to release_C_heap_structures. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6105/files - new: https://git.openjdk.java.net/jdk/pull/6105/files/09b9bfcb..22b38b03 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6105&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6105&range=00-01 Stats: 22 lines in 4 files changed: 4 ins; 10 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/6105.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6105/head:pull/6105 PR: https://git.openjdk.java.net/jdk/pull/6105 From dholmes at openjdk.java.net Mon Oct 25 22:54:14 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 25 Oct 2021 22:54:14 GMT Subject: RFR: 8275800: Redefinition leaks MethodData::_extra_data_lock [v2] In-Reply-To: References: <8vVneUUEGRU697sJgAYHWcqguc_1cZnxtlchPr8npiE=.d4099019-f0da-428e-9293-a4fd1a6237e4@github.com> <1UcQ7ZQtc-WXG46d96s1oxGWTkdot8jdy3bUEVgQAYw=.a3356dab-3c11-478e-a11d-9000064f5aeb@github.com> Message-ID: On Mon, 25 Oct 2021 22:17:33 GMT, Coleen Phillimore wrote: >> // Can't release the constant pool here because the constant pool can be >> // deallocated separately from the InstanceKlass for default methods and >> // redefine classes. >> >> Because of this comment. We have two ways that we can get to ConstantPool::deallocate_contents. We don't have two ways to get to the Method::deallocate_contents. >> >> The deallocate list contains unloaded/unattached Methods (from the relocator for jsr/ret), scratch classes from redefinition, fully formed constant pools from default methods, or fully formed InstanceKlass from classfile parsing errors or class definition errors. >> >> Only the constant pool case should be separated out in deallocate_contents, which is why it's excluded. > > Is the suggestion to have a default parameter to free CHEAP structures that's false for deallocate_contents? > I'm not really sure it's an improvement because I have to have an extra parameter to Klass::release_C_heap_structures that's unused. That is one option but I was thinking perhaps remove `constants()->release_C_heap_structures()` from `instanceKlass::release_C_heap_structures()` and have the callers that should remove the CP call `ik->constants()->release_C_heap_structures()` directly? Or perhaps just change this comment: // Release C heap allocated data that this points to, which includes // reference counting symbol names. release_C_heap_structures_internal(); to ` // reference counting symbol names, but excludes the constant pool.` ------------- PR: https://git.openjdk.java.net/jdk/pull/6105 From sspitsyn at openjdk.java.net Mon Oct 25 23:00:13 2021 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Mon, 25 Oct 2021 23:00:13 GMT Subject: RFR: 8275800: Redefinition leaks MethodData::_extra_data_lock [v2] In-Reply-To: References: <8vVneUUEGRU697sJgAYHWcqguc_1cZnxtlchPr8npiE=.d4099019-f0da-428e-9293-a4fd1a6237e4@github.com> <1UcQ7ZQtc-WXG46d96s1oxGWTkdot8jdy3bUEVgQAYw=.a3356dab-3c11-478e-a11d-9000064f5aeb@github.com> Message-ID: On Mon, 25 Oct 2021 22:51:37 GMT, David Holmes wrote: >> Is the suggestion to have a default parameter to free CHEAP structures that's false for deallocate_contents? >> I'm not really sure it's an improvement because I have to have an extra parameter to Klass::release_C_heap_structures that's unused. > > That is one option but I was thinking perhaps remove `constants()->release_C_heap_structures()` from `instanceKlass::release_C_heap_structures()` and have the callers that should remove the CP call `ik->constants()->release_C_heap_structures()` directly? > Or perhaps just change this comment: > > > // Release C heap allocated data that this points to, which includes > // reference counting symbol names. > release_C_heap_structures_internal(); > > > to > > ` // reference counting symbol names, but excludes the constant pool.` Yes, I think that combining release_C_heap_structures and release_C_heap_structures_internal into single release_C_heap_structures with a default parameter will simplify code a little bit. But I'm not sure it is that important and would leave it for you to decide. :) ------------- PR: https://git.openjdk.java.net/jdk/pull/6105 From sspitsyn at openjdk.java.net Mon Oct 25 23:13:14 2021 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Mon, 25 Oct 2021 23:13:14 GMT Subject: RFR: 8275800: Redefinition leaks MethodData::_extra_data_lock [v2] In-Reply-To: References: Message-ID: On Mon, 25 Oct 2021 22:49:39 GMT, Coleen Phillimore wrote: >> The Mutex destructor isn't called for the MDO lock. We found this leak with crashes for the patch to print all mutex locks on error, which is now withdrawn. It was easily reproduced with that patch. It is not easily reproduced otherwise. >> I tested this change with that patch though and it passes tier1-6, > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add parameter to release_C_heap_structures. Coleen, It looks good to me. Thanks, Serguei src/hotspot/share/oops/instanceKlass.cpp line 2693: > 2691: Klass::release_C_heap_structures(); > 2692: if (release_constant_pool) { > 2693: constants()->release_C_heap_structures(); Just wanted to note that the order of de-allocations has been changed with this fix. It should not be important though. To be fully safe this deallocation can be moved to the end of `release_C_heap_structures`. ------------- Marked as reviewed by sspitsyn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6105 From sspitsyn at openjdk.java.net Mon Oct 25 23:24:11 2021 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Mon, 25 Oct 2021 23:24:11 GMT Subject: RFR: 8275800: Redefinition leaks MethodData::_extra_data_lock [v2] In-Reply-To: References: <8vVneUUEGRU697sJgAYHWcqguc_1cZnxtlchPr8npiE=.d4099019-f0da-428e-9293-a4fd1a6237e4@github.com> <1UcQ7ZQtc-WXG46d96s1oxGWTkdot8jdy3bUEVgQAYw=.a3356dab-3c11-478e-a11d-9000064f5aeb@github.com> Message-ID: <7Xnvst8kX4nfCI_iHxLYpd7t2cZMR2qd6SJaXEeZd0M=.f1f7f120-05f4-4750-8907-79af6018f59f@github.com> On Mon, 25 Oct 2021 22:57:35 GMT, Serguei Spitsyn wrote: >> That is one option but I was thinking perhaps remove `constants()->release_C_heap_structures()` from `instanceKlass::release_C_heap_structures()` and have the callers that should remove the CP call `ik->constants()->release_C_heap_structures()` directly? >> Or perhaps just change this comment: >> >> >> // Release C heap allocated data that this points to, which includes >> // reference counting symbol names. >> release_C_heap_structures_internal(); >> >> >> to >> >> ` // reference counting symbol names, but excludes the constant pool.` > > Yes, I think that combining release_C_heap_structures and release_C_heap_structures_internal into single release_C_heap_structures with a default parameter will simplify code a little bit. But I'm not sure it is that important and would leave it for you to decide. :) I did not see the last comment from David when added mine. In fact, I like the David's suggestion to call `ik->constants()->release_C_heap_structures()` directly. It looks more simple to me. ------------- PR: https://git.openjdk.java.net/jdk/pull/6105 From dholmes at openjdk.java.net Tue Oct 26 00:59:11 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 26 Oct 2021 00:59:11 GMT Subject: RFR: 4718400: Many quantities are held as signed that should be unsigned In-Reply-To: <8w0hrPXhr8NHtRq-phr7oH7pa-X8xpJolTcrRxDrDf4=.4113324d-699f-410c-8db3-ef114d812c03@github.com> References: <8w0hrPXhr8NHtRq-phr7oH7pa-X8xpJolTcrRxDrDf4=.4113324d-699f-410c-8db3-ef114d812c03@github.com> Message-ID: <11KqFpdd_IyifucriPTm5vZhHoicVscbTT2JMcRwKu8=.97635141-aa7b-430b-9fbb-eda3ec99f395@github.com> On Mon, 25 Oct 2021 08:42:20 GMT, Stefan Karlsson wrote: > The GC code tends to use the size_t type to describe sizes of memory and objects, while the oopDesc class and *Klasses use int. The unit of the object size is in "words" (8 bytes on 64-bits, 4 bytes on 32-bits), and int variables can currently hold the largest object size in words. However, when/if we start to experiment with larger objects, using int could be limiting. I propose that we convert oopDesc::size to return size_t to describe the "objects size"-in-words quantities, and update calling code to also use size_t. Note that there will still be places in the JVM that expects the size to fit in 4 bytes. I've listed a couple of them below. > > I'm reusing this old bug, which more or less covers what this patch does. It would probably be good to update the name of the bug if we decide this patch should be integrated, but I given the age of the RFE, I decided to open the PR with its current name. :) > > Comments on the patch: > * There are some JVMCI usages of 'long' (not jlong). It would be good to figure out why that is, and if that's something that should be cleaned out. > * JFR stores field offsets in bytes in 4 bytes variables. This could be a bug. I added an assert. > * java_lang_Class::set_oop_size/oop_size uses size_t but the stored field is only 4 bytes. Leaving this for future consideration. > * I left array object size calculation as-is for now. > > I've run this through tier1-3. I'll run more tiers after initial feedback on this proposal Looks good! Thanks for taking this on. David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6100 From duke at openjdk.java.net Tue Oct 26 02:06:32 2021 From: duke at openjdk.java.net (Fei Gao) Date: Tue, 26 Oct 2021 02:06:32 GMT Subject: RFR: 8274179: AArch64: Support SVE operations with encodable immediates Message-ID: for(int i = 0; i < LENGTH; i++) { c[i] = a[i] + 2; } For the case showed above, after superword optimization with SVE, without the patch, the vector add operation always has 2 z-reg inputs, like: mov z16.s, #2 add z17.s, z17.s, z16.s Considering sve has supported basic binary operations with immediate, this pattern could be further optimized to: add z16.s, z16.s, #2 To implement it, we added some new match rules and assembler rules in the aarch64 backend. We also made some extensions on immediate types and functions to keep backward compatible. With the patch, only these binary integer vector operations, +(add), -(sub), &(and), |(orr), and ^(eor) with immediate are supported for the optimization. Other vector operations are not supported currently. Tested tier1 and test/hotspot/jtreg/compiler on SVE featured AArch64 CPU, no new failure. There is no obvious performance uplift but it can help remove one redundant mov instruction. ------------- Commit messages: - 8274179: AArch64: Support SVE operations with encodable immediates Changes: https://git.openjdk.java.net/jdk/pull/6115/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6115&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8274179 Stats: 1496 lines in 14 files changed: 1312 ins; 50 del; 134 mod Patch: https://git.openjdk.java.net/jdk/pull/6115.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6115/head:pull/6115 PR: https://git.openjdk.java.net/jdk/pull/6115 From dholmes at openjdk.java.net Tue Oct 26 04:32:11 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 26 Oct 2021 04:32:11 GMT Subject: RFR: 8274179: AArch64: Support SVE operations with encodable immediates In-Reply-To: References: Message-ID: On Tue, 26 Oct 2021 01:58:40 GMT, Fei Gao wrote: > for(int i = 0; i < LENGTH; i++) { > c[i] = a[i] + 2; > } > > For the case showed above, after superword optimization with SVE, > without the patch, the vector add operation always has 2 z-reg inputs, > like: > mov z16.s, #2 > add z17.s, z17.s, z16.s > > Considering sve has supported basic binary operations with immediate, > this pattern could be further optimized to: > add z16.s, z16.s, #2 > > To implement it, we added some new match rules and assembler rules in > the aarch64 backend. We also made some extensions on immediate types > and functions to keep backward compatible. > > With the patch, only these binary integer vector operations, +(add), > -(sub), &(and), |(orr), and ^(eor) with immediate are supported for > the optimization. Other vector operations are not supported currently. > > Tested tier1 and test/hotspot/jtreg/compiler on SVE featured AArch64 > CPU, no new failure. > > There is no obvious performance uplift but it can help remove one > redundant mov instruction. @fg1417 I'm running this through our internal testing as we have had problems with other recent SVE changes. Thanks, David ------------- PR: https://git.openjdk.java.net/jdk/pull/6115 From duke at openjdk.java.net Tue Oct 26 05:51:09 2021 From: duke at openjdk.java.net (Fei Gao) Date: Tue, 26 Oct 2021 05:51:09 GMT Subject: RFR: 8274179: AArch64: Support SVE operations with encodable immediates In-Reply-To: References: Message-ID: On Tue, 26 Oct 2021 04:29:09 GMT, David Holmes wrote: > @fg1417 I'm running this through our internal testing as we have had problems with other recent SVE changes. > > Thanks, David Thanks, @dholmes-ora . If you have any problems, please contact me. ------------- PR: https://git.openjdk.java.net/jdk/pull/6115 From dholmes at openjdk.java.net Tue Oct 26 07:04:11 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 26 Oct 2021 07:04:11 GMT Subject: RFR: 8274179: AArch64: Support SVE operations with encodable immediates In-Reply-To: References: Message-ID: On Tue, 26 Oct 2021 01:58:40 GMT, Fei Gao wrote: > for(int i = 0; i < LENGTH; i++) { > c[i] = a[i] + 2; > } > > For the case showed above, after superword optimization with SVE, > without the patch, the vector add operation always has 2 z-reg inputs, > like: > mov z16.s, #2 > add z17.s, z17.s, z16.s > > Considering sve has supported basic binary operations with immediate, > this pattern could be further optimized to: > add z16.s, z16.s, #2 > > To implement it, we added some new match rules and assembler rules in > the aarch64 backend. We also made some extensions on immediate types > and functions to keep backward compatible. > > With the patch, only these binary integer vector operations, +(add), > -(sub), &(and), |(orr), and ^(eor) with immediate are supported for > the optimization. Other vector operations are not supported currently. > > Tested tier1 and test/hotspot/jtreg/compiler on SVE featured AArch64 > CPU, no new failure. > > There is no obvious performance uplift but it can help remove one > redundant mov instruction. Test run was successful - tiers 1-3. ------------- PR: https://git.openjdk.java.net/jdk/pull/6115 From duke at openjdk.java.net Tue Oct 26 08:04:10 2021 From: duke at openjdk.java.net (Fei Gao) Date: Tue, 26 Oct 2021 08:04:10 GMT Subject: RFR: 8274179: AArch64: Support SVE operations with encodable immediates In-Reply-To: References: Message-ID: On Tue, 26 Oct 2021 07:00:57 GMT, David Holmes wrote: > Test run was successful - tiers 1-3. Thanks for your effort :) ------------- PR: https://git.openjdk.java.net/jdk/pull/6115 From magnus.ihse.bursie at oracle.com Tue Oct 26 08:59:28 2021 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Tue, 26 Oct 2021 10:59:28 +0200 Subject: RFR: 8275717: Reimplement STATIC_ASSERT to use static_assert In-Reply-To: References: Message-ID: <96bde719-9c4c-b41c-8acb-b893558ccb32@oracle.com> On 2021-10-22 17:44, Kim Barrett wrote: > The "unused typedef" warnings suggest a build with `-Wunused-local-typedefs` enabled. I thought we were not enabling that (because it's not actually useful), but don't see such disabling and it's enabled by `-Wall`. So why aren't we getting those warnings in any function where we use STATIC_ASSERT, for any gcc-based platform? That's an interesting question. I see we have a global disable for "unused" on both gcc and clang. (I'm not even sure why..?) I think that is a shorthand for several unused-* warnings. Maybe gcc and clang interprets this differently. I agree that most of the "unused" warnings are not useful, but some are definitely helping to keep the code base clean from dead code. /Magnus From aph at openjdk.java.net Tue Oct 26 09:42:18 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 26 Oct 2021 09:42:18 GMT Subject: RFR: 8274179: AArch64: Support SVE operations with encodable immediates In-Reply-To: References: Message-ID: On Tue, 26 Oct 2021 01:58:40 GMT, Fei Gao wrote: > for(int i = 0; i < LENGTH; i++) { > c[i] = a[i] + 2; > } > > For the case showed above, after superword optimization with SVE, > without the patch, the vector add operation always has 2 z-reg inputs, > like: > mov z16.s, #2 > add z17.s, z17.s, z16.s > > Considering sve has supported basic binary operations with immediate, > this pattern could be further optimized to: > add z16.s, z16.s, #2 > > To implement it, we added some new match rules and assembler rules in > the aarch64 backend. We also made some extensions on immediate types > and functions to keep backward compatible. > > With the patch, only these binary integer vector operations, +(add), > -(sub), &(and), |(orr), and ^(eor) with immediate are supported for > the optimization. Other vector operations are not supported currently. > > Tested tier1 and test/hotspot/jtreg/compiler on SVE featured AArch64 > CPU, no new failure. > > There is no obvious performance uplift but it can help remove one > redundant mov instruction. src/hotspot/cpu/aarch64/aarch64.ad line 2669: > 2667: if (!imm_node->is_Con() || > 2668: !(imm_node->bottom_type()->isa_int() || imm_node->bottom_type()->isa_long())) { > 2669: return false; I'd split this up a bit. Please consider if (!imm_node->is_Con() return false; const Type t = imm_node->bottom_type(); if (! (t->isa_int() || t->isa_long)) return false; ------------- PR: https://git.openjdk.java.net/jdk/pull/6115 From coleenp at openjdk.java.net Tue Oct 26 11:37:11 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 26 Oct 2021 11:37:11 GMT Subject: RFR: 8275800: Redefinition leaks MethodData::_extra_data_lock [v2] In-Reply-To: References: Message-ID: On Mon, 25 Oct 2021 23:09:03 GMT, Serguei Spitsyn wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Add parameter to release_C_heap_structures. > > src/hotspot/share/oops/instanceKlass.cpp line 2693: > >> 2691: Klass::release_C_heap_structures(); >> 2692: if (release_constant_pool) { >> 2693: constants()->release_C_heap_structures(); > > Just wanted to note that the order of de-allocations has been changed with this fix. It should not be important though. To be fully safe this deallocation can be moved to the end of `release_C_heap_structures`. It shouldn't make a difference but to be perfectly conservative, I'll move it to the end of the function. Thank you for the suggestion. ------------- PR: https://git.openjdk.java.net/jdk/pull/6105 From aph at openjdk.java.net Tue Oct 26 11:40:10 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 26 Oct 2021 11:40:10 GMT Subject: RFR: 8274179: AArch64: Support SVE operations with encodable immediates In-Reply-To: References: Message-ID: On Tue, 26 Oct 2021 01:58:40 GMT, Fei Gao wrote: > for(int i = 0; i < LENGTH; i++) { > c[i] = a[i] + 2; > } > > For the case showed above, after superword optimization with SVE, > without the patch, the vector add operation always has 2 z-reg inputs, > like: > mov z16.s, #2 > add z17.s, z17.s, z16.s > > Considering sve has supported basic binary operations with immediate, > this pattern could be further optimized to: > add z16.s, z16.s, #2 > > To implement it, we added some new match rules and assembler rules in > the aarch64 backend. We also made some extensions on immediate types > and functions to keep backward compatible. > > With the patch, only these binary integer vector operations, +(add), > -(sub), &(and), |(orr), and ^(eor) with immediate are supported for > the optimization. Other vector operations are not supported currently. > > Tested tier1 and test/hotspot/jtreg/compiler on SVE featured AArch64 > CPU, no new failure. > > There is no obvious performance uplift but it can help remove one > redundant mov instruction. I'd like you to split this patch into two parts, please. First, please use the new functions such as `Assembler::operand_valid_for_logical_immediate(bool is32, uint64_t imm)` only for SVE, leaving the existing logic in `Assembler` entirely untouched. This will cause some duplication, but that's OK. We can review changes to merge functionality in a separate patch. This will be much easier. ------------- PR: https://git.openjdk.java.net/jdk/pull/6115 From coleenp at openjdk.java.net Tue Oct 26 11:43:42 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 26 Oct 2021 11:43:42 GMT Subject: RFR: 8275800: Redefinition leaks MethodData::_extra_data_lock [v3] In-Reply-To: References: Message-ID: > The Mutex destructor isn't called for the MDO lock. We found this leak with crashes for the patch to print all mutex locks on error, which is now withdrawn. It was easily reproduced with that patch. It is not easily reproduced otherwise. > I tested this change with that patch though and it passes tier1-6, Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Move constant pool release_C_heap_structures to the end. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6105/files - new: https://git.openjdk.java.net/jdk/pull/6105/files/22b38b03..4b52907f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6105&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6105&range=01-02 Stats: 7 lines in 1 file changed: 4 ins; 3 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/6105.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6105/head:pull/6105 PR: https://git.openjdk.java.net/jdk/pull/6105 From coleenp at openjdk.java.net Tue Oct 26 11:47:18 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 26 Oct 2021 11:47:18 GMT Subject: RFR: 8275800: Redefinition leaks MethodData::_extra_data_lock [v3] In-Reply-To: <7Xnvst8kX4nfCI_iHxLYpd7t2cZMR2qd6SJaXEeZd0M=.f1f7f120-05f4-4750-8907-79af6018f59f@github.com> References: <8vVneUUEGRU697sJgAYHWcqguc_1cZnxtlchPr8npiE=.d4099019-f0da-428e-9293-a4fd1a6237e4@github.com> <1UcQ7ZQtc-WXG46d96s1oxGWTkdot8jdy3bUEVgQAYw=.a3356dab-3c11-478e-a11d-9000064f5aeb@github.com> <7Xnvst8kX4nfCI_iHxLYpd7t2cZMR2qd6SJaXEeZd0M=.f1f7f120-05f4-4750-8907-79af6018f59f@github.com> Message-ID: On Mon, 25 Oct 2021 23:21:02 GMT, Serguei Spitsyn wrote: >> Yes, I think that combining release_C_heap_structures and release_C_heap_structures_internal into single release_C_heap_structures with a default parameter will simplify code a little bit. But I'm not sure it is that important and would leave it for you to decide. :) > > I did not see the last comment from David when added mine. In fact, I like the David's suggestion to call `ik->constants()->release_C_heap_structures()` directly. It looks more simple to me. I'd prefer that the callers do NOT release the constant pool CHeap structures directly. The main caller, during class unloading has the InstanceKlass as a handle to delete and it would be really easy to miss the constant pool case. The constant pool is contained by the InstanceKlass in this case. Deallocate_contents is the only exception for the exceptional case where we have added the InstanceKlass to the deallocate list. ------------- PR: https://git.openjdk.java.net/jdk/pull/6105 From sspitsyn at openjdk.java.net Tue Oct 26 16:54:10 2021 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Tue, 26 Oct 2021 16:54:10 GMT Subject: RFR: 8275800: Redefinition leaks MethodData::_extra_data_lock [v3] In-Reply-To: References: Message-ID: On Tue, 26 Oct 2021 11:43:42 GMT, Coleen Phillimore wrote: >> The Mutex destructor isn't called for the MDO lock. We found this leak with crashes for the patch to print all mutex locks on error, which is now withdrawn. It was easily reproduced with that patch. It is not easily reproduced otherwise. >> I tested this change with that patch though and it passes tier1-6, > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Move constant pool release_C_heap_structures to the end. Marked as reviewed by sspitsyn (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/6105 From coleenp at openjdk.java.net Tue Oct 26 19:51:27 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 26 Oct 2021 19:51:27 GMT Subject: RFR: 8275917: Some locks shouldn't allow_vm_block Message-ID: <2cetS3bCD4YjbfSLnCqc6Gm0L0_a9Lok0iIvpOLCiY8=.ea386bfa-c548-4a5e-a3de-ef9d0dc520eb@github.com> There were a few safepoint checking locks that passed allow_vm_block as true, that aren't taken by the VMThread. I also made ClassInitError_lock behave like SystemDictionaryLock and the others in SystemDictionary::do_unloading so that doesn't require allow_vm_block either. See the CR for more details. Tested with tier1-6. ------------- Commit messages: - 8275917: Some locks shouldn't allow_vm_block Changes: https://git.openjdk.java.net/jdk/pull/6123/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6123&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275917 Stats: 23 lines in 3 files changed: 1 ins; 0 del; 22 mod Patch: https://git.openjdk.java.net/jdk/pull/6123.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6123/head:pull/6123 PR: https://git.openjdk.java.net/jdk/pull/6123 From dholmes at openjdk.java.net Wed Oct 27 00:29:08 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 27 Oct 2021 00:29:08 GMT Subject: RFR: 8275800: Redefinition leaks MethodData::_extra_data_lock [v3] In-Reply-To: References: Message-ID: On Tue, 26 Oct 2021 11:43:42 GMT, Coleen Phillimore wrote: >> The Mutex destructor isn't called for the MDO lock. We found this leak with crashes for the patch to print all mutex locks on error, which is now withdrawn. It was easily reproduced with that patch. It is not easily reproduced otherwise. >> I tested this change with that patch though and it passes tier1-6, > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Move constant pool release_C_heap_structures to the end. Updates look good. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6105 From dholmes at openjdk.java.net Wed Oct 27 01:40:07 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 27 Oct 2021 01:40:07 GMT Subject: RFR: 8275917: Some locks shouldn't allow_vm_block In-Reply-To: <2cetS3bCD4YjbfSLnCqc6Gm0L0_a9Lok0iIvpOLCiY8=.ea386bfa-c548-4a5e-a3de-ef9d0dc520eb@github.com> References: <2cetS3bCD4YjbfSLnCqc6Gm0L0_a9Lok0iIvpOLCiY8=.ea386bfa-c548-4a5e-a3de-ef9d0dc520eb@github.com> Message-ID: On Tue, 26 Oct 2021 19:43:46 GMT, Coleen Phillimore wrote: > There were a few safepoint checking locks that passed allow_vm_block as true, that aren't taken by the VMThread. I also made ClassInitError_lock behave like SystemDictionaryLock and the others in SystemDictionary::do_unloading so that doesn't require allow_vm_block either. See the CR for more details. > > Tested with tier1-6. Hi Coleen, So, IIUC the `allow_vm_block` flag leads to two checks: 1. If the VMThread tries to acquire any lock, then `allow_vm_block` must be true else it is a fatal error. (I have to keep reminding myself that the "block" in `allow_vm_block` is wrong, it should really just be "lock" as we do not check only when there is contention!) 2. For a non-safepoint lock, `allow_vm_block` must be true. (asserted at construction time) So from that we can deduce that if a safepoint lock is never taken by the VMThread, then `allow_vm_block` doesn't need to be true and so should be set to false. And I can see that change, for example, with the `BeforeExit_lock`. So in looking at this PR for every safepoint lock where `allow_vm_block` has been changed from true to false, we have to confirm that the VMThread will never take that lock. From a reviewing perspective I'll rely on your analysis and testing on that. (I did look at a couple of locks out of interest :) ). Regarding the change to the `ClassInitError_lock` ... it is valid because when a JavaThread holds that lock it doesn't execute code that can safepoint; so the VMThread would always find the lock free at a safepoint. So `allow_vm_block` being true works here. But instead you skip locking at the safepoint altogether, as done with the other locks, and so can now make `allow_vm_block` false. Really all code that takes such locks by a JavaThread should have a NSV covering the critical section to ensure it is actually safe for the VMThread to skip locking at a safepoint. BTW it also seems to me that the `Terminator_lock`, which can be taken by a NJT (WatcherThread) but not the VMThread, can also have `allow_vm_block` set to false. (I just happened to pick that one to do some code analysis and noticed the VMThread never acquires it.) Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6123 From mchung at openjdk.java.net Wed Oct 27 02:32:31 2021 From: mchung at openjdk.java.net (Mandy Chung) Date: Wed, 27 Oct 2021 02:32:31 GMT Subject: RFR: JDK-8275703: System.loadLibrary fails on Big Sur for libraries hidden from filesystem Message-ID: <1d0vB2i2qrGaWRxtX_FqnYWrKcgXHaPvShDeSSG9cXg=.f733da58-3eaf-4eda-8c93-f8e81a384326@github.com> On, macOS 11.x, system libraries are loaded from dynamic linker cache. The libraries are no longer present on the filesystem. `NativeLibraries::loadLibrary` checks for the file existence before calling `JVM_LoadLibrary`. Such check no longer applies on Big Sur. This proposes that on macOS >= 11, it will skip the file existence check and attempt to load a library for each path from java.library.path and system library path. ------------- Commit messages: - trim whitespaces - add copyright header - fix typo - JDK-8275703: System.loadLibrary fails on Big Sur for libraries hidden from filesystem Changes: https://git.openjdk.java.net/jdk/pull/6127/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6127&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275703 Stats: 198 lines in 9 files changed: 170 ins; 2 del; 26 mod Patch: https://git.openjdk.java.net/jdk/pull/6127.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6127/head:pull/6127 PR: https://git.openjdk.java.net/jdk/pull/6127 From dholmes at openjdk.java.net Wed Oct 27 02:44:13 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 27 Oct 2021 02:44:13 GMT Subject: RFR: JDK-8275703: System.loadLibrary fails on Big Sur for libraries hidden from filesystem In-Reply-To: <1d0vB2i2qrGaWRxtX_FqnYWrKcgXHaPvShDeSSG9cXg=.f733da58-3eaf-4eda-8c93-f8e81a384326@github.com> References: <1d0vB2i2qrGaWRxtX_FqnYWrKcgXHaPvShDeSSG9cXg=.f733da58-3eaf-4eda-8c93-f8e81a384326@github.com> Message-ID: On Tue, 26 Oct 2021 22:51:29 GMT, Mandy Chung wrote: > On, macOS 11.x, system libraries are loaded from dynamic linker cache. The libraries are no longer present on the filesystem. `NativeLibraries::loadLibrary` checks for the file existence before calling `JVM_LoadLibrary`. Such check no longer applies on Big Sur. This proposes that on macOS >= 11, it will skip the file existence check and attempt to load a library for each path from java.library.path and system library path. Hi Mandy, Hotspot changes are fine. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6127 From jpai at openjdk.java.net Wed Oct 27 02:54:08 2021 From: jpai at openjdk.java.net (Jaikiran Pai) Date: Wed, 27 Oct 2021 02:54:08 GMT Subject: RFR: JDK-8275703: System.loadLibrary fails on Big Sur for libraries hidden from filesystem In-Reply-To: <1d0vB2i2qrGaWRxtX_FqnYWrKcgXHaPvShDeSSG9cXg=.f733da58-3eaf-4eda-8c93-f8e81a384326@github.com> References: <1d0vB2i2qrGaWRxtX_FqnYWrKcgXHaPvShDeSSG9cXg=.f733da58-3eaf-4eda-8c93-f8e81a384326@github.com> Message-ID: On Tue, 26 Oct 2021 22:51:29 GMT, Mandy Chung wrote: > On, macOS 11.x, system libraries are loaded from dynamic linker cache. The libraries are no longer present on the filesystem. `NativeLibraries::loadLibrary` checks for the file existence before calling `JVM_LoadLibrary`. Such check no longer applies on Big Sur. This proposes that on macOS >= 11, it will skip the file existence check and attempt to load a library for each path from java.library.path and system library path. src/java.base/macosx/classes/jdk/internal/loader/ClassLoaderHelper.java line 44: > 42: } catch (NumberFormatException e) {} > 43: } > 44: hasDynamicLoaderCache = major >= 11; Hello Mandy, I'm not too familiar with MacOS versioning schemes. However, in this specific logic, if the `os.version` value doesn't contain a dot character, then the `major` is initialized to `11`, which would then evaluate this `hasDynamicLoaderCache` to `true`. That would mean if the `os.version` is (for example) `10`, then `hasDynamicLoaderCache` will be incorrectly set to `true` here, isn't it? ------------- PR: https://git.openjdk.java.net/jdk/pull/6127 From jpai at openjdk.java.net Wed Oct 27 03:36:11 2021 From: jpai at openjdk.java.net (Jaikiran Pai) Date: Wed, 27 Oct 2021 03:36:11 GMT Subject: RFR: JDK-8275703: System.loadLibrary fails on Big Sur for libraries hidden from filesystem In-Reply-To: <1d0vB2i2qrGaWRxtX_FqnYWrKcgXHaPvShDeSSG9cXg=.f733da58-3eaf-4eda-8c93-f8e81a384326@github.com> References: <1d0vB2i2qrGaWRxtX_FqnYWrKcgXHaPvShDeSSG9cXg=.f733da58-3eaf-4eda-8c93-f8e81a384326@github.com> Message-ID: On Tue, 26 Oct 2021 22:51:29 GMT, Mandy Chung wrote: > On, macOS 11.x, system libraries are loaded from dynamic linker cache. The libraries are no longer present on the filesystem. `NativeLibraries::loadLibrary` checks for the file existence before calling `JVM_LoadLibrary`. Such check no longer applies on Big Sur. This proposes that on macOS >= 11, it will skip the file existence check and attempt to load a library for each path from java.library.path and system library path. src/java.base/share/classes/jdk/internal/loader/NativeLibraries.java line 166: > 164: return null; > 165: } > 166: return file.getCanonicalPath(); I think a `!file.exists()` check would still be needed here to handle the case when `loadLibraryOnlyIfPresent` is `false`, isn't it? ------------- PR: https://git.openjdk.java.net/jdk/pull/6127 From jpai at openjdk.java.net Wed Oct 27 03:42:08 2021 From: jpai at openjdk.java.net (Jaikiran Pai) Date: Wed, 27 Oct 2021 03:42:08 GMT Subject: RFR: JDK-8275703: System.loadLibrary fails on Big Sur for libraries hidden from filesystem In-Reply-To: References: <1d0vB2i2qrGaWRxtX_FqnYWrKcgXHaPvShDeSSG9cXg=.f733da58-3eaf-4eda-8c93-f8e81a384326@github.com> Message-ID: <5BAH1dAVPdl7UnZRnSpe_7xOaX8Whsu_E0drRbcia-0=.449e99a7-ac53-4ffa-929b-b3ddc21c2f31@github.com> On Wed, 27 Oct 2021 03:31:00 GMT, Jaikiran Pai wrote: >> On, macOS 11.x, system libraries are loaded from dynamic linker cache. The libraries are no longer present on the filesystem. `NativeLibraries::loadLibrary` checks for the file existence before calling `JVM_LoadLibrary`. Such check no longer applies on Big Sur. This proposes that on macOS >= 11, it will skip the file existence check and attempt to load a library for each path from java.library.path and system library path. > > src/java.base/share/classes/jdk/internal/loader/NativeLibraries.java line 166: > >> 164: return null; >> 165: } >> 166: return file.getCanonicalPath(); > > I think a `!file.exists()` check would still be needed here to handle the case when `loadLibraryOnlyIfPresent` is `false`, isn't it? Ignore this previous comment. I read your JBS comment just now which says: > The JDK side determines the path of the given library name and checks if it exists before calling JVM_LoadLibrary. A potential fix for MacOS might be to skip the file presence check and pass it to JVM. So, I think, the whole point of this change in this block is to skip the file existence check and return back a file path. ------------- PR: https://git.openjdk.java.net/jdk/pull/6127 From Divino.Cesar at microsoft.com Wed Oct 27 06:20:53 2021 From: Divino.Cesar at microsoft.com (Cesar Soares Lucas) Date: Wed, 27 Oct 2021 06:20:53 +0000 Subject: [External] : Re: RFC - Improving C2 Escape Analysis In-Reply-To: <81f86a0b-dfb7-0b45-1779-49209a82ae40@oracle.com> References: <20210930140335.648146897@eggemoggin.niobe.net> <415a6622-a46c-33da-8e39-c8f3068c7df3@oracle.com> <44563450-403B-4A15-95AB-5FB5DCA4ED0B@oracle.com> <81f86a0b-dfb7-0b45-1779-49209a82ae40@oracle.com> Message-ID: Hi Tobias, Thank you for the feedback. Very much appreciated. > I think the most critical issue is that current EA does not properly handle even "simple" control > flow merges (including the cases you described in the "Re-use Stack Slots" section). I.e., cases > where the object does not escape on any paths but control flow is too complicated for EA to prove > that. Often, it's not even multiple objects of the same type that are merged but just a single > object that is used in different paths. Right. I was suspecting this to be the most critical issue indeed. However, I didn't know there was a case where "... the object does not escape on any paths but control flow is too complicated for EA to prove that." Is this an issue tracked in JBS or perhaps you can show me an example where this happens? > Maybe the inability to handle `null` could be described/handled as a separate issue. IIRC both EA > and scalarization do not support `null` being merged with a non-escaping object. Got it. I stumbled on this recently; at least I think this is what I found. This seems to be a big pain point. > Also, I think there are (or will be) cases when EA is able to prove that an object is non-escaping > but scalarization still fails. For example, if not all loads can be folded or there are other uses > left (see `PhaseMacroExpand::can_eliminate_allocation`). Since scalarization is the main benefit of > EA, improvements to EA need to be made to scalarization as well. Got it, makes perfect sense to me. So far, this didn't show up in my "quantitative analysis" as a big problem because most of the objects are considered "GlobalEscape". Regards, Cesar ________________________________ From: Tobias Hartmann Sent: October 22, 2021 1:12 AM To: Cesar Soares Lucas ; Ron Pressler Cc: John Rose ; Mark Reinhold ; hotspot-dev at openjdk.java.net ; Brian Stafford ; Martijn Verburg Subject: Re: [External] : Re: RFC - Improving C2 Escape Analysis Hi Cesar, On 22.10.21 01:07, Cesar Soares Lucas wrote: > We decided to direct our immediate efforts to improving EA for heap objects. > AFAIU inline types will bring a lot of performance improvements but that will > require the programmer to rewrite part(s) of the application - please correct > if I'm wrong here. We are investigating how we can improve the performance of > applications that aren't updated to use inline types. That makes perfect sense to me. > About the write-up, may I ask you what do you think is the most critical > issue raised there? I think the most critical issue is that current EA does not properly handle even "simple" control flow merges (including the cases you described in the "Re-use Stack Slots" section). I.e., cases where the object does not escape on any paths but control flow is too complicated for EA to prove that. Often, it's not even multiple objects of the same type that are merged but just a single object that is used in different paths. > Or perhaps there is something else that we missed? Maybe the inability to handle `null` could be described/handled as a separate issue. IIRC both EA and scalarization do not support `null` being merged with a non-escaping object. Also, I think there are (or will be) cases when EA is able to prove that an object is non-escaping but scalarization still fails. For example, if not all loads can be folded or there are other uses left (see `PhaseMacroExpand::can_eliminate_allocation`). Since scalarization is the main benefit of EA, improvements to EA need to be made to scalarization as well. Best regards, Tobias > ---------------------------------------------------------------------------------------------------- > *From:* Ron Pressler > *Sent:* October 19, 2021 6:39 AM > *To:* Tobias Hartmann > *Cc:* Cesar Soares Lucas ; John Rose ; Mark > Reinhold ; hotspot-dev at openjdk.java.net ; > Brian Stafford ; Martijn Verburg > *Subject:* Re: RFC - Improving C2 Escape Analysis > > [You don't often get email from ron.pressler at oracle.com. Learn why this is important at > http://aka.ms/LearnAboutSenderIdentification.] > > >> On 19 Oct 2021, at 14:17, Tobias Hartmann wrote: >> >> >> That said, support for stack/thread local allocation would be something that we could potentially >> use in Valhalla to avoid buffering on the heap. It's difficult though because in most cases we use >> buffering when storing to a non-flattened field that escapes to other threads. >> > > > Just a reminder, stacks can move in memory (virtual threads), so great care must be > taken with pointers pointing into the stack. As a general rule, that?s a no-no. > > ? Ron From alanb at openjdk.java.net Wed Oct 27 09:54:09 2021 From: alanb at openjdk.java.net (Alan Bateman) Date: Wed, 27 Oct 2021 09:54:09 GMT Subject: RFR: JDK-8275703: System.loadLibrary fails on Big Sur for libraries hidden from filesystem In-Reply-To: <1d0vB2i2qrGaWRxtX_FqnYWrKcgXHaPvShDeSSG9cXg=.f733da58-3eaf-4eda-8c93-f8e81a384326@github.com> References: <1d0vB2i2qrGaWRxtX_FqnYWrKcgXHaPvShDeSSG9cXg=.f733da58-3eaf-4eda-8c93-f8e81a384326@github.com> Message-ID: <7H9sfqoYxuFgFvRWHuB8yDkJmsTgPg2XTBLnAb3Xt_s=.5f584e17-d492-414c-ba47-d4d1ca583291@github.com> On Tue, 26 Oct 2021 22:51:29 GMT, Mandy Chung wrote: > On, macOS 11.x, system libraries are loaded from dynamic linker cache. The libraries are no longer present on the filesystem. `NativeLibraries::loadLibrary` checks for the file existence before calling `JVM_LoadLibrary`. Such check no longer applies on Big Sur. This proposes that on macOS >= 11, it will skip the file existence check and attempt to load a library for each path from java.library.path and system library path. The change is unfortunate but looks okay. ------------- Marked as reviewed by alanb (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6127 From tobias.hartmann at oracle.com Wed Oct 27 10:04:36 2021 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 27 Oct 2021 12:04:36 +0200 Subject: [External] : Re: RFC - Improving C2 Escape Analysis In-Reply-To: References: <20210930140335.648146897@eggemoggin.niobe.net> <415a6622-a46c-33da-8e39-c8f3068c7df3@oracle.com> <44563450-403B-4A15-95AB-5FB5DCA4ED0B@oracle.com> <81f86a0b-dfb7-0b45-1779-49209a82ae40@oracle.com> Message-ID: <0f30507c-e0f0-c380-568b-ac441611e116@oracle.com> Hi Cesar, On 27.10.21 08:20, Cesar Soares Lucas wrote: > Right. I was suspecting this to be the most critical issue indeed. However, I > didn't know there was a case where "... the object does not escape on any paths > but control flow is too complicated for EA to prove that." Is this an issue > tracked in JBS or perhaps you can show me an example where this happens? Sure, here are four examples of EA and/or scalarization failing due to complicated control/data flow: https://cr.openjdk.java.net/~thartmann/EA_examples All examples would completely fold with inline types (Valhalla). I'm not sure if these issues are tracked by JBS issues but there's most likely an overlap with some of the issues you already described. Best regards, Tobias From stefank at openjdk.java.net Wed Oct 27 11:34:26 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 27 Oct 2021 11:34:26 GMT Subject: RFR: 4718400: Many quantities are held as signed that should be unsigned [v2] In-Reply-To: <8w0hrPXhr8NHtRq-phr7oH7pa-X8xpJolTcrRxDrDf4=.4113324d-699f-410c-8db3-ef114d812c03@github.com> References: <8w0hrPXhr8NHtRq-phr7oH7pa-X8xpJolTcrRxDrDf4=.4113324d-699f-410c-8db3-ef114d812c03@github.com> Message-ID: > The GC code tends to use the size_t type to describe sizes of memory and objects, while the oopDesc class and *Klasses use int. The unit of the object size is in "words" (8 bytes on 64-bits, 4 bytes on 32-bits), and int variables can currently hold the largest object size in words. However, when/if we start to experiment with larger objects, using int could be limiting. I propose that we convert oopDesc::size to return size_t to describe the "objects size"-in-words quantities, and update calling code to also use size_t. Note that there will still be places in the JVM that expects the size to fit in 4 bytes. I've listed a couple of them below. > > I'm reusing this old bug, which more or less covers what this patch does. It would probably be good to update the name of the bug if we decide this patch should be integrated, but I given the age of the RFE, I decided to open the PR with its current name. :) > > Comments on the patch: > * There are some JVMCI usages of 'long' (not jlong). It would be good to figure out why that is, and if that's something that should be cleaned out. > * JFR stores field offsets in bytes in 4 bytes variables. This could be a bug. I added an assert. > * java_lang_Class::set_oop_size/oop_size uses size_t but the stored field is only 4 bytes. Leaving this for future consideration. > * I left array object size calculation as-is for now. > > I've run this through tier1-3. I'll run more tiers after initial feedback on this proposal Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: - Merge branch 'master' into 4718400_use_size_t_for_object_sizes - Use size_t for Object sizes ------------- Changes: https://git.openjdk.java.net/jdk/pull/6100/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6100&range=01 Stats: 113 lines in 52 files changed: 3 ins; 18 del; 92 mod Patch: https://git.openjdk.java.net/jdk/pull/6100.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6100/head:pull/6100 PR: https://git.openjdk.java.net/jdk/pull/6100 From stefank at openjdk.java.net Wed Oct 27 11:34:27 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 27 Oct 2021 11:34:27 GMT Subject: RFR: 4718400: Many quantities are held as signed that should be unsigned In-Reply-To: <8w0hrPXhr8NHtRq-phr7oH7pa-X8xpJolTcrRxDrDf4=.4113324d-699f-410c-8db3-ef114d812c03@github.com> References: <8w0hrPXhr8NHtRq-phr7oH7pa-X8xpJolTcrRxDrDf4=.4113324d-699f-410c-8db3-ef114d812c03@github.com> Message-ID: On Mon, 25 Oct 2021 08:42:20 GMT, Stefan Karlsson wrote: > The GC code tends to use the size_t type to describe sizes of memory and objects, while the oopDesc class and *Klasses use int. The unit of the object size is in "words" (8 bytes on 64-bits, 4 bytes on 32-bits), and int variables can currently hold the largest object size in words. However, when/if we start to experiment with larger objects, using int could be limiting. I propose that we convert oopDesc::size to return size_t to describe the "objects size"-in-words quantities, and update calling code to also use size_t. Note that there will still be places in the JVM that expects the size to fit in 4 bytes. I've listed a couple of them below. > > I'm reusing this old bug, which more or less covers what this patch does. It would probably be good to update the name of the bug if we decide this patch should be integrated, but I given the age of the RFE, I decided to open the PR with its current name. :) > > Comments on the patch: > * There are some JVMCI usages of 'long' (not jlong). It would be good to figure out why that is, and if that's something that should be cleaned out. > * JFR stores field offsets in bytes in 4 bytes variables. This could be a bug. I added an assert. > * java_lang_Class::set_oop_size/oop_size uses size_t but the stored field is only 4 bytes. Leaving this for future consideration. > * I left array object size calculation as-is for now. > > I've run this through tier1-3. I'll run more tiers after initial feedback on this proposal Thanks for reviewing! I've also verified that this passes with tier4-7 ------------- PR: https://git.openjdk.java.net/jdk/pull/6100 From coleenp at openjdk.java.net Wed Oct 27 12:12:18 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 27 Oct 2021 12:12:18 GMT Subject: Integrated: 8275800: Redefinition leaks MethodData::_extra_data_lock In-Reply-To: References: Message-ID: On Mon, 25 Oct 2021 12:51:32 GMT, Coleen Phillimore wrote: > The Mutex destructor isn't called for the MDO lock. We found this leak with crashes for the patch to print all mutex locks on error, which is now withdrawn. It was easily reproduced with that patch. It is not easily reproduced otherwise. > I tested this change with that patch though and it passes tier1-6, This pull request has now been integrated. Changeset: 40606021 Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/40606021ee6b7d18674e36b3f6249f1ca8a7647e Stats: 35 lines in 5 files changed: 11 ins; 16 del; 8 mod 8275800: Redefinition leaks MethodData::_extra_data_lock Reviewed-by: sspitsyn, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/6105 From coleenp at openjdk.java.net Wed Oct 27 12:12:18 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 27 Oct 2021 12:12:18 GMT Subject: RFR: 8275800: Redefinition leaks MethodData::_extra_data_lock [v3] In-Reply-To: References: Message-ID: On Tue, 26 Oct 2021 11:43:42 GMT, Coleen Phillimore wrote: >> The Mutex destructor isn't called for the MDO lock. We found this leak with crashes for the patch to print all mutex locks on error, which is now withdrawn. It was easily reproduced with that patch. It is not easily reproduced otherwise. >> I tested this change with that patch though and it passes tier1-6, > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Move constant pool release_C_heap_structures to the end. Thanks David and Serguei! ------------- PR: https://git.openjdk.java.net/jdk/pull/6105 From shade at openjdk.java.net Wed Oct 27 12:20:40 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 27 Oct 2021 12:20:40 GMT Subject: RFR: 8276054: JMH benchmarks for Fences Message-ID: While working on JDK-8252990, I realized there are no microbenchmarks for Fences. I would be good to add some. While technically we can benchmark `Unsafe` directly, I instead chose to use the public API, `VarHandles`, as this captures the real-world uses too. Sample output for `make test TEST=micro:vm.fences` on `x86_64`: Benchmark Mode Cnt Score Error Units Multiple.acquire avgt 9 0.408 ? 0.005 ns/op Multiple.full avgt 9 4.694 ? 0.003 ns/op Multiple.plain avgt 9 0.407 ? 0.004 ns/op Multiple.release avgt 9 0.407 ? 0.003 ns/op Multiple.storeStore avgt 9 0.409 ? 0.006 ns/op MultipleWithLoads.acquire avgt 9 4.600 ? 0.002 ns/op MultipleWithLoads.full avgt 9 8.322 ? 0.003 ns/op MultipleWithLoads.plain avgt 9 4.328 ? 0.001 ns/op MultipleWithLoads.release avgt 9 4.600 ? 0.002 ns/op MultipleWithLoads.storeStore avgt 9 4.600 ? 0.002 ns/op MultipleWithStores.acquire avgt 9 0.812 ? 0.001 ns/op MultipleWithStores.full avgt 9 5.291 ? 0.001 ns/op MultipleWithStores.plain avgt 9 0.812 ? 0.001 ns/op MultipleWithStores.release avgt 9 0.812 ? 0.001 ns/op MultipleWithStores.storeStore avgt 9 0.812 ? 0.001 ns/op SafePublishing.plain avgt 9 6.378 ? 0.011 ns/op SafePublishing.release avgt 9 6.368 ? 0.010 ns/op SafePublishing.storeStore avgt 9 6.376 ? 0.013 ns/op Single.acquireFence avgt 9 0.410 ? 0.009 ns/op Single.fullFence avgt 9 4.689 ? 0.002 ns/op Single.plain avgt 9 0.409 ? 0.010 ns/op Single.releaseFence avgt 9 0.410 ? 0.006 ns/op Single.storeStoreFence avgt 9 0.410 ? 0.009 ns/op ------------- Commit messages: - Fix Changes: https://git.openjdk.java.net/jdk/pull/6138/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6138&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8276054 Stats: 415 lines in 5 files changed: 415 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/6138.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6138/head:pull/6138 PR: https://git.openjdk.java.net/jdk/pull/6138 From coleenp at openjdk.java.net Wed Oct 27 12:37:13 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 27 Oct 2021 12:37:13 GMT Subject: RFR: 8275917: Some locks shouldn't allow_vm_block In-Reply-To: <2cetS3bCD4YjbfSLnCqc6Gm0L0_a9Lok0iIvpOLCiY8=.ea386bfa-c548-4a5e-a3de-ef9d0dc520eb@github.com> References: <2cetS3bCD4YjbfSLnCqc6Gm0L0_a9Lok0iIvpOLCiY8=.ea386bfa-c548-4a5e-a3de-ef9d0dc520eb@github.com> Message-ID: On Tue, 26 Oct 2021 19:43:46 GMT, Coleen Phillimore wrote: > There were a few safepoint checking locks that passed allow_vm_block as true, that aren't taken by the VMThread. I also made ClassInitError_lock behave like SystemDictionaryLock and the others in SystemDictionary::do_unloading so that doesn't require allow_vm_block either. See the CR for more details. > > Tested with tier1-6. David, everything you wrote above is accurate. You might be right about the Terminator_lock. My logging and testing only printed for locks taken by NJT in a safepoint and it appeared in this list, but it may not have been the VMThread. For reasons that I put in the CR, I can't expand the check for allow_vm_block to include this condition, only the VMThread. Another pass of logging just for the VMThread shows that the Terminator_lock isn't taken. I'll test with changing this one also. This is my new set if I change logging to print for VMThread only. ExpandHeap_lock G1OldGCCount_lock Notify_lock OopMapCacheAlloc_lock Threads_lock VMOperation_lock ------------- PR: https://git.openjdk.java.net/jdk/pull/6123 From stefank at openjdk.java.net Wed Oct 27 13:28:20 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 27 Oct 2021 13:28:20 GMT Subject: Integrated: 4718400: Many quantities are held as signed that should be unsigned In-Reply-To: <8w0hrPXhr8NHtRq-phr7oH7pa-X8xpJolTcrRxDrDf4=.4113324d-699f-410c-8db3-ef114d812c03@github.com> References: <8w0hrPXhr8NHtRq-phr7oH7pa-X8xpJolTcrRxDrDf4=.4113324d-699f-410c-8db3-ef114d812c03@github.com> Message-ID: On Mon, 25 Oct 2021 08:42:20 GMT, Stefan Karlsson wrote: > The GC code tends to use the size_t type to describe sizes of memory and objects, while the oopDesc class and *Klasses use int. The unit of the object size is in "words" (8 bytes on 64-bits, 4 bytes on 32-bits), and int variables can currently hold the largest object size in words. However, when/if we start to experiment with larger objects, using int could be limiting. I propose that we convert oopDesc::size to return size_t to describe the "objects size"-in-words quantities, and update calling code to also use size_t. Note that there will still be places in the JVM that expects the size to fit in 4 bytes. I've listed a couple of them below. > > I'm reusing this old bug, which more or less covers what this patch does. It would probably be good to update the name of the bug if we decide this patch should be integrated, but I given the age of the RFE, I decided to open the PR with its current name. :) > > Comments on the patch: > * There are some JVMCI usages of 'long' (not jlong). It would be good to figure out why that is, and if that's something that should be cleaned out. > * JFR stores field offsets in bytes in 4 bytes variables. This could be a bug. I added an assert. > * java_lang_Class::set_oop_size/oop_size uses size_t but the stored field is only 4 bytes. Leaving this for future consideration. > * I left array object size calculation as-is for now. > > I've run this through tier1-3. I'll run more tiers after initial feedback on this proposal This pull request has now been integrated. Changeset: 93be099c Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/93be099ccb73c88532866ae6d0c288c12a592cc4 Stats: 113 lines in 52 files changed: 3 ins; 18 del; 92 mod 4718400: Many quantities are held as signed that should be unsigned Reviewed-by: coleenp, rbackman, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/6100 From shade at openjdk.java.net Wed Oct 27 14:46:21 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 27 Oct 2021 14:46:21 GMT Subject: RFR: 8252990: Intrinsify Unsafe.storeStoreFence Message-ID: `Unsafe.storeStoreFence` currently delegates to stronger `Unsafe.storeFence`. We can teach compilers to map this directly to already existing rules that handle `MemBarStoreStore`. Like explicit `LoadFence`/`StoreFence`, we introduce the special node to differentiate explicit fence and implicit store-store barriers. This node is usually used to simulate safe `final`-field like constructions in special JDK classes, like `ConstantCallSite` and friends. Motivational performance difference on benchmarks from JDK-8276054 on ARM32: Benchmark Mode Cnt Score Error Units Multiple.plain avgt 3 2.669 ? 0.004 ns/op Multiple.release avgt 3 16.688 ? 0.057 ns/op Multiple.storeStore avgt 3 14.021 ? 0.144 ns/op // Better MultipleWithLoads.plain avgt 3 4.672 ? 0.053 ns/op MultipleWithLoads.release avgt 3 16.689 ? 0.044 ns/op MultipleWithLoads.storeStore avgt 3 14.012 ? 0.010 ns/op // Better MultipleWithStores.plain avgt 3 14.687 ? 0.009 ns/op MultipleWithStores.release avgt 3 45.393 ? 0.192 ns/op MultipleWithStores.storeStore avgt 3 38.048 ? 0.033 ns/op // Better Publishing.plain avgt 3 27.079 ? 0.201 ns/op Publishing.release avgt 3 27.088 ? 0.241 ns/op Publishing.storeStore avgt 3 27.009 ? 0.259 ns/op // Within error, hidden by allocation Single.plain avgt 3 2.670 ? 0.002 ns/op Single.releaseFence avgt 3 6.675 ? 0.001 ns/op Single.storeStoreFence avgt 3 8.012 ? 0.027 ns/op // Worse, seems to be ARM32 implementation artifact As expected, this does not affect x86_64 at all, because both `release` and `storeStore` are effectively no-ops, only affecting compiler optimizations: Benchmark Mode Cnt Score Error Units Multiple.plain avgt 3 0.406 ? 0.002 ns/op Multiple.release avgt 3 0.409 ? 0.018 ns/op Multiple.storeStore avgt 3 0.406 ? 0.001 ns/op MultipleWithLoads.plain avgt 3 4.328 ? 0.006 ns/op MultipleWithLoads.release avgt 3 4.600 ? 0.014 ns/op MultipleWithLoads.storeStore avgt 3 4.602 ? 0.006 ns/op MultipleWithStores.plain avgt 3 0.812 ? 0.001 ns/op MultipleWithStores.release avgt 3 0.812 ? 0.002 ns/op MultipleWithStores.storeStore avgt 3 0.812 ? 0.002 ns/op Publishing.plain avgt 3 6.370 ? 0.059 ns/op Publishing.release avgt 3 6.358 ? 0.436 ns/op Publishing.storeStore avgt 3 6.367 ? 0.054 ns/op Single.plain avgt 3 0.407 ? 0.039 ns/op Single.releaseFence avgt 3 0.406 ? 0.001 ns/op Single.storeStoreFence avgt 3 0.406 ? 0.001 ns/op Additional testing: - [x] Linux x86_64 fastdebug `tier1` ------------- Commit messages: - Formatting - Little cleanup - 8252990: Intrinsify Unsafe.storeStoreFence Changes: https://git.openjdk.java.net/jdk/pull/6136/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6136&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8252990 Stats: 39 lines in 16 files changed: 33 ins; 5 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/6136.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6136/head:pull/6136 PR: https://git.openjdk.java.net/jdk/pull/6136 From pchilanomate at openjdk.java.net Wed Oct 27 15:19:15 2021 From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo) Date: Wed, 27 Oct 2021 15:19:15 GMT Subject: RFR: 8275917: Some locks shouldn't allow_vm_block In-Reply-To: <2cetS3bCD4YjbfSLnCqc6Gm0L0_a9Lok0iIvpOLCiY8=.ea386bfa-c548-4a5e-a3de-ef9d0dc520eb@github.com> References: <2cetS3bCD4YjbfSLnCqc6Gm0L0_a9Lok0iIvpOLCiY8=.ea386bfa-c548-4a5e-a3de-ef9d0dc520eb@github.com> Message-ID: On Tue, 26 Oct 2021 19:43:46 GMT, Coleen Phillimore wrote: > There were a few safepoint checking locks that passed allow_vm_block as true, that aren't taken by the VMThread. I also made ClassInitError_lock behave like SystemDictionaryLock and the others in SystemDictionary::do_unloading so that doesn't require allow_vm_block either. See the CR for more details. > > Tested with tier1-6. LGTM. Thanks, Patricio src/hotspot/share/runtime/mutexLocker.cpp line 281: > 279: > 280: def(Heap_lock , PaddedMonitor, safepoint); // Doesn't safepoint check during termination. > 281: def(JfieldIdCreation_lock , PaddedMutex , safepoint); // jfieldID, Used in VM_Operation Is "Used in VM_Operation" still true? ------------- Marked as reviewed by pchilanomate (Committer). PR: https://git.openjdk.java.net/jdk/pull/6123 From shade at openjdk.java.net Wed Oct 27 15:44:24 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 27 Oct 2021 15:44:24 GMT Subject: RFR: 8255286: Implement ParametersTypeData::print_data_on fully Message-ID: There is a little FIXME leftover since JDK 9 build warning fixes. This basically implements `ParametersTypeData::print_data_on` like `SpeculativeTrapData::print_data_on` is currently implemented. Additional testing: - [ ] Linux x86_64 fastdebug `tier1` ------------- Commit messages: - 8255286: Implement ParametersTypeData::print_data_on fully Changes: https://git.openjdk.java.net/jdk/pull/6143/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6143&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8255286 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/6143.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6143/head:pull/6143 PR: https://git.openjdk.java.net/jdk/pull/6143 From mchung at openjdk.java.net Wed Oct 27 16:16:14 2021 From: mchung at openjdk.java.net (Mandy Chung) Date: Wed, 27 Oct 2021 16:16:14 GMT Subject: RFR: JDK-8275703: System.loadLibrary fails on Big Sur for libraries hidden from filesystem In-Reply-To: <5BAH1dAVPdl7UnZRnSpe_7xOaX8Whsu_E0drRbcia-0=.449e99a7-ac53-4ffa-929b-b3ddc21c2f31@github.com> References: <1d0vB2i2qrGaWRxtX_FqnYWrKcgXHaPvShDeSSG9cXg=.f733da58-3eaf-4eda-8c93-f8e81a384326@github.com> <5BAH1dAVPdl7UnZRnSpe_7xOaX8Whsu_E0drRbcia-0=.449e99a7-ac53-4ffa-929b-b3ddc21c2f31@github.com> Message-ID: On Wed, 27 Oct 2021 03:39:39 GMT, Jaikiran Pai wrote: > So, I think, the whole point of this change in this block is to skip the file existence check and return back a file path. exactly. ------------- PR: https://git.openjdk.java.net/jdk/pull/6127 From mchung at openjdk.java.net Wed Oct 27 16:28:16 2021 From: mchung at openjdk.java.net (Mandy Chung) Date: Wed, 27 Oct 2021 16:28:16 GMT Subject: RFR: JDK-8275703: System.loadLibrary fails on Big Sur for libraries hidden from filesystem In-Reply-To: References: <1d0vB2i2qrGaWRxtX_FqnYWrKcgXHaPvShDeSSG9cXg=.f733da58-3eaf-4eda-8c93-f8e81a384326@github.com> Message-ID: On Wed, 27 Oct 2021 02:51:24 GMT, Jaikiran Pai wrote: >> On, macOS 11.x, system libraries are loaded from dynamic linker cache. The libraries are no longer present on the filesystem. `NativeLibraries::loadLibrary` checks for the file existence before calling `JVM_LoadLibrary`. Such check no longer applies on Big Sur. This proposes that on macOS >= 11, it will skip the file existence check and attempt to load a library for each path from java.library.path and system library path. > > src/java.base/macosx/classes/jdk/internal/loader/ClassLoaderHelper.java line 44: > >> 42: } catch (NumberFormatException e) {} >> 43: } >> 44: hasDynamicLoaderCache = major >= 11; > > Hello Mandy, > I'm not too familiar with MacOS versioning schemes. However, in this specific logic, if the `os.version` value doesn't contain a dot character, then the `major` is initialized to `11`, which would then evaluate this `hasDynamicLoaderCache` to `true`. That would mean if the `os.version` is (for example) `10`, then `hasDynamicLoaderCache` will be incorrectly set to `true` here, isn't it? macOS product version contains 3 parts: major, minor, and patch version. But it does not hurt to handle the case where no dot exists in the string. int major = 11; int i = osVersion.indexOf('.'); try { major = Integer.parseInt(i < 0 ? osVersion : osVersion.substring(0, i)); } catch (NumberFormatException e) {} ------------- PR: https://git.openjdk.java.net/jdk/pull/6127 From redestad at openjdk.java.net Wed Oct 27 16:35:10 2021 From: redestad at openjdk.java.net (Claes Redestad) Date: Wed, 27 Oct 2021 16:35:10 GMT Subject: RFR: 8276054: JMH benchmarks for Fences In-Reply-To: References: Message-ID: On Wed, 27 Oct 2021 12:11:37 GMT, Aleksey Shipilev wrote: > While working on JDK-8252990, I realized there are no microbenchmarks for Fences. I would be good to add some. While technically we can benchmark `Unsafe` directly, I instead chose to use the public API, `VarHandles`, as this captures the real-world uses too. > > Sample output for `make test TEST=micro:vm.fences` on `x86_64`: > > > Benchmark Mode Cnt Score Error Units > Multiple.acquire avgt 9 0.408 ? 0.005 ns/op > Multiple.full avgt 9 4.694 ? 0.003 ns/op > Multiple.plain avgt 9 0.407 ? 0.004 ns/op > Multiple.release avgt 9 0.407 ? 0.003 ns/op > Multiple.storeStore avgt 9 0.409 ? 0.006 ns/op > MultipleWithLoads.acquire avgt 9 4.600 ? 0.002 ns/op > MultipleWithLoads.full avgt 9 8.322 ? 0.003 ns/op > MultipleWithLoads.plain avgt 9 4.328 ? 0.001 ns/op > MultipleWithLoads.release avgt 9 4.600 ? 0.002 ns/op > MultipleWithLoads.storeStore avgt 9 4.600 ? 0.002 ns/op > MultipleWithStores.acquire avgt 9 0.812 ? 0.001 ns/op > MultipleWithStores.full avgt 9 5.291 ? 0.001 ns/op > MultipleWithStores.plain avgt 9 0.812 ? 0.001 ns/op > MultipleWithStores.release avgt 9 0.812 ? 0.001 ns/op > MultipleWithStores.storeStore avgt 9 0.812 ? 0.001 ns/op > SafePublishing.plain avgt 9 6.378 ? 0.011 ns/op > SafePublishing.release avgt 9 6.368 ? 0.010 ns/op > SafePublishing.storeStore avgt 9 6.376 ? 0.013 ns/op > Single.acquireFence avgt 9 0.410 ? 0.009 ns/op > Single.fullFence avgt 9 4.689 ? 0.002 ns/op > Single.plain avgt 9 0.409 ? 0.010 ns/op > Single.releaseFence avgt 9 0.410 ? 0.006 ns/op > Single.storeStoreFence avgt 9 0.410 ? 0.009 ns/op Benchmarks looks good. I'm surprised the build handles omitting the `org.openjdk.bench` package prefix. I don't particularly mind either way (and maybe removing this prefix would be an overall win), but I think we should keep it consistent for now and use `package org.openjdk.bench...`. ------------- Marked as reviewed by redestad (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6138 From shade at openjdk.java.net Wed Oct 27 16:43:53 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 27 Oct 2021 16:43:53 GMT Subject: RFR: 8276054: JMH benchmarks for Fences [v2] In-Reply-To: References: Message-ID: <_qNpquRxIFzLujEADaDOxFdLU7xuCgQSe79yvhbpYRk=.f755a276-e1fb-4e64-bfda-5e4aec623df7@github.com> > While working on JDK-8252990, I realized there are no microbenchmarks for Fences. I would be good to add some. While technically we can benchmark `Unsafe` directly, I instead chose to use the public API, `VarHandles`, as this captures the real-world uses too. > > Sample output for `make test TEST=micro:vm.fences` on `x86_64`: > > > Benchmark Mode Cnt Score Error Units > Multiple.acquire avgt 9 0.408 ? 0.005 ns/op > Multiple.full avgt 9 4.694 ? 0.003 ns/op > Multiple.plain avgt 9 0.407 ? 0.004 ns/op > Multiple.release avgt 9 0.407 ? 0.003 ns/op > Multiple.storeStore avgt 9 0.409 ? 0.006 ns/op > MultipleWithLoads.acquire avgt 9 4.600 ? 0.002 ns/op > MultipleWithLoads.full avgt 9 8.322 ? 0.003 ns/op > MultipleWithLoads.plain avgt 9 4.328 ? 0.001 ns/op > MultipleWithLoads.release avgt 9 4.600 ? 0.002 ns/op > MultipleWithLoads.storeStore avgt 9 4.600 ? 0.002 ns/op > MultipleWithStores.acquire avgt 9 0.812 ? 0.001 ns/op > MultipleWithStores.full avgt 9 5.291 ? 0.001 ns/op > MultipleWithStores.plain avgt 9 0.812 ? 0.001 ns/op > MultipleWithStores.release avgt 9 0.812 ? 0.001 ns/op > MultipleWithStores.storeStore avgt 9 0.812 ? 0.001 ns/op > SafePublishing.plain avgt 9 6.378 ? 0.011 ns/op > SafePublishing.release avgt 9 6.368 ? 0.010 ns/op > SafePublishing.storeStore avgt 9 6.376 ? 0.013 ns/op > Single.acquireFence avgt 9 0.410 ? 0.009 ns/op > Single.fullFence avgt 9 4.689 ? 0.002 ns/op > Single.plain avgt 9 0.409 ? 0.010 ns/op > Single.releaseFence avgt 9 0.410 ? 0.006 ns/op > Single.storeStoreFence avgt 9 0.410 ? 0.009 ns/op Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Proper package name ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6138/files - new: https://git.openjdk.java.net/jdk/pull/6138/files/183e70ac..c1e7905d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6138&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6138&range=00-01 Stats: 5 lines in 5 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/6138.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6138/head:pull/6138 PR: https://git.openjdk.java.net/jdk/pull/6138 From shade at openjdk.java.net Wed Oct 27 16:43:55 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 27 Oct 2021 16:43:55 GMT Subject: RFR: 8276054: JMH benchmarks for Fences [v2] In-Reply-To: References: Message-ID: <0QE_ZidQus7OJs1JKQjEbN0a40fzENXKhWlGAl_r55w=.d8efc120-e271-49ae-8fc0-387f53351f9e@github.com> On Wed, 27 Oct 2021 16:32:00 GMT, Claes Redestad wrote: > I'm surprised the build handles omitting the `org.openjdk.bench` package prefix. I don't particularly mind either way (and maybe removing this prefix would be an overall win), but I think we should keep it consistent for now and use `package org.openjdk.bench...`. Ah yes, good spot. Fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/6138 From vladimir.kozlov at oracle.com Wed Oct 27 17:26:53 2021 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 27 Oct 2021 10:26:53 -0700 Subject: [External] : Re: RFC - Improving C2 Escape Analysis In-Reply-To: <0f30507c-e0f0-c380-568b-ac441611e116@oracle.com> References: <20210930140335.648146897@eggemoggin.niobe.net> <415a6622-a46c-33da-8e39-c8f3068c7df3@oracle.com> <44563450-403B-4A15-95AB-5FB5DCA4ED0B@oracle.com> <81f86a0b-dfb7-0b45-1779-49209a82ae40@oracle.com> <0f30507c-e0f0-c380-568b-ac441611e116@oracle.com> Message-ID: <787f8fbb-83e6-0867-1c97-ae2516df114b@oracle.com> First. Thank you, Cesar, for collecting data about C2 EA shortcomings. I agree with cases Tobias pointed as possible starting points to improve EA. Yes, finding solution for allocation merges (or NULL) is a pain. I spent some time investigating possible solutions for it but "no cigar". May be we do indead need control flow analysis to resolve this. I looked through JBS and found few issues which are not required to write new EA: https://bugs.openjdk.java.net/browse/JDK-7149991 https://bugs.openjdk.java.net/browse/JDK-8059378 https://bugs.openjdk.java.net/browse/JDK-8073358 https://bugs.openjdk.java.net/browse/JDK-8155769 Tobias also has fix prototype for next bug which was not fixed yet: https://bugs.openjdk.java.net/browse/JDK-8236493 Ther are 2 test files with small methods for different EA cases I used to see how EA works: test/hotspot/jtreg/compiler/escapeAnalysis/Test6726999.java test/hotspot/jtreg/compiler/escapeAnalysis/Test6689060.java You can start looking on above RFE/bug or run these tests and see why scalarization failed for some cases. Except for known merge issue: https://bugs.openjdk.java.net/browse/JDK-6853701 I am currently looking on iterative EA. Do more EA rounds if we can eliminate more connected allocations. It was proposed by Vladimir Ivanov and I have working prototype. There is also suggestin from Amazon Java group about "C2 Partial Escape Analysis" which needs more discsussion: https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2021-May/047486.html Thanks, Vladimir K On 10/27/21 3:04 AM, Tobias Hartmann wrote: > Hi Cesar, > > On 27.10.21 08:20, Cesar Soares Lucas wrote: >> Right. I was suspecting this to be the most critical issue indeed. However, I >> didn't know there was a case where "... the object does not escape on any paths >> but control flow is too complicated for EA to prove that." Is this an issue >> tracked in JBS or perhaps you can show me an example where this happens? > > Sure, here are four examples of EA and/or scalarization failing due to complicated control/data > flow: https://cr.openjdk.java.net/~thartmann/EA_examples > > All examples would completely fold with inline types (Valhalla). > > I'm not sure if these issues are tracked by JBS issues but there's most likely an overlap with some > of the issues you already described. > > Best regards, > Tobias > From mchung at openjdk.java.net Wed Oct 27 17:27:41 2021 From: mchung at openjdk.java.net (Mandy Chung) Date: Wed, 27 Oct 2021 17:27:41 GMT Subject: RFR: JDK-8275703: System.loadLibrary fails on Big Sur for libraries hidden from filesystem [v2] In-Reply-To: <1d0vB2i2qrGaWRxtX_FqnYWrKcgXHaPvShDeSSG9cXg=.f733da58-3eaf-4eda-8c93-f8e81a384326@github.com> References: <1d0vB2i2qrGaWRxtX_FqnYWrKcgXHaPvShDeSSG9cXg=.f733da58-3eaf-4eda-8c93-f8e81a384326@github.com> Message-ID: > On, macOS 11.x, system libraries are loaded from dynamic linker cache. The libraries are no longer present on the filesystem. `NativeLibraries::loadLibrary` checks for the file existence before calling `JVM_LoadLibrary`. Such check no longer applies on Big Sur. This proposes that on macOS >= 11, it will skip the file existence check and attempt to load a library for each path from java.library.path and system library path. Mandy Chung has updated the pull request incrementally with two additional commits since the last revision: - Adjust parsing os.version to handle no dot version in case it's allowed - Exclude building exeLibraryCache.c on other platforms except macOS ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6127/files - new: https://git.openjdk.java.net/jdk/pull/6127/files/e034029f..710925b5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6127&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6127&range=00-01 Stats: 12 lines in 5 files changed: 2 ins; 3 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/6127.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6127/head:pull/6127 PR: https://git.openjdk.java.net/jdk/pull/6127 From coleenp at openjdk.java.net Wed Oct 27 17:53:43 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 27 Oct 2021 17:53:43 GMT Subject: RFR: 8275917: Some locks shouldn't allow_vm_block [v2] In-Reply-To: <2cetS3bCD4YjbfSLnCqc6Gm0L0_a9Lok0iIvpOLCiY8=.ea386bfa-c548-4a5e-a3de-ef9d0dc520eb@github.com> References: <2cetS3bCD4YjbfSLnCqc6Gm0L0_a9Lok0iIvpOLCiY8=.ea386bfa-c548-4a5e-a3de-ef9d0dc520eb@github.com> Message-ID: > There were a few safepoint checking locks that passed allow_vm_block as true, that aren't taken by the VMThread. I also made ClassInitError_lock behave like SystemDictionaryLock and the others in SystemDictionary::do_unloading so that doesn't require allow_vm_block either. See the CR for more details. > > Tested with tier1-6. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Remove incorrect comment. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6123/files - new: https://git.openjdk.java.net/jdk/pull/6123/files/767ff817..aa68f26d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6123&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6123&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/6123.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6123/head:pull/6123 PR: https://git.openjdk.java.net/jdk/pull/6123 From coleenp at openjdk.java.net Wed Oct 27 17:53:45 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 27 Oct 2021 17:53:45 GMT Subject: RFR: 8275917: Some locks shouldn't allow_vm_block [v2] In-Reply-To: References: <2cetS3bCD4YjbfSLnCqc6Gm0L0_a9Lok0iIvpOLCiY8=.ea386bfa-c548-4a5e-a3de-ef9d0dc520eb@github.com> Message-ID: On Wed, 27 Oct 2021 01:37:23 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove incorrect comment. > > Hi Coleen, > > So, IIUC the `allow_vm_block` flag leads to two checks: > > 1. If the VMThread tries to acquire any lock, then `allow_vm_block` must be true else it is a fatal error. (I have to keep reminding myself that the "block" in `allow_vm_block` is wrong, it should really just be "lock" as we do not check only when there is contention!) > > 2. For a non-safepoint lock, `allow_vm_block` must be true. (asserted at construction time) > > So from that we can deduce that if a safepoint lock is never taken by the VMThread, then `allow_vm_block` doesn't need to be true and so should be set to false. And I can see that change, for example, with the `BeforeExit_lock`. So in looking at this PR for every safepoint lock where `allow_vm_block` has been changed from true to false, we have to confirm that the VMThread will never take that lock. From a reviewing perspective I'll rely on your analysis and testing on that. (I did look at a couple of locks out of interest :) ). > > Regarding the change to the `ClassInitError_lock` ... it is valid because when a JavaThread holds that lock it doesn't execute code that can safepoint; so the VMThread would always find the lock free at a safepoint. So `allow_vm_block` being true works here. But instead you skip locking at the safepoint altogether, as done with the other locks, and so can now make `allow_vm_block` false. Really all code that takes such locks by a JavaThread should have a NSV covering the critical section to ensure it is actually safe for the VMThread to skip locking at a safepoint. > > BTW it also seems to me that the `Terminator_lock`, which can be taken by a NJT (WatcherThread) but not the VMThread, can also have `allow_vm_block` set to false. (I just happened to pick that one to do some code analysis and noticed the VMThread never acquires it.) > > Thanks, > David @dholmes-ora I'm going to leave Terminator_lock for now because of this and in case anyone changes this. // Wait until VM thread is terminated // Note: it should be OK to use Terminator_lock here. But this is called // at a very delicate time (VM shutdown) and we are operating in non- VM // thread at Safepoint. It's safer to not share lock with other threads. { MonitorLocker ml(_terminate_lock, Mutex::_no_safepoint_check_flag); ... ------------- PR: https://git.openjdk.java.net/jdk/pull/6123 From coleenp at openjdk.java.net Wed Oct 27 17:53:50 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 27 Oct 2021 17:53:50 GMT Subject: RFR: 8275917: Some locks shouldn't allow_vm_block [v2] In-Reply-To: References: <2cetS3bCD4YjbfSLnCqc6Gm0L0_a9Lok0iIvpOLCiY8=.ea386bfa-c548-4a5e-a3de-ef9d0dc520eb@github.com> Message-ID: On Wed, 27 Oct 2021 15:16:07 GMT, Patricio Chilano Mateo wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove incorrect comment. > > src/hotspot/share/runtime/mutexLocker.cpp line 281: > >> 279: >> 280: def(Heap_lock , PaddedMonitor, safepoint); // Doesn't safepoint check during termination. >> 281: def(JfieldIdCreation_lock , PaddedMutex , safepoint); // jfieldID, Used in VM_Operation > > Is "Used in VM_Operation" still true? Nope, it's only used by JavaThreads in jni and jvmti (just checked) and not by the VMThread in a vm operation. ------------- PR: https://git.openjdk.java.net/jdk/pull/6123 From mcimadamore at openjdk.java.net Wed Oct 27 18:32:08 2021 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Wed, 27 Oct 2021 18:32:08 GMT Subject: RFR: JDK-8275703: System.loadLibrary fails on Big Sur for libraries hidden from filesystem [v2] In-Reply-To: References: <1d0vB2i2qrGaWRxtX_FqnYWrKcgXHaPvShDeSSG9cXg=.f733da58-3eaf-4eda-8c93-f8e81a384326@github.com> Message-ID: On Wed, 27 Oct 2021 17:27:41 GMT, Mandy Chung wrote: >> On, macOS 11.x, system libraries are loaded from dynamic linker cache. The libraries are no longer present on the filesystem. `NativeLibraries::loadLibrary` checks for the file existence before calling `JVM_LoadLibrary`. Such check no longer applies on Big Sur. This proposes that on macOS >= 11, it will skip the file existence check and attempt to load a library for each path from java.library.path and system library path. > > Mandy Chung has updated the pull request incrementally with two additional commits since the last revision: > > - Adjust parsing os.version to handle no dot version in case it's allowed > - Exclude building exeLibraryCache.c on other platforms except macOS This came up on panama-dev a bunch of time now. In fact, in the panama use case this would make sense for other systems as well. For instance on linux systems, there's a bunch of known library names in gnu/lib-names.h which programs can depend on, but for which System::load/loadLibrary cannot be used - as they are neither library names, nor paths - for instance "libm.so.6". ------------- Marked as reviewed by mcimadamore (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6127 From mchung at openjdk.java.net Wed Oct 27 19:05:12 2021 From: mchung at openjdk.java.net (Mandy Chung) Date: Wed, 27 Oct 2021 19:05:12 GMT Subject: RFR: JDK-8275703: System.loadLibrary fails on Big Sur for libraries hidden from filesystem In-Reply-To: <0FE666F4-659B-4B1E-A4B9-B3FC658B5FD8@cbfiddle.com> References: <1d0vB2i2qrGaWRxtX_FqnYWrKcgXHaPvShDeSSG9cXg=.f733da58-3eaf-4eda-8c93-f8e81a384326@github.com> <0FE666F4-659B-4B1E-A4B9-B3FC658B5FD8@cbfiddle.com> Message-ID: On Wed, 27 Oct 2021 18:25:40 GMT, Alan Snyder wrote: > I never load system libraries directly. I load my own libraries (that support JNI entry points) and the system loader loads the necessary system frameworks that they were linked against. > > What's different in this case that motivates loading system libraries directly from Java Panama would be the right place to provide the support of loading system libraries directly from Java and enhance the interconnection between JVM and native code. The bug report is an example showing that developers want to load a system library for their native code to use. It's not surprise that they use `System::load/loadLibrary` even it's designed for JNI without Panama since it works. So this patch is for compatibility for existing applications to continue to run on Big Sur. ------------- PR: https://git.openjdk.java.net/jdk/pull/6127 From psandoz at openjdk.java.net Wed Oct 27 21:42:29 2021 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Wed, 27 Oct 2021 21:42:29 GMT Subject: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v7] In-Reply-To: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> References: <_QQ9ntdJJfzVcAGrbjev0ZM-xNfD4wNATphnXkb-Y00=.bbf46985-8776-4dda-ada5-b15ab50774aa@github.com> Message-ID: > This PR improves the performance of vector operations that accept masks on architectures that support masking in hardware, specifically Intel AVX512 and ARM SVE. > > On architectures that do not support masking in hardware the same technique as before is applied to most operations, specifically composition using blend. > > Masked loads/stores are a special form of masked operation that require additional care to ensure out-of-bounds access throw exceptions. The range checking has not been fully optimized and will require further work. > > No API enhancements were required and only a few additional tests were needed. Paul Sandoz has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: - Merge branch 'master' into JDK-8271515-vector-api - Merge pull request #1 from nsjian/JDK-8271515 Address AArch64 review comments from Nick. - Address review comments from Nick. - Merge branch 'master' into JDK-8271515-vector-api - Resolve review comments. - Merge branch 'master' into JDK-8271515-vector-api - Apply patch from https://github.com/openjdk/panama-vector/pull/152 - Apply patch from https://github.com/openjdk/panama-vector/pull/142 - Apply patch from https://github.com/openjdk/panama-vector/pull/139 - Apply patch from https://github.com/openjdk/panama-vector/pull/151 - ... and 2 more: https://git.openjdk.java.net/jdk/compare/9a3e9542...c9a77225 ------------- Changes: https://git.openjdk.java.net/jdk/pull/5873/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5873&range=06 Stats: 21966 lines in 104 files changed: 16193 ins; 2089 del; 3684 mod Patch: https://git.openjdk.java.net/jdk/pull/5873.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5873/head:pull/5873 PR: https://git.openjdk.java.net/jdk/pull/5873 From dholmes at openjdk.java.net Thu Oct 28 02:36:13 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 28 Oct 2021 02:36:13 GMT Subject: RFR: 8252990: Intrinsify Unsafe.storeStoreFence In-Reply-To: References: Message-ID: On Wed, 27 Oct 2021 11:53:47 GMT, Aleksey Shipilev wrote: > `Unsafe.storeStoreFence` currently delegates to stronger `Unsafe.storeFence`. We can teach compilers to map this directly to already existing rules that handle `MemBarStoreStore`. Like explicit `LoadFence`/`StoreFence`, we introduce the special node to differentiate explicit fence and implicit store-store barriers. This node is usually used to simulate safe `final`-field like constructions in special JDK classes, like `ConstantCallSite` and friends. > > Motivational performance difference on benchmarks from JDK-8276054 on ARM32: > > > Benchmark Mode Cnt Score Error Units > Multiple.plain avgt 3 2.669 ? 0.004 ns/op > Multiple.release avgt 3 16.688 ? 0.057 ns/op > Multiple.storeStore avgt 3 14.021 ? 0.144 ns/op // Better > > MultipleWithLoads.plain avgt 3 4.672 ? 0.053 ns/op > MultipleWithLoads.release avgt 3 16.689 ? 0.044 ns/op > MultipleWithLoads.storeStore avgt 3 14.012 ? 0.010 ns/op // Better > > MultipleWithStores.plain avgt 3 14.687 ? 0.009 ns/op > MultipleWithStores.release avgt 3 45.393 ? 0.192 ns/op > MultipleWithStores.storeStore avgt 3 38.048 ? 0.033 ns/op // Better > > Publishing.plain avgt 3 27.079 ? 0.201 ns/op > Publishing.release avgt 3 27.088 ? 0.241 ns/op > Publishing.storeStore avgt 3 27.009 ? 0.259 ns/op // Within error, hidden by allocation > > Single.plain avgt 3 2.670 ? 0.002 ns/op > Single.releaseFence avgt 3 6.675 ? 0.001 ns/op > Single.storeStoreFence avgt 3 8.012 ? 0.027 ns/op // Worse, seems to be ARM32 implementation artifact > > > As expected, this does not affect x86_64 at all, because both `release` and `storeStore` are effectively no-ops, only affecting compiler optimizations: > > > Benchmark Mode Cnt Score Error Units > > Multiple.plain avgt 3 0.406 ? 0.002 ns/op > Multiple.release avgt 3 0.409 ? 0.018 ns/op > Multiple.storeStore avgt 3 0.406 ? 0.001 ns/op > > MultipleWithLoads.plain avgt 3 4.328 ? 0.006 ns/op > MultipleWithLoads.release avgt 3 4.600 ? 0.014 ns/op > MultipleWithLoads.storeStore avgt 3 4.602 ? 0.006 ns/op > > MultipleWithStores.plain avgt 3 0.812 ? 0.001 ns/op > MultipleWithStores.release avgt 3 0.812 ? 0.002 ns/op > MultipleWithStores.storeStore avgt 3 0.812 ? 0.002 ns/op > > Publishing.plain avgt 3 6.370 ? 0.059 ns/op > Publishing.release avgt 3 6.358 ? 0.436 ns/op > Publishing.storeStore avgt 3 6.367 ? 0.054 ns/op > > Single.plain avgt 3 0.407 ? 0.039 ns/op > Single.releaseFence avgt 3 0.406 ? 0.001 ns/op > Single.storeStoreFence avgt 3 0.406 ? 0.001 ns/op > > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` src/java.base/share/classes/jdk/internal/misc/Unsafe.java line 3449: > 3447: public final void storeStoreFence() { > 3448: // Without the special intrinsic, default to a stronger storeFence, > 3449: // which is already intrinsified. Not clear me to me why we need to retain this fallback? ------------- PR: https://git.openjdk.java.net/jdk/pull/6136 From dholmes at openjdk.java.net Thu Oct 28 02:40:19 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 28 Oct 2021 02:40:19 GMT Subject: RFR: 8255286: Implement ParametersTypeData::print_data_on fully In-Reply-To: References: Message-ID: On Wed, 27 Oct 2021 15:37:37 GMT, Aleksey Shipilev wrote: > There is a little FIXME leftover since JDK 9 build warning fixes. This basically implements `ParametersTypeData::print_data_on` like `SpeculativeTrapData::print_data_on` is currently implemented. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` Seems perfectly reasonable. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6143 From shade at openjdk.java.net Thu Oct 28 07:03:19 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 28 Oct 2021 07:03:19 GMT Subject: RFR: 8252990: Intrinsify Unsafe.storeStoreFence In-Reply-To: References: Message-ID: On Thu, 28 Oct 2021 02:32:41 GMT, David Holmes wrote: >> `Unsafe.storeStoreFence` currently delegates to stronger `Unsafe.storeFence`. We can teach compilers to map this directly to already existing rules that handle `MemBarStoreStore`. Like explicit `LoadFence`/`StoreFence`, we introduce the special node to differentiate explicit fence and implicit store-store barriers. This node is usually used to simulate safe `final`-field like constructions in special JDK classes, like `ConstantCallSite` and friends. >> >> Motivational performance difference on benchmarks from JDK-8276054 on ARM32: >> >> >> Benchmark Mode Cnt Score Error Units >> Multiple.plain avgt 3 2.669 ? 0.004 ns/op >> Multiple.release avgt 3 16.688 ? 0.057 ns/op >> Multiple.storeStore avgt 3 14.021 ? 0.144 ns/op // Better >> >> MultipleWithLoads.plain avgt 3 4.672 ? 0.053 ns/op >> MultipleWithLoads.release avgt 3 16.689 ? 0.044 ns/op >> MultipleWithLoads.storeStore avgt 3 14.012 ? 0.010 ns/op // Better >> >> MultipleWithStores.plain avgt 3 14.687 ? 0.009 ns/op >> MultipleWithStores.release avgt 3 45.393 ? 0.192 ns/op >> MultipleWithStores.storeStore avgt 3 38.048 ? 0.033 ns/op // Better >> >> Publishing.plain avgt 3 27.079 ? 0.201 ns/op >> Publishing.release avgt 3 27.088 ? 0.241 ns/op >> Publishing.storeStore avgt 3 27.009 ? 0.259 ns/op // Within error, hidden by allocation >> >> Single.plain avgt 3 2.670 ? 0.002 ns/op >> Single.releaseFence avgt 3 6.675 ? 0.001 ns/op >> Single.storeStoreFence avgt 3 8.012 ? 0.027 ns/op // Worse, seems to be ARM32 implementation artifact >> >> >> As expected, this does not affect x86_64 at all, because both `release` and `storeStore` are effectively no-ops, only affecting compiler optimizations: >> >> >> Benchmark Mode Cnt Score Error Units >> >> Multiple.plain avgt 3 0.406 ? 0.002 ns/op >> Multiple.release avgt 3 0.409 ? 0.018 ns/op >> Multiple.storeStore avgt 3 0.406 ? 0.001 ns/op >> >> MultipleWithLoads.plain avgt 3 4.328 ? 0.006 ns/op >> MultipleWithLoads.release avgt 3 4.600 ? 0.014 ns/op >> MultipleWithLoads.storeStore avgt 3 4.602 ? 0.006 ns/op >> >> MultipleWithStores.plain avgt 3 0.812 ? 0.001 ns/op >> MultipleWithStores.release avgt 3 0.812 ? 0.002 ns/op >> MultipleWithStores.storeStore avgt 3 0.812 ? 0.002 ns/op >> >> Publishing.plain avgt 3 6.370 ? 0.059 ns/op >> Publishing.release avgt 3 6.358 ? 0.436 ns/op >> Publishing.storeStore avgt 3 6.367 ? 0.054 ns/op >> >> Single.plain avgt 3 0.407 ? 0.039 ns/op >> Single.releaseFence avgt 3 0.406 ? 0.001 ns/op >> Single.storeStoreFence avgt 3 0.406 ? 0.001 ns/op >> >> >> Additional testing: >> - [x] Linux x86_64 fastdebug `tier1` > > src/java.base/share/classes/jdk/internal/misc/Unsafe.java line 3449: > >> 3447: public final void storeStoreFence() { >> 3448: // Without the special intrinsic, default to a stronger storeFence, >> 3449: // which is already intrinsified. > > Not clear me to me why we need to retain this fallback? Something should happen when intrinsic is disabled. Other fences have native `Unsafe_{Load|Store|Full}Fence` entry points for this. We can, technically, do the same here, but I see no need. Instead, we can fall back to the already implemented stronger intrinsic. ------------- PR: https://git.openjdk.java.net/jdk/pull/6136 From dholmes at openjdk.java.net Thu Oct 28 07:28:11 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 28 Oct 2021 07:28:11 GMT Subject: RFR: 8252990: Intrinsify Unsafe.storeStoreFence In-Reply-To: References: Message-ID: On Wed, 27 Oct 2021 11:53:47 GMT, Aleksey Shipilev wrote: > `Unsafe.storeStoreFence` currently delegates to stronger `Unsafe.storeFence`. We can teach compilers to map this directly to already existing rules that handle `MemBarStoreStore`. Like explicit `LoadFence`/`StoreFence`, we introduce the special node to differentiate explicit fence and implicit store-store barriers. This node is usually used to simulate safe `final`-field like constructions in special JDK classes, like `ConstantCallSite` and friends. > > Motivational performance difference on benchmarks from JDK-8276054 on ARM32: > > > Benchmark Mode Cnt Score Error Units > Multiple.plain avgt 3 2.669 ? 0.004 ns/op > Multiple.release avgt 3 16.688 ? 0.057 ns/op > Multiple.storeStore avgt 3 14.021 ? 0.144 ns/op // Better > > MultipleWithLoads.plain avgt 3 4.672 ? 0.053 ns/op > MultipleWithLoads.release avgt 3 16.689 ? 0.044 ns/op > MultipleWithLoads.storeStore avgt 3 14.012 ? 0.010 ns/op // Better > > MultipleWithStores.plain avgt 3 14.687 ? 0.009 ns/op > MultipleWithStores.release avgt 3 45.393 ? 0.192 ns/op > MultipleWithStores.storeStore avgt 3 38.048 ? 0.033 ns/op // Better > > Publishing.plain avgt 3 27.079 ? 0.201 ns/op > Publishing.release avgt 3 27.088 ? 0.241 ns/op > Publishing.storeStore avgt 3 27.009 ? 0.259 ns/op // Within error, hidden by allocation > > Single.plain avgt 3 2.670 ? 0.002 ns/op > Single.releaseFence avgt 3 6.675 ? 0.001 ns/op > Single.storeStoreFence avgt 3 8.012 ? 0.027 ns/op // Worse, seems to be ARM32 implementation artifact > > > As expected, this does not affect x86_64 at all, because both `release` and `storeStore` are effectively no-ops, only affecting compiler optimizations: > > > Benchmark Mode Cnt Score Error Units > > Multiple.plain avgt 3 0.406 ? 0.002 ns/op > Multiple.release avgt 3 0.409 ? 0.018 ns/op > Multiple.storeStore avgt 3 0.406 ? 0.001 ns/op > > MultipleWithLoads.plain avgt 3 4.328 ? 0.006 ns/op > MultipleWithLoads.release avgt 3 4.600 ? 0.014 ns/op > MultipleWithLoads.storeStore avgt 3 4.602 ? 0.006 ns/op > > MultipleWithStores.plain avgt 3 0.812 ? 0.001 ns/op > MultipleWithStores.release avgt 3 0.812 ? 0.002 ns/op > MultipleWithStores.storeStore avgt 3 0.812 ? 0.002 ns/op > > Publishing.plain avgt 3 6.370 ? 0.059 ns/op > Publishing.release avgt 3 6.358 ? 0.436 ns/op > Publishing.storeStore avgt 3 6.367 ? 0.054 ns/op > > Single.plain avgt 3 0.407 ? 0.039 ns/op > Single.releaseFence avgt 3 0.406 ? 0.001 ns/op > Single.storeStoreFence avgt 3 0.406 ? 0.001 ns/op > > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` I'm certainly no JIT expert but the pattern for adding the new intrinsic seems consistent with the existing code. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6136 From dholmes at openjdk.java.net Thu Oct 28 07:28:11 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 28 Oct 2021 07:28:11 GMT Subject: RFR: 8252990: Intrinsify Unsafe.storeStoreFence In-Reply-To: References: Message-ID: On Thu, 28 Oct 2021 07:00:24 GMT, Aleksey Shipilev wrote: >> src/java.base/share/classes/jdk/internal/misc/Unsafe.java line 3449: >> >>> 3447: public final void storeStoreFence() { >>> 3448: // Without the special intrinsic, default to a stronger storeFence, >>> 3449: // which is already intrinsified. >> >> Not clear me to me why we need to retain this fallback? > > Something should happen when intrinsic is disabled. Other fences have native `Unsafe_{Load|Store|Full}Fence` entry points for this. We can, technically, do the same here, but I see no need. Instead, we can fall back to the already implemented stronger intrinsic. Thanks for clarifying. Now we have all the intrinsics in place we have the choice of delegating either at the Java level or the native level, where previously we had to delegate at the Java level to get the benefit of the storeFence intrinsic. But it is fine as-is. ------------- PR: https://git.openjdk.java.net/jdk/pull/6136 From shade at openjdk.java.net Thu Oct 28 07:48:15 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 28 Oct 2021 07:48:15 GMT Subject: RFR: 8252990: Intrinsify Unsafe.storeStoreFence In-Reply-To: References: Message-ID: On Thu, 28 Oct 2021 07:23:52 GMT, David Holmes wrote: >> Something should happen when intrinsic is disabled. Other fences have native `Unsafe_{Load|Store|Full}Fence` entry points for this. We can, technically, do the same here, but I see no need. Instead, we can fall back to the already implemented stronger intrinsic. > > Thanks for clarifying. Now we have all the intrinsics in place we have the choice of delegating either at the Java level or the native level, where previously we had to delegate at the Java level to get the benefit of the storeFence intrinsic. But it is fine as-is. Now I am actually thinking to drop `Unsafe_{Load|Store}Fence` entry points and delegate `Unsafe.{load|store}Fence()` to `Unsafe.fullFence()` as the fallback, in a similar way. It looks to me calling all the way to `unsafe.cpp` for a single barrier is not that useful. That's something for a separate PR. So it would be something like: - `storeStore` intrinsifies efficiently, or falls back to `release` - `loadLoad` always falls back to `acquire` (seems to be little sense to intrinsify this, as all platforms alias it to `acquire`) - `acquire` intrinsifies efficiently, or falls back to `full` - `release` intrinsifies efficiently, or falls back to `full` - `full` intrinsifies efficiently, or falls back to `Unsafe_FullFence`, which calls `OrderAccess::fullFence()`. ------------- PR: https://git.openjdk.java.net/jdk/pull/6136 From shade at openjdk.java.net Thu Oct 28 07:50:55 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 28 Oct 2021 07:50:55 GMT Subject: RFR: 8276054: JMH benchmarks for Fences [v3] In-Reply-To: References: Message-ID: > While working on JDK-8252990, I realized there are no microbenchmarks for Fences. I would be good to add some. While technically we can benchmark `Unsafe` directly, I instead chose to use the public API, `VarHandles`, as this captures the real-world uses too. > > Sample output for `make test TEST=micro:vm.fences` on `x86_64`: > > > Benchmark Mode Cnt Score Error Units > Multiple.acquire avgt 9 0.408 ? 0.005 ns/op > Multiple.full avgt 9 4.694 ? 0.003 ns/op > Multiple.plain avgt 9 0.407 ? 0.004 ns/op > Multiple.release avgt 9 0.407 ? 0.003 ns/op > Multiple.storeStore avgt 9 0.409 ? 0.006 ns/op > MultipleWithLoads.acquire avgt 9 4.600 ? 0.002 ns/op > MultipleWithLoads.full avgt 9 8.322 ? 0.003 ns/op > MultipleWithLoads.plain avgt 9 4.328 ? 0.001 ns/op > MultipleWithLoads.release avgt 9 4.600 ? 0.002 ns/op > MultipleWithLoads.storeStore avgt 9 4.600 ? 0.002 ns/op > MultipleWithStores.acquire avgt 9 0.812 ? 0.001 ns/op > MultipleWithStores.full avgt 9 5.291 ? 0.001 ns/op > MultipleWithStores.plain avgt 9 0.812 ? 0.001 ns/op > MultipleWithStores.release avgt 9 0.812 ? 0.001 ns/op > MultipleWithStores.storeStore avgt 9 0.812 ? 0.001 ns/op > SafePublishing.plain avgt 9 6.378 ? 0.011 ns/op > SafePublishing.release avgt 9 6.368 ? 0.010 ns/op > SafePublishing.storeStore avgt 9 6.376 ? 0.013 ns/op > Single.acquireFence avgt 9 0.410 ? 0.009 ns/op > Single.fullFence avgt 9 4.689 ? 0.002 ns/op > Single.plain avgt 9 0.409 ? 0.010 ns/op > Single.releaseFence avgt 9 0.410 ? 0.006 ns/op > Single.storeStoreFence avgt 9 0.410 ? 0.009 ns/op Aleksey Shipilev has updated the pull request incrementally with two additional commits since the last revision: - Consistent benchmark names: drop excess "Fence" - Add loadLoad tests too ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6138/files - new: https://git.openjdk.java.net/jdk/pull/6138/files/c1e7905d..9b9488ce Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6138&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6138&range=01-02 Stats: 94 lines in 4 files changed: 56 ins; 20 del; 18 mod Patch: https://git.openjdk.java.net/jdk/pull/6138.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6138/head:pull/6138 PR: https://git.openjdk.java.net/jdk/pull/6138 From shade at openjdk.java.net Thu Oct 28 08:58:41 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 28 Oct 2021 08:58:41 GMT Subject: RFR: 8276096: Simplify Unsafe.{load|store}Fence fallbacks by delegating to fullFence Message-ID: `Unsafe.{load|store}Fence` falls back to `unsafe.cpp` for `OrderAccess::{acquire|release}Fence()`. It seems too heavy-handed (useless?) to call to runtime for a single memory barrier. We can simplify the native `Unsafe` interface by falling back to `fullFence` when `{load|store}Fence` intrinsics are not available. This would be similar to what `Unsafe.{loadLoad|storeStore}Fences` do. This is the behavior of these intrinsics now, on x86_64, using benchmarks from JDK-8276054: Benchmark Mode Cnt Score Error Units # Default Single.acquire avgt 3 0.407 ? 0.060 ns/op Single.full avgt 3 4.693 ? 0.005 ns/op Single.loadLoad avgt 3 0.415 ? 0.095 ns/op Single.plain avgt 3 0.406 ? 0.002 ns/op Single.release avgt 3 0.408 ? 0.047 ns/op Single.storeStore avgt 3 0.408 ? 0.043 ns/op # -XX:DisableIntrinsic=_storeFence Single.acquire avgt 3 0.408 ? 0.016 ns/op Single.full avgt 3 4.694 ? 0.002 ns/op Single.loadLoad avgt 3 0.406 ? 0.002 ns/op Single.plain avgt 3 0.406 ? 0.001 ns/op Single.release avgt 3 4.694 ? 0.003 ns/op <--- upgraded to full Single.storeStore avgt 3 4.690 ? 0.005 ns/op <--- upgraded to full # -XX:DisableIntrinsic=_loadFence Single.acquire avgt 3 4.691 ? 0.001 ns/op <--- upgraded to full Single.full avgt 3 4.693 ? 0.009 ns/op Single.loadLoad avgt 3 4.693 ? 0.013 ns/op <--- upgraded to full Single.plain avgt 3 0.408 ? 0.072 ns/op Single.release avgt 3 0.415 ? 0.016 ns/op Single.storeStore avgt 3 0.416 ? 0.041 ns/op # -XX:DisableIntrinsic=_fullFence Single.acquire avgt 3 0.406 ? 0.014 ns/op Single.full avgt 3 15.836 ? 0.151 ns/op <--- calls runtime Single.loadLoad avgt 3 0.406 ? 0.001 ns/op Single.plain avgt 3 0.426 ? 0.361 ns/op Single.release avgt 3 0.407 ? 0.021 ns/op Single.storeStore avgt 3 0.410 ? 0.061 ns/op # -XX:DisableIntrinsic=_fullFence,_loadFence Single.acquire avgt 3 15.822 ? 0.282 ns/op <--- upgraded, calls runtime Single.full avgt 3 15.851 ? 0.127 ns/op <--- calls runtime Single.loadLoad avgt 3 15.829 ? 0.045 ns/op <--- upgraded, calls runtime Single.plain avgt 3 0.406 ? 0.001 ns/op Single.release avgt 3 0.414 ? 0.156 ns/op Single.storeStore avgt 3 0.422 ? 0.452 ns/op # -XX:DisableIntrinsic=_fullFence,_storeFence Single.acquire avgt 3 0.407 ? 0.016 ns/op Single.full avgt 3 15.347 ? 6.783 ns/op <--- calls runtime Single.loadLoad avgt 3 0.406 ? 0.001 ns/op Single.plain avgt 3 0.406 ? 0.002 ns/op Single.release avgt 3 15.828 ? 0.019 ns/op <--- upgraded, calls runtime Single.storeStore avgt 3 15.834 ? 0.045 ns/op <--- upgraded, calls runtime # -XX:DisableIntrinsic=_fullFence,_loadFence,_storeFence Single.acquire avgt 3 15.838 ? 0.030 ns/op <--- upgraded, calls runtime Single.full avgt 3 15.854 ? 0.277 ns/op <--- calls runtime Single.loadLoad avgt 3 15.826 ? 0.160 ns/op <--- upgraded, calls runtime Single.plain avgt 3 0.406 ? 0.003 ns/op Single.release avgt 3 15.838 ? 0.019 ns/op <--- upgraded, calls runtime Single.storeStore avgt 3 15.844 ? 0.104 ns/op <--- upgraded, calls runtime Additional testing: - [x] Linux x86_64 fastdebug `tier1` ------------- Commit messages: - Fix Changes: https://git.openjdk.java.net/jdk/pull/6149/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6149&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8276096 Stats: 22 lines in 3 files changed: 6 ins; 11 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/6149.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6149/head:pull/6149 PR: https://git.openjdk.java.net/jdk/pull/6149 From shade at openjdk.java.net Thu Oct 28 08:58:48 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 28 Oct 2021 08:58:48 GMT Subject: RFR: 8252990: Intrinsify Unsafe.storeStoreFence [v2] In-Reply-To: References: Message-ID: > `Unsafe.storeStoreFence` currently delegates to stronger `Unsafe.storeFence`. We can teach compilers to map this directly to already existing rules that handle `MemBarStoreStore`. Like explicit `LoadFence`/`StoreFence`, we introduce the special node to differentiate explicit fence and implicit store-store barriers. `storeStoreFence` is usually used to simulate safe `final`-field like constructions in special JDK classes, like `ConstantCallSite` and friends. > > Motivational performance difference on benchmarks from JDK-8276054 on ARM32: > > > Benchmark Mode Cnt Score Error Units > Multiple.plain avgt 3 2.669 ? 0.004 ns/op > Multiple.release avgt 3 16.688 ? 0.057 ns/op > Multiple.storeStore avgt 3 14.021 ? 0.144 ns/op // Better > > MultipleWithLoads.plain avgt 3 4.672 ? 0.053 ns/op > MultipleWithLoads.release avgt 3 16.689 ? 0.044 ns/op > MultipleWithLoads.storeStore avgt 3 14.012 ? 0.010 ns/op // Better > > MultipleWithStores.plain avgt 3 14.687 ? 0.009 ns/op > MultipleWithStores.release avgt 3 45.393 ? 0.192 ns/op > MultipleWithStores.storeStore avgt 3 38.048 ? 0.033 ns/op // Better > > Publishing.plain avgt 3 27.079 ? 0.201 ns/op > Publishing.release avgt 3 27.088 ? 0.241 ns/op > Publishing.storeStore avgt 3 27.009 ? 0.259 ns/op // Within error, hidden by allocation > > Single.plain avgt 3 2.670 ? 0.002 ns/op > Single.releaseFence avgt 3 6.675 ? 0.001 ns/op > Single.storeStoreFence avgt 3 8.012 ? 0.027 ns/op // Worse, seems to be ARM32 implementation artifact > > > As expected, this does not affect x86_64 at all, because both `release` and `storeStore` are effectively no-ops, only affecting compiler optimizations: > > > Benchmark Mode Cnt Score Error Units > > Multiple.plain avgt 3 0.406 ? 0.002 ns/op > Multiple.release avgt 3 0.409 ? 0.018 ns/op > Multiple.storeStore avgt 3 0.406 ? 0.001 ns/op > > MultipleWithLoads.plain avgt 3 4.328 ? 0.006 ns/op > MultipleWithLoads.release avgt 3 4.600 ? 0.014 ns/op > MultipleWithLoads.storeStore avgt 3 4.602 ? 0.006 ns/op > > MultipleWithStores.plain avgt 3 0.812 ? 0.001 ns/op > MultipleWithStores.release avgt 3 0.812 ? 0.002 ns/op > MultipleWithStores.storeStore avgt 3 0.812 ? 0.002 ns/op > > Publishing.plain avgt 3 6.370 ? 0.059 ns/op > Publishing.release avgt 3 6.358 ? 0.436 ns/op > Publishing.storeStore avgt 3 6.367 ? 0.054 ns/op > > Single.plain avgt 3 0.407 ? 0.039 ns/op > Single.releaseFence avgt 3 0.406 ? 0.001 ns/op > Single.storeStoreFence avgt 3 0.406 ? 0.001 ns/op > > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Fix the comment to match JDK-8276096 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6136/files - new: https://git.openjdk.java.net/jdk/pull/6136/files/5277aa2a..1f24d71c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6136&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6136&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/6136.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6136/head:pull/6136 PR: https://git.openjdk.java.net/jdk/pull/6136 From shade at openjdk.java.net Thu Oct 28 08:58:49 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 28 Oct 2021 08:58:49 GMT Subject: RFR: 8252990: Intrinsify Unsafe.storeStoreFence [v2] In-Reply-To: References: Message-ID: On Thu, 28 Oct 2021 07:41:59 GMT, Aleksey Shipilev wrote: >> Thanks for clarifying. Now we have all the intrinsics in place we have the choice of delegating either at the Java level or the native level, where previously we had to delegate at the Java level to get the benefit of the storeFence intrinsic. But it is fine as-is. > > Now I am actually thinking to drop `Unsafe_{Load|Store}Fence` entry points and delegate `Unsafe.{load|store}Fence()` to `Unsafe.fullFence()` as the fallback, in a similar way. It looks to me calling all the way to `unsafe.cpp` for a single barrier is not that useful. That's something for a separate PR. > > So it would be something like: > - `storeStore` intrinsifies efficiently, or falls back to `release` > - `loadLoad` always falls back to `acquire` (seems to be little sense to intrinsify this, as all platforms alias it to `acquire`) > - `acquire` intrinsifies efficiently, or falls back to `full` > - `release` intrinsifies efficiently, or falls back to `full` > - `full` intrinsifies efficiently, or falls back to `Unsafe_FullFence`, which calls `OrderAccess::fullFence()`. Doing it here: #6149, JDK-8276096. ------------- PR: https://git.openjdk.java.net/jdk/pull/6136 From thartmann at openjdk.java.net Thu Oct 28 09:09:14 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Thu, 28 Oct 2021 09:09:14 GMT Subject: RFR: 8252990: Intrinsify Unsafe.storeStoreFence [v2] In-Reply-To: References: Message-ID: On Thu, 28 Oct 2021 08:58:48 GMT, Aleksey Shipilev wrote: >> `Unsafe.storeStoreFence` currently delegates to stronger `Unsafe.storeFence`. We can teach compilers to map this directly to already existing rules that handle `MemBarStoreStore`. Like explicit `LoadFence`/`StoreFence`, we introduce the special node to differentiate explicit fence and implicit store-store barriers. `storeStoreFence` is usually used to simulate safe `final`-field like constructions in special JDK classes, like `ConstantCallSite` and friends. >> >> Motivational performance difference on benchmarks from JDK-8276054 on ARM32: >> >> >> Benchmark Mode Cnt Score Error Units >> Multiple.plain avgt 3 2.669 ? 0.004 ns/op >> Multiple.release avgt 3 16.688 ? 0.057 ns/op >> Multiple.storeStore avgt 3 14.021 ? 0.144 ns/op // Better >> >> MultipleWithLoads.plain avgt 3 4.672 ? 0.053 ns/op >> MultipleWithLoads.release avgt 3 16.689 ? 0.044 ns/op >> MultipleWithLoads.storeStore avgt 3 14.012 ? 0.010 ns/op // Better >> >> MultipleWithStores.plain avgt 3 14.687 ? 0.009 ns/op >> MultipleWithStores.release avgt 3 45.393 ? 0.192 ns/op >> MultipleWithStores.storeStore avgt 3 38.048 ? 0.033 ns/op // Better >> >> Publishing.plain avgt 3 27.079 ? 0.201 ns/op >> Publishing.release avgt 3 27.088 ? 0.241 ns/op >> Publishing.storeStore avgt 3 27.009 ? 0.259 ns/op // Within error, hidden by allocation >> >> Single.plain avgt 3 2.670 ? 0.002 ns/op >> Single.releaseFence avgt 3 6.675 ? 0.001 ns/op >> Single.storeStoreFence avgt 3 8.012 ? 0.027 ns/op // Worse, seems to be ARM32 implementation artifact >> >> >> As expected, this does not affect x86_64 at all, because both `release` and `storeStore` are effectively no-ops, only affecting compiler optimizations: >> >> >> Benchmark Mode Cnt Score Error Units >> >> Multiple.plain avgt 3 0.406 ? 0.002 ns/op >> Multiple.release avgt 3 0.409 ? 0.018 ns/op >> Multiple.storeStore avgt 3 0.406 ? 0.001 ns/op >> >> MultipleWithLoads.plain avgt 3 4.328 ? 0.006 ns/op >> MultipleWithLoads.release avgt 3 4.600 ? 0.014 ns/op >> MultipleWithLoads.storeStore avgt 3 4.602 ? 0.006 ns/op >> >> MultipleWithStores.plain avgt 3 0.812 ? 0.001 ns/op >> MultipleWithStores.release avgt 3 0.812 ? 0.002 ns/op >> MultipleWithStores.storeStore avgt 3 0.812 ? 0.002 ns/op >> >> Publishing.plain avgt 3 6.370 ? 0.059 ns/op >> Publishing.release avgt 3 6.358 ? 0.436 ns/op >> Publishing.storeStore avgt 3 6.367 ? 0.054 ns/op >> >> Single.plain avgt 3 0.407 ? 0.039 ns/op >> Single.releaseFence avgt 3 0.406 ? 0.001 ns/op >> Single.storeStoreFence avgt 3 0.406 ? 0.001 ns/op >> >> >> Additional testing: >> - [x] Linux x86_64 fastdebug `tier1` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Fix the comment to match JDK-8276096 That looks good to me. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6136 From duke at openjdk.java.net Thu Oct 28 09:50:40 2021 From: duke at openjdk.java.net (Fei Gao) Date: Thu, 28 Oct 2021 09:50:40 GMT Subject: RFR: 8274179: AArch64: Support SVE operations with encodable immediates [v2] In-Reply-To: References: Message-ID: > for(int i = 0; i < LENGTH; i++) { > c[i] = a[i] + 2; > } > > For the case showed above, after superword optimization with SVE, > without the patch, the vector add operation always has 2 z-reg inputs, > like: > mov z16.s, #2 > add z17.s, z17.s, z16.s > > Considering sve has supported basic binary operations with immediate, > this pattern could be further optimized to: > add z16.s, z16.s, #2 > > To implement it, we added some new match rules and assembler rules in > the aarch64 backend. We also made some extensions on immediate types > and functions to keep backward compatible. > > With the patch, only these binary integer vector operations, +(add), > -(sub), &(and), |(orr), and ^(eor) with immediate are supported for > the optimization. Other vector operations are not supported currently. > > Tested tier1 and test/hotspot/jtreg/compiler on SVE featured AArch64 > CPU, no new failure. > > There is no obvious performance uplift but it can help remove one > redundant mov instruction. Fei Gao has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: - Split the original patch and leave the existing logic in Assembler entirely untouched Change-Id: Ia85fdd3a8d14830f51ad659efada53b3b0f5cbdc - Merge branch 'master' of github.com:fg1417/jdk into fg1417-20211026 Change-Id: Ia5cea44c615c9a00495cfc5123b1ebfe73dc3d1a - 8274179: AArch64: Support SVE operations with encodable immediates for(int i = 0; i < LENGTH; i++) { c[i] = a[i] + 2; } For the case showed above, after superword optimization with SVE, without the patch, the vector add operation always has 2 z-reg inputs, like: mov z16.s, #2 add z17.s, z17.s, z16.s Considering sve has supported basic binary operations with immediate, this pattern could be further optimized to: add z16.s, z16.s, #2 To implement it, we added some new match rules and assembler rules in the aarch64 backend. We also made some extensions on immediate types and functions to keep backward compatible. With the patch, only these binary integer vector operations, +(add), -(sub), &(and), |(orr), and ^(eor) with immediate are supported for the optimization. Other vector operations are not supported currently. Tested tier1 and test/hotspot/jtreg/compiler on SVE featured AArch64 CPU, no new failure. There is no obvious performance uplift but it can help remove one redundant mov instruction. Change-Id: Iaec40e362918118691083fb171cc4dff390b35a2 ------------- Changes: https://git.openjdk.java.net/jdk/pull/6115/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6115&range=01 Stats: 1471 lines in 12 files changed: 1328 ins; 43 del; 100 mod Patch: https://git.openjdk.java.net/jdk/pull/6115.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6115/head:pull/6115 PR: https://git.openjdk.java.net/jdk/pull/6115 From duke at openjdk.java.net Thu Oct 28 10:43:33 2021 From: duke at openjdk.java.net (Fei Gao) Date: Thu, 28 Oct 2021 10:43:33 GMT Subject: RFR: 8274179: AArch64: Support SVE operations with encodable immediates [v3] In-Reply-To: References: Message-ID: <1wJVjyUjZyArCiABWBLy5VCt_rGiE_Y96KdePt6f5Ck=.00fe1bfe-2a9b-4d79-b49a-6ae5bfb22108@github.com> > for(int i = 0; i < LENGTH; i++) { > c[i] = a[i] + 2; > } > > For the case showed above, after superword optimization with SVE, > without the patch, the vector add operation always has 2 z-reg inputs, > like: > mov z16.s, #2 > add z17.s, z17.s, z16.s > > Considering sve has supported basic binary operations with immediate, > this pattern could be further optimized to: > add z16.s, z16.s, #2 > > To implement it, we added some new match rules and assembler rules in > the aarch64 backend. We also made some extensions on immediate types > and functions to keep backward compatible. > > With the patch, only these binary integer vector operations, +(add), > -(sub), &(and), |(orr), and ^(eor) with immediate are supported for > the optimization. Other vector operations are not supported currently. > > Tested tier1 and test/hotspot/jtreg/compiler on SVE featured AArch64 > CPU, no new failure. > > There is no obvious performance uplift but it can help remove one > redundant mov instruction. Fei Gao has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains two new commits since the last revision: - Split the original patch and leave the existing logic in Assembler entirely untouched Change-Id: If8ddcef07b15615d7dd0c3063c44d2b705fac6f7 - Merge branch 'master' of github.com:fg1417/jdk into fg1417-20211026 Change-Id: I52aa66d200b74ac312c5d40283b94854bc1142e6 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6115/files - new: https://git.openjdk.java.net/jdk/pull/6115/files/5ffd5cb1..a7a915af Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6115&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6115&range=01-02 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/6115.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6115/head:pull/6115 PR: https://git.openjdk.java.net/jdk/pull/6115 From duke at openjdk.java.net Thu Oct 28 10:43:33 2021 From: duke at openjdk.java.net (Fei Gao) Date: Thu, 28 Oct 2021 10:43:33 GMT Subject: RFR: 8274179: AArch64: Support SVE operations with encodable immediates In-Reply-To: References: Message-ID: On Tue, 26 Oct 2021 11:37:23 GMT, Andrew Haley wrote: > I'd like you to split this patch into two parts, please. First, please use the new functions such as `Assembler::operand_valid_for_logical_immediate(bool is32, uint64_t imm)` only for SVE, leaving the existing logic in `Assembler` entirely untouched. This will cause some duplication, but that's OK. We can review changes to merge functionality in a separate patch. This will be much easier. Thanks. Done. ------------- PR: https://git.openjdk.java.net/jdk/pull/6115 From duke at openjdk.java.net Thu Oct 28 10:43:34 2021 From: duke at openjdk.java.net (Fei Gao) Date: Thu, 28 Oct 2021 10:43:34 GMT Subject: RFR: 8274179: AArch64: Support SVE operations with encodable immediates [v3] In-Reply-To: References: Message-ID: On Tue, 26 Oct 2021 09:38:38 GMT, Andrew Haley wrote: >> Fei Gao has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains two new commits since the last revision: >> >> - Split the original patch and leave the existing logic in Assembler entirely untouched >> >> Change-Id: If8ddcef07b15615d7dd0c3063c44d2b705fac6f7 >> - Merge branch 'master' of github.com:fg1417/jdk into fg1417-20211026 >> >> Change-Id: I52aa66d200b74ac312c5d40283b94854bc1142e6 > > src/hotspot/cpu/aarch64/aarch64.ad line 2669: > >> 2667: if (!imm_node->is_Con() || >> 2668: !(imm_node->bottom_type()->isa_int() || imm_node->bottom_type()->isa_long())) { >> 2669: return false; > > I'd split this up a bit. > > Please consider > > > if (!imm_node->is_Con() return false; > > const Type t = imm_node->bottom_type(); > if (! (t->isa_int() || t->isa_long)) return false; Done ------------- PR: https://git.openjdk.java.net/jdk/pull/6115 From rkennke at openjdk.java.net Thu Oct 28 11:44:45 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 28 Oct 2021 11:44:45 GMT Subject: RFR: 8275527: Separate/refactor forward pointer access [v2] In-Reply-To: References: Message-ID: > Accessing the forward pointer overlay of object headers is currently done in a somewhat crude way. oopDesc::forwardee() and oopeDesc::is_forwarded() both load the header-word. This seems kinda bad, because they are most often used in conjunction, e.g.: > fwd = obj->forwarded() ? obj->forwardee() : promote_obj(); > > or similar constructs. > > Also, in some places, the forwardee is accessed in a more direct fashion using markWord::decode_pointer(), or as HeapWord*, sometimes is_forwarded() is determined by forwardee() == NULL checks (and crude overrides to prepare non-forwarded headers to return NULL in such checks, see g1Full* source files). > > I propose to extract and refactor forward pointer access in a way to better support above pattern. In-fact I originally wrote this with performance-enhancement in mind (see oopForwarding.hpp header for more details), but I could not show any actual performance improvements, so let's consider this purely on cleanliness and separation-of-concerns basis. > > Testing: > - [x] tier > - [x] tier2 > - [x] hotspot_gc Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Revert unnecessary changes ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5955/files - new: https://git.openjdk.java.net/jdk/pull/5955/files/ec0a4bad..75fdec35 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5955&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5955&range=00-01 Stats: 327 lines in 28 files changed: 60 ins; 171 del; 96 mod Patch: https://git.openjdk.java.net/jdk/pull/5955.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5955/head:pull/5955 PR: https://git.openjdk.java.net/jdk/pull/5955 From rkennke at openjdk.java.net Thu Oct 28 11:47:52 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 28 Oct 2021 11:47:52 GMT Subject: RFR: 8275527: Separate/refactor forward pointer access [v3] In-Reply-To: References: Message-ID: > Accessing the forward pointer overlay of object headers is currently done in a somewhat crude way. oopDesc::forwardee() and oopeDesc::is_forwarded() both load the header-word. This seems kinda bad, because they are most often used in conjunction, e.g.: > fwd = obj->forwarded() ? obj->forwardee() : promote_obj(); > > or similar constructs. > > Also, in some places, the forwardee is accessed in a more direct fashion using markWord::decode_pointer(), or as HeapWord*, sometimes is_forwarded() is determined by forwardee() == NULL checks (and crude overrides to prepare non-forwarded headers to return NULL in such checks, see g1Full* source files). > > I propose to extract and refactor forward pointer access in a way to better support above pattern. In-fact I originally wrote this with performance-enhancement in mind (see oopForwarding.hpp header for more details), but I could not show any actual performance improvements, so let's consider this purely on cleanliness and separation-of-concerns basis. > > Testing: > - [x] tier > - [x] tier2 > - [x] hotspot_gc Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Fix Parallel GC mistake ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5955/files - new: https://git.openjdk.java.net/jdk/pull/5955/files/75fdec35..a9f0c845 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5955&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5955&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/5955.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5955/head:pull/5955 PR: https://git.openjdk.java.net/jdk/pull/5955 From aph at openjdk.java.net Thu Oct 28 11:53:18 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 28 Oct 2021 11:53:18 GMT Subject: RFR: 8274179: AArch64: Support SVE operations with encodable immediates [v3] In-Reply-To: <1wJVjyUjZyArCiABWBLy5VCt_rGiE_Y96KdePt6f5Ck=.00fe1bfe-2a9b-4d79-b49a-6ae5bfb22108@github.com> References: <1wJVjyUjZyArCiABWBLy5VCt_rGiE_Y96KdePt6f5Ck=.00fe1bfe-2a9b-4d79-b49a-6ae5bfb22108@github.com> Message-ID: <-qQsA7EWkM3X98NNZJNQBFW-YSh6W745k8lvR2WBVXs=.c7033a2b-c9de-4b8b-9d15-e371ba4cb5b4@github.com> On Thu, 28 Oct 2021 10:43:33 GMT, Fei Gao wrote: >> for(int i = 0; i < LENGTH; i++) { >> c[i] = a[i] + 2; >> } >> >> For the case showed above, after superword optimization with SVE, >> without the patch, the vector add operation always has 2 z-reg inputs, >> like: >> mov z16.s, #2 >> add z17.s, z17.s, z16.s >> >> Considering sve has supported basic binary operations with immediate, >> this pattern could be further optimized to: >> add z16.s, z16.s, #2 >> >> To implement it, we added some new match rules and assembler rules in >> the aarch64 backend. We also made some extensions on immediate types >> and functions to keep backward compatible. >> >> With the patch, only these binary integer vector operations, +(add), >> -(sub), &(and), |(orr), and ^(eor) with immediate are supported for >> the optimization. Other vector operations are not supported currently. >> >> Tested tier1 and test/hotspot/jtreg/compiler on SVE featured AArch64 >> CPU, no new failure. >> >> There is no obvious performance uplift but it can help remove one >> redundant mov instruction. > > Fei Gao has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. Looks fine with those minor changes. src/hotspot/cpu/aarch64/assembler_aarch64.cpp line 86: > 84: > 85: unsigned Assembler::regVariant_to_elemBits(Assembler::SIMD_RegVariant T){ > 86: return 1 << (T + 3); Assert something about `T` here. src/hotspot/cpu/aarch64/assembler_aarch64.cpp line 394: > 392: if (uimm < (UCONST64(1) << nbits)) > 393: return true; > 394: if (uimm < (UCONST64(1) << (2 * nbits)) Assert something about `nbits` here. It has to be less than 32, I think. ------------- PR: https://git.openjdk.java.net/jdk/pull/6115 From coleenp at openjdk.java.net Thu Oct 28 12:01:15 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 28 Oct 2021 12:01:15 GMT Subject: RFR: 8275917: Some locks shouldn't allow_vm_block [v2] In-Reply-To: References: <2cetS3bCD4YjbfSLnCqc6Gm0L0_a9Lok0iIvpOLCiY8=.ea386bfa-c548-4a5e-a3de-ef9d0dc520eb@github.com> Message-ID: On Wed, 27 Oct 2021 17:53:43 GMT, Coleen Phillimore wrote: >> There were a few safepoint checking locks that passed allow_vm_block as true, that aren't taken by the VMThread. I also made ClassInitError_lock behave like SystemDictionaryLock and the others in SystemDictionary::do_unloading so that doesn't require allow_vm_block either. See the CR for more details. >> >> Tested with tier1-6. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Remove incorrect comment. Thanks Patricio and David. ------------- PR: https://git.openjdk.java.net/jdk/pull/6123 From coleenp at openjdk.java.net Thu Oct 28 12:01:16 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 28 Oct 2021 12:01:16 GMT Subject: Integrated: 8275917: Some locks shouldn't allow_vm_block In-Reply-To: <2cetS3bCD4YjbfSLnCqc6Gm0L0_a9Lok0iIvpOLCiY8=.ea386bfa-c548-4a5e-a3de-ef9d0dc520eb@github.com> References: <2cetS3bCD4YjbfSLnCqc6Gm0L0_a9Lok0iIvpOLCiY8=.ea386bfa-c548-4a5e-a3de-ef9d0dc520eb@github.com> Message-ID: On Tue, 26 Oct 2021 19:43:46 GMT, Coleen Phillimore wrote: > There were a few safepoint checking locks that passed allow_vm_block as true, that aren't taken by the VMThread. I also made ClassInitError_lock behave like SystemDictionaryLock and the others in SystemDictionary::do_unloading so that doesn't require allow_vm_block either. See the CR for more details. > > Tested with tier1-6. This pull request has now been integrated. Changeset: bec977c7 Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/bec977c778a35ea48a45db662f1feaeab79308b2 Stats: 23 lines in 3 files changed: 1 ins; 0 del; 22 mod 8275917: Some locks shouldn't allow_vm_block Reviewed-by: dholmes, pchilanomate ------------- PR: https://git.openjdk.java.net/jdk/pull/6123 From rkennke at openjdk.java.net Thu Oct 28 12:35:37 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 28 Oct 2021 12:35:37 GMT Subject: RFR: 8275527: Separate/refactor forward pointer access [v4] In-Reply-To: References: Message-ID: > Accessing the forward pointer overlay of object headers is currently done in a somewhat crude way. oopDesc::forwardee() and oopeDesc::is_forwarded() both load the header-word. This seems kinda bad, because they are most often used in conjunction, e.g.: > fwd = obj->forwarded() ? obj->forwardee() : promote_obj(); > > or similar constructs. > > Also, in some places, the forwardee is accessed in a more direct fashion using markWord::decode_pointer(), or as HeapWord*, sometimes is_forwarded() is determined by forwardee() == NULL checks (and crude overrides to prepare non-forwarded headers to return NULL in such checks, see g1Full* source files). > > I propose to extract and refactor forward pointer access in a way to better support above pattern. In-fact I originally wrote this with performance-enhancement in mind (see oopForwarding.hpp header for more details), but I could not show any actual performance improvements, so let's consider this purely on cleanliness and separation-of-concerns basis. > > Testing: > - [x] tier > - [x] tier2 > - [x] hotspot_gc Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Move forward impl into markWord and add assert ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5955/files - new: https://git.openjdk.java.net/jdk/pull/5955/files/a9f0c845..9c2ed2ff Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5955&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5955&range=02-03 Stats: 6 lines in 2 files changed: 3 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/5955.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5955/head:pull/5955 PR: https://git.openjdk.java.net/jdk/pull/5955 From rkennke at openjdk.java.net Thu Oct 28 12:41:13 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 28 Oct 2021 12:41:13 GMT Subject: RFR: 8275527: Separate/refactor forward pointer access [v4] In-Reply-To: References: Message-ID: On Thu, 28 Oct 2021 12:35:37 GMT, Roman Kennke wrote: >> Accessing the forward pointer overlay of object headers is currently done in a somewhat crude way. oopDesc::forwardee() and oopeDesc::is_forwarded() both load the header-word. This seems kinda bad, because they are most often used in conjunction, e.g.: >> fwd = obj->forwarded() ? obj->forwardee() : promote_obj(); >> >> or similar constructs. >> >> Also, in some places, the forwardee is accessed in a more direct fashion using markWord::decode_pointer(), or as HeapWord*, sometimes is_forwarded() is determined by forwardee() == NULL checks (and crude overrides to prepare non-forwarded headers to return NULL in such checks, see g1Full* source files). >> >> I propose to extract and refactor forward pointer access in a way to better support above pattern. In-fact I originally wrote this with performance-enhancement in mind (see oopForwarding.hpp header for more details), but I could not show any actual performance improvements, so let's consider this purely on cleanliness and separation-of-concerns basis. >> >> Testing: >> - [x] tier >> - [x] tier2 >> - [x] hotspot_gc > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Move forward impl into markWord and add assert I've removed OopForwarding altogether, and only kept the actual refactorings and cleanups. Can you please re-review? ------------- PR: https://git.openjdk.java.net/jdk/pull/5955 From rkennke at openjdk.java.net Thu Oct 28 12:41:14 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 28 Oct 2021 12:41:14 GMT Subject: RFR: 8275527: Separate/refactor forward pointer access [v4] In-Reply-To: References: Message-ID: <9xpY2oFvx_vNWmeextHoveVIarXOJ4IUPtXQeLV6VqY=.02c0abc9-a5de-4309-909c-16aaf129b4bf@github.com> On Thu, 21 Oct 2021 07:30:44 GMT, Stefan Karlsson wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Move forward impl into markWord and add assert > > src/hotspot/share/gc/g1/g1FullGCCompactionPoint.cpp line 112: > >> 110: // since it will be restored by preserved marks. >> 111: object->init_mark(); >> 112: } else { > > Could you explain why this was removed? The G1 Full GC code decoded the fwdptr without checking if object is actually forwarded (i.e. by checking the lowest two mark word bits), and then compare the result with NULL. In order for this to work, it modifies the mark word of not forwarded objects to prototype mark. This is all quite messy and can be avoided by checking is_forwarded() before actually decoding the fwdptr. ------------- PR: https://git.openjdk.java.net/jdk/pull/5955 From redestad at openjdk.java.net Thu Oct 28 13:43:14 2021 From: redestad at openjdk.java.net (Claes Redestad) Date: Thu, 28 Oct 2021 13:43:14 GMT Subject: RFR: 8276054: JMH benchmarks for Fences [v3] In-Reply-To: References: Message-ID: <9ciUOd912omDoYW4rwmofHALX8JlEu03Fl5ijiRQqZo=.271cc0e8-5d2e-4045-8c41-7a6945e6cb97@github.com> On Thu, 28 Oct 2021 07:50:55 GMT, Aleksey Shipilev wrote: >> While working on JDK-8252990, I realized there are no microbenchmarks for Fences. I would be good to add some. While technically we can benchmark `Unsafe` directly, I instead chose to use the public API, `VarHandles`, as this captures the real-world uses too. >> >> Sample output for `make test TEST=micro:vm.fences` on `x86_64`: >> >> >> Benchmark Mode Cnt Score Error Units >> Multiple.acquire avgt 9 0.408 ? 0.005 ns/op >> Multiple.full avgt 9 4.694 ? 0.003 ns/op >> Multiple.plain avgt 9 0.407 ? 0.004 ns/op >> Multiple.release avgt 9 0.407 ? 0.003 ns/op >> Multiple.storeStore avgt 9 0.409 ? 0.006 ns/op >> MultipleWithLoads.acquire avgt 9 4.600 ? 0.002 ns/op >> MultipleWithLoads.full avgt 9 8.322 ? 0.003 ns/op >> MultipleWithLoads.plain avgt 9 4.328 ? 0.001 ns/op >> MultipleWithLoads.release avgt 9 4.600 ? 0.002 ns/op >> MultipleWithLoads.storeStore avgt 9 4.600 ? 0.002 ns/op >> MultipleWithStores.acquire avgt 9 0.812 ? 0.001 ns/op >> MultipleWithStores.full avgt 9 5.291 ? 0.001 ns/op >> MultipleWithStores.plain avgt 9 0.812 ? 0.001 ns/op >> MultipleWithStores.release avgt 9 0.812 ? 0.001 ns/op >> MultipleWithStores.storeStore avgt 9 0.812 ? 0.001 ns/op >> SafePublishing.plain avgt 9 6.378 ? 0.011 ns/op >> SafePublishing.release avgt 9 6.368 ? 0.010 ns/op >> SafePublishing.storeStore avgt 9 6.376 ? 0.013 ns/op >> Single.acquireFence avgt 9 0.410 ? 0.009 ns/op >> Single.fullFence avgt 9 4.689 ? 0.002 ns/op >> Single.plain avgt 9 0.409 ? 0.010 ns/op >> Single.releaseFence avgt 9 0.410 ? 0.006 ns/op >> Single.storeStoreFence avgt 9 0.410 ? 0.009 ns/op > > Aleksey Shipilev has updated the pull request incrementally with two additional commits since the last revision: > > - Consistent benchmark names: drop excess "Fence" > - Add loadLoad tests too Thanks for making the package names consistent. New tests and restructuring looks good, too. ------------- Marked as reviewed by redestad (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6138 From mchung at openjdk.java.net Thu Oct 28 15:31:29 2021 From: mchung at openjdk.java.net (Mandy Chung) Date: Thu, 28 Oct 2021 15:31:29 GMT Subject: Integrated: JDK-8275703: System.loadLibrary fails on Big Sur for libraries hidden from filesystem In-Reply-To: <1d0vB2i2qrGaWRxtX_FqnYWrKcgXHaPvShDeSSG9cXg=.f733da58-3eaf-4eda-8c93-f8e81a384326@github.com> References: <1d0vB2i2qrGaWRxtX_FqnYWrKcgXHaPvShDeSSG9cXg=.f733da58-3eaf-4eda-8c93-f8e81a384326@github.com> Message-ID: On Tue, 26 Oct 2021 22:51:29 GMT, Mandy Chung wrote: > On, macOS 11.x, system libraries are loaded from dynamic linker cache. The libraries are no longer present on the filesystem. `NativeLibraries::loadLibrary` checks for the file existence before calling `JVM_LoadLibrary`. Such check no longer applies on Big Sur. This proposes that on macOS >= 11, it will skip the file existence check and attempt to load a library for each path from java.library.path and system library path. This pull request has now been integrated. Changeset: 309acbf0 Author: Mandy Chung URL: https://git.openjdk.java.net/jdk/commit/309acbf0e86a0d248294503fccc7a936fa0a846e Stats: 203 lines in 10 files changed: 170 ins; 3 del; 30 mod 8275703: System.loadLibrary fails on Big Sur for libraries hidden from filesystem Reviewed-by: dholmes, alanb, mcimadamore ------------- PR: https://git.openjdk.java.net/jdk/pull/6127 From aph at openjdk.java.net Thu Oct 28 15:55:14 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 28 Oct 2021 15:55:14 GMT Subject: Integrated: 8275052: AArch64: Severe AES/GCM slowdown on MacOS for short blocks In-Reply-To: References: Message-ID: On Mon, 11 Oct 2021 13:06:14 GMT, Andrew Haley wrote: > This is more of the fallout from JDK-8273297. > We've noticed that blocks of less than 8kbytes are very slowly encrypted with AES/GCM . This is because of incorrect flag handline in vm_version_bsd_aarch64.cpp. > > Before this patch: > > > Benchmark (dataSize) (keyLength) (provider) Mode Cnt Score Error Units > AESGCMBench.decrypt 8191 256 avgt 6 95821.232 ? 2447.246 ns/op > AESGCMBench.decrypt 8192 256 avgt 6 3653.619 ? 83.432 ns/op > > > After: > > > Benchmark (dataSize) (keyLength) (provider) Mode Cnt Score Error Units > AESGCMBench.decrypt 8191 256 avgt 6 3371.735 ? 70.204 ns/op > AESGCMBench.decrypt 8192 256 avgt 6 3119.928 ? 80.375 ns/op This pull request has now been integrated. Changeset: cb989cf3 Author: Andrew Haley URL: https://git.openjdk.java.net/jdk/commit/cb989cf3a182ee07fe127b4536e7ff4213f31eaf Stats: 16 lines in 2 files changed: 7 ins; 7 del; 2 mod 8275052: AArch64: Severe AES/GCM slowdown on MacOS for short blocks Reviewed-by: ngasson, adinn ------------- PR: https://git.openjdk.java.net/jdk/pull/5894 From shade at openjdk.java.net Thu Oct 28 17:15:21 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 28 Oct 2021 17:15:21 GMT Subject: RFR: 8276054: JMH benchmarks for Fences [v3] In-Reply-To: References: Message-ID: On Thu, 28 Oct 2021 07:50:55 GMT, Aleksey Shipilev wrote: >> While working on JDK-8252990, I realized there are no microbenchmarks for Fences. I would be good to add some. While technically we can benchmark `Unsafe` directly, I instead chose to use the public API, `VarHandles`, as this captures the real-world uses too. >> >> Sample output for `make test TEST=micro:vm.fences` on `x86_64`: >> >> >> Benchmark Mode Cnt Score Error Units >> Multiple.acquire avgt 9 0.408 ? 0.005 ns/op >> Multiple.full avgt 9 4.694 ? 0.003 ns/op >> Multiple.plain avgt 9 0.407 ? 0.004 ns/op >> Multiple.release avgt 9 0.407 ? 0.003 ns/op >> Multiple.storeStore avgt 9 0.409 ? 0.006 ns/op >> MultipleWithLoads.acquire avgt 9 4.600 ? 0.002 ns/op >> MultipleWithLoads.full avgt 9 8.322 ? 0.003 ns/op >> MultipleWithLoads.plain avgt 9 4.328 ? 0.001 ns/op >> MultipleWithLoads.release avgt 9 4.600 ? 0.002 ns/op >> MultipleWithLoads.storeStore avgt 9 4.600 ? 0.002 ns/op >> MultipleWithStores.acquire avgt 9 0.812 ? 0.001 ns/op >> MultipleWithStores.full avgt 9 5.291 ? 0.001 ns/op >> MultipleWithStores.plain avgt 9 0.812 ? 0.001 ns/op >> MultipleWithStores.release avgt 9 0.812 ? 0.001 ns/op >> MultipleWithStores.storeStore avgt 9 0.812 ? 0.001 ns/op >> SafePublishing.plain avgt 9 6.378 ? 0.011 ns/op >> SafePublishing.release avgt 9 6.368 ? 0.010 ns/op >> SafePublishing.storeStore avgt 9 6.376 ? 0.013 ns/op >> Single.acquireFence avgt 9 0.410 ? 0.009 ns/op >> Single.fullFence avgt 9 4.689 ? 0.002 ns/op >> Single.plain avgt 9 0.409 ? 0.010 ns/op >> Single.releaseFence avgt 9 0.410 ? 0.006 ns/op >> Single.storeStoreFence avgt 9 0.410 ? 0.009 ns/op > > Aleksey Shipilev has updated the pull request incrementally with two additional commits since the last revision: > > - Consistent benchmark names: drop excess "Fence" > - Add loadLoad tests too Thanks! No more reviews needed for this, or? ------------- PR: https://git.openjdk.java.net/jdk/pull/6138 From shade at openjdk.java.net Thu Oct 28 17:16:21 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 28 Oct 2021 17:16:21 GMT Subject: RFR: 8255286: Implement ParametersTypeData::print_data_on fully In-Reply-To: References: Message-ID: On Wed, 27 Oct 2021 15:37:37 GMT, Aleksey Shipilev wrote: > There is a little FIXME leftover since JDK 9 build warning fixes. This basically implements `ParametersTypeData::print_data_on` like `SpeculativeTrapData::print_data_on` is currently implemented. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/6143 From shade at openjdk.java.net Thu Oct 28 17:16:22 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 28 Oct 2021 17:16:22 GMT Subject: Integrated: 8255286: Implement ParametersTypeData::print_data_on fully In-Reply-To: References: Message-ID: On Wed, 27 Oct 2021 15:37:37 GMT, Aleksey Shipilev wrote: > There is a little FIXME leftover since JDK 9 build warning fixes. This basically implements `ParametersTypeData::print_data_on` like `SpeculativeTrapData::print_data_on` is currently implemented. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` This pull request has now been integrated. Changeset: 6d8fa8f6 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/6d8fa8f6632a78dc79786cb102ba20f6834ad3f4 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod 8255286: Implement ParametersTypeData::print_data_on fully Reviewed-by: dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/6143 From psandoz at openjdk.java.net Thu Oct 28 17:36:25 2021 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Thu, 28 Oct 2021 17:36:25 GMT Subject: RFR: 8276096: Simplify Unsafe.{load|store}Fence fallbacks by delegating to fullFence In-Reply-To: References: Message-ID: <1XLxQe8x0LZNSMz8I-gJk4SPyaNRVl-dMxPEp0rjIRU=.a088a9d6-83e0-498e-8b90-308a53637f5c@github.com> On Thu, 28 Oct 2021 08:47:31 GMT, Aleksey Shipilev wrote: > `Unsafe.{load|store}Fence` falls back to `unsafe.cpp` for `OrderAccess::{acquire|release}Fence()`. It seems too heavy-handed (useless?) to call to runtime for a single memory barrier. We can simplify the native `Unsafe` interface by falling back to `fullFence` when `{load|store}Fence` intrinsics are not available. This would be similar to what `Unsafe.{loadLoad|storeStore}Fences` do. > > This is the behavior of these intrinsics now, on x86_64, using benchmarks from JDK-8276054: > > > Benchmark Mode Cnt Score Error Units > > # Default > Single.acquire avgt 3 0.407 ? 0.060 ns/op > Single.full avgt 3 4.693 ? 0.005 ns/op > Single.loadLoad avgt 3 0.415 ? 0.095 ns/op > Single.plain avgt 3 0.406 ? 0.002 ns/op > Single.release avgt 3 0.408 ? 0.047 ns/op > Single.storeStore avgt 3 0.408 ? 0.043 ns/op > > # -XX:DisableIntrinsic=_storeFence > Single.acquire avgt 3 0.408 ? 0.016 ns/op > Single.full avgt 3 4.694 ? 0.002 ns/op > Single.loadLoad avgt 3 0.406 ? 0.002 ns/op > Single.plain avgt 3 0.406 ? 0.001 ns/op > Single.release avgt 3 4.694 ? 0.003 ns/op <--- upgraded to full > Single.storeStore avgt 3 4.690 ? 0.005 ns/op <--- upgraded to full > > # -XX:DisableIntrinsic=_loadFence > Single.acquire avgt 3 4.691 ? 0.001 ns/op <--- upgraded to full > Single.full avgt 3 4.693 ? 0.009 ns/op > Single.loadLoad avgt 3 4.693 ? 0.013 ns/op <--- upgraded to full > Single.plain avgt 3 0.408 ? 0.072 ns/op > Single.release avgt 3 0.415 ? 0.016 ns/op > Single.storeStore avgt 3 0.416 ? 0.041 ns/op > > # -XX:DisableIntrinsic=_fullFence > Single.acquire avgt 3 0.406 ? 0.014 ns/op > Single.full avgt 3 15.836 ? 0.151 ns/op <--- calls runtime > Single.loadLoad avgt 3 0.406 ? 0.001 ns/op > Single.plain avgt 3 0.426 ? 0.361 ns/op > Single.release avgt 3 0.407 ? 0.021 ns/op > Single.storeStore avgt 3 0.410 ? 0.061 ns/op > > # -XX:DisableIntrinsic=_fullFence,_loadFence > Single.acquire avgt 3 15.822 ? 0.282 ns/op <--- upgraded, calls runtime > Single.full avgt 3 15.851 ? 0.127 ns/op <--- calls runtime > Single.loadLoad avgt 3 15.829 ? 0.045 ns/op <--- upgraded, calls runtime > Single.plain avgt 3 0.406 ? 0.001 ns/op > Single.release avgt 3 0.414 ? 0.156 ns/op > Single.storeStore avgt 3 0.422 ? 0.452 ns/op > > # -XX:DisableIntrinsic=_fullFence,_storeFence > Single.acquire avgt 3 0.407 ? 0.016 ns/op > Single.full avgt 3 15.347 ? 6.783 ns/op <--- calls runtime > Single.loadLoad avgt 3 0.406 ? 0.001 ns/op > Single.plain avgt 3 0.406 ? 0.002 ns/op > Single.release avgt 3 15.828 ? 0.019 ns/op <--- upgraded, calls runtime > Single.storeStore avgt 3 15.834 ? 0.045 ns/op <--- upgraded, calls runtime > > # -XX:DisableIntrinsic=_fullFence,_loadFence,_storeFence > Single.acquire avgt 3 15.838 ? 0.030 ns/op <--- upgraded, calls runtime > Single.full avgt 3 15.854 ? 0.277 ns/op <--- calls runtime > Single.loadLoad avgt 3 15.826 ? 0.160 ns/op <--- upgraded, calls runtime > Single.plain avgt 3 0.406 ? 0.003 ns/op > Single.release avgt 3 15.838 ? 0.019 ns/op <--- upgraded, calls runtime > Single.storeStore avgt 3 15.844 ? 0.104 ns/op <--- upgraded, calls runtime > > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` Marked as reviewed by psandoz (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/6149 From redestad at openjdk.java.net Thu Oct 28 17:37:22 2021 From: redestad at openjdk.java.net (Claes Redestad) Date: Thu, 28 Oct 2021 17:37:22 GMT Subject: RFR: 8276054: JMH benchmarks for Fences [v3] In-Reply-To: References: Message-ID: On Thu, 28 Oct 2021 07:50:55 GMT, Aleksey Shipilev wrote: >> While working on JDK-8252990, I realized there are no microbenchmarks for Fences. I would be good to add some. While technically we can benchmark `Unsafe` directly, I instead chose to use the public API, `VarHandles`, as this captures the real-world uses too. >> >> Sample output for `make test TEST=micro:vm.fences` on `x86_64`: >> >> >> Benchmark Mode Cnt Score Error Units >> Multiple.acquire avgt 9 0.408 ? 0.005 ns/op >> Multiple.full avgt 9 4.694 ? 0.003 ns/op >> Multiple.plain avgt 9 0.407 ? 0.004 ns/op >> Multiple.release avgt 9 0.407 ? 0.003 ns/op >> Multiple.storeStore avgt 9 0.409 ? 0.006 ns/op >> MultipleWithLoads.acquire avgt 9 4.600 ? 0.002 ns/op >> MultipleWithLoads.full avgt 9 8.322 ? 0.003 ns/op >> MultipleWithLoads.plain avgt 9 4.328 ? 0.001 ns/op >> MultipleWithLoads.release avgt 9 4.600 ? 0.002 ns/op >> MultipleWithLoads.storeStore avgt 9 4.600 ? 0.002 ns/op >> MultipleWithStores.acquire avgt 9 0.812 ? 0.001 ns/op >> MultipleWithStores.full avgt 9 5.291 ? 0.001 ns/op >> MultipleWithStores.plain avgt 9 0.812 ? 0.001 ns/op >> MultipleWithStores.release avgt 9 0.812 ? 0.001 ns/op >> MultipleWithStores.storeStore avgt 9 0.812 ? 0.001 ns/op >> SafePublishing.plain avgt 9 6.378 ? 0.011 ns/op >> SafePublishing.release avgt 9 6.368 ? 0.010 ns/op >> SafePublishing.storeStore avgt 9 6.376 ? 0.013 ns/op >> Single.acquireFence avgt 9 0.410 ? 0.009 ns/op >> Single.fullFence avgt 9 4.689 ? 0.002 ns/op >> Single.plain avgt 9 0.409 ? 0.010 ns/op >> Single.releaseFence avgt 9 0.410 ? 0.006 ns/op >> Single.storeStoreFence avgt 9 0.410 ? 0.009 ns/op > > Aleksey Shipilev has updated the pull request incrementally with two additional commits since the last revision: > > - Consistent benchmark names: drop excess "Fence" > - Add loadLoad tests too I would say you don't need more reviews for this. ------------- PR: https://git.openjdk.java.net/jdk/pull/6138 From shade at openjdk.java.net Thu Oct 28 17:37:23 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 28 Oct 2021 17:37:23 GMT Subject: RFR: 8276054: JMH benchmarks for Fences [v3] In-Reply-To: References: Message-ID: On Thu, 28 Oct 2021 07:50:55 GMT, Aleksey Shipilev wrote: >> While working on JDK-8252990, I realized there are no microbenchmarks for Fences. I would be good to add some. While technically we can benchmark `Unsafe` directly, I instead chose to use the public API, `VarHandles`, as this captures the real-world uses too. >> >> Sample output for `make test TEST=micro:vm.fences` on `x86_64`: >> >> >> Benchmark Mode Cnt Score Error Units >> Multiple.acquire avgt 9 0.408 ? 0.005 ns/op >> Multiple.full avgt 9 4.694 ? 0.003 ns/op >> Multiple.plain avgt 9 0.407 ? 0.004 ns/op >> Multiple.release avgt 9 0.407 ? 0.003 ns/op >> Multiple.storeStore avgt 9 0.409 ? 0.006 ns/op >> MultipleWithLoads.acquire avgt 9 4.600 ? 0.002 ns/op >> MultipleWithLoads.full avgt 9 8.322 ? 0.003 ns/op >> MultipleWithLoads.plain avgt 9 4.328 ? 0.001 ns/op >> MultipleWithLoads.release avgt 9 4.600 ? 0.002 ns/op >> MultipleWithLoads.storeStore avgt 9 4.600 ? 0.002 ns/op >> MultipleWithStores.acquire avgt 9 0.812 ? 0.001 ns/op >> MultipleWithStores.full avgt 9 5.291 ? 0.001 ns/op >> MultipleWithStores.plain avgt 9 0.812 ? 0.001 ns/op >> MultipleWithStores.release avgt 9 0.812 ? 0.001 ns/op >> MultipleWithStores.storeStore avgt 9 0.812 ? 0.001 ns/op >> SafePublishing.plain avgt 9 6.378 ? 0.011 ns/op >> SafePublishing.release avgt 9 6.368 ? 0.010 ns/op >> SafePublishing.storeStore avgt 9 6.376 ? 0.013 ns/op >> Single.acquireFence avgt 9 0.410 ? 0.009 ns/op >> Single.fullFence avgt 9 4.689 ? 0.002 ns/op >> Single.plain avgt 9 0.409 ? 0.010 ns/op >> Single.releaseFence avgt 9 0.410 ? 0.006 ns/op >> Single.storeStoreFence avgt 9 0.410 ? 0.009 ns/op > > Aleksey Shipilev has updated the pull request incrementally with two additional commits since the last revision: > > - Consistent benchmark names: drop excess "Fence" > - Add loadLoad tests too Okay then! Not sure what the rules for benchmarks are. ------------- PR: https://git.openjdk.java.net/jdk/pull/6138 From shade at openjdk.java.net Thu Oct 28 17:37:23 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 28 Oct 2021 17:37:23 GMT Subject: Integrated: 8276054: JMH benchmarks for Fences In-Reply-To: References: Message-ID: On Wed, 27 Oct 2021 12:11:37 GMT, Aleksey Shipilev wrote: > While working on JDK-8252990, I realized there are no microbenchmarks for Fences. I would be good to add some. While technically we can benchmark `Unsafe` directly, I instead chose to use the public API, `VarHandles`, as this captures the real-world uses too. > > Sample output for `make test TEST=micro:vm.fences` on `x86_64`: > > > Benchmark Mode Cnt Score Error Units > Multiple.acquire avgt 9 0.408 ? 0.005 ns/op > Multiple.full avgt 9 4.694 ? 0.003 ns/op > Multiple.plain avgt 9 0.407 ? 0.004 ns/op > Multiple.release avgt 9 0.407 ? 0.003 ns/op > Multiple.storeStore avgt 9 0.409 ? 0.006 ns/op > MultipleWithLoads.acquire avgt 9 4.600 ? 0.002 ns/op > MultipleWithLoads.full avgt 9 8.322 ? 0.003 ns/op > MultipleWithLoads.plain avgt 9 4.328 ? 0.001 ns/op > MultipleWithLoads.release avgt 9 4.600 ? 0.002 ns/op > MultipleWithLoads.storeStore avgt 9 4.600 ? 0.002 ns/op > MultipleWithStores.acquire avgt 9 0.812 ? 0.001 ns/op > MultipleWithStores.full avgt 9 5.291 ? 0.001 ns/op > MultipleWithStores.plain avgt 9 0.812 ? 0.001 ns/op > MultipleWithStores.release avgt 9 0.812 ? 0.001 ns/op > MultipleWithStores.storeStore avgt 9 0.812 ? 0.001 ns/op > SafePublishing.plain avgt 9 6.378 ? 0.011 ns/op > SafePublishing.release avgt 9 6.368 ? 0.010 ns/op > SafePublishing.storeStore avgt 9 6.376 ? 0.013 ns/op > Single.acquireFence avgt 9 0.410 ? 0.009 ns/op > Single.fullFence avgt 9 4.689 ? 0.002 ns/op > Single.plain avgt 9 0.409 ? 0.010 ns/op > Single.releaseFence avgt 9 0.410 ? 0.006 ns/op > Single.storeStoreFence avgt 9 0.410 ? 0.009 ns/op This pull request has now been integrated. Changeset: 5a768f75 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/5a768f75c9cb013edbf6c61e79820bd180cad4ba Stats: 451 lines in 5 files changed: 451 ins; 0 del; 0 mod 8276054: JMH benchmarks for Fences Reviewed-by: redestad ------------- PR: https://git.openjdk.java.net/jdk/pull/6138 From redestad at openjdk.java.net Thu Oct 28 17:50:20 2021 From: redestad at openjdk.java.net (Claes Redestad) Date: Thu, 28 Oct 2021 17:50:20 GMT Subject: RFR: 8276054: JMH benchmarks for Fences [v3] In-Reply-To: References: Message-ID: On Thu, 28 Oct 2021 07:50:55 GMT, Aleksey Shipilev wrote: >> While working on JDK-8252990, I realized there are no microbenchmarks for Fences. I would be good to add some. While technically we can benchmark `Unsafe` directly, I instead chose to use the public API, `VarHandles`, as this captures the real-world uses too. >> >> Sample output for `make test TEST=micro:vm.fences` on `x86_64`: >> >> >> Benchmark Mode Cnt Score Error Units >> Multiple.acquire avgt 9 0.408 ? 0.005 ns/op >> Multiple.full avgt 9 4.694 ? 0.003 ns/op >> Multiple.plain avgt 9 0.407 ? 0.004 ns/op >> Multiple.release avgt 9 0.407 ? 0.003 ns/op >> Multiple.storeStore avgt 9 0.409 ? 0.006 ns/op >> MultipleWithLoads.acquire avgt 9 4.600 ? 0.002 ns/op >> MultipleWithLoads.full avgt 9 8.322 ? 0.003 ns/op >> MultipleWithLoads.plain avgt 9 4.328 ? 0.001 ns/op >> MultipleWithLoads.release avgt 9 4.600 ? 0.002 ns/op >> MultipleWithLoads.storeStore avgt 9 4.600 ? 0.002 ns/op >> MultipleWithStores.acquire avgt 9 0.812 ? 0.001 ns/op >> MultipleWithStores.full avgt 9 5.291 ? 0.001 ns/op >> MultipleWithStores.plain avgt 9 0.812 ? 0.001 ns/op >> MultipleWithStores.release avgt 9 0.812 ? 0.001 ns/op >> MultipleWithStores.storeStore avgt 9 0.812 ? 0.001 ns/op >> SafePublishing.plain avgt 9 6.378 ? 0.011 ns/op >> SafePublishing.release avgt 9 6.368 ? 0.010 ns/op >> SafePublishing.storeStore avgt 9 6.376 ? 0.013 ns/op >> Single.acquireFence avgt 9 0.410 ? 0.009 ns/op >> Single.fullFence avgt 9 4.689 ? 0.002 ns/op >> Single.plain avgt 9 0.409 ? 0.010 ns/op >> Single.releaseFence avgt 9 0.410 ? 0.006 ns/op >> Single.storeStoreFence avgt 9 0.410 ? 0.009 ns/op > > Aleksey Shipilev has updated the pull request incrementally with two additional commits since the last revision: > > - Consistent benchmark names: drop excess "Fence" > - Add loadLoad tests too I think the general rule in OpenJDK projects is simply one reviewer approval. I have not seen a need to try and add any special rules. I'm also not even sure where special rules - like those enforced by the hotspot maintainers (two reviewers, 24hr grace time) - are written down. ------------- PR: https://git.openjdk.java.net/jdk/pull/6138 From dholmes at openjdk.java.net Thu Oct 28 21:21:16 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 28 Oct 2021 21:21:16 GMT Subject: RFR: 8276054: JMH benchmarks for Fences [v3] In-Reply-To: References: Message-ID: <8gd4yYCRkG2PK8d9qQCZB415GjbB4srxMEWYg_n5HQA=.91b9ac13-5dfa-413e-bca9-220ddc7ee8d2@github.com> On Thu, 28 Oct 2021 07:50:55 GMT, Aleksey Shipilev wrote: >> While working on JDK-8252990, I realized there are no microbenchmarks for Fences. I would be good to add some. While technically we can benchmark `Unsafe` directly, I instead chose to use the public API, `VarHandles`, as this captures the real-world uses too. >> >> Sample output for `make test TEST=micro:vm.fences` on `x86_64`: >> >> >> Benchmark Mode Cnt Score Error Units >> Multiple.acquire avgt 9 0.408 ? 0.005 ns/op >> Multiple.full avgt 9 4.694 ? 0.003 ns/op >> Multiple.plain avgt 9 0.407 ? 0.004 ns/op >> Multiple.release avgt 9 0.407 ? 0.003 ns/op >> Multiple.storeStore avgt 9 0.409 ? 0.006 ns/op >> MultipleWithLoads.acquire avgt 9 4.600 ? 0.002 ns/op >> MultipleWithLoads.full avgt 9 8.322 ? 0.003 ns/op >> MultipleWithLoads.plain avgt 9 4.328 ? 0.001 ns/op >> MultipleWithLoads.release avgt 9 4.600 ? 0.002 ns/op >> MultipleWithLoads.storeStore avgt 9 4.600 ? 0.002 ns/op >> MultipleWithStores.acquire avgt 9 0.812 ? 0.001 ns/op >> MultipleWithStores.full avgt 9 5.291 ? 0.001 ns/op >> MultipleWithStores.plain avgt 9 0.812 ? 0.001 ns/op >> MultipleWithStores.release avgt 9 0.812 ? 0.001 ns/op >> MultipleWithStores.storeStore avgt 9 0.812 ? 0.001 ns/op >> SafePublishing.plain avgt 9 6.378 ? 0.011 ns/op >> SafePublishing.release avgt 9 6.368 ? 0.010 ns/op >> SafePublishing.storeStore avgt 9 6.376 ? 0.013 ns/op >> Single.acquireFence avgt 9 0.410 ? 0.009 ns/op >> Single.fullFence avgt 9 4.689 ? 0.002 ns/op >> Single.plain avgt 9 0.409 ? 0.010 ns/op >> Single.releaseFence avgt 9 0.410 ? 0.006 ns/op >> Single.storeStoreFence avgt 9 0.410 ? 0.009 ns/op > > Aleksey Shipilev has updated the pull request incrementally with two additional commits since the last revision: > > - Consistent benchmark names: drop excess "Fence" > - Add loadLoad tests too The Hotspot rules are documented here: https://wiki.openjdk.java.net/display/HotSpot/Pushing+a+HotSpot+change and in the process of migrating to the OpenJDK dev guide. @shipilev Not sure why this was even posted to hotspot-dev for review ??? :) ------------- PR: https://git.openjdk.java.net/jdk/pull/6138 From redestad at openjdk.java.net Thu Oct 28 21:44:18 2021 From: redestad at openjdk.java.net (Claes Redestad) Date: Thu, 28 Oct 2021 21:44:18 GMT Subject: RFR: 8276054: JMH benchmarks for Fences [v3] In-Reply-To: <8gd4yYCRkG2PK8d9qQCZB415GjbB4srxMEWYg_n5HQA=.91b9ac13-5dfa-413e-bca9-220ddc7ee8d2@github.com> References: <8gd4yYCRkG2PK8d9qQCZB415GjbB4srxMEWYg_n5HQA=.91b9ac13-5dfa-413e-bca9-220ddc7ee8d2@github.com> Message-ID: <2t57HIOjSiOWnfH7KU-TGiaCWyLvFfFRVX7L8jBOVM4=.ad3d681c-6170-486d-8afd-ab864d27eb88@github.com> On Thu, 28 Oct 2021 21:17:40 GMT, David Holmes wrote: >> Aleksey Shipilev has updated the pull request incrementally with two additional commits since the last revision: >> >> - Consistent benchmark names: drop excess "Fence" >> - Add loadLoad tests too > > The Hotspot rules are documented here: > > https://wiki.openjdk.java.net/display/HotSpot/Pushing+a+HotSpot+change > > and in the process of migrating to the OpenJDK dev guide. > > @shipilev Not sure why this was even posted to hotspot-dev for review ??? :) @dholmes-ora we've not created a dedicated mailing list for microbenchmarks, rather stick to the convention that they are to be reviewed on the list(s) appropriate for the component being tested/stressed. Since these micros use `VarHandle` maybe they should have been reviewed on both core-libs and hotspot dev lists. Further then in this case I've interpreted the rules that even though the tests are VM-centric they don't touch the HotSpot source code and are thus not subject to the stricter set of rules put in place there. ------------- PR: https://git.openjdk.java.net/jdk/pull/6138 From jincheng at ca.ibm.com Fri Oct 29 00:25:34 2021 From: jincheng at ca.ibm.com (Cheng Jin) Date: Thu, 28 Oct 2021 20:25:34 -0400 Subject: Incorrect hehavior on the class name (UTF8) in the constant pool of bytecode Message-ID: One or more of the following files ( dumped.class ) violates IBM policy and all attachment(s) have been removed from the message. ********************************************************************** Hi There, I created a simple test that loads a class file as follows to see whether a package name in the constant pool is rejected as invalid for a class name. However, it surprised me that it just passed without any exception on Hotspot (e.g. OpenJDK11). (See attached file: dumped.class) constant_pool (in dumped.class) ... 3. Utf8 tag: 1 length: 16 bytes: die/verwandlung/ <----- a package name rather than an valid class name 4. Class tag: 7 name_index: 3 <------ import java.io.*; public class CustomClassLoader extends ClassLoader { @Override public Class findClass(String fileName) throws ClassNotFoundException { byte[] b = loadClassBytes(fileName); return defineClass(null, b, 0, b.length); } private byte[] loadClassBytes(String fileName) { InputStream inputStream = getClass().getClassLoader().getResourceAsStream(fileName + ".class"); ByteArrayOutputStream byteOutStream = new ByteArrayOutputStream(); try { int nextByte = 0; while ((nextByte = inputStream.read()) != -1) { byteOutStream.write(nextByte); } } catch (IOException e) { e.printStackTrace(); } return byteOutStream.toByteArray(); } public static void main(String args[]) { try { CustomClassLoader cl = new CustomClassLoader(); cl.findClass("dumped"); System.out.println("DONE....."); } catch (Exception e) { e.printStackTrace(); } } } $ jdk11_hotspot/bin/java CustomClassLoader DONE..... According to the VM Spec at 4.2.1 Binary Class and Interface Names Class and interface names that appear in class file structures are always represented in a fully qualified form known as binary names (JLS ?13.1). ...In this internal form, the ASCII periods (.) that normally separate the identifiers which make up the binary name are replaced by ASCII forward slashes (/). The identifiers themselves must be unqualified names (?4.2.2). For example, the normal binary name of class Thread is java.lang.Thread. In the internal form used in descriptors in the class file format, a reference to the name of class Thread is implemented using a CONSTANT_Utf8_info structure representing the string java/lang/Thread. It means a valid class name should be something like "xxx" or "xxx/yyy/zzz" (where "/" only serves as the separator in between, and "/" shouldn't occur at the end), in which case "xxx/yyy/" is treated as invalid for a class name. So I am wondering why Hotspot doesn't follow the VM Spec to check the invalid package name in the constant pool. Thanks and Best Regards Cheng Jin From ioi.lam at oracle.com Fri Oct 29 04:05:41 2021 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 28 Oct 2021 21:05:41 -0700 Subject: Incorrect hehavior on the class name (UTF8) in the constant pool of bytecode In-Reply-To: References: Message-ID: <1a1dbb59-09ee-c628-f7b3-a1ea2aee71fc@oracle.com> Hi Cheng Jin, Dumped.class couldn't be posted to the mailing list. Could you upload it somewhere, or send a base64-encoded version of it? $ base64 < dumped.class Thanks - Ioi On 10/28/21 5:25 PM, Cheng Jin wrote: > One or more of the following files ( dumped.class ) violates IBM policy and all attachment(s) have been removed from the message. > > ********************************************************************** > > > Hi There, > > I created a simple test that loads a class file as follows to see whether a > package name in the constant pool is rejected as invalid for a class name. > However, it surprised me that it just passed without any exception on > Hotspot (e.g. OpenJDK11). > > (See attached file: dumped.class) > > constant_pool (in dumped.class) > ... > 3. Utf8 > tag: 1 > length: 16 > bytes: die/verwandlung/ <----- a package name rather than an valid class > name > 4. Class > tag: 7 > name_index: 3 <------ > > > import java.io.*; > public class CustomClassLoader extends ClassLoader { > > @Override > public Class findClass(String fileName) throws ClassNotFoundException { > byte[] b = loadClassBytes(fileName); > return defineClass(null, b, 0, b.length); > } > > private byte[] loadClassBytes(String fileName) { > InputStream inputStream = > getClass().getClassLoader().getResourceAsStream(fileName + ".class"); > ByteArrayOutputStream byteOutStream = new ByteArrayOutputStream(); > try { > int nextByte = 0; > while ((nextByte = inputStream.read()) != -1) { > byteOutStream.write(nextByte); > } > } catch (IOException e) { > e.printStackTrace(); > } > return byteOutStream.toByteArray(); > } > > public static void main(String args[]) { > try { > CustomClassLoader cl = new CustomClassLoader(); > cl.findClass("dumped"); > System.out.println("DONE....."); > } catch (Exception e) { > e.printStackTrace(); > } > } > } > > $ jdk11_hotspot/bin/java CustomClassLoader > DONE..... > > > According to the VM Spec at 4.2.1 Binary Class and Interface Names > > Class and interface names that appear in class file structures are always > represented in a fully qualified form known as binary names (JLS ?13.1). > ...In this internal form, the ASCII periods (.) that normally separate the > identifiers which > make up the binary name are replaced by ASCII forward slashes (/). The > identifiers > themselves must be unqualified names (?4.2.2). > > For example, the normal binary name of class Thread is java.lang.Thread. In > the > internal form used in descriptors in the class file format, a reference to > the name of class > Thread is implemented using a CONSTANT_Utf8_info structure representing the > string > java/lang/Thread. > > It means a valid class name should be something like "xxx" or > "xxx/yyy/zzz" (where "/" only serves as the separator in between, and "/" > shouldn't occur at the end), > in which case "xxx/yyy/" is treated as invalid for a class name. > > > So I am wondering why Hotspot doesn't follow the VM Spec to check the > invalid package name in the constant pool. > > > Thanks and Best Regards > Cheng Jin From stuefe at openjdk.java.net Fri Oct 29 04:19:06 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 29 Oct 2021 04:19:06 GMT Subject: RFR: JDK-8275320: NMT should perform buffer overrun checks In-Reply-To: References: Message-ID: <7hnaO7f3xC6SGhfcY7_ypJ2f16QtAgom0pQQaxhbfvE=.468d16ad-95a6-420f-ad26-222e0695ebe3@github.com> On Thu, 14 Oct 2021 15:49:05 GMT, Thomas Stuefe wrote: > This is part of a number of RFE I plan to improve and simplify C-heap overflow checking in hotspot. For the whole story please refer to https://bugs.openjdk.java.net/browse/JDK-8275301. > > This proposal adds NMT buffer overflow checking. As laid out in JDK-8275301: > > - it would give us C-heap overflow checking in release builds > - the additional costs are neglectable > - NMT needs intact headers anyway. Faced with buffer overwrites today, it would maybe crash or maybe account wrongly, but it's a bit of a lottery really. The error reports would also be confusing. > - it is a preparation for future code removal (the memory guarding done in debug only in os::malloc() and friends, and possibly the guarding done with CheckJNICalls) > > Patch notes: > > 1) The malloc header is changed such that it contains a 16-bit canary directly preceding the user payload of the allocation. > > On 64-bit, we don't even need to enlarge the malloc header: we carve some bits out by decreasing the size of the bucket index bit field to 16 bits. The bucket index field is used to store the bucket slot of the malloc site table in NMT detail mode. The malloc site table width is 512 atm, so 65k gives plenty of room for growing the malloc site table should we ever want to. > > On 32-bit, I had to enlarge the header from 8 bytes to 16 bytes. That is because there were not enough bits to spare for a canary. On the upside, 8 bytes were not enough anyway, strictly speaking, to guarantee proper alignment e.g. for 128bit data types on all 32-bit platforms. See e.g. the malloc alignment the glibc uses. > > I also took the freedom of re-arranging the malloc header fields a bit to minimize the difference between 32-bit and 64-bit platforms, and to align each field optimally according to its size. I also switched from bitfields to real types in order to be able to do a sizeof() on them. > > For more details, see the comment in mallocTracker.hpp. > > 2) I added a footer canary trailing the user allocation to catch tail buffer overruns. For simplicity reasons (alignment) and to save some cycles I made it a byte only. That is enough to catch most overrun scenarios. If you think this is too small, I'm open to change it. > > 3) I put a bit of work into error reporting. When NMT detects corruption, it will now print out a hex dump of the corrupted area to tty before asserting. > > 4) I added a bunch of gtests to test various heap overwrite scenarios. I also had to extend the gtest macros a bit because I wanted these tests of course to run in release builds too, but we did not have a death test macro for release builds yet (there are possibilities for code simplification here too, but that's for another RFE). > > (Note that these gtests, to test anything, need to run with NMT switched on. We do this as part of our NMT jtreg-controlled gtests in tier1). > > Even though the patch adds more code than it removes, it prepares possible code removal (if we can agree to do that) and the net result will be less complexity, not more. Again, see JDK-8275301 for details. > > -------------- > > Example output a buffer overrun would provide: > > > Block at 0x00005600f86136b0: footer canary broken at 0x00005600f86136c1 (buffer overflow?) > NMT Block at 0x00005600f86136b0, corruption at: 0x00005600f86136c1: > 0x00005600f86136a8: 21 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 > 0x00005600f86136b8: 00 00 00 00 0f 00 1f fa 00 61 00 00 00 00 00 00 > 0x00005600f86136c8: 41 39 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x00005600f86136d8: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x00005600f86136e8: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x00005600f86136f8: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x00005600f8613708: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x00005600f8613718: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x00005600f8613728: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x00005600f8613738: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > assert failed: fatal error: Block at 0x00005600f86136b0: footer canary broken at 0x00005600f86136c1 (buffer overflow?)# > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (mallocTracker.cpp:203), pid=10805, tid=10805 > # fatal error: Block at 0x00005600f86136b0: footer canary broken at 0x00005600f86136c1 (buffer overflow?) > # > > ------- > > Tests: > - manual tests with Linux x64, x86, minimal build > - GHAs all clean > - SAP nightlies ran for 14 days in a row without problems Patch has been active in our systems for 14 days without problems. ------------- PR: https://git.openjdk.java.net/jdk/pull/5952 From jincheng at ca.ibm.com Fri Oct 29 04:39:02 2021 From: jincheng at ca.ibm.com (Cheng Jin) Date: Fri, 29 Oct 2021 00:39:02 -0400 Subject: Incorrect hehavior on the class name (UTF8) in the constant pool of bytecode In-Reply-To: <1a1dbb59-09ee-c628-f7b3-a1ea2aee71fc@oracle.com> References: <1a1dbb59-09ee-c628-f7b3-a1ea2aee71fc@oracle.com> Message-ID: Hi Lam, Here's the link of the class file in the Google drive. https://drive.google.com/file/d/1g723UF4FOkYkn8majZUYNQVlTJAQ492W/view?usp=sharing Thanks and Best Regards Cheng Jin From david.holmes at oracle.com Fri Oct 29 06:17:57 2021 From: david.holmes at oracle.com (David Holmes) Date: Fri, 29 Oct 2021 16:17:57 +1000 Subject: Incorrect hehavior on the class name (UTF8) in the constant pool of bytecode In-Reply-To: References: Message-ID: Hi, On 29/10/2021 10:25 am, Cheng Jin wrote: > Hi There, > > I created a simple test that loads a class file as follows to see whether a > package name in the constant pool is rejected as invalid for a class name. > However, it surprised me that it just passed without any exception on > Hotspot (e.g. OpenJDK11). Interesting. ClassFileParser::verify_unqualified_name certainly has logic that is supposed to catch this case. But we can investigate. Cheers, David > (See attached file: dumped.class) > > constant_pool (in dumped.class) > ... > 3. Utf8 > tag: 1 > length: 16 > bytes: die/verwandlung/ <----- a package name rather than an valid class > name > 4. Class > tag: 7 > name_index: 3 <------ > > > import java.io.*; > public class CustomClassLoader extends ClassLoader { > > @Override > public Class findClass(String fileName) throws ClassNotFoundException { > byte[] b = loadClassBytes(fileName); > return defineClass(null, b, 0, b.length); > } > > private byte[] loadClassBytes(String fileName) { > InputStream inputStream = > getClass().getClassLoader().getResourceAsStream(fileName + ".class"); > ByteArrayOutputStream byteOutStream = new ByteArrayOutputStream(); > try { > int nextByte = 0; > while ((nextByte = inputStream.read()) != -1) { > byteOutStream.write(nextByte); > } > } catch (IOException e) { > e.printStackTrace(); > } > return byteOutStream.toByteArray(); > } > > public static void main(String args[]) { > try { > CustomClassLoader cl = new CustomClassLoader(); > cl.findClass("dumped"); > System.out.println("DONE....."); > } catch (Exception e) { > e.printStackTrace(); > } > } > } > > $ jdk11_hotspot/bin/java CustomClassLoader > DONE..... > > > According to the VM Spec at 4.2.1 Binary Class and Interface Names > > Class and interface names that appear in class file structures are always > represented in a fully qualified form known as binary names (JLS ?13.1). > ...In this internal form, the ASCII periods (.) that normally separate the > identifiers which > make up the binary name are replaced by ASCII forward slashes (/). The > identifiers > themselves must be unqualified names (?4.2.2). > > For example, the normal binary name of class Thread is java.lang.Thread. In > the > internal form used in descriptors in the class file format, a reference to > the name of class > Thread is implemented using a CONSTANT_Utf8_info structure representing the > string > java/lang/Thread. > > It means a valid class name should be something like "xxx" or > "xxx/yyy/zzz" (where "/" only serves as the separator in between, and "/" > shouldn't occur at the end), > in which case "xxx/yyy/" is treated as invalid for a class name. > > > So I am wondering why Hotspot doesn't follow the VM Spec to check the > invalid package name in the constant pool. > > > Thanks and Best Regards > Cheng Jin > From duke at openjdk.java.net Fri Oct 29 09:24:47 2021 From: duke at openjdk.java.net (Fei Gao) Date: Fri, 29 Oct 2021 09:24:47 GMT Subject: RFR: 8274179: AArch64: Support SVE operations with encodable immediates [v4] In-Reply-To: References: Message-ID: <6RXxK49iDwBKpqVZar9-4B1AO5z6lLagcjLVBHT5sKo=.a274ceff-4c76-4cad-a9e7-f5f7f148ea83@github.com> > for(int i = 0; i < LENGTH; i++) { > c[i] = a[i] + 2; > } > > For the case showed above, after superword optimization with SVE, > without the patch, the vector add operation always has 2 z-reg inputs, > like: > mov z16.s, #2 > add z17.s, z17.s, z16.s > > Considering sve has supported basic binary operations with immediate, > this pattern could be further optimized to: > add z16.s, z16.s, #2 > > To implement it, we added some new match rules and assembler rules in > the aarch64 backend. We also made some extensions on immediate types > and functions to keep backward compatible. > > With the patch, only these binary integer vector operations, +(add), > -(sub), &(and), |(orr), and ^(eor) with immediate are supported for > the optimization. Other vector operations are not supported currently. > > Tested tier1 and test/hotspot/jtreg/compiler on SVE featured AArch64 > CPU, no new failure. > > There is no obvious performance uplift but it can help remove one > redundant mov instruction. Fei Gao has updated the pull request incrementally with one additional commit since the last revision: Add some assertion lines for help functions Change-Id: Ic9120902bd8f8a8ead2e3740435a40f35d21757c ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6115/files - new: https://git.openjdk.java.net/jdk/pull/6115/files/a7a915af..05c712d4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6115&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6115&range=02-03 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/6115.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6115/head:pull/6115 PR: https://git.openjdk.java.net/jdk/pull/6115 From duke at openjdk.java.net Fri Oct 29 09:24:52 2021 From: duke at openjdk.java.net (Fei Gao) Date: Fri, 29 Oct 2021 09:24:52 GMT Subject: RFR: 8274179: AArch64: Support SVE operations with encodable immediates [v3] In-Reply-To: <-qQsA7EWkM3X98NNZJNQBFW-YSh6W745k8lvR2WBVXs=.c7033a2b-c9de-4b8b-9d15-e371ba4cb5b4@github.com> References: <1wJVjyUjZyArCiABWBLy5VCt_rGiE_Y96KdePt6f5Ck=.00fe1bfe-2a9b-4d79-b49a-6ae5bfb22108@github.com> <-qQsA7EWkM3X98NNZJNQBFW-YSh6W745k8lvR2WBVXs=.c7033a2b-c9de-4b8b-9d15-e371ba4cb5b4@github.com> Message-ID: On Thu, 28 Oct 2021 11:46:58 GMT, Andrew Haley wrote: >> Fei Gao has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. > > src/hotspot/cpu/aarch64/assembler_aarch64.cpp line 86: > >> 84: >> 85: unsigned Assembler::regVariant_to_elemBits(Assembler::SIMD_RegVariant T){ >> 86: return 1 << (T + 3); > > Assert something about `T` here. Done > src/hotspot/cpu/aarch64/assembler_aarch64.cpp line 394: > >> 392: if (uimm < (UCONST64(1) << nbits)) >> 393: return true; >> 394: if (uimm < (UCONST64(1) << (2 * nbits)) > > Assert something about `nbits` here. It has to be less than 32, I think. Done ------------- PR: https://git.openjdk.java.net/jdk/pull/6115 From aph at openjdk.java.net Fri Oct 29 09:54:14 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 29 Oct 2021 09:54:14 GMT Subject: RFR: 8276096: Simplify Unsafe.{load|store}Fence fallbacks by delegating to fullFence In-Reply-To: References: Message-ID: On Thu, 28 Oct 2021 08:47:31 GMT, Aleksey Shipilev wrote: > `Unsafe.{load|store}Fence` falls back to `unsafe.cpp` for `OrderAccess::{acquire|release}Fence()`. It seems too heavy-handed (useless?) to call to runtime for a single memory barrier. We can simplify the native `Unsafe` interface by falling back to `fullFence` when `{load|store}Fence` intrinsics are not available. This would be similar to what `Unsafe.{loadLoad|storeStore}Fences` do. > > This is the behavior of these intrinsics now, on x86_64, using benchmarks from JDK-8276054: > > > Benchmark Mode Cnt Score Error Units > > # Default > Single.acquire avgt 3 0.407 ? 0.060 ns/op > Single.full avgt 3 4.693 ? 0.005 ns/op > Single.loadLoad avgt 3 0.415 ? 0.095 ns/op > Single.plain avgt 3 0.406 ? 0.002 ns/op > Single.release avgt 3 0.408 ? 0.047 ns/op > Single.storeStore avgt 3 0.408 ? 0.043 ns/op > > # -XX:DisableIntrinsic=_storeFence > Single.acquire avgt 3 0.408 ? 0.016 ns/op > Single.full avgt 3 4.694 ? 0.002 ns/op > Single.loadLoad avgt 3 0.406 ? 0.002 ns/op > Single.plain avgt 3 0.406 ? 0.001 ns/op > Single.release avgt 3 4.694 ? 0.003 ns/op <--- upgraded to full > Single.storeStore avgt 3 4.690 ? 0.005 ns/op <--- upgraded to full > > # -XX:DisableIntrinsic=_loadFence > Single.acquire avgt 3 4.691 ? 0.001 ns/op <--- upgraded to full > Single.full avgt 3 4.693 ? 0.009 ns/op > Single.loadLoad avgt 3 4.693 ? 0.013 ns/op <--- upgraded to full > Single.plain avgt 3 0.408 ? 0.072 ns/op > Single.release avgt 3 0.415 ? 0.016 ns/op > Single.storeStore avgt 3 0.416 ? 0.041 ns/op > > # -XX:DisableIntrinsic=_fullFence > Single.acquire avgt 3 0.406 ? 0.014 ns/op > Single.full avgt 3 15.836 ? 0.151 ns/op <--- calls runtime > Single.loadLoad avgt 3 0.406 ? 0.001 ns/op > Single.plain avgt 3 0.426 ? 0.361 ns/op > Single.release avgt 3 0.407 ? 0.021 ns/op > Single.storeStore avgt 3 0.410 ? 0.061 ns/op > > # -XX:DisableIntrinsic=_fullFence,_loadFence > Single.acquire avgt 3 15.822 ? 0.282 ns/op <--- upgraded, calls runtime > Single.full avgt 3 15.851 ? 0.127 ns/op <--- calls runtime > Single.loadLoad avgt 3 15.829 ? 0.045 ns/op <--- upgraded, calls runtime > Single.plain avgt 3 0.406 ? 0.001 ns/op > Single.release avgt 3 0.414 ? 0.156 ns/op > Single.storeStore avgt 3 0.422 ? 0.452 ns/op > > # -XX:DisableIntrinsic=_fullFence,_storeFence > Single.acquire avgt 3 0.407 ? 0.016 ns/op > Single.full avgt 3 15.347 ? 6.783 ns/op <--- calls runtime > Single.loadLoad avgt 3 0.406 ? 0.001 ns/op > Single.plain avgt 3 0.406 ? 0.002 ns/op > Single.release avgt 3 15.828 ? 0.019 ns/op <--- upgraded, calls runtime > Single.storeStore avgt 3 15.834 ? 0.045 ns/op <--- upgraded, calls runtime > > # -XX:DisableIntrinsic=_fullFence,_loadFence,_storeFence > Single.acquire avgt 3 15.838 ? 0.030 ns/op <--- upgraded, calls runtime > Single.full avgt 3 15.854 ? 0.277 ns/op <--- calls runtime > Single.loadLoad avgt 3 15.826 ? 0.160 ns/op <--- upgraded, calls runtime > Single.plain avgt 3 0.406 ? 0.003 ns/op > Single.release avgt 3 15.838 ? 0.019 ns/op <--- upgraded, calls runtime > Single.storeStore avgt 3 15.844 ? 0.104 ns/op <--- upgraded, calls runtime > > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` Marked as reviewed by aph (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/6149 From whuang at openjdk.java.net Fri Oct 29 10:34:16 2021 From: whuang at openjdk.java.net (Wang Huang) Date: Fri, 29 Oct 2021 10:34:16 GMT Subject: RFR: 8252990: Intrinsify Unsafe.storeStoreFence [v2] In-Reply-To: References: Message-ID: On Thu, 28 Oct 2021 08:58:48 GMT, Aleksey Shipilev wrote: >> `Unsafe.storeStoreFence` currently delegates to stronger `Unsafe.storeFence`. We can teach compilers to map this directly to already existing rules that handle `MemBarStoreStore`. Like explicit `LoadFence`/`StoreFence`, we introduce the special node to differentiate explicit fence and implicit store-store barriers. `storeStoreFence` is usually used to simulate safe `final`-field like constructions in special JDK classes, like `ConstantCallSite` and friends. >> >> Motivational performance difference on benchmarks from JDK-8276054 on ARM32: >> >> >> Benchmark Mode Cnt Score Error Units >> Multiple.plain avgt 3 2.669 ? 0.004 ns/op >> Multiple.release avgt 3 16.688 ? 0.057 ns/op >> Multiple.storeStore avgt 3 14.021 ? 0.144 ns/op // Better >> >> MultipleWithLoads.plain avgt 3 4.672 ? 0.053 ns/op >> MultipleWithLoads.release avgt 3 16.689 ? 0.044 ns/op >> MultipleWithLoads.storeStore avgt 3 14.012 ? 0.010 ns/op // Better >> >> MultipleWithStores.plain avgt 3 14.687 ? 0.009 ns/op >> MultipleWithStores.release avgt 3 45.393 ? 0.192 ns/op >> MultipleWithStores.storeStore avgt 3 38.048 ? 0.033 ns/op // Better >> >> Publishing.plain avgt 3 27.079 ? 0.201 ns/op >> Publishing.release avgt 3 27.088 ? 0.241 ns/op >> Publishing.storeStore avgt 3 27.009 ? 0.259 ns/op // Within error, hidden by allocation >> >> Single.plain avgt 3 2.670 ? 0.002 ns/op >> Single.releaseFence avgt 3 6.675 ? 0.001 ns/op >> Single.storeStoreFence avgt 3 8.012 ? 0.027 ns/op // Worse, seems to be ARM32 implementation artifact >> >> >> As expected, this does not affect x86_64 at all, because both `release` and `storeStore` are effectively no-ops, only affecting compiler optimizations: >> >> >> Benchmark Mode Cnt Score Error Units >> >> Multiple.plain avgt 3 0.406 ? 0.002 ns/op >> Multiple.release avgt 3 0.409 ? 0.018 ns/op >> Multiple.storeStore avgt 3 0.406 ? 0.001 ns/op >> >> MultipleWithLoads.plain avgt 3 4.328 ? 0.006 ns/op >> MultipleWithLoads.release avgt 3 4.600 ? 0.014 ns/op >> MultipleWithLoads.storeStore avgt 3 4.602 ? 0.006 ns/op >> >> MultipleWithStores.plain avgt 3 0.812 ? 0.001 ns/op >> MultipleWithStores.release avgt 3 0.812 ? 0.002 ns/op >> MultipleWithStores.storeStore avgt 3 0.812 ? 0.002 ns/op >> >> Publishing.plain avgt 3 6.370 ? 0.059 ns/op >> Publishing.release avgt 3 6.358 ? 0.436 ns/op >> Publishing.storeStore avgt 3 6.367 ? 0.054 ns/op >> >> Single.plain avgt 3 0.407 ? 0.039 ns/op >> Single.releaseFence avgt 3 0.406 ? 0.001 ns/op >> Single.storeStoreFence avgt 3 0.406 ? 0.001 ns/op >> >> >> Additional testing: >> - [x] Linux x86_64 fastdebug `tier1` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Fix the comment to match JDK-8276096 LGTM ------------- Marked as reviewed by whuang (Author). PR: https://git.openjdk.java.net/jdk/pull/6136 From lkorinth at openjdk.java.net Fri Oct 29 13:29:40 2021 From: lkorinth at openjdk.java.net (Leo Korinth) Date: Fri, 29 Oct 2021 13:29:40 GMT Subject: RFR: 8275506: Rename allocated_on_stack to allocated_on_stack_or_embedded [v2] In-Reply-To: References: Message-ID: > In allocation.hpp, the name allocated_on_stack can be misleading, better rename the function to allocated_on_stack_or_embedded and it will match the name of the enum as a bonus. Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: restart failed github tests ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6004/files - new: https://git.openjdk.java.net/jdk/pull/6004/files/e8479fab..d983f09e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6004&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6004&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/6004.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6004/head:pull/6004 PR: https://git.openjdk.java.net/jdk/pull/6004 From stuefe at openjdk.java.net Fri Oct 29 14:40:10 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 29 Oct 2021 14:40:10 GMT Subject: RFR: 8275506: Rename allocated_on_stack to allocated_on_stack_or_embedded [v2] In-Reply-To: References: Message-ID: On Fri, 29 Oct 2021 13:29:40 GMT, Leo Korinth wrote: >> In allocation.hpp, the name allocated_on_stack can be misleading, better rename the function to allocated_on_stack_or_embedded and it will match the name of the enum as a bonus. > > Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: > > restart failed github tests Looks good and trivial. Cheers, Thomas ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6004 From Divino.Cesar at microsoft.com Fri Oct 29 23:50:36 2021 From: Divino.Cesar at microsoft.com (Cesar Soares Lucas) Date: Fri, 29 Oct 2021 23:50:36 +0000 Subject: [External] : Re: RFC - Improving C2 Escape Analysis In-Reply-To: <787f8fbb-83e6-0867-1c97-ae2516df114b@oracle.com> References: <20210930140335.648146897@eggemoggin.niobe.net> <415a6622-a46c-33da-8e39-c8f3068c7df3@oracle.com> <44563450-403B-4A15-95AB-5FB5DCA4ED0B@oracle.com> <81f86a0b-dfb7-0b45-1779-49209a82ae40@oracle.com> <0f30507c-e0f0-c380-568b-ac441611e116@oracle.com> <787f8fbb-83e6-0867-1c97-ae2516df114b@oracle.com> Message-ID: Hi Vladimir and Tobias, >> Sure, here are four examples of EA and/or scalarization failing due to >> complicated control/data flow: >> https://cr.openjdk.java.net/~thartmann/EA_examples >> There are 2 test files with small methods for different EA cases I used to >> see how EA works: >> >> test/hotspot/jtreg/compiler/escapeAnalysis/Test6726999.java >> test/hotspot/jtreg/compiler/escapeAnalysis/Test6689060.java Thank you for the examples, Tobias/Vladimir. This is being very helpful. >> Yes, finding solution for allocation merges (or NULL) is a pain. I spent >> some time investigating possible solutions for it but "no cigar". May be we >> do indead need control flow analysis to resolve this. By "need control flow analysis" you mean the flow-sensitive EA algorithm? My first idea to handle these control/data-merge issues was to implement in C2 the same algorithm used by GRAAL - i.e., the algorithm described in Stadler et. al PEA paper. Do you think this is reasonable? >> I am currently looking on iterative EA. Do more EA rounds if we can >> eliminate more connected allocations. It was proposed by Vladimir Ivanov and >> I have working prototype. Cool! I'm curious, when do you plan to submit a Pull Request for this? >> There is also suggestin from Amazon Java group about "C2 Partial Escape >> Analysis" which needs more discsussion: >> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2021-May/047486.html I'd love to hear from them about their experience with these issues and if they have any plans to work on this moving forward! I'll ping them on the thread that you linked above. Regards, Cesar ________________________________ From: Vladimir Kozlov Sent: October 27, 2021 10:26 AM To: Tobias Hartmann ; Cesar Soares Lucas ; Ron Pressler Cc: John Rose ; Mark Reinhold ; hotspot-dev at openjdk.java.net ; Brian Stafford ; Martijn Verburg Subject: Re: [External] : Re: RFC - Improving C2 Escape Analysis First. Thank you, Cesar, for collecting data about C2 EA shortcomings. I agree with cases Tobias pointed as possible starting points to improve EA. Yes, finding solution for allocation merges (or NULL) is a pain. I spent some time investigating possible solutions for it but "no cigar". May be we do indead need control flow analysis to resolve this. I looked through JBS and found few issues which are not required to write new EA: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-7149991&data=04%7C01%7CDivino.Cesar%40microsoft.com%7C63920cb1798f48c3487508d9996efd44%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637709524262611242%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=QrR7%2BGxXon4ToV6x3PhtQzZGl5tF7f1RUDbEi2AMTqA%3D&reserved=0 https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-8059378&data=04%7C01%7CDivino.Cesar%40microsoft.com%7C63920cb1798f48c3487508d9996efd44%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637709524262611242%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=rsMHgOyTDYF%2B%2Ba38jGeown5TcZfIEDucAWI5QuAaTd4%3D&reserved=0 https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-8073358&data=04%7C01%7CDivino.Cesar%40microsoft.com%7C63920cb1798f48c3487508d9996efd44%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637709524262611242%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=CypHNEd5B5EymTYMnF6jf30LspY6sBqXoz1sypE2tSg%3D&reserved=0 https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-8155769&data=04%7C01%7CDivino.Cesar%40microsoft.com%7C63920cb1798f48c3487508d9996efd44%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637709524262611242%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=BE170%2BZrn2c2%2FDLcijZsol25q2zY5X5idHXXwjCn7ug%3D&reserved=0 Tobias also has fix prototype for next bug which was not fixed yet: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-8236493&data=04%7C01%7CDivino.Cesar%40microsoft.com%7C63920cb1798f48c3487508d9996efd44%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637709524262611242%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=fqaQ7zhAHGdsnUcw7wjA6c4XX96Aaa3acTIzc6%2FJXmY%3D&reserved=0 Ther are 2 test files with small methods for different EA cases I used to see how EA works: test/hotspot/jtreg/compiler/escapeAnalysis/Test6726999.java test/hotspot/jtreg/compiler/escapeAnalysis/Test6689060.java You can start looking on above RFE/bug or run these tests and see why scalarization failed for some cases. Except for known merge issue: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-6853701&data=04%7C01%7CDivino.Cesar%40microsoft.com%7C63920cb1798f48c3487508d9996efd44%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637709524262621193%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=F%2Bz1CFuCK6ZgXi5%2FWOcOgBWuXKeap0oZJh4873QKRgk%3D&reserved=0 I am currently looking on iterative EA. Do more EA rounds if we can eliminate more connected allocations. It was proposed by Vladimir Ivanov and I have working prototype. There is also suggestin from Amazon Java group about "C2 Partial Escape Analysis" which needs more discsussion: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openjdk.java.net%2Fpipermail%2Fhotspot-compiler-dev%2F2021-May%2F047486.html&data=04%7C01%7CDivino.Cesar%40microsoft.com%7C63920cb1798f48c3487508d9996efd44%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637709524262621193%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=QFszHSDnPkYLBkjqzNkmU92P6VlBFSok1mOku5sNudw%3D&reserved=0 Thanks, Vladimir K On 10/27/21 3:04 AM, Tobias Hartmann wrote: > Hi Cesar, > > On 27.10.21 08:20, Cesar Soares Lucas wrote: >> Right. I was suspecting this to be the most critical issue indeed. However, I >> didn't know there was a case where "... the object does not escape on any paths >> but control flow is too complicated for EA to prove that." Is this an issue >> tracked in JBS or perhaps you can show me an example where this happens? > > Sure, here are four examples of EA and/or scalarization failing due to complicated control/data > flow: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcr.openjdk.java.net%2F~thartmann%2FEA_examples&data=04%7C01%7CDivino.Cesar%40microsoft.com%7C63920cb1798f48c3487508d9996efd44%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637709524262621193%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=YbaF4T0zt9dle23nulvUWWLktuTvaWFWENQHD7Q13CE%3D&reserved=0 > > All examples would completely fold with inline types (Valhalla). > > I'm not sure if these issues are tracked by JBS issues but there's most likely an overlap with some > of the issues you already described. > > Best regards, > Tobias > From vladimir.kozlov at oracle.com Sat Oct 30 00:27:34 2021 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 29 Oct 2021 17:27:34 -0700 Subject: [External] : Re: RFC - Improving C2 Escape Analysis In-Reply-To: References: <20210930140335.648146897@eggemoggin.niobe.net> <415a6622-a46c-33da-8e39-c8f3068c7df3@oracle.com> <44563450-403B-4A15-95AB-5FB5DCA4ED0B@oracle.com> <81f86a0b-dfb7-0b45-1779-49209a82ae40@oracle.com> <0f30507c-e0f0-c380-568b-ac441611e116@oracle.com> <787f8fbb-83e6-0867-1c97-ae2516df114b@oracle.com> Message-ID: <457a3277-bc96-d481-2a69-4559f25cd52e@oracle.com> On 10/29/21 4:50 PM, Cesar Soares Lucas wrote: > Hi Vladimir and Tobias, > > >> Sure, here are four examples of EA and/or scalarization failing due to > >> complicated control/data flow: > >> https://cr.openjdk.java.net/~thartmann/EA_examples > > >> There are 2 test files with small methods for different EA cases I used to > >> see how EA works: > >> > >> test/hotspot/jtreg/compiler/escapeAnalysis/Test6726999.java > >> test/hotspot/jtreg/compiler/escapeAnalysis/Test6689060.java > > Thank you for the examples, Tobias/Vladimir. This is being very helpful. > > >> Yes, finding solution for allocation merges (or NULL) is a pain. I spent > >> some time investigating possible solutions for it but "no cigar". May be we > >> do indead need control flow analysis to resolve this. > > By "need control flow analysis" you mean the flow-sensitive EA algorithm? My Yes. To clarify. I investigated solutions in current flow-insensitive EA. > first idea to handle these control/data-merge issues was to implement in C2 the > same algorithm used by GRAAL - i.e., the algorithm described in Stadler et. al > PEA paper. Do you think this is reasonable? Yes, I think it would be good to have a prototype if you are comfortable to work with C2 code already. I proposed small RFEs just for warmup ;) > > >> I am currently looking on iterative EA. Do more EA rounds if we can > >> eliminate more connected allocations. It was proposed by Vladimir Ivanov and > >> I have working prototype. > > Cool! I'm curious, when do you plan to submit a Pull Request for this? I am investigating regressions in some benchmarks. > > >> There is also suggestion from Amazon Java group about "C2 Partial Escape > >> Analysis" which needs more discussion: > >> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2021-May/047486.html > > I'd love to hear from them about their experience with these issues and if they > have any plans to work on this moving forward! I'll ping them on the thread > that you linked above. Yes, I would like them to participate too (CCing to Paul). They sent proposal almost 6 months ago and we did not hear any additional information after Vladimir Ivanov replied. Regards, Vladimir K > > > Regards, > Cesar > ------------------------------------------------------------------------------------------------------------------------ > *From:* Vladimir Kozlov > *Sent:* October 27, 2021 10:26 AM > *To:* Tobias Hartmann ; Cesar Soares Lucas ; Ron Pressler > > *Cc:* John Rose ; Mark Reinhold ; hotspot-dev at openjdk.java.net > ; Brian Stafford ; Martijn Verburg > > *Subject:* Re: [External] : Re: RFC - Improving C2 Escape Analysis > First. Thank you, Cesar, for collecting data about C2 EA shortcomings. > > I agree with cases Tobias pointed as possible starting points to improve EA. > > Yes, finding solution for allocation merges (or NULL) is a pain. I spent some time investigating possible solutions for > it but "no cigar". May be we do indead need control flow analysis to resolve this. > > I looked through JBS and found few issues which are not required to write new EA: > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-7149991&data=04%7C01%7CDivino.Cesar%40microsoft.com%7C63920cb1798f48c3487508d9996efd44%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637709524262611242%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=QrR7%2BGxXon4ToV6x3PhtQzZGl5tF7f1RUDbEi2AMTqA%3D&reserved=0 > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-8059378&data=04%7C01%7CDivino.Cesar%40microsoft.com%7C63920cb1798f48c3487508d9996efd44%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637709524262611242%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=rsMHgOyTDYF%2B%2Ba38jGeown5TcZfIEDucAWI5QuAaTd4%3D&reserved=0 > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-8073358&data=04%7C01%7CDivino.Cesar%40microsoft.com%7C63920cb1798f48c3487508d9996efd44%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637709524262611242%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=CypHNEd5B5EymTYMnF6jf30LspY6sBqXoz1sypE2tSg%3D&reserved=0 > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-8155769&data=04%7C01%7CDivino.Cesar%40microsoft.com%7C63920cb1798f48c3487508d9996efd44%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637709524262611242%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=BE170%2BZrn2c2%2FDLcijZsol25q2zY5X5idHXXwjCn7ug%3D&reserved=0 > > > Tobias also has fix prototype for next bug which was not fixed yet: > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-8236493&data=04%7C01%7CDivino.Cesar%40microsoft.com%7C63920cb1798f48c3487508d9996efd44%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637709524262611242%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=fqaQ7zhAHGdsnUcw7wjA6c4XX96Aaa3acTIzc6%2FJXmY%3D&reserved=0 > > > Ther are 2 test files with small methods for different EA cases I used to see how EA works: > > test/hotspot/jtreg/compiler/escapeAnalysis/Test6726999.java > test/hotspot/jtreg/compiler/escapeAnalysis/Test6689060.java > > You can start looking on above RFE/bug or run these tests and see why scalarization failed for some cases. Except for > known merge issue: > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-6853701&data=04%7C01%7CDivino.Cesar%40microsoft.com%7C63920cb1798f48c3487508d9996efd44%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637709524262621193%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=F%2Bz1CFuCK6ZgXi5%2FWOcOgBWuXKeap0oZJh4873QKRgk%3D&reserved=0 > > > I am currently looking on iterative EA. Do more EA rounds if we can eliminate more connected allocations. It was > proposed by Vladimir Ivanov and I have working prototype. > > There is also suggestin from Amazon Java group about "C2 Partial Escape Analysis" which needs more discsussion: > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openjdk.java.net%2Fpipermail%2Fhotspot-compiler-dev%2F2021-May%2F047486.html&data=04%7C01%7CDivino.Cesar%40microsoft.com%7C63920cb1798f48c3487508d9996efd44%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637709524262621193%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=QFszHSDnPkYLBkjqzNkmU92P6VlBFSok1mOku5sNudw%3D&reserved=0 > > > Thanks, > Vladimir K > > On 10/27/21 3:04 AM, Tobias Hartmann wrote: >> Hi Cesar, >> >> On 27.10.21 08:20, Cesar Soares Lucas wrote: >>> Right. I was suspecting this to be the most critical issue indeed. However, I >>> didn't know there was a case where "... the object does not escape on any paths >>> but control flow is too complicated for EA to prove that." Is this an issue >>> tracked in JBS or perhaps you can show me an example where this happens? >> >> Sure, here are four examples of EA and/or scalarization failing due to complicated control/data >> flow: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcr.openjdk.java.net%2F~thartmann%2FEA_examples&data=04%7C01%7CDivino.Cesar%40microsoft.com%7C63920cb1798f48c3487508d9996efd44%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637709524262621193%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=YbaF4T0zt9dle23nulvUWWLktuTvaWFWENQHD7Q13CE%3D&reserved=0 > >> >> All examples would completely fold with inline types (Valhalla). >> >> I'm not sure if these issues are tracked by JBS issues but there's most likely an overlap with some >> of the issues you already described. >> >> Best regards, >> Tobias >> From zgu at openjdk.java.net Sat Oct 30 20:39:42 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Sat, 30 Oct 2021 20:39:42 GMT Subject: RFR: 8275718: Relax memory constraint on exception counter updates In-Reply-To: References: Message-ID: On Mon, 25 Oct 2021 00:22:47 GMT, David Holmes wrote: > What does this achieve? These are not "hot" counters so performance is of zero concern. The conservative memory ordering doesn't guarantee cross-thread visibility but it does seem to me there is less likelihood of error reporting seeing a stale value from another thread if we have "conservative" memory ordering (which traditionally should have been a full bi-directional fence for atomic r-m-w operations). > > David Hi David, I agree that they are not "hot" counters, but I disagree that "conservative" memory ordering provides any benefits that prevent other threads from reading stale values. We had internal discussion on this topic, Aleksey pointed out: "All modifications to any particular atomic variable occur in a total order that is specific to this one atomic variable". This guarantee holds even for relaxed atomic load/stores. This is a very basic guarantee. In short, value updated via atomic r-m-w operation should be visible to other threads guaranteed by coherence protocol ------------- PR: https://git.openjdk.java.net/jdk/pull/6065 From aph at openjdk.java.net Sun Oct 31 11:56:11 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Sun, 31 Oct 2021 11:56:11 GMT Subject: RFR: 8275718: Relax memory constraint on exception counter updates In-Reply-To: References: Message-ID: On Sat, 30 Oct 2021 14:04:33 GMT, Zhengyu Gu wrote: > We had internal discussion on this topic, Aleksey pointed out: "All modifications to any particular atomic variable occur in a total order that is specific to this one atomic variable". This guarantee holds even for relaxed atomic load/stores. This is a very basic guarantee. I think that's true for most processors as a consequence of multi-copy atomicity, but we support Power which is not multi-copy atomic, where stores can become visible to one group of threads before they become visible to all threads. ------------- PR: https://git.openjdk.java.net/jdk/pull/6065 From ddong at openjdk.java.net Sun Oct 31 15:15:28 2021 From: ddong at openjdk.java.net (Denghui Dong) Date: Sun, 31 Oct 2021 15:15:28 GMT Subject: RFR: 8276209: Some call sites doesn't pass the parameter 'size' to SharedRuntime::dtrace_object_alloc(_base) Message-ID: Hi, Could I have a review of this fix that corrects the oop size value of dtrace_object_alloc(_base). JDK-8039904 added a new parameter 'size' to SharedRuntime::dtrace_object_alloc and dtrace_object_alloc_base, but didn't modified the callsites(interpreter/c1/c2). To make this fix as simple as possible, I overloaded dtrace_object_alloc_base rather than dtrace_object_alloc. Thanks, Denghui ------------- Commit messages: - 8276209: Some call sites doesn't pass the parameter 'size' to SharedRuntime::dtrace_object_alloc(_base) Changes: https://git.openjdk.java.net/jdk/pull/6181/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6181&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8276209 Stats: 11 lines in 4 files changed: 6 ins; 0 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/6181.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6181/head:pull/6181 PR: https://git.openjdk.java.net/jdk/pull/6181 From david.holmes at oracle.com Sun Oct 31 21:58:31 2021 From: david.holmes at oracle.com (David Holmes) Date: Mon, 1 Nov 2021 07:58:31 +1000 Subject: RFR: 8275718: Relax memory constraint on exception counter updates In-Reply-To: References: Message-ID: <51c3fb5b-3440-5c40-10de-04da76b18bb4@oracle.com> On 31/10/2021 6:39 am, Zhengyu Gu wrote: > On Mon, 25 Oct 2021 00:22:47 GMT, David Holmes wrote: > >> What does this achieve? These are not "hot" counters so performance is of zero concern. The conservative memory ordering doesn't guarantee cross-thread visibility but it does seem to me there is less likelihood of error reporting seeing a stale value from another thread if we have "conservative" memory ordering (which traditionally should have been a full bi-directional fence for atomic r-m-w operations). >> >> David > > Hi David, > > I agree that they are not "hot" counters, but I disagree that "conservative" memory ordering provides any benefits that prevent other threads from reading stale values. > > We had internal discussion on this topic, Aleksey pointed out: > "All modifications to any particular atomic variable occur in a total order that is specific to this one atomic variable". This guarantee holds even for relaxed atomic load/stores. This is a very basic guarantee. > > In short, value updated via atomic r-m-w operation should be visible to other threads guaranteed by coherence protocol I don't know where this guarantee is coming from. Two r-m-w atomic ops must have some guarantee via coherence for the atomic op to actually work. And an implementation could make any atomic r-m-w implementation ensure global immediate visibility. But you cannot assume this is guaranteed for all hardware. Even for a given platform this would need to be a specified guarantee in the architecture manual, not just something deduced/inferred by reasoning. Cheers, David ----- > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/6065 > From ddong at openjdk.java.net Sun Oct 31 22:56:44 2021 From: ddong at openjdk.java.net (Denghui Dong) Date: Sun, 31 Oct 2021 22:56:44 GMT Subject: RFR: 8276209: Some call sites doesn't pass the parameter 'size' to SharedRuntime::dtrace_object_alloc(_base) [v2] In-Reply-To: References: Message-ID: > Hi, > > Could I have a review of this fix that corrects the oop size value of dtrace_object_alloc(_base). > > JDK-8039904 added a new parameter 'size' to SharedRuntime::dtrace_object_alloc and dtrace_object_alloc_base, but didn't modified the callsites(interpreter/c1/c2). > > To make this fix as simple as possible, I overloaded dtrace_object_alloc_base rather than dtrace_object_alloc. > > Thanks, > Denghui Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: fix build problem ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6181/files - new: https://git.openjdk.java.net/jdk/pull/6181/files/a5ecbff4..8d597ebc Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6181&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6181&range=00-01 Stats: 3 lines in 3 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/6181.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6181/head:pull/6181 PR: https://git.openjdk.java.net/jdk/pull/6181