From shade at openjdk.java.net Mon Jan 3 10:40:46 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 3 Jan 2022 10:40:46 GMT Subject: RFR: 8278146: G1: Rework VM_G1Concurrent VMOp to clearly identify it as pause [v4] In-Reply-To: References: Message-ID: <2FnsXL-8-NB38S8GbKLbn4ANIhSSAaoQNT8b6qvbYc8=.a66bf138-7d75-4b33-9915-e7704fd72ff3@github.com> > Our support engineers asked this: > >> I see these G1Concurrent safepoints in JDK17: >> [0.064s][info][safepoint ] Safepoint "G1Concurrent", Time since last: 1666947 ns, Reaching > safepoint: 79150 ns, At safepoint: 349999 ns, Total: 429149 ns >> I've always thought that "concurrent" and "safepoint" are basically antonyms. >> What is a G1Concurrent safepoint? How can a concurrent event require a safepoint? > > I agree that's confusing. This patch splits the VM_G1Concurrent into two exactly named VMOp-s, so that we get: > > > [6.527s][info][gc ] GC(7) Pause Remark 64M->64M(224M) 218.847ms > [6.527s][info][safepoint] Safepoint "G1PauseRemark", Time since last: 17493991 ns, Reaching safepoint: 506830 ns, At safepoint: 218950374 ns, Total: 219457204 ns > [6.536s][info][gc ] GC(7) Pause Cleanup 71M->71M(224M) 0.177ms > [6.536s][info][safepoint] Safepoint "G1PauseCleanup", Time since last: 8250157 ns, Reaching safepoint: 884967 ns, At safepoint: 223964 ns, Total: 1108931 ns > [6.537s][info][gc ] GC(7) Concurrent Mark Cycle 247.051ms > > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: - Merge branch 'master' into JDK-8278146-g1-concurrent-vmop - Use override - Merge branch 'master' into JDK-8278146-g1-concurrent-vmop - Review Thomas - Merge branch 'master' into JDK-8278146-g1-concurrent-vmop - Whitespace and touchups - Basic implementation ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6677/files - new: https://git.openjdk.java.net/jdk/pull/6677/files/e26df883..0f8d89af Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6677&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6677&range=02-03 Stats: 6463 lines in 268 files changed: 4729 ins; 921 del; 813 mod Patch: https://git.openjdk.java.net/jdk/pull/6677.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6677/head:pull/6677 PR: https://git.openjdk.java.net/jdk/pull/6677 From shade at openjdk.java.net Mon Jan 3 10:40:52 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 3 Jan 2022 10:40:52 GMT Subject: RFR: 8278146: G1: Rework VM_G1Concurrent VMOp to clearly identify it as pause [v3] In-Reply-To: <877OBDhOaXQDtM-0PcvoLsRlrsWSX50DLoA5-fzbvpM=.7e3c025b-8aae-41e2-b589-7ac26a42d4cc@github.com> References: <877OBDhOaXQDtM-0PcvoLsRlrsWSX50DLoA5-fzbvpM=.7e3c025b-8aae-41e2-b589-7ac26a42d4cc@github.com> Message-ID: On Mon, 20 Dec 2021 11:29:58 GMT, Aleksey Shipilev wrote: >> Our support engineers asked this: >> >>> I see these G1Concurrent safepoints in JDK17: >>> [0.064s][info][safepoint ] Safepoint "G1Concurrent", Time since last: 1666947 ns, Reaching >> safepoint: 79150 ns, At safepoint: 349999 ns, Total: 429149 ns >>> I've always thought that "concurrent" and "safepoint" are basically antonyms. >>> What is a G1Concurrent safepoint? How can a concurrent event require a safepoint? >> >> I agree that's confusing. This patch splits the VM_G1Concurrent into two exactly named VMOp-s, so that we get: >> >> >> [6.527s][info][gc ] GC(7) Pause Remark 64M->64M(224M) 218.847ms >> [6.527s][info][safepoint] Safepoint "G1PauseRemark", Time since last: 17493991 ns, Reaching safepoint: 506830 ns, At safepoint: 218950374 ns, Total: 219457204 ns >> [6.536s][info][gc ] GC(7) Pause Cleanup 71M->71M(224M) 0.177ms >> [6.536s][info][safepoint] Safepoint "G1PauseCleanup", Time since last: 8250157 ns, Reaching safepoint: 884967 ns, At safepoint: 223964 ns, Total: 1108931 ns >> [6.537s][info][gc ] GC(7) Concurrent Mark Cycle 247.051ms >> >> >> Additional testing: >> - [x] Linux x86_64 fastdebug `tier1` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - Use override > - Merge branch 'master' into JDK-8278146-g1-concurrent-vmop > - Review Thomas > - Merge branch 'master' into JDK-8278146-g1-concurrent-vmop > - Whitespace and touchups > - Basic implementation Thanks! I am retesting for new master, and then integrating. ------------- PR: https://git.openjdk.java.net/jdk/pull/6677 From shade at openjdk.java.net Mon Jan 3 14:44:18 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 3 Jan 2022 14:44:18 GMT Subject: Integrated: 8278146: G1: Rework VM_G1Concurrent VMOp to clearly identify it as pause In-Reply-To: References: Message-ID: On Thu, 2 Dec 2021 18:22:56 GMT, Aleksey Shipilev wrote: > Our support engineers asked this: > >> I see these G1Concurrent safepoints in JDK17: >> [0.064s][info][safepoint ] Safepoint "G1Concurrent", Time since last: 1666947 ns, Reaching > safepoint: 79150 ns, At safepoint: 349999 ns, Total: 429149 ns >> I've always thought that "concurrent" and "safepoint" are basically antonyms. >> What is a G1Concurrent safepoint? How can a concurrent event require a safepoint? > > I agree that's confusing. This patch splits the VM_G1Concurrent into two exactly named VMOp-s, so that we get: > > > [6.527s][info][gc ] GC(7) Pause Remark 64M->64M(224M) 218.847ms > [6.527s][info][safepoint] Safepoint "G1PauseRemark", Time since last: 17493991 ns, Reaching safepoint: 506830 ns, At safepoint: 218950374 ns, Total: 219457204 ns > [6.536s][info][gc ] GC(7) Pause Cleanup 71M->71M(224M) 0.177ms > [6.536s][info][safepoint] Safepoint "G1PauseCleanup", Time since last: 8250157 ns, Reaching safepoint: 884967 ns, At safepoint: 223964 ns, Total: 1108931 ns > [6.537s][info][gc ] GC(7) Concurrent Mark Cycle 247.051ms > > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` This pull request has now been integrated. Changeset: 3a1fca3a Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/3a1fca3adf3111a966cb62d926b95acc89b7fe97 Stats: 67 lines in 4 files changed: 29 ins; 24 del; 14 mod 8278146: G1: Rework VM_G1Concurrent VMOp to clearly identify it as pause Reviewed-by: tschatzl, kbarrett ------------- PR: https://git.openjdk.java.net/jdk/pull/6677 From ayang at openjdk.java.net Mon Jan 3 20:37:18 2022 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Mon, 3 Jan 2022 20:37:18 GMT Subject: RFR: 8278282: G1: Log basic statistics for evacuation failure [v3] In-Reply-To: References: Message-ID: On Wed, 22 Dec 2021 02:33:37 GMT, Hamlin Li wrote: >> The original pr is at #6763 , which should be retired as we have decided to adjust part of optimization solution for evacuation failure (see #6627 for details), so the log will be adjusted accordiingly. >> The basic log related to evacuation failed will looks like below based on this patch. >> >> >> [13.126s][debug][gc,phases] GC(0) Restore Retained Regions (ms): Min: 0.0, Avg: 197.4, Max: 1579.1, Diff: 1579.1, Sum: 1579.1, Workers: 8 >> [13.126s][debug][gc,phases] GC(0) Evacuation Failure Regions: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 1, Workers: 1 > > Hamlin Li has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: > > - Merge branch 'master' into log-evac-failure-region-num > - Fix test > - Initial commit Just a minor comment. src/hotspot/share/gc/g1/g1GCPhaseTimes.cpp line 135: > 133: _gc_par_phases[MergePSS]->create_thread_work_items("LAB Undo Waste", MergePSSLABUndoWasteBytes); > 134: > 135: _gc_par_phases[RestoreRetainedRegions]->create_thread_work_items("Evacuation Failure Regions:", RestoreRetainedRegionsNum); Based on the surrounding naming conventions, `RestoreRetainedRegionsNum` should probably be sth like `RestoreRetainedRegionsNumEvacFailRegions` or just the suffix `NumEvacFailRegions`. ------------- Marked as reviewed by ayang (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6860 From duke at openjdk.java.net Tue Jan 4 04:26:41 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Tue, 4 Jan 2022 04:26:41 GMT Subject: RFR: 8279143: Undefined behaviours in globalDefinitions.hpp [v3] In-Reply-To: References: Message-ID: > Hi, > > This patch replaces undefined behaviours in globalDefinitions.hpp by proper well-defined ones. > > Thank you very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: update copyright ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6930/files - new: https://git.openjdk.java.net/jdk/pull/6930/files/c84bc84f..88c4305c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6930&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6930&range=01-02 Stats: 6 lines in 4 files changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/6930.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6930/head:pull/6930 PR: https://git.openjdk.java.net/jdk/pull/6930 From iklam at openjdk.java.net Tue Jan 4 04:51:41 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 4 Jan 2022 04:51:41 GMT Subject: RFR: 8278602: CDS dynamic dump may access unloaded classes [v5] In-Reply-To: References: Message-ID: <4Q44rCoM_z7otN5LWOPReJ7K6h8oQ7dpJD1bL0_Rvpo=.1f075217-51ef-4779-a1c7-c05b8ebff411@github.com> > Cause of crash: > > When dumping a CDS archive, while iterating over entries of the `SystemDictionaryShared::_dumptime_table`, we do not check whether the classes are already unloaded. In the crash, we are trying to call `InstanceKlass::signer()` but the class has already been unloaded. > > Fix: > > Override the template function `DumpTimeSharedClassTable::iterate` to ensure iteration safety. Do not iterate over a class if its `class_loader_data` is no longer alive. > > The assert in `DumpTimeSharedClassTable::IterationHelper` found another existing bug -- we were calling `SystemDictionaryShared::is_dumptime_table_empty()` without holding the `DumpTimeTable_lock`. I delayed the call until we have grabbed the lock. > > Testing: > > I have attached a test case into the bug report. Without the fix, it would reproduce the same crash in less than a minute. With the fix, the crash is no longer reproducible. > > Unfortunately, the test case requires a ZGC patch (thanks to @stefank) that adds delays to increase the likelihood of seeing unloaded classes inside the `_dumptime_table`. Therefore, I cannot integrate the test as a jtreg test. I'll mark the bug as **noreg-hard** Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: - Merge branch 'master' into 8278602-cds-zgc-class-unload-bug - @calvinccheung comments -- removed unused code - Merge branch 'master' into 8278602-cds-zgc-class-unload-bug - added test case - @coleenp and @stefank review comments - cleaned up code - add #if INCLUDE_CDS - Merge branch 'master' into 8278602-cds-zgc-class-unload-bug - using k->is_loader_alive() is enough - Added DumpTimeSharedClassTable::iterate() to make sure every iteration goes through EligibleClassIterationHelper - ... and 1 more: https://git.openjdk.java.net/jdk/compare/2a59ebbb...41e0b8ed ------------- Changes: https://git.openjdk.java.net/jdk/pull/6859/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6859&range=04 Stats: 269 lines in 8 files changed: 261 ins; 4 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/6859.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6859/head:pull/6859 PR: https://git.openjdk.java.net/jdk/pull/6859 From iklam at openjdk.java.net Tue Jan 4 04:56:19 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 4 Jan 2022 04:56:19 GMT Subject: RFR: 8278602: CDS dynamic dump may access unloaded classes [v5] In-Reply-To: References: Message-ID: On Thu, 16 Dec 2021 08:30:21 GMT, Stefan Karlsson wrote: >> Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: >> >> - Merge branch 'master' into 8278602-cds-zgc-class-unload-bug >> - @calvinccheung comments -- removed unused code >> - Merge branch 'master' into 8278602-cds-zgc-class-unload-bug >> - added test case >> - @coleenp and @stefank review comments >> - cleaned up code >> - add #if INCLUDE_CDS >> - Merge branch 'master' into 8278602-cds-zgc-class-unload-bug >> - using k->is_loader_alive() is enough >> - Added DumpTimeSharedClassTable::iterate() to make sure every iteration goes through EligibleClassIterationHelper >> - ... and 1 more: https://git.openjdk.java.net/jdk/compare/2a59ebbb...41e0b8ed > > I've reviewed the interaction of the klasses in the _dumptime_table with the new is_loader_alive() check. I don't know the reset of the CDS code to know if the other changes are correct or not. I spotted something that looks weird: Thanks @stefank @calvinccheung @coleenp for the review. Latest version passed Mach5 CI tiers 1-4. ------------- PR: https://git.openjdk.java.net/jdk/pull/6859 From iklam at openjdk.java.net Tue Jan 4 04:56:20 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 4 Jan 2022 04:56:20 GMT Subject: Integrated: 8278602: CDS dynamic dump may access unloaded classes In-Reply-To: References: Message-ID: On Thu, 16 Dec 2021 03:46:10 GMT, Ioi Lam wrote: > Cause of crash: > > When dumping a CDS archive, while iterating over entries of the `SystemDictionaryShared::_dumptime_table`, we do not check whether the classes are already unloaded. In the crash, we are trying to call `InstanceKlass::signer()` but the class has already been unloaded. > > Fix: > > Override the template function `DumpTimeSharedClassTable::iterate` to ensure iteration safety. Do not iterate over a class if its `class_loader_data` is no longer alive. > > The assert in `DumpTimeSharedClassTable::IterationHelper` found another existing bug -- we were calling `SystemDictionaryShared::is_dumptime_table_empty()` without holding the `DumpTimeTable_lock`. I delayed the call until we have grabbed the lock. > > Testing: > > I have attached a test case into the bug report. Without the fix, it would reproduce the same crash in less than a minute. With the fix, the crash is no longer reproducible. > > Unfortunately, the test case requires a ZGC patch (thanks to @stefank) that adds delays to increase the likelihood of seeing unloaded classes inside the `_dumptime_table`. Therefore, I cannot integrate the test as a jtreg test. I'll mark the bug as **noreg-hard** This pull request has now been integrated. Changeset: 09cf5f19 Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/09cf5f19d76b17790ffb899aad247f821a27d46b Stats: 269 lines in 8 files changed: 261 ins; 4 del; 4 mod 8278602: CDS dynamic dump may access unloaded classes Reviewed-by: coleenp, ccheung ------------- PR: https://git.openjdk.java.net/jdk/pull/6859 From iwalulya at openjdk.java.net Tue Jan 4 05:03:15 2022 From: iwalulya at openjdk.java.net (Ivan Walulya) Date: Tue, 4 Jan 2022 05:03:15 GMT Subject: RFR: 8278282: G1: Log basic statistics for evacuation failure [v3] In-Reply-To: References: Message-ID: On Wed, 22 Dec 2021 02:33:37 GMT, Hamlin Li wrote: >> The original pr is at #6763 , which should be retired as we have decided to adjust part of optimization solution for evacuation failure (see #6627 for details), so the log will be adjusted accordiingly. >> The basic log related to evacuation failed will looks like below based on this patch. >> >> >> [13.126s][debug][gc,phases] GC(0) Restore Retained Regions (ms): Min: 0.0, Avg: 197.4, Max: 1579.1, Diff: 1579.1, Sum: 1579.1, Workers: 8 >> [13.126s][debug][gc,phases] GC(0) Evacuation Failure Regions: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 1, Workers: 1 > > Hamlin Li has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: > > - Merge branch 'master' into log-evac-failure-region-num > - Fix test > - Initial commit Marked as reviewed by iwalulya (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/6860 From duke at openjdk.java.net Tue Jan 4 08:37:55 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Tue, 4 Jan 2022 08:37:55 GMT Subject: RFR: 8279143: Undefined behaviours in globalDefinitions.hpp [v4] In-Reply-To: References: Message-ID: <4Aa44MvoFLOsvLI2sNGwnosaW8wvlotamHdjw8FKwL4=.1d9f2ccb-454a-40b2-8232-4311d41c0831@github.com> > Hi, > > This patch replaces undefined behaviours in globalDefinitions.hpp by proper well-defined ones. > > Thank you very much. Quan Anh Mai has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: - Merge branch 'master' into undefinedBehaviour - update copyright - typo - clean - Merge branch 'master' into undefinedBehaviour - Merge branch 'master' of github.com:MeryKitty/jdk into undefinedBehaviour - implementation limits - const reference - words not need to be initialized - undefined behaviour in globalDefinitions.hpp ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6930/files - new: https://git.openjdk.java.net/jdk/pull/6930/files/88c4305c..e8186468 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6930&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6930&range=02-03 Stats: 7295 lines in 307 files changed: 5407 ins; 983 del; 905 mod Patch: https://git.openjdk.java.net/jdk/pull/6930.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6930/head:pull/6930 PR: https://git.openjdk.java.net/jdk/pull/6930 From mli at openjdk.java.net Tue Jan 4 11:59:22 2022 From: mli at openjdk.java.net (Hamlin Li) Date: Tue, 4 Jan 2022 11:59:22 GMT Subject: RFR: 8278282: G1: Log basic statistics for evacuation failure [v3] In-Reply-To: References: Message-ID: On Mon, 3 Jan 2022 20:34:03 GMT, Albert Mingkun Yang wrote: >> Hamlin Li has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: >> >> - Merge branch 'master' into log-evac-failure-region-num >> - Fix test >> - Initial commit > > src/hotspot/share/gc/g1/g1GCPhaseTimes.cpp line 135: > >> 133: _gc_par_phases[MergePSS]->create_thread_work_items("LAB Undo Waste", MergePSSLABUndoWasteBytes); >> 134: >> 135: _gc_par_phases[RestoreRetainedRegions]->create_thread_work_items("Evacuation Failure Regions:", RestoreRetainedRegionsNum); > > Based on the surrounding naming conventions, `RestoreRetainedRegionsNum` should probably be sth like `RestoreRetainedRegionsNumEvacFailRegions` or just the suffix `NumEvacFailRegions`. Thanks Albert, I see your point. But seems the surrounding naming convention is not that strict, and there is no uniform convention here. And to me, `RestoreRetainedRegionsNum` is more friendly to read and easily related to `RestoreRetainedRegions`, so I prefer to keep it as `RestoreRetainedRegionsNum`. ------------- PR: https://git.openjdk.java.net/jdk/pull/6860 From mli at openjdk.java.net Tue Jan 4 11:59:21 2022 From: mli at openjdk.java.net (Hamlin Li) Date: Tue, 4 Jan 2022 11:59:21 GMT Subject: RFR: 8278282: G1: Log basic statistics for evacuation failure [v3] In-Reply-To: References: Message-ID: On Fri, 17 Dec 2021 12:04:38 GMT, Thomas Schatzl wrote: >> Hamlin Li has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: >> >> - Merge branch 'master' into log-evac-failure-region-num >> - Fix test >> - Initial commit > > Lgtm apart from the omission of that log message check. > > I assume that it's just not worth adding log messages for the sub phases (sorting, ...) of the evacuation failure handling since we decided on the other option using the prev bitmap already. That's fine. > In any case we can always improve these messages. Thanks @tschatzl @albertnetymk @walulyai for your reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/6860 From mli at openjdk.java.net Tue Jan 4 11:59:23 2022 From: mli at openjdk.java.net (Hamlin Li) Date: Tue, 4 Jan 2022 11:59:23 GMT Subject: Integrated: 8278282: G1: Log basic statistics for evacuation failure In-Reply-To: References: Message-ID: On Thu, 16 Dec 2021 09:29:09 GMT, Hamlin Li wrote: > The original pr is at #6763 , which should be retired as we have decided to adjust part of optimization solution for evacuation failure (see #6627 for details), so the log will be adjusted accordiingly. > The basic log related to evacuation failed will looks like below based on this patch. > > > [13.126s][debug][gc,phases] GC(0) Restore Retained Regions (ms): Min: 0.0, Avg: 197.4, Max: 1579.1, Diff: 1579.1, Sum: 1579.1, Workers: 8 > [13.126s][debug][gc,phases] GC(0) Evacuation Failure Regions: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 1, Workers: 1 This pull request has now been integrated. Changeset: 93c7d90c Author: Hamlin Li URL: https://git.openjdk.java.net/jdk/commit/93c7d90c55034ba8dbcd612366c891ad08c9c54e Stats: 23 lines in 6 files changed: 16 ins; 0 del; 7 mod 8278282: G1: Log basic statistics for evacuation failure Reviewed-by: tschatzl, ayang, iwalulya ------------- PR: https://git.openjdk.java.net/jdk/pull/6860 From jwilhelm at openjdk.java.net Tue Jan 4 18:46:48 2022 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Tue, 4 Jan 2022 18:46:48 GMT Subject: RFR: Merge jdk18 Message-ID: Forwardport JDK 18 -> JDK 19 ------------- Commit messages: - Merge remote-tracking branch 'jdk18/master' into Merge_jdk18 - 8275830: C2: Receiver downcast is missing when inlining through method handle linkers - 8265317: [vector] assert(payload->is_object()) failed: expected 'object' value for scalar-replaced boxed vector but got: NULL - 8279379: GHA: Print tests that are in error - 8278966: two microbenchmarks tests fail "assert(!jvms->method()->has_exception_handlers()) failed: no exception handler expected" after JDK-8275638 - 8278824: Uneven work distribution when scanning heap roots in G1 The webrevs contain the adjustments done while merging with regards to each parent branch: - master: https://webrevs.openjdk.java.net/?repo=jdk&pr=6959&range=00.0 - jdk18: https://webrevs.openjdk.java.net/?repo=jdk&pr=6959&range=00.1 Changes: https://git.openjdk.java.net/jdk/pull/6959/files Stats: 161 lines in 9 files changed: 125 ins; 11 del; 25 mod Patch: https://git.openjdk.java.net/jdk/pull/6959.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6959/head:pull/6959 PR: https://git.openjdk.java.net/jdk/pull/6959 From hseigel at openjdk.java.net Tue Jan 4 19:31:36 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Tue, 4 Jan 2022 19:31:36 GMT Subject: RFR: 8214976: Warn about uses of functions replaced for portability Message-ID: Please review this change for JDK-8214976. This change adds attribute warnings to header file compilerWarnings.hpp so that compilation warnings get issued when certain system functions are called directly, instead of hotspot's os:: versions of the functions. Many additional files were changed because of compilation warnings resulting from the compilerWarnings.hpp changes. A sample warning is: .../open/test/hotspot/gtest/logging/test_log.cpp:63:19: error: call to 'fopen' declared with attribute warning: use os::fopen [-Werror=attribute-warning] 63 | FILE* fp = fopen(TestLogFileName, "r"); | ~~~~~^~~~~~~~~~~~~~~~~~~~~~ Note that changing src/hotspot/os/linux/gc/z/zMountPoint_linux.cpp to call os:: functions requires adding "#include "runtime/os.hpp" and caused test gc/z/TestAllocateHeapAt.java to fail. So, for now, I just added PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION to zMountPoint_linux.cpp. There's a similar issue with gtest/logging/test_logDecorators.cpp. Attribute warnings for additional functions, such as malloc(), were not included in this change because they require lots of source code changes, or, in the case of open(), require adding an "open(const char *path, int oflag)" version of open() to os.hpp and os.cpp. This change was tested by running mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64. Also, builds were done on Linux-zero, Linux-s390, and Linux-ppc. Thanks, Harold ------------- Commit messages: - 8214976: Warn about uses of functions replaced for portability Changes: https://git.openjdk.java.net/jdk/pull/6961/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6961&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8214976 Stats: 208 lines in 46 files changed: 62 ins; 0 del; 146 mod Patch: https://git.openjdk.java.net/jdk/pull/6961.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6961/head:pull/6961 PR: https://git.openjdk.java.net/jdk/pull/6961 From jwilhelm at openjdk.java.net Tue Jan 4 19:35:28 2022 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Tue, 4 Jan 2022 19:35:28 GMT Subject: Integrated: Merge jdk18 In-Reply-To: References: Message-ID: On Tue, 4 Jan 2022 18:37:53 GMT, Jesper Wilhelmsson wrote: > Forwardport JDK 18 -> JDK 19 This pull request has now been integrated. Changeset: 191f7307 Author: Jesper Wilhelmsson URL: https://git.openjdk.java.net/jdk/commit/191f7307bb2f2e2ce93480b4fc5fbbef216ff7cd Stats: 161 lines in 9 files changed: 125 ins; 11 del; 25 mod Merge ------------- PR: https://git.openjdk.java.net/jdk/pull/6959 From kbarrett at openjdk.java.net Wed Jan 5 01:21:21 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 5 Jan 2022 01:21:21 GMT Subject: RFR: 8279143: Undefined behaviours in globalDefinitions.hpp [v4] In-Reply-To: <4Aa44MvoFLOsvLI2sNGwnosaW8wvlotamHdjw8FKwL4=.1d9f2ccb-454a-40b2-8232-4311d41c0831@github.com> References: <4Aa44MvoFLOsvLI2sNGwnosaW8wvlotamHdjw8FKwL4=.1d9f2ccb-454a-40b2-8232-4311d41c0831@github.com> Message-ID: <0UxP6wtOa77IpnfiL70N8Q4lwCX4KWI2GCXUS43NLYg=.aa1e3f25-d831-4364-b1a1-424b0629802c@github.com> On Tue, 4 Jan 2022 08:37:55 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch replaces undefined behaviours in globalDefinitions.hpp by proper well-defined ones. >> >> Thank you very much. > > Quan Anh Mai has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: > > - Merge branch 'master' into undefinedBehaviour > - update copyright > - typo > - clean > - Merge branch 'master' into undefinedBehaviour > - Merge branch 'master' of github.com:MeryKitty/jdk into undefinedBehaviour > - implementation limits > - const reference > - words not need to be initialized > - undefined behaviour in globalDefinitions.hpp Changes requested by kbarrett (Reviewer). src/hotspot/share/utilities/globalDefinitions.hpp line 617: > 615: std::is_trivially_copy_assignable(), "implementation limits"); > 616: T to; > 617: memcpy(&to, &from, sizeof(T)); During the review of JDK-8145096 it was found that some compilers produce wretched code for these kinds of memcpy uses, even at fairly high optimization levels. (I don't know if we still care about those compilers. Unfortunately I don't remember which ones they were, other than gcc/clang/VS all being good.) While using the so-called "union trick" is technically undefined behavior, it is a technique that is known to be widely and well supported and produces good code, at least for the cases where it is being used in HotSpot. In some cases, such as gcc (and I think Visual Studio, though can't find a reference right now), this behavior is documented. Rather than adding a partial bit_cast (or moving it from elsewhere), we should be using our existing PrimitiveConversions::cast (metaprogramming/primitiveConversions.hpp). That has the small difficulty of a circular include dependency with globalDefintions.hpp. That can be fixed by moving the various jfoo_cast functions elsewhere (either to primitiveConversions.hpp or to a new file; I might prefer the latter (along with the Translate specializations in primitiveConversions), moving these relatively infrequently used utilities to their own dedicated location). That also reduces the content of globalDefinitions.hpp, which IMO is too much of a random dumping ground. ------------- PR: https://git.openjdk.java.net/jdk/pull/6930 From duke at openjdk.java.net Wed Jan 5 04:04:03 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Wed, 5 Jan 2022 04:04:03 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v4] In-Reply-To: References: Message-ID: > Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: Update popcount long test to use IR framework ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6857/files - new: https://git.openjdk.java.net/jdk/pull/6857/files/67f2a71b..e84c6bdb Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6857&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6857&range=02-03 Stats: 22 lines in 2 files changed: 6 ins; 13 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/6857.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6857/head:pull/6857 PR: https://git.openjdk.java.net/jdk/pull/6857 From duke at openjdk.java.net Wed Jan 5 04:04:04 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Wed, 5 Jan 2022 04:04:04 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v2] In-Reply-To: References: Message-ID: On Tue, 21 Dec 2021 05:10:16 GMT, Jatin Bhateja wrote: >> Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: >> >> Add JMH micro benchmark to measure performance > > test/hotspot/jtreg/compiler/vectorization/TestPopCountVectorLong.java line 65: > >> 63: } >> 64: >> 65: public void vectorizeBitCount() { > > We can add check based on new IR framework here. Updated the test to use IR framework...please check... ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From kbarrett at openjdk.java.net Wed Jan 5 04:12:16 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 5 Jan 2022 04:12:16 GMT Subject: RFR: 8214976: Warn about uses of functions replaced for portability In-Reply-To: References: Message-ID: On Tue, 4 Jan 2022 19:23:24 GMT, Harold Seigel wrote: > Please review this change for JDK-8214976. This change adds attribute warnings to header file compilerWarnings.hpp so that compilation warnings get issued when certain system functions are called directly, instead of hotspot's os:: versions of the functions. Many additional files were changed because of compilation warnings resulting from the compilerWarnings.hpp changes. > > A sample warning is: > > .../open/test/hotspot/gtest/logging/test_log.cpp:63:19: error: call to 'fopen' declared with attribute warning: use os::fopen [-Werror=attribute-warning] > 63 | FILE* fp = fopen(TestLogFileName, "r"); > | ~~~~~^~~~~~~~~~~~~~~~~~~~~~ > > > Note that changing src/hotspot/os/linux/gc/z/zMountPoint_linux.cpp to call os:: functions requires adding "#include "runtime/os.hpp" and caused test gc/z/TestAllocateHeapAt.java to fail. So, for now, I just added PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION to zMountPoint_linux.cpp. There's a similar issue with gtest/logging/test_logDecorators.cpp. > > Attribute warnings for additional functions, such as malloc(), were not included in this change because they require lots of source code changes. > > This change was tested by running mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64. Also, builds were done on Linux-zero, Linux-s390, and Linux-ppc. > > Thanks, Harold Changes requested by kbarrett (Reviewer). src/hotspot/os/aix/os_aix.cpp line 111: > 109: #include > 110: > 111: PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION(close); // prevents compiler warnings for all functions Why suppress the warning for this file? I think there are only 3 calls to `close`, so it seems like they could just be fixed. src/hotspot/os/bsd/os_bsd.cpp line 110: > 108: #endif > 109: > 110: PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION(close); // prevents compiler warnings for all functions Again here, why suppress the warning rather than fix the small number of calls in this file. src/hotspot/os/linux/gc/z/zMountPoint_linux.cpp line 37: > 35: #define PROC_SELF_MOUNTINFO "/proc/self/mountinfo" > 36: > 37: PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION(fopen); The intended usage for this pragma is to scope it narrowly, using `PRAGMA_DIAG_PUSH/POP`. src/hotspot/os/linux/os_linux.cpp line 153: > 151: }; > 152: > 153: PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION(closedir); // prevents compiler warning for all functions Why suppress this warning rather than fix the one call. Note the `closedir` in `os::dir_is_empty` is in the `os` "namespace" and hence is already calling the `os` wrapper. It could be explicitly qualified anyway, if that seems confusing. src/hotspot/os/posix/os_posix.cpp line 93: > 91: #define guarantee_with_errno(cond, msg) check_with_errno(guarantee, cond, msg) > 92: > 93: PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION(closedir); // prevents compiler warnings for all functions This should be scoped over the one use, in the implementation of `os::closedir`, rather than applied to the whole file. src/hotspot/os/windows/os_windows.cpp line 105: > 103: #include > 104: > 105: PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION(close); // prevents compiler warnings for all functions The scope of this should be limited to the implementation of the `os::close` function. src/hotspot/share/compiler/compilerEvent.cpp line 102: > 100: > 101: index = phase_names->length(); > 102: phase_names->append(use_strdup ? os::strdup(phase_name) : phase_name); I'm not sure this change to use `os::strdup` is currently correct. I'm not sure where the copied string gets freed, but that might be using `::free` rather than `os::free`. `os::strdup` uses NMT when enabled, and then using `::free` will leak the tracking data. src/hotspot/share/jvmci/jvmciCodeInstaller.cpp line 511: > 509: JVMCI_ERROR_OK("stub should have a name"); > 510: } > 511: char* name = os::strdup(jvmci_env()->as_utf8_string(stubName)); Another `os::strdup` that I'm not sure is correct because I'm not sure where corresponding `free` might be, and whether it is `::free` or `os::free`. src/hotspot/share/runtime/os.cpp line 93: > 91: #endif > 92: > 93: PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION(fopen); // prevents compiler warnings for all functions Scope should be limited to the implementation of `os::fopen`. src/hotspot/share/utilities/compilerWarnings.hpp line 87: > 85: PRAGMA_DISABLE_GCC_WARNING("-Wattribute-warning") > 86: > 87: FORBID_C_FUNCTION(void abort(void), "use os::abort"); It would be better to put all of these after all the `#endif`, so that if we add macro implementations for other platforms (like windows), these will be covered by the additional platforms. src/hotspot/share/utilities/compilerWarnings.hpp line 112: > 110: > 111: #else > 112: I think, but have not tested it, that this facility can be implemented for Visual Studio using `__declspec(deprecated)` and suppressing warning C4996. Of course, doing that may trigger a bunch of warnings in Windows-specific files, so it might be best to do that as a followup change. test/hotspot/gtest/logging/test_logDecorators.cpp line 29: > 27: #include "unittest.hpp" > 28: > 29: PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION(strdup); Why suppress this warning? Why not just fix the couple of calls, and remember to also fix the corresponding calls to `::free` to instead call `os::free`. ------------- PR: https://git.openjdk.java.net/jdk/pull/6961 From ioi.lam at oracle.com Wed Jan 5 07:38:36 2022 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 4 Jan 2022 23:38:36 -0800 Subject: [Ping] RFR: 8275731: CDS archived enums objects are recreated at runtime In-Reply-To: <9XdQFi_-JzM91ET0nN1gRCp8ZfMGBz1BwXglxqb8phg=.c643d5a5-b99a-4ce2-8616-9c1472e521b7@github.com> References: <9XdQFi_-JzM91ET0nN1gRCp8ZfMGBz1BwXglxqb8phg=.c643d5a5-b99a-4ce2-8616-9c1472e521b7@github.com> Message-ID: <211a7559-e8e8-29a8-0df0-6fd94fbe1c2d@oracle.com> Still looking for reviewers .... Thanks - Ioi On 12/1/21 1:02 PM, Ioi Lam wrote: > **Background:** > > In the Java Language, Enums can be tested for equality, so the constants in an Enum type must be unique. Javac compiles an enum declaration like this: > > > public enum Day { SUNDAY, MONDAY ... } > > > to > > > public class Day extends java.lang.Enum { > public static final SUNDAY = new Day("SUNDAY"); > public static final MONDAY = new Day("MONDAY"); ... > } > > > With CDS archived heap objects, `Day::` is executed twice: once during `java -Xshare:dump`, and once during normal JVM execution. If the archived heap objects references one of the Enum constants created at dump time, we will violate the uniqueness requirements of the Enum constants at runtime. See the test case in the description of [JDK-8275731](https://bugs.openjdk.java.net/browse/JDK-8275731) > > **Fix:** > > During -Xshare:dump, if we discovered that an Enum constant of type X is archived, we archive all constants of type X. At Runtime, type X will skip the normal execution of `X::`. Instead, we run `HeapShared::initialize_enum_klass()` to retrieve all the constants of X that were saved at dump time. > > This is safe as we know that `X::` has no observable side effect -- it only creates the constants of type X, as well as the synthetic value `X::$VALUES`, which cannot be observed until X is fully initialized. > > **Verification:** > > To avoid future problems, I added a new tool, CDSHeapVerifier, to look for similar problems where the archived heap objects reference a static field that may be recreated at runtime. There are some manual steps involved, but I analyzed the potential problems found by the tool are they are all safe (after the current bug is fixed). See cdsHeapVerifier.cpp for gory details. An example trace of this tool can be found at https://bugs.openjdk.java.net/secure/attachment/97242/enum_warning.txt > > **Testing:** > > Passed Oracle CI tiers 1-4. WIll run tier 5 as well. > > ------------- > > Commit messages: > - 8275731: CDS archived enums objects are recreated at runtime > > Changes: https://git.openjdk.java.net/jdk/pull/6653/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6653&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8275731 > Stats: 829 lines in 16 files changed: 787 ins; 2 del; 40 mod > Patch: https://git.openjdk.java.net/jdk/pull/6653.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/6653/head:pull/6653 > > PR: https://git.openjdk.java.net/jdk/pull/6653 From ddong at openjdk.java.net Wed Jan 5 07:50:15 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Wed, 5 Jan 2022 07:50:15 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash In-Reply-To: References: Message-ID: On Thu, 23 Dec 2021 11:04:22 GMT, Andrew Haley wrote: > Hmm, it's a tricky one. Your solution might be the best that can be done at present, but it doesn't make me feel very comfortable. I think I need to have a look at it later, probably in the new year. Please feel free to remind me then. Hi @theRealAph , Happy new year :-) Please give some comments on this patch or suggestions when you have time. Thanks Denghui ------------- PR: https://git.openjdk.java.net/jdk/pull/6597 From jbhateja at openjdk.java.net Wed Jan 5 09:07:16 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Wed, 5 Jan 2022 09:07:16 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v4] In-Reply-To: References: Message-ID: On Wed, 5 Jan 2022 04:04:03 GMT, Vamsi Parasa wrote: >> Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. > > Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > Update popcount long test to use IR framework Please also update copywrite headers of modified files. ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From jbhateja at openjdk.java.net Wed Jan 5 09:11:14 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Wed, 5 Jan 2022 09:11:14 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v2] In-Reply-To: References: Message-ID: On Wed, 5 Jan 2022 03:59:58 GMT, Vamsi Parasa wrote: >> test/hotspot/jtreg/compiler/vectorization/TestPopCountVectorLong.java line 65: >> >>> 63: } >>> 64: >>> 65: public void vectorizeBitCount() { >> >> We can add check based on new IR framework here. > > Updated the test to use IR framework...please check... Kindly add @requires vm.cpu.features ~= ".*avx512dq.*" in tag since test case may fail on other targets. ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From duke at openjdk.java.net Wed Jan 5 10:37:33 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Wed, 5 Jan 2022 10:37:33 GMT Subject: RFR: 8279282: [vectorapi] Matcher::supports_vector_comparison_unsigned is not needed on x86 Message-ID: Hi, Currently, unsigned comparison on AVX is implemented by zero extending elements and comparing the results. This leads to unnecessary complexity. This patch changes the implementation to use the identity existing in `Integer/Long.compareUnsigned`, that is `compareUnsigned(x, y) == compare(x ^ min_value, y ^ min_value)`. Thank you very much. ------------- Commit messages: - unsigned comparison enhancement Changes: https://git.openjdk.java.net/jdk/pull/6966/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6966&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8279282 Stats: 171 lines in 6 files changed: 38 ins; 99 del; 34 mod Patch: https://git.openjdk.java.net/jdk/pull/6966.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6966/head:pull/6966 PR: https://git.openjdk.java.net/jdk/pull/6966 From dholmes at openjdk.java.net Wed Jan 5 12:34:10 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 5 Jan 2022 12:34:10 GMT Subject: RFR: 8214976: Warn about uses of functions replaced for portability In-Reply-To: References: Message-ID: On Tue, 4 Jan 2022 19:23:24 GMT, Harold Seigel wrote: > Please review this change for JDK-8214976. This change adds attribute warnings to header file compilerWarnings.hpp so that compilation warnings get issued when certain system functions are called directly, instead of hotspot's os:: versions of the functions. Many additional files were changed because of compilation warnings resulting from the compilerWarnings.hpp changes. > > A sample warning is: > > .../open/test/hotspot/gtest/logging/test_log.cpp:63:19: error: call to 'fopen' declared with attribute warning: use os::fopen [-Werror=attribute-warning] > 63 | FILE* fp = fopen(TestLogFileName, "r"); > | ~~~~~^~~~~~~~~~~~~~~~~~~~~~ > > > Note that changing src/hotspot/os/linux/gc/z/zMountPoint_linux.cpp to call os:: functions requires adding "#include "runtime/os.hpp" and caused test gc/z/TestAllocateHeapAt.java to fail. So, for now, I just added PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION to zMountPoint_linux.cpp. There's a similar issue with gtest/logging/test_logDecorators.cpp. > > Attribute warnings for additional functions, such as malloc(), were not included in this change because they require lots of source code changes. > > This change was tested by running mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64. Also, builds were done on Linux-zero, Linux-s390, and Linux-ppc. > > Thanks, Harold Hi Harold, As I wrote in the bug report: > Just be mindful that this really only applies when shared code calls these functions. In os specific code it may be necessary/desirable to call the platform specific functions rather than the generic OS:: version. We do not need to go through the os:: portability layer when already in os-specific code. The only time we need to use the os:: layer in that case is when the os:: layer adds additional semantics (like NMT) that we want. ------------- PR: https://git.openjdk.java.net/jdk/pull/6961 From kim.barrett at oracle.com Wed Jan 5 13:45:11 2022 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 5 Jan 2022 13:45:11 +0000 Subject: RFR: 8214976: Warn about uses of functions replaced for portability In-Reply-To: References: Message-ID: > On Jan 5, 2022, at 7:34 AM, David Holmes wrote: > As I wrote in the bug report: > >> Just be mindful that this really only applies when shared code calls these functions. In os specific code it may be necessary/desirable to call the platform specific functions rather than the generic OS:: version. > > We do not need to go through the os:: portability layer when already in os-specific code. The only time we need to use the os:: layer in that case is when the os:: layer adds additional semantics (like NMT) that we want. Calls that bypass the os:: portability layer require suppressing the warning for bypassing the portability layer. I think in very nearly all such cases it?s better to stick with the portability layer, the obvious contrary case being the implementation of the portability function in terms of the underlying ?native? function. From hseigel at openjdk.java.net Wed Jan 5 15:32:15 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 5 Jan 2022 15:32:15 GMT Subject: RFR: 8214976: Warn about uses of functions replaced for portability In-Reply-To: References: Message-ID: On Tue, 4 Jan 2022 19:23:24 GMT, Harold Seigel wrote: > Please review this change for JDK-8214976. This change adds attribute warnings to header file compilerWarnings.hpp so that compilation warnings get issued when certain system functions are called directly, instead of hotspot's os:: versions of the functions. Many additional files were changed because of compilation warnings resulting from the compilerWarnings.hpp changes. > > A sample warning is: > > .../open/test/hotspot/gtest/logging/test_log.cpp:63:19: error: call to 'fopen' declared with attribute warning: use os::fopen [-Werror=attribute-warning] > 63 | FILE* fp = fopen(TestLogFileName, "r"); > | ~~~~~^~~~~~~~~~~~~~~~~~~~~~ > > > Note that changing src/hotspot/os/linux/gc/z/zMountPoint_linux.cpp to call os:: functions requires adding "#include "runtime/os.hpp" and caused test gc/z/TestAllocateHeapAt.java to fail. So, for now, I just added PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION to zMountPoint_linux.cpp. There's a similar issue with gtest/logging/test_logDecorators.cpp. > > Attribute warnings for additional functions, such as malloc(), were not included in this change because they require lots of source code changes. > > This change was tested by running mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64. Also, builds were done on Linux-zero, Linux-s390, and Linux-ppc. > > Thanks, Harold Hi Kim, David, Should I remove FORBID_C_FUNCTION(char *strdup(const char*), from this change and revert the strdup changes? A future bug fix would handle FORBID_C_FUNCTION for strdup(), malloc() and free() together in one change? Thanks, Harold ------------- PR: https://git.openjdk.java.net/jdk/pull/6961 From hseigel at openjdk.java.net Wed Jan 5 16:16:17 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 5 Jan 2022 16:16:17 GMT Subject: RFR: 8214976: Warn about uses of functions replaced for portability In-Reply-To: References: Message-ID: On Tue, 4 Jan 2022 19:23:24 GMT, Harold Seigel wrote: > Please review this change for JDK-8214976. This change adds attribute warnings to header file compilerWarnings.hpp so that compilation warnings get issued when certain system functions are called directly, instead of hotspot's os:: versions of the functions. Many additional files were changed because of compilation warnings resulting from the compilerWarnings.hpp changes. > > A sample warning is: > > .../open/test/hotspot/gtest/logging/test_log.cpp:63:19: error: call to 'fopen' declared with attribute warning: use os::fopen [-Werror=attribute-warning] > 63 | FILE* fp = fopen(TestLogFileName, "r"); > | ~~~~~^~~~~~~~~~~~~~~~~~~~~~ > > > Note that changing src/hotspot/os/linux/gc/z/zMountPoint_linux.cpp to call os:: functions requires adding "#include "runtime/os.hpp" and caused test gc/z/TestAllocateHeapAt.java to fail. So, for now, I just added PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION to zMountPoint_linux.cpp. There's a similar issue with gtest/logging/test_logDecorators.cpp. > > Attribute warnings for additional functions, such as malloc(), were not included in this change because they require lots of source code changes. > > This change was tested by running mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64. Also, builds were done on Linux-zero, Linux-s390, and Linux-ppc. > > Thanks, Harold Hi Kim, I don't think that FORBIDDEN_C_FUNCTION should be enforced for files os_aix.cpp, os_bsd.cpp, os_posix.cpp, and os_windows.cpp. These are platform specific files that intentionally call multiple native functions. For example, os_linux.cpp calls close_dir() fopen(), close(), readdir(), write(), lseek(), etc. Since PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION ignores its argument and disables the warning for all native calls, I think it's reasonable to just specify PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION once at the beginning of the above files and not clutter them with many PRAGMA_DIAG_PUSH/POP's. Thanks, Harold ------------- PR: https://git.openjdk.java.net/jdk/pull/6961 From kim.barrett at oracle.com Wed Jan 5 16:16:25 2022 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 5 Jan 2022 16:16:25 +0000 Subject: RFR: 8214976: Warn about uses of functions replaced for portability In-Reply-To: References: Message-ID: > On Jan 5, 2022, at 10:32 AM, Harold Seigel wrote: > Should I remove FORBID_C_FUNCTION(char *strdup(const char*), from this change and revert the strdup changes? A future bug fix would handle FORBID_C_FUNCTION for strdup(), malloc() and free() together in one change? That seems like a good idea to me. From shade at openjdk.java.net Wed Jan 5 16:45:30 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 5 Jan 2022 16:45:30 GMT Subject: RFR: 8279526: Exceptions::count_out_of_memory_exceptions miscounts class metaspace OOMEs Message-ID: SonarCloud reports that `Universe::is_out_of_memory_error_class_metaspace` is not used after JDK-8278125. Indeed, that patch [seems to introduce](https://github.com/openjdk/jdk/commit/ad1dc9c2ae5463363aff20072a3f2ca4ea23acd2?diff=unified#diff-997cf62de09eb9ba3ba9a8fc1d48666b913b4ece76a4f37559a985282788d913L466-R466) a typo in `Exceptions::count_out_of_memory_exceptions`. Additional testing: - [x] Linux x86_64 fastdebug `hotspot:tier1` ------------- Commit messages: - Fix Changes: https://git.openjdk.java.net/jdk/pull/6970/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6970&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8279526 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/6970.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6970/head:pull/6970 PR: https://git.openjdk.java.net/jdk/pull/6970 From kim.barrett at oracle.com Wed Jan 5 16:58:29 2022 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 5 Jan 2022 16:58:29 +0000 Subject: RFR: 8214976: Warn about uses of functions replaced for portability In-Reply-To: References: Message-ID: <9770C08D-E9A8-4025-9D4D-CB1A5A5445D4@oracle.com> > On Jan 5, 2022, at 11:16 AM, Harold Seigel wrote: > I don't think that FORBIDDEN_C_FUNCTION should be enforced for files os_aix.cpp, os_bsd.cpp, os_posix.cpp, and os_windows.cpp. These are platform specific files that intentionally call multiple native functions. For example, os_linux.cpp calls close_dir() fopen(), close(), readdir(), write(), lseek(), etc. I disagree. I think there are or will be functions we want to just forbid outright, and others we want to forbid outside the implementation of the os:: wrapper because we consistently want some added feature of the portability wrapper (such as asserts or common argument manipulation or whatever). > Since PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION ignores its argument and disables the warning for all native calls, I think it's reasonable to just specify PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION once at the beginning of the above files and not clutter them with many PRAGMA_DIAG_PUSH/POP's. I disagree. That behavior of the gcc version of the pragma is an artifact of its current implementation, and not the desired behavior. I couldn?t find a way to limit the disabling to the identifier in the argument with gcc. That doesn?t mean that won?t be possible for other platforms, or for a future version of gcc, or for someone more clever than me. Even with the limited gcc implementation the argument documents intent. The purpose of the pragma is to permit the os:: portability wrapper for a function to be implemented in terms of the corresponding native function. If it weren?t for that requirement I wouldn?t want the pragma to exist at all. From zgu at openjdk.java.net Wed Jan 5 17:41:15 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Wed, 5 Jan 2022 17:41:15 GMT Subject: RFR: 8279526: Exceptions::count_out_of_memory_exceptions miscounts class metaspace OOMEs In-Reply-To: References: Message-ID: <-yIB90iNeyq92SHjzfKmFVS0MGWm7DokDZqOobJEFOI=.447ad463-cdd9-4242-9c0d-73505f9dccb0@github.com> On Wed, 5 Jan 2022 16:38:22 GMT, Aleksey Shipilev wrote: > SonarCloud reports that `Universe::is_out_of_memory_error_class_metaspace` is not used after JDK-8278125. Indeed, that patch [seems to introduce](https://github.com/openjdk/jdk/commit/ad1dc9c2ae5463363aff20072a3f2ca4ea23acd2?diff=unified#diff-997cf62de09eb9ba3ba9a8fc1d48666b913b4ece76a4f37559a985282788d913L466-R466) a typo in `Exceptions::count_out_of_memory_exceptions`. > > Additional testing: > - [x] Linux x86_64 fastdebug `hotspot:tier1` Looks good. ------------- Marked as reviewed by zgu (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6970 From sviswanathan at openjdk.java.net Wed Jan 5 17:47:16 2022 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Wed, 5 Jan 2022 17:47:16 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v4] In-Reply-To: References: Message-ID: On Wed, 5 Jan 2022 04:04:03 GMT, Vamsi Parasa wrote: >> Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. > > Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > Update popcount long test to use IR framework src/hotspot/cpu/x86/x86.ad line 1416: > 1414: return false; > 1415: } > 1416: break; This case could be combined with case Op_PopCountVI and duplication removed. The check is the same for both. test/hotspot/jtreg/compiler/vectorization/TestPopCountVectorLong.java line 57: > 55: @Test // needs to be run in (fast) debug mode > 56: @Warmup(10000) > 57: @IR(counts = {"PopCountVL", "9"}) //9 PopCountVL nodes are generated for a long[] of LEN=1024 Could this be a failOn check instead of counts check? The number of PopCountVL nodes is dependent on loop unrolling which keeps changing with loop optimizations. ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From coleenp at openjdk.java.net Wed Jan 5 18:06:11 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 5 Jan 2022 18:06:11 GMT Subject: RFR: 8279526: Exceptions::count_out_of_memory_exceptions miscounts class metaspace OOMEs In-Reply-To: References: Message-ID: On Wed, 5 Jan 2022 16:38:22 GMT, Aleksey Shipilev wrote: > SonarCloud reports that `Universe::is_out_of_memory_error_class_metaspace` is not used after JDK-8278125. Indeed, that patch [seems to introduce](https://github.com/openjdk/jdk/commit/ad1dc9c2ae5463363aff20072a3f2ca4ea23acd2?diff=unified#diff-997cf62de09eb9ba3ba9a8fc1d48666b913b4ece76a4f37559a985282788d913L466-R466) a typo in `Exceptions::count_out_of_memory_exceptions`. > > Additional testing: > - [x] Linux x86_64 fastdebug `hotspot:tier1` Thanks for fixing this. A trivial change. please check it in! ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6970 From psandoz at openjdk.java.net Wed Jan 5 18:22:20 2022 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Wed, 5 Jan 2022 18:22:20 GMT Subject: RFR: 8279282: [vectorapi] Matcher::supports_vector_comparison_unsigned is not needed on x86 In-Reply-To: References: Message-ID: On Wed, 5 Jan 2022 10:30:46 GMT, Quan Anh Mai wrote: > Hi, > > Currently, unsigned comparison on AVX is implemented by zero extending elements and comparing the results. This leads to unnecessary complexity. This patch changes the implementation to use the identity existing in `Integer/Long.compareUnsigned`, that is `compareUnsigned(x, y) == compare(x ^ min_value, y ^ min_value)`. > > Thank you very much. src/hotspot/cpu/x86/x86.ad line 7332: > 7330: Matcher::vector_length_in_bytes(n->in(1)->in(1)) <= 32 && // src1 > 7331: is_integral_type(Matcher::vector_element_basic_type(n->in(1)->in(1))) && > 7332: (n->in(2)->get_int() == BoolTest::eq || It's tempting to add a method to check the third bit of a `BoolTest` value, which controls the sense of the result e.g. `eq(0)` and `ne(4)`, rather than three separate checks e.g. `is_negated` perhaps. That it turn may result in more clearer naming of the methods rather than using `_pri` and `_sec`, and the logic percolates down into `vpcmpCCW` via the `ComparisonPredicate` value and the use of the tmp register. ------------- PR: https://git.openjdk.java.net/jdk/pull/6966 From eastig at amazon.co.uk Wed Jan 5 19:16:33 2022 From: eastig at amazon.co.uk (Astigeevich, Evgeny) Date: Wed, 5 Jan 2022 19:16:33 +0000 Subject: RFC: improving NMethod code locality in CodeCache Message-ID: <8814F24A-85D4-4A6D-96C9-EF8AABE20CE8@amazon.com> Hi Boris, Thank you for the comments. > You say [1] that branch prediction hardware can become overloaded in the case of 15K compiled methods. > In your numbers, I see the maxium is 7K methods ~ 50MB (on Renaissance benchmark). This 15K is not the threshold. It is application dependent. I found DaCapo eclipse benchmark got ~6.0% improvement on Graviton2 when the tiered compilation was off and the code cache was limited to 64M. The eclipse benchmark has ~5K C2 methods. These numbers show the total created C2 methods during benchmarks run. However the most important metrics are the number of the hottest C2 methods and the corresponding memory map. The metrics change whilst an application runs. The sparse memory map causes branch prediction issues. > Also on aws-graviton-getting-started link [2] we see that the recommended CodeCacheSize value is 64M - more than that makes a performance impact. > These cases may be also different by the contents of the code cache: I guess it's tiered compilation in benchmarks and non-tiered C2 in [2]. The advice is to turn off the tiered compilation and to limit the code cache to 64M at the same time. It is based on: 1. All compiled methods will be C2 compiled. The sizes of non-tiered C2 methods are smaller than the sizes of tiered C2 methods. One thing I noticed tiered C2 methods have more code. I'll check what also differs. 2. Limiting the code cache to 64M, we force Sweeper to remove cold methods. This results the set of the hottest methods to be compact. For our application we saw the reduction of code cache usage from 130M to 37M. > - What is the typical CodeCache size for real-world applications? Is it common for CodeCache get hundreds of megabytes? Can it be simulated with benchmarks? Besides the data from our application, I don't have such data. I'll think how to collect it. With the tiered compilation on the default code cache size is 240M: 116M for non-profiled methods, 116M for profiled methods, 8M for non-nmethods. This is for x86_64 and arm64. With the tiered compilation off the default code cache size is 48M. I have not seen much research of code cache usage by real-world applications on x86. I think we can use some DaCapo and Renaissance benchmarks. SpecJbb needs to be checked as well. > I am not sure that branch predictors are often limited to a certain amount of memory, which is much less than the possible size of the code. They are. See: "The BTB in contemporary Intel chips" https://xania.org/201602/bpu-part-three "Branch predictor: How many "if"s are too many? Including x86 and M1 benchmarks!" https://blog.cloudflare.com/branch-predictor/ > There are now 3 generations of AWS Graviton HW. Do you observe same branch prediction and code cache size effects on all three? I have no data for Graviton 1 and Graviton 3. I have plans to try Graviton 3 as soon as I get access to it. Graviton 1 based instances (A1) are not so widely used as Graviton 2 instances. Anyway it might be interesting to get data from Graviton 1. Graviton 1 is based on Cortex-A72 which differs very much from Neoverse N1. Among arm64 implementations, Apple M1 is a good candidate to check. > - What does maximum CodeCache limit mean, is this distance from the first method to the last? This is the maximum amount of memory reserved for CodeCache. The memory is split into several heaps if the tiered compilation is on. It is not the distance between methods. See https://github.com/openjdk/jdk/blob/master/src/hotspot/share/code/codeCache.cpp#L1485 for details. > Will it help if C2 put the metatadata and things to the next page after the instructions page? I mean it worth putting them not too far from each other. What does "page" mean in this context? OS page? Or CodeCache page? > I believe it makes sence to work with Sweaper so that it removes cold methods actively from the CodeCache (see the Hotness Code picture on Page 65, [3]). Currently Sweeper relies on nmethod allocation and state change events. These events cause updating the temperature of nmethods. If no such events happen the temperature is not updated. In order to remove cold methods more actively, we need a sampling mechanism. See ideas in https://bugs.openjdk.java.net/browse/JDK-8279184. > In general a GC-like approach can be applied to the CodeCache to make it clean, small and hot. IMHO, this can get arbitrary complex up to a full-blown, generational CodeCache GC. Thanks, Evgeny From: Boris Ulasevich Date: Thursday, 23 December 2021 at 15:59 To: "Astigeevich, Evgeny" , "hotspot-dev at openjdk.java.net" Subject: RE: RFC: improving NMethod code locality in CodeCache Hi Evgeny, Thank you for sharing the data. It is very detailed and well structured. It is indeed interesting that the code itself takes ~1/2 of the volume and sometimes even less. So judging from the numbers, we can (theoretically) double the code dencity. I agree that it is worth doing. You say [1] that branch prediction hardware can become overloaded in the case of 15K compiled methods. In your numbers, I see the maxium is 7K methods ~ 50MB (on Renaissance benchmark). This is quite a load, yes. Also on aws-graviton-getting-started link [2] we see that the recommended CodeCacheSize value is 64M - more than that makes a performance impact. These cases may be also different by the contents of the code cache: I guess it's tiered compilation in benchmarks and non-tiered C2 in [2]. My questions are - What is the typical CodeCache size for real-world applications? Is it common for CodeCache get hundreds of megabytes? Can it be simulated with benchmarks? - I am not sure that branch predictors are often limited to a certain amount of memory, which is much less than the possible size of the code. There are now 3 generations of AWS Graviton HW. Do you observe same branch prediction and code cache size effects on all three? - What does maximum CodeCache limit mean, is this distance from the first method to the last? Will it help if C2 put the metatadata and things to the next page after the instructions page? I mean it worth putting them not too far from each other. Besides code density issue in case of a limited CodeCache size (either a small amount of memory or a limitation of branch predictor) I believe it makes sence to work with Sweaper so that it removes cold methods actively from the CodeCache (see the Hotness Code picture on Page 65, [3]). After the virtual machine warms up, the compiler threads are idle anyway. In general a GC-like approach can be applied to the CodeCache to make it clean, small and hot. thanks, Boris [1] https://mail.openjdk.java.net/pipermail/hotspot-dev/2021-November/056198.html [2] https://github.com/aws/aws-graviton-getting-started/blob/main/java.md [3] http://cr.openjdk.java.net/~thartmann/papers/2014-Code_Cache_Optimizations-thesis.pdf Amazon Development Centre (London) Ltd. Registered in England and Wales with registration number 04543232 with its registered office at 1 Principal Place, Worship Street, London EC2A 2FA, United Kingdom. From shade at openjdk.java.net Wed Jan 5 19:49:17 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 5 Jan 2022 19:49:17 GMT Subject: RFR: 8279526: Exceptions::count_out_of_memory_exceptions miscounts class metaspace OOMEs In-Reply-To: References: Message-ID: On Wed, 5 Jan 2022 16:38:22 GMT, Aleksey Shipilev wrote: > SonarCloud reports that `Universe::is_out_of_memory_error_class_metaspace` is not used after JDK-8278125. Indeed, that patch [seems to introduce](https://github.com/openjdk/jdk/commit/ad1dc9c2ae5463363aff20072a3f2ca4ea23acd2?diff=unified#diff-997cf62de09eb9ba3ba9a8fc1d48666b913b4ece76a4f37559a985282788d913L466-R466) a typo in `Exceptions::count_out_of_memory_exceptions`. > > Additional testing: > - [x] Linux x86_64 fastdebug `hotspot:tier1` Thanks for quick reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/6970 From shade at openjdk.java.net Wed Jan 5 19:49:17 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 5 Jan 2022 19:49:17 GMT Subject: Integrated: 8279526: Exceptions::count_out_of_memory_exceptions miscounts class metaspace OOMEs In-Reply-To: References: Message-ID: On Wed, 5 Jan 2022 16:38:22 GMT, Aleksey Shipilev wrote: > SonarCloud reports that `Universe::is_out_of_memory_error_class_metaspace` is not used after JDK-8278125. Indeed, that patch [seems to introduce](https://github.com/openjdk/jdk/commit/ad1dc9c2ae5463363aff20072a3f2ca4ea23acd2?diff=unified#diff-997cf62de09eb9ba3ba9a8fc1d48666b913b4ece76a4f37559a985282788d913L466-R466) a typo in `Exceptions::count_out_of_memory_exceptions`. > > Additional testing: > - [x] Linux x86_64 fastdebug `hotspot:tier1` This pull request has now been integrated. Changeset: 523300e7 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/523300e7968b28ade4bbfe004030227a224ab2dc Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8279526: Exceptions::count_out_of_memory_exceptions miscounts class metaspace OOMEs Reviewed-by: zgu, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/6970 From hseigel at openjdk.java.net Wed Jan 5 20:41:07 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 5 Jan 2022 20:41:07 GMT Subject: RFR: 8214976: Warn about uses of functions replaced for portability [v2] In-Reply-To: References: Message-ID: > Please review this change for JDK-8214976. This change adds attribute warnings to header file compilerWarnings.hpp so that compilation warnings get issued when certain system functions are called directly, instead of hotspot's os:: versions of the functions. Many additional files were changed because of compilation warnings resulting from the compilerWarnings.hpp changes. > > A sample warning is: > > .../open/test/hotspot/gtest/logging/test_log.cpp:63:19: error: call to 'fopen' declared with attribute warning: use os::fopen [-Werror=attribute-warning] > 63 | FILE* fp = fopen(TestLogFileName, "r"); > | ~~~~~^~~~~~~~~~~~~~~~~~~~~~ > > > Note that changing src/hotspot/os/linux/gc/z/zMountPoint_linux.cpp to call os:: functions requires adding "#include "runtime/os.hpp" and caused test gc/z/TestAllocateHeapAt.java to fail. So, for now, I just added PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION to zMountPoint_linux.cpp. There's a similar issue with gtest/logging/test_logDecorators.cpp. > > Attribute warnings for additional functions, such as malloc(), were not included in this change because they require lots of source code changes. > > This change was tested by running mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64. Also, builds were done on Linux-zero, Linux-s390, and Linux-ppc. > > Thanks, Harold Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: revert strdup() changes and address some of Kim's comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6961/files - new: https://git.openjdk.java.net/jdk/pull/6961/files/12182a15..f7d8a387 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6961&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6961&range=00-01 Stats: 44 lines in 9 files changed: 14 ins; 15 del; 15 mod Patch: https://git.openjdk.java.net/jdk/pull/6961.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6961/head:pull/6961 PR: https://git.openjdk.java.net/jdk/pull/6961 From hseigel at openjdk.java.net Wed Jan 5 20:41:10 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 5 Jan 2022 20:41:10 GMT Subject: RFR: 8214976: Warn about uses of functions replaced for portability [v2] In-Reply-To: References: Message-ID: On Wed, 5 Jan 2022 03:21:54 GMT, Kim Barrett wrote: >> Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: >> >> revert strdup() changes and address some of Kim's comments > > src/hotspot/os/linux/gc/z/zMountPoint_linux.cpp line 37: > >> 35: #define PROC_SELF_MOUNTINFO "/proc/self/mountinfo" >> 36: >> 37: PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION(fopen); > > The intended usage for this pragma is to scope it narrowly, using `PRAGMA_DIAG_PUSH/POP`. Fixed. > src/hotspot/share/jvmci/jvmciCodeInstaller.cpp line 511: > >> 509: JVMCI_ERROR_OK("stub should have a name"); >> 510: } >> 511: char* name = os::strdup(jvmci_env()->as_utf8_string(stubName)); > > Another `os::strdup` that I'm not sure is correct because I'm not sure where corresponding `free` might be, and whether it is `::free` or `os::free`. Fixed by reverting strdup() changes. > src/hotspot/share/runtime/os.cpp line 93: > >> 91: #endif >> 92: >> 93: PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION(fopen); // prevents compiler warnings for all functions > > Scope should be limited to the implementation of `os::fopen`. Fixed > src/hotspot/share/utilities/compilerWarnings.hpp line 87: > >> 85: PRAGMA_DISABLE_GCC_WARNING("-Wattribute-warning") >> 86: >> 87: FORBID_C_FUNCTION(void abort(void), "use os::abort"); > > It would be better to put all of these after all the `#endif`, so that if we add macro implementations for other platforms (like windows), these will be covered by the additional platforms. Done. > src/hotspot/share/utilities/compilerWarnings.hpp line 112: > >> 110: >> 111: #else >> 112: > > I think, but have not tested it, that this facility can be implemented for Visual Studio using `__declspec(deprecated)` and suppressing warning C4996. Of course, doing that may trigger a bunch of warnings in Windows-specific files, so it might be best to do that as a followup change. Addressing this as a follow up change sounds good. > test/hotspot/gtest/logging/test_logDecorators.cpp line 29: > >> 27: #include "unittest.hpp" >> 28: >> 29: PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION(strdup); > > Why suppress this warning? Why not just fix the couple of calls, and remember to also fix the corresponding calls to `::free` to instead call `os::free`. Fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/6961 From kvn at openjdk.java.net Wed Jan 5 21:22:15 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 5 Jan 2022 21:22:15 GMT Subject: RFR: 8279282: [vectorapi] Matcher::supports_vector_comparison_unsigned is not needed on x86 In-Reply-To: References: Message-ID: On Wed, 5 Jan 2022 10:30:46 GMT, Quan Anh Mai wrote: > Hi, > > Currently, unsigned comparison on AVX is implemented by zero extending elements and comparing the results. This leads to unnecessary complexity. This patch changes the implementation to use the identity existing in `Integer/Long.compareUnsigned`, that is `compareUnsigned(x, y) == compare(x ^ min_value, y ^ min_value)`. > > Thank you very much. Do we have tests which verify correctness of unsigned compare? ------------- PR: https://git.openjdk.java.net/jdk/pull/6966 From psandoz at openjdk.java.net Wed Jan 5 21:38:19 2022 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Wed, 5 Jan 2022 21:38:19 GMT Subject: RFR: 8279282: [vectorapi] Matcher::supports_vector_comparison_unsigned is not needed on x86 In-Reply-To: References: Message-ID: On Wed, 5 Jan 2022 21:18:40 GMT, Vladimir Kozlov wrote: > Do we have tests which verify correctness of unsigned compare? Yes, we test against the cross product of [cases]( https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/incubator/vector/Int256VectorTests.java#L984) for all comparison operations. See use of `intCompareOpProvider` as a data provider. (As for all other operations the test assumes the C2 compiler is triggered, reasonable for these cases. However, I think we should completely rewrite the tests using the IR framework. It's a non-trivial amount of work but i think possible, especially with similar source code generation techniques we already use and the ability to query at runtime what IR nodes are supported.) ------------- PR: https://git.openjdk.java.net/jdk/pull/6966 From duke at openjdk.java.net Wed Jan 5 23:24:55 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Wed, 5 Jan 2022 23:24:55 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v5] In-Reply-To: References: Message-ID: > Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: Update year in copyright notice. Add avx512dq check for vectorization test ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6857/files - new: https://git.openjdk.java.net/jdk/pull/6857/files/e84c6bdb..1892c37d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6857&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6857&range=03-04 Stats: 16 lines in 11 files changed: 1 ins; 4 del; 11 mod Patch: https://git.openjdk.java.net/jdk/pull/6857.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6857/head:pull/6857 PR: https://git.openjdk.java.net/jdk/pull/6857 From duke at openjdk.java.net Wed Jan 5 23:24:56 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Wed, 5 Jan 2022 23:24:56 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v2] In-Reply-To: References: Message-ID: On Wed, 5 Jan 2022 09:08:08 GMT, Jatin Bhateja wrote: >> Updated the test to use IR framework...please check... > > Kindly add @requires vm.cpu.features ~= ".*avx512dq.*" in tag since test case may fail on other targets. Added the @requires vm.cpu.features ~= ".avx512dq." check...thanks for pointing that out! ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From duke at openjdk.java.net Wed Jan 5 23:29:11 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Wed, 5 Jan 2022 23:29:11 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v4] In-Reply-To: References: Message-ID: On Wed, 5 Jan 2022 09:04:16 GMT, Jatin Bhateja wrote: > Please also update copywrite headers of modified files. Updated the year to 2022 in copyright headers... ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From duke at openjdk.java.net Wed Jan 5 23:29:12 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Wed, 5 Jan 2022 23:29:12 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v4] In-Reply-To: References: Message-ID: <4DnsrDdAusPD58aqjhObk1Uc1QJ025TB-wEx_Xw6BLw=.8a6939a5-4b16-4c3d-90eb-4800e234194d@github.com> On Wed, 5 Jan 2022 17:15:34 GMT, Sandhya Viswanathan wrote: >> Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: >> >> Update popcount long test to use IR framework > > src/hotspot/cpu/x86/x86.ad line 1416: > >> 1414: return false; >> 1415: } >> 1416: break; > > This case could be combined with case Op_PopCountVI and duplication removed. The check is the same for both. Updated code as per your suggestion to avoid duplication in the latest commit... ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From duke at openjdk.java.net Thu Jan 6 01:47:04 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Thu, 6 Jan 2022 01:47:04 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v6] In-Reply-To: References: Message-ID: > Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: Update IR framework test to check for non-zero count of PopCountVL ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6857/files - new: https://git.openjdk.java.net/jdk/pull/6857/files/1892c37d..8cd298b6 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6857&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6857&range=04-05 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/6857.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6857/head:pull/6857 PR: https://git.openjdk.java.net/jdk/pull/6857 From duke at openjdk.java.net Thu Jan 6 01:47:05 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Thu, 6 Jan 2022 01:47:05 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v4] In-Reply-To: References: Message-ID: <6afh8zRPM7tPcUoOpp8teSGZVjDsNlvXUOODU5obdmc=.1b292020-822a-4c8d-9fda-657b31437eb6@github.com> On Wed, 5 Jan 2022 17:39:52 GMT, Sandhya Viswanathan wrote: >> Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: >> >> Update popcount long test to use IR framework > > test/hotspot/jtreg/compiler/vectorization/TestPopCountVectorLong.java line 57: > >> 55: @Test // needs to be run in (fast) debug mode >> 56: @Warmup(10000) >> 57: @IR(counts = {"PopCountVL", "9"}) //9 PopCountVL nodes are generated for a long[] of LEN=1024 > > Could this be a failOn check instead of counts check? The number of PopCountVL nodes is dependent on loop unrolling which keeps changing with loop optimizations. Looks like we can use the regex (">= ") which checks for atleast one PopCountVL node. Please see the updated code... ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From duke at openjdk.java.net Thu Jan 6 03:50:57 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Thu, 6 Jan 2022 03:50:57 GMT Subject: RFR: 8279282: [vectorapi] Matcher::supports_vector_comparison_unsigned is not needed on x86 [v2] In-Reply-To: References: Message-ID: > Hi, > > Currently, unsigned comparison on AVX is implemented by zero extending elements and comparing the results. This leads to unnecessary complexity. This patch changes the implementation to use the identity existing in `Integer/Long.compareUnsigned`, that is `compareUnsigned(x, y) == compare(x ^ min_value, y ^ min_value)`. > > Thank you very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: rename ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6966/files - new: https://git.openjdk.java.net/jdk/pull/6966/files/04e02615..a0444928 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6966&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6966&range=00-01 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/6966.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6966/head:pull/6966 PR: https://git.openjdk.java.net/jdk/pull/6966 From duke at openjdk.java.net Thu Jan 6 03:54:17 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Thu, 6 Jan 2022 03:54:17 GMT Subject: RFR: 8279282: [vectorapi] Matcher::supports_vector_comparison_unsigned is not needed on x86 [v2] In-Reply-To: References: Message-ID: <-v5mZ0anZXUm9HWtJZYyds8sicW_KKCd_W67rmQTQfY=.2ccdc20e-43b6-4cd8-bae8-4bc8c0c16637@github.com> On Wed, 5 Jan 2022 18:18:49 GMT, Paul Sandoz wrote: >> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: >> >> rename > > src/hotspot/cpu/x86/x86.ad line 7332: > >> 7330: Matcher::vector_length_in_bytes(n->in(1)->in(1)) <= 32 && // src1 >> 7331: is_integral_type(Matcher::vector_element_basic_type(n->in(1)->in(1))) && >> 7332: (n->in(2)->get_int() == BoolTest::eq || > > It's tempting to add a method to check the third bit of a `BoolTest` value, which controls the sense of the result e.g. `eq(0)` and `ne(4)`, rather than three separate checks e.g. `is_negated` perhaps. That it turn may result in more clearer naming of the methods rather than using `_pri` and `_sec`, and the logic percolates down into `vpcmpCCW` via the `ComparisonPredicate` value and the use of the tmp register. Thanks a lot for the suggestion. The situation is quite unique in this case and I can't find anywhere else the need for the check of the third bit. So I just renamed the nodes here to be more descriptive. What do you think? ------------- PR: https://git.openjdk.java.net/jdk/pull/6966 From fmatte at openjdk.java.net Thu Jan 6 07:47:32 2022 From: fmatte at openjdk.java.net (Fairoz Matte) Date: Thu, 6 Jan 2022 07:47:32 GMT Subject: RFR: 8279333: Some JFR tests do not accept 'GCLocker Initiated GC' as a valid GC cause Message-ID: Some JFR tests do not accept 'GCLocker Initiated GC' as a valid GC cause ------------- Commit messages: - 8279333: Some JFR tests do not accept 'GCLocker Initiated GC' as a valid GC cause Changes: https://git.openjdk.java.net/jdk/pull/6978/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6978&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8279333 Stats: 4 lines in 2 files changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/6978.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6978/head:pull/6978 PR: https://git.openjdk.java.net/jdk/pull/6978 From kbarrett at openjdk.java.net Thu Jan 6 10:16:16 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Thu, 6 Jan 2022 10:16:16 GMT Subject: RFR: 8214976: Warn about uses of functions replaced for portability [v2] In-Reply-To: References: Message-ID: <_VpnZnTRWB1qtqHuMjJ4zoiMHuLRWvr_yPx6qCITvKc=.dd1b9e98-fbd9-4b5e-b3e2-8748143fb71d@github.com> On Wed, 5 Jan 2022 20:41:07 GMT, Harold Seigel wrote: >> Please review this change for JDK-8214976. This change adds attribute warnings to header file compilerWarnings.hpp so that compilation warnings get issued when certain system functions are called directly, instead of hotspot's os:: versions of the functions. Many additional files were changed because of compilation warnings resulting from the compilerWarnings.hpp changes. >> >> A sample warning is: >> >> .../open/test/hotspot/gtest/logging/test_log.cpp:63:19: error: call to 'fopen' declared with attribute warning: use os::fopen [-Werror=attribute-warning] >> 63 | FILE* fp = fopen(TestLogFileName, "r"); >> | ~~~~~^~~~~~~~~~~~~~~~~~~~~~ >> >> >> Note that changing src/hotspot/os/linux/gc/z/zMountPoint_linux.cpp to call os:: functions requires adding "#include "runtime/os.hpp" and caused test gc/z/TestAllocateHeapAt.java to fail. So, for now, I just added PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION to zMountPoint_linux.cpp. There's a similar issue with gtest/logging/test_logDecorators.cpp. >> >> Attribute warnings for additional functions, such as malloc(), were not included in this change because they require lots of source code changes. >> >> This change was tested by running mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64. Also, builds were done on Linux-zero, Linux-s390, and Linux-ppc. >> >> Thanks, Harold > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > revert strdup() changes and address some of Kim's comments Changes requested by kbarrett (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/6961 From kbarrett at openjdk.java.net Thu Jan 6 10:16:16 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Thu, 6 Jan 2022 10:16:16 GMT Subject: RFR: 8214976: Warn about uses of functions replaced for portability [v2] In-Reply-To: References: Message-ID: On Wed, 5 Jan 2022 20:36:13 GMT, Harold Seigel wrote: >> src/hotspot/share/utilities/compilerWarnings.hpp line 87: >> >>> 85: PRAGMA_DISABLE_GCC_WARNING("-Wattribute-warning") >>> 86: >>> 87: FORBID_C_FUNCTION(void abort(void), "use os::abort"); >> >> It would be better to put all of these after all the `#endif`, so that if we add macro implementations for other platforms (like windows), these will be covered by the additional platforms. > > Done. "after all the `#endif`" wasn't quite what I meant. Still needs to be inside the include guard. And I forgot there are platform-specific compilerWarnings files; the gcc macro definitions should be in compilerWarnings_gcc.hpp rather than conditionally defined here. This file should contain something like #ifndef FORBID_C_FUNCTION #define FORBID_C_FUNCTION(signature, alternative) #endif #ifndef PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION #define PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION(name) #endif ------------- PR: https://git.openjdk.java.net/jdk/pull/6961 From kbarrett at openjdk.java.net Thu Jan 6 10:16:17 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Thu, 6 Jan 2022 10:16:17 GMT Subject: RFR: 8214976: Warn about uses of functions replaced for portability [v2] In-Reply-To: References: Message-ID: On Thu, 6 Jan 2022 09:57:54 GMT, Kim Barrett wrote: >> Done. > > "after all the `#endif`" wasn't quite what I meant. Still needs to be inside the include guard. And I forgot there are platform-specific compilerWarnings files; the gcc macro definitions should be in compilerWarnings_gcc.hpp rather than conditionally defined here. This file should contain something like > > #ifndef FORBID_C_FUNCTION > #define FORBID_C_FUNCTION(signature, alternative) > #endif > > #ifndef PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION > #define PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION(name) > #endif Some of these functions are posix while some are portable (C StdLib). Indeed, some of the os:: functions exist to provide a shared API with Windows where none exists natively. So I think compilerWarnings.hpp should have forbid-decls for only portable names, and there should be another group (in another file? not sure how to set it up though. for now just conditional in compilerWarnings.hpp?) for the posix functions that aren't present in Windows. The additional #includes should be separated accordingly, as needed. But need to be careful about . There's a problem with that header on XLC. See comment in globalDefinitions_xlc.hpp. Ugh! ------------- PR: https://git.openjdk.java.net/jdk/pull/6961 From dholmes at openjdk.java.net Thu Jan 6 11:29:12 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 6 Jan 2022 11:29:12 GMT Subject: RFR: 8214976: Warn about uses of functions replaced for portability [v2] In-Reply-To: References: Message-ID: On Wed, 5 Jan 2022 20:41:07 GMT, Harold Seigel wrote: >> Please review this change for JDK-8214976. This change adds attribute warnings to header file compilerWarnings.hpp so that compilation warnings get issued when certain system functions are called directly, instead of hotspot's os:: versions of the functions. Many additional files were changed because of compilation warnings resulting from the compilerWarnings.hpp changes. >> >> A sample warning is: >> >> .../open/test/hotspot/gtest/logging/test_log.cpp:63:19: error: call to 'fopen' declared with attribute warning: use os::fopen [-Werror=attribute-warning] >> 63 | FILE* fp = fopen(TestLogFileName, "r"); >> | ~~~~~^~~~~~~~~~~~~~~~~~~~~~ >> >> >> Note that changing src/hotspot/os/linux/gc/z/zMountPoint_linux.cpp to call os:: functions requires adding "#include "runtime/os.hpp" and caused test gc/z/TestAllocateHeapAt.java to fail. So, for now, I just added PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION to zMountPoint_linux.cpp. There's a similar issue with gtest/logging/test_logDecorators.cpp. >> >> Attribute warnings for additional functions, such as malloc(), were not included in this change because they require lots of source code changes. >> >> This change was tested by running mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64. Also, builds were done on Linux-zero, Linux-s390, and Linux-ppc. >> >> Thanks, Harold > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > revert strdup() changes and address some of Kim's comments I'm having very mixed feelings about this one. I'd like to see the warnings for shared code using potentially OS-specific API's, but I don't like seeing OS-specific code having to call os::* functions unnecessarily - but there seems no neat way to achieve this. We can argue it is harmless for the OS-specific code to call the os::* functions, but it looks really odd to see code like: os::close(fd); ::unlink(fd); why is `close()` an os api where `unlink()` is not? There's no obvious answer. Examining that question closer both os_windows.cpp and os_posix.cpp just simply call `::close` - so `os::close` could actually be defined in os.cpp the same as `os::read` - but then taking that one step further, why do close and read need to be part of the os API at all? Maybe there was a reason when Solaris was still supported - I haven't checked. But should we re-examine the os API and actually drop any functions that are either not OS-specific, or which don't need some VM-specific custom handling? And if so, in which order should we do that simplification and the current PR? David ------------- PR: https://git.openjdk.java.net/jdk/pull/6961 From dholmes at openjdk.java.net Thu Jan 6 11:43:20 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 6 Jan 2022 11:43:20 GMT Subject: RFR: 8279526: Exceptions::count_out_of_memory_exceptions miscounts class metaspace OOMEs In-Reply-To: References: Message-ID: On Wed, 5 Jan 2022 16:38:22 GMT, Aleksey Shipilev wrote: > SonarCloud reports that `Universe::is_out_of_memory_error_class_metaspace` is not used after JDK-8278125. Indeed, that patch [seems to introduce](https://github.com/openjdk/jdk/commit/ad1dc9c2ae5463363aff20072a3f2ca4ea23acd2?diff=unified#diff-997cf62de09eb9ba3ba9a8fc1d48666b913b4ece76a4f37559a985282788d913L466-R466) a typo in `Exceptions::count_out_of_memory_exceptions`. > > Additional testing: > - [x] Linux x86_64 fastdebug `hotspot:tier1` Good catch! (I missed it :( ). Obviously no test coverage though! ------------- PR: https://git.openjdk.java.net/jdk/pull/6970 From shade at openjdk.java.net Thu Jan 6 11:53:17 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 6 Jan 2022 11:53:17 GMT Subject: RFR: 8279526: Exceptions::count_out_of_memory_exceptions miscounts class metaspace OOMEs In-Reply-To: References: Message-ID: On Thu, 6 Jan 2022 11:40:02 GMT, David Holmes wrote: > Good catch! (I missed it :( ). Obviously no test coverage though! Honestly, it is genuinely hard to spot. I first thought Sonar is showing a false positive to me here. ------------- PR: https://git.openjdk.java.net/jdk/pull/6970 From aph at openjdk.java.net Thu Jan 6 12:19:18 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 6 Jan 2022 12:19:18 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash In-Reply-To: References: Message-ID: On Mon, 29 Nov 2021 17:40:43 GMT, Denghui Dong wrote: > Hi, > > I found that the native stack frames in the hs log are not accurate sometimes on AArch64, not sure if this is a known issue or an issue worth fixing. > > The following steps can quick reproduce the problem: > > 1. apply the diff(comment the dtrace_object_alloc call in interpreter and make a crash on SharedRuntime::dtrace_object_alloc) > > index 39e99bdd5ed..4fc768e94aa 100644 > --- a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp > +++ b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp > @@ -3558,6 +3558,7 @@ void TemplateTable::_new() { > __ store_klass_gap(r0, zr); // zero klass gap for compressed oops > __ store_klass(r0, r4); // store klass last > > +/** > { > SkipIfEqual skip(_masm, &DTraceAllocProbes, false); > // Trigger dtrace event for fastpath > @@ -3567,6 +3568,7 @@ void TemplateTable::_new() { > __ pop(atos); // restore the return value > > } > +*/ > __ b(done); > } > > diff --git a/src/hotspot/cpu/x86/templateTable_x86.cpp b/src/hotspot/cpu/x86/templateTable_x86.cpp > index 19530b7c57c..15b0509da4c 100644 > --- a/src/hotspot/cpu/x86/templateTable_x86.cpp > +++ b/src/hotspot/cpu/x86/templateTable_x86.cpp > @@ -4033,6 +4033,7 @@ void TemplateTable::_new() { > Register tmp_store_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg); > __ store_klass(rax, rcx, tmp_store_klass); // klass > > +/** > { > SkipIfEqual skip_if(_masm, &DTraceAllocProbes, 0); > // Trigger dtrace event for fastpath > @@ -4041,6 +4042,7 @@ void TemplateTable::_new() { > CAST_FROM_FN_PTR(address, static_cast(SharedRuntime::dtrace_object_alloc)), rax); > __ pop(atos); > } > +*/ > > __ jmp(done); > } > diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp > index a5de65ea5ab..60b4bd3bcc8 100644 > --- a/src/hotspot/share/runtime/sharedRuntime.cpp > +++ b/src/hotspot/share/runtime/sharedRuntime.cpp > @@ -1002,6 +1002,7 @@ jlong SharedRuntime::get_java_tid(Thread* thread) { > * 6254741. Once that is fixed we can remove the dummy return value. > */ > int SharedRuntime::dtrace_object_alloc(oopDesc* o) { > + *(int*)0 = 1; > return dtrace_object_alloc(Thread::current(), o, o->size()); > } > > > 2. `java -XX:+DTraceAllocProbes -Xcomp -XX:-PreserveFramePointer -version` > > On x86_64, the native stack in hs log is complete, but on AArch64, the native stack is incorrect. > > In the beginning, I thought it might be the influence of PreserveFramePointer. Later, I found that no matter whether PreserveFramePointer is enabled or not, in the hs log of x86_64, the native stack is always correct, and aarch64 is wrong. > > After some investigation, I found that this problem is related to the layout of the stack. > > On x86_64, whether it is C/C++, interpreter, or JIT, `callee` will always put the `return address` and `fp` of the `caller` at the bottom of the stack. > Hence, `callee` can always get the `caller sp`(aka `sender sp`) by `fp + 2`, and if `caller` is a compiled method, `caller sp` is the key to getting the `caller`'s `caller` since `caller fp` may be invalid.(see frame::sender_for_compiled_frame). > > > push %rbp > mov %rsp,%rbp > > _ _ _ _ _ _ > | | > | | | > |_ _ _ _ _ _| | > | | | > caller | | <- caller sp | > _ _ _ |_ _ _ _ _ _| | expand > | | | > | ret addr | | direction > callee |_ _ _ _ _ _| | > | | V > | caller fp | <- fp > |_ _ _ _ _ _| > > > > But for AArch64, the C/C++ code doesn't put the `return address` and `fp` of the `caller` at the bottom of the stack. > Hence, we cannot use `fp + 2` to calculate the proper `caller sp`(although it is still implemented this way). > > When `caller` is a C1/C2 method A, and `callee` is a C/C++ method B, we cannot get the `caller` of A since we cannot get the proper sp value of it. > > > stp x29, x30, [sp, #-N]! > mov x29, sp > > _ _ _ _ _ _ > | | > | | | > |_ _ _ _ _ _| | > | | | > caller | | <- caller sp | > _ _ _ |_ _ _ _ _ _| - | expand > | | > . . . . . | | direction > _ _ _ _ _ _ | | > | | | N | > | ret addr | | | > callee |_ _ _ _ _ _| | | > | | - V > | caller fp | <- fp > |_ _ _ _ _ _| > > > > I am not very familiar with AArch64 and have no idea how to fix this issue perfectly at current. > > Based on my understanding of the implementation, we can get the correct stack trace when PreserveFramePointer is enabled. > > Although PreserveFramePointer is disabled by default, I found that some real applications will enable it in the production environment. > Therefore, in my opinion, this fix can help troubleshoot crash issues in applications that enable PreserveFramePointer on AArch64 platform. > > This patch changes the logic of l_sender_sp calculation, uses sender_sp() as the value of l_sender_sp when PreserveFramePointer is enabled. > > Any input is appreciated. > > Thanks, > Denghui I've had a good look at this - in fact spent all morning on it - and this is the wrong fix. For example, it breaks the `pfl()` function in the test case. `pfl()` isn't called from anywhere in the JDK, but it is one of our essential debugging tools. If you're interested in pursuing this further I could explain what else to try, but I don't have any time to spend on this myself. Sorry. ------------- PR: https://git.openjdk.java.net/jdk/pull/6597 From fmatte at openjdk.java.net Thu Jan 6 14:11:20 2022 From: fmatte at openjdk.java.net (Fairoz Matte) Date: Thu, 6 Jan 2022 14:11:20 GMT Subject: Withdrawn: 8279333: Some JFR tests do not accept 'GCLocker Initiated GC' as a valid GC cause In-Reply-To: References: Message-ID: On Thu, 6 Jan 2022 07:40:50 GMT, Fairoz Matte wrote: > Some JFR tests do not accept 'GCLocker Initiated GC' as a valid GC cause This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/6978 From duke at openjdk.java.net Thu Jan 6 14:15:38 2022 From: duke at openjdk.java.net (Tobias Holenstein) Date: Thu, 6 Jan 2022 14:15:38 GMT Subject: RFR: JDK-8278329: some TraceDeoptimization code not included in PRODUCT build [v2] In-Reply-To: References: Message-ID: > After "JDK-8154011: Make `TraceDeoptimization` a diagnostic flag" some code was not included in the PRODUCT build. > > Removed all the #ifndef PRODUCT guards around `TraceDeoptimization` checks and made sure to be consistent. > > The DEOPT PACKING messages were controlled by `PrintDeoptimizationDetails` (develop flag), but DEOPT UNPACKING is controlled by `TraceDeoptimization` (product flag),. Therefore changed DEOPT PACKING messages to be controlled by `TraceDeoptimization` as well. > > Checked that tests are not affected. Checked on Aurora that performance is not affected. Tobias Holenstein has updated the pull request incrementally with one additional commit since the last revision: Cleanup output of TraceDeoptimization ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6746/files - new: https://git.openjdk.java.net/jdk/pull/6746/files/eac86b9c..0406fb65 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6746&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6746&range=00-01 Stats: 92 lines in 3 files changed: 48 ins; 30 del; 14 mod Patch: https://git.openjdk.java.net/jdk/pull/6746.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6746/head:pull/6746 PR: https://git.openjdk.java.net/jdk/pull/6746 From duke at openjdk.java.net Thu Jan 6 14:22:18 2022 From: duke at openjdk.java.net (Tobias Holenstein) Date: Thu, 6 Jan 2022 14:22:18 GMT Subject: RFR: JDK-8278329: some TraceDeoptimization code not included in PRODUCT build In-Reply-To: References: Message-ID: On Fri, 17 Dec 2021 20:51:44 GMT, Tom Rodriguez wrote: >> After "JDK-8154011: Make `TraceDeoptimization` a diagnostic flag" some code was not included in the PRODUCT build. >> >> Removed all the #ifndef PRODUCT guards around `TraceDeoptimization` checks and made sure to be consistent. >> >> The DEOPT PACKING messages were controlled by `PrintDeoptimizationDetails` (develop flag), but DEOPT UNPACKING is controlled by `TraceDeoptimization` (product flag),. Therefore changed DEOPT PACKING messages to be controlled by `TraceDeoptimization` as well. >> >> Checked that tests are not affected. Checked on Aurora that performance is not affected. > > We can convert JDK-8278329 into a RFE if we need to. Creating yet another bug just complicates things. For debugging output the distinction between a bug and an RFE is pretty small anyway. > > The first thing I notice that in a release build we get `[CodeBlob]` in this output which isn't very helpful. > > DEOPT PACKING thread 0x00007fae8000dc00 Compiled frame (sp=0x000070000dd06ee0 unextended sp=0x000070000dd06ee0, fp=0x000000000000003a, real_fp=0x000070000dd06f10, pc=0x0000000116950388) > [CodeBlob] > Virtual frames (innermost first): > > In fastdebug we get output like: > > nmethod 2351 1042 4 jdk.internal.misc.Unsafe::allocateUninitializedArray (55 bytes) > > so I think the code is using a print function that doesn't exist in product. That said I don't think that line of output is helpful since it reiterates the information in the trap or packing messages, so I'd be inclined to delete it. > A full pack/unpack sequence looks like this: > > Uncommon trap bci=2 pc=0x00000001169ba620, relative_pc=0x00000000000005c0, method=scala.collection.mutable.HashTable$class.elemEquals(Lscala/collection/mutable/HashTable;Ljava/lang/Object;Ljava/lang/Object;)Z, debug_id=0 > Uncommon trap occurred in scala.collection.mutable.HashTable$class::findEntry compiler=c2 compile_id=1418 (@0x00000001169ba620) thread=7171 reason=unstable_if action=reinterpret unloaded_class_index=-1 debug_id=0 > DEOPT PACKING thread 0x00007fae8000dc00 Compiled frame (sp=0x000070000dd06230 unextended sp=0x000070000dd06230, fp=0x00000006016a06e0, real_fp=0x000070000dd06280, pc=0x00000001169ba620) > [CodeBlob] > Virtual frames (innermost first): > 0 - (0x00007fae90293010) - if_acmpne @ bci 2 > 1 - (0x00007fae90294328) - invokestatic @ bci 3 > 2 - (0x00007fae90295640) - invokeinterface @ bci 35 > Created vframeArray 0x00007fadf1041800 > > DEOPT UNPACKING thread 0x00007fae8000dc00 vframeArray 0x00007fadf1041800 mode 2 > > {method} {0x0000000133ac41a8} 'findEntry' '(Lscala/collection/mutable/HashTable;Ljava/lang/Object;)Lscala/collection/mutable/HashEntry;' in 'scala/collection/mutable/HashTable$class' - invokeinterface @ bci 35 sp = 0x000070000dd061f0 > {method} {0x0000000133824bf8} 'elemEquals' '(Ljava/lang/Object;Ljava/lang/Object;)Z' in 'scala/collection/mutable/HashMap' - invokestatic @ bci 3 sp = 0x000070000dd06180 > {method} {0x0000000133ac4a60} 'elemEquals' '(Lscala/collection/mutable/HashTable;Ljava/lang/Object;Ljava/lang/Object;)Z' in 'scala/collection/mutable/HashTable$class' - if_acmpne @ bci 2 sp = 0x000070000dd06118 > > The `{method}` lines correspond to the vframes in the `PACKING` step so it would be nice if they were printed in a similar way, without the extra blank line in between. We should also use a different printing function so they are printed in a more natural way, like class.name(parameters) without the '{method}` part. So I'd recommended moving the `DEOPT UNPACKING` printing into `vframeArray::unpack_to_stack and try to make the output look similar between the two. The unpacking step just add information about the sp used in the recreated interpreter frame. Maybe something like this: > > DEOPT UNPACKING thread 0x00007fae8000dc00 vframeArray 0x00007fadf1041800 mode 2 > Virtual frames (innermost first): > 0 - {0x0000000133ac41a8} scala/collection/mutable/HashTable$class.findEntry(Lscala/collection/mutable/HashTable;Ljava/lang/Object;)Lscala/collection/mutable/HashEntry; - invokeinterface @ bci 35 sp = > 1 - {0x0000000133824bf8} scala/collection/mutable/HashMap.elemEquals(Ljava/lang/Object;Ljava/lang/Object;)Z - invokestatic @ bci 3 sp = > 2 - {0x0000000133ac4a60} scala/collection/mutable/HashTable$class.elemEquals(Lscala/collection/mutable/HashTable;Ljava/lang/Object;Ljava/lang/Object;)Z - if_acmpne @ bci 2 sp = > ``` and update the vframe printing to include similar information about the actual method? > There's also the issue of 2 `Uncommon trap` messages for every trap that show slightly different information. A single message would be clearer but maybe there's some good reason for the double printing that I'm missing. > I can prepare a changeset with my suggestions if it's unclear what I'm asking for. > I'm fine with the current state of PrintDeoptimizationDetails being non-product, but I'm surprised no one has finally deleted the `Verbose` and `WizardMode` flags. Those are some ancient artifacts that should probably be purged. I have cleaned up the output of `TraceDeoptimization` as suggested by @tkrodriguez - The printing of `Uncommon trap` are now merged into one line - `DEOPT UNPACKING` and `DEOPT PACKING` are now printed in a similar and more structured way. A `UNCOMMON TRAP` followed by the corresponding `DEOPT PACKING` and `DEOPT UNPACKING` now looks like this in `release` build: UNCOMMON TRAP method=java.util.HashMap.putVal(ILjava/lang/Object;Ljava/lang/Object;ZZ)Ljava/lang/Object; bci=56 pc=0x000000011d4482a8, relative_pc=0x0000000000000be8, debug_id=0 compiler=c2 compile_id=388 (@0x000000011d4482a8) thread=53259 reason=bimorphic_or_optimized_type_check action=maybe_recompile unloaded_class_index=-1 debug_id=0 DEOPT PACKING thread=0x00007fe4ad009200 vframeArray=0x00007fe4a901c600 Compiled frame (sp=0x000070000ba30c20 unextended sp=0x000070000ba30c20, fp=0x000000070f6f9d70, real_fp=0x000070000ba30cb0, pc=0x000000011d4482a8) Virtual frames (innermost/newest first): VFrame 0 (0x00007fe4a88f2e10) - java.util.HashMap.putVal(ILjava/lang/Object;Ljava/lang/Object;ZZ)Ljava/lang/Object; - invokevirtual @ bci=56 VFrame 1 (0x00007fe4a88f4128) - java.util.HashMap.put(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; - invokevirtual @ bci=9 DEOPT UNPACKING thread=0x00007fe4ad009200 vframeArray=0x00007fe4a901c600 mode=2 Virtual frames (outermost/oldest first): VFrame 1 (0x00007fe4a901db58) - java.util.HashMap.put(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; - invokevirtual @ bci=9 sp=0x000070000ba30bf0 VFrame 0 (0x00007fe4a901db00) - java.util.HashMap.putVal(ILjava/lang/Object;Ljava/lang/Object;ZZ)Ljava/lang/Object; - invokevirtual @ bci=56 sp=0x000070000ba30b60 and like this in `slow-debug`build: UNCOMMON TRAP method=java.lang.StringLatin1.indexOf([BII)I bci=13 pc=0x0000000118ad50c4, relative_pc=0x00000000000001e4, debug_id=0 compiler=c2 compile_id=155 (@0x0000000118ad50c4) thread=5635 reason=range_check action=reinterpret unloaded_class_index=-1 debug_id=0 DEOPT PACKING thread=0x00007ff88a008a20 vframeArray=0x00007ff88888f420 Compiled frame (sp=0x000070000117a310 unextended sp=0x000070000117a310, fp=0x0000000000000000, real_fp=0x000070000117a340, pc=0x0000000118ad50c4) nmethod 1257 155 4 java.lang.String::indexOf (29 bytes) Virtual frames (innermost/newest first): VFrame 0 (0x00007ff88bfb3038) - java.lang.StringLatin1.indexOf([BII)I - ifge @ bci=13 VFrame 1 (0x00007ff88bfb43a0) - java.lang.String.indexOf(II)I - invokestatic @ bci=13 DEOPT UNPACKING thread=0x00007ff88a008a20 vframeArray=0x00007ff88888f420 mode=2 Virtual frames (outermost/oldest first): VFrame 1 (0x00007ff888890988) - java.lang.String.indexOf(II)I - invokestatic @ bci=13 sp=0x000070000117a2c8 VFrame 0 (0x00007ff888890928) - java.lang.StringLatin1.indexOf([BII)I - ifge @ bci=13 sp=0x000070000117a268 @tkrodriguez does that look like what you had in mind? ------------- PR: https://git.openjdk.java.net/jdk/pull/6746 From fmatte at openjdk.java.net Thu Jan 6 14:24:35 2022 From: fmatte at openjdk.java.net (Fairoz Matte) Date: Thu, 6 Jan 2022 14:24:35 GMT Subject: [jdk18] Integrated: 8279333: Some JFR tests do not accept 'GCLocker Initiated GC' as a valid GC Cause Message-ID: 8279333: Some JFR tests do not accept 'GCLocker Initiated GC' as a valid GC Cause ------------- Commit messages: - 8279333: Some JFR tests do not accept 'GCLocker Initiated GC' as a valid GC Cause Changes: https://git.openjdk.java.net/jdk18/pull/86/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk18&pr=86&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8279333 Stats: 6 lines in 2 files changed: 2 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk18/pull/86.diff Fetch: git fetch https://git.openjdk.java.net/jdk18 pull/86/head:pull/86 PR: https://git.openjdk.java.net/jdk18/pull/86 From egahlin at openjdk.java.net Thu Jan 6 14:24:36 2022 From: egahlin at openjdk.java.net (Erik Gahlin) Date: Thu, 6 Jan 2022 14:24:36 GMT Subject: [jdk18] Integrated: 8279333: Some JFR tests do not accept 'GCLocker Initiated GC' as a valid GC Cause In-Reply-To: References: Message-ID: On Thu, 6 Jan 2022 14:11:45 GMT, Fairoz Matte wrote: > 8279333: Some JFR tests do not accept 'GCLocker Initiated GC' as a valid GC Cause Marked as reviewed by egahlin (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk18/pull/86 From fmatte at openjdk.java.net Thu Jan 6 14:24:38 2022 From: fmatte at openjdk.java.net (Fairoz Matte) Date: Thu, 6 Jan 2022 14:24:38 GMT Subject: [jdk18] Integrated: 8279333: Some JFR tests do not accept 'GCLocker Initiated GC' as a valid GC Cause In-Reply-To: References: Message-ID: On Thu, 6 Jan 2022 14:11:45 GMT, Fairoz Matte wrote: > 8279333: Some JFR tests do not accept 'GCLocker Initiated GC' as a valid GC Cause This pull request has now been integrated. Changeset: 7c792f27 Author: Fairoz Matte Committer: Erik Gahlin URL: https://git.openjdk.java.net/jdk18/commit/7c792f27a8f6ccf87922cc5f2768946e55e33816 Stats: 6 lines in 2 files changed: 2 ins; 0 del; 4 mod 8279333: Some JFR tests do not accept 'GCLocker Initiated GC' as a valid GC Cause Reviewed-by: egahlin ------------- PR: https://git.openjdk.java.net/jdk18/pull/86 From psandoz at openjdk.java.net Thu Jan 6 17:40:18 2022 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Thu, 6 Jan 2022 17:40:18 GMT Subject: RFR: 8279282: [vectorapi] Matcher::supports_vector_comparison_unsigned is not needed on x86 [v2] In-Reply-To: <-v5mZ0anZXUm9HWtJZYyds8sicW_KKCd_W67rmQTQfY=.2ccdc20e-43b6-4cd8-bae8-4bc8c0c16637@github.com> References: <-v5mZ0anZXUm9HWtJZYyds8sicW_KKCd_W67rmQTQfY=.2ccdc20e-43b6-4cd8-bae8-4bc8c0c16637@github.com> Message-ID: On Thu, 6 Jan 2022 03:51:21 GMT, Quan Anh Mai wrote: >> src/hotspot/cpu/x86/x86.ad line 7332: >> >>> 7330: Matcher::vector_length_in_bytes(n->in(1)->in(1)) <= 32 && // src1 >>> 7331: is_integral_type(Matcher::vector_element_basic_type(n->in(1)->in(1))) && >>> 7332: (n->in(2)->get_int() == BoolTest::eq || >> >> It's tempting to add a method to check the third bit of a `BoolTest` value, which controls the sense of the result e.g. `eq(0)` and `ne(4)`, rather than three separate checks e.g. `is_negated` perhaps. That it turn may result in more clearer naming of the methods rather than using `_pri` and `_sec`, and the logic percolates down into `vpcmpCCW` via the `ComparisonPredicate` value and the use of the tmp register. > > Thanks a lot for the suggestion. The situation is quite unique in this case and I can't find anywhere else the need for the check of the third bit. So I just renamed the nodes here to be more descriptive. What do you think? That's reasonable. ------------- PR: https://git.openjdk.java.net/jdk/pull/6966 From never at openjdk.java.net Thu Jan 6 17:58:20 2022 From: never at openjdk.java.net (Tom Rodriguez) Date: Thu, 6 Jan 2022 17:58:20 GMT Subject: RFR: JDK-8278329: some TraceDeoptimization code not included in PRODUCT build [v2] In-Reply-To: References: Message-ID: On Thu, 6 Jan 2022 14:15:38 GMT, Tobias Holenstein wrote: >> After "JDK-8154011: Make `TraceDeoptimization` a diagnostic flag" some code was not included in the PRODUCT build. >> >> Removed all the #ifndef PRODUCT guards around `TraceDeoptimization` checks and made sure to be consistent. >> >> The DEOPT PACKING messages were controlled by `PrintDeoptimizationDetails` (develop flag), but DEOPT UNPACKING is controlled by `TraceDeoptimization` (product flag),. Therefore changed DEOPT PACKING messages to be controlled by `TraceDeoptimization` as well. >> >> Checked that tests are not affected. Checked on Aurora that performance is not affected. > > Tobias Holenstein has updated the pull request incrementally with one additional commit since the last revision: > > Cleanup output of TraceDeoptimization Marked as reviewed by never (Reviewer). Yes that output looks great to me. Thank you for taking the time to do this. ------------- PR: https://git.openjdk.java.net/jdk/pull/6746 From jbhateja at openjdk.java.net Thu Jan 6 18:35:17 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Thu, 6 Jan 2022 18:35:17 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v6] In-Reply-To: References: Message-ID: On Thu, 6 Jan 2022 01:47:04 GMT, Vamsi Parasa wrote: >> Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. > > Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > Update IR framework test to check for non-zero count of PopCountVL Thanks for updates. ------------- Marked as reviewed by jbhateja (Committer). PR: https://git.openjdk.java.net/jdk/pull/6857 From sviswanathan at openjdk.java.net Thu Jan 6 18:35:18 2022 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Thu, 6 Jan 2022 18:35:18 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v6] In-Reply-To: References: Message-ID: On Thu, 6 Jan 2022 01:47:04 GMT, Vamsi Parasa wrote: >> Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. > > Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > Update IR framework test to check for non-zero count of PopCountVL Marked as reviewed by sviswanathan (Reviewer). @vnkozlov Could you please review this and run it through your testing? ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From sviswanathan at openjdk.java.net Thu Jan 6 18:35:18 2022 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Thu, 6 Jan 2022 18:35:18 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v4] In-Reply-To: References: Message-ID: On Wed, 5 Jan 2022 23:24:09 GMT, Vamsi Parasa wrote: >> Please also update copywrite headers of modified files. > >> Please also update copywrite headers of modified files. > > Updated the year to 2022 in copyright headers... @vamsi-parasa The patch looks good to me. You will need another review. ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From kvn at openjdk.java.net Thu Jan 6 19:21:16 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 6 Jan 2022 19:21:16 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v6] In-Reply-To: References: Message-ID: <6JTZlbliDa16Xotez1-Qe8y5uFDiZ04Xfhie4p7rqhE=.96f98052-a887-4dad-a43d-7f00e5056f2c@github.com> On Thu, 6 Jan 2022 01:47:04 GMT, Vamsi Parasa wrote: >> Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. > > Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > Update IR framework test to check for non-zero count of PopCountVL Please update branch. Latest changes #6893 touched same files. ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From duke at openjdk.java.net Thu Jan 6 19:40:22 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Thu, 6 Jan 2022 19:40:22 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v6] In-Reply-To: References: Message-ID: On Thu, 6 Jan 2022 01:47:04 GMT, Vamsi Parasa wrote: >> Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. > > Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > Update IR framework test to check for non-zero count of PopCountVL Thank you Jatin and Sandhya for the review! Thank you Vladimir for looking into the patch. Will update the branch and let you know... ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From duke at openjdk.java.net Thu Jan 6 19:57:01 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Thu, 6 Jan 2022 19:57:01 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v7] In-Reply-To: References: Message-ID: > Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. Vamsi Parasa has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: - Merge branch 'master' of https://git.openjdk.java.net/jdk into vlong - Update IR framework test to check for non-zero count of PopCountVL - Update year in copyright notice. Add avx512dq check for vectorization test - Update popcount long test to use IR framework - Use generic vector node names - Add JMH micro benchmark to measure performance - 8278868:Add x86 vectorization support for Long.bitCount() ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6857/files - new: https://git.openjdk.java.net/jdk/pull/6857/files/8cd298b6..d8b3cedd Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6857&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6857&range=05-06 Stats: 12922 lines in 484 files changed: 9523 ins; 1562 del; 1837 mod Patch: https://git.openjdk.java.net/jdk/pull/6857.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6857/head:pull/6857 PR: https://git.openjdk.java.net/jdk/pull/6857 From duke at openjdk.java.net Thu Jan 6 20:06:22 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Thu, 6 Jan 2022 20:06:22 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v6] In-Reply-To: <6JTZlbliDa16Xotez1-Qe8y5uFDiZ04Xfhie4p7rqhE=.96f98052-a887-4dad-a43d-7f00e5056f2c@github.com> References: <6JTZlbliDa16Xotez1-Qe8y5uFDiZ04Xfhie4p7rqhE=.96f98052-a887-4dad-a43d-7f00e5056f2c@github.com> Message-ID: On Thu, 6 Jan 2022 19:18:20 GMT, Vladimir Kozlov wrote: >> Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: >> >> Update IR framework test to check for non-zero count of PopCountVL > > Please update branch. Latest changes #6893 touched same files. Hi Vladimir (@vnkozlov) ...updated the branch. Please check... ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From kvn at openjdk.java.net Thu Jan 6 20:20:23 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 6 Jan 2022 20:20:23 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v7] In-Reply-To: References: Message-ID: On Thu, 6 Jan 2022 19:57:01 GMT, Vamsi Parasa wrote: >> Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. > > Vamsi Parasa has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: > > - Merge branch 'master' of https://git.openjdk.java.net/jdk into vlong > - Update IR framework test to check for non-zero count of PopCountVL > - Update year in copyright notice. Add avx512dq check for vectorization test > - Update popcount long test to use IR framework > - Use generic vector node names > - Add JMH micro benchmark to measure performance > - 8278868:Add x86 vectorization support for Long.bitCount() The update is good. I submitted testing. ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From coleenp at openjdk.java.net Thu Jan 6 20:54:26 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 6 Jan 2022 20:54:26 GMT Subject: RFR: 8142362: Lots of code duplication in Copy class Message-ID: Removed an unused assembly function on one platform (_Copy_conjoint_bytes), and consolidated the linux and bsd x86, and linux and bsd aarch64 copy code that was duplicated. There's unfortunately now an #ifndef _WINDOWS in copy_x86.hpp and copy_aarch64.hpp, and I couldn't combine the duplicate copy_.S files because Windows doesn't have this file (and couldn't convince the build system to ignore the .S file for windows). I didn't think it was worth adding an os_cpu/posix_x86 and os_cpu/posix_aarch64 directory for this small bit of code. There could be more consolidation but the platform differences are subtle. This change just moves around code without poking this bear. Tested with tier1 on Oracle platforms, build on linux-x86-open,linux-s390x-open,linux-arm32-debug,linux-ppc64le-debug and Zero (zero in progress). ------------- Commit messages: - Commonify linux/bsd versions of os_cpu copy functions into cpu copy functions. - Removed unused Copy_ functions. Changes: https://git.openjdk.java.net/jdk/pull/6984/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6984&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8142362 Stats: 1627 lines in 11 files changed: 451 ins; 1162 del; 14 mod Patch: https://git.openjdk.java.net/jdk/pull/6984.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6984/head:pull/6984 PR: https://git.openjdk.java.net/jdk/pull/6984 From kvn at openjdk.java.net Thu Jan 6 20:59:22 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 6 Jan 2022 20:59:22 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v7] In-Reply-To: References: Message-ID: <_Rd2wR8yAzz-6mW2YPJhW_oJg04Z1w32hWSxbA4Af8g=.f1591d94-11bd-408d-b8cc-ed85c87cf9db@github.com> On Thu, 6 Jan 2022 19:57:01 GMT, Vamsi Parasa wrote: >> Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. > > Vamsi Parasa has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: > > - Merge branch 'master' of https://git.openjdk.java.net/jdk into vlong > - Update IR framework test to check for non-zero count of PopCountVL > - Update year in copyright notice. Add avx512dq check for vectorization test > - Update popcount long test to use IR framework > - Use generic vector node names > - Add JMH micro benchmark to measure performance > - 8278868:Add x86 vectorization support for Long.bitCount() Build error: workspace/open/src/hotspot/share/opto/superword.cpp:2556:38: error: converting the enum constant to a boolean [-Werror,-Wint-in-bool-context] opc == Op_PopCountI || Op_PopCountL) { And a lot of testing errors with missing `VectorCastI2X` and `VectorCastL2X` (compiler/codegen/TestLongDoubleVect.java, compiler/codegen/TestIntFloatVect.java): # Internal Error (/workspace/open/src/hotspot/share/opto/vectornode.cpp:573), pid=11893, tid=11909 # fatal error: Missed vector creation for 'VectorCastI2X' # # Problematic frame: # V [libjvm.so+0x19d2bd3] VectorNode::make(int, Node*, Node*, TypeVect const*, bool, bool) [clone .part.0]+0x63 Current CompileTask: C2: 674 58 % b compiler.codegen.TestIntFloatVect::test_conv_i2f @ 2 (22 bytes) Stack: [0x00007f0fa59fa000,0x00007f0fa5afb000], sp=0x00007f0fa5af4e70, free space=1003k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x19d2bd3] VectorNode::make(int, Node*, Node*, TypeVect const*, bool, bool) [clone .part.0]+0x63 V [libjvm.so+0x189e082] SuperWord::output()+0xb82 V [libjvm.so+0x18a40e0] SuperWord::transform_loop(IdealLoopTree*, bool)+0x400 V [libjvm.so+0x13a2284] PhaseIdealLoop::build_and_optimize(LoopOptsMode)+0xff4 V [libjvm.so+0xa9a0fa] PhaseIdealLoop::optimize(PhaseIterGVN&, LoopOptsMode)+0x28a V [libjvm.so+0xa963df] Compile::Optimize()+0x102f V [libjvm.so+0xa9863e] Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x159e ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From duke at openjdk.java.net Thu Jan 6 21:16:16 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Thu, 6 Jan 2022 21:16:16 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v7] In-Reply-To: References: Message-ID: On Thu, 6 Jan 2022 19:57:01 GMT, Vamsi Parasa wrote: >> Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. > > Vamsi Parasa has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: > > - Merge branch 'master' of https://git.openjdk.java.net/jdk into vlong > - Update IR framework test to check for non-zero count of PopCountVL > - Update year in copyright notice. Add avx512dq check for vectorization test > - Update popcount long test to use IR framework > - Use generic vector node names > - Add JMH micro benchmark to measure performance > - 8278868:Add x86 vectorization support for Long.bitCount() Sorry, closed this issue accidentally... ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From duke at openjdk.java.net Thu Jan 6 21:16:19 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Thu, 6 Jan 2022 21:16:19 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v7] In-Reply-To: <_Rd2wR8yAzz-6mW2YPJhW_oJg04Z1w32hWSxbA4Af8g=.f1591d94-11bd-408d-b8cc-ed85c87cf9db@github.com> References: <_Rd2wR8yAzz-6mW2YPJhW_oJg04Z1w32hWSxbA4Af8g=.f1591d94-11bd-408d-b8cc-ed85c87cf9db@github.com> Message-ID: On Thu, 6 Jan 2022 20:56:36 GMT, Vladimir Kozlov wrote: >> Vamsi Parasa has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: >> >> - Merge branch 'master' of https://git.openjdk.java.net/jdk into vlong >> - Update IR framework test to check for non-zero count of PopCountVL >> - Update year in copyright notice. Add avx512dq check for vectorization test >> - Update popcount long test to use IR framework >> - Use generic vector node names >> - Add JMH micro benchmark to measure performance >> - 8278868:Add x86 vectorization support for Long.bitCount() > > Build error: > > workspace/open/src/hotspot/share/opto/superword.cpp:2556:38: error: converting the enum constant to a boolean [-Werror,-Wint-in-bool-context] > opc == Op_PopCountI || Op_PopCountL) { > > > And a lot of testing errors with missing `VectorCastI2X` and `VectorCastL2X` (compiler/codegen/TestLongDoubleVect.java, compiler/codegen/TestIntFloatVect.java): > > # Internal Error (/workspace/open/src/hotspot/share/opto/vectornode.cpp:573), pid=11893, tid=11909 > # fatal error: Missed vector creation for 'VectorCastI2X' > # > # Problematic frame: > # V [libjvm.so+0x19d2bd3] VectorNode::make(int, Node*, Node*, TypeVect const*, bool, bool) [clone .part.0]+0x63 > > Current CompileTask: > C2: 674 58 % b compiler.codegen.TestIntFloatVect::test_conv_i2f @ 2 (22 bytes) > > Stack: [0x00007f0fa59fa000,0x00007f0fa5afb000], sp=0x00007f0fa5af4e70, free space=1003k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x19d2bd3] VectorNode::make(int, Node*, Node*, TypeVect const*, bool, bool) [clone .part.0]+0x63 > V [libjvm.so+0x189e082] SuperWord::output()+0xb82 > V [libjvm.so+0x18a40e0] SuperWord::transform_loop(IdealLoopTree*, bool)+0x400 > V [libjvm.so+0x13a2284] PhaseIdealLoop::build_and_optimize(LoopOptsMode)+0xff4 > V [libjvm.so+0xa9a0fa] PhaseIdealLoop::optimize(PhaseIterGVN&, LoopOptsMode)+0x28a > V [libjvm.so+0xa963df] Compile::Optimize()+0x102f > V [libjvm.so+0xa9863e] Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x159e Hi Vladimir (@vnkozlov), tried to replicate the build errors on my IceLake machine but they did not occur for both release and debug builds. Both builds completed successfully... ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From coleenp at openjdk.java.net Thu Jan 6 22:25:02 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 6 Jan 2022 22:25:02 GMT Subject: RFR: 8142362: Lots of code duplication in Copy class [v2] In-Reply-To: References: Message-ID: > Removed an unused assembly function on one platform (_Copy_conjoint_bytes), and consolidated the linux and bsd x86, and linux and bsd aarch64 copy code that was duplicated. There's unfortunately now an #ifndef _WINDOWS in copy_x86.hpp and copy_aarch64.hpp, and I couldn't combine the duplicate copy_.S files because Windows doesn't have this file (and couldn't convince the build system to ignore the .S file for windows). > I didn't think it was worth adding an os_cpu/posix_x86 and os_cpu/posix_aarch64 directory for this small bit of code. > > There could be more consolidation but the platform differences are subtle. This change just moves around code without poking this bear. > > Tested with tier1 on Oracle platforms, build on linux-x86-open,linux-s390x-open,linux-arm32-debug,linux-ppc64le-debug and Zero (zero in progress). Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix Windows aarch64 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6984/files - new: https://git.openjdk.java.net/jdk/pull/6984/files/ced521f9..fe6984a5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6984&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6984&range=00-01 Stats: 4 lines in 1 file changed: 2 ins; 2 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/6984.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6984/head:pull/6984 PR: https://git.openjdk.java.net/jdk/pull/6984 From kvn at openjdk.java.net Thu Jan 6 22:38:14 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 6 Jan 2022 22:38:14 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v7] In-Reply-To: <_Rd2wR8yAzz-6mW2YPJhW_oJg04Z1w32hWSxbA4Af8g=.f1591d94-11bd-408d-b8cc-ed85c87cf9db@github.com> References: <_Rd2wR8yAzz-6mW2YPJhW_oJg04Z1w32hWSxbA4Af8g=.f1591d94-11bd-408d-b8cc-ed85c87cf9db@github.com> Message-ID: On Thu, 6 Jan 2022 20:56:36 GMT, Vladimir Kozlov wrote: >> Vamsi Parasa has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: >> >> - Merge branch 'master' of https://git.openjdk.java.net/jdk into vlong >> - Update IR framework test to check for non-zero count of PopCountVL >> - Update year in copyright notice. Add avx512dq check for vectorization test >> - Update popcount long test to use IR framework >> - Use generic vector node names >> - Add JMH micro benchmark to measure performance >> - 8278868:Add x86 vectorization support for Long.bitCount() > > Build error: > > workspace/open/src/hotspot/share/opto/superword.cpp:2556:38: error: converting the enum constant to a boolean [-Werror,-Wint-in-bool-context] > opc == Op_PopCountI || Op_PopCountL) { > > > And a lot of testing errors with missing `VectorCastI2X` and `VectorCastL2X` (compiler/codegen/TestLongDoubleVect.java, compiler/codegen/TestIntFloatVect.java): > > # Internal Error (/workspace/open/src/hotspot/share/opto/vectornode.cpp:573), pid=11893, tid=11909 > # fatal error: Missed vector creation for 'VectorCastI2X' > # > # Problematic frame: > # V [libjvm.so+0x19d2bd3] VectorNode::make(int, Node*, Node*, TypeVect const*, bool, bool) [clone .part.0]+0x63 > > Current CompileTask: > C2: 674 58 % b compiler.codegen.TestIntFloatVect::test_conv_i2f @ 2 (22 bytes) > > Stack: [0x00007f0fa59fa000,0x00007f0fa5afb000], sp=0x00007f0fa5af4e70, free space=1003k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x19d2bd3] VectorNode::make(int, Node*, Node*, TypeVect const*, bool, bool) [clone .part.0]+0x63 > V [libjvm.so+0x189e082] SuperWord::output()+0xb82 > V [libjvm.so+0x18a40e0] SuperWord::transform_loop(IdealLoopTree*, bool)+0x400 > V [libjvm.so+0x13a2284] PhaseIdealLoop::build_and_optimize(LoopOptsMode)+0xff4 > V [libjvm.so+0xa9a0fa] PhaseIdealLoop::optimize(PhaseIterGVN&, LoopOptsMode)+0x28a > V [libjvm.so+0xa963df] Compile::Optimize()+0x102f > V [libjvm.so+0xa9863e] Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x159e > Hi Vladimir (@vnkozlov), tried to replicate the build errors on my IceLake machine but they did not occur for both release and debug builds. Both builds completed successfully... Build failure is on MacOSX x86 If you look on code it is really bug - missing 'opc =='. ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From iklam at openjdk.java.net Fri Jan 7 00:08:37 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 7 Jan 2022 00:08:37 GMT Subject: [jdk18] RFR: 8278020: ~13% variation in Renaissance-Scrabble Message-ID: 8278020: ~13% variation in Renaissance-Scrabble ------------- Commit messages: - Backport 4ba980ba439f94a6b5015e64382a6c308476d63f Changes: https://git.openjdk.java.net/jdk18/pull/87/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk18&pr=87&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8278020 Stats: 6 lines in 1 file changed: 3 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk18/pull/87.diff Fetch: git fetch https://git.openjdk.java.net/jdk18 pull/87/head:pull/87 PR: https://git.openjdk.java.net/jdk18/pull/87 From kvn at openjdk.java.net Fri Jan 7 00:32:17 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 7 Jan 2022 00:32:17 GMT Subject: RFR: 8142362: Lots of code duplication in Copy class [v2] In-Reply-To: References: Message-ID: <_BHcp_ijZx6f8G3XfTxpyaHO9fLv6epchPI02oKVA24=.e8d789be-c2b5-4fff-b9a8-61ada8165b23@github.com> On Thu, 6 Jan 2022 22:25:02 GMT, Coleen Phillimore wrote: >> Removed an unused assembly function on one platform (_Copy_conjoint_bytes), and consolidated the linux and bsd x86, and linux and bsd aarch64 copy code that was duplicated. There's unfortunately now an #ifndef _WINDOWS in copy_x86.hpp and copy_aarch64.hpp, and I couldn't combine the duplicate copy_.S files because Windows doesn't have this file (and couldn't convince the build system to ignore the .S file for windows). >> I didn't think it was worth adding an os_cpu/posix_x86 and os_cpu/posix_aarch64 directory for this small bit of code. >> >> There could be more consolidation but the platform differences are subtle. This change just moves around code without poking this bear. >> >> Tested with tier1 on Oracle platforms, build on linux-x86-open,linux-s390x-open,linux-arm32-debug,linux-ppc64le-debug and Zero (zero in progress). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix Windows aarch64 Good cleanup. You need to change copyright year to 2022. ------------- PR: https://git.openjdk.java.net/jdk/pull/6984 From duke at openjdk.java.net Fri Jan 7 00:36:49 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Fri, 7 Jan 2022 00:36:49 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v8] In-Reply-To: References: Message-ID: > Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: Fix error with opc == Op_PopCountI ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6857/files - new: https://git.openjdk.java.net/jdk/pull/6857/files/d8b3cedd..b3717ad6 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6857&range=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6857&range=06-07 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/6857.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6857/head:pull/6857 PR: https://git.openjdk.java.net/jdk/pull/6857 From duke at openjdk.java.net Fri Jan 7 00:45:15 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Fri, 7 Jan 2022 00:45:15 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v8] In-Reply-To: References: Message-ID: On Fri, 7 Jan 2022 00:36:49 GMT, Vamsi Parasa wrote: >> Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. > > Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > Fix error with opc == Op_PopCountI Fixed the 'opc == ' error. Thanks for identifying it! (gcc on Linux should have caught it) Will try to replicate the VectorCastI2X and VectorCastL2X errors in compiler/codegen/TestLongDoubleVect.java, compiler/codegen/TestIntFloatVect.java... ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From ddong at openjdk.java.net Fri Jan 7 01:46:16 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Fri, 7 Jan 2022 01:46:16 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash In-Reply-To: References: Message-ID: <-WnZRNSYnVZg8lNSl4kXSEg4np9iGPPkk3qG0ij9_DA=.3510020e-d5cd-49bc-97c8-9156c6f9ee36@github.com> On Thu, 6 Jan 2022 12:16:09 GMT, Andrew Haley wrote: > I've had a good look at this - in fact spent all morning on it - and this is the wrong fix. > For example, it breaks the `pfl()` function in the test case. `pfl()` isn't called from anywhere in the JDK, but it is one of our essential debugging tools. If you're interested in pursuing this further I could explain what else to try, but I don't have any time to spend on this myself. Sorry. Thanks for the comment. It would be nice if you could give me some other way that helps fix the problem. ------------- PR: https://git.openjdk.java.net/jdk/pull/6597 From kvn at openjdk.java.net Fri Jan 7 02:17:15 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 7 Jan 2022 02:17:15 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v8] In-Reply-To: References: Message-ID: On Fri, 7 Jan 2022 00:36:49 GMT, Vamsi Parasa wrote: >> Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. > > Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > Fix error with opc == Op_PopCountI Tests failed on aarch64 systems and avx2 x86. ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From iklam at openjdk.java.net Fri Jan 7 05:34:18 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 7 Jan 2022 05:34:18 GMT Subject: [jdk18] RFR: 8278020: ~13% variation in Renaissance-Scrabble In-Reply-To: References: Message-ID: On Thu, 6 Jan 2022 23:57:06 GMT, Ioi Lam wrote: > 8278020: ~13% variation in Renaissance-Scrabble Passed mach5 tiers1/2 ------------- PR: https://git.openjdk.java.net/jdk18/pull/87 From iklam at openjdk.java.net Fri Jan 7 05:34:19 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 7 Jan 2022 05:34:19 GMT Subject: [jdk18] Integrated: 8278020: ~13% variation in Renaissance-Scrabble In-Reply-To: References: Message-ID: On Thu, 6 Jan 2022 23:57:06 GMT, Ioi Lam wrote: > 8278020: ~13% variation in Renaissance-Scrabble This pull request has now been integrated. Changeset: 967ef0c4 Author: Ioi Lam URL: https://git.openjdk.java.net/jdk18/commit/967ef0c48252957f9bec42965fe02414fd2c77cb Stats: 6 lines in 1 file changed: 3 ins; 0 del; 3 mod 8278020: ~13% variation in Renaissance-Scrabble Backport-of: 4ba980ba439f94a6b5015e64382a6c308476d63f ------------- PR: https://git.openjdk.java.net/jdk18/pull/87 From thartmann at openjdk.java.net Fri Jan 7 07:38:11 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Fri, 7 Jan 2022 07:38:11 GMT Subject: RFR: JDK-8278329: some TraceDeoptimization code not included in PRODUCT build [v2] In-Reply-To: References: Message-ID: On Thu, 6 Jan 2022 14:15:38 GMT, Tobias Holenstein wrote: >> After "JDK-8154011: Make `TraceDeoptimization` a diagnostic flag" some code was not included in the PRODUCT build. >> >> Removed all the #ifndef PRODUCT guards around `TraceDeoptimization` checks and made sure to be consistent. >> >> The DEOPT PACKING messages were controlled by `PrintDeoptimizationDetails` (develop flag), but DEOPT UNPACKING is controlled by `TraceDeoptimization` (product flag),. Therefore changed DEOPT PACKING messages to be controlled by `TraceDeoptimization` as well. >> >> Checked that tests are not affected. Checked on Aurora that performance is not affected. > > Tobias Holenstein has updated the pull request incrementally with one additional commit since the last revision: > > Cleanup output of TraceDeoptimization Great work, looks good to me. Just found some minor style issues. src/hotspot/share/runtime/deoptimization.cpp line 1510: > 1508: ResourceMark rm; > 1509: stringStream st; > 1510: //st.print_cr("DEOPT PACKING thread " INTPTR_FORMAT " ", p2i(thread)); Should be removed. src/hotspot/share/runtime/deoptimization.cpp line 1954: > 1952: #if INCLUDE_JVMCI > 1953: , debug_id > 1954: #endif You can use `JVMCI_ONLY` here as well. Same in line 1840. src/hotspot/share/runtime/vframeArray.cpp line 586: > 584: stringStream st; > 585: st.print_cr("DEOPT UNPACKING thread=" INTPTR_FORMAT " vframeArray=" INTPTR_FORMAT " mode=%d", > 586: p2i(current), p2i(this), exec_mode); Indentation of line 586 is off-by-one. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6746 From duke at openjdk.java.net Fri Jan 7 10:56:36 2022 From: duke at openjdk.java.net (Tobias Holenstein) Date: Fri, 7 Jan 2022 10:56:36 GMT Subject: RFR: JDK-8278329: some TraceDeoptimization code not included in PRODUCT build [v3] In-Reply-To: References: Message-ID: > After "JDK-8154011: Make `TraceDeoptimization` a diagnostic flag" some code was not included in the PRODUCT build. > > Removed all the #ifndef PRODUCT guards around `TraceDeoptimization` checks and made sure to be consistent. > > The DEOPT PACKING messages were controlled by `PrintDeoptimizationDetails` (develop flag), but DEOPT UNPACKING is controlled by `TraceDeoptimization` (product flag),. Therefore changed DEOPT PACKING messages to be controlled by `TraceDeoptimization` as well. > > Checked that tests are not affected. Checked on Aurora that performance is not affected. Tobias Holenstein has updated the pull request incrementally with one additional commit since the last revision: minor style issues ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6746/files - new: https://git.openjdk.java.net/jdk/pull/6746/files/0406fb65..90dc623e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6746&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6746&range=01-02 Stats: 2 lines in 2 files changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/6746.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6746/head:pull/6746 PR: https://git.openjdk.java.net/jdk/pull/6746 From duke at openjdk.java.net Fri Jan 7 10:56:39 2022 From: duke at openjdk.java.net (Tobias Holenstein) Date: Fri, 7 Jan 2022 10:56:39 GMT Subject: RFR: JDK-8278329: some TraceDeoptimization code not included in PRODUCT build [v2] In-Reply-To: References: Message-ID: On Thu, 6 Jan 2022 17:54:33 GMT, Tom Rodriguez wrote: >> Tobias Holenstein has updated the pull request incrementally with one additional commit since the last revision: >> >> Cleanup output of TraceDeoptimization > > Yes that output looks great to me. Thank you for taking the time to do this. @tkrodriguez , @dougxc , @vnkozlov and @TobiHartmann Thanks for the inputs and the reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/6746 From duke at openjdk.java.net Fri Jan 7 10:56:43 2022 From: duke at openjdk.java.net (Tobias Holenstein) Date: Fri, 7 Jan 2022 10:56:43 GMT Subject: RFR: JDK-8278329: some TraceDeoptimization code not included in PRODUCT build [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jan 2022 07:28:23 GMT, Tobias Hartmann wrote: >> Tobias Holenstein has updated the pull request incrementally with one additional commit since the last revision: >> >> Cleanup output of TraceDeoptimization > > src/hotspot/share/runtime/deoptimization.cpp line 1954: > >> 1952: #if INCLUDE_JVMCI >> 1953: , debug_id >> 1954: #endif > > You can use `JVMCI_ONLY` here as well. Same in line 1840. Unfortunately, because of the comma and the way JVMCI_ONLY is defined, this does not work. I will leave it as it is ------------- PR: https://git.openjdk.java.net/jdk/pull/6746 From thartmann at openjdk.java.net Fri Jan 7 13:03:16 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Fri, 7 Jan 2022 13:03:16 GMT Subject: RFR: JDK-8278329: some TraceDeoptimization code not included in PRODUCT build [v3] In-Reply-To: References: Message-ID: On Fri, 7 Jan 2022 10:56:36 GMT, Tobias Holenstein wrote: >> After "JDK-8154011: Make `TraceDeoptimization` a diagnostic flag" some code was not included in the PRODUCT build. >> >> Removed all the #ifndef PRODUCT guards around `TraceDeoptimization` checks and made sure to be consistent. >> >> The DEOPT PACKING messages were controlled by `PrintDeoptimizationDetails` (develop flag), but DEOPT UNPACKING is controlled by `TraceDeoptimization` (product flag),. Therefore changed DEOPT PACKING messages to be controlled by `TraceDeoptimization` as well. >> >> Checked that tests are not affected. Checked on Aurora that performance is not affected. > > Tobias Holenstein has updated the pull request incrementally with one additional commit since the last revision: > > minor style issues Thanks for changing. Looks good! ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6746 From hseigel at openjdk.java.net Fri Jan 7 13:36:43 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Fri, 7 Jan 2022 13:36:43 GMT Subject: RFR: 8218857: Confusing overloads for os::open Message-ID: <6o68V8XKdZcazOZUKcJLtZ6N7HZ2VWRuXopw0NpAb1Y=.90799217-bb62-4199-9fa3-588c51ba658a@github.com> Please review this small change to resolve overload confusion by renaming "FILE* os::open(int fd, const char* mode)" to "FILE* os::fdopen(int fd, const char* mode)" in os.hpp. The change was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. Thanks, Harold ------------- Commit messages: - 8218857: Confusing overloads for os::open Changes: https://git.openjdk.java.net/jdk/pull/6988/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6988&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8218857 Stats: 12 lines in 6 files changed: 0 ins; 0 del; 12 mod Patch: https://git.openjdk.java.net/jdk/pull/6988.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6988/head:pull/6988 PR: https://git.openjdk.java.net/jdk/pull/6988 From aph at openjdk.java.net Fri Jan 7 14:31:20 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 7 Jan 2022 14:31:20 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash In-Reply-To: <-WnZRNSYnVZg8lNSl4kXSEg4np9iGPPkk3qG0ij9_DA=.3510020e-d5cd-49bc-97c8-9156c6f9ee36@github.com> References: <-WnZRNSYnVZg8lNSl4kXSEg4np9iGPPkk3qG0ij9_DA=.3510020e-d5cd-49bc-97c8-9156c6f9ee36@github.com> Message-ID: On Fri, 7 Jan 2022 01:43:30 GMT, Denghui Dong wrote: > > I've had a good look at this - in fact spent all morning on it - and this is the wrong fix. > > For example, it breaks the `pfl()` function in the test case. `pfl()` isn't called from anywhere in the JDK, but it is one of our essential debugging tools. If you're interested in pursuing this further I could explain what else to try, but I don't have any time to spend on this myself. Sorry. > > Thanks for the comment. It would be nice if you could give me some other way that helps fix the problem. OK. The following changes cause `dtrace_object_alloc()` to call `pfl()`. This should print the entire stack. (You can also clone https://github.com/theRealAph/jdk , branch `pull/6597` for the same code. With your patch included and `PreserveFramePointer` enabled, `pfl()` crashes. So it seems like your patch fixes one thing, but breaks other uses of stack walking. diff --git a/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp b/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp index 661fad89e47..3fa80da73f7 100644 --- a/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp +++ b/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp @@ -237,7 +237,9 @@ void C1_MacroAssembler::initialize_object(Register obj, Register klass, Register if (CURRENT_ENV->dtrace_alloc_probes()) { assert(obj == r0, "must be"); + set_last_Java_frame(sp, rfp, (address)pc(), rscratch1); far_call(RuntimeAddress(Runtime1::entry_for(Runtime1::dtrace_object_alloc_id))); + reset_last_Java_frame(true); } verify_oop(obj); @@ -270,7 +272,9 @@ void C1_MacroAssembler::allocate_array(Register obj, Register len, Register t1, if (CURRENT_ENV->dtrace_alloc_probes()) { assert(obj == r0, "must be"); + set_last_Java_frame(sp, rfp, (address)pc(), rscratch1); far_call(RuntimeAddress(Runtime1::entry_for(Runtime1::dtrace_object_alloc_id))); + reset_last_Java_frame(true); } verify_oop(obj); diff --git a/src/hotspot/cpu/aarch64/c1_Runtime1_aarch64.cpp b/src/hotspot/cpu/aarch64/c1_Runtime1_aarch64.cpp index 005f739f0aa..b1da03398cf 100644 --- a/src/hotspot/cpu/aarch64/c1_Runtime1_aarch64.cpp +++ b/src/hotspot/cpu/aarch64/c1_Runtime1_aarch64.cpp @@ -1091,7 +1091,9 @@ OopMapSet* Runtime1::generate_code_for(StubID id, StubAssembler* sasm) { StubFrame f(sasm, "dtrace_object_alloc", dont_gc_arguments); save_live_registers(sasm); + __ set_last_Java_frame(sp, rfp, (address)(__ pc()), rscratch1); __ call_VM_leaf(CAST_FROM_FN_PTR(address, static_cast(SharedRuntime::dtrace_object_alloc)), c_rarg0); + __ reset_last_Java_frame(true); restore_live_registers(sasm); } diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp index a5de65ea5ab..5e09a1de120 100644 --- a/src/hotspot/share/runtime/sharedRuntime.cpp +++ b/src/hotspot/share/runtime/sharedRuntime.cpp @@ -996,12 +996,16 @@ jlong SharedRuntime::get_java_tid(Thread* thread) { return 0; } +extern "C" void pfl(); + /** * This function ought to be a void function, but cannot be because * it gets turned into a tail-call on sparc, which runs into dtrace bug * 6254741. Once that is fixed we can remove the dummy return value. */ int SharedRuntime::dtrace_object_alloc(oopDesc* o) { + pfl(); + *(int*)0 = 1; return dtrace_object_alloc(Thread::current(), o, o->size()); } ------------- PR: https://git.openjdk.java.net/jdk/pull/6597 From coleenp at openjdk.java.net Fri Jan 7 14:48:14 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 7 Jan 2022 14:48:14 GMT Subject: RFR: 8142362: Lots of code duplication in Copy class [v2] In-Reply-To: References: Message-ID: On Thu, 6 Jan 2022 22:25:02 GMT, Coleen Phillimore wrote: >> Removed an unused assembly function on one platform (_Copy_conjoint_bytes), and consolidated the linux and bsd x86, and linux and bsd aarch64 copy code that was duplicated. There's unfortunately now an #ifndef _WINDOWS in copy_x86.hpp and copy_aarch64.hpp, and I couldn't combine the duplicate copy_.S files because Windows doesn't have this file (and couldn't convince the build system to ignore the .S file for windows). >> I didn't think it was worth adding an os_cpu/posix_x86 and os_cpu/posix_aarch64 directory for this small bit of code. >> >> There could be more consolidation but the platform differences are subtle. This change just moves around code without poking this bear. >> >> Tested with tier1 on Oracle platforms, build on linux-x86-open,linux-s390x-open,linux-arm32-debug,linux-ppc64le-debug and Zero (zero in progress). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix Windows aarch64 Thanks Vladimir, oops, I need to update my copyright script. ------------- PR: https://git.openjdk.java.net/jdk/pull/6984 From coleenp at openjdk.java.net Fri Jan 7 15:12:49 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 7 Jan 2022 15:12:49 GMT Subject: RFR: 8142362: Lots of code duplication in Copy class [v3] In-Reply-To: References: Message-ID: > Removed an unused assembly function on one platform (_Copy_conjoint_bytes), and consolidated the linux and bsd x86, and linux and bsd aarch64 copy code that was duplicated. There's unfortunately now an #ifndef _WINDOWS in copy_x86.hpp and copy_aarch64.hpp, and I couldn't combine the duplicate copy_.S files because Windows doesn't have this file (and couldn't convince the build system to ignore the .S file for windows). > I didn't think it was worth adding an os_cpu/posix_x86 and os_cpu/posix_aarch64 directory for this small bit of code. > > There could be more consolidation but the platform differences are subtle. This change just moves around code without poking this bear. > > Tested with tier1 on Oracle platforms, build on linux-x86-open,linux-s390x-open,linux-arm32-debug,linux-ppc64le-debug and Zero (zero in progress). Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix copyrights ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6984/files - new: https://git.openjdk.java.net/jdk/pull/6984/files/fe6984a5..84907051 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6984&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6984&range=01-02 Stats: 11 lines in 11 files changed: 0 ins; 0 del; 11 mod Patch: https://git.openjdk.java.net/jdk/pull/6984.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6984/head:pull/6984 PR: https://git.openjdk.java.net/jdk/pull/6984 From thartmann at openjdk.java.net Fri Jan 7 17:03:59 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Fri, 7 Jan 2022 17:03:59 GMT Subject: RFR: 8279356: Method linking fails with guarantee(mh->adapter() != NULL) failed: Adapter blob must already exist! Message-ID: Adapter creation during method linking may fail due to a lack of code cache space which leads to a `VirtualMachineError` being thrown and thus a bail out from linking the holder klass: https://github.com/openjdk/jdk/blob/4243f4c998344e77dccd4d5605e56e869bc8af89/src/hotspot/share/oops/method.cpp#L1239-L1247 If the `VirtualMachineError` is handled/ignored by the application, we may later attempt to link the same klass and therefore also the same method again. We then incorrectly bail out from adapter creation because the `_i2i_entry` is set, assuming that this can only happen if adapters have already been created. However, that is not guaranteed because the interpreter entry is set right **before** adapters are created: https://github.com/openjdk/jdk/blob/4243f4c998344e77dccd4d5605e56e869bc8af89/src/hotspot/share/oops/method.cpp#L1213-L1230 I propose to instead check if adapters have been created. This is an old bug that was just recently triggered by an unrelated change. Thanks, Tobias ------------- Commit messages: - Test fix #2 - Fixed test issue - 8279356: Method linking fails with guarantee(mh->adapter() != NULL) failed: Adapter blob must already exist! Changes: https://git.openjdk.java.net/jdk/pull/6990/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6990&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8279356 Stats: 52 lines in 2 files changed: 43 ins; 2 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/6990.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6990/head:pull/6990 PR: https://git.openjdk.java.net/jdk/pull/6990 From kvn at openjdk.java.net Fri Jan 7 17:32:30 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 7 Jan 2022 17:32:30 GMT Subject: RFR: 8142362: Lots of code duplication in Copy class [v3] In-Reply-To: References: Message-ID: On Fri, 7 Jan 2022 15:12:49 GMT, Coleen Phillimore wrote: >> Removed an unused assembly function on one platform (_Copy_conjoint_bytes), and consolidated the linux and bsd x86, and linux and bsd aarch64 copy code that was duplicated. There's unfortunately now an #ifndef _WINDOWS in copy_x86.hpp and copy_aarch64.hpp, and I couldn't combine the duplicate copy_.S files because Windows doesn't have this file (and couldn't convince the build system to ignore the .S file for windows). >> I didn't think it was worth adding an os_cpu/posix_x86 and os_cpu/posix_aarch64 directory for this small bit of code. >> >> There could be more consolidation but the platform differences are subtle. This change just moves around code without poking this bear. >> >> Tested with tier1 on Oracle platforms, build on linux-x86-open,linux-s390x-open,linux-arm32-debug,linux-ppc64le-debug and Zero (zero in progress). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix copyrights Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6984 From hseigel at openjdk.java.net Fri Jan 7 20:08:49 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Fri, 7 Jan 2022 20:08:49 GMT Subject: RFR: 8183227: read/write APIs in class os shall return ssize_t Message-ID: Please review this small fix that changes the return type of os::write() to ssize_t. No changes were needed for os::read() because its return type is ssize_t. This fix was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows and Mach5 tiers 3-5 on Linux x64. Thanks, Harold ------------- Commit messages: - 8183227: read/write APIs in class os shall return ssize_t Changes: https://git.openjdk.java.net/jdk/pull/6992/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6992&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8183227 Stats: 39 lines in 10 files changed: 3 ins; 4 del; 32 mod Patch: https://git.openjdk.java.net/jdk/pull/6992.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6992/head:pull/6992 PR: https://git.openjdk.java.net/jdk/pull/6992 From kbarrett at openjdk.java.net Fri Jan 7 20:25:29 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 7 Jan 2022 20:25:29 GMT Subject: RFR: 8218857: Confusing overloads for os::open In-Reply-To: <6o68V8XKdZcazOZUKcJLtZ6N7HZ2VWRuXopw0NpAb1Y=.90799217-bb62-4199-9fa3-588c51ba658a@github.com> References: <6o68V8XKdZcazOZUKcJLtZ6N7HZ2VWRuXopw0NpAb1Y=.90799217-bb62-4199-9fa3-588c51ba658a@github.com> Message-ID: On Fri, 7 Jan 2022 13:29:38 GMT, Harold Seigel wrote: > Please review this small change to resolve overload confusion by renaming "FILE* os::open(int fd, const char* mode)" to "FILE* os::fdopen(int fd, const char* mode)" in os.hpp. The change was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6988 From fparain at openjdk.java.net Fri Jan 7 20:50:27 2022 From: fparain at openjdk.java.net (Frederic Parain) Date: Fri, 7 Jan 2022 20:50:27 GMT Subject: RFR: 8183227: read/write APIs in class os shall return ssize_t In-Reply-To: References: Message-ID: On Fri, 7 Jan 2022 20:00:15 GMT, Harold Seigel wrote: > Please review this small fix that changes the return type of os::write() to ssize_t. No changes were needed for os::read() because its return type is ssize_t. This fix was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows and Mach5 tiers 3-5 on Linux x64. > Thanks, Harold Looks good to me. Fred ------------- Marked as reviewed by fparain (Committer). PR: https://git.openjdk.java.net/jdk/pull/6992 From kvn at openjdk.java.net Fri Jan 7 21:41:23 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 7 Jan 2022 21:41:23 GMT Subject: RFR: 8279356: Method linking fails with guarantee(mh->adapter() != NULL) failed: Adapter blob must already exist! In-Reply-To: References: Message-ID: On Fri, 7 Jan 2022 16:57:03 GMT, Tobias Hartmann wrote: > Adapter creation during method linking may fail due to a lack of code cache space which leads to a `VirtualMachineError` being thrown and thus a bail out from linking the holder klass: > https://github.com/openjdk/jdk/blob/4243f4c998344e77dccd4d5605e56e869bc8af89/src/hotspot/share/oops/method.cpp#L1239-L1247 > > If the `VirtualMachineError` is handled/ignored by the application, we may later attempt to link the same klass and therefore also the same method again. We then incorrectly bail out from adapter creation because the `_i2i_entry` is set, assuming that this can only happen if adapters have already been created. However, that is not guaranteed because the interpreter entry is set right **before** adapters are created: > https://github.com/openjdk/jdk/blob/4243f4c998344e77dccd4d5605e56e869bc8af89/src/hotspot/share/oops/method.cpp#L1213-L1230 > > I propose to instead check if adapters have been created. > > This is an old bug that was just recently triggered by an unrelated change. > > Thanks, > Tobias Seems reasonable. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6990 From david.holmes at oracle.com Sat Jan 8 00:56:48 2022 From: david.holmes at oracle.com (David Holmes) Date: Sat, 8 Jan 2022 10:56:48 +1000 Subject: RFR: JDK-8278329: some TraceDeoptimization code not included in PRODUCT build [v2] In-Reply-To: References: Message-ID: <5f13a676-3f81-299a-7b20-35bdf2704a65@oracle.com> Hi Tobias, On 7/01/2022 8:56 pm, Tobias Holenstein wrote: > On Fri, 7 Jan 2022 07:28:23 GMT, Tobias Hartmann wrote: > >>> Tobias Holenstein has updated the pull request incrementally with one additional commit since the last revision: >>> >>> Cleanup output of TraceDeoptimization >> >> src/hotspot/share/runtime/deoptimization.cpp line 1954: >> >>> 1952: #if INCLUDE_JVMCI >>> 1953: , debug_id >>> 1954: #endif >> >> You can use `JVMCI_ONLY` here as well. Same in line 1840. > > Unfortunately, because of the comma and the way JVMCI_ONLY is defined, this does not work. I will leave it as it is We have the COMMA macro to solve that problem e.g. ./share/runtime/threadSMR.inline.hpp: DEBUG_ONLY(COMMA _list(list)) Cheers, David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/6746 From duke at openjdk.java.net Sun Jan 9 01:48:04 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Sun, 9 Jan 2022 01:48:04 GMT Subject: RFR: 8279282: [vectorapi] Matcher::supports_vector_comparison_unsigned is not needed on x86 [v3] In-Reply-To: References: Message-ID: <6YEvoLUipNu1l5haifwWwgQyfrjpRxh1o9qaIElSKQs=.d06b8dea-ed32-4cb5-ab09-28ce9a58e524@github.com> > Hi, > > Currently, unsigned comparison on AVX is implemented by zero extending elements and comparing the results. This leads to unnecessary complexity. This patch changes the implementation to use the identity existing in `Integer/Long.compareUnsigned`, that is `compareUnsigned(x, y) == compare(x ^ min_value, y ^ min_value)`. > > Thank you very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: use movddup for 128-bit vectors ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6966/files - new: https://git.openjdk.java.net/jdk/pull/6966/files/a0444928..59d1fa35 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6966&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6966&range=01-02 Stats: 30 lines in 5 files changed: 27 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/6966.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6966/head:pull/6966 PR: https://git.openjdk.java.net/jdk/pull/6966 From duke at openjdk.java.net Mon Jan 10 06:05:54 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Mon, 10 Jan 2022 06:05:54 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v9] In-Reply-To: References: Message-ID: > Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: aditional checks for the test ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6857/files - new: https://git.openjdk.java.net/jdk/pull/6857/files/b3717ad6..d2c0099f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6857&range=08 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6857&range=07-08 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/6857.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6857/head:pull/6857 PR: https://git.openjdk.java.net/jdk/pull/6857 From duke at openjdk.java.net Mon Jan 10 06:09:31 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Mon, 10 Jan 2022 06:09:31 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v8] In-Reply-To: References: Message-ID: On Fri, 7 Jan 2022 02:13:49 GMT, Vladimir Kozlov wrote: > Tests failed on aarch64 systems and avx2 x86. Could you please let me know if the failing test is test/hotspot/jtreg/compiler/vectorization/TestPopCountVectorLong.java ? Added additional checks (shown below) to make sure it runs on an x86 machine that has AVX3. * @requires vm.compiler2.enabled * @requires os.arch=="x86" | os.arch=="i386" | os.arch=="amd64" | os.arch=="x86_64" This test exits gracefully on a Skylake machine which doesn't have AVX3. ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From pli at openjdk.java.net Mon Jan 10 06:20:01 2022 From: pli at openjdk.java.net (Pengfei Li) Date: Mon, 10 Jan 2022 06:20:01 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v3] In-Reply-To: References: Message-ID: <1sV-WcCv0-H2lHh1FFwAocRL8EIVzZ_att5tUeY5M-c=.ee351f2b-c9b9-4294-9205-cf16da3e896d@github.com> > ### Background > > Post loop vectorization is a C2 compiler optimization in an experimental > VM feature called PostLoopMultiversioning. It transforms the range-check > eliminated post loop to a 1-iteration vectorized loop with vector mask. > This optimization was contributed by Intel in 2016 to support x86 AVX512 > masked vector instructions. However, it was disabled soon after an issue > was found. Due to insufficient maintenance in these years, multiple bugs > have been accumulated inside. But we (Arm) still think this is a useful > framework for vector mask support in C2 auto-vectorized loops, for both > x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable > post loop vectorization. > > ### Changes in this patch > > This patch reworks post loop vectorization. The most significant change > is removing vector mask support in C2 x86 backend and re-implementing > it in the mid-end. With this, we can re-enable post loop vectorization > for platforms other than x86. > > Previous implementation hard-codes x86 k1 register as a reserved AVX512 > opmask register and defines two routines (setvectmask/restorevectmask) > to set and restore the value of k1. But after [JDK-8211251](https://bugs.openjdk.java.net/browse/JDK-8211251) which encodes > AVX512 instructions as unmasked by default, generated vector masks are > no longer used in AVX512 vector instructions. To fix incorrect codegen > and add vector mask support for more platforms, we turn to add a vector > mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode > to generate a mask and replace all Load/Store nodes in the post loop > into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This > IR form is exactly the same to those which are used in VectorAPI mask > support. For now, we only add mask inputs for Load/Store nodes because > we don't have reduction operations supported in post loop vectorization. > After this change, the x86 k1 register is no longer reserved and can be > allocated when PostLoopMultiversioning is enabled. > > Besides this change, we have fixed a compiler crash and five incorrect > result issues with post loop vectorization. > > **I) C2 crashes with segmentation fault in strip-mined loops** > > Previous implementation was done before C2 loop strip-mining was merged > into JDK master so it didn't take strip-mined loops into consideration. > In C2's strip mined loops, post loop is not the sibling of the main loop > in ideal loop tree. Instead, it's the sibling of the main loop's parent. > This patch fixed a SIGSEGV issue caused by NULL pointer when locating > post loop from strip-mined main loop. > > **II) Incorrect result issues with post loop vectorization** > > We have also fixed five incorrect vectorization issues. Some of them are > hidden deep and can only be reproduced with corner cases. These issues > have a common cause that it assumes the post loop can be vectorized if > the vectorization in corresponding main loop is successful. But in many > cases this assumption is wrong. Below are details. > > - **[Issue-1] Incorrect vectorization for partial vectorizable loops** > > This issue can be reproduced by below loop where only some operations in > the loop body are vectorizable. > > for (int i = 0; i < 10000; i++) { > res[i] = a[i] * b[i]; > k = 3 * k + 1; > } > > In the main loop, superword can work well if parts of the operations in > loop body are not vectorizable since those parts can be unrolled only. > But for post loops, we don't create vectors through combining scalar IRs > generated from loop unrolling. Instead, we are doing scalars to vectors > replacement for all operations in the loop body. Hence, all operations > should be either vectorized together or not vectorized at all. To fix > this kind of cases, we add an extra field "_slp_vector_pack_count" in > CountedLoopNode to record the eventual count of vector packs in the main > loop. This value is then passed to post loop and compared with post loop > pack count. Vectorization will be bailed out in post loop if it creates > more vector packs than in the main loop. > > - **[Issue-2] Incorrect result in loops with growing-down vectors** > > This issue appears with growing-down vectors, that is, vectors that grow > to smaller memory address as the loop iterates. It can be reproduced by > below counting-up loop with negative scale value in array index. > > for (int i = 0; i < 10000; i++) { > a[MAX - i] = b[MAX - i]; > } > > Cause of this issue is that for a growing-down vector, generated vector > mask value has reversed vector-lane order so it masks incorrect vector > lanes. Note that if negative scale value appears in counting-down loops, > the vector will be growing up. With this rule, we fix the issue by only > allowing positive array index scales in counting-up loops and negative > array index scales in counting-down loops. This check is done with the > help of SWPointer by comparing scale values in each memory access in the > loop with loop stride value. > > - **[Issue-3] Incorrect result in manually unrolled loops** > > This issue can be reproduced by below manually unrolled loop. > > for (int i = 0; i < 10000; i += 2) { > c[i] = a[i] + b[i]; > c[i + 1] = a[i + 1] * b[i + 1]; > } > > In this loop, operations in the 2nd statement duplicate those in the 1st > statement with a small memory address offset. Vectorization in the main > loop works well in this case because C2 does further unrolling and pack > combination. But we cannot vectorize the post loop through replacement > from scalars to vectors because it creates duplicated vector operations. > To fix this, we restrict post loop vectorization to loops with stride > values of 1 or -1. > > - **[Issue-4] Incorrect result in loops with mixed vector element sizes** > > This issue is found after we enable post loop vectorization for AArch64. > It's reproducible by multiple array operations with different element > sizes inside a loop. On x86, there is no issue because the values of x86 > AVX512 opmasks only depend on which vector lanes are active. But AArch64 > is different - the values of SVE predicates also depend on lane size of > the vector. Hence, on AArch64 SVE, if a loop has mixed vector element > sizes, we should use different vector masks. For now, we just support > loops with only one vector element size, i.e., "int + float" vectors in > a single loop is ok but "int + double" vectors in a single loop is not > vectorizable. This fix also enables subword vectors support to make all > primitive type array operations vectorizable. > > - **[Issue-5] Incorrect result in loops with potential data dependence** > > This issue can be reproduced by below corner case on AArch64 only. > > for (int i = 0; i < 10000; i++) { > a[i] = x; > a[i + OFFSET] = y; > } > > In this case, two stores in the loop have data dependence if the OFFSET > value is smaller than the vector length. So we cannot do vectorization > through replacing scalars to vectors. But the main loop vectorization > in this case is successful on AArch64 because AArch64 has partial vector > load/store support. It splits vector fill with different values in lanes > to several smaller-sized fills. In this patch, we add additional data > dependence check for this kind of cases. The check is also done with the > help of SWPointer class. In this check, we require that every two memory > accesses (with at least one store) of the same element type (or subword > size) in the loop has the same array index expression. > > ### Tests > > So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with > experimental VM option "PostLoopMultiversioning" turned on. We found no > issue in all tests. We notice that those existing cases are not enough > because some of above issues are not spotted by them. We would like to > add some new cases but we found existing vectorization tests are a bit > cumbersome - golden results must be pre-calculated and hard-coded in the > test code for correctness verification. Thus, in this patch, we propose > a new vectorization testing framework. > > Our new framework brings a simpler way to add new cases. For a new test > case, we only need to create a new method annotated with "@Test". The > test runner will invoke each annotated method twice automatically. First > time it runs in the interpreter and second time it's forced compiled by > C2. Then the two return results are compared. So in this framework each > test method should return a primitive value or an array of primitives. > In this way, no extra verification code for vectorization correctness is > required. This test runner is still jtreg-based and takes advantages of > the jtreg WhiteBox API, which enables test methods running at specific > compilation levels. Each test class inside is also jtreg-based. It just > need to inherit from the test runner class and run with two additional > options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". > > ### Summary & Future work > > In this patch, we reworked post loop vectorization. We made it platform > independent and fixed several issues inside. We also implemented a new > vectorization testing framework with many test cases inside. Meanwhile, > we did some code cleanups. > > This patch only touches C2 code guarded with PostLoopMultiversioning, > except a few data structure changes. So, there's no behavior change when > experimental VM option PostLoopMultiversioning is off. Also, to reduce > risks, we still propose to keep post loop vectorization experimental for > now. But if it receives positive feedback, we would like to change it to > non-experimental in the future. Pengfei Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Update copyright year and rename a function Change-Id: I15845ebd3982edebd4c151284cc6f2ff727630bb - Merge branch 'master' into postloop Change-Id: Ie639c79c9cf016dc68ebf2c0031b60453b45e9a4 - Fix issues in newly added test framework Change-Id: I6e61abf05e9665325cb3abaf407360b18355c6b1 - Merge branch 'master' into postloop Change-Id: I9bb5a808d7540426dedb141fd198d25eb1f569e6 - 8183390: Fix and re-enable post loop vectorization ** Background Post loop vectorization is a C2 compiler optimization in an experimental VM feature called PostLoopMultiversioning. It transforms the range-check eliminated post loop to a 1-iteration vectorized loop with vector mask. This optimization was contributed by Intel in 2016 to support x86 AVX512 masked vector instructions. However, it was disabled soon after an issue was found. Due to insufficient maintenance in these years, multiple bugs have been accumulated inside. But we (Arm) still think this is a useful framework for vector mask support in C2 auto-vectorized loops, for both x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable post loop vectorization. ** Changes in this patch This patch reworks post loop vectorization. The most significant change is removing vector mask support in C2 x86 backend and re-implementing it in the mid-end. With this, we can re-enable post loop vectorization for platforms other than x86. Previous implementation hard-codes x86 k1 register as a reserved AVX512 opmask register and defines two routines (setvectmask/restorevectmask) to set and restore the value of k1. But after JDK-8211251 which encodes AVX512 instructions as unmasked by default, generated vector masks are no longer used in AVX512 vector instructions. To fix incorrect codegen and add vector mask support for more platforms, we turn to add a vector mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode to generate a mask and replace all Load/Store nodes in the post loop into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This IR form is exactly the same to those which are used in VectorAPI mask support. For now, we only add mask inputs for Load/Store nodes because we don't have reduction operations supported in post loop vectorization. After this change, the x86 k1 register is no longer reserved and can be allocated when PostLoopMultiversioning is enabled. Besides this change, we have fixed a compiler crash and five incorrect result issues with post loop vectorization. - 1) C2 crashes with segmentation fault in strip-mined loops Previous implementation was done before C2 loop strip-mining was merged into JDK master so it didn't take strip-mined loops into consideration. In C2's strip mined loops, post loop is not the sibling of the main loop in ideal loop tree. Instead, it's the sibling of the main loop's parent. This patch fixed a SIGSEGV issue caused by NULL pointer when locating post loop from strip-mined main loop. - 2) Incorrect result issues with post loop vectorization We have also fixed five incorrect vectorization issues. Some of them are hidden deep and can only be reproduced with corner cases. These issues have a common cause that it assumes the post loop can be vectorized if the vectorization in corresponding main loop is successful. But in many cases this assumption is wrong. Below are details. [Issue-1] Incorrect vectorization for partial vectorizable loops This issue can be reproduced by below loop where only some operations in the loop body are vectorizable. for (int i = 0; i < 10000; i++) { res[i] = a[i] * b[i]; k = 3 * k + 1; } In the main loop, superword can work well if parts of the operations in loop body are not vectorizable since those parts can be unrolled only. But for post loops, we don't create vectors through combining scalar IRs generated from loop unrolling. Instead, we are doing scalars to vectors replacement for all operations in the loop body. Hence, all operations should be either vectorized together or not vectorized at all. To fix this kind of cases, we add an extra field "_slp_vector_pack_count" in CountedLoopNode to record the eventual count of vector packs in the main loop. This value is then passed to post loop and compared with post loop pack count. Vectorization will be bailed out in post loop if it creates more vector packs than in the main loop. [Issue-2] Incorrect result in loops with growing-down vectors This issue appears with growing-down vectors, that is, vectors that grow to smaller memory address as the loop iterates. It can be reproduced by below counting-up loop with negative scale value in array index. for (int i = 0; i < 10000; i++) { a[MAX - i] = b[MAX - i]; } Cause of this issue is that for a growing-down vector, generated vector mask value has reversed vector-lane order so it masks incorrect vector lanes. Note that if negative scale value appears in counting-down loops, the vector will be growing up. With this rule, we fix the issue by only allowing positive array index scales in counting-up loops and negative array index scales in counting-down loops. This check is done with the help of SWPointer by comparing scale values in each memory access in the loop with loop stride value. [Issue-3] Incorrect result in manually unrolled loops This issue can be reproduced by below manually unrolled loop. for (int i = 0; i < 10000; i += 2) { c[i] = a[i] + b[i]; c[i + 1] = a[i + 1] * b[i + 1]; } In this loop, operations in the 2nd statement duplicate those in the 1st statement with a small memory address offset. Vectorization in the main loop works well in this case because C2 does further unrolling and pack combination. But we cannot vectorize the post loop through replacement from scalars to vectors because it creates duplicated vector operations. To fix this, we restrict post loop vectorization to loops with stride values of 1 or -1. [Issue-4] Incorrect result in loops with mixed vector element sizes This issue is found after we enable post loop vectorization for AArch64. It's reproducible by multiple array operations with different element sizes inside a loop. On x86, there is no issue because the values of x86 AVX512 opmasks only depend on which vector lanes are active. But AArch64 is different - the values of SVE predicates also depend on lane size of the vector. Hence, on AArch64 SVE, if a loop has mixed vector element sizes, we should use different vector masks. For now, we just support loops with only one vector element size, i.e., "int + float" vectors in a single loop is ok but "int + double" vectors in a single loop is not vectorizable. This fix also enables subword vectors support to make all primitive type array operations vectorizable. [Issue-5] Incorrect result in loops with potential data dependence This issue can be reproduced by below corner case on AArch64 only. for (int i = 0; i < 10000; i++) { a[i] = x; a[i + OFFSET] = y; } In this case, two stores in the loop have data dependence if the OFFSET value is smaller than the vector length. So we cannot do vectorization through replacing scalars to vectors. But the main loop vectorization in this case is successful on AArch64 because AArch64 has partial vector load/store support. It splits vector fill with different values in lanes to several smaller-sized fills. In this patch, we add additional data dependence check for this kind of cases. The check is also done with the help of SWPointer class. In this check, we require that every two memory accesses (with at least one store) of the same element type (or subword size) in the loop has the same array index expression. ** Tests So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with experimental VM option "PostLoopMultiversioning" turned on. We found no issue in all tests. We notice that those existing cases are not enough because some of above issues are not spotted by them. We would like to add some new cases but we found existing vectorization tests are a bit cumbersome - golden results must be pre-calculated and hard-coded in the test code for correctness verification. Thus, in this patch, we propose a new vectorization testing framework. Our new framework brings a simpler way to add new cases. For a new test case, we only need to create a new method annotated with "@Test". The test runner will invoke each annotated method twice automatically. First time it runs in the interpreter and second time it's forced compiled by C2. Then the two return results are compared. So in this framework each test method should return a primitive value or an array of primitives. In this way, no extra verification code for vectorization correctness is required. This test runner is still jtreg-based and takes advantages of the jtreg WhiteBox API, which enables test methods running at specific compilation levels. Each test class inside is also jtreg-based. It just need to inherit from the test runner class and run with two additional options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". ** Summary & Future work In this patch, we reworked post loop vectorization. We made it platform independent and fixed several issues inside. We also implemented a new vectorization testing framework with many test cases inside. Meanwhile, we did some code cleanups. This patch only touches C2 code guarded with PostLoopMultiversioning, except a few data structure changes. So, there's no behavior change when experimental VM option PostLoopMultiversioning is off. Also, to reduce risks, we still propose to keep post loop vectorization experimental for now. But if it receives positive feedback, we would like to change it to non-experimental in the future. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6828/files - new: https://git.openjdk.java.net/jdk/pull/6828/files/85ce597d..56575886 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6828&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6828&range=01-02 Stats: 17336 lines in 604 files changed: 11436 ins; 3718 del; 2182 mod Patch: https://git.openjdk.java.net/jdk/pull/6828.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6828/head:pull/6828 PR: https://git.openjdk.java.net/jdk/pull/6828 From duke at openjdk.java.net Mon Jan 10 06:36:24 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Mon, 10 Jan 2022 06:36:24 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v9] In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 06:05:54 GMT, Vamsi Parasa wrote: >> Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. > > Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > aditional checks for the test src/hotspot/cpu/x86/x86.ad line 8602: > 8600: int vlen_enc = vector_length_encoding(this, $src); > 8601: __ vpopcntq($dst$$XMMRegister, $src$$XMMRegister, vlen_enc); > 8602: __ evpmovqd($dst$$XMMRegister, $dst$$XMMRegister, vlen_enc); Hi, Should this cast be introduced at the middle-end instead? Popcount is a lane-wise operation and forcing the node to do a shape-changing operation seems not so reasonable. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From thartmann at openjdk.java.net Mon Jan 10 07:11:19 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Mon, 10 Jan 2022 07:11:19 GMT Subject: RFR: 8279356: Method linking fails with guarantee(mh->adapter() != NULL) failed: Adapter blob must already exist! In-Reply-To: References: Message-ID: On Fri, 7 Jan 2022 16:57:03 GMT, Tobias Hartmann wrote: > Adapter creation during method linking may fail due to a lack of code cache space which leads to a `VirtualMachineError` being thrown and thus a bail out from linking the holder klass: > https://github.com/openjdk/jdk/blob/4243f4c998344e77dccd4d5605e56e869bc8af89/src/hotspot/share/oops/method.cpp#L1239-L1247 > > If the `VirtualMachineError` is handled/ignored by the application, we may later attempt to link the same klass and therefore also the same method again. We then incorrectly bail out from adapter creation because the `_i2i_entry` is set, assuming that this can only happen if adapters have already been created. However, that is not guaranteed because the interpreter entry is set right **before** adapters are created: > https://github.com/openjdk/jdk/blob/4243f4c998344e77dccd4d5605e56e869bc8af89/src/hotspot/share/oops/method.cpp#L1213-L1230 > > I propose to instead check if adapters have been created. > > This is an old bug that was just recently triggered by an unrelated change. > > Thanks, > Tobias Thanks, Vladimir! ------------- PR: https://git.openjdk.java.net/jdk/pull/6990 From thartmann at openjdk.java.net Mon Jan 10 07:44:30 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Mon, 10 Jan 2022 07:44:30 GMT Subject: RFR: 8279356: Method linking fails with guarantee(mh->adapter() != NULL) failed: Adapter blob must already exist! In-Reply-To: References: Message-ID: On Fri, 7 Jan 2022 16:57:03 GMT, Tobias Hartmann wrote: > Adapter creation during method linking may fail due to a lack of code cache space which leads to a `VirtualMachineError` being thrown and thus a bail out from linking the holder klass: > https://github.com/openjdk/jdk/blob/4243f4c998344e77dccd4d5605e56e869bc8af89/src/hotspot/share/oops/method.cpp#L1239-L1247 > > If the `VirtualMachineError` is handled/ignored by the application, we may later attempt to link the same klass and therefore also the same method again. We then incorrectly bail out from adapter creation because the `_i2i_entry` is set, assuming that this can only happen if adapters have already been created. However, that is not guaranteed because the interpreter entry is set right **before** adapters are created: > https://github.com/openjdk/jdk/blob/4243f4c998344e77dccd4d5605e56e869bc8af89/src/hotspot/share/oops/method.cpp#L1213-L1230 > > I propose to instead check if adapters have been created. > > This is an old bug that was just recently triggered by an unrelated change. > > Thanks, > Tobias Just noticed that I accidentally opened this PR for JDK 19 but the fix should go into 18. Will re-open a PR in the JDK 18u fork. ------------- PR: https://git.openjdk.java.net/jdk/pull/6990 From thartmann at openjdk.java.net Mon Jan 10 07:44:30 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Mon, 10 Jan 2022 07:44:30 GMT Subject: Withdrawn: 8279356: Method linking fails with guarantee(mh->adapter() != NULL) failed: Adapter blob must already exist! In-Reply-To: References: Message-ID: On Fri, 7 Jan 2022 16:57:03 GMT, Tobias Hartmann wrote: > Adapter creation during method linking may fail due to a lack of code cache space which leads to a `VirtualMachineError` being thrown and thus a bail out from linking the holder klass: > https://github.com/openjdk/jdk/blob/4243f4c998344e77dccd4d5605e56e869bc8af89/src/hotspot/share/oops/method.cpp#L1239-L1247 > > If the `VirtualMachineError` is handled/ignored by the application, we may later attempt to link the same klass and therefore also the same method again. We then incorrectly bail out from adapter creation because the `_i2i_entry` is set, assuming that this can only happen if adapters have already been created. However, that is not guaranteed because the interpreter entry is set right **before** adapters are created: > https://github.com/openjdk/jdk/blob/4243f4c998344e77dccd4d5605e56e869bc8af89/src/hotspot/share/oops/method.cpp#L1213-L1230 > > I propose to instead check if adapters have been created. > > This is an old bug that was just recently triggered by an unrelated change. > > Thanks, > Tobias This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/6990 From thartmann at openjdk.java.net Mon Jan 10 07:55:52 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Mon, 10 Jan 2022 07:55:52 GMT Subject: [jdk18] RFR: 8279356: Method linking fails with guarantee(mh->adapter() != NULL) failed: Adapter blob must already exist! Message-ID: Adapter creation during method linking may fail due to a lack of code cache space which leads to a `VirtualMachineError` being thrown and thus a bail out from linking the holder klass: https://github.com/openjdk/jdk18/blob/d65c665839c0a564c422ef685f2673fac37315d7/src/hotspot/share/oops/method.cpp#L1239-L1247 If the `VirtualMachineError` is handled/ignored by the application, we may later attempt to link the same klass and therefore also the same method again. We then incorrectly bail out from adapter creation because the `_i2i_entry` is set, assuming that this can only happen if adapters have already been created. However, that is not guaranteed because the interpreter entry is set right **before** adapters are created: https://github.com/openjdk/jdk18/blob/d65c665839c0a564c422ef685f2673fac37315d7/src/hotspot/share/oops/method.cpp#L1213-L1230 I propose to instead check if adapters have been created. This is an old bug that was just recently triggered by an unrelated change. Thanks, Tobias ------------- Commit messages: - 8279356: Method linking fails with guarantee(mh->adapter() != NULL) failed: Adapter blob must already exist! Changes: https://git.openjdk.java.net/jdk18/pull/88/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk18&pr=88&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8279356 Stats: 52 lines in 2 files changed: 43 ins; 2 del; 7 mod Patch: https://git.openjdk.java.net/jdk18/pull/88.diff Fetch: git fetch https://git.openjdk.java.net/jdk18 pull/88/head:pull/88 PR: https://git.openjdk.java.net/jdk18/pull/88 From chagedorn at openjdk.java.net Mon Jan 10 08:25:26 2022 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Mon, 10 Jan 2022 08:25:26 GMT Subject: [jdk18] RFR: 8279356: Method linking fails with guarantee(mh->adapter() != NULL) failed: Adapter blob must already exist! In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 07:46:24 GMT, Tobias Hartmann wrote: > Adapter creation during method linking may fail due to a lack of code cache space which leads to a `VirtualMachineError` being thrown and thus a bail out from linking the holder klass: > https://github.com/openjdk/jdk18/blob/d65c665839c0a564c422ef685f2673fac37315d7/src/hotspot/share/oops/method.cpp#L1239-L1247 > > If the `VirtualMachineError` is handled/ignored by the application, we may later attempt to link the same klass and therefore also the same method again. We then incorrectly bail out from adapter creation because the `_i2i_entry` is set, assuming that this can only happen if adapters have already been created. However, that is not guaranteed because the interpreter entry is set right **before** adapters are created: > https://github.com/openjdk/jdk18/blob/d65c665839c0a564c422ef685f2673fac37315d7/src/hotspot/share/oops/method.cpp#L1213-L1230 > > I propose to instead check if adapters have been created. > > This is an old bug that was just recently triggered by an unrelated change. > > Thanks, > Tobias That looks reasonable! ------------- Marked as reviewed by chagedorn (Reviewer). PR: https://git.openjdk.java.net/jdk18/pull/88 From thartmann at openjdk.java.net Mon Jan 10 08:35:28 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Mon, 10 Jan 2022 08:35:28 GMT Subject: [jdk18] RFR: 8279356: Method linking fails with guarantee(mh->adapter() != NULL) failed: Adapter blob must already exist! In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 07:46:24 GMT, Tobias Hartmann wrote: > Adapter creation during method linking may fail due to a lack of code cache space which leads to a `VirtualMachineError` being thrown and thus a bail out from linking the holder klass: > https://github.com/openjdk/jdk18/blob/d65c665839c0a564c422ef685f2673fac37315d7/src/hotspot/share/oops/method.cpp#L1239-L1247 > > If the `VirtualMachineError` is handled/ignored by the application, we may later attempt to link the same klass and therefore also the same method again. We then incorrectly bail out from adapter creation because the `_i2i_entry` is set, assuming that this can only happen if adapters have already been created. However, that is not guaranteed because the interpreter entry is set right **before** adapters are created: > https://github.com/openjdk/jdk18/blob/d65c665839c0a564c422ef685f2673fac37315d7/src/hotspot/share/oops/method.cpp#L1213-L1230 > > I propose to instead check if adapters have been created. > > This is an old bug that was just recently triggered by an unrelated change. > > Thanks, > Tobias Thanks, Christian! ------------- PR: https://git.openjdk.java.net/jdk18/pull/88 From rehn at openjdk.java.net Mon Jan 10 08:46:30 2022 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 10 Jan 2022 08:46:30 GMT Subject: RFR: 8218857: Confusing overloads for os::open In-Reply-To: <6o68V8XKdZcazOZUKcJLtZ6N7HZ2VWRuXopw0NpAb1Y=.90799217-bb62-4199-9fa3-588c51ba658a@github.com> References: <6o68V8XKdZcazOZUKcJLtZ6N7HZ2VWRuXopw0NpAb1Y=.90799217-bb62-4199-9fa3-588c51ba658a@github.com> Message-ID: On Fri, 7 Jan 2022 13:29:38 GMT, Harold Seigel wrote: > Please review this small change to resolve overload confusion by renaming "FILE* os::open(int fd, const char* mode)" to "FILE* os::fdopen(int fd, const char* mode)" in os.hpp. The change was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold Thanks ------------- Marked as reviewed by rehn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6988 From rehn at openjdk.java.net Mon Jan 10 08:48:27 2022 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 10 Jan 2022 08:48:27 GMT Subject: RFR: 8183227: read/write APIs in class os shall return ssize_t In-Reply-To: References: Message-ID: On Fri, 7 Jan 2022 20:00:15 GMT, Harold Seigel wrote: > Please review this small fix that changes the return type of os::write() to ssize_t. No changes were needed for os::read() because its return type is ssize_t. This fix was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows and Mach5 tiers 3-5 on Linux x64. > Thanks, Harold Thanks ------------- Marked as reviewed by rehn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6992 From duke at openjdk.java.net Mon Jan 10 08:48:33 2022 From: duke at openjdk.java.net (Tobias Holenstein) Date: Mon, 10 Jan 2022 08:48:33 GMT Subject: RFR: JDK-8278329: some TraceDeoptimization code not included in PRODUCT build [v4] In-Reply-To: References: Message-ID: > After "JDK-8154011: Make `TraceDeoptimization` a diagnostic flag" some code was not included in the PRODUCT build. > > Removed all the #ifndef PRODUCT guards around `TraceDeoptimization` checks and made sure to be consistent. > > The DEOPT PACKING messages were controlled by `PrintDeoptimizationDetails` (develop flag), but DEOPT UNPACKING is controlled by `TraceDeoptimization` (product flag),. Therefore changed DEOPT PACKING messages to be controlled by `TraceDeoptimization` as well. > > Checked that tests are not affected. Checked on Aurora that performance is not affected. Tobias Holenstein has updated the pull request incrementally with one additional commit since the last revision: changed INCLUDE_JVMCI to JVMCI_ONLY ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6746/files - new: https://git.openjdk.java.net/jdk/pull/6746/files/90dc623e..fe555ff7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6746&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6746&range=02-03 Stats: 10 lines in 1 file changed: 0 ins; 5 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/6746.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6746/head:pull/6746 PR: https://git.openjdk.java.net/jdk/pull/6746 From duke at openjdk.java.net Mon Jan 10 08:48:33 2022 From: duke at openjdk.java.net (Tobias Holenstein) Date: Mon, 10 Jan 2022 08:48:33 GMT Subject: RFR: JDK-8278329: some TraceDeoptimization code not included in PRODUCT build [v3] In-Reply-To: References: Message-ID: On Fri, 7 Jan 2022 10:56:36 GMT, Tobias Holenstein wrote: >> After "JDK-8154011: Make `TraceDeoptimization` a diagnostic flag" some code was not included in the PRODUCT build. >> >> Removed all the #ifndef PRODUCT guards around `TraceDeoptimization` checks and made sure to be consistent. >> >> The DEOPT PACKING messages were controlled by `PrintDeoptimizationDetails` (develop flag), but DEOPT UNPACKING is controlled by `TraceDeoptimization` (product flag),. Therefore changed DEOPT PACKING messages to be controlled by `TraceDeoptimization` as well. >> >> Checked that tests are not affected. Checked on Aurora that performance is not affected. > > Tobias Holenstein has updated the pull request incrementally with one additional commit since the last revision: > > minor style issues > _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on [hotspot-dev](mailto:hotspot-dev at mail.openjdk.java.net):_ > > Hi Tobias, > > On 7/01/2022 8:56 pm, Tobias Holenstein wrote: > > > On Fri, 7 Jan 2022 07:28:23 GMT, Tobias Hartmann wrote: > > > > Tobias Holenstein has updated the pull request incrementally with one additional commit since the last revision: > > > > Cleanup output of TraceDeoptimization > > > > > > > > > src/hotspot/share/runtime/deoptimization.cpp line 1954: > > > > 1952: #if INCLUDE_JVMCI > > > > 1953: , debug_id > > > > 1954: #endif > > > > > > > > > You can use `JVMCI_ONLY` here as well. Same in line 1840. > > > > > > Unfortunately, because of the comma and the way JVMCI_ONLY is defined, this does not work. I will leave it as it is > > We have the COMMA macro to solve that problem e.g. > > ./share/runtime/threadSMR.inline.hpp: DEBUG_ONLY(COMMA _list(list)) > > Cheers, David Ok, I didn't know that. I changed it now to DEBUG_ONLY(COMMA ...) Thanks David! - Tobias ------------- PR: https://git.openjdk.java.net/jdk/pull/6746 From pli at openjdk.java.net Mon Jan 10 09:52:31 2022 From: pli at openjdk.java.net (Pengfei Li) Date: Mon, 10 Jan 2022 09:52:31 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v3] In-Reply-To: <1sV-WcCv0-H2lHh1FFwAocRL8EIVzZ_att5tUeY5M-c=.ee351f2b-c9b9-4294-9205-cf16da3e896d@github.com> References: <1sV-WcCv0-H2lHh1FFwAocRL8EIVzZ_att5tUeY5M-c=.ee351f2b-c9b9-4294-9205-cf16da3e896d@github.com> Message-ID: On Mon, 10 Jan 2022 06:20:01 GMT, Pengfei Li wrote: >> ### Background >> >> Post loop vectorization is a C2 compiler optimization in an experimental >> VM feature called PostLoopMultiversioning. It transforms the range-check >> eliminated post loop to a 1-iteration vectorized loop with vector mask. >> This optimization was contributed by Intel in 2016 to support x86 AVX512 >> masked vector instructions. However, it was disabled soon after an issue >> was found. Due to insufficient maintenance in these years, multiple bugs >> have been accumulated inside. But we (Arm) still think this is a useful >> framework for vector mask support in C2 auto-vectorized loops, for both >> x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable >> post loop vectorization. >> >> ### Changes in this patch >> >> This patch reworks post loop vectorization. The most significant change >> is removing vector mask support in C2 x86 backend and re-implementing >> it in the mid-end. With this, we can re-enable post loop vectorization >> for platforms other than x86. >> >> Previous implementation hard-codes x86 k1 register as a reserved AVX512 >> opmask register and defines two routines (setvectmask/restorevectmask) >> to set and restore the value of k1. But after [JDK-8211251](https://bugs.openjdk.java.net/browse/JDK-8211251) which encodes >> AVX512 instructions as unmasked by default, generated vector masks are >> no longer used in AVX512 vector instructions. To fix incorrect codegen >> and add vector mask support for more platforms, we turn to add a vector >> mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode >> to generate a mask and replace all Load/Store nodes in the post loop >> into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This >> IR form is exactly the same to those which are used in VectorAPI mask >> support. For now, we only add mask inputs for Load/Store nodes because >> we don't have reduction operations supported in post loop vectorization. >> After this change, the x86 k1 register is no longer reserved and can be >> allocated when PostLoopMultiversioning is enabled. >> >> Besides this change, we have fixed a compiler crash and five incorrect >> result issues with post loop vectorization. >> >> **I) C2 crashes with segmentation fault in strip-mined loops** >> >> Previous implementation was done before C2 loop strip-mining was merged >> into JDK master so it didn't take strip-mined loops into consideration. >> In C2's strip mined loops, post loop is not the sibling of the main loop >> in ideal loop tree. Instead, it's the sibling of the main loop's parent. >> This patch fixed a SIGSEGV issue caused by NULL pointer when locating >> post loop from strip-mined main loop. >> >> **II) Incorrect result issues with post loop vectorization** >> >> We have also fixed five incorrect vectorization issues. Some of them are >> hidden deep and can only be reproduced with corner cases. These issues >> have a common cause that it assumes the post loop can be vectorized if >> the vectorization in corresponding main loop is successful. But in many >> cases this assumption is wrong. Below are details. >> >> - **[Issue-1] Incorrect vectorization for partial vectorizable loops** >> >> This issue can be reproduced by below loop where only some operations in >> the loop body are vectorizable. >> >> for (int i = 0; i < 10000; i++) { >> res[i] = a[i] * b[i]; >> k = 3 * k + 1; >> } >> >> In the main loop, superword can work well if parts of the operations in >> loop body are not vectorizable since those parts can be unrolled only. >> But for post loops, we don't create vectors through combining scalar IRs >> generated from loop unrolling. Instead, we are doing scalars to vectors >> replacement for all operations in the loop body. Hence, all operations >> should be either vectorized together or not vectorized at all. To fix >> this kind of cases, we add an extra field "_slp_vector_pack_count" in >> CountedLoopNode to record the eventual count of vector packs in the main >> loop. This value is then passed to post loop and compared with post loop >> pack count. Vectorization will be bailed out in post loop if it creates >> more vector packs than in the main loop. >> >> - **[Issue-2] Incorrect result in loops with growing-down vectors** >> >> This issue appears with growing-down vectors, that is, vectors that grow >> to smaller memory address as the loop iterates. It can be reproduced by >> below counting-up loop with negative scale value in array index. >> >> for (int i = 0; i < 10000; i++) { >> a[MAX - i] = b[MAX - i]; >> } >> >> Cause of this issue is that for a growing-down vector, generated vector >> mask value has reversed vector-lane order so it masks incorrect vector >> lanes. Note that if negative scale value appears in counting-down loops, >> the vector will be growing up. With this rule, we fix the issue by only >> allowing positive array index scales in counting-up loops and negative >> array index scales in counting-down loops. This check is done with the >> help of SWPointer by comparing scale values in each memory access in the >> loop with loop stride value. >> >> - **[Issue-3] Incorrect result in manually unrolled loops** >> >> This issue can be reproduced by below manually unrolled loop. >> >> for (int i = 0; i < 10000; i += 2) { >> c[i] = a[i] + b[i]; >> c[i + 1] = a[i + 1] * b[i + 1]; >> } >> >> In this loop, operations in the 2nd statement duplicate those in the 1st >> statement with a small memory address offset. Vectorization in the main >> loop works well in this case because C2 does further unrolling and pack >> combination. But we cannot vectorize the post loop through replacement >> from scalars to vectors because it creates duplicated vector operations. >> To fix this, we restrict post loop vectorization to loops with stride >> values of 1 or -1. >> >> - **[Issue-4] Incorrect result in loops with mixed vector element sizes** >> >> This issue is found after we enable post loop vectorization for AArch64. >> It's reproducible by multiple array operations with different element >> sizes inside a loop. On x86, there is no issue because the values of x86 >> AVX512 opmasks only depend on which vector lanes are active. But AArch64 >> is different - the values of SVE predicates also depend on lane size of >> the vector. Hence, on AArch64 SVE, if a loop has mixed vector element >> sizes, we should use different vector masks. For now, we just support >> loops with only one vector element size, i.e., "int + float" vectors in >> a single loop is ok but "int + double" vectors in a single loop is not >> vectorizable. This fix also enables subword vectors support to make all >> primitive type array operations vectorizable. >> >> - **[Issue-5] Incorrect result in loops with potential data dependence** >> >> This issue can be reproduced by below corner case on AArch64 only. >> >> for (int i = 0; i < 10000; i++) { >> a[i] = x; >> a[i + OFFSET] = y; >> } >> >> In this case, two stores in the loop have data dependence if the OFFSET >> value is smaller than the vector length. So we cannot do vectorization >> through replacing scalars to vectors. But the main loop vectorization >> in this case is successful on AArch64 because AArch64 has partial vector >> load/store support. It splits vector fill with different values in lanes >> to several smaller-sized fills. In this patch, we add additional data >> dependence check for this kind of cases. The check is also done with the >> help of SWPointer class. In this check, we require that every two memory >> accesses (with at least one store) of the same element type (or subword >> size) in the loop has the same array index expression. >> >> ### Tests >> >> So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with >> experimental VM option "PostLoopMultiversioning" turned on. We found no >> issue in all tests. We notice that those existing cases are not enough >> because some of above issues are not spotted by them. We would like to >> add some new cases but we found existing vectorization tests are a bit >> cumbersome - golden results must be pre-calculated and hard-coded in the >> test code for correctness verification. Thus, in this patch, we propose >> a new vectorization testing framework. >> >> Our new framework brings a simpler way to add new cases. For a new test >> case, we only need to create a new method annotated with "@Test". The >> test runner will invoke each annotated method twice automatically. First >> time it runs in the interpreter and second time it's forced compiled by >> C2. Then the two return results are compared. So in this framework each >> test method should return a primitive value or an array of primitives. >> In this way, no extra verification code for vectorization correctness is >> required. This test runner is still jtreg-based and takes advantages of >> the jtreg WhiteBox API, which enables test methods running at specific >> compilation levels. Each test class inside is also jtreg-based. It just >> need to inherit from the test runner class and run with two additional >> options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". >> >> ### Summary & Future work >> >> In this patch, we reworked post loop vectorization. We made it platform >> independent and fixed several issues inside. We also implemented a new >> vectorization testing framework with many test cases inside. Meanwhile, >> we did some code cleanups. >> >> This patch only touches C2 code guarded with PostLoopMultiversioning, >> except a few data structure changes. So, there's no behavior change when >> experimental VM option PostLoopMultiversioning is off. Also, to reduce >> risks, we still propose to keep post loop vectorization experimental for >> now. But if it receives positive feedback, we would like to change it to >> non-experimental in the future. > > Pengfei Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Update copyright year and rename a function > > Change-Id: I15845ebd3982edebd4c151284cc6f2ff727630bb > - Merge branch 'master' into postloop > > Change-Id: Ie639c79c9cf016dc68ebf2c0031b60453b45e9a4 > - Fix issues in newly added test framework > > Change-Id: I6e61abf05e9665325cb3abaf407360b18355c6b1 > - Merge branch 'master' into postloop > > Change-Id: I9bb5a808d7540426dedb141fd198d25eb1f569e6 > - 8183390: Fix and re-enable post loop vectorization > > ** Background > > Post loop vectorization is a C2 compiler optimization in an experimental > VM feature called PostLoopMultiversioning. It transforms the range-check > eliminated post loop to a 1-iteration vectorized loop with vector mask. > This optimization was contributed by Intel in 2016 to support x86 AVX512 > masked vector instructions. However, it was disabled soon after an issue > was found. Due to insufficient maintenance in these years, multiple bugs > have been accumulated inside. But we (Arm) still think this is a useful > framework for vector mask support in C2 auto-vectorized loops, for both > x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable > post loop vectorization. > > ** Changes in this patch > > This patch reworks post loop vectorization. The most significant change > is removing vector mask support in C2 x86 backend and re-implementing > it in the mid-end. With this, we can re-enable post loop vectorization > for platforms other than x86. > > Previous implementation hard-codes x86 k1 register as a reserved AVX512 > opmask register and defines two routines (setvectmask/restorevectmask) > to set and restore the value of k1. But after JDK-8211251 which encodes > AVX512 instructions as unmasked by default, generated vector masks are > no longer used in AVX512 vector instructions. To fix incorrect codegen > and add vector mask support for more platforms, we turn to add a vector > mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode > to generate a mask and replace all Load/Store nodes in the post loop > into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This > IR form is exactly the same to those which are used in VectorAPI mask > support. For now, we only add mask inputs for Load/Store nodes because > we don't have reduction operations supported in post loop vectorization. > After this change, the x86 k1 register is no longer reserved and can be > allocated when PostLoopMultiversioning is enabled. > > Besides this change, we have fixed a compiler crash and five incorrect > result issues with post loop vectorization. > > - 1) C2 crashes with segmentation fault in strip-mined loops > > Previous implementation was done before C2 loop strip-mining was merged > into JDK master so it didn't take strip-mined loops into consideration. > In C2's strip mined loops, post loop is not the sibling of the main loop > in ideal loop tree. Instead, it's the sibling of the main loop's parent. > This patch fixed a SIGSEGV issue caused by NULL pointer when locating > post loop from strip-mined main loop. > > - 2) Incorrect result issues with post loop vectorization > > We have also fixed five incorrect vectorization issues. Some of them are > hidden deep and can only be reproduced with corner cases. These issues > have a common cause that it assumes the post loop can be vectorized if > the vectorization in corresponding main loop is successful. But in many > cases this assumption is wrong. Below are details. > > [Issue-1] Incorrect vectorization for partial vectorizable loops > > This issue can be reproduced by below loop where only some operations in > the loop body are vectorizable. > > for (int i = 0; i < 10000; i++) { > res[i] = a[i] * b[i]; > k = 3 * k + 1; > } > > In the main loop, superword can work well if parts of the operations in > loop body are not vectorizable since those parts can be unrolled only. > But for post loops, we don't create vectors through combining scalar IRs > generated from loop unrolling. Instead, we are doing scalars to vectors > replacement for all operations in the loop body. Hence, all operations > should be either vectorized together or not vectorized at all. To fix > this kind of cases, we add an extra field "_slp_vector_pack_count" in > CountedLoopNode to record the eventual count of vector packs in the main > loop. This value is then passed to post loop and compared with post loop > pack count. Vectorization will be bailed out in post loop if it creates > more vector packs than in the main loop. > > [Issue-2] Incorrect result in loops with growing-down vectors > > This issue appears with growing-down vectors, that is, vectors that grow > to smaller memory address as the loop iterates. It can be reproduced by > below counting-up loop with negative scale value in array index. > > for (int i = 0; i < 10000; i++) { > a[MAX - i] = b[MAX - i]; > } > > Cause of this issue is that for a growing-down vector, generated vector > mask value has reversed vector-lane order so it masks incorrect vector > lanes. Note that if negative scale value appears in counting-down loops, > the vector will be growing up. With this rule, we fix the issue by only > allowing positive array index scales in counting-up loops and negative > array index scales in counting-down loops. This check is done with the > help of SWPointer by comparing scale values in each memory access in the > loop with loop stride value. > > [Issue-3] Incorrect result in manually unrolled loops > > This issue can be reproduced by below manually unrolled loop. > > for (int i = 0; i < 10000; i += 2) { > c[i] = a[i] + b[i]; > c[i + 1] = a[i + 1] * b[i + 1]; > } > > In this loop, operations in the 2nd statement duplicate those in the 1st > statement with a small memory address offset. Vectorization in the main > loop works well in this case because C2 does further unrolling and pack > combination. But we cannot vectorize the post loop through replacement > from scalars to vectors because it creates duplicated vector operations. > To fix this, we restrict post loop vectorization to loops with stride > values of 1 or -1. > > [Issue-4] Incorrect result in loops with mixed vector element sizes > > This issue is found after we enable post loop vectorization for AArch64. > It's reproducible by multiple array operations with different element > sizes inside a loop. On x86, there is no issue because the values of x86 > AVX512 opmasks only depend on which vector lanes are active. But AArch64 > is different - the values of SVE predicates also depend on lane size of > the vector. Hence, on AArch64 SVE, if a loop has mixed vector element > sizes, we should use different vector masks. For now, we just support > loops with only one vector element size, i.e., "int + float" vectors in > a single loop is ok but "int + double" vectors in a single loop is not > vectorizable. This fix also enables subword vectors support to make all > primitive type array operations vectorizable. > > [Issue-5] Incorrect result in loops with potential data dependence > > This issue can be reproduced by below corner case on AArch64 only. > > for (int i = 0; i < 10000; i++) { > a[i] = x; > a[i + OFFSET] = y; > } > > In this case, two stores in the loop have data dependence if the OFFSET > value is smaller than the vector length. So we cannot do vectorization > through replacing scalars to vectors. But the main loop vectorization > in this case is successful on AArch64 because AArch64 has partial vector > load/store support. It splits vector fill with different values in lanes > to several smaller-sized fills. In this patch, we add additional data > dependence check for this kind of cases. The check is also done with the > help of SWPointer class. In this check, we require that every two memory > accesses (with at least one store) of the same element type (or subword > size) in the loop has the same array index expression. > > ** Tests > > So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with > experimental VM option "PostLoopMultiversioning" turned on. We found no > issue in all tests. We notice that those existing cases are not enough > because some of above issues are not spotted by them. We would like to > add some new cases but we found existing vectorization tests are a bit > cumbersome - golden results must be pre-calculated and hard-coded in the > test code for correctness verification. Thus, in this patch, we propose > a new vectorization testing framework. > > Our new framework brings a simpler way to add new cases. For a new test > case, we only need to create a new method annotated with "@Test". The > test runner will invoke each annotated method twice automatically. First > time it runs in the interpreter and second time it's forced compiled by > C2. Then the two return results are compared. So in this framework each > test method should return a primitive value or an array of primitives. > In this way, no extra verification code for vectorization correctness is > required. This test runner is still jtreg-based and takes advantages of > the jtreg WhiteBox API, which enables test methods running at specific > compilation levels. Each test class inside is also jtreg-based. It just > need to inherit from the test runner class and run with two additional > options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". > > ** Summary & Future work > > In this patch, we reworked post loop vectorization. We made it platform > independent and fixed several issues inside. We also implemented a new > vectorization testing framework with many test cases inside. Meanwhile, > we did some code cleanups. > > This patch only touches C2 code guarded with PostLoopMultiversioning, > except a few data structure changes. So, there's no behavior change when > experimental VM option PostLoopMultiversioning is off. Also, to reduce > risks, we still propose to keep post loop vectorization experimental for > now. But if it receives positive feedback, we would like to change it to > non-experimental in the future. Can any C2 compiler expert help review this? I updated copyright year to 2022 and renamed a function in latest commit. ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From shade at openjdk.java.net Mon Jan 10 10:39:44 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 10 Jan 2022 10:39:44 GMT Subject: RFR: 8279668: x86: AVX2 versions of vpxor should be asserted Message-ID: Got the SIGILLs on some machines when testing JDK-8279621. That patch started using shorter `vpxor` versions on `UseAVX = 1` path, which tried to use `VEX.256`-encoded `vpxor` instruction that is only available on AVX2. This should be at very least asserted in assembler code. Also fixed the implicit `bool` -> `int` conversion to `vector_len`. Additional testing: - [x] Linux x86_64 fastdebug `tier1` with `-XX:UseAVX=1` ------------- Commit messages: - Fix Changes: https://git.openjdk.java.net/jdk/pull/7005/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7005&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8279668 Stats: 8 lines in 1 file changed: 6 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7005.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7005/head:pull/7005 PR: https://git.openjdk.java.net/jdk/pull/7005 From thartmann at openjdk.java.net Mon Jan 10 10:53:32 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Mon, 10 Jan 2022 10:53:32 GMT Subject: RFR: JDK-8278329: some TraceDeoptimization code not included in PRODUCT build [v4] In-Reply-To: References: Message-ID: <4MroDjqCv0QnxjAxoMJtq45SJ23yO30XDVcvMf4aXho=.f3f2e58f-4bf4-4a75-8be1-ca7de4627336@github.com> On Mon, 10 Jan 2022 08:48:33 GMT, Tobias Holenstein wrote: >> After "JDK-8154011: Make `TraceDeoptimization` a diagnostic flag" some code was not included in the PRODUCT build. >> >> Removed all the #ifndef PRODUCT guards around `TraceDeoptimization` checks and made sure to be consistent. >> >> The DEOPT PACKING messages were controlled by `PrintDeoptimizationDetails` (develop flag), but DEOPT UNPACKING is controlled by `TraceDeoptimization` (product flag),. Therefore changed DEOPT PACKING messages to be controlled by `TraceDeoptimization` as well. >> >> Checked that tests are not affected. Checked on Aurora that performance is not affected. > > Tobias Holenstein has updated the pull request incrementally with one additional commit since the last revision: > > changed INCLUDE_JVMCI to JVMCI_ONLY Marked as reviewed by thartmann (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/6746 From duke at openjdk.java.net Mon Jan 10 10:53:33 2022 From: duke at openjdk.java.net (Tobias Holenstein) Date: Mon, 10 Jan 2022 10:53:33 GMT Subject: Integrated: JDK-8278329: some TraceDeoptimization code not included in PRODUCT build In-Reply-To: References: Message-ID: On Tue, 7 Dec 2021 14:46:05 GMT, Tobias Holenstein wrote: > After "JDK-8154011: Make `TraceDeoptimization` a diagnostic flag" some code was not included in the PRODUCT build. > > Removed all the #ifndef PRODUCT guards around `TraceDeoptimization` checks and made sure to be consistent. > > The DEOPT PACKING messages were controlled by `PrintDeoptimizationDetails` (develop flag), but DEOPT UNPACKING is controlled by `TraceDeoptimization` (product flag),. Therefore changed DEOPT PACKING messages to be controlled by `TraceDeoptimization` as well. > > Checked that tests are not affected. Checked on Aurora that performance is not affected. This pull request has now been integrated. Changeset: 1f101b04 Author: Tobias Holenstein Committer: Tobias Hartmann URL: https://git.openjdk.java.net/jdk/commit/1f101b04f4d7c166cc0a830383e4e08025df5c74 Stats: 187 lines in 4 files changed: 79 ins; 76 del; 32 mod 8278329: some TraceDeoptimization code not included in PRODUCT build Reviewed-by: dnsimon, kvn, never, thartmann ------------- PR: https://git.openjdk.java.net/jdk/pull/6746 From duke at openjdk.java.net Mon Jan 10 11:02:29 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Mon, 10 Jan 2022 11:02:29 GMT Subject: RFR: 8279668: x86: AVX2 versions of vpxor should be asserted In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 10:31:30 GMT, Aleksey Shipilev wrote: > Got the SIGILLs on some machines when testing JDK-8279621. That patch started using shorter `vpxor` versions on `UseAVX = 1` path, which tried to use `VEX.256`-encoded `vpxor` instruction that is only available on AVX2. This should be at very least asserted in assembler code. > > Also fixed the implicit `bool` -> `int` conversion to `vector_len`. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` with `-XX:UseAVX=1` Hi, I think the assert should be inserted in the assembler instead. Also, I would prefer `AVX_256bit` instead of `1`. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7005 From shade at openjdk.java.net Mon Jan 10 11:57:10 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 10 Jan 2022 11:57:10 GMT Subject: RFR: 8279668: x86: AVX2 versions of vpxor should be asserted [v2] In-Reply-To: References: Message-ID: > Got the SIGILLs on some machines when testing JDK-8279621. That patch started using shorter `vpxor` versions on `UseAVX = 1` path, which tried to use `VEX.256`-encoded `vpxor` instruction that is only available on AVX2. This should be at very least asserted in assembler code. > > Also fixed the implicit `bool` -> `int` conversion to `vector_len`. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` with `-XX:UseAVX=1` Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Use AVX_256bit literal ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7005/files - new: https://git.openjdk.java.net/jdk/pull/7005/files/3012705b..e52bdf0a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7005&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7005&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7005.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7005/head:pull/7005 PR: https://git.openjdk.java.net/jdk/pull/7005 From shade at openjdk.java.net Mon Jan 10 11:57:11 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 10 Jan 2022 11:57:11 GMT Subject: RFR: 8279668: x86: AVX2 versions of vpxor should be asserted In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 10:58:53 GMT, Quan Anh Mai wrote: > Hi, I think the assert should be inserted in the assembler instead. I think `Assembler::vpxor` actually does it right already, and the problem is only in "shortcut" macro-method for AVX2. So the assert is where it should be. > Also, I would prefer `AVX_256bit` instead of `1`. Thanks. Good idea, fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/7005 From duke at openjdk.java.net Mon Jan 10 12:04:25 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Mon, 10 Jan 2022 12:04:25 GMT Subject: RFR: 8279668: x86: AVX2 versions of vpxor should be asserted [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 11:57:10 GMT, Aleksey Shipilev wrote: >> Got the SIGILLs on some machines when testing JDK-8279621. That patch started using shorter `vpxor` versions on `UseAVX = 1` path, which tried to use `VEX.256`-encoded `vpxor` instruction that is only available on AVX2. This should be at very least asserted in assembler code. >> >> Also fixed the implicit `bool` -> `int` conversion to `vector_len`. >> >> Additional testing: >> - [x] Linux x86_64 fastdebug `tier1` with `-XX:UseAVX=1` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Use AVX_256bit literal What I mean is an assert similar to: https://github.com/openjdk/jdk/blob/e52bdf0a10d05aee93783d8d4e3a3b349cf4c174/src/hotspot/cpu/x86/assembler_x86.cpp#L2032 Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7005 From shade at openjdk.java.net Mon Jan 10 13:18:05 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 10 Jan 2022 13:18:05 GMT Subject: RFR: 8279668: x86: AVX2 versions of vpxor should be asserted [v3] In-Reply-To: References: Message-ID: > Got the SIGILLs on some machines when testing JDK-8279621. That patch started using shorter `vpxor` versions on `UseAVX = 1` path, which tried to use `VEX.256`-encoded `vpxor` instruction that is only available on AVX2. This should be at very least asserted in assembler code. > > Also fixed the implicit `bool` -> `int` conversion to `vector_len`. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` with `-XX:UseAVX=1` Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Add more asserts ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7005/files - new: https://git.openjdk.java.net/jdk/pull/7005/files/e52bdf0a..194bfa09 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7005&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7005&range=01-02 Stats: 6 lines in 1 file changed: 6 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7005.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7005/head:pull/7005 PR: https://git.openjdk.java.net/jdk/pull/7005 From shade at openjdk.java.net Mon Jan 10 13:18:07 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 10 Jan 2022 13:18:07 GMT Subject: RFR: 8279668: x86: AVX2 versions of vpxor should be asserted [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 12:01:17 GMT, Quan Anh Mai wrote: > What I mean is an assert similar to: > https://github.com/openjdk/jdk/blob/e52bdf0a10d05aee93783d8d4e3a3b349cf4c174/src/hotspot/cpu/x86/assembler_x86.cpp#L2032 Ah, I see! Added in new commit. ------------- PR: https://git.openjdk.java.net/jdk/pull/7005 From hseigel at openjdk.java.net Mon Jan 10 13:22:33 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Mon, 10 Jan 2022 13:22:33 GMT Subject: Integrated: 8183227: read/write APIs in class os shall return ssize_t In-Reply-To: References: Message-ID: <59YtgLKESgk2V6W10QAAb2_hoQFB1RNgY61n9f4ztQc=.d5406f39-2ad1-4b15-b590-3c739b27b4a1@github.com> On Fri, 7 Jan 2022 20:00:15 GMT, Harold Seigel wrote: > Please review this small fix that changes the return type of os::write() to ssize_t. No changes were needed for os::read() because its return type is ssize_t. This fix was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows and Mach5 tiers 3-5 on Linux x64. > Thanks, Harold This pull request has now been integrated. Changeset: 4ff67205 Author: Harold Seigel URL: https://git.openjdk.java.net/jdk/commit/4ff6720573f9b55eb397d1aac9b398228faf2ceb Stats: 38 lines in 10 files changed: 3 ins; 4 del; 31 mod 8183227: read/write APIs in class os shall return ssize_t Reviewed-by: fparain, rehn ------------- PR: https://git.openjdk.java.net/jdk/pull/6992 From hseigel at openjdk.java.net Mon Jan 10 13:22:33 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Mon, 10 Jan 2022 13:22:33 GMT Subject: RFR: 8183227: read/write APIs in class os shall return ssize_t In-Reply-To: References: Message-ID: On Fri, 7 Jan 2022 20:00:15 GMT, Harold Seigel wrote: > Please review this small fix that changes the return type of os::write() to ssize_t. No changes were needed for os::read() because its return type is ssize_t. This fix was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows and Mach5 tiers 3-5 on Linux x64. > Thanks, Harold Thanks Fred and Robbin for the reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/6992 From jiefu at openjdk.java.net Mon Jan 10 13:34:38 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 10 Jan 2022 13:34:38 GMT Subject: RFR: 8279668: x86: AVX2 versions of vpxor should be asserted [v3] In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 13:18:05 GMT, Aleksey Shipilev wrote: >> Got the SIGILLs on some machines when testing JDK-8279621. That patch started using shorter `vpxor` versions on `UseAVX = 1` path, which tried to use `VEX.256`-encoded `vpxor` instruction that is only available on AVX2. This should be at very least asserted in assembler code. >> >> Also fixed the implicit `bool` -> `int` conversion to `vector_len`. >> >> Additional testing: >> - [x] Linux x86_64 fastdebug `tier1` with `-XX:UseAVX=1` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Add more asserts src/hotspot/cpu/x86/macroAssembler_x86.hpp line 1493: > 1491: // Simple version for AVX2 256bit vectors > 1492: void vpxor(XMMRegister dst, XMMRegister src) { > 1493: assert(UseAVX >= 2, "Only with AVX2"); I would suggest the assert msg as "Only with AVX2 or above". We should also update the copyright year. ------------- PR: https://git.openjdk.java.net/jdk/pull/7005 From jbhateja at openjdk.java.net Mon Jan 10 13:49:32 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Mon, 10 Jan 2022 13:49:32 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v3] In-Reply-To: <1sV-WcCv0-H2lHh1FFwAocRL8EIVzZ_att5tUeY5M-c=.ee351f2b-c9b9-4294-9205-cf16da3e896d@github.com> References: <1sV-WcCv0-H2lHh1FFwAocRL8EIVzZ_att5tUeY5M-c=.ee351f2b-c9b9-4294-9205-cf16da3e896d@github.com> Message-ID: On Mon, 10 Jan 2022 06:20:01 GMT, Pengfei Li wrote: >> ### Background >> >> Post loop vectorization is a C2 compiler optimization in an experimental >> VM feature called PostLoopMultiversioning. It transforms the range-check >> eliminated post loop to a 1-iteration vectorized loop with vector mask. >> This optimization was contributed by Intel in 2016 to support x86 AVX512 >> masked vector instructions. However, it was disabled soon after an issue >> was found. Due to insufficient maintenance in these years, multiple bugs >> have been accumulated inside. But we (Arm) still think this is a useful >> framework for vector mask support in C2 auto-vectorized loops, for both >> x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable >> post loop vectorization. >> >> ### Changes in this patch >> >> This patch reworks post loop vectorization. The most significant change >> is removing vector mask support in C2 x86 backend and re-implementing >> it in the mid-end. With this, we can re-enable post loop vectorization >> for platforms other than x86. >> >> Previous implementation hard-codes x86 k1 register as a reserved AVX512 >> opmask register and defines two routines (setvectmask/restorevectmask) >> to set and restore the value of k1. But after [JDK-8211251](https://bugs.openjdk.java.net/browse/JDK-8211251) which encodes >> AVX512 instructions as unmasked by default, generated vector masks are >> no longer used in AVX512 vector instructions. To fix incorrect codegen >> and add vector mask support for more platforms, we turn to add a vector >> mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode >> to generate a mask and replace all Load/Store nodes in the post loop >> into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This >> IR form is exactly the same to those which are used in VectorAPI mask >> support. For now, we only add mask inputs for Load/Store nodes because >> we don't have reduction operations supported in post loop vectorization. >> After this change, the x86 k1 register is no longer reserved and can be >> allocated when PostLoopMultiversioning is enabled. >> >> Besides this change, we have fixed a compiler crash and five incorrect >> result issues with post loop vectorization. >> >> **I) C2 crashes with segmentation fault in strip-mined loops** >> >> Previous implementation was done before C2 loop strip-mining was merged >> into JDK master so it didn't take strip-mined loops into consideration. >> In C2's strip mined loops, post loop is not the sibling of the main loop >> in ideal loop tree. Instead, it's the sibling of the main loop's parent. >> This patch fixed a SIGSEGV issue caused by NULL pointer when locating >> post loop from strip-mined main loop. >> >> **II) Incorrect result issues with post loop vectorization** >> >> We have also fixed five incorrect vectorization issues. Some of them are >> hidden deep and can only be reproduced with corner cases. These issues >> have a common cause that it assumes the post loop can be vectorized if >> the vectorization in corresponding main loop is successful. But in many >> cases this assumption is wrong. Below are details. >> >> - **[Issue-1] Incorrect vectorization for partial vectorizable loops** >> >> This issue can be reproduced by below loop where only some operations in >> the loop body are vectorizable. >> >> for (int i = 0; i < 10000; i++) { >> res[i] = a[i] * b[i]; >> k = 3 * k + 1; >> } >> >> In the main loop, superword can work well if parts of the operations in >> loop body are not vectorizable since those parts can be unrolled only. >> But for post loops, we don't create vectors through combining scalar IRs >> generated from loop unrolling. Instead, we are doing scalars to vectors >> replacement for all operations in the loop body. Hence, all operations >> should be either vectorized together or not vectorized at all. To fix >> this kind of cases, we add an extra field "_slp_vector_pack_count" in >> CountedLoopNode to record the eventual count of vector packs in the main >> loop. This value is then passed to post loop and compared with post loop >> pack count. Vectorization will be bailed out in post loop if it creates >> more vector packs than in the main loop. >> >> - **[Issue-2] Incorrect result in loops with growing-down vectors** >> >> This issue appears with growing-down vectors, that is, vectors that grow >> to smaller memory address as the loop iterates. It can be reproduced by >> below counting-up loop with negative scale value in array index. >> >> for (int i = 0; i < 10000; i++) { >> a[MAX - i] = b[MAX - i]; >> } >> >> Cause of this issue is that for a growing-down vector, generated vector >> mask value has reversed vector-lane order so it masks incorrect vector >> lanes. Note that if negative scale value appears in counting-down loops, >> the vector will be growing up. With this rule, we fix the issue by only >> allowing positive array index scales in counting-up loops and negative >> array index scales in counting-down loops. This check is done with the >> help of SWPointer by comparing scale values in each memory access in the >> loop with loop stride value. >> >> - **[Issue-3] Incorrect result in manually unrolled loops** >> >> This issue can be reproduced by below manually unrolled loop. >> >> for (int i = 0; i < 10000; i += 2) { >> c[i] = a[i] + b[i]; >> c[i + 1] = a[i + 1] * b[i + 1]; >> } >> >> In this loop, operations in the 2nd statement duplicate those in the 1st >> statement with a small memory address offset. Vectorization in the main >> loop works well in this case because C2 does further unrolling and pack >> combination. But we cannot vectorize the post loop through replacement >> from scalars to vectors because it creates duplicated vector operations. >> To fix this, we restrict post loop vectorization to loops with stride >> values of 1 or -1. >> >> - **[Issue-4] Incorrect result in loops with mixed vector element sizes** >> >> This issue is found after we enable post loop vectorization for AArch64. >> It's reproducible by multiple array operations with different element >> sizes inside a loop. On x86, there is no issue because the values of x86 >> AVX512 opmasks only depend on which vector lanes are active. But AArch64 >> is different - the values of SVE predicates also depend on lane size of >> the vector. Hence, on AArch64 SVE, if a loop has mixed vector element >> sizes, we should use different vector masks. For now, we just support >> loops with only one vector element size, i.e., "int + float" vectors in >> a single loop is ok but "int + double" vectors in a single loop is not >> vectorizable. This fix also enables subword vectors support to make all >> primitive type array operations vectorizable. >> >> - **[Issue-5] Incorrect result in loops with potential data dependence** >> >> This issue can be reproduced by below corner case on AArch64 only. >> >> for (int i = 0; i < 10000; i++) { >> a[i] = x; >> a[i + OFFSET] = y; >> } >> >> In this case, two stores in the loop have data dependence if the OFFSET >> value is smaller than the vector length. So we cannot do vectorization >> through replacing scalars to vectors. But the main loop vectorization >> in this case is successful on AArch64 because AArch64 has partial vector >> load/store support. It splits vector fill with different values in lanes >> to several smaller-sized fills. In this patch, we add additional data >> dependence check for this kind of cases. The check is also done with the >> help of SWPointer class. In this check, we require that every two memory >> accesses (with at least one store) of the same element type (or subword >> size) in the loop has the same array index expression. >> >> ### Tests >> >> So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with >> experimental VM option "PostLoopMultiversioning" turned on. We found no >> issue in all tests. We notice that those existing cases are not enough >> because some of above issues are not spotted by them. We would like to >> add some new cases but we found existing vectorization tests are a bit >> cumbersome - golden results must be pre-calculated and hard-coded in the >> test code for correctness verification. Thus, in this patch, we propose >> a new vectorization testing framework. >> >> Our new framework brings a simpler way to add new cases. For a new test >> case, we only need to create a new method annotated with "@Test". The >> test runner will invoke each annotated method twice automatically. First >> time it runs in the interpreter and second time it's forced compiled by >> C2. Then the two return results are compared. So in this framework each >> test method should return a primitive value or an array of primitives. >> In this way, no extra verification code for vectorization correctness is >> required. This test runner is still jtreg-based and takes advantages of >> the jtreg WhiteBox API, which enables test methods running at specific >> compilation levels. Each test class inside is also jtreg-based. It just >> need to inherit from the test runner class and run with two additional >> options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". >> >> ### Summary & Future work >> >> In this patch, we reworked post loop vectorization. We made it platform >> independent and fixed several issues inside. We also implemented a new >> vectorization testing framework with many test cases inside. Meanwhile, >> we did some code cleanups. >> >> This patch only touches C2 code guarded with PostLoopMultiversioning, >> except a few data structure changes. So, there's no behavior change when >> experimental VM option PostLoopMultiversioning is off. Also, to reduce >> risks, we still propose to keep post loop vectorization experimental for >> now. But if it receives positive feedback, we would like to change it to >> non-experimental in the future. > > Pengfei Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Update copyright year and rename a function > > Change-Id: I15845ebd3982edebd4c151284cc6f2ff727630bb > - Merge branch 'master' into postloop > > Change-Id: Ie639c79c9cf016dc68ebf2c0031b60453b45e9a4 > - Fix issues in newly added test framework > > Change-Id: I6e61abf05e9665325cb3abaf407360b18355c6b1 > - Merge branch 'master' into postloop > > Change-Id: I9bb5a808d7540426dedb141fd198d25eb1f569e6 > - 8183390: Fix and re-enable post loop vectorization > > ** Background > > Post loop vectorization is a C2 compiler optimization in an experimental > VM feature called PostLoopMultiversioning. It transforms the range-check > eliminated post loop to a 1-iteration vectorized loop with vector mask. > This optimization was contributed by Intel in 2016 to support x86 AVX512 > masked vector instructions. However, it was disabled soon after an issue > was found. Due to insufficient maintenance in these years, multiple bugs > have been accumulated inside. But we (Arm) still think this is a useful > framework for vector mask support in C2 auto-vectorized loops, for both > x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable > post loop vectorization. > > ** Changes in this patch > > This patch reworks post loop vectorization. The most significant change > is removing vector mask support in C2 x86 backend and re-implementing > it in the mid-end. With this, we can re-enable post loop vectorization > for platforms other than x86. > > Previous implementation hard-codes x86 k1 register as a reserved AVX512 > opmask register and defines two routines (setvectmask/restorevectmask) > to set and restore the value of k1. But after JDK-8211251 which encodes > AVX512 instructions as unmasked by default, generated vector masks are > no longer used in AVX512 vector instructions. To fix incorrect codegen > and add vector mask support for more platforms, we turn to add a vector > mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode > to generate a mask and replace all Load/Store nodes in the post loop > into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This > IR form is exactly the same to those which are used in VectorAPI mask > support. For now, we only add mask inputs for Load/Store nodes because > we don't have reduction operations supported in post loop vectorization. > After this change, the x86 k1 register is no longer reserved and can be > allocated when PostLoopMultiversioning is enabled. > > Besides this change, we have fixed a compiler crash and five incorrect > result issues with post loop vectorization. > > - 1) C2 crashes with segmentation fault in strip-mined loops > > Previous implementation was done before C2 loop strip-mining was merged > into JDK master so it didn't take strip-mined loops into consideration. > In C2's strip mined loops, post loop is not the sibling of the main loop > in ideal loop tree. Instead, it's the sibling of the main loop's parent. > This patch fixed a SIGSEGV issue caused by NULL pointer when locating > post loop from strip-mined main loop. > > - 2) Incorrect result issues with post loop vectorization > > We have also fixed five incorrect vectorization issues. Some of them are > hidden deep and can only be reproduced with corner cases. These issues > have a common cause that it assumes the post loop can be vectorized if > the vectorization in corresponding main loop is successful. But in many > cases this assumption is wrong. Below are details. > > [Issue-1] Incorrect vectorization for partial vectorizable loops > > This issue can be reproduced by below loop where only some operations in > the loop body are vectorizable. > > for (int i = 0; i < 10000; i++) { > res[i] = a[i] * b[i]; > k = 3 * k + 1; > } > > In the main loop, superword can work well if parts of the operations in > loop body are not vectorizable since those parts can be unrolled only. > But for post loops, we don't create vectors through combining scalar IRs > generated from loop unrolling. Instead, we are doing scalars to vectors > replacement for all operations in the loop body. Hence, all operations > should be either vectorized together or not vectorized at all. To fix > this kind of cases, we add an extra field "_slp_vector_pack_count" in > CountedLoopNode to record the eventual count of vector packs in the main > loop. This value is then passed to post loop and compared with post loop > pack count. Vectorization will be bailed out in post loop if it creates > more vector packs than in the main loop. > > [Issue-2] Incorrect result in loops with growing-down vectors > > This issue appears with growing-down vectors, that is, vectors that grow > to smaller memory address as the loop iterates. It can be reproduced by > below counting-up loop with negative scale value in array index. > > for (int i = 0; i < 10000; i++) { > a[MAX - i] = b[MAX - i]; > } > > Cause of this issue is that for a growing-down vector, generated vector > mask value has reversed vector-lane order so it masks incorrect vector > lanes. Note that if negative scale value appears in counting-down loops, > the vector will be growing up. With this rule, we fix the issue by only > allowing positive array index scales in counting-up loops and negative > array index scales in counting-down loops. This check is done with the > help of SWPointer by comparing scale values in each memory access in the > loop with loop stride value. > > [Issue-3] Incorrect result in manually unrolled loops > > This issue can be reproduced by below manually unrolled loop. > > for (int i = 0; i < 10000; i += 2) { > c[i] = a[i] + b[i]; > c[i + 1] = a[i + 1] * b[i + 1]; > } > > In this loop, operations in the 2nd statement duplicate those in the 1st > statement with a small memory address offset. Vectorization in the main > loop works well in this case because C2 does further unrolling and pack > combination. But we cannot vectorize the post loop through replacement > from scalars to vectors because it creates duplicated vector operations. > To fix this, we restrict post loop vectorization to loops with stride > values of 1 or -1. > > [Issue-4] Incorrect result in loops with mixed vector element sizes > > This issue is found after we enable post loop vectorization for AArch64. > It's reproducible by multiple array operations with different element > sizes inside a loop. On x86, there is no issue because the values of x86 > AVX512 opmasks only depend on which vector lanes are active. But AArch64 > is different - the values of SVE predicates also depend on lane size of > the vector. Hence, on AArch64 SVE, if a loop has mixed vector element > sizes, we should use different vector masks. For now, we just support > loops with only one vector element size, i.e., "int + float" vectors in > a single loop is ok but "int + double" vectors in a single loop is not > vectorizable. This fix also enables subword vectors support to make all > primitive type array operations vectorizable. > > [Issue-5] Incorrect result in loops with potential data dependence > > This issue can be reproduced by below corner case on AArch64 only. > > for (int i = 0; i < 10000; i++) { > a[i] = x; > a[i + OFFSET] = y; > } > > In this case, two stores in the loop have data dependence if the OFFSET > value is smaller than the vector length. So we cannot do vectorization > through replacing scalars to vectors. But the main loop vectorization > in this case is successful on AArch64 because AArch64 has partial vector > load/store support. It splits vector fill with different values in lanes > to several smaller-sized fills. In this patch, we add additional data > dependence check for this kind of cases. The check is also done with the > help of SWPointer class. In this check, we require that every two memory > accesses (with at least one store) of the same element type (or subword > size) in the loop has the same array index expression. > > ** Tests > > So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with > experimental VM option "PostLoopMultiversioning" turned on. We found no > issue in all tests. We notice that those existing cases are not enough > because some of above issues are not spotted by them. We would like to > add some new cases but we found existing vectorization tests are a bit > cumbersome - golden results must be pre-calculated and hard-coded in the > test code for correctness verification. Thus, in this patch, we propose > a new vectorization testing framework. > > Our new framework brings a simpler way to add new cases. For a new test > case, we only need to create a new method annotated with "@Test". The > test runner will invoke each annotated method twice automatically. First > time it runs in the interpreter and second time it's forced compiled by > C2. Then the two return results are compared. So in this framework each > test method should return a primitive value or an array of primitives. > In this way, no extra verification code for vectorization correctness is > required. This test runner is still jtreg-based and takes advantages of > the jtreg WhiteBox API, which enables test methods running at specific > compilation levels. Each test class inside is also jtreg-based. It just > need to inherit from the test runner class and run with two additional > options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". > > ** Summary & Future work > > In this patch, we reworked post loop vectorization. We made it platform > independent and fixed several issues inside. We also implemented a new > vectorization testing framework with many test cases inside. Meanwhile, > we did some code cleanups. > > This patch only touches C2 code guarded with PostLoopMultiversioning, > except a few data structure changes. So, there's no behavior change when > experimental VM option PostLoopMultiversioning is off. Also, to reduce > risks, we still propose to keep post loop vectorization experimental for > now. But if it receives positive feedback, we would like to change it to > non-experimental in the future. I have collected perf data for micro benchmarks on AVX512, will share results shortly. ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From hseigel at openjdk.java.net Mon Jan 10 13:58:31 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Mon, 10 Jan 2022 13:58:31 GMT Subject: RFR: 8218857: Confusing overloads for os::open In-Reply-To: <6o68V8XKdZcazOZUKcJLtZ6N7HZ2VWRuXopw0NpAb1Y=.90799217-bb62-4199-9fa3-588c51ba658a@github.com> References: <6o68V8XKdZcazOZUKcJLtZ6N7HZ2VWRuXopw0NpAb1Y=.90799217-bb62-4199-9fa3-588c51ba658a@github.com> Message-ID: On Fri, 7 Jan 2022 13:29:38 GMT, Harold Seigel wrote: > Please review this small change to resolve overload confusion by renaming "FILE* os::open(int fd, const char* mode)" to "FILE* os::fdopen(int fd, const char* mode)" in os.hpp. The change was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold Thanks Kim and Robbin for the reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/6988 From coleenp at openjdk.java.net Mon Jan 10 13:59:32 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 10 Jan 2022 13:59:32 GMT Subject: RFR: 8142362: Lots of code duplication in Copy class [v3] In-Reply-To: References: Message-ID: On Fri, 7 Jan 2022 15:12:49 GMT, Coleen Phillimore wrote: >> Removed an unused assembly function on one platform (_Copy_conjoint_bytes), and consolidated the linux and bsd x86, and linux and bsd aarch64 copy code that was duplicated. There's unfortunately now an #ifndef _WINDOWS in copy_x86.hpp and copy_aarch64.hpp, and I couldn't combine the duplicate copy_.S files because Windows doesn't have this file (and couldn't convince the build system to ignore the .S file for windows). >> I didn't think it was worth adding an os_cpu/posix_x86 and os_cpu/posix_aarch64 directory for this small bit of code. >> >> There could be more consolidation but the platform differences are subtle. This change just moves around code without poking this bear. >> >> Tested with tier1 on Oracle platforms, build on linux-x86-open,linux-s390x-open,linux-arm32-debug,linux-ppc64le-debug and Zero (zero in progress). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix copyrights Thanks Vladimir! ------------- PR: https://git.openjdk.java.net/jdk/pull/6984 From hseigel at openjdk.java.net Mon Jan 10 14:01:31 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Mon, 10 Jan 2022 14:01:31 GMT Subject: Integrated: 8218857: Confusing overloads for os::open In-Reply-To: <6o68V8XKdZcazOZUKcJLtZ6N7HZ2VWRuXopw0NpAb1Y=.90799217-bb62-4199-9fa3-588c51ba658a@github.com> References: <6o68V8XKdZcazOZUKcJLtZ6N7HZ2VWRuXopw0NpAb1Y=.90799217-bb62-4199-9fa3-588c51ba658a@github.com> Message-ID: On Fri, 7 Jan 2022 13:29:38 GMT, Harold Seigel wrote: > Please review this small change to resolve overload confusion by renaming "FILE* os::open(int fd, const char* mode)" to "FILE* os::fdopen(int fd, const char* mode)" in os.hpp. The change was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold This pull request has now been integrated. Changeset: 11d88ce8 Author: Harold Seigel URL: https://git.openjdk.java.net/jdk/commit/11d88ce82efd72d3d63f7c7271c285cd21b01217 Stats: 10 lines in 6 files changed: 0 ins; 0 del; 10 mod 8218857: Confusing overloads for os::open Reviewed-by: kbarrett, rehn ------------- PR: https://git.openjdk.java.net/jdk/pull/6988 From hseigel at openjdk.java.net Mon Jan 10 14:27:28 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Mon, 10 Jan 2022 14:27:28 GMT Subject: RFR: 8142362: Lots of code duplication in Copy class [v3] In-Reply-To: References: Message-ID: On Fri, 7 Jan 2022 15:12:49 GMT, Coleen Phillimore wrote: >> Removed an unused assembly function on one platform (_Copy_conjoint_bytes), and consolidated the linux and bsd x86, and linux and bsd aarch64 copy code that was duplicated. There's unfortunately now an #ifndef _WINDOWS in copy_x86.hpp and copy_aarch64.hpp, and I couldn't combine the duplicate copy_.S files because Windows doesn't have this file (and couldn't convince the build system to ignore the .S file for windows). >> I didn't think it was worth adding an os_cpu/posix_x86 and os_cpu/posix_aarch64 directory for this small bit of code. >> >> There could be more consolidation but the platform differences are subtle. This change just moves around code without poking this bear. >> >> Tested with tier1 on Oracle platforms, build on linux-x86-open,linux-s390x-open,linux-arm32-debug,linux-ppc64le-debug and Zero (zero in progress). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix copyrights Looks good! Thanks, Harold ------------- Marked as reviewed by hseigel (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6984 From coleenp at openjdk.java.net Mon Jan 10 14:40:35 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 10 Jan 2022 14:40:35 GMT Subject: RFR: 8142362: Lots of code duplication in Copy class [v3] In-Reply-To: References: Message-ID: On Fri, 7 Jan 2022 15:12:49 GMT, Coleen Phillimore wrote: >> Removed an unused assembly function on one platform (_Copy_conjoint_bytes), and consolidated the linux and bsd x86, and linux and bsd aarch64 copy code that was duplicated. There's unfortunately now an #ifndef _WINDOWS in copy_x86.hpp and copy_aarch64.hpp, and I couldn't combine the duplicate copy_.S files because Windows doesn't have this file (and couldn't convince the build system to ignore the .S file for windows). >> I didn't think it was worth adding an os_cpu/posix_x86 and os_cpu/posix_aarch64 directory for this small bit of code. >> >> There could be more consolidation but the platform differences are subtle. This change just moves around code without poking this bear. >> >> Tested with tier1 on Oracle platforms, build on linux-x86-open,linux-s390x-open,linux-arm32-debug,linux-ppc64le-debug and Zero (zero in progress). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix copyrights Thanks Harold! ------------- PR: https://git.openjdk.java.net/jdk/pull/6984 From coleenp at openjdk.java.net Mon Jan 10 14:40:35 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 10 Jan 2022 14:40:35 GMT Subject: Integrated: 8142362: Lots of code duplication in Copy class In-Reply-To: References: Message-ID: On Thu, 6 Jan 2022 20:45:39 GMT, Coleen Phillimore wrote: > Removed an unused assembly function on one platform (_Copy_conjoint_bytes), and consolidated the linux and bsd x86, and linux and bsd aarch64 copy code that was duplicated. There's unfortunately now an #ifndef _WINDOWS in copy_x86.hpp and copy_aarch64.hpp, and I couldn't combine the duplicate copy_.S files because Windows doesn't have this file (and couldn't convince the build system to ignore the .S file for windows). > I didn't think it was worth adding an os_cpu/posix_x86 and os_cpu/posix_aarch64 directory for this small bit of code. > > There could be more consolidation but the platform differences are subtle. This change just moves around code without poking this bear. > > Tested with tier1 on Oracle platforms, build on linux-x86-open,linux-s390x-open,linux-arm32-debug,linux-ppc64le-debug and Zero (zero in progress). This pull request has now been integrated. Changeset: 76477f8c Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/76477f8cdbc012f7ff0670ad57067ebf304612a0 Stats: 1628 lines in 11 files changed: 451 ins; 1162 del; 15 mod 8142362: Lots of code duplication in Copy class Reviewed-by: kvn, hseigel ------------- PR: https://git.openjdk.java.net/jdk/pull/6984 From jwilhelm at openjdk.java.net Mon Jan 10 17:10:40 2022 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Mon, 10 Jan 2022 17:10:40 GMT Subject: Integrated: Merge jdk18 Message-ID: <5hMMnbP2laUfEjb0svduDuzGgdUJckDDcCxaj-7QZg0=.1fd54c99-b81e-41e1-b3d7-bbdbedf161c9@github.com> Forwardport JDK 18 -> JDK 19 ------------- Commit messages: - Merge - 8274679: Remove unnecessary conversion to String in security code in java.base - 8142362: Lots of code duplication in Copy class - 8218857: Confusing overloads for os::open - 8183227: read/write APIs in class os shall return ssize_t - 8279300: [arm32] SIGILL when running GetObjectSizeIntrinsicsTest - 8278329: some TraceDeoptimization code not included in PRODUCT build - 8279523: Parallel: Remove unnecessary PSScavenge::_to_space_top_before_gc - 8279522: Serial: Remove unused Generation::clear_remembered_set - 8279528: Unused TypeEnter.diag after JDK-8205187 - ... and 163 more: https://git.openjdk.java.net/jdk/compare/40df5df9...6ff1c607 The merge commit only contains trivial merges, so no merge-specific webrevs have been generated. Changes: https://git.openjdk.java.net/jdk/pull/7017/files Stats: 27994 lines in 750 files changed: 20678 ins; 5315 del; 2001 mod Patch: https://git.openjdk.java.net/jdk/pull/7017.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7017/head:pull/7017 PR: https://git.openjdk.java.net/jdk/pull/7017 From jwilhelm at openjdk.java.net Mon Jan 10 17:10:41 2022 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Mon, 10 Jan 2022 17:10:41 GMT Subject: Integrated: Merge jdk18 In-Reply-To: <5hMMnbP2laUfEjb0svduDuzGgdUJckDDcCxaj-7QZg0=.1fd54c99-b81e-41e1-b3d7-bbdbedf161c9@github.com> References: <5hMMnbP2laUfEjb0svduDuzGgdUJckDDcCxaj-7QZg0=.1fd54c99-b81e-41e1-b3d7-bbdbedf161c9@github.com> Message-ID: On Mon, 10 Jan 2022 17:00:05 GMT, Jesper Wilhelmsson wrote: > Forwardport JDK 18 -> JDK 19 This pull request has now been integrated. Changeset: d9b1bb58 Author: Jesper Wilhelmsson URL: https://git.openjdk.java.net/jdk/commit/d9b1bb58600c03cee43387864d1530d4dd5f1422 Stats: 615 lines in 24 files changed: 478 ins; 77 del; 60 mod Merge ------------- PR: https://git.openjdk.java.net/jdk/pull/7017 From duke at openjdk.java.net Mon Jan 10 18:03:46 2022 From: duke at openjdk.java.net (Emanuel Peter) Date: Mon, 10 Jan 2022 18:03:46 GMT Subject: RFR: 8277748: Obsolete the MinInliningThreshold flag in JDK 19 Message-ID: <_TiZMCmpu_DVduVg5Hcpz6yr7UxXjHi3smZH9WClBOY=.1aa19985-0fe1-46a1-b113-975216f49893@github.com> Changed `MinInliningThreshold` from `Depricated` (for JDK 18) to `Obsolete` (for JDK 19). Checked that tests are not affected. ------------- Commit messages: - 8277748: Obsolete the MinInliningThreshold flag in JDK 19 Changes: https://git.openjdk.java.net/jdk/pull/6986/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6986&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8277748 Stats: 15 lines in 3 files changed: 1 ins; 14 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/6986.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6986/head:pull/6986 PR: https://git.openjdk.java.net/jdk/pull/6986 From duke at openjdk.java.net Mon Jan 10 18:03:47 2022 From: duke at openjdk.java.net (Emanuel Peter) Date: Mon, 10 Jan 2022 18:03:47 GMT Subject: RFR: 8277748: Obsolete the MinInliningThreshold flag in JDK 19 In-Reply-To: <_TiZMCmpu_DVduVg5Hcpz6yr7UxXjHi3smZH9WClBOY=.1aa19985-0fe1-46a1-b113-975216f49893@github.com> References: <_TiZMCmpu_DVduVg5Hcpz6yr7UxXjHi3smZH9WClBOY=.1aa19985-0fe1-46a1-b113-975216f49893@github.com> Message-ID: On Fri, 7 Jan 2022 12:41:26 GMT, Emanuel Peter wrote: > Changed `MinInliningThreshold` from `Depricated` (for JDK 18) to `Obsolete` (for JDK 19). > > Checked that tests are not affected. \covered ------------- PR: https://git.openjdk.java.net/jdk/pull/6986 From shade at openjdk.java.net Mon Jan 10 18:25:11 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 10 Jan 2022 18:25:11 GMT Subject: RFR: 8279668: x86: AVX2 versions of vpxor should be asserted [v4] In-Reply-To: References: Message-ID: > Got the SIGILLs on some machines when testing JDK-8279621. That patch started using shorter `vpxor` versions on `UseAVX = 1` path, which tried to use `VEX.256`-encoded `vpxor` instruction that is only available on AVX2. This should be at very least asserted in assembler code. > > Also fixed the implicit `bool` -> `int` conversion to `vector_len`. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` with `-XX:UseAVX=1` Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Review comments - Merge branch 'master' into JDK-8279668-vpxor-avx2 - Add more asserts - Use AVX_256bit literal - Fix ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7005/files - new: https://git.openjdk.java.net/jdk/pull/7005/files/194bfa09..483e8e65 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7005&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7005&range=02-03 Stats: 6610 lines in 172 files changed: 4268 ins; 1779 del; 563 mod Patch: https://git.openjdk.java.net/jdk/pull/7005.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7005/head:pull/7005 PR: https://git.openjdk.java.net/jdk/pull/7005 From shade at openjdk.java.net Mon Jan 10 18:25:15 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 10 Jan 2022 18:25:15 GMT Subject: RFR: 8279668: x86: AVX2 versions of vpxor should be asserted [v3] In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 13:31:42 GMT, Jie Fu wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Add more asserts > > src/hotspot/cpu/x86/macroAssembler_x86.hpp line 1493: > >> 1491: // Simple version for AVX2 256bit vectors >> 1492: void vpxor(XMMRegister dst, XMMRegister src) { >> 1493: assert(UseAVX >= 2, "Only with AVX2"); > > I would suggest the assert msg as "Only with AVX2 or above". > We should also update the copyright year. Rephrased the assert message, updated copyright date. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/7005 From kvn at openjdk.java.net Mon Jan 10 20:12:24 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 10 Jan 2022 20:12:24 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v8] In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 06:05:39 GMT, Vamsi Parasa wrote: > > Tests failed on aarch64 systems and avx2 x86. > > Could you please let me know if the failing test is test/hotspot/jtreg/compiler/vectorization/TestPopCountVectorLong.java ? Added additional checks (shown below) to make sure it runs on an x86 machine that has AVX3. > > * @requires vm.compiler2.enabled > * @requires os.arch=="x86" | os.arch=="i386" | os.arch=="amd64" | os.arch=="x86_64" > > This test exits gracefully on a Skylake machine which doesn't have AVX3. As I posted in my comment next tests failed on Aarch64 and x86 with avx2 only (AMD): compiler/codegen/TestIntFloatVect.java compiler/codegen/TestLongDoubleVect.java ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From rkennke at openjdk.java.net Mon Jan 10 20:16:44 2022 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 10 Jan 2022 20:16:44 GMT Subject: RFR: 8279534: Consolidate and remove oopDesc::klass_gap methods Message-ID: After JDK-8278568, these methods are unused: inline int klass_gap() const; inline void set_klass_gap(int z); Except Zero which uses set_klass_gap(int), but we agreed elsewhere (#5585) that we don't want to access partly initialized oops as such. We should use the HeapWord* initialization variants in Zero, too. Note: we could take that even further and replace the initialization in Zero with ObjAllocator::initialize() call, but that would also have to remove the storestore fence, and possibly adopt ObjAllocator to avoid clearing in already-zeroed TLABs, all of which would have wider consequences and would be a matter for separate PR. Testing: - [x] Build (for klass_gap methods removal) - [ ] GHA for Zero stuff ------------- Commit messages: - Reinstate storestore fence - Use HeapWord* based oop initializers in Zero - 8279534: Unused oopDesc::klass_gap methods after JDK-8278568 Changes: https://git.openjdk.java.net/jdk/pull/7008/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7008&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8279534 Stats: 18 lines in 3 files changed: 3 ins; 13 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7008.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7008/head:pull/7008 PR: https://git.openjdk.java.net/jdk/pull/7008 From kvn at openjdk.java.net Mon Jan 10 20:17:29 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 10 Jan 2022 20:17:29 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v9] In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 06:05:54 GMT, Vamsi Parasa wrote: >> Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. > > Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > aditional checks for the test `TestPopCountVectorLong.java ` was not ran on these systems because it has `@requires vm.cpu.features ~= ".*avx512dq.*"` And I did not test other tiers because tier1 had failures. ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From kvn at openjdk.java.net Mon Jan 10 20:20:34 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 10 Jan 2022 20:20:34 GMT Subject: [jdk18] RFR: 8279356: Method linking fails with guarantee(mh->adapter() != NULL) failed: Adapter blob must already exist! In-Reply-To: References: Message-ID: <4N5sWWZtqXAYTCoOUPxLoitpPBZb1p392Cq1Z2nLJo0=.0f001508-8ed5-40c3-8865-b88dd2783fde@github.com> On Mon, 10 Jan 2022 07:46:24 GMT, Tobias Hartmann wrote: > Adapter creation during method linking may fail due to a lack of code cache space which leads to a `VirtualMachineError` being thrown and thus a bail out from linking the holder klass: > https://github.com/openjdk/jdk18/blob/d65c665839c0a564c422ef685f2673fac37315d7/src/hotspot/share/oops/method.cpp#L1239-L1247 > > If the `VirtualMachineError` is handled/ignored by the application, we may later attempt to link the same klass and therefore also the same method again. We then incorrectly bail out from adapter creation because the `_i2i_entry` is set, assuming that this can only happen if adapters have already been created. However, that is not guaranteed because the interpreter entry is set right **before** adapters are created: > https://github.com/openjdk/jdk18/blob/d65c665839c0a564c422ef685f2673fac37315d7/src/hotspot/share/oops/method.cpp#L1213-L1230 > > I propose to instead check if adapters have been created. > > This is an old bug that was just recently triggered by an unrelated change. > > Thanks, > Tobias Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk18/pull/88 From duke at openjdk.java.net Mon Jan 10 20:23:29 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Mon, 10 Jan 2022 20:23:29 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v8] In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 20:08:52 GMT, Vladimir Kozlov wrote: > > > Tests failed on aarch64 systems and avx2 x86. > > > > > > Could you please let me know if the failing test is test/hotspot/jtreg/compiler/vectorization/TestPopCountVectorLong.java ? Added additional checks (shown below) to make sure it runs on an x86 machine that has AVX3. > > > > * @requires vm.compiler2.enabled > > * @requires os.arch=="x86" | os.arch=="i386" | os.arch=="amd64" | os.arch=="x86_64" > > > > This test exits gracefully on a Skylake machine which doesn't have AVX3. > > As I posted in my comment next tests failed on Aarch64 and x86 with avx2 only (AMD): > > ``` > compiler/codegen/TestIntFloatVect.java > compiler/codegen/TestLongDoubleVect.java > ``` Thank you Vladimir! Will work on fixing the compiler/codegen/{TestIntFloatVect.java, TestLongDoubleVect.java} This patch is not supposed to affect those tests but I will investigate why they're failing and update you... ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From kvn at openjdk.java.net Mon Jan 10 20:39:27 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 10 Jan 2022 20:39:27 GMT Subject: RFR: 8277748: Obsolete the MinInliningThreshold flag in JDK 19 In-Reply-To: <_TiZMCmpu_DVduVg5Hcpz6yr7UxXjHi3smZH9WClBOY=.1aa19985-0fe1-46a1-b113-975216f49893@github.com> References: <_TiZMCmpu_DVduVg5Hcpz6yr7UxXjHi3smZH9WClBOY=.1aa19985-0fe1-46a1-b113-975216f49893@github.com> Message-ID: <2eOyNCXzi1D3wsqpk8w8KDtnJiORFm4-bEP3y-djGec=.7e322010-5281-499f-abba-c10ea7cf4052@github.com> On Fri, 7 Jan 2022 12:41:26 GMT, Emanuel Peter wrote: > Changed `MinInliningThreshold` from `Depricated` (for JDK 18) to `Obsolete` (for JDK 19). > > Checked that tests are not affected. Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6986 From hseigel at openjdk.java.net Mon Jan 10 21:10:41 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Mon, 10 Jan 2022 21:10:41 GMT Subject: RFR: 8238161: use os::fopen in HS code where possible Message-ID: Please review this change to hotspot to call os::fopen() instead of fopen() so that the 'close-on-exec' flag gets set for these opened files. This change was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. Additionally, the changes were built on Linux PPC. Thanks, Harold ------------- Commit messages: - 8238161: use os::fopen in HS code where possible Changes: https://git.openjdk.java.net/jdk/pull/7022/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7022&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8238161 Stats: 59 lines in 20 files changed: 1 ins; 0 del; 58 mod Patch: https://git.openjdk.java.net/jdk/pull/7022.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7022/head:pull/7022 PR: https://git.openjdk.java.net/jdk/pull/7022 From coleenp at openjdk.java.net Mon Jan 10 22:55:38 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 10 Jan 2022 22:55:38 GMT Subject: [jdk18] RFR: 8279356: Method linking fails with guarantee(mh->adapter() != NULL) failed: Adapter blob must already exist! In-Reply-To: References: Message-ID: <4cguacKdKwMnPvAB5vJ73y5-mFMAApesgPiPq-492CM=.017d301b-88f6-4400-9295-947dddc345d3@github.com> On Mon, 10 Jan 2022 07:46:24 GMT, Tobias Hartmann wrote: > Adapter creation during method linking may fail due to a lack of code cache space which leads to a `VirtualMachineError` being thrown and thus a bail out from linking the holder klass: > https://github.com/openjdk/jdk18/blob/d65c665839c0a564c422ef685f2673fac37315d7/src/hotspot/share/oops/method.cpp#L1239-L1247 > > If the `VirtualMachineError` is handled/ignored by the application, we may later attempt to link the same klass and therefore also the same method again. We then incorrectly bail out from adapter creation because the `_i2i_entry` is set, assuming that this can only happen if adapters have already been created. However, that is not guaranteed because the interpreter entry is set right **before** adapters are created: > https://github.com/openjdk/jdk18/blob/d65c665839c0a564c422ef685f2673fac37315d7/src/hotspot/share/oops/method.cpp#L1213-L1230 > > I propose to instead check if adapters have been created. > > This is an old bug that was just recently triggered by an unrelated change. > > Thanks, > Tobias Looks good. You should update the copyrights to 2022 though. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk18/pull/88 From kvn at openjdk.java.net Mon Jan 10 23:17:38 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 10 Jan 2022 23:17:38 GMT Subject: RFR: 8279668: x86: AVX2 versions of vpxor should be asserted [v4] In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 18:25:11 GMT, Aleksey Shipilev wrote: >> Got the SIGILLs on some machines when testing JDK-8279621. That patch started using shorter `vpxor` versions on `UseAVX = 1` path, which tried to use `VEX.256`-encoded `vpxor` instruction that is only available on AVX2. This should be at very least asserted in assembler code. >> >> Also fixed the implicit `bool` -> `int` conversion to `vector_len`. >> >> Additional testing: >> - [x] Linux x86_64 fastdebug `tier1` with `-XX:UseAVX=1` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Review comments > - Merge branch 'master' into JDK-8279668-vpxor-avx2 > - Add more asserts > - Use AVX_256bit literal > - Fix Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7005 From jiefu at openjdk.java.net Mon Jan 10 23:17:38 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 10 Jan 2022 23:17:38 GMT Subject: RFR: 8279668: x86: AVX2 versions of vpxor should be asserted [v4] In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 18:25:11 GMT, Aleksey Shipilev wrote: >> Got the SIGILLs on some machines when testing JDK-8279621. That patch started using shorter `vpxor` versions on `UseAVX = 1` path, which tried to use `VEX.256`-encoded `vpxor` instruction that is only available on AVX2. This should be at very least asserted in assembler code. >> >> Also fixed the implicit `bool` -> `int` conversion to `vector_len`. >> >> Additional testing: >> - [x] Linux x86_64 fastdebug `tier1` with `-XX:UseAVX=1` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Review comments > - Merge branch 'master' into JDK-8279668-vpxor-avx2 > - Add more asserts > - Use AVX_256bit literal > - Fix LGTM Thanks for your update. ------------- Marked as reviewed by jiefu (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7005 From duke at openjdk.java.net Mon Jan 10 23:51:30 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Mon, 10 Jan 2022 23:51:30 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v9] In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 20:14:40 GMT, Vladimir Kozlov wrote: >> Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: >> >> aditional checks for the test > > `TestPopCountVectorLong.java ` was not ran on these systems because it has `@requires vm.cpu.features ~= ".*avx512dq.*"` > And I did not test other tiers because tier1 had failures. Hi Vladimir (@vnkozlov) Could you please check if you incorporated the fix for the 'opc == ' bug? The fix for that bug was already pushed last week. Because, without the bug fix, I was able to reproduce the failure of compiler/codegen/TestIntFloatVect.java on AVX2 x86 machine. After applying the fix (which was pushed last week), both the tests compiler/codegen/{TestIntFloatVect.java, TestLongDoubleVect.java} are passing on AVX2(x86) ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From coleenp at openjdk.java.net Tue Jan 11 01:53:49 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 01:53:49 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long Message-ID: Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). ------------- Commit messages: - remove commented out code - 8248404: AArch64: Remove uses of long and unsigned long Changes: https://git.openjdk.java.net/jdk/pull/7023/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8248404 Stats: 42 lines in 5 files changed: 9 ins; 20 del; 13 mod Patch: https://git.openjdk.java.net/jdk/pull/7023.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7023/head:pull/7023 PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Tue Jan 11 04:07:51 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 04:07:51 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v2] In-Reply-To: References: Message-ID: > Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Cast Address operand to int ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7023/files - new: https://git.openjdk.java.net/jdk/pull/7023/files/6640b24a..f1478e57 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=00-01 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7023.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7023/head:pull/7023 PR: https://git.openjdk.java.net/jdk/pull/7023 From kvn at openjdk.java.net Tue Jan 11 04:47:26 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 11 Jan 2022 04:47:26 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v9] In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 06:05:54 GMT, Vamsi Parasa wrote: >> Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. > > Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > aditional checks for the test Latest version passed tier1-3 testing. Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6857 From thartmann at openjdk.java.net Tue Jan 11 06:54:21 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Tue, 11 Jan 2022 06:54:21 GMT Subject: RFR: 8277748: Obsolete the MinInliningThreshold flag in JDK 19 In-Reply-To: <_TiZMCmpu_DVduVg5Hcpz6yr7UxXjHi3smZH9WClBOY=.1aa19985-0fe1-46a1-b113-975216f49893@github.com> References: <_TiZMCmpu_DVduVg5Hcpz6yr7UxXjHi3smZH9WClBOY=.1aa19985-0fe1-46a1-b113-975216f49893@github.com> Message-ID: On Fri, 7 Jan 2022 12:41:26 GMT, Emanuel Peter wrote: > Changed `MinInliningThreshold` from `Depricated` (for JDK 18) to `Obsolete` (for JDK 19). > > Checked that tests are not affected. Looks good to me too. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6986 From duke at openjdk.java.net Tue Jan 11 07:01:24 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Tue, 11 Jan 2022 07:01:24 GMT Subject: RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v9] In-Reply-To: References: Message-ID: On Tue, 11 Jan 2022 04:44:14 GMT, Vladimir Kozlov wrote: > Latest version passed tier1-3 testing. Good. Thank you Vladimir! ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From thartmann at openjdk.java.net Tue Jan 11 07:02:55 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Tue, 11 Jan 2022 07:02:55 GMT Subject: [jdk18] RFR: 8279356: Method linking fails with guarantee(mh->adapter() != NULL) failed: Adapter blob must already exist! [v2] In-Reply-To: References: Message-ID: > Adapter creation during method linking may fail due to a lack of code cache space which leads to a `VirtualMachineError` being thrown and thus a bail out from linking the holder klass: > https://github.com/openjdk/jdk18/blob/d65c665839c0a564c422ef685f2673fac37315d7/src/hotspot/share/oops/method.cpp#L1239-L1247 > > If the `VirtualMachineError` is handled/ignored by the application, we may later attempt to link the same klass and therefore also the same method again. We then incorrectly bail out from adapter creation because the `_i2i_entry` is set, assuming that this can only happen if adapters have already been created. However, that is not guaranteed because the interpreter entry is set right **before** adapters are created: > https://github.com/openjdk/jdk18/blob/d65c665839c0a564c422ef685f2673fac37315d7/src/hotspot/share/oops/method.cpp#L1213-L1230 > > I propose to instead check if adapters have been created. > > This is an old bug that was just recently triggered by an unrelated change. > > Thanks, > Tobias Tobias Hartmann has updated the pull request incrementally with one additional commit since the last revision: Updated copyright ------------- Changes: - all: https://git.openjdk.java.net/jdk18/pull/88/files - new: https://git.openjdk.java.net/jdk18/pull/88/files/e0ba3cfb..5c6e13d9 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk18&pr=88&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk18&pr=88&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk18/pull/88.diff Fetch: git fetch https://git.openjdk.java.net/jdk18 pull/88/head:pull/88 PR: https://git.openjdk.java.net/jdk18/pull/88 From thartmann at openjdk.java.net Tue Jan 11 07:02:56 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Tue, 11 Jan 2022 07:02:56 GMT Subject: [jdk18] RFR: 8279356: Method linking fails with guarantee(mh->adapter() != NULL) failed: Adapter blob must already exist! In-Reply-To: References: Message-ID: <3RiJlpnar4WPfGzIg8aLwaXp1Qe7WwqLropK2rYQ_gQ=.52c87fd1-e3aa-4d81-80ca-8c2dd2002a1d@github.com> On Mon, 10 Jan 2022 07:46:24 GMT, Tobias Hartmann wrote: > Adapter creation during method linking may fail due to a lack of code cache space which leads to a `VirtualMachineError` being thrown and thus a bail out from linking the holder klass: > https://github.com/openjdk/jdk18/blob/d65c665839c0a564c422ef685f2673fac37315d7/src/hotspot/share/oops/method.cpp#L1239-L1247 > > If the `VirtualMachineError` is handled/ignored by the application, we may later attempt to link the same klass and therefore also the same method again. We then incorrectly bail out from adapter creation because the `_i2i_entry` is set, assuming that this can only happen if adapters have already been created. However, that is not guaranteed because the interpreter entry is set right **before** adapters are created: > https://github.com/openjdk/jdk18/blob/d65c665839c0a564c422ef685f2673fac37315d7/src/hotspot/share/oops/method.cpp#L1213-L1230 > > I propose to instead check if adapters have been created. > > This is an old bug that was just recently triggered by an unrelated change. > > Thanks, > Tobias Vladimir, Coleen, thanks for the reviews. I updated the copyright. ------------- PR: https://git.openjdk.java.net/jdk18/pull/88 From thartmann at openjdk.java.net Tue Jan 11 07:02:57 2022 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Tue, 11 Jan 2022 07:02:57 GMT Subject: [jdk18] Integrated: 8279356: Method linking fails with guarantee(mh->adapter() != NULL) failed: Adapter blob must already exist! In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 07:46:24 GMT, Tobias Hartmann wrote: > Adapter creation during method linking may fail due to a lack of code cache space which leads to a `VirtualMachineError` being thrown and thus a bail out from linking the holder klass: > https://github.com/openjdk/jdk18/blob/d65c665839c0a564c422ef685f2673fac37315d7/src/hotspot/share/oops/method.cpp#L1239-L1247 > > If the `VirtualMachineError` is handled/ignored by the application, we may later attempt to link the same klass and therefore also the same method again. We then incorrectly bail out from adapter creation because the `_i2i_entry` is set, assuming that this can only happen if adapters have already been created. However, that is not guaranteed because the interpreter entry is set right **before** adapters are created: > https://github.com/openjdk/jdk18/blob/d65c665839c0a564c422ef685f2673fac37315d7/src/hotspot/share/oops/method.cpp#L1213-L1230 > > I propose to instead check if adapters have been created. > > This is an old bug that was just recently triggered by an unrelated change. > > Thanks, > Tobias This pull request has now been integrated. Changeset: 6d7db4b0 Author: Tobias Hartmann URL: https://git.openjdk.java.net/jdk18/commit/6d7db4b0b3e9172645cef12c36fbeb41a6d38d83 Stats: 53 lines in 2 files changed: 43 ins; 2 del; 8 mod 8279356: Method linking fails with guarantee(mh->adapter() != NULL) failed: Adapter blob must already exist! Reviewed-by: chagedorn, kvn, coleenp ------------- PR: https://git.openjdk.java.net/jdk18/pull/88 From mbaesken at openjdk.java.net Tue Jan 11 08:18:22 2022 From: mbaesken at openjdk.java.net (Matthias Baesken) Date: Tue, 11 Jan 2022 08:18:22 GMT Subject: RFR: 8238161: use os::fopen in HS code where possible In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 21:02:54 GMT, Harold Seigel wrote: > Please review this change to hotspot to call os::fopen() instead of fopen() so that the 'close-on-exec' flag gets set for these opened files. This change was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. Additionally, the changes were built on Linux PPC. > > Thanks, Harold Hi Harold, this looks good to me. Thanks, Matthias ------------- Marked as reviewed by mbaesken (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7022 From duke at openjdk.java.net Tue Jan 11 08:36:27 2022 From: duke at openjdk.java.net (Emanuel Peter) Date: Tue, 11 Jan 2022 08:36:27 GMT Subject: Integrated: 8277748: Obsolete the MinInliningThreshold flag in JDK 19 In-Reply-To: <_TiZMCmpu_DVduVg5Hcpz6yr7UxXjHi3smZH9WClBOY=.1aa19985-0fe1-46a1-b113-975216f49893@github.com> References: <_TiZMCmpu_DVduVg5Hcpz6yr7UxXjHi3smZH9WClBOY=.1aa19985-0fe1-46a1-b113-975216f49893@github.com> Message-ID: <2BUHe9vwvDDqdErrPwOH9CDNhzYDGeSCRwekRsXIBk0=.417d05ed-f649-4629-86c8-15f560bb7c92@github.com> On Fri, 7 Jan 2022 12:41:26 GMT, Emanuel Peter wrote: > Changed `MinInliningThreshold` from `Depricated` (for JDK 18) to `Obsolete` (for JDK 19). > > Checked that tests are not affected. This pull request has now been integrated. Changeset: bf7bcaac Author: Emanuel Peter Committer: Tobias Hartmann URL: https://git.openjdk.java.net/jdk/commit/bf7bcaacaab12dbba1c2fb010487ed9196cb2fa5 Stats: 15 lines in 3 files changed: 1 ins; 14 del; 0 mod 8277748: Obsolete the MinInliningThreshold flag in JDK 19 Reviewed-by: kvn, thartmann ------------- PR: https://git.openjdk.java.net/jdk/pull/6986 From stefank at openjdk.java.net Tue Jan 11 08:51:28 2022 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 11 Jan 2022 08:51:28 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v2] In-Reply-To: References: Message-ID: <1jhilky48u8gRyNmagzpoOJ0WIrf5djH5oDotN0W1AM=.49233b11-e581-4fc3-abac-91ccb627a5e0@github.com> On Tue, 11 Jan 2022 04:07:51 GMT, Coleen Phillimore wrote: >> Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Cast Address operand to int Changes requested by stefank (Reviewer). src/hotspot/cpu/aarch64/macroAssembler_aarch64_log.cpp line 300: > 298: fmovs(tmp3, vtmp5); // int intB0 = AS_INT_BITS(B); > 299: mov(tmp5, 0x3FE0); > 300: uint64_t mask = 0xffffe00000000000UL; I think this should be using ULL to support LLP64. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From shade at openjdk.java.net Tue Jan 11 10:31:33 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 11 Jan 2022 10:31:33 GMT Subject: RFR: 8279668: x86: AVX2 versions of vpxor should be asserted [v4] In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 18:25:11 GMT, Aleksey Shipilev wrote: >> Got the SIGILLs on some machines when testing JDK-8279621. That patch started using shorter `vpxor` versions on `UseAVX = 1` path, which tried to use `VEX.256`-encoded `vpxor` instruction that is only available on AVX2. This should be at very least asserted in assembler code. >> >> Also fixed the implicit `bool` -> `int` conversion to `vector_len`. >> >> Additional testing: >> - [x] Linux x86_64 fastdebug `tier1` with `-XX:UseAVX=1` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Review comments > - Merge branch 'master' into JDK-8279668-vpxor-avx2 > - Add more asserts > - Use AVX_256bit literal > - Fix Thank you for reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/7005 From shade at openjdk.java.net Tue Jan 11 10:31:33 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 11 Jan 2022 10:31:33 GMT Subject: Integrated: 8279668: x86: AVX2 versions of vpxor should be asserted In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 10:31:30 GMT, Aleksey Shipilev wrote: > Got the SIGILLs on some machines when testing JDK-8279621. That patch started using shorter `vpxor` versions on `UseAVX = 1` path, which tried to use `VEX.256`-encoded `vpxor` instruction that is only available on AVX2. This should be at very least asserted in assembler code. > > Also fixed the implicit `bool` -> `int` conversion to `vector_len`. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` with `-XX:UseAVX=1` This pull request has now been integrated. Changeset: 2bbeae3f Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/2bbeae3f056243a224b0bda021f16cdcbee3b3d6 Stats: 15 lines in 2 files changed: 12 ins; 0 del; 3 mod 8279668: x86: AVX2 versions of vpxor should be asserted Reviewed-by: kvn, jiefu ------------- PR: https://git.openjdk.java.net/jdk/pull/7005 From aph at openjdk.java.net Tue Jan 11 10:49:29 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 11 Jan 2022 10:49:29 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v2] In-Reply-To: References: Message-ID: On Tue, 11 Jan 2022 04:07:51 GMT, Coleen Phillimore wrote: >> Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Cast Address operand to int src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 499: > 497: #ifdef __APPLE__ > 498: // macosx wants all the overloads > 499: inline void mov(Register dst, intptr_t imm32) { mov_immediate64(dst, imm32); } Suggestion: inline void mov(Register dst, intptr_t imm64) { mov_immediate64(dst, imm64); } src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 501: > 499: inline void mov(Register dst, intptr_t imm32) { mov_immediate64(dst, imm32); } > 500: #endif > 501: inline void mov(Register dst, int64_t imm32) { mov_immediate64(dst, imm32); } Suggestion: inline void mov(Register dst, int64_t imm64) { mov_immediate64(dst, imm64); } src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 502: > 500: #endif > 501: inline void mov(Register dst, int64_t imm32) { mov_immediate64(dst, imm32); } > 502: inline void mov(Register dst, uint64_t imm32) { mov_immediate64(dst, imm32); } Suggestion: inline void mov(Register dst, uint64_t imm64) { mov_immediate64(dst, imm64); } ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From aph at openjdk.java.net Tue Jan 11 10:58:23 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 11 Jan 2022 10:58:23 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v2] In-Reply-To: References: Message-ID: On Tue, 11 Jan 2022 04:07:51 GMT, Coleen Phillimore wrote: >> Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Cast Address operand to int src/hotspot/cpu/aarch64/universalUpcallHandler_aarch64.cpp line 74: > 72: // Capture prev stack pointer (stack arguments base) > 73: __ add(rscratch1, rfp, 16); // Skip saved FP and LR > 74: __ str(rscratch1, Address(sp, (int)layout.stack_args)); // x86 casts to int also Suggestion: __ str(rscratch1, Address(sp, checked_cast(layout.stack_args))); // x86 casts to int also ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From aph at openjdk.java.net Tue Jan 11 10:58:23 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 11 Jan 2022 10:58:23 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v2] In-Reply-To: References: Message-ID: On Tue, 11 Jan 2022 10:52:48 GMT, Andrew Haley wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Cast Address operand to int > > src/hotspot/cpu/aarch64/universalUpcallHandler_aarch64.cpp line 74: > >> 72: // Capture prev stack pointer (stack arguments base) >> 73: __ add(rscratch1, rfp, 16); // Skip saved FP and LR >> 74: __ str(rscratch1, Address(sp, (int)layout.stack_args)); // x86 casts to int also > > Suggestion: > > __ str(rscratch1, Address(sp, checked_cast(layout.stack_args))); // x86 casts to int also ... because this is UB if it doesn't fit. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From shade at openjdk.java.net Tue Jan 11 11:10:32 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 11 Jan 2022 11:10:32 GMT Subject: RFR: 8279534: Consolidate and remove oopDesc::klass_gap methods In-Reply-To: References: Message-ID: <22xHj7nWpeA2KedI3m5fLxrJMJ-mxJLy9n3b_-F8Huo=.2bfe798b-664f-430a-9b26-70930a9a09da@github.com> On Mon, 10 Jan 2022 13:40:29 GMT, Roman Kennke wrote: > After JDK-8278568, these methods are unused: > inline int klass_gap() const; > inline void set_klass_gap(int z); > > Except Zero which uses set_klass_gap(int), but we agreed elsewhere (#5585) that we don't want to access partly initialized oops as such. We should use the HeapWord* initialization variants in Zero, too. > > Note: we could take that even further and replace the initialization in Zero with ObjAllocator::initialize() call, but that would also have to remove the storestore fence, and possibly adopt ObjAllocator to avoid clearing in already-zeroed TLABs, all of which would have wider consequences and would be a matter for separate PR. > > Testing: > - [x] Build (for klass_gap methods removal) > - [ ] GHA for Zero stuff Zero changes look fine to me. A good smoke test for Zero is `make bootcycle-images`, and it passes on Linux x86_64 for me with this patch. The rest looks good too. ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7008 From coleenp at openjdk.java.net Tue Jan 11 13:51:58 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 13:51:58 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v3] In-Reply-To: References: Message-ID: > Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix casts and types and fix unintented change to Address constructor. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7023/files - new: https://git.openjdk.java.net/jdk/pull/7023/files/f1478e57..dace0869 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=01-02 Stats: 5 lines in 3 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/7023.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7023/head:pull/7023 PR: https://git.openjdk.java.net/jdk/pull/7023 From duke at openjdk.java.net Tue Jan 11 15:28:28 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Tue, 11 Jan 2022 15:28:28 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v2] In-Reply-To: <1jhilky48u8gRyNmagzpoOJ0WIrf5djH5oDotN0W1AM=.49233b11-e581-4fc3-abac-91ccb627a5e0@github.com> References: <1jhilky48u8gRyNmagzpoOJ0WIrf5djH5oDotN0W1AM=.49233b11-e581-4fc3-abac-91ccb627a5e0@github.com> Message-ID: <2iPRFR5o4XWKLnlIq7GgYw9vpc0614EenUxU8VyWxV0=.441eff1c-bef0-4572-b5eb-9427b37fc4cb@github.com> On Tue, 11 Jan 2022 08:45:55 GMT, Stefan Karlsson wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Cast Address operand to int > > src/hotspot/cpu/aarch64/macroAssembler_aarch64_log.cpp line 300: > >> 298: fmovs(tmp3, vtmp5); // int intB0 = AS_INT_BITS(B); >> 299: mov(tmp5, 0x3FE0); >> 300: uint64_t mask = 0xffffe00000000000UL; > > I think this should be using ULL to support LLP64. Integral literal takes the first type that fits so for this 64-bit all U, UL and ULL would produce the same type. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Tue Jan 11 16:04:00 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 16:04:00 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v4] In-Reply-To: References: Message-ID: > Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Add ULL ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7023/files - new: https://git.openjdk.java.net/jdk/pull/7023/files/dace0869..e06bf9bd Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7023.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7023/head:pull/7023 PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Tue Jan 11 16:04:04 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 16:04:04 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v2] In-Reply-To: References: Message-ID: <5-yDKHk1UCCHMIp9NDBuYn1fdQTJj3gG0UPEjjb5sB4=.f6a56d44-aaec-4ea3-88bc-2a3a38b47a7e@github.com> On Tue, 11 Jan 2022 10:46:03 GMT, Andrew Haley wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Cast Address operand to int > > src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 499: > >> 497: #ifdef __APPLE__ >> 498: // macosx wants all the overloads >> 499: inline void mov(Register dst, intptr_t imm32) { mov_immediate64(dst, imm32); } > > Suggestion: > > inline void mov(Register dst, intptr_t imm64) { mov_immediate64(dst, imm64); } fixed all these. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From stefank at openjdk.java.net Tue Jan 11 16:04:06 2022 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 11 Jan 2022 16:04:06 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v2] In-Reply-To: <2iPRFR5o4XWKLnlIq7GgYw9vpc0614EenUxU8VyWxV0=.441eff1c-bef0-4572-b5eb-9427b37fc4cb@github.com> References: <1jhilky48u8gRyNmagzpoOJ0WIrf5djH5oDotN0W1AM=.49233b11-e581-4fc3-abac-91ccb627a5e0@github.com> <2iPRFR5o4XWKLnlIq7GgYw9vpc0614EenUxU8VyWxV0=.441eff1c-bef0-4572-b5eb-9427b37fc4cb@github.com> Message-ID: On Tue, 11 Jan 2022 15:25:31 GMT, Quan Anh Mai wrote: >> src/hotspot/cpu/aarch64/macroAssembler_aarch64_log.cpp line 300: >> >>> 298: fmovs(tmp3, vtmp5); // int intB0 = AS_INT_BITS(B); >>> 299: mov(tmp5, 0x3FE0); >>> 300: uint64_t mask = 0xffffe00000000000UL; >> >> I think this should be using ULL to support LLP64. > > Integral literal takes the first type that fits so for this 64-bit all suffices would produce the same type. Thanks, I wasn't sure what would happen if you used the "wrong" suffix. I think it still makes sense to use the correct prefix for the intended target type. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Tue Jan 11 16:04:07 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 16:04:07 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v2] In-Reply-To: References: <1jhilky48u8gRyNmagzpoOJ0WIrf5djH5oDotN0W1AM=.49233b11-e581-4fc3-abac-91ccb627a5e0@github.com> <2iPRFR5o4XWKLnlIq7GgYw9vpc0614EenUxU8VyWxV0=.441eff1c-bef0-4572-b5eb-9427b37fc4cb@github.com> Message-ID: On Tue, 11 Jan 2022 15:56:58 GMT, Stefan Karlsson wrote: >> Integral literal takes the first type that fits so for this 64-bit all suffices would produce the same type. > > Thanks, I wasn't sure what would happen if you used the "wrong" suffix. I think it still makes sense to use the correct prefix for the intended target type. I changed it to ULL. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Tue Jan 11 16:04:08 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 16:04:08 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v2] In-Reply-To: References: Message-ID: On Tue, 11 Jan 2022 10:54:52 GMT, Andrew Haley wrote: >> src/hotspot/cpu/aarch64/universalUpcallHandler_aarch64.cpp line 74: >> >>> 72: // Capture prev stack pointer (stack arguments base) >>> 73: __ add(rscratch1, rfp, 16); // Skip saved FP and LR >>> 74: __ str(rscratch1, Address(sp, (int)layout.stack_args)); // x86 casts to int also >> >> Suggestion: >> >> __ str(rscratch1, Address(sp, checked_cast(layout.stack_args))); // x86 casts to int also > > ... because this is UB if it doesn't fit. Yes, I changed this and wasn't really sure how to fix it better. I'd love if Address had a unit64_t argument but all the callers that just passed 0 wouldn't compile. The code is like that in the x86 version. So I thought that'd be safe. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From aph at openjdk.java.net Tue Jan 11 16:11:24 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 11 Jan 2022 16:11:24 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v2] In-Reply-To: References: Message-ID: On Tue, 11 Jan 2022 16:00:21 GMT, Coleen Phillimore wrote: >> ... because this is UB if it doesn't fit. > > Yes, I changed this and wasn't really sure how to fix it better. I'd love if Address had a unit64_t argument but all the callers that just passed 0 wouldn't compile. The code is like that in the x86 version. So I thought that'd be safe. Well, I'm not convinced the x86 version is safe, really. It probably is, but the checked_cast costs nothing in production. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From aph at openjdk.java.net Tue Jan 11 16:11:25 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 11 Jan 2022 16:11:25 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v2] In-Reply-To: References: Message-ID: On Tue, 11 Jan 2022 16:06:33 GMT, Andrew Haley wrote: >> Yes, I changed this and wasn't really sure how to fix it better. I'd love if Address had a unit64_t argument but all the callers that just passed 0 wouldn't compile. The code is like that in the x86 version. So I thought that'd be safe. > > Well, I'm not convinced the x86 version is safe, really. It probably is, but the checked_cast costs nothing in production. Actually, thinking some more, I can see no guarantee that this `Address` is valid either. So I'm going to change my mind. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From aph at openjdk.java.net Tue Jan 11 16:20:24 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 11 Jan 2022 16:20:24 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v4] In-Reply-To: References: Message-ID: On Tue, 11 Jan 2022 16:04:00 GMT, Coleen Phillimore wrote: >> Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add ULL src/hotspot/cpu/aarch64/universalUpcallHandler_aarch64.cpp line 74: > 72: // Capture prev stack pointer (stack arguments base) > 73: __ add(rscratch1, rfp, 16); // Skip saved FP and LR > 74: __ str(rscratch1, Address(sp, checked_cast(layout.stack_args))); // x86 casts to int also Suggestion: __ Address slot = __ legitimize_address(Address(sp, checked_cast(layout.stack_args)), wordSize, rscratch2); __ str(rscratch1, slot); // x86 casts to int also I think this is a real bug: the range of a stack arg from SP can exceed that of the maximum offset of a STR instruction! Wherever there's a dubious cast there's probably a bug... ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From aph at openjdk.java.net Tue Jan 11 16:25:31 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 11 Jan 2022 16:25:31 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v4] In-Reply-To: References: Message-ID: <5Ovc1He6IoRJmzPBuR7o4lFBr_j6TAEgAm_kpm3RNq8=.87cef1a7-7de5-4b1d-a478-fa7927fe62e4@github.com> On Tue, 11 Jan 2022 16:17:15 GMT, Andrew Haley wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Add ULL > > src/hotspot/cpu/aarch64/universalUpcallHandler_aarch64.cpp line 74: > >> 72: // Capture prev stack pointer (stack arguments base) >> 73: __ add(rscratch1, rfp, 16); // Skip saved FP and LR >> 74: __ str(rscratch1, Address(sp, checked_cast(layout.stack_args))); // x86 casts to int also > > Suggestion: > > __ Address slot = __ legitimize_address(Address(sp, checked_cast(layout.stack_args)), wordSize, rscratch2); > __ str(rscratch1, slot); // x86 casts to int also > > I think this is a real bug: the range of a stack arg from SP can exceed that of the maximum offset of a STR instruction! > Wherever there's a dubious cast there's probably a bug... All of this may seem tedious and pedantic, but we have had failures in production caused by stack pointer offsets exceeding the 12-bit range of a STR instruction. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Tue Jan 11 16:55:26 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 16:55:26 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v4] In-Reply-To: <4Hnm5R1GgXerEoRDrImUH0R_m4XFenIOzad0NrmPPY0=.285c6c40-2910-41aa-8f3b-1cd88b62039e@github.com> References: <4Hnm5R1GgXerEoRDrImUH0R_m4XFenIOzad0NrmPPY0=.285c6c40-2910-41aa-8f3b-1cd88b62039e@github.com> Message-ID: On Tue, 11 Jan 2022 16:48:42 GMT, Andrew Haley wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Add ULL > > src/hotspot/cpu/aarch64/assembler_aarch64.hpp line 418: > >> 416: : _base(r), _index(noreg), _offset(o), _mode(base_plus_offset), _target(0) { } >> 417: Address(Register r, unsigned long long o) >> 418: : _base(r), _index(noreg), _offset(o), _mode(base_plus_offset), _target(0) { } > > This change looks wrong. For example, a call with `Address(Register, ptrdiff_t)` actually calls `Address(Register, int)`, silently truncating the 64-bit signed type to 32 bits. Will it silently truncate? I was getting errors with the unit64_t overloads, I was getting ambiguous calls with address or Register. Maybe that's equally wrong. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Tue Jan 11 16:55:26 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 16:55:26 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v4] In-Reply-To: References: <4Hnm5R1GgXerEoRDrImUH0R_m4XFenIOzad0NrmPPY0=.285c6c40-2910-41aa-8f3b-1cd88b62039e@github.com> Message-ID: On Tue, 11 Jan 2022 16:52:01 GMT, Coleen Phillimore wrote: >> src/hotspot/cpu/aarch64/assembler_aarch64.hpp line 418: >> >>> 416: : _base(r), _index(noreg), _offset(o), _mode(base_plus_offset), _target(0) { } >>> 417: Address(Register r, unsigned long long o) >>> 418: : _base(r), _index(noreg), _offset(o), _mode(base_plus_offset), _target(0) { } >> >> This change looks wrong. For example, a call with `Address(Register, ptrdiff_t)` actually calls `Address(Register, int)`, silently truncating the 64-bit signed type to 32 bits. > > Will it silently truncate? I was getting errors with the unit64_t overloads, I was getting ambiguous calls with address or Register. Maybe that's equally wrong. Getting closer to giving up ... ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From aph at openjdk.java.net Tue Jan 11 16:55:26 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 11 Jan 2022 16:55:26 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v4] In-Reply-To: References: Message-ID: <4Hnm5R1GgXerEoRDrImUH0R_m4XFenIOzad0NrmPPY0=.285c6c40-2910-41aa-8f3b-1cd88b62039e@github.com> On Tue, 11 Jan 2022 16:04:00 GMT, Coleen Phillimore wrote: >> Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add ULL src/hotspot/cpu/aarch64/assembler_aarch64.hpp line 418: > 416: : _base(r), _index(noreg), _offset(o), _mode(base_plus_offset), _target(0) { } > 417: Address(Register r, unsigned long long o) > 418: : _base(r), _index(noreg), _offset(o), _mode(base_plus_offset), _target(0) { } This change looks wrong. For example, a call with `Address(Register, ptrdiff_t)` actually calls `Address(Register, int)`, silently truncating the 64-bit signed type to 32 bits. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Tue Jan 11 16:55:26 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 16:55:26 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v4] In-Reply-To: <5Ovc1He6IoRJmzPBuR7o4lFBr_j6TAEgAm_kpm3RNq8=.87cef1a7-7de5-4b1d-a478-fa7927fe62e4@github.com> References: <5Ovc1He6IoRJmzPBuR7o4lFBr_j6TAEgAm_kpm3RNq8=.87cef1a7-7de5-4b1d-a478-fa7927fe62e4@github.com> Message-ID: On Tue, 11 Jan 2022 16:22:06 GMT, Andrew Haley wrote: >> src/hotspot/cpu/aarch64/universalUpcallHandler_aarch64.cpp line 74: >> >>> 72: // Capture prev stack pointer (stack arguments base) >>> 73: __ add(rscratch1, rfp, 16); // Skip saved FP and LR >>> 74: __ str(rscratch1, Address(sp, checked_cast(layout.stack_args))); // x86 casts to int also >> >> Suggestion: >> >> __ Address slot = __ legitimize_address(Address(sp, checked_cast(layout.stack_args)), wordSize, rscratch2); >> __ str(rscratch1, slot); // x86 casts to int also >> >> I think this is a real bug: the range of a stack arg from SP can exceed that of the maximum offset of a STR instruction! >> Wherever there's a dubious cast there's probably a bug... > > All of this may seem tedious and pedantic, but we have had failures in production caused by stack pointer offsets exceeding the 12-bit range of a STR instruction. Yes, I agree and didn't really know how to fix it so it would compile (if it actually compiles now) and not be UB. I'm close to giving up! Thanks for the code change. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Tue Jan 11 16:55:27 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 16:55:27 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v4] In-Reply-To: References: <5Ovc1He6IoRJmzPBuR7o4lFBr_j6TAEgAm_kpm3RNq8=.87cef1a7-7de5-4b1d-a478-fa7927fe62e4@github.com> Message-ID: On Tue, 11 Jan 2022 16:47:13 GMT, Coleen Phillimore wrote: >> All of this may seem tedious and pedantic, but we have had failures in production caused by stack pointer offsets exceeding the 12-bit range of a STR instruction. > > Yes, I agree and didn't really know how to fix it so it would compile (if it actually compiles now) and not be UB. I'm close to giving up! Thanks for the code change. It does seem pedantic which makes it really difficult, but it's important to get this right. These kinds of bugs are horrible to debug. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Tue Jan 11 17:21:01 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 17:21:01 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v5] In-Reply-To: References: Message-ID: > Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). Coleen Phillimore has updated the pull request incrementally with three additional commits since the last revision: - Fix Address overload. - Take out int cast in universalUpcallHandler - Try some more overloads ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7023/files - new: https://git.openjdk.java.net/jdk/pull/7023/files/e06bf9bd..4edba83b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=03-04 Stats: 6 lines in 2 files changed: 5 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7023.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7023/head:pull/7023 PR: https://git.openjdk.java.net/jdk/pull/7023 From aph at openjdk.java.net Tue Jan 11 17:21:04 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 11 Jan 2022 17:21:04 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v5] In-Reply-To: References: <4Hnm5R1GgXerEoRDrImUH0R_m4XFenIOzad0NrmPPY0=.285c6c40-2910-41aa-8f3b-1cd88b62039e@github.com> Message-ID: On Tue, 11 Jan 2022 16:52:52 GMT, Coleen Phillimore wrote: >> Will it silently truncate? I was getting errors with the unit64_t overloads, I was getting ambiguous calls with address or Register. Maybe that's equally wrong. > > Getting closer to giving up ... It seems to silently truncate if the only signed type is int. I just put a breakpoint on `LIR_Assembler::as_Address` and watched it happen. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Tue Jan 11 17:21:05 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 17:21:05 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v5] In-Reply-To: References: <4Hnm5R1GgXerEoRDrImUH0R_m4XFenIOzad0NrmPPY0=.285c6c40-2910-41aa-8f3b-1cd88b62039e@github.com> Message-ID: On Tue, 11 Jan 2022 16:59:23 GMT, Andrew Haley wrote: >> Getting closer to giving up ... > > It seems to silently truncate if the only signed type is int. I just put a breakpoint on `LIR_Assembler::as_Address` and watched it happen. There are too many overloads. I don't know why long long overloads didn't cause ambiguity but if there's any conversion needed, it's ambiguous. Address's _offset field is an int64_t, which seems wrong to assign a unint64_t into it. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Tue Jan 11 17:21:06 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 17:21:06 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v5] In-Reply-To: References: <4Hnm5R1GgXerEoRDrImUH0R_m4XFenIOzad0NrmPPY0=.285c6c40-2910-41aa-8f3b-1cd88b62039e@github.com> Message-ID: On Tue, 11 Jan 2022 17:02:09 GMT, Coleen Phillimore wrote: >> It seems to silently truncate if the only signed type is int. I just put a breakpoint on `LIR_Assembler::as_Address` and watched it happen. > > There are too many overloads. I don't know why long long overloads didn't cause ambiguity but if there's any conversion needed, it's ambiguous. > Address's _offset field is an int64_t, which seems wrong to assign a unint64_t into it. I added a ptrdiff_t and a unint64_t overload. The linux build is happy about that. I'm going to push so the GHA tries the rest. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From aph at openjdk.java.net Tue Jan 11 17:21:06 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 11 Jan 2022 17:21:06 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v5] In-Reply-To: References: <4Hnm5R1GgXerEoRDrImUH0R_m4XFenIOzad0NrmPPY0=.285c6c40-2910-41aa-8f3b-1cd88b62039e@github.com> Message-ID: On Tue, 11 Jan 2022 17:04:56 GMT, Coleen Phillimore wrote: >> There are too many overloads. I don't know why long long overloads didn't cause ambiguity but if there's any conversion needed, it's ambiguous. >> Address's _offset field is an int64_t, which seems wrong to assign a unint64_t into it. > > I added a ptrdiff_t and a unint64_t overload. The linux build is happy about that. I'm going to push so the GHA tries the rest. I just got a clean build with overloads for `int`,` int64_t`, and `uint64_t` offsets. That seems to be the minimum. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Tue Jan 11 17:21:07 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 17:21:07 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v5] In-Reply-To: References: <4Hnm5R1GgXerEoRDrImUH0R_m4XFenIOzad0NrmPPY0=.285c6c40-2910-41aa-8f3b-1cd88b62039e@github.com> Message-ID: On Tue, 11 Jan 2022 17:05:15 GMT, Andrew Haley wrote: >> I added a ptrdiff_t and a unint64_t overload. The linux build is happy about that. I'm going to push so the GHA tries the rest. > > I just got a clean build with overloads for `int`,` int64_t`, and `uint64_t` offsets. That seems to be the minimum. what platforms did you get this to compile on? I only have a linux and the macosx compiler seems to have different rules as well as windows, which only have with GHA. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From aph at openjdk.java.net Tue Jan 11 17:21:08 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 11 Jan 2022 17:21:08 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v5] In-Reply-To: References: <4Hnm5R1GgXerEoRDrImUH0R_m4XFenIOzad0NrmPPY0=.285c6c40-2910-41aa-8f3b-1cd88b62039e@github.com> Message-ID: On Tue, 11 Jan 2022 17:14:04 GMT, Coleen Phillimore wrote: >> I just got a clean build with overloads for `int`,` int64_t`, and `uint64_t` offsets. That seems to be the minimum. > > what platforms did you get this to compile on? I only have a linux and the macosx compiler seems to have different rules as well as windows, which only have with GHA. Sounds good. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From duke at openjdk.java.net Tue Jan 11 17:21:09 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Tue, 11 Jan 2022 17:21:09 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v5] In-Reply-To: References: <4Hnm5R1GgXerEoRDrImUH0R_m4XFenIOzad0NrmPPY0=.285c6c40-2910-41aa-8f3b-1cd88b62039e@github.com> Message-ID: On Tue, 11 Jan 2022 17:14:36 GMT, Andrew Haley wrote: >> what platforms did you get this to compile on? I only have a linux and the macosx compiler seems to have different rules as well as windows, which only have with GHA. > > Sounds good. Integral types are implicitly convertible to each other, I believe in this case a single overload for `int64_t` would be sufficient? ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From aph at openjdk.java.net Tue Jan 11 17:21:10 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 11 Jan 2022 17:21:10 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v5] In-Reply-To: References: <4Hnm5R1GgXerEoRDrImUH0R_m4XFenIOzad0NrmPPY0=.285c6c40-2910-41aa-8f3b-1cd88b62039e@github.com> Message-ID: On Tue, 11 Jan 2022 17:15:59 GMT, Quan Anh Mai wrote: >> Sounds good. > > Integral types are implicitly convertible to each other, I believe in this case a single overload for `int64_t` would be sufficient? On Linux. The MacOS debugger makes me pull my hair out. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From kbarrett at openjdk.java.net Tue Jan 11 17:21:11 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Tue, 11 Jan 2022 17:21:11 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v2] In-Reply-To: References: <1jhilky48u8gRyNmagzpoOJ0WIrf5djH5oDotN0W1AM=.49233b11-e581-4fc3-abac-91ccb627a5e0@github.com> <2iPRFR5o4XWKLnlIq7GgYw9vpc0614EenUxU8VyWxV0=.441eff1c-bef0-4572-b5eb-9427b37fc4cb@github.com> Message-ID: On Tue, 11 Jan 2022 15:58:31 GMT, Coleen Phillimore wrote: >> Thanks, I wasn't sure what would happen if you used the "wrong" suffix. I think it still makes sense to use the correct prefix for the intended target type. > > I changed it to ULL. Thanks. We have `CONST64` and `UCONST64` for these situations (in globalDefinitions.hpp). ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Tue Jan 11 17:21:13 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 17:21:13 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v2] In-Reply-To: References: Message-ID: On Tue, 11 Jan 2022 16:08:05 GMT, Andrew Haley wrote: >> Well, I'm not convinced the x86 version is safe, really. It probably is, but the checked_cast costs nothing in production. > > Actually, thinking some more, I can see no guarantee that this `Address` is valid either. So I'm going to change my mind. I'm going to remove the cast or cast it to uint64_t in universalUpcallHandler. thanks for bearing with me! ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From aph at openjdk.java.net Tue Jan 11 17:21:14 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 11 Jan 2022 17:21:14 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v4] In-Reply-To: References: <5Ovc1He6IoRJmzPBuR7o4lFBr_j6TAEgAm_kpm3RNq8=.87cef1a7-7de5-4b1d-a478-fa7927fe62e4@github.com> Message-ID: <9Fpfe6busg8pXOQRanyPtqe9Dj3gjlSEXA9b0ypeksw=.7555d279-989f-48bc-9281-324a53c7a406@github.com> On Tue, 11 Jan 2022 16:49:30 GMT, Coleen Phillimore wrote: >> Yes, I agree and didn't really know how to fix it so it would compile (if it actually compiles now) and not be UB. I'm close to giving up! Thanks for the code change. > > It does seem pedantic which makes it really difficult, but it's important to get this right. These kinds of bugs are horrible to debug. OK, so if we have the `Address(reg_offset)` overloads for `int`, `int64_t`, and `uint64_t` we're good, and by calling the right overload we get rid of the need to cast the `layout.stack_args` offset to `int` here. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From aph at openjdk.java.net Tue Jan 11 17:21:14 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 11 Jan 2022 17:21:14 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v4] In-Reply-To: <9Fpfe6busg8pXOQRanyPtqe9Dj3gjlSEXA9b0ypeksw=.7555d279-989f-48bc-9281-324a53c7a406@github.com> References: <5Ovc1He6IoRJmzPBuR7o4lFBr_j6TAEgAm_kpm3RNq8=.87cef1a7-7de5-4b1d-a478-fa7927fe62e4@github.com> <9Fpfe6busg8pXOQRanyPtqe9Dj3gjlSEXA9b0ypeksw=.7555d279-989f-48bc-9281-324a53c7a406@github.com> Message-ID: On Tue, 11 Jan 2022 17:08:43 GMT, Andrew Haley wrote: >> It does seem pedantic which makes it really difficult, but it's important to get this right. These kinds of bugs are horrible to debug. > > OK, so if we have the `Address(reg_offset)` overloads for `int`, `int64_t`, and `uint64_t` we're good, and by calling the right overload we get rid of the need to cast the `layout.stack_args` offset to `int` here. This is what I've got: Address(Register r) : _base(r), _index(noreg), _offset(0), _mode(base_plus_offset), _target(0) { } Address(Register r, int o) : _base(r), _index(noreg), _offset(o), _mode(base_plus_offset), _target(0) { } Address(Register r, int64_t o) : _base(r), _index(noreg), _offset(o), _mode(base_plus_offset), _target(0) { } Address(Register r, uint64_t o) : _base(r), _index(noreg), _offset(o), _mode(base_plus_offset), _target(0) { } Address(Register r, ByteSize disp) : Address(r, in_bytes(disp)) { } ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Tue Jan 11 17:21:15 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 17:21:15 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v4] In-Reply-To: References: <5Ovc1He6IoRJmzPBuR7o4lFBr_j6TAEgAm_kpm3RNq8=.87cef1a7-7de5-4b1d-a478-fa7927fe62e4@github.com> <9Fpfe6busg8pXOQRanyPtqe9Dj3gjlSEXA9b0ypeksw=.7555d279-989f-48bc-9281-324a53c7a406@github.com> Message-ID: On Tue, 11 Jan 2022 17:10:25 GMT, Andrew Haley wrote: >> OK, so if we have the `Address(reg_offset)` overloads for `int`, `int64_t`, and `uint64_t` we're good, and by calling the right overload we get rid of the need to cast the `layout.stack_args` offset to `int` here. > > This is what I've got: > > > Address(Register r) > : _base(r), _index(noreg), _offset(0), _mode(base_plus_offset), _target(0) { } > Address(Register r, int o) > : _base(r), _index(noreg), _offset(o), _mode(base_plus_offset), _target(0) { } > Address(Register r, int64_t o) > : _base(r), _index(noreg), _offset(o), _mode(base_plus_offset), _target(0) { } > Address(Register r, uint64_t o) > : _base(r), _index(noreg), _offset(o), _mode(base_plus_offset), _target(0) { } > Address(Register r, ByteSize disp) > : Address(r, in_bytes(disp)) { } I added ptrdiff_t but that's the same as int64_t. I'll fix it. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Tue Jan 11 17:40:09 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 17:40:09 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v6] In-Reply-To: References: Message-ID: > Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Use UCONST64. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7023/files - new: https://git.openjdk.java.net/jdk/pull/7023/files/4edba83b..0877accb Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=04-05 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7023.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7023/head:pull/7023 PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Tue Jan 11 17:40:10 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 17:40:10 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v6] In-Reply-To: References: <4Hnm5R1GgXerEoRDrImUH0R_m4XFenIOzad0NrmPPY0=.285c6c40-2910-41aa-8f3b-1cd88b62039e@github.com> Message-ID: On Tue, 11 Jan 2022 17:16:09 GMT, Andrew Haley wrote: >> Integral types are implicitly convertible to each other, I believe in this case a single overload for `int64_t` would be sufficient? > > On Linux. The MacOS debugger makes me pull my hair out. I think I tried int64_t instead of plain 'int' and got errors for simple Address(reg, 0). Now I don't remember ... ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Tue Jan 11 17:40:12 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 17:40:12 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v2] In-Reply-To: References: <1jhilky48u8gRyNmagzpoOJ0WIrf5djH5oDotN0W1AM=.49233b11-e581-4fc3-abac-91ccb627a5e0@github.com> <2iPRFR5o4XWKLnlIq7GgYw9vpc0614EenUxU8VyWxV0=.441eff1c-bef0-4572-b5eb-9427b37fc4cb@github.com> Message-ID: On Tue, 11 Jan 2022 17:17:40 GMT, Kim Barrett wrote: >> I changed it to ULL. Thanks. > > We have `CONST64` and `UCONST64` for these situations (in globalDefinitions.hpp). Forgot about those. thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From duke at openjdk.java.net Tue Jan 11 17:48:31 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Tue, 11 Jan 2022 17:48:31 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v6] In-Reply-To: References: <4Hnm5R1GgXerEoRDrImUH0R_m4XFenIOzad0NrmPPY0=.285c6c40-2910-41aa-8f3b-1cd88b62039e@github.com> Message-ID: On Tue, 11 Jan 2022 17:34:32 GMT, Coleen Phillimore wrote: >> On Linux. The MacOS debugger makes me pull my hair out. > > I think I tried int64_t instead of plain 'int' and got errors for simple Address(reg, 0). Now I don't remember ... Oh right I now realize that integral promotion only promotes things to `int`. So it get ambiguous cause both `int -> int64_t` and `int -> ByteSize` are integral conversion. Ouch. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From duke at openjdk.java.net Tue Jan 11 18:51:33 2022 From: duke at openjdk.java.net (Vamsi Parasa) Date: Tue, 11 Jan 2022 18:51:33 GMT Subject: Integrated: 8278868: Add x86 vectorization support for Long.bitCount() In-Reply-To: References: Message-ID: On Wed, 15 Dec 2021 23:51:19 GMT, Vamsi Parasa wrote: > Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization. This pull request has now been integrated. Changeset: c4518e25 Author: Vamsi Parasa Committer: Sandhya Viswanathan URL: https://git.openjdk.java.net/jdk/commit/c4518e257c1680a6cdb80b7e177d01700ea2c54e Stats: 237 lines in 11 files changed: 231 ins; 0 del; 6 mod 8278868: Add x86 vectorization support for Long.bitCount() Reviewed-by: jbhateja, sviswanathan, kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/6857 From coleenp at openjdk.java.net Tue Jan 11 21:25:22 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 21:25:22 GMT Subject: RFR: 8238161: use os::fopen in HS code where possible In-Reply-To: References: Message-ID: <_34aBMfVRM803-Hud8zCk2wmO0OMB8YqTFdwJkly_JI=.2ca0bc93-f443-4505-91ac-80cb9c84cb79@github.com> On Mon, 10 Jan 2022 21:02:54 GMT, Harold Seigel wrote: > Please review this change to hotspot to call os::fopen() instead of fopen() so that the 'close-on-exec' flag gets set for these opened files. This change was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. Additionally, the changes were built on Linux PPC. > > Thanks, Harold Looks good to me. I assume that you are going to add the pragma forbid function for fopen after this in 8214976 ? ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7022 From hseigel at openjdk.java.net Tue Jan 11 21:32:27 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Tue, 11 Jan 2022 21:32:27 GMT Subject: RFR: 8238161: use os::fopen in HS code where possible In-Reply-To: <_34aBMfVRM803-Hud8zCk2wmO0OMB8YqTFdwJkly_JI=.2ca0bc93-f443-4505-91ac-80cb9c84cb79@github.com> References: <_34aBMfVRM803-Hud8zCk2wmO0OMB8YqTFdwJkly_JI=.2ca0bc93-f443-4505-91ac-80cb9c84cb79@github.com> Message-ID: On Tue, 11 Jan 2022 21:22:11 GMT, Coleen Phillimore wrote: >> I assume that you are going to add the pragma forbid function for fopen after this in 8214976 ? Yes, ------------- PR: https://git.openjdk.java.net/jdk/pull/7022 From coleenp at openjdk.java.net Tue Jan 11 22:38:02 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 11 Jan 2022 22:38:02 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v7] In-Reply-To: References: Message-ID: <47b7PwHi_RPmmcGM-Qo500ApZ2889W8-RTCzpxJOyjw=.6b4eddfb-6f60-48f0-8fa1-e6069d2e7416@github.com> > Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Add some overloads to Address to keep macosx happy ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7023/files - new: https://git.openjdk.java.net/jdk/pull/7023/files/0877accb..1c49feda Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=05-06 Stats: 11 lines in 2 files changed: 7 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/7023.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7023/head:pull/7023 PR: https://git.openjdk.java.net/jdk/pull/7023 From phedlin at openjdk.java.net Wed Jan 12 09:26:26 2022 From: phedlin at openjdk.java.net (Patric Hedlin) Date: Wed, 12 Jan 2022 09:26:26 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v7] In-Reply-To: <47b7PwHi_RPmmcGM-Qo500ApZ2889W8-RTCzpxJOyjw=.6b4eddfb-6f60-48f0-8fa1-e6069d2e7416@github.com> References: <47b7PwHi_RPmmcGM-Qo500ApZ2889W8-RTCzpxJOyjw=.6b4eddfb-6f60-48f0-8fa1-e6069d2e7416@github.com> Message-ID: On Tue, 11 Jan 2022 22:38:02 GMT, Coleen Phillimore wrote: >> Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add some overloads to Address to keep macosx happy src/hotspot/cpu/aarch64/assembler_aarch64.hpp line 419: > 417: #endif > 418: Address(Register r, uint64_t o) > 419: : _base(r), _index(noreg), _offset(o), _mode(base_plus_offset), _target(0) { } The above could be written (from an overlapping CS, with additional initialisation) Suggestion: Address(Register r) : Address(r, 0) {} Address(Register r, int32_t o) : Address(r, int64_t(o)) {} Address(Register r, int64_t o) : _mode(base_plus_offset), _base(r), _index(noreg), _offset(o), _extend(lsl(0)), _target(nullptr) {} Address(Register r, uint32_t o) : Address(r, uint64_t(o)) {} Address(Register r, uint64_t o) : _mode(base_plus_offset), _base(r), _index(noreg), _offset(o), _extend(lsl(0)), _target(nullptr) {} ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From hseigel at openjdk.java.net Wed Jan 12 13:14:43 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 12 Jan 2022 13:14:43 GMT Subject: RFR: 8238161: use os::fopen in HS code where possible In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 21:02:54 GMT, Harold Seigel wrote: > Please review this change to hotspot to call os::fopen() instead of fopen() so that the 'close-on-exec' flag gets set for these opened files. This change was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. Additionally, the changes were built on Linux PPC. > > Thanks, Harold Thanks Matthias and Coleen for reviewing this! ------------- PR: https://git.openjdk.java.net/jdk/pull/7022 From hseigel at openjdk.java.net Wed Jan 12 13:14:43 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 12 Jan 2022 13:14:43 GMT Subject: Integrated: 8238161: use os::fopen in HS code where possible In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 21:02:54 GMT, Harold Seigel wrote: > Please review this change to hotspot to call os::fopen() instead of fopen() so that the 'close-on-exec' flag gets set for these opened files. This change was tested by running Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. Additionally, the changes were built on Linux PPC. > > Thanks, Harold This pull request has now been integrated. Changeset: f54ce844 Author: Harold Seigel URL: https://git.openjdk.java.net/jdk/commit/f54ce84474c2ced340c92564814fa5c221415944 Stats: 59 lines in 20 files changed: 1 ins; 0 del; 58 mod 8238161: use os::fopen in HS code where possible Reviewed-by: mbaesken, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/7022 From coleenp at openjdk.java.net Wed Jan 12 13:51:42 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 12 Jan 2022 13:51:42 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v4] In-Reply-To: References: <5Ovc1He6IoRJmzPBuR7o4lFBr_j6TAEgAm_kpm3RNq8=.87cef1a7-7de5-4b1d-a478-fa7927fe62e4@github.com> <9Fpfe6busg8pXOQRanyPtqe9Dj3gjlSEXA9b0ypeksw=.7555d279-989f-48bc-9281-324a53c7a406@github.com> Message-ID: On Tue, 11 Jan 2022 17:12:06 GMT, Coleen Phillimore wrote: >> This is what I've got: >> >> >> Address(Register r) >> : _base(r), _index(noreg), _offset(0), _mode(base_plus_offset), _target(0) { } >> Address(Register r, int o) >> : _base(r), _index(noreg), _offset(o), _mode(base_plus_offset), _target(0) { } >> Address(Register r, int64_t o) >> : _base(r), _index(noreg), _offset(o), _mode(base_plus_offset), _target(0) { } >> Address(Register r, uint64_t o) >> : _base(r), _index(noreg), _offset(o), _mode(base_plus_offset), _target(0) { } >> Address(Register r, ByteSize disp) >> : Address(r, in_bytes(disp)) { } > > I added ptrdiff_t but that's the same as int64_t. I'll fix it. I need ptrdiff_t additionally to get this to compile on our macosx-aarch64 platform. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From mdoerr at openjdk.java.net Wed Jan 12 14:10:53 2022 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 12 Jan 2022 14:10:53 GMT Subject: [jdk18] RFR: 8279924: [PPC64, s390] implement frame::is_interpreted_frame_valid checks Message-ID: Implement frame::is_interpreted_frame_valid like on other platforms. Only interpreter_frame_initial_sp_offset needed to be replaced because it doesn't exist on these platforms. ------------- Commit messages: - 8279924: [PPC64, s390] implement frame::is_interpreted_frame_valid checks Changes: https://git.openjdk.java.net/jdk18/pull/96/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk18&pr=96&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8279924 Stats: 100 lines in 2 files changed: 96 ins; 2 del; 2 mod Patch: https://git.openjdk.java.net/jdk18/pull/96.diff Fetch: git fetch https://git.openjdk.java.net/jdk18 pull/96/head:pull/96 PR: https://git.openjdk.java.net/jdk18/pull/96 From aph at openjdk.java.net Wed Jan 12 14:35:28 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 12 Jan 2022 14:35:28 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v4] In-Reply-To: References: <5Ovc1He6IoRJmzPBuR7o4lFBr_j6TAEgAm_kpm3RNq8=.87cef1a7-7de5-4b1d-a478-fa7927fe62e4@github.com> <9Fpfe6busg8pXOQRanyPtqe9Dj3gjlSEXA9b0ypeksw=.7555d279-989f-48bc-9281-324a53c7a406@github.com> Message-ID: On Wed, 12 Jan 2022 13:48:37 GMT, Coleen Phillimore wrote: >> I added ptrdiff_t but that's the same as int64_t. I'll fix it. > > I need ptrdiff_t additionally to get this to compile on our macosx-aarch64 platform. That sounds fine. I'd add it unconditionally for all platforms. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Wed Jan 12 15:14:10 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 12 Jan 2022 15:14:10 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v8] In-Reply-To: References: Message-ID: > Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Much better solution from Kim - templates! ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7023/files - new: https://git.openjdk.java.net/jdk/pull/7023/files/1c49feda..a85fd4f3 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=06-07 Stats: 23 lines in 2 files changed: 2 ins; 13 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/7023.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7023/head:pull/7023 PR: https://git.openjdk.java.net/jdk/pull/7023 From mdoerr at openjdk.java.net Wed Jan 12 15:18:00 2022 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 12 Jan 2022 15:18:00 GMT Subject: [jdk18] RFR: 8279924: [PPC64, s390] implement frame::is_interpreted_frame_valid checks [v2] In-Reply-To: References: Message-ID: > Implement frame::is_interpreted_frame_valid like on other platforms. Only interpreter_frame_initial_sp_offset needed to be replaced because it doesn't exist on these platforms. Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: Update Copyright years. ------------- Changes: - all: https://git.openjdk.java.net/jdk18/pull/96/files - new: https://git.openjdk.java.net/jdk18/pull/96/files/fb53fda5..36c0cfcc Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk18&pr=96&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk18&pr=96&range=00-01 Stats: 4 lines in 2 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk18/pull/96.diff Fetch: git fetch https://git.openjdk.java.net/jdk18 pull/96/head:pull/96 PR: https://git.openjdk.java.net/jdk18/pull/96 From aph at openjdk.java.net Wed Jan 12 15:43:31 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 12 Jan 2022 15:43:31 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v8] In-Reply-To: References: Message-ID: <7udOYix-Tp2DKsqy5gh0b9w7BIFOkh-WtOaTmvayCVo=.6f3c0945-3d21-42aa-acbd-a3e7bda2b57e@github.com> On Wed, 12 Jan 2022 15:14:10 GMT, Coleen Phillimore wrote: >> Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Much better solution from Kim - templates! src/hotspot/cpu/aarch64/assembler_aarch64.hpp line 412: > 410: Address(Register r, T o) > 411: : _base(r), _index(noreg), _offset(o), _mode(base_plus_offset), _target(0) {} > 412: Hum, interesting. Looks good: all integer types should get sign- or zero-extended as appropriate, I think. I am going to check tho'. :-) ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Wed Jan 12 16:00:34 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 12 Jan 2022 16:00:34 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v8] In-Reply-To: <7udOYix-Tp2DKsqy5gh0b9w7BIFOkh-WtOaTmvayCVo=.6f3c0945-3d21-42aa-acbd-a3e7bda2b57e@github.com> References: <7udOYix-Tp2DKsqy5gh0b9w7BIFOkh-WtOaTmvayCVo=.6f3c0945-3d21-42aa-acbd-a3e7bda2b57e@github.com> Message-ID: <2K1PK1zRozjzE2mta3IwHBcljV5tlU5NhiTZijralEI=.7fac2fc0-d1dd-4f10-aa7b-973ee62674ef@github.com> On Wed, 12 Jan 2022 15:39:28 GMT, Andrew Haley wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Much better solution from Kim - templates! > > src/hotspot/cpu/aarch64/assembler_aarch64.hpp line 412: > >> 410: Address(Register r, T o) >> 411: : _base(r), _index(noreg), _offset(o), _mode(base_plus_offset), _target(0) {} >> 412: > > Hum, interesting. Looks good: all integer types should get sign- or zero-extended as appropriate, I think. I am going to check tho'. :-) Great, thanks for checking! ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From hseigel at openjdk.java.net Wed Jan 12 16:36:49 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 12 Jan 2022 16:36:49 GMT Subject: RFR: 8214976: Warn about uses of functions replaced for portability [v3] In-Reply-To: References: Message-ID: > Please review this change for JDK-8214976. This change adds attribute warnings to header file compilerWarnings.hpp so that compilation warnings get issued when certain system functions are called directly, instead of hotspot's os:: versions of the functions. Many additional files were changed because of compilation warnings resulting from the compilerWarnings.hpp changes. > > A sample warning is: > > .../open/test/hotspot/gtest/logging/test_log.cpp:63:19: error: call to 'fopen' declared with attribute warning: use os::fopen [-Werror=attribute-warning] > 63 | FILE* fp = fopen(TestLogFileName, "r"); > | ~~~~~^~~~~~~~~~~~~~~~~~~~~~ > > > Note that changing src/hotspot/os/linux/gc/z/zMountPoint_linux.cpp to call os:: functions requires adding "#include "runtime/os.hpp" and caused test gc/z/TestAllocateHeapAt.java to fail. So, for now, I just added PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION to zMountPoint_linux.cpp. There's a similar issue with gtest/logging/test_logDecorators.cpp. > > Attribute warnings for additional functions, such as malloc(), were not included in this change because they require lots of source code changes. > > This change was tested by running mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64. Also, builds were done on Linux-zero, Linux-s390, and Linux-ppc. > > Thanks, Harold Harold Seigel has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - more stuff - revert strdup() changes and address some of Kim's comments - 8214976: Warn about uses of functions replaced for portability - rebase attempt 2 - rebase ------------- Changes: https://git.openjdk.java.net/jdk/pull/6961/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6961&range=02 Stats: 158 lines in 26 files changed: 73 ins; 1 del; 84 mod Patch: https://git.openjdk.java.net/jdk/pull/6961.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6961/head:pull/6961 PR: https://git.openjdk.java.net/jdk/pull/6961 From hseigel at openjdk.java.net Wed Jan 12 16:36:53 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 12 Jan 2022 16:36:53 GMT Subject: RFR: 8214976: Warn about uses of functions replaced for portability [v2] In-Reply-To: References: Message-ID: On Wed, 5 Jan 2022 20:41:07 GMT, Harold Seigel wrote: >> Please review this change for JDK-8214976. This change adds attribute warnings to header file compilerWarnings.hpp so that compilation warnings get issued when certain system functions are called directly, instead of hotspot's os:: versions of the functions. Many additional files were changed because of compilation warnings resulting from the compilerWarnings.hpp changes. >> >> A sample warning is: >> >> .../open/test/hotspot/gtest/logging/test_log.cpp:63:19: error: call to 'fopen' declared with attribute warning: use os::fopen [-Werror=attribute-warning] >> 63 | FILE* fp = fopen(TestLogFileName, "r"); >> | ~~~~~^~~~~~~~~~~~~~~~~~~~~~ >> >> >> Note that changing src/hotspot/os/linux/gc/z/zMountPoint_linux.cpp to call os:: functions requires adding "#include "runtime/os.hpp" and caused test gc/z/TestAllocateHeapAt.java to fail. So, for now, I just added PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION to zMountPoint_linux.cpp. There's a similar issue with gtest/logging/test_logDecorators.cpp. >> >> Attribute warnings for additional functions, such as malloc(), were not included in this change because they require lots of source code changes. >> >> This change was tested by running mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64. Also, builds were done on Linux-zero, Linux-s390, and Linux-ppc. >> >> Thanks, Harold > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > revert strdup() changes and address some of Kim's comments This pull request is being withdrawn do to repo corruption. ------------- PR: https://git.openjdk.java.net/jdk/pull/6961 From hseigel at openjdk.java.net Wed Jan 12 16:36:54 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 12 Jan 2022 16:36:54 GMT Subject: Withdrawn: 8214976: Warn about uses of functions replaced for portability In-Reply-To: References: Message-ID: On Tue, 4 Jan 2022 19:23:24 GMT, Harold Seigel wrote: > Please review this change for JDK-8214976. This change adds attribute warnings to header file compilerWarnings.hpp so that compilation warnings get issued when certain system functions are called directly, instead of hotspot's os:: versions of the functions. Many additional files were changed because of compilation warnings resulting from the compilerWarnings.hpp changes. > > A sample warning is: > > .../open/test/hotspot/gtest/logging/test_log.cpp:63:19: error: call to 'fopen' declared with attribute warning: use os::fopen [-Werror=attribute-warning] > 63 | FILE* fp = fopen(TestLogFileName, "r"); > | ~~~~~^~~~~~~~~~~~~~~~~~~~~~ > > > Note that changing src/hotspot/os/linux/gc/z/zMountPoint_linux.cpp to call os:: functions requires adding "#include "runtime/os.hpp" and caused test gc/z/TestAllocateHeapAt.java to fail. So, for now, I just added PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION to zMountPoint_linux.cpp. There's a similar issue with gtest/logging/test_logDecorators.cpp. > > Attribute warnings for additional functions, such as malloc(), were not included in this change because they require lots of source code changes. > > This change was tested by running mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64. Also, builds were done on Linux-zero, Linux-s390, and Linux-ppc. > > Thanks, Harold This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/6961 From duke at openjdk.java.net Wed Jan 12 17:11:37 2022 From: duke at openjdk.java.net (Alan Hayward) Date: Wed, 12 Jan 2022 17:11:37 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v11] In-Reply-To: References: Message-ID: On Tue, 14 Dec 2021 09:40:03 GMT, Alan Hayward wrote: >> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One >> of its uses is to protect against ROP based attacks. This is done by >> signing the Link Register whenever it is stored on the stack, and >> authenticating the value when it is loaded back from the stack. If an >> attacker were to try to change control flow by editing the stack then >> the authentication check of the Link Register will fail, causing a >> segfault when the function returns. >> >> On a system with PAC enabled, it is expected that all applications will >> be compiled with ROP protection. Fedora 33 and upwards already provide >> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of >> PAC instructions that exist in the NOP space - on hardware without PAC, >> these instructions act as NOPs, allowing backward compatibility for >> negligible performance cost (2 NOPs per non-leaf function). >> >> Hardware is currently limited to the Apple M1 MacBooks. All testing has >> been done within a Fedora Docker image. A run of SpecJVM showed no >> difference to that of noise - which was surprising. >> >> The most important part of this patch is simply compiling using branch >> protection provided by GCC/LLVM. This protects all C++ code from being >> used in ROP attacks, removing all static ROP gadgets from use. >> >> The remainder of the patch adds ROP protection to runtime generated >> code, in both stubs and compiled Java code. Attacks here are much harder >> as ROP gadgets must be found dynamically at runtime. If/when AOT >> compilation is added to JDK, then all stubs and compiled Java will be >> susceptible ROP gadgets being found by static analysis and therefore >> potentially as vulnerable as C++ code. >> >> There are a number of places where the VM changes control flow by >> rewriting the stack or otherwise. I?ve done some analysis as to how >> these could also be used for attacks (which I didn?t want to post here). >> These areas can be protected ensuring the pointers to various stubs and >> entry points are stored in memory as signed pointers. These changes are >> simple to make (they can be reduced to a type change in common code and >> a few addition sign/auth calls in the backend), but there a lot of them >> and the total code change is fairly large. I?m happy to provide a few >> work in progress patches. >> >> In order to match the security benefits of the Apple Arm64e ABI across >> the whole of JDK, then all the changes mentioned above would be >> required. > > Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: > > Change UseROPProtection to UseBranchProtection > > Change-Id: I31c5e1bb5c285262f262459c13057a46221682f1 > CustomizedGitHooks: yes Coming back to this work for 2021.... I've just done a complete run of all the jtreg tests on this patch with both PAC on and PAC off on a PAC enabled machine. With PAC off I saw no regressions. With PAC on I saw ~250 regressions - mostly Shenendoah tests, but with a few ZGC and serviceability tests too. Is it worth holding this patch up for those fixes? In addition the CSR still needs a review: https://bugs.openjdk.java.net/browse/JDK-8277543 ------------- PR: https://git.openjdk.java.net/jdk/pull/6334 From adinn at openjdk.java.net Wed Jan 12 17:21:30 2022 From: adinn at openjdk.java.net (Andrew Dinn) Date: Wed, 12 Jan 2022 17:21:30 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v11] In-Reply-To: References: Message-ID: On Wed, 12 Jan 2022 17:08:34 GMT, Alan Hayward wrote: > Is it worth holding this patch up for those fixes? Probably. It depends on exactly what is failing. Can you provide more info? ------------- PR: https://git.openjdk.java.net/jdk/pull/6334 From aph at openjdk.java.net Wed Jan 12 18:18:28 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 12 Jan 2022 18:18:28 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v8] In-Reply-To: <2K1PK1zRozjzE2mta3IwHBcljV5tlU5NhiTZijralEI=.7fac2fc0-d1dd-4f10-aa7b-973ee62674ef@github.com> References: <7udOYix-Tp2DKsqy5gh0b9w7BIFOkh-WtOaTmvayCVo=.6f3c0945-3d21-42aa-acbd-a3e7bda2b57e@github.com> <2K1PK1zRozjzE2mta3IwHBcljV5tlU5NhiTZijralEI=.7fac2fc0-d1dd-4f10-aa7b-973ee62674ef@github.com> Message-ID: On Wed, 12 Jan 2022 15:57:06 GMT, Coleen Phillimore wrote: >> src/hotspot/cpu/aarch64/assembler_aarch64.hpp line 412: >> >>> 410: Address(Register r, T o) >>> 411: : _base(r), _index(noreg), _offset(o), _mode(base_plus_offset), _target(0) {} >>> 412: >> >> Hum, interesting. Looks good: all integer types should get sign- or zero-extended as appropriate, I think. I am going to check tho'. :-) > > Great, thanks for checking! Seems to be fine: does the sign extension right, doesn't do any bogus sign extension. Ship it! ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Wed Jan 12 18:30:35 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 12 Jan 2022 18:30:35 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v8] In-Reply-To: References: <7udOYix-Tp2DKsqy5gh0b9w7BIFOkh-WtOaTmvayCVo=.6f3c0945-3d21-42aa-acbd-a3e7bda2b57e@github.com> <2K1PK1zRozjzE2mta3IwHBcljV5tlU5NhiTZijralEI=.7fac2fc0-d1dd-4f10-aa7b-973ee62674ef@github.com> Message-ID: On Wed, 12 Jan 2022 18:14:38 GMT, Andrew Haley wrote: >> Great, thanks for checking! > > Seems to be fine: does the sign extension right, doesn't do any bogus sign extension. Ship it! Thanks Andrew. Now I need another reviewer and you to approve. Thanks for all your help. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From kbarrett at openjdk.java.net Wed Jan 12 19:02:34 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 12 Jan 2022 19:02:34 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v8] In-Reply-To: References: Message-ID: <9_R1GWG8goQxC2V9LsMBox5l71FKoBK71kb5tkdpX2Y=.2e30a38b-1196-4cfa-83d4-7f085f0ebde5@github.com> On Wed, 12 Jan 2022 15:14:10 GMT, Coleen Phillimore wrote: >> Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Much better solution from Kim - templates! Changes requested by kbarrett (Reviewer). src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 498: > 496: > 497: template::value)> > 498: inline void mov(Register dst, T o) { mov_immediate64(dst, (uint64_t)o); } I'm not sure why this isn't just `inline void mov(Register dst, uint64_t o) { mov_immediate64(dst, o); }` Unlike the `Address` case, I don't see any ambiguous implicit conversions here. src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 1174: > 1172: assert(r->is_valid(), "bad oop arg"); > 1173: if (r->is_stack()) { > 1174: __ ldr(temp_reg, Address(sp, (uint64_t)r->reg2stack() * VMRegImpl::stack_slot_size)); Why is the cast being added here? Is it because the multiply can overflow an int, and this is really fix for a bug that is distinct from the cleanup in this PR? There are a couple more like this later in this file. src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp line 972: > 970: * Method entry for static native methods: > 971: * int java.util.zip.CRC32.updateBytes(int crc, byte[] b, int off, int len) > 972: * int java.util.zip.CRC32.updateByteBuffer(int crc, jlong buf, int off, int len) Why not `int64_t` instead of `jlong`? Similarly in a couple later places. src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp line 997: > 995: // Calculate address of start element > 996: if (kind == Interpreter::java_util_zip_CRC32_updateByteBuffer) { > 997: __ ldr(buf, Address(esp, 2*wordSize)); // jlong buf Maybe just drop the type in the comment? Similarly in a couple later places. src/hotspot/cpu/aarch64/universalUpcallHandler_aarch64.cpp line 74: > 72: // Capture prev stack pointer (stack arguments base) > 73: __ add(rscratch1, rfp, 16); // Skip saved FP and LR > 74: Address slot = __ legitimize_address(Address(sp, layout.stack_args), wordSize, rscratch2); Introduction of legitimize_address here seems unrelated to this PR. What's this about? ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From duke at openjdk.java.net Wed Jan 12 21:20:33 2022 From: duke at openjdk.java.net (Tyler Steele) Date: Wed, 12 Jan 2022 21:20:33 GMT Subject: [jdk18] RFR: 8279924: [PPC64, s390] implement frame::is_interpreted_frame_valid checks [v2] In-Reply-To: References: Message-ID: On Wed, 12 Jan 2022 15:18:00 GMT, Martin Doerr wrote: >> Implement frame::is_interpreted_frame_valid like on other platforms. Only interpreter_frame_initial_sp_offset needed to be replaced because it doesn't exist on these platforms. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Update Copyright years. I just ran a build & the tier1 tests on Linux/s390x. With the exception of UnsafeCopyMemory, which I have seen failing elsewhere, all tests are passing. I am still getting to know the project and the specifics of stack frames for the ppc64 and s390x architectures, so I'll defer reviewing to someone with more experience. That said, these checks look reasonable from my perspective. ------------- PR: https://git.openjdk.java.net/jdk18/pull/96 From coleenp at openjdk.java.net Wed Jan 12 21:42:34 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 12 Jan 2022 21:42:34 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v8] In-Reply-To: <9_R1GWG8goQxC2V9LsMBox5l71FKoBK71kb5tkdpX2Y=.2e30a38b-1196-4cfa-83d4-7f085f0ebde5@github.com> References: <9_R1GWG8goQxC2V9LsMBox5l71FKoBK71kb5tkdpX2Y=.2e30a38b-1196-4cfa-83d4-7f085f0ebde5@github.com> Message-ID: <6hr8ENDQ7zlY3LNHrQx4WZ-oI6dVNtyeq4rg4bV7nmc=.062547eb-fdd8-4373-b026-82a0e9c2b8d1@github.com> On Wed, 12 Jan 2022 18:47:29 GMT, Kim Barrett wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Much better solution from Kim - templates! > > src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 498: > >> 496: >> 497: template::value)> >> 498: inline void mov(Register dst, T o) { mov_immediate64(dst, (uint64_t)o); } > > I'm not sure why this isn't just > `inline void mov(Register dst, uint64_t o) { mov_immediate64(dst, o); }` > Unlike the `Address` case, I don't see any ambiguous implicit conversions here. Because the macosx compiler complains of ambiguity for the mov also if this is the only mov. > src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 1174: > >> 1172: assert(r->is_valid(), "bad oop arg"); >> 1173: if (r->is_stack()) { >> 1174: __ ldr(temp_reg, Address(sp, (uint64_t)r->reg2stack() * VMRegImpl::stack_slot_size)); > > Why is the cast being added here? Is it because the multiply can overflow an int, and this is really fix for a bug that is distinct from the cleanup in this PR? There are a couple more like this later in this file. This was here to match one of the Address constructors that I had. With the template, it's no longer needed. > src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp line 972: > >> 970: * Method entry for static native methods: >> 971: * int java.util.zip.CRC32.updateBytes(int crc, byte[] b, int off, int len) >> 972: * int java.util.zip.CRC32.updateByteBuffer(int crc, jlong buf, int off, int len) > > Why not `int64_t` instead of `jlong`? Similarly in a couple later places. This comment refers to Java code which more appropriately uses jlong. Actually, the Java code uses long. I'll revert this change. > src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp line 997: > >> 995: // Calculate address of start element >> 996: if (kind == Interpreter::java_util_zip_CRC32_updateByteBuffer) { >> 997: __ ldr(buf, Address(esp, 2*wordSize)); // jlong buf > > Maybe just drop the type in the comment? Similarly in a couple later places. Ok. That'll give me less 'long' matches. > src/hotspot/cpu/aarch64/universalUpcallHandler_aarch64.cpp line 74: > >> 72: // Capture prev stack pointer (stack arguments base) >> 73: __ add(rscratch1, rfp, 16); // Skip saved FP and LR >> 74: Address slot = __ legitimize_address(Address(sp, layout.stack_args), wordSize, rscratch2); > > Introduction of legitimize_address here seems unrelated to this PR. What's this about? This is a bug that Andrew pointed out. I could file a different CR for this and take it out. It seemed a minor thing to include with this change though. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Wed Jan 12 21:42:34 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 12 Jan 2022 21:42:34 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v8] In-Reply-To: <6hr8ENDQ7zlY3LNHrQx4WZ-oI6dVNtyeq4rg4bV7nmc=.062547eb-fdd8-4373-b026-82a0e9c2b8d1@github.com> References: <9_R1GWG8goQxC2V9LsMBox5l71FKoBK71kb5tkdpX2Y=.2e30a38b-1196-4cfa-83d4-7f085f0ebde5@github.com> <6hr8ENDQ7zlY3LNHrQx4WZ-oI6dVNtyeq4rg4bV7nmc=.062547eb-fdd8-4373-b026-82a0e9c2b8d1@github.com> Message-ID: On Wed, 12 Jan 2022 21:35:29 GMT, Coleen Phillimore wrote: >> src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp line 997: >> >>> 995: // Calculate address of start element >>> 996: if (kind == Interpreter::java_util_zip_CRC32_updateByteBuffer) { >>> 997: __ ldr(buf, Address(esp, 2*wordSize)); // jlong buf >> >> Maybe just drop the type in the comment? Similarly in a couple later places. > > Ok. That'll give me less 'long' matches. No, it's useful because it describes the parameter that you're loading. I'm leaving this jlong. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From hseigel at openjdk.java.net Wed Jan 12 22:02:49 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 12 Jan 2022 22:02:49 GMT Subject: RFR: 8279936: Change shared code to use os:: system API's Message-ID: Please review this small change to call os:: API's. the changes were tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. Most of these changes were for I/O related calls. Changes to memory allocation calls such as malloc and free will be handled in a future change. Thanks, Harold ------------- Commit messages: - 8279936: Change shared code to use os:: system API's Changes: https://git.openjdk.java.net/jdk/pull/7055/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7055&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8279936 Stats: 25 lines in 9 files changed: 1 ins; 0 del; 24 mod Patch: https://git.openjdk.java.net/jdk/pull/7055.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7055/head:pull/7055 PR: https://git.openjdk.java.net/jdk/pull/7055 From coleenp at openjdk.java.net Wed Jan 12 23:17:58 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 12 Jan 2022 23:17:58 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v9] In-Reply-To: References: Message-ID: > Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Some code review changes. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7023/files - new: https://git.openjdk.java.net/jdk/pull/7023/files/a85fd4f3..e65df6a2 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=08 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=07-08 Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/7023.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7023/head:pull/7023 PR: https://git.openjdk.java.net/jdk/pull/7023 From jwilhelm at openjdk.java.net Wed Jan 12 23:39:04 2022 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Wed, 12 Jan 2022 23:39:04 GMT Subject: RFR: Merge jdk18 Message-ID: Forwardport JDK 18 -> JDK 19 ------------- Commit messages: - Merge remote-tracking branch 'jdk18/master' into Merge_jdk18 - 8206181: ExceptionInInitializerError: improve handling of exceptions in user-provided taglets - 8279695: [TESTBUG] modify compiler/loopopts/TestSkeletonPredicateNegation.java to run on C1 also - 8279356: Method linking fails with guarantee(mh->adapter() != NULL) failed: Adapter blob must already exist! - 8278489: Preserve result in native wrapper with +UseHeavyMonitors - 8278267: ARM32: several vector test failures for ASHR The webrevs contain the adjustments done while merging with regards to each parent branch: - master: https://webrevs.openjdk.java.net/?repo=jdk&pr=7057&range=00.0 - jdk18: https://webrevs.openjdk.java.net/?repo=jdk&pr=7057&range=00.1 Changes: https://git.openjdk.java.net/jdk/pull/7057/files Stats: 907 lines in 16 files changed: 811 ins; 21 del; 75 mod Patch: https://git.openjdk.java.net/jdk/pull/7057.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7057/head:pull/7057 PR: https://git.openjdk.java.net/jdk/pull/7057 From jwilhelm at openjdk.java.net Thu Jan 13 01:10:29 2022 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Thu, 13 Jan 2022 01:10:29 GMT Subject: Integrated: Merge jdk18 In-Reply-To: References: Message-ID: <38zJOHU9dDOsz0As8TzSdMeOgHo2owiO-kk59_McW-s=.af44c757-3e59-4b39-bb76-81f82f63eb97@github.com> On Wed, 12 Jan 2022 23:32:00 GMT, Jesper Wilhelmsson wrote: > Forwardport JDK 18 -> JDK 19 This pull request has now been integrated. Changeset: 67e3d51d Author: Jesper Wilhelmsson URL: https://git.openjdk.java.net/jdk/commit/67e3d51d68e7319bd6d5b01233b664e6ee6b17ec Stats: 907 lines in 16 files changed: 811 ins; 21 del; 75 mod Merge ------------- PR: https://git.openjdk.java.net/jdk/pull/7057 From rrich at openjdk.java.net Thu Jan 13 09:04:35 2022 From: rrich at openjdk.java.net (Richard Reingruber) Date: Thu, 13 Jan 2022 09:04:35 GMT Subject: [jdk18] RFR: 8279924: [PPC64, s390] implement frame::is_interpreted_frame_valid checks [v2] In-Reply-To: References: Message-ID: On Wed, 12 Jan 2022 15:18:00 GMT, Martin Doerr wrote: >> Implement frame::is_interpreted_frame_valid like on other platforms. Only interpreter_frame_initial_sp_offset needed to be replaced because it doesn't exist on these platforms. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Update Copyright years. Looks good. Cheers, Richard. ------------- Marked as reviewed by rrich (Reviewer). PR: https://git.openjdk.java.net/jdk18/pull/96 From mdoerr at openjdk.java.net Thu Jan 13 09:09:27 2022 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Thu, 13 Jan 2022 09:09:27 GMT Subject: [jdk18] RFR: 8279924: [PPC64, s390] implement frame::is_interpreted_frame_valid checks [v2] In-Reply-To: References: Message-ID: On Wed, 12 Jan 2022 15:18:00 GMT, Martin Doerr wrote: >> Implement frame::is_interpreted_frame_valid like on other platforms. Only interpreter_frame_initial_sp_offset needed to be replaced because it doesn't exist on these platforms. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Update Copyright years. Thanks a lot for reviewing. I just noticed that `interpreter_frame_initial_sp_offset`on other platforms is negative. So, my replacement for it uses the wrong sign. I need to fix that and retest. ------------- PR: https://git.openjdk.java.net/jdk18/pull/96 From mdoerr at openjdk.java.net Thu Jan 13 09:18:07 2022 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Thu, 13 Jan 2022 09:18:07 GMT Subject: [jdk18] RFR: 8279924: [PPC64, s390] implement frame::is_interpreted_frame_valid checks [v3] In-Reply-To: References: Message-ID: > Implement frame::is_interpreted_frame_valid like on other platforms. Only interpreter_frame_initial_sp_offset needed to be replaced because it doesn't exist on these platforms. Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: Fix check if frame is large enough for header plus intepreter state. ------------- Changes: - all: https://git.openjdk.java.net/jdk18/pull/96/files - new: https://git.openjdk.java.net/jdk18/pull/96/files/36c0cfcc..7805e7c3 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk18&pr=96&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk18&pr=96&range=01-02 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk18/pull/96.diff Fetch: git fetch https://git.openjdk.java.net/jdk18 pull/96/head:pull/96 PR: https://git.openjdk.java.net/jdk18/pull/96 From rrich at openjdk.java.net Thu Jan 13 09:23:38 2022 From: rrich at openjdk.java.net (Richard Reingruber) Date: Thu, 13 Jan 2022 09:23:38 GMT Subject: [jdk18] RFR: 8279924: [PPC64, s390] implement frame::is_interpreted_frame_valid checks [v2] In-Reply-To: References: Message-ID: <8z1JTNJv5vN8PIc2kZ8wQVB_LcIBIAYLmH9PLpXOHbk=.f9a01ae8-6de6-4b4c-8a32-fcf4f126d7f4@github.com> On Thu, 13 Jan 2022 09:06:03 GMT, Martin Doerr wrote: > Thanks a lot for reviewing. I just noticed that `interpreter_frame_initial_sp_offset`on other platforms is negative. So, my replacement for it uses the wrong sign. I need to fix that and retest. Indeed. Good catch! ------------- PR: https://git.openjdk.java.net/jdk18/pull/96 From aph at openjdk.java.net Thu Jan 13 10:24:33 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 13 Jan 2022 10:24:33 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v9] In-Reply-To: References: Message-ID: On Wed, 12 Jan 2022 23:17:58 GMT, Coleen Phillimore wrote: >> Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Some code review changes. Marked as reviewed by aph (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From aph at openjdk.java.net Thu Jan 13 10:24:33 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 13 Jan 2022 10:24:33 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v8] In-Reply-To: <6hr8ENDQ7zlY3LNHrQx4WZ-oI6dVNtyeq4rg4bV7nmc=.062547eb-fdd8-4373-b026-82a0e9c2b8d1@github.com> References: <9_R1GWG8goQxC2V9LsMBox5l71FKoBK71kb5tkdpX2Y=.2e30a38b-1196-4cfa-83d4-7f085f0ebde5@github.com> <6hr8ENDQ7zlY3LNHrQx4WZ-oI6dVNtyeq4rg4bV7nmc=.062547eb-fdd8-4373-b026-82a0e9c2b8d1@github.com> Message-ID: On Wed, 12 Jan 2022 21:39:05 GMT, Coleen Phillimore wrote: >> src/hotspot/cpu/aarch64/universalUpcallHandler_aarch64.cpp line 74: >> >>> 72: // Capture prev stack pointer (stack arguments base) >>> 73: __ add(rscratch1, rfp, 16); // Skip saved FP and LR >>> 74: Address slot = __ legitimize_address(Address(sp, layout.stack_args), wordSize, rscratch2); >> >> Introduction of legitimize_address here seems unrelated to this PR. What's this about? > > This is a bug that Andrew pointed out. I could file a different CR for this and take it out. It seemed a minor thing to include with this change though. > Introduction of legitimize_address here seems unrelated to this PR. What's this about? SP-relative loads and stores have only a 12-bit offset. This is a bug that was discovered during review. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From duke at openjdk.java.net Thu Jan 13 11:56:32 2022 From: duke at openjdk.java.net (Alan Hayward) Date: Thu, 13 Jan 2022 11:56:32 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v11] In-Reply-To: References: Message-ID: On Wed, 12 Jan 2022 17:18:16 GMT, Andrew Dinn wrote: >Probably. It depends on exactly what is failing. Can you provide more info? Looks like they are all segfaults, which is exactly what I'd expect if a frame was missing a sign or auth. I'm fairly confident it's the same handful of issues over and over. I'll get onto debugging them a little more. ------------- PR: https://git.openjdk.java.net/jdk/pull/6334 From adinn at openjdk.java.net Thu Jan 13 12:51:34 2022 From: adinn at openjdk.java.net (Andrew Dinn) Date: Thu, 13 Jan 2022 12:51:34 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v11] In-Reply-To: References: Message-ID: On Thu, 13 Jan 2022 11:53:47 GMT, Alan Hayward wrote: > Looks like they are all segfaults, which is exactly what I'd expect if a frame was missing a sign or auth. I'm fairly confident it's the same handful of issues over and over. I'll get onto debugging them a little more. I was hoping you were going to say that ... :-) Let's see how it looks in the debugger. ------------- PR: https://git.openjdk.java.net/jdk/pull/6334 From coleenp at openjdk.java.net Thu Jan 13 13:44:35 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 13 Jan 2022 13:44:35 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v9] In-Reply-To: References: Message-ID: On Thu, 13 Jan 2022 10:18:49 GMT, Andrew Haley wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Some code review changes. > > Marked as reviewed by aph (Reviewer). Thanks @theRealAph for your help and the review. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From jwilhelm at openjdk.java.net Thu Jan 13 21:06:07 2022 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Thu, 13 Jan 2022 21:06:07 GMT Subject: RFR: Merge jdk18 Message-ID: Forwardport JDK 18 -> JDK 19 ------------- Commit messages: - Merge remote-tracking branch 'jdk18/master' into Merge_jdk18 - 8279833: Loop optimization issue in String.encodeUTF8_UTF16 - 8279370: jdk.jpackage/share/native/applauncher/JvmLauncher.cpp fails to build with GCC 6.3.0 - 8274007: [REDO] VM Exit does not abort concurrent mark - 8279837: C2: assert(is_Loop()) failed: invalid node class: Region The webrevs contain the adjustments done while merging with regards to each parent branch: - master: https://webrevs.openjdk.java.net/?repo=jdk&pr=7068&range=00.0 - jdk18: https://webrevs.openjdk.java.net/?repo=jdk&pr=7068&range=00.1 Changes: https://git.openjdk.java.net/jdk/pull/7068/files Stats: 124 lines in 6 files changed: 117 ins; 1 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/7068.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7068/head:pull/7068 PR: https://git.openjdk.java.net/jdk/pull/7068 From mbaesken at openjdk.java.net Fri Jan 14 07:37:32 2022 From: mbaesken at openjdk.java.net (Matthias Baesken) Date: Fri, 14 Jan 2022 07:37:32 GMT Subject: [jdk18] RFR: 8279924: [PPC64, s390] implement frame::is_interpreted_frame_valid checks [v3] In-Reply-To: References: Message-ID: On Thu, 13 Jan 2022 09:18:07 GMT, Martin Doerr wrote: >> Implement frame::is_interpreted_frame_valid like on other platforms. Only interpreter_frame_initial_sp_offset needed to be replaced because it doesn't exist on these platforms. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Fix check if frame is large enough for header plus intepreter state. Marked as reviewed by mbaesken (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk18/pull/96 From jbhateja at openjdk.java.net Fri Jan 14 12:11:24 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Fri, 14 Jan 2022 12:11:24 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v3] In-Reply-To: References: <1sV-WcCv0-H2lHh1FFwAocRL8EIVzZ_att5tUeY5M-c=.ee351f2b-c9b9-4294-9205-cf16da3e896d@github.com> Message-ID: On Mon, 10 Jan 2022 09:49:20 GMT, Pengfei Li wrote: >> Pengfei Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: >> >> - Update copyright year and rename a function >> >> Change-Id: I15845ebd3982edebd4c151284cc6f2ff727630bb >> - Merge branch 'master' into postloop >> >> Change-Id: Ie639c79c9cf016dc68ebf2c0031b60453b45e9a4 >> - Fix issues in newly added test framework >> >> Change-Id: I6e61abf05e9665325cb3abaf407360b18355c6b1 >> - Merge branch 'master' into postloop >> >> Change-Id: I9bb5a808d7540426dedb141fd198d25eb1f569e6 >> - 8183390: Fix and re-enable post loop vectorization >> >> ** Background >> >> Post loop vectorization is a C2 compiler optimization in an experimental >> VM feature called PostLoopMultiversioning. It transforms the range-check >> eliminated post loop to a 1-iteration vectorized loop with vector mask. >> This optimization was contributed by Intel in 2016 to support x86 AVX512 >> masked vector instructions. However, it was disabled soon after an issue >> was found. Due to insufficient maintenance in these years, multiple bugs >> have been accumulated inside. But we (Arm) still think this is a useful >> framework for vector mask support in C2 auto-vectorized loops, for both >> x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable >> post loop vectorization. >> >> ** Changes in this patch >> >> This patch reworks post loop vectorization. The most significant change >> is removing vector mask support in C2 x86 backend and re-implementing >> it in the mid-end. With this, we can re-enable post loop vectorization >> for platforms other than x86. >> >> Previous implementation hard-codes x86 k1 register as a reserved AVX512 >> opmask register and defines two routines (setvectmask/restorevectmask) >> to set and restore the value of k1. But after JDK-8211251 which encodes >> AVX512 instructions as unmasked by default, generated vector masks are >> no longer used in AVX512 vector instructions. To fix incorrect codegen >> and add vector mask support for more platforms, we turn to add a vector >> mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode >> to generate a mask and replace all Load/Store nodes in the post loop >> into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This >> IR form is exactly the same to those which are used in VectorAPI mask >> support. For now, we only add mask inputs for Load/Store nodes because >> we don't have reduction operations supported in post loop vectorization. >> After this change, the x86 k1 register is no longer reserved and can be >> allocated when PostLoopMultiversioning is enabled. >> >> Besides this change, we have fixed a compiler crash and five incorrect >> result issues with post loop vectorization. >> >> - 1) C2 crashes with segmentation fault in strip-mined loops >> >> Previous implementation was done before C2 loop strip-mining was merged >> into JDK master so it didn't take strip-mined loops into consideration. >> In C2's strip mined loops, post loop is not the sibling of the main loop >> in ideal loop tree. Instead, it's the sibling of the main loop's parent. >> This patch fixed a SIGSEGV issue caused by NULL pointer when locating >> post loop from strip-mined main loop. >> >> - 2) Incorrect result issues with post loop vectorization >> >> We have also fixed five incorrect vectorization issues. Some of them are >> hidden deep and can only be reproduced with corner cases. These issues >> have a common cause that it assumes the post loop can be vectorized if >> the vectorization in corresponding main loop is successful. But in many >> cases this assumption is wrong. Below are details. >> >> [Issue-1] Incorrect vectorization for partial vectorizable loops >> >> This issue can be reproduced by below loop where only some operations in >> the loop body are vectorizable. >> >> for (int i = 0; i < 10000; i++) { >> res[i] = a[i] * b[i]; >> k = 3 * k + 1; >> } >> >> In the main loop, superword can work well if parts of the operations in >> loop body are not vectorizable since those parts can be unrolled only. >> But for post loops, we don't create vectors through combining scalar IRs >> generated from loop unrolling. Instead, we are doing scalars to vectors >> replacement for all operations in the loop body. Hence, all operations >> should be either vectorized together or not vectorized at all. To fix >> this kind of cases, we add an extra field "_slp_vector_pack_count" in >> CountedLoopNode to record the eventual count of vector packs in the main >> loop. This value is then passed to post loop and compared with post loop >> pack count. Vectorization will be bailed out in post loop if it creates >> more vector packs than in the main loop. >> >> [Issue-2] Incorrect result in loops with growing-down vectors >> >> This issue appears with growing-down vectors, that is, vectors that grow >> to smaller memory address as the loop iterates. It can be reproduced by >> below counting-up loop with negative scale value in array index. >> >> for (int i = 0; i < 10000; i++) { >> a[MAX - i] = b[MAX - i]; >> } >> >> Cause of this issue is that for a growing-down vector, generated vector >> mask value has reversed vector-lane order so it masks incorrect vector >> lanes. Note that if negative scale value appears in counting-down loops, >> the vector will be growing up. With this rule, we fix the issue by only >> allowing positive array index scales in counting-up loops and negative >> array index scales in counting-down loops. This check is done with the >> help of SWPointer by comparing scale values in each memory access in the >> loop with loop stride value. >> >> [Issue-3] Incorrect result in manually unrolled loops >> >> This issue can be reproduced by below manually unrolled loop. >> >> for (int i = 0; i < 10000; i += 2) { >> c[i] = a[i] + b[i]; >> c[i + 1] = a[i + 1] * b[i + 1]; >> } >> >> In this loop, operations in the 2nd statement duplicate those in the 1st >> statement with a small memory address offset. Vectorization in the main >> loop works well in this case because C2 does further unrolling and pack >> combination. But we cannot vectorize the post loop through replacement >> from scalars to vectors because it creates duplicated vector operations. >> To fix this, we restrict post loop vectorization to loops with stride >> values of 1 or -1. >> >> [Issue-4] Incorrect result in loops with mixed vector element sizes >> >> This issue is found after we enable post loop vectorization for AArch64. >> It's reproducible by multiple array operations with different element >> sizes inside a loop. On x86, there is no issue because the values of x86 >> AVX512 opmasks only depend on which vector lanes are active. But AArch64 >> is different - the values of SVE predicates also depend on lane size of >> the vector. Hence, on AArch64 SVE, if a loop has mixed vector element >> sizes, we should use different vector masks. For now, we just support >> loops with only one vector element size, i.e., "int + float" vectors in >> a single loop is ok but "int + double" vectors in a single loop is not >> vectorizable. This fix also enables subword vectors support to make all >> primitive type array operations vectorizable. >> >> [Issue-5] Incorrect result in loops with potential data dependence >> >> This issue can be reproduced by below corner case on AArch64 only. >> >> for (int i = 0; i < 10000; i++) { >> a[i] = x; >> a[i + OFFSET] = y; >> } >> >> In this case, two stores in the loop have data dependence if the OFFSET >> value is smaller than the vector length. So we cannot do vectorization >> through replacing scalars to vectors. But the main loop vectorization >> in this case is successful on AArch64 because AArch64 has partial vector >> load/store support. It splits vector fill with different values in lanes >> to several smaller-sized fills. In this patch, we add additional data >> dependence check for this kind of cases. The check is also done with the >> help of SWPointer class. In this check, we require that every two memory >> accesses (with at least one store) of the same element type (or subword >> size) in the loop has the same array index expression. >> >> ** Tests >> >> So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with >> experimental VM option "PostLoopMultiversioning" turned on. We found no >> issue in all tests. We notice that those existing cases are not enough >> because some of above issues are not spotted by them. We would like to >> add some new cases but we found existing vectorization tests are a bit >> cumbersome - golden results must be pre-calculated and hard-coded in the >> test code for correctness verification. Thus, in this patch, we propose >> a new vectorization testing framework. >> >> Our new framework brings a simpler way to add new cases. For a new test >> case, we only need to create a new method annotated with "@Test". The >> test runner will invoke each annotated method twice automatically. First >> time it runs in the interpreter and second time it's forced compiled by >> C2. Then the two return results are compared. So in this framework each >> test method should return a primitive value or an array of primitives. >> In this way, no extra verification code for vectorization correctness is >> required. This test runner is still jtreg-based and takes advantages of >> the jtreg WhiteBox API, which enables test methods running at specific >> compilation levels. Each test class inside is also jtreg-based. It just >> need to inherit from the test runner class and run with two additional >> options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". >> >> ** Summary & Future work >> >> In this patch, we reworked post loop vectorization. We made it platform >> independent and fixed several issues inside. We also implemented a new >> vectorization testing framework with many test cases inside. Meanwhile, >> we did some code cleanups. >> >> This patch only touches C2 code guarded with PostLoopMultiversioning, >> except a few data structure changes. So, there's no behavior change when >> experimental VM option PostLoopMultiversioning is off. Also, to reduce >> risks, we still propose to keep post loop vectorization experimental for >> now. But if it receives positive feedback, we would like to change it to >> non-experimental in the future. > > Can any C2 compiler expert help review this? I updated copyright year to 2022 and renamed a function in latest commit. Hi @pfustc , Apologies for being late in my response over this, following is the performance data of JMH micro (included with the report) operating over vectors of various primitive types with and without optimization. [http://cr.openjdk.java.net/~jbhateja/post_loop_multiversioning/perf_post_loop_multiversioning_CLX.xlsx](http://cr.openjdk.java.net/~jbhateja/post_loop_multiversioning/perf_post_loop_multiversioning_CLX.xlsx ) Observations: - Data shows reduction in cycles , dynamic instruction count, branches with optimization. - Addition of tail loop iteration has impact on JIT code size, this may effect other optimizations like procedure in-lining. - Scores are better for sub-word types (byte and short) since they have relatively long tail. - There is high run to run variation in throughput. Best Regards, Jatin ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From ayang at openjdk.java.net Fri Jan 14 14:01:39 2022 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Fri, 14 Jan 2022 14:01:39 GMT Subject: RFR: 8280018: Remove obsolete VM_GenCollectFullConcurrent Message-ID: Trivial change of removing dead code. Test: build ------------- Commit messages: - trivial Changes: https://git.openjdk.java.net/jdk/pull/7084/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7084&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8280018 Stats: 4 lines in 3 files changed: 0 ins; 4 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7084.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7084/head:pull/7084 PR: https://git.openjdk.java.net/jdk/pull/7084 From mdoerr at openjdk.java.net Fri Jan 14 14:13:32 2022 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Fri, 14 Jan 2022 14:13:32 GMT Subject: [jdk18] RFR: 8279924: [PPC64, s390] implement frame::is_interpreted_frame_valid checks [v3] In-Reply-To: References: Message-ID: On Thu, 13 Jan 2022 09:18:07 GMT, Martin Doerr wrote: >> Implement frame::is_interpreted_frame_valid like on other platforms. Only interpreter_frame_initial_sp_offset needed to be replaced because it doesn't exist on these platforms. > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Fix check if frame is large enough for header plus intepreter state. Thanks for reviewing my update! ------------- PR: https://git.openjdk.java.net/jdk18/pull/96 From mdoerr at openjdk.java.net Fri Jan 14 14:17:37 2022 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Fri, 14 Jan 2022 14:17:37 GMT Subject: [jdk18] Integrated: 8279924: [PPC64, s390] implement frame::is_interpreted_frame_valid checks In-Reply-To: References: Message-ID: On Wed, 12 Jan 2022 14:02:09 GMT, Martin Doerr wrote: > Implement frame::is_interpreted_frame_valid like on other platforms. Only interpreter_frame_initial_sp_offset needed to be replaced because it doesn't exist on these platforms. This pull request has now been integrated. Changeset: c809d34f Author: Martin Doerr URL: https://git.openjdk.java.net/jdk18/commit/c809d34f9ec0d8e9f77adc73ee772ce90efbe58d Stats: 104 lines in 2 files changed: 96 ins; 2 del; 6 mod 8279924: [PPC64, s390] implement frame::is_interpreted_frame_valid checks Reviewed-by: rrich, mbaesken ------------- PR: https://git.openjdk.java.net/jdk18/pull/96 From coleenp at openjdk.java.net Fri Jan 14 14:20:12 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 14 Jan 2022 14:20:12 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v10] In-Reply-To: References: Message-ID: > Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Revert src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7023/files - new: https://git.openjdk.java.net/jdk/pull/7023/files/e65df6a2..232e1d63 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=09 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=08-09 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/7023.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7023/head:pull/7023 PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Fri Jan 14 14:20:14 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 14 Jan 2022 14:20:14 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v8] In-Reply-To: References: <9_R1GWG8goQxC2V9LsMBox5l71FKoBK71kb5tkdpX2Y=.2e30a38b-1196-4cfa-83d4-7f085f0ebde5@github.com> <6hr8ENDQ7zlY3LNHrQx4WZ-oI6dVNtyeq4rg4bV7nmc=.062547eb-fdd8-4373-b026-82a0e9c2b8d1@github.com> Message-ID: On Wed, 12 Jan 2022 21:36:56 GMT, Coleen Phillimore wrote: >> Ok. That'll give me less 'long' matches. > > No, it's useful because it describes the parameter that you're loading. I'm leaving this jlong. Or I'll revert this whole file because these are comment changes that refer to the Java code. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From ddong at openjdk.java.net Fri Jan 14 16:15:06 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Fri, 14 Jan 2022 16:15:06 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash [v2] In-Reply-To: References: Message-ID: > Hi, > > I found that the native stack frames in the hs log are not accurate sometimes on AArch64, not sure if this is a known issue or an issue worth fixing. > > The following steps can quick reproduce the problem: > > 1. apply the diff(comment the dtrace_object_alloc call in interpreter and make a crash on SharedRuntime::dtrace_object_alloc) > > index 39e99bdd5ed..4fc768e94aa 100644 > --- a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp > +++ b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp > @@ -3558,6 +3558,7 @@ void TemplateTable::_new() { > __ store_klass_gap(r0, zr); // zero klass gap for compressed oops > __ store_klass(r0, r4); // store klass last > > +/** > { > SkipIfEqual skip(_masm, &DTraceAllocProbes, false); > // Trigger dtrace event for fastpath > @@ -3567,6 +3568,7 @@ void TemplateTable::_new() { > __ pop(atos); // restore the return value > > } > +*/ > __ b(done); > } > > diff --git a/src/hotspot/cpu/x86/templateTable_x86.cpp b/src/hotspot/cpu/x86/templateTable_x86.cpp > index 19530b7c57c..15b0509da4c 100644 > --- a/src/hotspot/cpu/x86/templateTable_x86.cpp > +++ b/src/hotspot/cpu/x86/templateTable_x86.cpp > @@ -4033,6 +4033,7 @@ void TemplateTable::_new() { > Register tmp_store_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg); > __ store_klass(rax, rcx, tmp_store_klass); // klass > > +/** > { > SkipIfEqual skip_if(_masm, &DTraceAllocProbes, 0); > // Trigger dtrace event for fastpath > @@ -4041,6 +4042,7 @@ void TemplateTable::_new() { > CAST_FROM_FN_PTR(address, static_cast(SharedRuntime::dtrace_object_alloc)), rax); > __ pop(atos); > } > +*/ > > __ jmp(done); > } > diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp > index a5de65ea5ab..60b4bd3bcc8 100644 > --- a/src/hotspot/share/runtime/sharedRuntime.cpp > +++ b/src/hotspot/share/runtime/sharedRuntime.cpp > @@ -1002,6 +1002,7 @@ jlong SharedRuntime::get_java_tid(Thread* thread) { > * 6254741. Once that is fixed we can remove the dummy return value. > */ > int SharedRuntime::dtrace_object_alloc(oopDesc* o) { > + *(int*)0 = 1; > return dtrace_object_alloc(Thread::current(), o, o->size()); > } > > > 2. `java -XX:+DTraceAllocProbes -Xcomp -XX:-PreserveFramePointer -version` > > On x86_64, the native stack in hs log is complete, but on AArch64, the native stack is incorrect. > > In the beginning, I thought it might be the influence of PreserveFramePointer. Later, I found that no matter whether PreserveFramePointer is enabled or not, in the hs log of x86_64, the native stack is always correct, and aarch64 is wrong. > > After some investigation, I found that this problem is related to the layout of the stack. > > On x86_64, whether it is C/C++, interpreter, or JIT, `callee` will always put the `return address` and `fp` of the `caller` at the bottom of the stack. > Hence, `callee` can always get the `caller sp`(aka `sender sp`) by `fp + 2`, and if `caller` is a compiled method, `caller sp` is the key to getting the `caller`'s `caller` since `caller fp` may be invalid.(see frame::sender_for_compiled_frame). > > > push %rbp > mov %rsp,%rbp > > _ _ _ _ _ _ > | | > | | | > |_ _ _ _ _ _| | > | | | > caller | | <- caller sp | > _ _ _ |_ _ _ _ _ _| | expand > | | | > | ret addr | | direction > callee |_ _ _ _ _ _| | > | | V > | caller fp | <- fp > |_ _ _ _ _ _| > > > > But for AArch64, the C/C++ code doesn't put the `return address` and `fp` of the `caller` at the bottom of the stack. > Hence, we cannot use `fp + 2` to calculate the proper `caller sp`(although it is still implemented this way). > > When `caller` is a C1/C2 method A, and `callee` is a C/C++ method B, we cannot get the `caller` of A since we cannot get the proper sp value of it. > > > stp x29, x30, [sp, #-N]! > mov x29, sp > > _ _ _ _ _ _ > | | > | | | > |_ _ _ _ _ _| | > | | | > caller | | <- caller sp | > _ _ _ |_ _ _ _ _ _| - | expand > | | > . . . . . | | direction > _ _ _ _ _ _ | | > | | | N | > | ret addr | | | > callee |_ _ _ _ _ _| | | > | | - V > | caller fp | <- fp > |_ _ _ _ _ _| > > > > I am not very familiar with AArch64 and have no idea how to fix this issue perfectly at current. > > Based on my understanding of the implementation, we can get the correct stack trace when PreserveFramePointer is enabled. > > Although PreserveFramePointer is disabled by default, I found that some real applications will enable it in the production environment. > Therefore, in my opinion, this fix can help troubleshoot crash issues in applications that enable PreserveFramePointer on AArch64 platform. > > This patch changes the logic of l_sender_sp calculation, uses sender_sp() as the value of l_sender_sp when PreserveFramePointer is enabled. > > Any input is appreciated. > > Thanks, > Denghui Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: fix pfl() crash problem and rename from_thread to from_anchor ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6597/files - new: https://git.openjdk.java.net/jdk/pull/6597/files/06682c7b..0996bbe7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6597&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6597&range=00-01 Stats: 9 lines in 4 files changed: 1 ins; 0 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/6597.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6597/head:pull/6597 PR: https://git.openjdk.java.net/jdk/pull/6597 From smonteith at openjdk.java.net Fri Jan 14 17:12:29 2022 From: smonteith at openjdk.java.net (Stuart Monteith) Date: Fri, 14 Jan 2022 17:12:29 GMT Subject: RFR: 8239927: Product variable PrefetchFieldsAhead is unused and should be removed [v3] In-Reply-To: References: Message-ID: <5rlxneliEyNgfaNXC5YrFXSHiyW1cWDXNvx-4s6Dudg=.63f72983-162c-479e-bb3e-1d2e19c4535a@github.com> On Fri, 17 Dec 2021 12:03:37 GMT, Bhavana-Kilambi wrote: >> The product variable "PrefetchFieldsAhead" is defined in gc_globals.hpp and set in vm_version_x86.cpp. >> But as it's not used anywhere, removing this option from the JDK source. > > Bhavana-Kilambi has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > 8239927: Product variable PrefetchFieldsAhead is unused and should be removed I understand this might be a trivial fix, but reviews would be appreciated. ------------- PR: https://git.openjdk.java.net/jdk/pull/6783 From kbarrett at openjdk.java.net Fri Jan 14 21:29:52 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 14 Jan 2022 21:29:52 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v10] In-Reply-To: References: Message-ID: On Fri, 14 Jan 2022 14:20:12 GMT, Coleen Phillimore wrote: >> Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Revert src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7023 From kbarrett at openjdk.java.net Fri Jan 14 21:29:59 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 14 Jan 2022 21:29:59 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v8] In-Reply-To: References: <9_R1GWG8goQxC2V9LsMBox5l71FKoBK71kb5tkdpX2Y=.2e30a38b-1196-4cfa-83d4-7f085f0ebde5@github.com> <6hr8ENDQ7zlY3LNHrQx4WZ-oI6dVNtyeq4rg4bV7nmc=.062547eb-fdd8-4373-b026-82a0e9c2b8d1@github.com> Message-ID: On Fri, 14 Jan 2022 14:14:03 GMT, Coleen Phillimore wrote: >> No, it's useful because it describes the parameter that you're loading. I'm leaving this jlong. > > Or I'll revert this whole file because these are comment changes that refer to the Java code. Now that you've explained it, I'd be okay with `jlong` here. But I'm also okay with leaving it as `long`. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From kbarrett at openjdk.java.net Fri Jan 14 21:30:02 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 14 Jan 2022 21:30:02 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v8] In-Reply-To: References: <9_R1GWG8goQxC2V9LsMBox5l71FKoBK71kb5tkdpX2Y=.2e30a38b-1196-4cfa-83d4-7f085f0ebde5@github.com> <6hr8ENDQ7zlY3LNHrQx4WZ-oI6dVNtyeq4rg4bV7nmc=.062547eb-fdd8-4373-b026-82a0e9c2b8d1@github.com> Message-ID: On Thu, 13 Jan 2022 10:21:41 GMT, Andrew Haley wrote: >> This is a bug that Andrew pointed out. I could file a different CR for this and take it out. It seemed a minor thing to include with this change though. > >> Introduction of legitimize_address here seems unrelated to this PR. What's this about? > > SP-relative loads and stores have only a 12-bit offset. This is a bug that was discovered during review. Thanks for the explanation. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From kbarrett at openjdk.java.net Fri Jan 14 21:29:56 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 14 Jan 2022 21:29:56 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v8] In-Reply-To: <6hr8ENDQ7zlY3LNHrQx4WZ-oI6dVNtyeq4rg4bV7nmc=.062547eb-fdd8-4373-b026-82a0e9c2b8d1@github.com> References: <9_R1GWG8goQxC2V9LsMBox5l71FKoBK71kb5tkdpX2Y=.2e30a38b-1196-4cfa-83d4-7f085f0ebde5@github.com> <6hr8ENDQ7zlY3LNHrQx4WZ-oI6dVNtyeq4rg4bV7nmc=.062547eb-fdd8-4373-b026-82a0e9c2b8d1@github.com> Message-ID: On Wed, 12 Jan 2022 21:34:35 GMT, Coleen Phillimore wrote: >> src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp line 972: >> >>> 970: * Method entry for static native methods: >>> 971: * int java.util.zip.CRC32.updateBytes(int crc, byte[] b, int off, int len) >>> 972: * int java.util.zip.CRC32.updateByteBuffer(int crc, jlong buf, int off, int len) >> >> Why not `int64_t` instead of `jlong`? Similarly in a couple later places. > > This comment refers to Java code which more appropriately uses jlong. Actually, the Java code uses long. I'll revert this change. Yeah, this is in a comment referring to a Java method signature, so I agree `long` is correct here. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From sviswanathan at openjdk.java.net Fri Jan 14 23:31:28 2022 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Fri, 14 Jan 2022 23:31:28 GMT Subject: RFR: 8279282: [vectorapi] Matcher::supports_vector_comparison_unsigned is not needed on x86 [v3] In-Reply-To: <6YEvoLUipNu1l5haifwWwgQyfrjpRxh1o9qaIElSKQs=.d06b8dea-ed32-4cb5-ab09-28ce9a58e524@github.com> References: <6YEvoLUipNu1l5haifwWwgQyfrjpRxh1o9qaIElSKQs=.d06b8dea-ed32-4cb5-ab09-28ce9a58e524@github.com> Message-ID: On Sun, 9 Jan 2022 01:48:04 GMT, Quan Anh Mai wrote: >> Hi, >> >> Currently, unsigned comparison on AVX is implemented by zero extending elements and comparing the results. This leads to unnecessary complexity. This patch changes the implementation to use the identity existing in `Integer/Long.compareUnsigned`, that is `compareUnsigned(x, y) == compare(x ^ min_value, y ^ min_value)`. >> >> Thank you very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > use movddup for 128-bit vectors @merykitty Very good work. I have only one comment above. src/hotspot/cpu/x86/macroAssembler_x86.cpp line 3254: > 3252: vpcmpCC(dst, nds, src, eq_cond_enc, width, vector_len); > 3253: vallones(xtmp, vector_len); > 3254: vpxor(dst, xtmp, dst, vector_len); This would add extra overhead of doing vallones every time versus what we had before. ------------- PR: https://git.openjdk.java.net/jdk/pull/6966 From jbhateja at openjdk.java.net Sat Jan 15 02:28:07 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Sat, 15 Jan 2022 02:28:07 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API Message-ID: Summary of changes: - Intrinsify Math.round(float) and Math.round(double) APIs. - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. - Test creation using new IR testing framework. Following are the performance number of a JMH micro included with the patch Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) ? | ? | BASELINE AVX2 | WithOpt AVX2 | Gain (opt/baseline) | Baseline AVX3 | Withopt AVX3 | Gain (opt/baseline) -- | -- | -- | -- | -- | -- | -- | -- Benchmark | ARRAYLEN | Score (ops/ms) | Score (ops/ms) | ? | Score (ops/ms) | Score (ops/ms) | ? FpRoundingBenchmark.test_round_double | 1024 | 518.532 | 1364.066 | 2.630630318 | 512.908 | 4292.11 | 8.368186887 FpRoundingBenchmark.test_round_double | 2048 | 270.137 | 830.986 | 3.076165057 | 273.159 | 2459.116 | 9.002507697 FpRoundingBenchmark.test_round_float | 1024 | 752.436 | 7780.905 | 10.34095259 | 752.49 | 9506.694 | 12.63364829 FpRoundingBenchmark.test_round_float | 2048 | 389.499 | 4113.046 | 10.55983712 | 389.63 | 4863.673 | 12.48279907 Kindly review and share your feedback. Best Regards, Jatin ------------- Commit messages: - 8279508: Auto-vectorize Math.round API Changes: https://git.openjdk.java.net/jdk/pull/7094/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8279508 Stats: 409 lines in 22 files changed: 342 ins; 1 del; 66 mod Patch: https://git.openjdk.java.net/jdk/pull/7094.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7094/head:pull/7094 PR: https://git.openjdk.java.net/jdk/pull/7094 From ddong at openjdk.java.net Sat Jan 15 15:03:29 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Sat, 15 Jan 2022 15:03:29 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash In-Reply-To: References: <-WnZRNSYnVZg8lNSl4kXSEg4np9iGPPkk3qG0ij9_DA=.3510020e-d5cd-49bc-97c8-9156c6f9ee36@github.com> Message-ID: On Fri, 7 Jan 2022 14:28:05 GMT, Andrew Haley wrote: > > > I've had a good look at this - in fact spent all morning on it - and this is the wrong fix. > > > For example, it breaks the `pfl()` function in the test case. `pfl()` isn't called from anywhere in the JDK, but it is one of our essential debugging tools. If you're interested in pursuing this further I could explain what else to try, but I don't have any time to spend on this myself. Sorry. > > > > > > Thanks for the comment. It would be nice if you could give me some other way that helps fix the problem. > > OK. The following changes cause `dtrace_object_alloc()` to call `pfl()`. This should print the entire stack. (You can also clone https://github.com/theRealAph/jdk , branch `pull/6597` for the same code. With your patch included and `PreserveFramePointer` enabled, `pfl()` crashes. So it seems like your patch fixes one thing, but breaks other uses of stack walking. > > ``` > diff --git a/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp b/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp > index 661fad89e47..3fa80da73f7 100644 > --- a/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp > +++ b/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp > @@ -237,7 +237,9 @@ void C1_MacroAssembler::initialize_object(Register obj, Register klass, Register > > if (CURRENT_ENV->dtrace_alloc_probes()) { > assert(obj == r0, "must be"); > + set_last_Java_frame(sp, rfp, (address)pc(), rscratch1); > far_call(RuntimeAddress(Runtime1::entry_for(Runtime1::dtrace_object_alloc_id))); > + reset_last_Java_frame(true); > } > > verify_oop(obj); > @@ -270,7 +272,9 @@ void C1_MacroAssembler::allocate_array(Register obj, Register len, Register t1, > > if (CURRENT_ENV->dtrace_alloc_probes()) { > assert(obj == r0, "must be"); > + set_last_Java_frame(sp, rfp, (address)pc(), rscratch1); > far_call(RuntimeAddress(Runtime1::entry_for(Runtime1::dtrace_object_alloc_id))); > + reset_last_Java_frame(true); > } > > verify_oop(obj); > diff --git a/src/hotspot/cpu/aarch64/c1_Runtime1_aarch64.cpp b/src/hotspot/cpu/aarch64/c1_Runtime1_aarch64.cpp > index 005f739f0aa..b1da03398cf 100644 > --- a/src/hotspot/cpu/aarch64/c1_Runtime1_aarch64.cpp > +++ b/src/hotspot/cpu/aarch64/c1_Runtime1_aarch64.cpp > @@ -1091,7 +1091,9 @@ OopMapSet* Runtime1::generate_code_for(StubID id, StubAssembler* sasm) { > StubFrame f(sasm, "dtrace_object_alloc", dont_gc_arguments); > save_live_registers(sasm); > > + __ set_last_Java_frame(sp, rfp, (address)(__ pc()), rscratch1); > __ call_VM_leaf(CAST_FROM_FN_PTR(address, static_cast(SharedRuntime::dtrace_object_alloc)), c_rarg0); > + __ reset_last_Java_frame(true); > > restore_live_registers(sasm); > } > diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp > index a5de65ea5ab..5e09a1de120 100644 > --- a/src/hotspot/share/runtime/sharedRuntime.cpp > +++ b/src/hotspot/share/runtime/sharedRuntime.cpp > @@ -996,12 +996,16 @@ jlong SharedRuntime::get_java_tid(Thread* thread) { > return 0; > } > > +extern "C" void pfl(); > + > /** > * This function ought to be a void function, but cannot be because > * it gets turned into a tail-call on sparc, which runs into dtrace bug > * 6254741. Once that is fixed we can remove the dummy return value. > */ > int SharedRuntime::dtrace_object_alloc(oopDesc* o) { > + pfl(); > + *(int*)0 = 1; > return dtrace_object_alloc(Thread::current(), o, o->size()); > } > > ``` Thanks. `frame::sender_for_entry_frame` also uses anchor to build the sender's frame, I fixed. Also, I change the name '_from_thread' to '_from_anchor', I think the latter is more suitable. ------------- PR: https://git.openjdk.java.net/jdk/pull/6597 From duke at openjdk.java.net Sun Jan 16 02:26:23 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Sun, 16 Jan 2022 02:26:23 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API In-Reply-To: References: Message-ID: <3jys7tQ93ojPrWjoGb8Z04Ed-pUiimIj1V2pSCvdLoo=.590b55fd-39af-459c-8add-37bb6cd34f6f@github.com> On Sat, 15 Jan 2022 02:21:38 GMT, Jatin Bhateja wrote: > Summary of changes: > - Intrinsify Math.round(float) and Math.round(double) APIs. > - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. > - Test creation using new IR testing framework. > > Following are the performance number of a JMH micro included with the patch > > Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) > > ? | ? | BASELINE AVX2 | WithOpt AVX2 | Gain (opt/baseline) | Baseline AVX3 | Withopt AVX3 | Gain (opt/baseline) > -- | -- | -- | -- | -- | -- | -- | -- > Benchmark | ARRAYLEN | Score (ops/ms) | Score (ops/ms) | ? | Score (ops/ms) | Score (ops/ms) | ? > FpRoundingBenchmark.test_round_double | 1024 | 518.532 | 1364.066 | 2.630630318 | 512.908 | 4292.11 | 8.368186887 > FpRoundingBenchmark.test_round_double | 2048 | 270.137 | 830.986 | 3.076165057 | 273.159 | 2459.116 | 9.002507697 > FpRoundingBenchmark.test_round_float | 1024 | 752.436 | 7780.905 | 10.34095259 | 752.49 | 9506.694 | 12.63364829 > FpRoundingBenchmark.test_round_float | 2048 | 389.499 | 4113.046 | 10.55983712 | 389.63 | 4863.673 | 12.48279907 > > Kindly review and share your feedback. > > Best Regards, > Jatin Hi, did we have tests for the scalar intrinsification already? Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From jbhateja at openjdk.java.net Sun Jan 16 04:01:26 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Sun, 16 Jan 2022 04:01:26 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API In-Reply-To: <3jys7tQ93ojPrWjoGb8Z04Ed-pUiimIj1V2pSCvdLoo=.590b55fd-39af-459c-8add-37bb6cd34f6f@github.com> References: <3jys7tQ93ojPrWjoGb8Z04Ed-pUiimIj1V2pSCvdLoo=.590b55fd-39af-459c-8add-37bb6cd34f6f@github.com> Message-ID: <025PTTFCmBytUGceCiL86BhLqWIczWJvEQTF5pusmF8=.9ec9f398-83b2-41a2-8988-47aa466a9871@github.com> On Sun, 16 Jan 2022 02:23:15 GMT, Quan Anh Mai wrote: > Hi, did we have tests for the scalar intrinsification already? Thanks. Verification is done against scalar rounding operation. https://github.com/openjdk/jdk/pull/7094/files#diff-88b1bad16d68808e6c1224fff7773104924bfdabcb23958c2a3e4e6b06844701R369 Thanks ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From duke at openjdk.java.net Sun Jan 16 08:07:38 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Sun, 16 Jan 2022 08:07:38 GMT Subject: RFR: 8279282: [vectorapi] Matcher::supports_vector_comparison_unsigned is not needed on x86 [v3] In-Reply-To: References: <6YEvoLUipNu1l5haifwWwgQyfrjpRxh1o9qaIElSKQs=.d06b8dea-ed32-4cb5-ab09-28ce9a58e524@github.com> Message-ID: On Fri, 14 Jan 2022 23:23:20 GMT, Sandhya Viswanathan wrote: >> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: >> >> use movddup for 128-bit vectors > > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 3254: > >> 3252: vpcmpCC(dst, nds, src, eq_cond_enc, width, vector_len); >> 3253: vallones(xtmp, vector_len); >> 3254: vpxor(dst, xtmp, dst, vector_len); > > This would add extra overhead of doing vallones every time versus what we had before. Thanks a lot for the review. uiCA shows that both result in 3 uops being executed if the all ones is reachable. If the external address is not reachable however, an extra `mov` instruction would need to be emitted in the current approach, leading to 1 extra uop. Trying `_mm256_xor_si256(src, _mm256_set1_epi32(-1))` it seems that both gcc and clang use `vcmpeqd`. ------------- PR: https://git.openjdk.java.net/jdk/pull/6966 From duke at openjdk.java.net Sun Jan 16 08:13:25 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Sun, 16 Jan 2022 08:13:25 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API In-Reply-To: References: Message-ID: On Sat, 15 Jan 2022 02:21:38 GMT, Jatin Bhateja wrote: > Summary of changes: > - Intrinsify Math.round(float) and Math.round(double) APIs. > - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. > - Test creation using new IR testing framework. > > Following are the performance number of a JMH micro included with the patch > > Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) > > ? | ? | BASELINE AVX2 | WithOpt AVX2 | Gain (opt/baseline) | Baseline AVX3 | Withopt AVX3 | Gain (opt/baseline) > -- | -- | -- | -- | -- | -- | -- | -- > Benchmark | ARRAYLEN | Score (ops/ms) | Score (ops/ms) | ? | Score (ops/ms) | Score (ops/ms) | ? > FpRoundingBenchmark.test_round_double | 1024 | 518.532 | 1364.066 | 2.630630318 | 512.908 | 4292.11 | 8.368186887 > FpRoundingBenchmark.test_round_double | 2048 | 270.137 | 830.986 | 3.076165057 | 273.159 | 2459.116 | 9.002507697 > FpRoundingBenchmark.test_round_float | 1024 | 752.436 | 7780.905 | 10.34095259 | 752.49 | 9506.694 | 12.63364829 > FpRoundingBenchmark.test_round_float | 2048 | 389.499 | 4113.046 | 10.55983712 | 389.63 | 4863.673 | 12.48279907 > > Kindly review and share your feedback. > > Best Regards, > Jatin Ah understood, you can also disable loop unrolling so that vectorisation does not take place. Regards. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From njian at openjdk.java.net Mon Jan 17 02:33:29 2022 From: njian at openjdk.java.net (Ningsheng Jian) Date: Mon, 17 Jan 2022 02:33:29 GMT Subject: RFR: 8239927: Product variable PrefetchFieldsAhead is unused and should be removed [v3] In-Reply-To: References: Message-ID: On Fri, 17 Dec 2021 12:03:37 GMT, Bhavana-Kilambi wrote: >> The product variable "PrefetchFieldsAhead" is defined in gc_globals.hpp and set in vm_version_x86.cpp. >> But as it's not used anywhere, removing this option from the JDK source. > > Bhavana-Kilambi has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > 8239927: Product variable PrefetchFieldsAhead is unused and should be removed Marked as reviewed by njian (Committer). ------------- PR: https://git.openjdk.java.net/jdk/pull/6783 From dholmes at openjdk.java.net Mon Jan 17 05:17:26 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 17 Jan 2022 05:17:26 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v10] In-Reply-To: References: Message-ID: <6jcmY2VLTea4bEyt5zQRGF4o-FkX_GG0HgrTX9NVByw=.41dac3ec-ba34-4bba-acb5-b00ce4671888@github.com> On Fri, 14 Jan 2022 14:20:12 GMT, Coleen Phillimore wrote: >> Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Revert src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp Just a drive-by comment below. Cheers, David src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp line 2: > 1: /* > 2: * Copyright (c) 2003, 2022, Oracle and/or its affiliates. All rights reserved. No other changes to this file, so this needs reverting. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From dholmes at openjdk.java.net Mon Jan 17 07:03:25 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 17 Jan 2022 07:03:25 GMT Subject: RFR: 8279534: Consolidate and remove oopDesc::klass_gap methods In-Reply-To: References: Message-ID: <-7JXXZWqJlZSVB2QDbeu675eHWRMLmgeSkxfIFa2Dy4=.6edf1d37-bcdd-4dfe-9626-1d6ff4f0d15e@github.com> On Mon, 10 Jan 2022 13:40:29 GMT, Roman Kennke wrote: > After JDK-8278568, these methods are unused: > inline int klass_gap() const; > inline void set_klass_gap(int z); > > Except Zero which uses set_klass_gap(int), but we agreed elsewhere (#5585) that we don't want to access partly initialized oops as such. We should use the HeapWord* initialization variants in Zero, too. > > Note: we could take that even further and replace the initialization in Zero with ObjAllocator::initialize() call, but that would also have to remove the storestore fence, and possibly adopt ObjAllocator to avoid clearing in already-zeroed TLABs, all of which would have wider consequences and would be a matter for separate PR. > > Testing: > - [x] Build (for klass_gap methods removal) > - [ ] GHA for Zero stuff Seems quite reasonable. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7008 From kbarrett at openjdk.java.net Mon Jan 17 08:30:52 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 17 Jan 2022 08:30:52 GMT Subject: [jdk18] RFR: 8279294: NonblockingQueue::try_pop may improperly indicate queue is empty Message-ID: Please review this improvement to NonblockingQueue::try_pop. The old code returned an indication that the queue was empty in some cases where that wasn't true. In particular, contending try_pop operations could result in some incorrectly indicating empty. The change fixes that and improves the interaction between contending try_pops. Testing: mach5 tier1 Lots of testing of this change in conjunction with others as part of investigating and fixing JDK-8273383. ------------- Commit messages: - improve try_pop Changes: https://git.openjdk.java.net/jdk18/pull/106/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk18&pr=106&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8279294 Stats: 39 lines in 1 file changed: 17 ins; 3 del; 19 mod Patch: https://git.openjdk.java.net/jdk18/pull/106.diff Fetch: git fetch https://git.openjdk.java.net/jdk18 pull/106/head:pull/106 PR: https://git.openjdk.java.net/jdk18/pull/106 From tschatzl at openjdk.java.net Mon Jan 17 09:05:25 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Mon, 17 Jan 2022 09:05:25 GMT Subject: RFR: 8280018: Remove obsolete VM_GenCollectFullConcurrent In-Reply-To: References: Message-ID: <-GKUgxSbU8R2XXPxfBkSZi8exWSjH8qgPnayvqiv97w=.84d79722-8347-47d6-bf80-1da83f31d33e@github.com> On Fri, 14 Jan 2022 13:55:02 GMT, Albert Mingkun Yang wrote: > Trivial change of removing dead code. > > Test: build Lgtm and trivial to me. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7084 From yyang at openjdk.java.net Mon Jan 17 09:38:40 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Mon, 17 Jan 2022 09:38:40 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes Message-ID: Add VM.classes to print details of all classes, output looks like: 1. jcmd VM.classes KlassAddr Size State Flags LoaderName ClassName 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 ... 2. jcmd VM.classes verbose KlassAddr Size State Flags LoaderName ClassName 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} - instance size: 2 - klass size: 62 - access: final synchronized - state: inited - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' - super: 'java/lang/Object' - sub: - arrays: NULL - methods: Array(0x00007f620841f210) - method ordering: Array(0x0000000800a7e5a8) - default_methods: Array(0x0000000000000000) - local interfaces: Array(0x00000008005af748) - trans. interfaces: Array(0x00000008005af748) - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder - source file: 'LambdaForm$MH' - class annotations: Array(0x0000000000000000) - class type annotations: Array(0x0000000000000000) - field annotations: Array(0x0000000000000000) - field type annotations: Array(0x0000000000000000) - inner classes: Array(0x00000008005af6d8) - nest members: Array(0x00000008005af6d8) - permitted subclasses: Array(0x00000008005af6d8) - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' - vtable length 5 (start addr: 0x0000000800c0b5b8) - itable length 2 (start addr: 0x0000000800c0b5e0) - ---- static fields (1 words): - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 - ---- non-static fields (0 words): - non-static oop maps: 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} - instance size: 2 - klass size: 62 - access: final synchronized - state: inited - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' - super: 'java/lang/Object' - sub: - arrays: NULL - methods: Array(0x00007f620841ea68) - method ordering: Array(0x0000000800a7e5a8) - default_methods: Array(0x0000000000000000) - local interfaces: Array(0x00000008005af748) - trans. interfaces: Array(0x00000008005af748) - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder - source file: 'LambdaForm$DMH' - class annotations: Array(0x0000000000000000) - class type annotations: Array(0x0000000000000000) - field annotations: Array(0x0000000000000000) - field type annotations: Array(0x0000000000000000) - inner classes: Array(0x00000008005af6d8) - nest members: Array(0x00000008005af6d8) - permitted subclasses: Array(0x00000008005af6d8) - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' - vtable length 5 (start addr: 0x0000000800c0b1b8) - itable length 2 (start addr: 0x0000000800c0b1e0) - ---- static fields (1 words): - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 - ---- non-static fields (0 words): ... ------------- Commit messages: - 8275775 Add VM.classes to print details of all classes Changes: https://git.openjdk.java.net/jdk/pull/7105/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7105&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8275775 Stats: 176 lines in 6 files changed: 174 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7105.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7105/head:pull/7105 PR: https://git.openjdk.java.net/jdk/pull/7105 From dholmes at openjdk.java.net Mon Jan 17 10:16:32 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 17 Jan 2022 10:16:32 GMT Subject: RFR: 8279936: Change shared code to use os:: system API's In-Reply-To: References: Message-ID: On Wed, 12 Jan 2022 21:56:18 GMT, Harold Seigel wrote: > Please review this small change to call os:: API's. the changes were tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. Most of these changes were for I/O related calls. Changes to memory allocation calls such as malloc and free will be handled in a future change. > > Thanks, Harold Hi Harold, This seems fine except for one change - see below. I still think we need to discuss whether os::close and os::read should even exist but that is a separate matter. Thanks, David test/hotspot/gtest/gtestMain.cpp line 281: > 279: if ((ret = init_jvm(argc, argv, is_vmassert_test, &jvm)) != 0) { > 280: fprintf(stderr, "ERROR: JNI_CreateJavaVM failed: %d\n", ret); > 281: os::abort(); No this should be the native abort function. We failed to create the VM so we don't/can't use any VM service. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7055 From ayang at openjdk.java.net Mon Jan 17 13:21:30 2022 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Mon, 17 Jan 2022 13:21:30 GMT Subject: RFR: 8280018: Remove obsolete VM_GenCollectFullConcurrent In-Reply-To: References: Message-ID: On Fri, 14 Jan 2022 13:55:02 GMT, Albert Mingkun Yang wrote: > Trivial change of removing dead code. > > Test: build Thanks for the review. ------------- PR: https://git.openjdk.java.net/jdk/pull/7084 From ayang at openjdk.java.net Mon Jan 17 13:21:30 2022 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Mon, 17 Jan 2022 13:21:30 GMT Subject: Integrated: 8280018: Remove obsolete VM_GenCollectFullConcurrent In-Reply-To: References: Message-ID: On Fri, 14 Jan 2022 13:55:02 GMT, Albert Mingkun Yang wrote: > Trivial change of removing dead code. > > Test: build This pull request has now been integrated. Changeset: 3edcb132 Author: Albert Mingkun Yang URL: https://git.openjdk.java.net/jdk/commit/3edcb13272c7d1a587e17fc16be523b3d73053ac Stats: 4 lines in 3 files changed: 0 ins; 4 del; 0 mod 8280018: Remove obsolete VM_GenCollectFullConcurrent Reviewed-by: tschatzl ------------- PR: https://git.openjdk.java.net/jdk/pull/7084 From coleenp at openjdk.java.net Mon Jan 17 15:49:33 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 17 Jan 2022 15:49:33 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v10] In-Reply-To: <6jcmY2VLTea4bEyt5zQRGF4o-FkX_GG0HgrTX9NVByw=.41dac3ec-ba34-4bba-acb5-b00ce4671888@github.com> References: <6jcmY2VLTea4bEyt5zQRGF4o-FkX_GG0HgrTX9NVByw=.41dac3ec-ba34-4bba-acb5-b00ce4671888@github.com> Message-ID: On Mon, 17 Jan 2022 05:06:54 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp > > src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp line 2: > >> 1: /* >> 2: * Copyright (c) 2003, 2022, Oracle and/or its affiliates. All rights reserved. > > No other changes to this file, so this needs reverting. Oh, ok, thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Mon Jan 17 15:56:00 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 17 Jan 2022 15:56:00 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v11] In-Reply-To: References: Message-ID: <9eaxjtoRGtCTVz8CN0AKqAfaBNKlUy_J6w5F2TQvbC8=.81d12a93-038e-43ae-94ec-6bf054cdff5b@github.com> > Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Reset templateInterpreterGenerator_aarch64.cpp ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7023/files - new: https://git.openjdk.java.net/jdk/pull/7023/files/232e1d63..d37b1e63 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=10 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=09-10 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7023.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7023/head:pull/7023 PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Mon Jan 17 16:32:07 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 17 Jan 2022 16:32:07 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v8] In-Reply-To: <6hr8ENDQ7zlY3LNHrQx4WZ-oI6dVNtyeq4rg4bV7nmc=.062547eb-fdd8-4373-b026-82a0e9c2b8d1@github.com> References: <9_R1GWG8goQxC2V9LsMBox5l71FKoBK71kb5tkdpX2Y=.2e30a38b-1196-4cfa-83d4-7f085f0ebde5@github.com> <6hr8ENDQ7zlY3LNHrQx4WZ-oI6dVNtyeq4rg4bV7nmc=.062547eb-fdd8-4373-b026-82a0e9c2b8d1@github.com> Message-ID: <3sg-Ckg2UlJXSeVMxsMzJjn59M30VsMBLoIrv1PUL00=.41c4d477-d306-430c-9659-2ed96d1c59f4@github.com> On Wed, 12 Jan 2022 21:33:28 GMT, Coleen Phillimore wrote: >> src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 1174: >> >>> 1172: assert(r->is_valid(), "bad oop arg"); >>> 1173: if (r->is_stack()) { >>> 1174: __ ldr(temp_reg, Address(sp, (uint64_t)r->reg2stack() * VMRegImpl::stack_slot_size)); >> >> Why is the cast being added here? Is it because the multiply can overflow an int, and this is really fix for a bug that is distinct from the cleanup in this PR? There are a couple more like this later in this file. > > This was here to match one of the Address constructors that I had. With the template, it's no longer needed. I think I didn't revert this. Retesting. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Mon Jan 17 16:32:04 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 17 Jan 2022 16:32:04 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v12] In-Reply-To: References: Message-ID: > Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Revert sharedRuntime_aarch64.cpp ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7023/files - new: https://git.openjdk.java.net/jdk/pull/7023/files/d37b1e63..dd0e0db1 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=11 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=10-11 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/7023.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7023/head:pull/7023 PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Mon Jan 17 17:46:58 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 17 Jan 2022 17:46:58 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v13] In-Reply-To: References: Message-ID: > Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Revert macroAssembler_aarch64.cpp ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7023/files - new: https://git.openjdk.java.net/jdk/pull/7023/files/dd0e0db1..e076c4ca Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=12 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7023&range=11-12 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7023.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7023/head:pull/7023 PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Mon Jan 17 17:46:59 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 17 Jan 2022 17:46:59 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v12] In-Reply-To: References: Message-ID: On Mon, 17 Jan 2022 16:32:04 GMT, Coleen Phillimore wrote: >> Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Revert sharedRuntime_aarch64.cpp There were a couple of files changed that still had casts for the intermediate changes before templates. I reverted them and reran tier1 testing. I'll push after GHA completes. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From aph at openjdk.java.net Mon Jan 17 18:15:21 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Mon, 17 Jan 2022 18:15:21 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash [v2] In-Reply-To: References: Message-ID: On Fri, 14 Jan 2022 16:15:06 GMT, Denghui Dong wrote: >> Hi, >> >> I found that the native stack frames in the hs log are not accurate sometimes on AArch64, not sure if this is a known issue or an issue worth fixing. >> >> The following steps can quick reproduce the problem: >> >> 1. apply the diff(comment the dtrace_object_alloc call in interpreter and make a crash on SharedRuntime::dtrace_object_alloc) >> >> index 39e99bdd5ed..4fc768e94aa 100644 >> --- a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp >> +++ b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp >> @@ -3558,6 +3558,7 @@ void TemplateTable::_new() { >> __ store_klass_gap(r0, zr); // zero klass gap for compressed oops >> __ store_klass(r0, r4); // store klass last >> >> +/** >> { >> SkipIfEqual skip(_masm, &DTraceAllocProbes, false); >> // Trigger dtrace event for fastpath >> @@ -3567,6 +3568,7 @@ void TemplateTable::_new() { >> __ pop(atos); // restore the return value >> >> } >> +*/ >> __ b(done); >> } >> >> diff --git a/src/hotspot/cpu/x86/templateTable_x86.cpp b/src/hotspot/cpu/x86/templateTable_x86.cpp >> index 19530b7c57c..15b0509da4c 100644 >> --- a/src/hotspot/cpu/x86/templateTable_x86.cpp >> +++ b/src/hotspot/cpu/x86/templateTable_x86.cpp >> @@ -4033,6 +4033,7 @@ void TemplateTable::_new() { >> Register tmp_store_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg); >> __ store_klass(rax, rcx, tmp_store_klass); // klass >> >> +/** >> { >> SkipIfEqual skip_if(_masm, &DTraceAllocProbes, 0); >> // Trigger dtrace event for fastpath >> @@ -4041,6 +4042,7 @@ void TemplateTable::_new() { >> CAST_FROM_FN_PTR(address, static_cast(SharedRuntime::dtrace_object_alloc)), rax); >> __ pop(atos); >> } >> +*/ >> >> __ jmp(done); >> } >> diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp >> index a5de65ea5ab..60b4bd3bcc8 100644 >> --- a/src/hotspot/share/runtime/sharedRuntime.cpp >> +++ b/src/hotspot/share/runtime/sharedRuntime.cpp >> @@ -1002,6 +1002,7 @@ jlong SharedRuntime::get_java_tid(Thread* thread) { >> * 6254741. Once that is fixed we can remove the dummy return value. >> */ >> int SharedRuntime::dtrace_object_alloc(oopDesc* o) { >> + *(int*)0 = 1; >> return dtrace_object_alloc(Thread::current(), o, o->size()); >> } >> >> >> 2. `java -XX:+DTraceAllocProbes -Xcomp -XX:-PreserveFramePointer -version` >> >> On x86_64, the native stack in hs log is complete, but on AArch64, the native stack is incorrect. >> >> In the beginning, I thought it might be the influence of PreserveFramePointer. Later, I found that no matter whether PreserveFramePointer is enabled or not, in the hs log of x86_64, the native stack is always correct, and aarch64 is wrong. >> >> After some investigation, I found that this problem is related to the layout of the stack. >> >> On x86_64, whether it is C/C++, interpreter, or JIT, `callee` will always put the `return address` and `fp` of the `caller` at the bottom of the stack. >> Hence, `callee` can always get the `caller sp`(aka `sender sp`) by `fp + 2`, and if `caller` is a compiled method, `caller sp` is the key to getting the `caller`'s `caller` since `caller fp` may be invalid.(see frame::sender_for_compiled_frame). >> >> >> push %rbp >> mov %rsp,%rbp >> >> _ _ _ _ _ _ >> | | >> | | | >> |_ _ _ _ _ _| | >> | | | >> caller | | <- caller sp | >> _ _ _ |_ _ _ _ _ _| | expand >> | | | >> | ret addr | | direction >> callee |_ _ _ _ _ _| | >> | | V >> | caller fp | <- fp >> |_ _ _ _ _ _| >> >> >> >> But for AArch64, the C/C++ code doesn't put the `return address` and `fp` of the `caller` at the bottom of the stack. >> Hence, we cannot use `fp + 2` to calculate the proper `caller sp`(although it is still implemented this way). >> >> When `caller` is a C1/C2 method A, and `callee` is a C/C++ method B, we cannot get the `caller` of A since we cannot get the proper sp value of it. >> >> >> stp x29, x30, [sp, #-N]! >> mov x29, sp >> >> _ _ _ _ _ _ >> | | >> | | | >> |_ _ _ _ _ _| | >> | | | >> caller | | <- caller sp | >> _ _ _ |_ _ _ _ _ _| - | expand >> | | >> . . . . . | | direction >> _ _ _ _ _ _ | | >> | | | N | >> | ret addr | | | >> callee |_ _ _ _ _ _| | | >> | | - V >> | caller fp | <- fp >> |_ _ _ _ _ _| >> >> >> >> I am not very familiar with AArch64 and have no idea how to fix this issue perfectly at current. >> >> Based on my understanding of the implementation, we can get the correct stack trace when PreserveFramePointer is enabled. >> >> Although PreserveFramePointer is disabled by default, I found that some real applications will enable it in the production environment. >> Therefore, in my opinion, this fix can help troubleshoot crash issues in applications that enable PreserveFramePointer on AArch64 platform. >> >> This patch changes the logic of l_sender_sp calculation, uses sender_sp() as the value of l_sender_sp when PreserveFramePointer is enabled. >> >> Any input is appreciated. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > fix pfl() crash problem and rename from_thread to from_anchor Great! This works much better. I've been digging deeper to try to understand the issue, in order not to break anything: the unwinder is an important and delicate piece of code. The code at the very root of this problem is, as far as I can see, this: frame os::get_sender_for_C_frame(frame* fr) { return frame(fr->link(), fr->link(), fr->sender_pc()); } in which the sender's SP is set to be the same as the sender's FP, which is almost certainly wrong. As you have observed, we can't determine the sender's SP when native code is called from Java code, because all we have is a chain of frame pointers. However, if we know that a frame was created from the frame anchor, we can trust both the SP and the FP in that frame. When we're merely printing a backtrace we don't need stack pointer at all, so chasing frame pointers is fine. I'm grateful for your patience. This work is very much worth doing, but we need to be very careful. I'll dig a little more tomorrow. ------------- PR: https://git.openjdk.java.net/jdk/pull/6597 From aph at openjdk.java.net Mon Jan 17 18:21:32 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Mon, 17 Jan 2022 18:21:32 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v13] In-Reply-To: References: Message-ID: <-Mg_mS-_02wdEzs8hlTzeVqaEHBXRXS0QnWr1Yrs0Wk=.054c818a-2e3d-43be-b162-65ac52e7183a@github.com> On Mon, 17 Jan 2022 17:46:58 GMT, Coleen Phillimore wrote: >> Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Revert macroAssembler_aarch64.cpp Even better! I still approve. ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Mon Jan 17 19:25:26 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 17 Jan 2022 19:25:26 GMT Subject: RFR: 8275731: CDS archived enums objects are recreated at runtime [v3] In-Reply-To: References: <9XdQFi_-JzM91ET0nN1gRCp8ZfMGBz1BwXglxqb8phg=.c643d5a5-b99a-4ce2-8616-9c1472e521b7@github.com> Message-ID: <7c6mh2-s3SkpfGG1WptyZsJjTfcDy1wX0Ll0713MLkU=.7df74a01-7ea5-49c1-9bda-f73798df3852@github.com> On Sat, 11 Dec 2021 01:55:50 GMT, Ioi Lam wrote: >> **Background:** >> >> In the Java Language, Enums can be tested for equality, so the constants in an Enum type must be unique. Javac compiles an enum declaration like this: >> >> >> public enum Day { SUNDAY, MONDAY ... } >> >> >> to >> >> >> public class Day extends java.lang.Enum { >> public static final SUNDAY = new Day("SUNDAY"); >> public static final MONDAY = new Day("MONDAY"); ... >> } >> >> >> With CDS archived heap objects, `Day::` is executed twice: once during `java -Xshare:dump`, and once during normal JVM execution. If the archived heap objects references one of the Enum constants created at dump time, we will violate the uniqueness requirements of the Enum constants at runtime. See the test case in the description of [JDK-8275731](https://bugs.openjdk.java.net/browse/JDK-8275731) >> >> **Fix:** >> >> During -Xshare:dump, if we discovered that an Enum constant of type X is archived, we archive all constants of type X. At Runtime, type X will skip the normal execution of `X::`. Instead, we run `HeapShared::initialize_enum_klass()` to retrieve all the constants of X that were saved at dump time. >> >> This is safe as we know that `X::` has no observable side effect -- it only creates the constants of type X, as well as the synthetic value `X::$VALUES`, which cannot be observed until X is fully initialized. >> >> **Verification:** >> >> To avoid future problems, I added a new tool, CDSHeapVerifier, to look for similar problems where the archived heap objects reference a static field that may be recreated at runtime. There are some manual steps involved, but I analyzed the potential problems found by the tool are they are all safe (after the current bug is fixed). See cdsHeapVerifier.cpp for gory details. An example trace of this tool can be found at https://bugs.openjdk.java.net/secure/attachment/97242/enum_warning.txt >> >> **Testing:** >> >> Passed Oracle CI tiers 1-4. WIll run tier 5 as well. > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Merge branch 'master' into 8275731-heapshared-enum > - added exclusions needed by "java -Xshare:dump -ea -esa" > - Comments from @calvinccheung off-line > - 8275731: CDS archived enums objects are recreated at runtime I don't really know this code well enough to do a good code review. I had some comments though. src/hotspot/share/cds/cdsHeapVerifier.cpp line 165: > 163: > 164: ResourceMark rm; > 165: for (JavaFieldStream fs(ik); !fs.done(); fs.next()) { Can this call instead void InstanceKlass::do_local_static_fields(void f(fieldDescriptor*, Handle, TRAPS), Handle mirror, TRAPS) { and have this next few lines in the function? src/hotspot/share/cds/cdsHeapVerifier.cpp line 254: > 252: InstanceKlass* ik = InstanceKlass::cast(k); > 253: for (JavaFieldStream fs(ik); !fs.done(); fs.next()) { > 254: if (!fs.access_flags().is_static()) { same here. It only saves a couple of lines but then you can have the function outside this large function. src/hotspot/share/cds/cdsHeapVerifier.hpp line 52: > 50: mtClassShared, > 51: HeapShared::oop_hash> _table; > 52: Is this only used inside cdsHeapVerifier? if so it should be in the .cpp file. There's also an ArchiveableStaticFieldInfo. Not sure how they are related. src/hotspot/share/cds/heapShared.cpp line 433: > 431: oop mirror = k->java_mirror(); > 432: int i = 0; > 433: for (JavaFieldStream fs(k); !fs.done(); fs.next()) { This seems like it should also use InstanceKlass::do_local_static_fields. src/hotspot/share/cds/heapShared.cpp line 482: > 480: copy_open_objects(open_regions); > 481: > 482: CDSHeapVerifier::verify(); Should all this be DEBUG_ONLY ? src/hotspot/share/cds/heapShared.hpp line 236: > 234: oop _referrer; > 235: oop _obj; > 236: CachedOopInfo() :_subgraph_info(), _referrer(), _obj() {} Should these be initialized to nullptr? does this do this? ------------- PR: https://git.openjdk.java.net/jdk/pull/6653 From dholmes at openjdk.java.net Mon Jan 17 22:45:35 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 17 Jan 2022 22:45:35 GMT Subject: [jdk18] RFR: 8279294: NonblockingQueue::try_pop may improperly indicate queue is empty In-Reply-To: References: Message-ID: On Mon, 17 Jan 2022 08:23:37 GMT, Kim Barrett wrote: > Please review this improvement to NonblockingQueue::try_pop. The old code > returned an indication that the queue was empty in some cases where that > wasn't true. In particular, contending try_pop operations could result in > some incorrectly indicating empty. The change fixes that and improves the > interaction between contending try_pops. > > Testing: > mach5 tier1 > > Lots of testing of this change in conjunction with others as part of > investigating and fixing JDK-8273383. Hi Kim, I haven't looked at this implementation in detail before, but what you are describing as a bug seems to be a known "property" of the implementation: // A queue may temporarily appear to be empty even though elements have been // added and not removed. For example, after running the following program, // the value of r may be NULL. // // thread1: q.push(a); r = q.pop(); // thread2: q.push(b); // // This can occur if the push of b started before the push of a, but didn't // complete until after the pop. ?? David ------------- PR: https://git.openjdk.java.net/jdk18/pull/106 From jwilhelm at openjdk.java.net Tue Jan 18 01:13:35 2022 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Tue, 18 Jan 2022 01:13:35 GMT Subject: Integrated: Merge jdk18 In-Reply-To: References: Message-ID: <2VZJKBdrdAX6ea6GEMABo4fmQA1ECrnfDSG-GVxyFNE=.0e2472af-e217-4b55-a1a1-093265dcec3f@github.com> On Thu, 13 Jan 2022 20:55:49 GMT, Jesper Wilhelmsson wrote: > Forwardport JDK 18 -> JDK 19 This pull request has now been integrated. Changeset: 37143c09 Author: Jesper Wilhelmsson URL: https://git.openjdk.java.net/jdk/commit/37143c09ab56ff07767ab3ac392234e36ee82358 Stats: 124 lines in 6 files changed: 117 ins; 1 del; 6 mod Merge ------------- PR: https://git.openjdk.java.net/jdk/pull/7068 From jwilhelm at openjdk.java.net Tue Jan 18 01:23:21 2022 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Tue, 18 Jan 2022 01:23:21 GMT Subject: RFR: Merge jdk18 Message-ID: <8PYG3aNFWxaqHY3TBzCqtA9mdJedp5I3Sd7xp2ttgWo=.90a15cbc-0f8b-4c55-8df3-15838148f979@github.com> Forwardport JDK 18 -> JDK 19 ------------- Commit messages: - Merge remote-tracking branch 'jdk18/master' into Merge_jdk18 - 8279998: PPC64 debug builds fail with "untested: RangeCheckStub: predicate_failed_trap_id" - 8280034: ProblemList jdk/jfr/api/consumer/recordingstream/TestOnEvent.java on linux-x64 - 8279924: [PPC64, s390] implement frame::is_interpreted_frame_valid checks - 8279702: [macosx] ignore xcodebuild warnings on M1 - 8279930: Synthetic cast causes generation of store barriers when using heap segments - 8279597: [TESTBUG] ReturnBlobToWrongHeapTest.java fails with -XX:TieredStopAtLevel=1 on machines with many cores - 8278434: timeouts in test java/time/test/java/time/format/TestZoneTextPrinterParser.java The webrevs contain the adjustments done while merging with regards to each parent branch: - master: https://webrevs.openjdk.java.net/?repo=jdk&pr=7119&range=00.0 - jdk18: https://webrevs.openjdk.java.net/?repo=jdk&pr=7119&range=00.1 Changes: https://git.openjdk.java.net/jdk/pull/7119/files Stats: 403 lines in 9 files changed: 361 ins; 4 del; 38 mod Patch: https://git.openjdk.java.net/jdk/pull/7119.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7119/head:pull/7119 PR: https://git.openjdk.java.net/jdk/pull/7119 From jwilhelm at openjdk.java.net Tue Jan 18 02:00:29 2022 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Tue, 18 Jan 2022 02:00:29 GMT Subject: Integrated: Merge jdk18 In-Reply-To: <8PYG3aNFWxaqHY3TBzCqtA9mdJedp5I3Sd7xp2ttgWo=.90a15cbc-0f8b-4c55-8df3-15838148f979@github.com> References: <8PYG3aNFWxaqHY3TBzCqtA9mdJedp5I3Sd7xp2ttgWo=.90a15cbc-0f8b-4c55-8df3-15838148f979@github.com> Message-ID: On Tue, 18 Jan 2022 01:13:45 GMT, Jesper Wilhelmsson wrote: > Forwardport JDK 18 -> JDK 19 This pull request has now been integrated. Changeset: 39f140a2 Author: Jesper Wilhelmsson URL: https://git.openjdk.java.net/jdk/commit/39f140a20120300074167597580f9be34e812cad Stats: 403 lines in 9 files changed: 361 ins; 4 del; 38 mod Merge ------------- PR: https://git.openjdk.java.net/jdk/pull/7119 From yyang at openjdk.java.net Tue Jan 18 02:50:06 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Tue, 18 Jan 2022 02:50:06 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v2] In-Reply-To: References: Message-ID: <_Pw-D6A2BD-4wx0mZ5lFvFlxBRylbA5WT9y5xgtDBvk=.fa85a604-e6a0-463b-8a4a-4ae7e210661a@github.com> > Add VM.classes to print details of all classes, output looks like: > > 1. jcmd VM.classes > > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 > ... > > 2. jcmd VM.classes verbose > > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} > - instance size: 2 > - klass size: 62 > - access: final synchronized > - state: inited > - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > - super: 'java/lang/Object' > - sub: > - arrays: NULL > - methods: Array(0x00007f620841f210) > - method ordering: Array(0x0000000800a7e5a8) > - default_methods: Array(0x0000000000000000) > - local interfaces: Array(0x00000008005af748) > - trans. interfaces: Array(0x00000008005af748) > - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 > - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder > - source file: 'LambdaForm$MH' > - class annotations: Array(0x0000000000000000) > - class type annotations: Array(0x0000000000000000) > - field annotations: Array(0x0000000000000000) > - field type annotations: Array(0x0000000000000000) > - inner classes: Array(0x00000008005af6d8) > - nest members: Array(0x00000008005af6d8) > - permitted subclasses: Array(0x00000008005af6d8) > - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > - vtable length 5 (start addr: 0x0000000800c0b5b8) > - itable length 2 (start addr: 0x0000000800c0b5e0) > - ---- static fields (1 words): > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > - ---- non-static fields (0 words): > - non-static oop maps: > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} > - instance size: 2 > - klass size: 62 > - access: final synchronized > - state: inited > - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > - super: 'java/lang/Object' > - sub: > - arrays: NULL > - methods: Array(0x00007f620841ea68) > - method ordering: Array(0x0000000800a7e5a8) > - default_methods: Array(0x0000000000000000) > - local interfaces: Array(0x00000008005af748) > - trans. interfaces: Array(0x00000008005af748) > - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 > - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder > - source file: 'LambdaForm$DMH' > - class annotations: Array(0x0000000000000000) > - class type annotations: Array(0x0000000000000000) > - field annotations: Array(0x0000000000000000) > - field type annotations: Array(0x0000000000000000) > - inner classes: Array(0x00000008005af6d8) > - nest members: Array(0x00000008005af6d8) > - permitted subclasses: Array(0x00000008005af6d8) > - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > - vtable length 5 (start addr: 0x0000000800c0b1b8) > - itable length 2 (start addr: 0x0000000800c0b1e0) > - ---- static fields (1 words): > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > - ---- non-static fields (0 words): > ... Yi Yang has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8275775 Add VM.classes to print details of all classes ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7105/files - new: https://git.openjdk.java.net/jdk/pull/7105/files/c26a62be..80a3d22b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7105&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7105&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7105.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7105/head:pull/7105 PR: https://git.openjdk.java.net/jdk/pull/7105 From kbarrett at openjdk.java.net Tue Jan 18 03:37:32 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Tue, 18 Jan 2022 03:37:32 GMT Subject: [jdk18] RFR: 8279294: NonblockingQueue::try_pop may improperly indicate queue is empty In-Reply-To: References: Message-ID: <_J8GTxkxVdXkVX9ZD2gdNnBXJ2MFPnzlv6QpX-bK__0=.6a5a1157-4c82-4e84-9aec-ef76074c272d@github.com> On Mon, 17 Jan 2022 22:42:29 GMT, David Holmes wrote: > Hi Kim, > > I haven't looked at this implementation in detail before, but what you are describing as a bug seems to be a known "property" of the implementation: > > ``` > // A queue may temporarily appear to be empty even though elements have been > // added and not removed. For example, after running the following program, > // the value of r may be NULL. > // > // thread1: q.push(a); r = q.pop(); > // thread2: q.push(b); > // > // This can occur if the push of b started before the push of a, but didn't > // complete until after the pop. > ``` > > ?? > > David That's a different situation, and seems unavoidable without a pretty different design. The case we're trying to squash with this change can exhibit even without any concurrent push/append, i.e. 1. phase1 collects entries in the queue. 2. phase change ensures phase1 is complete and phase2 not yet started. 3. phase2 takes entries from the queue. During phase2, the bug being addressed could lead concurrent try_pops to have some (possibly even all but one) threads conclude there are no more entries and so no more work to do, even though there are actually lots of entries and all those threads just lost the race for the next one. That isn't necessary or intended behavior. ------------- PR: https://git.openjdk.java.net/jdk18/pull/106 From dholmes at openjdk.java.net Tue Jan 18 04:21:29 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 18 Jan 2022 04:21:29 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v2] In-Reply-To: <_Pw-D6A2BD-4wx0mZ5lFvFlxBRylbA5WT9y5xgtDBvk=.fa85a604-e6a0-463b-8a4a-4ae7e210661a@github.com> References: <_Pw-D6A2BD-4wx0mZ5lFvFlxBRylbA5WT9y5xgtDBvk=.fa85a604-e6a0-463b-8a4a-4ae7e210661a@github.com> Message-ID: On Tue, 18 Jan 2022 02:50:06 GMT, Yi Yang wrote: >> Add VM.classes to print details of all classes, output looks like: >> >> 1. jcmd VM.classes >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 >> ... >> >> 2. jcmd VM.classes verbose >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841f210) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 >> - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$MH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - vtable length 5 (start addr: 0x0000000800c0b5b8) >> - itable length 2 (start addr: 0x0000000800c0b5e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> - non-static oop maps: >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841ea68) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 >> - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$DMH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - vtable length 5 (start addr: 0x0000000800c0b1b8) >> - itable length 2 (start addr: 0x0000000800c0b1e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> ... > > Yi Yang has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > 8275775 Add VM.classes to print details of all classes Hi, This seems reasonable in general but a few details to work through - see comments below. Thanks, David src/hotspot/share/oops/instanceKlass.cpp line 2069: > 2067: ResourceMark rm; > 2068: _st->print("%-18s", "KlassAddr"); > 2069: _st->print(" "); Can't you just print the two spaces in the previous line: _st->print("%-18s ", "KlassAddr"); and save all the additional print calls. This applies throughout where you have " ". src/hotspot/share/oops/instanceKlass.cpp line 2085: > 2083: ResourceMark rm; > 2084: // klass pointer > 2085: _st->print("" INTPTR_FORMAT "", p2i(k)); Why do you need the two empty string literals ?? src/hotspot/share/oops/instanceKlass.cpp line 2100: > 2098: char buf[10]; > 2099: int i = 0; > 2100: if (k->has_finalizer()) buf[i++] = 'F'; Where is the meaning of these flags documented? src/hotspot/share/oops/instanceKlass.cpp line 2103: > 2101: if (k->has_final_method()) buf[i++] = 'f'; > 2102: if (k->has_vanilla_constructor()) buf[i++] = 'V'; > 2103: if (k->is_instance_klass()) { Don't the properties queried in L2100 to L2102 only apply to instance classes? src/hotspot/share/oops/instanceKlass.cpp line 3425: > 3423: > 3424: static const char* state_names[] = { > 3425: "alloc", "load", "link", "initing", "inited", "init_err" I don't like these short forms - they are mostly computerese not real words. Why can't we print the original full names? (Though I'd prefer "initializing" and "initialized" to "being_initialized" and "fully_initialized".) src/hotspot/share/services/diagnosticCommand.hpp line 870: > 868: } > 869: static const char* description() { > 870: return "Prints list of all loaded java classes"; s/java/Java/ src/hotspot/share/services/diagnosticCommand.hpp line 873: > 871: } > 872: static const char* impact() { > 873: return "Medium: Depends on Java content."; I would think impact is High due to the number of classes. ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7105 From pli at openjdk.java.net Tue Jan 18 07:19:26 2022 From: pli at openjdk.java.net (Pengfei Li) Date: Tue, 18 Jan 2022 07:19:26 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v3] In-Reply-To: References: <1sV-WcCv0-H2lHh1FFwAocRL8EIVzZ_att5tUeY5M-c=.ee351f2b-c9b9-4294-9205-cf16da3e896d@github.com> Message-ID: On Fri, 14 Jan 2022 12:08:33 GMT, Jatin Bhateja wrote: >> Can any C2 compiler expert help review this? I updated copyright year to 2022 and renamed a function in latest commit. > > Hi @pfustc , > Apologies for being late in my response over this, following is the performance data of JMH micro (included with the report) operating over vectors of various primitive types with and without optimization. > [http://cr.openjdk.java.net/~jbhateja/post_loop_multiversioning/perf_post_loop_multiversioning_CLX.xlsx](http://cr.openjdk.java.net/~jbhateja/post_loop_multiversioning/perf_post_loop_multiversioning_CLX.xlsx > ) > Observations: > - Data shows reduction in cycles , dynamic instruction count, branches with optimization. > - Addition of tail loop iteration has impact on JIT code size, this may effect other optimizations like procedure in-lining. > - Scores are better for sub-word types (byte and short) since they have relatively long tail. > > Best Regards, > Jatin Hi @jatin-bhateja , Thank you for the performance data. I repeat your JMH tests on AVX-512 and have below comments. - JIT code size increases after PostLoopMultiversioning is enabled. It is true but not related to this PR. The increase is caused by creation of multi-versioned post loops. Hence, the code size still increases even if we don't vectorize the post loop. To get rid of this side effect, I think we may directly vectorize RCE'd post loop without doing the multiversioning (prevent generation of any scalar tail - I see you have mentioned this in JBS comments). That's an enhancement we can do next. - JMH shows some obvious performance regression when loop iteration count is small. I do have reproduced this regression in my repeated tests on AVX-512. But I don't really understand why this could happen with reduced CPU cycles and reduced dynamic instruction count. I heard that AVX-512 CPUs may run with lower frequency when some SIMD instructions are executed[1]. Is this a cause of the regression? Please let me know if you have further comments. [1] https://stackoverflow.com/questions/56852812/simd-instructions-lowering-cpu-frequency Thanks, Pengfei ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From tschatzl at openjdk.java.net Tue Jan 18 10:28:27 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 18 Jan 2022 10:28:27 GMT Subject: [jdk18] RFR: 8279294: NonblockingQueue::try_pop may improperly indicate queue is empty In-Reply-To: References: Message-ID: <4KHnaiabB-z96_si_LaCLN18L7I2vCYOMZSJ-vvqTSY=.40e781d3-fd29-4e39-b85d-673386fc26a0@github.com> On Mon, 17 Jan 2022 08:23:37 GMT, Kim Barrett wrote: > Please review this improvement to NonblockingQueue::try_pop. The old code > returned an indication that the queue was empty in some cases where that > wasn't true. In particular, contending try_pop operations could result in > some incorrectly indicating empty. The change fixes that and improves the > interaction between contending try_pops. > > Testing: > mach5 tier1 > > Lots of testing of this change in conjunction with others as part of > investigating and fixing JDK-8273383. src/hotspot/share/utilities/nonblockingQueue.inline.hpp line 144: > 142: // (1) next_node is the extension of the queue's list. > 143: // (2) next_node is NULL, because a competing try_pop took result. > 144: // (3) next_node is the extension of some unrelated list, because a Would it be possible to name these cases a/b/c instead of numbering them? The comments below also refer to these subcases as 1a-1c. Cases a) and c) in this list seem to be in a different order than in the code, would be nice if they matched (just exchange). I do not understand (a linguistic problem) why for case 3, `next_node` is the extension of some "unrelated" list, i.e. I do not understand the use of the adjective "unrelated" here, probably taking the word too literally. The `result` to be popped must be in the same list as the list other threads pop from (e.g. in this case we always pop from the `_head` of the completed buffers list, racing with threads popping from the `_head` of the same list. There does not seem to be a way to compete with other, different lists here). Even if we got some element of a sub-list of the original list, it and its sublists seem still "related". ------------- PR: https://git.openjdk.java.net/jdk18/pull/106 From ayang at openjdk.java.net Tue Jan 18 12:10:37 2022 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Tue, 18 Jan 2022 12:10:37 GMT Subject: RFR: 8280136: Serial: Remove unnecessary use of ExpandHeap_lock Message-ID: <6zRTvGcJCD7VNEf1_U5RkTE9lg6I3mFFQYKtAb3WRqo=.e5df3ea9-693d-42ba-a7e7-7724f9fc3ad1@github.com> This PR consists of two commits: 1. remove `ExpandHeap_lock` in Serial GC code. 2. rename it to `ParallelExpandHeap_lock` to indicate it's Parallel-GC only. Test: tier1-6 ------------- Commit messages: - rename - serial-lock Changes: https://git.openjdk.java.net/jdk/pull/7124/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7124&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8280136 Stats: 24 lines in 7 files changed: 8 ins; 2 del; 14 mod Patch: https://git.openjdk.java.net/jdk/pull/7124.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7124/head:pull/7124 PR: https://git.openjdk.java.net/jdk/pull/7124 From lkorinth at openjdk.java.net Tue Jan 18 13:13:32 2022 From: lkorinth at openjdk.java.net (Leo Korinth) Date: Tue, 18 Jan 2022 13:13:32 GMT Subject: RFR: 8269537: memset() is called after operator new [v4] In-Reply-To: References: Message-ID: On Wed, 20 Oct 2021 09:36:38 GMT, Leo Korinth wrote: >> The basic problem is that we are relying on undefined behaviour, as documented in the code: >> >> // This whole business of passing information from ResourceObj::operator new >> // to the ResourceObj constructor via fields in the "object" is technically UB. >> // But it seems to work within the limitations of HotSpot usage (such as no >> // multiple inheritance) with the compilers and compiler options we're using. >> // And it gives some possibly useful checking for misuse of ResourceObj. >> >> >> I am removing the undefined behaviour by passing the type of allocation through a thread local variable. >> >> This solution has some advantages: >> 1) it is not UB >> 2) it is simpler and easier to understand >> 3) it uses less memory (I could make it use even less if I made the enum `allocation_type` a u8) >> 4) in the *very* unlikely situation that stack memory (or embedded) already equals the data calculated from the address of the object, the code will also work. >> >> When doing the change, I also updated `allocated_on_stack()` to the new name `allocated_on_stack_or_embedded()` which is much harder to misinterpret. >> >> I also disallow to "fake" the memory type by explicitly calling `ResourceObj::set_allocation_type`. >> >> This forced me to change two places that is faking the allocation type of an embedded `GrowableArray` from `STACK_OR_EMBEDDED` to `C_HEAP`. The faking of the type is hard to understand as a `STACK_OR_EMBEDDED` `GrowableArray` can allocate any type of object. My guess is that `GrowableArray` has changed behaviour, or maybe that it was hard to understand because the old naming of `allocated_on_stack()`. >> >> I have also tried to update the comments. In doing that I not only changed the comments for this change, but also for the *incorrect* advice to always delete object you allocate with new. >> >> Testing on debug build tier1-3 >> Testing on release build tier1 > > Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: > > review updates This comment will keep this pull request alive a bit longer. ------------- PR: https://git.openjdk.java.net/jdk/pull/5387 From aph at openjdk.java.net Tue Jan 18 13:46:28 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 18 Jan 2022 13:46:28 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash [v2] In-Reply-To: References: Message-ID: On Fri, 14 Jan 2022 16:15:06 GMT, Denghui Dong wrote: >> Hi, >> >> I found that the native stack frames in the hs log are not accurate sometimes on AArch64, not sure if this is a known issue or an issue worth fixing. >> >> The following steps can quick reproduce the problem: >> >> 1. apply the diff(comment the dtrace_object_alloc call in interpreter and make a crash on SharedRuntime::dtrace_object_alloc) >> >> index 39e99bdd5ed..4fc768e94aa 100644 >> --- a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp >> +++ b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp >> @@ -3558,6 +3558,7 @@ void TemplateTable::_new() { >> __ store_klass_gap(r0, zr); // zero klass gap for compressed oops >> __ store_klass(r0, r4); // store klass last >> >> +/** >> { >> SkipIfEqual skip(_masm, &DTraceAllocProbes, false); >> // Trigger dtrace event for fastpath >> @@ -3567,6 +3568,7 @@ void TemplateTable::_new() { >> __ pop(atos); // restore the return value >> >> } >> +*/ >> __ b(done); >> } >> >> diff --git a/src/hotspot/cpu/x86/templateTable_x86.cpp b/src/hotspot/cpu/x86/templateTable_x86.cpp >> index 19530b7c57c..15b0509da4c 100644 >> --- a/src/hotspot/cpu/x86/templateTable_x86.cpp >> +++ b/src/hotspot/cpu/x86/templateTable_x86.cpp >> @@ -4033,6 +4033,7 @@ void TemplateTable::_new() { >> Register tmp_store_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg); >> __ store_klass(rax, rcx, tmp_store_klass); // klass >> >> +/** >> { >> SkipIfEqual skip_if(_masm, &DTraceAllocProbes, 0); >> // Trigger dtrace event for fastpath >> @@ -4041,6 +4042,7 @@ void TemplateTable::_new() { >> CAST_FROM_FN_PTR(address, static_cast(SharedRuntime::dtrace_object_alloc)), rax); >> __ pop(atos); >> } >> +*/ >> >> __ jmp(done); >> } >> diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp >> index a5de65ea5ab..60b4bd3bcc8 100644 >> --- a/src/hotspot/share/runtime/sharedRuntime.cpp >> +++ b/src/hotspot/share/runtime/sharedRuntime.cpp >> @@ -1002,6 +1002,7 @@ jlong SharedRuntime::get_java_tid(Thread* thread) { >> * 6254741. Once that is fixed we can remove the dummy return value. >> */ >> int SharedRuntime::dtrace_object_alloc(oopDesc* o) { >> + *(int*)0 = 1; >> return dtrace_object_alloc(Thread::current(), o, o->size()); >> } >> >> >> 2. `java -XX:+DTraceAllocProbes -Xcomp -XX:-PreserveFramePointer -version` >> >> On x86_64, the native stack in hs log is complete, but on AArch64, the native stack is incorrect. >> >> In the beginning, I thought it might be the influence of PreserveFramePointer. Later, I found that no matter whether PreserveFramePointer is enabled or not, in the hs log of x86_64, the native stack is always correct, and aarch64 is wrong. >> >> After some investigation, I found that this problem is related to the layout of the stack. >> >> On x86_64, whether it is C/C++, interpreter, or JIT, `callee` will always put the `return address` and `fp` of the `caller` at the bottom of the stack. >> Hence, `callee` can always get the `caller sp`(aka `sender sp`) by `fp + 2`, and if `caller` is a compiled method, `caller sp` is the key to getting the `caller`'s `caller` since `caller fp` may be invalid.(see frame::sender_for_compiled_frame). >> >> >> push %rbp >> mov %rsp,%rbp >> >> _ _ _ _ _ _ >> | | >> | | | >> |_ _ _ _ _ _| | >> | | | >> caller | | <- caller sp | >> _ _ _ |_ _ _ _ _ _| | expand >> | | | >> | ret addr | | direction >> callee |_ _ _ _ _ _| | >> | | V >> | caller fp | <- fp >> |_ _ _ _ _ _| >> >> >> >> But for AArch64, the C/C++ code doesn't put the `return address` and `fp` of the `caller` at the bottom of the stack. >> Hence, we cannot use `fp + 2` to calculate the proper `caller sp`(although it is still implemented this way). >> >> When `caller` is a C1/C2 method A, and `callee` is a C/C++ method B, we cannot get the `caller` of A since we cannot get the proper sp value of it. >> >> >> stp x29, x30, [sp, #-N]! >> mov x29, sp >> >> _ _ _ _ _ _ >> | | >> | | | >> |_ _ _ _ _ _| | >> | | | >> caller | | <- caller sp | >> _ _ _ |_ _ _ _ _ _| - | expand >> | | >> . . . . . | | direction >> _ _ _ _ _ _ | | >> | | | N | >> | ret addr | | | >> callee |_ _ _ _ _ _| | | >> | | - V >> | caller fp | <- fp >> |_ _ _ _ _ _| >> >> >> >> I am not very familiar with AArch64 and have no idea how to fix this issue perfectly at current. >> >> Based on my understanding of the implementation, we can get the correct stack trace when PreserveFramePointer is enabled. >> >> Although PreserveFramePointer is disabled by default, I found that some real applications will enable it in the production environment. >> Therefore, in my opinion, this fix can help troubleshoot crash issues in applications that enable PreserveFramePointer on AArch64 platform. >> >> This patch changes the logic of l_sender_sp calculation, uses sender_sp() as the value of l_sender_sp when PreserveFramePointer is enabled. >> >> Any input is appreciated. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > fix pfl() crash problem and rename from_thread to from_anchor So, here's my thinking for now. `_from_anchor` really means _this SP is trustworthy_, and perhaps we need a different name which suggests that. `sp_ok_to_use()` or `sp_is_trusted()` or somesuch? We do at least need a comment which explains that unless this boolean is true, the SP value in a frame is basically garbage, although it will point to somewhere within the stack. With that change, this patch can be integrated. In the longer term, I think we should look at using libunwind to obtain a precise native stack trace, and then we can get rid of all the old kludges. ------------- PR: https://git.openjdk.java.net/jdk/pull/6597 From ayang at openjdk.java.net Tue Jan 18 14:15:52 2022 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Tue, 18 Jan 2022 14:15:52 GMT Subject: RFR: 8280146: Parallel: Remove time log tag Message-ID: Simple change of removing some unhelpful logs in Parallel GC. Then, `time` log tag becomes unused and is removed as well. Test: hotspot_gc ------------- Commit messages: - time-tag Changes: https://git.openjdk.java.net/jdk/pull/7128/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7128&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8280146 Stats: 28 lines in 4 files changed: 0 ins; 28 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7128.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7128/head:pull/7128 PR: https://git.openjdk.java.net/jdk/pull/7128 From hseigel at openjdk.java.net Tue Jan 18 15:14:51 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Tue, 18 Jan 2022 15:14:51 GMT Subject: RFR: 8279887: 2 Null pointer dereference defect groups in os_posix.cpp Message-ID: Please review this small fix to prevent possible Null pointer dereferences. The fix adds a Null check to prevent trying to park a Null thread. The Null check is needed because UNSAFE_ENTRY calls thread_from_jni_environment(), which returns NULL if the env thread is terminated. The fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. Thanks, Harold ------------- Commit messages: - 8279887: 2 Null pointer dereference defect groups in os_posix.cpp Changes: https://git.openjdk.java.net/jdk/pull/7129/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7129&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8279887 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7129.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7129/head:pull/7129 PR: https://git.openjdk.java.net/jdk/pull/7129 From hseigel at openjdk.java.net Tue Jan 18 15:49:08 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Tue, 18 Jan 2022 15:49:08 GMT Subject: RFR: 8279936: Change shared code to use os:: system API's [v2] In-Reply-To: References: Message-ID: <9dd-4_5r5HM3_Dys_O3NVqOPdKdtnVYXpqWhnhLgFLU=.56a9a8e2-a8f7-4a6f-bd7c-333ce4276e83@github.com> > Please review this small change to call os:: API's. the changes were tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. Most of these changes were for I/O related calls. Changes to memory allocation calls such as malloc and free will be handled in a future change. > > Thanks, Harold Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: restore abort() call ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7055/files - new: https://git.openjdk.java.net/jdk/pull/7055/files/ccef3b37..ff0b9192 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7055&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7055&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7055.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7055/head:pull/7055 PR: https://git.openjdk.java.net/jdk/pull/7055 From mdoerr at openjdk.java.net Tue Jan 18 16:06:01 2022 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Tue, 18 Jan 2022 16:06:01 GMT Subject: [jdk18] RFR: 8280155: [PPC64, s390] frame size checks are not yet correct Message-ID: Fix frame size check and do null check earlier as described in the JBS issue. ------------- Commit messages: - Update Copyright years. - [PPC64, s390] frame size checks are not yet correct Changes: https://git.openjdk.java.net/jdk18/pull/107/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk18&pr=107&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8280155 Stats: 11 lines in 3 files changed: 4 ins; 1 del; 6 mod Patch: https://git.openjdk.java.net/jdk18/pull/107.diff Fetch: git fetch https://git.openjdk.java.net/jdk18 pull/107/head:pull/107 PR: https://git.openjdk.java.net/jdk18/pull/107 From mbaesken at openjdk.java.net Tue Jan 18 16:06:02 2022 From: mbaesken at openjdk.java.net (Matthias Baesken) Date: Tue, 18 Jan 2022 16:06:02 GMT Subject: [jdk18] RFR: 8280155: [PPC64, s390] frame size checks are not yet correct In-Reply-To: References: Message-ID: On Tue, 18 Jan 2022 15:51:22 GMT, Martin Doerr wrote: > Fix frame size check and do null check earlier as described in the JBS issue. Hi Martin, this looks good to me; but please fix the copyright years in src/hotspot/os_cpu/linux_ppc/thread_linux_ppc.cpp . ------------- Marked as reviewed by mbaesken (Reviewer). PR: https://git.openjdk.java.net/jdk18/pull/107 From duke at openjdk.java.net Tue Jan 18 16:20:56 2022 From: duke at openjdk.java.net (Emanuel Peter) Date: Tue, 18 Jan 2022 16:20:56 GMT Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v3] In-Reply-To: References: Message-ID: <9_F8iOfxnwzaJddJd6lYKtkdrk7lKP4YR2_81LiHKp0=.92efd808-fb0e-4fa8-82a3-cc2c76f44a9e@github.com> > Deprecated ExtendedDTraceProbes. > Edited help messages and man pages accordingly. > Removed `/src/hotspot/share/services/dtraceAttacher.hpp`: only contained declarations that are never defined or used. > > Checked that tests are not affected. Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: - Merge branch 'JDK-8278423' of https://github.com/eme64/jdk into JDK-8278423 - added flag to VMDeprecatedOptions Test ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7110/files - new: https://git.openjdk.java.net/jdk/pull/7110/files/ecbee3a6..e936a0df Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7110&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7110&range=01-02 Stats: 3 lines in 2 files changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7110.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7110/head:pull/7110 PR: https://git.openjdk.java.net/jdk/pull/7110 From duke at openjdk.java.net Tue Jan 18 16:21:00 2022 From: duke at openjdk.java.net (Emanuel Peter) Date: Tue, 18 Jan 2022 16:21:00 GMT Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v2] In-Reply-To: References: <3iQtzpQwG6LqOx6Vf7TBHtYkmGBEAK95IzLss6unghE=.b9070f3d-e025-4f74-a415-aa82e4d2289d@github.com> Message-ID: On Tue, 18 Jan 2022 13:09:57 GMT, David Holmes wrote: >> Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: >> >> - Mode of speech edit >> - Fix formatting issue > > src/hotspot/share/runtime/arguments.cpp line 2883: > >> 2881: } else if (match_option(option, "-XX:+ExtendedDTraceProbes")) { >> 2882: #if defined(DTRACE_ENABLED) >> 2883: warning("-XX:+ExtendedDTraceProbes is deprecated. Use a combination of -XX:+DTraceMethodProbes, -XX:+DTraceAllocProbes and -XX:+DTraceMonitorProbes instead."); > > Because the flag is deprecated, a deprecation warning will already be generated for it, so this will cause two warnings to be produced. You either put the flag in the special_jvm_flags table and accept the standard deprecation message, or else you don't put it in the table and handle the deprecation (and later obsoletion) warning directly. In this case the direct approach seems best. @dholmes-ora : The warning is actually not already generated, since this flag is handled separately, the handling is only done if none of the cases matches, and we land in the `} else if (match_option(option, "-XX:", &tail)) { // -XX:xxxx` case. I had to add the warning separately now. I think this is a bug though, @tobiasholenstein thought so too. All options should first be checked if they are deprecated. ------------- PR: https://git.openjdk.java.net/jdk/pull/7110 From duke at openjdk.java.net Tue Jan 18 16:21:00 2022 From: duke at openjdk.java.net (Emanuel Peter) Date: Tue, 18 Jan 2022 16:21:00 GMT Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v2] In-Reply-To: References: <3iQtzpQwG6LqOx6Vf7TBHtYkmGBEAK95IzLss6unghE=.b9070f3d-e025-4f74-a415-aa82e4d2289d@github.com> Message-ID: On Tue, 18 Jan 2022 16:17:05 GMT, Emanuel Peter wrote: >> src/hotspot/share/runtime/arguments.cpp line 2883: >> >>> 2881: } else if (match_option(option, "-XX:+ExtendedDTraceProbes")) { >>> 2882: #if defined(DTRACE_ENABLED) >>> 2883: warning("-XX:+ExtendedDTraceProbes is deprecated. Use a combination of -XX:+DTraceMethodProbes, -XX:+DTraceAllocProbes and -XX:+DTraceMonitorProbes instead."); >> >> Because the flag is deprecated, a deprecation warning will already be generated for it, so this will cause two warnings to be produced. You either put the flag in the special_jvm_flags table and accept the standard deprecation message, or else you don't put it in the table and handle the deprecation (and later obsoletion) warning directly. In this case the direct approach seems best. > > @dholmes-ora : The warning is actually not already generated, since this flag is handled separately, the handling is only done if none of the cases matches, and we land in the `} else if (match_option(option, "-XX:", &tail)) { // -XX:xxxx` case. > I had to add the warning separately now. I think this is a bug though, @tobiasholenstein thought so too. All options should first be checked if they are deprecated. Thank you very much, I added the flag to the Test. ------------- PR: https://git.openjdk.java.net/jdk/pull/7110 From lucy at openjdk.java.net Tue Jan 18 16:24:37 2022 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Tue, 18 Jan 2022 16:24:37 GMT Subject: [jdk18] RFR: 8280155: [PPC64, s390] frame size checks are not yet correct In-Reply-To: References: Message-ID: On Tue, 18 Jan 2022 15:51:22 GMT, Martin Doerr wrote: > Fix frame size check and do null check earlier as described in the JBS issue. Changes look good to me! ------------- Marked as reviewed by lucy (Reviewer). PR: https://git.openjdk.java.net/jdk18/pull/107 From duke at openjdk.java.net Tue Jan 18 16:39:04 2022 From: duke at openjdk.java.net (Emanuel Peter) Date: Tue, 18 Jan 2022 16:39:04 GMT Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v4] In-Reply-To: References: Message-ID: > Deprecated ExtendedDTraceProbes. > Edited help messages and man pages accordingly. > Added flag to VMDeprecatedOptions test. > Removed `/src/hotspot/share/services/dtraceAttacher.hpp`: only contained declarations that are never defined or used. > > Checked that tests are not affected. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: moved deprecated flag to deprecated section in manpages ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7110/files - new: https://git.openjdk.java.net/jdk/pull/7110/files/e936a0df..0f161b01 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7110&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7110&range=02-03 Stats: 18 lines in 1 file changed: 9 ins; 9 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7110.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7110/head:pull/7110 PR: https://git.openjdk.java.net/jdk/pull/7110 From duke at openjdk.java.net Tue Jan 18 16:39:07 2022 From: duke at openjdk.java.net (Emanuel Peter) Date: Tue, 18 Jan 2022 16:39:07 GMT Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v2] In-Reply-To: References: <3iQtzpQwG6LqOx6Vf7TBHtYkmGBEAK95IzLss6unghE=.b9070f3d-e025-4f74-a415-aa82e4d2289d@github.com> Message-ID: On Tue, 18 Jan 2022 13:13:01 GMT, David Holmes wrote: >> Emanuel Peter has updated the pull request incrementally with two additional commits since the last revision: >> >> - Mode of speech edit >> - Fix formatting issue > > src/java.base/share/man/java.1 line 2978: > >> 2976: .TP >> 2977: .B \f[CB]\-XX:+ExtendedDTraceProbes\f[R] >> 2978: Deprecated. Use combination of these flags instead: -XX:+DTraceMethodProbes, -XX:+DTraceAllocProbes, -XX:+DTraceMonitorProbes > > When a flag is deprecated we move it to the "Deprecated Java Options" section of the manpage. Thank you @dholmes-ora ------------- PR: https://git.openjdk.java.net/jdk/pull/7110 From tschatzl at openjdk.java.net Tue Jan 18 16:47:26 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 18 Jan 2022 16:47:26 GMT Subject: RFR: 8280146: Parallel: Remove time log tag In-Reply-To: References: Message-ID: On Tue, 18 Jan 2022 14:09:53 GMT, Albert Mingkun Yang wrote: > Simple change of removing some unhelpful logs in Parallel GC. Then, `time` log tag becomes unused and is removed as well. > > Test: hotspot_gc Lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7128 From sviswanathan at openjdk.java.net Tue Jan 18 17:52:35 2022 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Tue, 18 Jan 2022 17:52:35 GMT Subject: RFR: 8279282: [vectorapi] Matcher::supports_vector_comparison_unsigned is not needed on x86 [v3] In-Reply-To: <6YEvoLUipNu1l5haifwWwgQyfrjpRxh1o9qaIElSKQs=.d06b8dea-ed32-4cb5-ab09-28ce9a58e524@github.com> References: <6YEvoLUipNu1l5haifwWwgQyfrjpRxh1o9qaIElSKQs=.d06b8dea-ed32-4cb5-ab09-28ce9a58e524@github.com> Message-ID: On Sun, 9 Jan 2022 01:48:04 GMT, Quan Anh Mai wrote: >> Hi, >> >> Currently, unsigned comparison on AVX is implemented by zero extending elements and comparing the results. This leads to unnecessary complexity. This patch changes the implementation to use the identity existing in `Integer/Long.compareUnsigned`, that is `compareUnsigned(x, y) == compare(x ^ min_value, y ^ min_value)`. >> >> Thank you very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > use movddup for 128-bit vectors Marked as reviewed by sviswanathan (Reviewer). Patch looks good to me. ------------- PR: https://git.openjdk.java.net/jdk/pull/6966 From sviswanathan at openjdk.java.net Tue Jan 18 17:52:36 2022 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Tue, 18 Jan 2022 17:52:36 GMT Subject: RFR: 8279282: [vectorapi] Matcher::supports_vector_comparison_unsigned is not needed on x86 [v3] In-Reply-To: References: <6YEvoLUipNu1l5haifwWwgQyfrjpRxh1o9qaIElSKQs=.d06b8dea-ed32-4cb5-ab09-28ce9a58e524@github.com> Message-ID: <3g_0wswqxMZETfxdxnoVorNke-TruQ3-CeoaEwg_EfQ=.462147d4-7ae3-4f24-b4a4-e58b5cd22fa3@github.com> On Sun, 16 Jan 2022 08:04:31 GMT, Quan Anh Mai wrote: >> src/hotspot/cpu/x86/macroAssembler_x86.cpp line 3254: >> >>> 3252: vpcmpCC(dst, nds, src, eq_cond_enc, width, vector_len); >>> 3253: vallones(xtmp, vector_len); >>> 3254: vpxor(dst, xtmp, dst, vector_len); >> >> This would add extra overhead of doing vallones every time versus what we had before. > > Thanks a lot for the review. > uiCA shows that both result in 3 uops being executed if the all ones is reachable. If the external address is not reachable however, an extra `mov` instruction would need to be emitted in the current approach, leading to 1 extra uop. > Trying `_mm256_xor_si256(src, _mm256_set1_epi32(-1))` it seems that both gcc and clang use `vcmpeqd`. @merykitty Thanks for the explanation. ------------- PR: https://git.openjdk.java.net/jdk/pull/6966 From foths.kounelhs at gmail.com Tue Jan 18 17:55:00 2022 From: foths.kounelhs at gmail.com (Fotios Kounelis) Date: Tue, 18 Jan 2022 17:55:00 +0000 Subject: jvmci return array Message-ID: Hello, I am trying to create a new native function inside jvmciRuntime.cpp. I want this function to return an integer array. While I have found on JNI a similar example with jintArray and SetIntArrayRegion() function, in the jvmci, the API is different. I was able to return an integer and read it in the GraalVM compiler but I am struggling with an array. Would the program for jvmci be similar to the JNI? If yes, which is the equivalent of SetIntArrayRegion() for this API, assuming I have the code below? jintArray result; JRT_BLOCK_ENTRY(jintArray , JVMCIRuntime::object_hash_get(JavaThread * thread, jint * ar1)) ? JRT_BLOCK; ??? jintArray result; ??? result = oopFactory::new_intArray(valueArraySize, CHECK_0); ??? int* valueArray; //this is the array that contains the data to fill result ??? // fill result with values in the thread ??? return result; ??? JRT_BLOCK_END; JRT_END Otherwise, could you give me an example of how to return an int array? Best regards, Fotis From hseigel at openjdk.java.net Tue Jan 18 18:45:27 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Tue, 18 Jan 2022 18:45:27 GMT Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v4] In-Reply-To: References: Message-ID: On Tue, 18 Jan 2022 16:39:04 GMT, Emanuel Peter wrote: >> Deprecated ExtendedDTraceProbes. >> Edited help messages and man pages accordingly. >> Added flag to VMDeprecatedOptions test. >> Removed `/src/hotspot/share/services/dtraceAttacher.hpp`: only contained declarations that are never defined or used. >> >> Checked that tests are not affected. > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > moved deprecated flag to deprecated section in manpages Can you replace the use of -XX:+ExtendedDTraceProbes in test/hotspot/jtreg/serviceability/7170638/SDTProbesGNULinuxTest.java with the three new flags ? ------------- PR: https://git.openjdk.java.net/jdk/pull/7110 From vladimir.kozlov at oracle.com Tue Jan 18 19:18:53 2022 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 18 Jan 2022 11:18:53 -0800 Subject: jvmci return array In-Reply-To: References: Message-ID: <631cc1b2-931f-28ee-0e59-b87015b542a0@oracle.com> CCing to Doug, JVMCI expert. Thanks, Vladimir K On 1/18/22 9:55 AM, Fotios Kounelis wrote: > Hello, > > I am trying to create a new native function inside jvmciRuntime.cpp. I want this function to return an integer array. > While I have found on JNI a similar example with jintArray and SetIntArrayRegion() function, in the jvmci, the API is > different. > > I was able to return an integer and read it in the GraalVM compiler but I am struggling with an array. Would the program > for jvmci be similar to the JNI? If yes, which is the equivalent of SetIntArrayRegion() for this API, assuming I have > the code below? > > jintArray result; > JRT_BLOCK_ENTRY(jintArray , JVMCIRuntime::object_hash_get(JavaThread * thread, jint * ar1)) > ? JRT_BLOCK; > ??? jintArray result; > ??? result = oopFactory::new_intArray(valueArraySize, CHECK_0); > ??? int* valueArray; //this is the array that contains the data to fill result > ??? // fill result with values in the thread > ??? return result; > ??? JRT_BLOCK_END; > JRT_END > > > Otherwise, could you give me an example of how to return an int array? > > Best regards, > > Fotis > From kvn at openjdk.java.net Tue Jan 18 19:34:36 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 18 Jan 2022 19:34:36 GMT Subject: RFR: 8279282: [vectorapi] Matcher::supports_vector_comparison_unsigned is not needed on x86 [v3] In-Reply-To: <6YEvoLUipNu1l5haifwWwgQyfrjpRxh1o9qaIElSKQs=.d06b8dea-ed32-4cb5-ab09-28ce9a58e524@github.com> References: <6YEvoLUipNu1l5haifwWwgQyfrjpRxh1o9qaIElSKQs=.d06b8dea-ed32-4cb5-ab09-28ce9a58e524@github.com> Message-ID: On Sun, 9 Jan 2022 01:48:04 GMT, Quan Anh Mai wrote: >> Hi, >> >> Currently, unsigned comparison on AVX is implemented by zero extending elements and comparing the results. This leads to unnecessary complexity. This patch changes the implementation to use the identity existing in `Integer/Long.compareUnsigned`, that is `compareUnsigned(x, y) == compare(x ^ min_value, y ^ min_value)`. >> >> Thank you very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > use movddup for 128-bit vectors I submitted testing. ------------- PR: https://git.openjdk.java.net/jdk/pull/6966 From cjplummer at openjdk.java.net Tue Jan 18 19:46:28 2022 From: cjplummer at openjdk.java.net (Chris Plummer) Date: Tue, 18 Jan 2022 19:46:28 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v2] In-Reply-To: <_Pw-D6A2BD-4wx0mZ5lFvFlxBRylbA5WT9y5xgtDBvk=.fa85a604-e6a0-463b-8a4a-4ae7e210661a@github.com> References: <_Pw-D6A2BD-4wx0mZ5lFvFlxBRylbA5WT9y5xgtDBvk=.fa85a604-e6a0-463b-8a4a-4ae7e210661a@github.com> Message-ID: On Tue, 18 Jan 2022 02:50:06 GMT, Yi Yang wrote: >> Add VM.classes to print details of all classes, output looks like: >> >> 1. jcmd VM.classes >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 >> ... >> >> 2. jcmd VM.classes verbose >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841f210) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 >> - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$MH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - vtable length 5 (start addr: 0x0000000800c0b5b8) >> - itable length 2 (start addr: 0x0000000800c0b5e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> - non-static oop maps: >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841ea68) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 >> - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$DMH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - vtable length 5 (start addr: 0x0000000800c0b1b8) >> - itable length 2 (start addr: 0x0000000800c0b1e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> ... > > Yi Yang has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > 8275775 Add VM.classes to print details of all classes It seems it would be useful to support the verbose output with just a single class that is specified, although that would suggest that the dcmd name should then be something other than `VM.classes`. ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From iwalulya at openjdk.java.net Tue Jan 18 19:48:23 2022 From: iwalulya at openjdk.java.net (Ivan Walulya) Date: Tue, 18 Jan 2022 19:48:23 GMT Subject: RFR: 8280146: Parallel: Remove time log tag In-Reply-To: References: Message-ID: On Tue, 18 Jan 2022 14:09:53 GMT, Albert Mingkun Yang wrote: > Simple change of removing some unhelpful logs in Parallel GC. Then, `time` log tag becomes unused and is removed as well. > > Test: hotspot_gc Marked as reviewed by iwalulya (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7128 From tom.rodriguez at oracle.com Tue Jan 18 21:15:59 2022 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Tue, 18 Jan 2022 21:15:59 +0000 Subject: jvmci return array In-Reply-To: <631cc1b2-931f-28ee-0e59-b87015b542a0@oracle.com> References: <631cc1b2-931f-28ee-0e59-b87015b542a0@oracle.com> Message-ID: > On Jan 18, 2022, at 11:18 AM, Vladimir Kozlov wrote: > > CCing to Doug, JVMCI expert. > > Thanks, > Vladimir K > > On 1/18/22 9:55 AM, Fotios Kounelis wrote: >> Hello, >> I am trying to create a new native function inside jvmciRuntime.cpp. I want this function to return an integer array. While I have found on JNI a similar example with jintArray and SetIntArrayRegion() function, in the jvmci, the API is different. >> I was able to return an integer and read it in the GraalVM compiler but I am struggling with an array. Would the program for jvmci be similar to the JNI? If yes, which is the equivalent of SetIntArrayRegion() for this API, assuming I have the code below? >> jintArray result; >> JRT_BLOCK_ENTRY(jintArray , JVMCIRuntime::object_hash_get(JavaThread * thread, jint * ar1)) >> JRT_BLOCK; >> jintArray result; >> result = oopFactory::new_intArray(valueArraySize, CHECK_0); >> int* valueArray; //this is the array that contains the data to fill result >> // fill result with values in the thread >> return result; >> JRT_BLOCK_END; >> JRT_END >> Otherwise, could you give me an example of how to return an int array? I?m not quite clear what you?re trying to do here but you?re mixing JNI concepts with HotSpot internals. jintArray is a JNI return type but new_intArray returns a typeArrayOop which is the hotspot internal representation of a primitive array. To fill the values into the result you?ll need to use the int_at_put methods with manually computed offsets or you could possibly copy the logic from the Set*ArrayRegion macros implemented in jni.cpp. Also to return an oop from a function like this you need to pass it out using JavaThread::set_vm_result which requires proper unpacking logic by a stub caller. The real return type would then be void. Look for other uses of set_vm_result in that file for examples. tom >> Best regards, >> Fotis From dholmes at openjdk.java.net Tue Jan 18 21:35:23 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 18 Jan 2022 21:35:23 GMT Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v2] In-Reply-To: References: <3iQtzpQwG6LqOx6Vf7TBHtYkmGBEAK95IzLss6unghE=.b9070f3d-e025-4f74-a415-aa82e4d2289d@github.com> Message-ID: On Tue, 18 Jan 2022 16:17:43 GMT, Emanuel Peter wrote: >> @dholmes-ora : The warning is actually not already generated, since this flag is handled separately, the handling is only done if none of the cases matches, and we land in the `} else if (match_option(option, "-XX:", &tail)) { // -XX:xxxx` case. >> I had to add the warning separately now. I think this is a bug though, @TobiHartmann thought so too. All options should first be checked if they are deprecated. > > Thank you very much, I added the flag to the Test. @eme64 thanks for that correction - yes I thought we checked all args up front and then processed the specialized logic. The opposite does make some sense though as it means we can use the table and do specialized handling. When the flag is obsoleted we delete the specialized handling. ------------- PR: https://git.openjdk.java.net/jdk/pull/7110 From coleenp at openjdk.java.net Tue Jan 18 21:57:34 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 18 Jan 2022 21:57:34 GMT Subject: RFR: 8248404: AArch64: Remove uses of long and unsigned long [v13] In-Reply-To: References: Message-ID: On Mon, 17 Jan 2022 17:46:58 GMT, Coleen Phillimore wrote: >> Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Revert macroAssembler_aarch64.cpp Thank you for the help and reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From coleenp at openjdk.java.net Tue Jan 18 22:01:39 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 18 Jan 2022 22:01:39 GMT Subject: Integrated: 8248404: AArch64: Remove uses of long and unsigned long In-Reply-To: References: Message-ID: On Tue, 11 Jan 2022 01:46:40 GMT, Coleen Phillimore wrote: > Tested with mach5 on linux-aarch64 and macosx-aarch64 on tier1-3 and below GHA for windows-aarch64 (once I open this PR). This pull request has now been integrated. Changeset: 1a206287 Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/1a206287576ec55d50d33c68b54647efc7fe32b0 Stats: 30 lines in 4 files changed: 4 ins; 14 del; 12 mod 8248404: AArch64: Remove uses of long and unsigned long Reviewed-by: kbarrett ------------- PR: https://git.openjdk.java.net/jdk/pull/7023 From foths.kounelhs at gmail.com Tue Jan 18 22:31:47 2022 From: foths.kounelhs at gmail.com (Fotios Kounelis) Date: Tue, 18 Jan 2022 22:31:47 +0000 Subject: jvmci return array In-Reply-To: References: <631cc1b2-931f-28ee-0e59-b87015b542a0@oracle.com> Message-ID: <2122c376-37e5-3cf2-68d0-42cf70fc113b@gmail.com> Hi Tom, Thank you for your reply. My bad example was an attempt to use jni in the HotSpot internals. What I am trying to do is: a function returning an int[] array, filled with values of another existing array. So, my function 1) needs to return void 2) have an oop obj initialized with new_intArray 3) fill the values using int_at_put(0...size-1, arrayWithValues[0...size-1]) and 4) use the set_vm_result() to return my array. I am not sure what you meant by "proper unpacking logic by a stub caller" for the set_vm_result, as in the same file the common use is something like "thread->set_vm_resutl(obj)", initializing just the oop obj properly with a new_typeArray. I would appreciate a quick example in case the above is not how it should work. Thank you for your time! I couldn't find (or maybe I didn't understand) the set*ArrayRegion initialization in the jni.cpp with the macros, that is why I didn't mention it above. Best regards, Fotis On 18/01/2022 21:15, Tom Rodriguez wrote: > proper unpacking logic by a stub caller From doug.simon at oracle.com Tue Jan 18 22:40:52 2022 From: doug.simon at oracle.com (Douglas Simon) Date: Tue, 18 Jan 2022 22:40:52 +0000 Subject: [External] : Re: jvmci return array In-Reply-To: <2122c376-37e5-3cf2-68d0-42cf70fc113b@gmail.com> References: <631cc1b2-931f-28ee-0e59-b87015b542a0@oracle.com> <2122c376-37e5-3cf2-68d0-42cf70fc113b@gmail.com> Message-ID: <32053EF1-60D5-4C63-BF3F-4350030157AE@oracle.com> I think we?re going to be able to better help if you put up a PR on GitHub (preferably a PR on your personal fork of https://github.com/openjdk/jdk). That way, there?s no need to guess at missing details of your problem. -Doug > On 19 Jan 2022, at 08:31, Fotios Kounelis wrote: > > Hi Tom, > > Thank you for your reply. My bad example was an attempt to use jni in the HotSpot internals. What I am trying to do is: a function returning an int[] array, filled with values of another existing array. > > So, my function 1) needs to return void 2) have an oop obj initialized with new_intArray 3) fill the values using int_at_put(0...size-1, arrayWithValues[0...size-1]) and 4) use the set_vm_result() to return my array. > > I am not sure what you meant by "proper unpacking logic by a stub caller" for the set_vm_result, as in the same file the common use is something like "thread->set_vm_resutl(obj)", initializing just the oop obj properly with a new_typeArray. > > I would appreciate a quick example in case the above is not how it should work. > > Thank you for your time! > > I couldn't find (or maybe I didn't understand) the set*ArrayRegion initialization in the jni.cpp with the macros, that is why I didn't mention it above. > > Best regards, > > Fotis > > On 18/01/2022 21:15, Tom Rodriguez wrote: >> proper unpacking logic by a stub caller From cushon at openjdk.java.net Tue Jan 18 23:31:55 2022 From: cushon at openjdk.java.net (Liam Miller-Cushon) Date: Tue, 18 Jan 2022 23:31:55 GMT Subject: RFR: 8280182: HotSpot Style Guide has stale link to chromium style guide Message-ID: Update links to the chromium style guide in the HotSpot Style Guide. ------------- Commit messages: - 8280182: HotSpot Style Guide has stale link to chromium style guide Changes: https://git.openjdk.java.net/jdk/pull/7138/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7138&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8280182 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7138.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7138/head:pull/7138 PR: https://git.openjdk.java.net/jdk/pull/7138 From duke at openjdk.java.net Tue Jan 18 23:32:59 2022 From: duke at openjdk.java.net (Yi-Fan Tsai) Date: Tue, 18 Jan 2022 23:32:59 GMT Subject: RFR: 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase Message-ID: 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase ------------- Commit messages: - Limit the change local - Merge branch 'openjdk:master' into pushpop - 8278036: Remove redundant push/pop in verify_heapbase Changes: https://git.openjdk.java.net/jdk/pull/7067/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7067&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8278036 Stats: 15 lines in 2 files changed: 11 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/7067.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7067/head:pull/7067 PR: https://git.openjdk.java.net/jdk/pull/7067 From coleenp at openjdk.java.net Tue Jan 18 23:53:21 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 18 Jan 2022 23:53:21 GMT Subject: RFR: 8279887: 2 Null pointer dereference defect groups in os_posix.cpp In-Reply-To: References: Message-ID: On Tue, 18 Jan 2022 15:00:50 GMT, Harold Seigel wrote: > Please review this small fix to prevent possible Null pointer dereferences. The fix adds a Null check to prevent trying to park a Null thread. The Null check is needed because UNSAFE_ENTRY calls thread_from_jni_environment(), which returns NULL if the env thread is terminated. > > The fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold Looks good to me. Seems like returning NULL for an impossible case is ok, since that's what UnPark does. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7129 From dholmes at openjdk.java.net Wed Jan 19 01:28:33 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 19 Jan 2022 01:28:33 GMT Subject: RFR: 8278423: ExtendedDTraceProbes should be deprecated [v4] In-Reply-To: References: Message-ID: On Tue, 18 Jan 2022 16:39:04 GMT, Emanuel Peter wrote: >> Deprecated ExtendedDTraceProbes. >> Edited help messages and man pages accordingly. >> Added flag to VMDeprecatedOptions test. >> Removed `/src/hotspot/share/services/dtraceAttacher.hpp`: only contained declarations that are never defined or used. >> >> Checked that tests are not affected. > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > moved deprecated flag to deprecated section in manpages A few more tweaks below. Thanks, David src/hotspot/os/aix/attachListener_aix.cpp line 31: > 29: #include "runtime/os.inline.hpp" > 30: #include "services/attachListener.hpp" > 31: #include "services/dtraceAttacher.hpp" These changes are somewhat independent of the deprecation issue and could be split out into a separate RFE. The serviceability folk may have an opinion. src/hotspot/share/runtime/arguments.cpp line 2884: > 2882: #if defined(DTRACE_ENABLED) > 2883: warning("Option ExtendedDTraceProbes was deprecated in version 19 and will likely be removed in a future release."); > 2884: warning("Use a combination of -XX:+DTraceMethodProbes, -XX:+DTraceAllocProbes and -XX:+DTraceMonitorProbes instead."); s/a/the/ Applies to all three uses. src/java.base/share/man/java.1 line 4001: > 3999: .TP > 4000: .B \f[CB]\-XX:+ExtendedDTraceProbes\f[R] > 4001: Deprecated. Use combination of these flags instead: -XX:+DTraceMethodProbes, -XX:+DTraceAllocProbes, -XX:+DTraceMonitorProbes Delete "Deprecated" as we are in the deprecated options section. The wording also needs updating as per the warning text ... though that might read a little odd here so I suggest a tweak: Use the combination of -XX:+DTraceMethodProbes, -XX:+DTraceAllocProbes and -XX:+DTraceMonitorProbes instead of this deprecated flag. I would also move that new text to the end, so we still describe the flag first (otherwise it again reads a little odd.) We will also need to add those flags to the "ADVANCED SERVICEABILITY OPTIONS FOR JAVA" section. ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7110 From yyang at openjdk.java.net Wed Jan 19 02:24:32 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Wed, 19 Jan 2022 02:24:32 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v2] In-Reply-To: References: <_Pw-D6A2BD-4wx0mZ5lFvFlxBRylbA5WT9y5xgtDBvk=.fa85a604-e6a0-463b-8a4a-4ae7e210661a@github.com> Message-ID: <3zl-jNbxBp_52T2RfuGIpJdslh3GLggDsNoqjqPiN3c=.0009ba39-59ed-46f2-9e0a-f4e1d190ac02@github.com> On Tue, 18 Jan 2022 03:04:15 GMT, David Holmes wrote: >> Yi Yang has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> 8275775 Add VM.classes to print details of all classes > > src/hotspot/share/oops/instanceKlass.cpp line 2103: > >> 2101: if (k->has_final_method()) buf[i++] = 'f'; >> 2102: if (k->has_vanilla_constructor()) buf[i++] = 'V'; >> 2103: if (k->is_instance_klass()) { > > Don't the properties queried in L2100 to L2102 only apply to instance classes? These methods belong to `Klass` > src/hotspot/share/services/diagnosticCommand.hpp line 873: > >> 871: } >> 872: static const char* impact() { >> 873: return "Medium: Depends on Java content."; > > I would think impact is High due to the number of classes. Thanks for reviews! Since ClassHierarchyDCmd uses `"Medium: Depends on number of loaded classes."`, so I'm going to change the impact description but keeping as `Medium` level. Now it looks like: KlassAddr Size State Flags LoaderName ClassName 0x0000000800c0b400 62 fully_initialized W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 0x0000000800c0b000 62 fully_initialized W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 0x0000000800c0ac00 62 fully_initialized W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 0x0000000800c0a800 62 fully_initialized W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0a800 0x0000000800c0a400 62 fully_initialized W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0a400 ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From yyang at openjdk.java.net Wed Jan 19 02:32:32 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Wed, 19 Jan 2022 02:32:32 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v2] In-Reply-To: References: <_Pw-D6A2BD-4wx0mZ5lFvFlxBRylbA5WT9y5xgtDBvk=.fa85a604-e6a0-463b-8a4a-4ae7e210661a@github.com> Message-ID: On Tue, 18 Jan 2022 03:10:11 GMT, David Holmes wrote: >> Yi Yang has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> 8275775 Add VM.classes to print details of all classes > > src/hotspot/share/oops/instanceKlass.cpp line 2100: > >> 2098: char buf[10]; >> 2099: int i = 0; >> 2100: if (k->has_finalizer()) buf[i++] = 'F'; > > Where is the meaning of these flags documented? I don't find a proper place to document these flags, do you have any suggestions? I do think we can output flag explanations as well, but it looks somewhat strange.. Flags: V=..., W=... KlassAddr Size State Flags LoaderName ClassName 0x0000000800c0b400 62 fully_initialized W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 0x0000000800c0b000 62 fully_initialized W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 0x0000000800c0ac00 62 fully_initialized W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 0x0000000800c0a800 62 fully_initialized W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0a800 0x0000000800c0a400 62 fully_initialized W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0a400 ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From yyang at openjdk.java.net Wed Jan 19 02:37:10 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Wed, 19 Jan 2022 02:37:10 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v3] In-Reply-To: References: Message-ID: > Add VM.classes to print details of all classes, output looks like: > > 1. jcmd VM.classes > > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 > ... > > 2. jcmd VM.classes verbose > > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} > - instance size: 2 > - klass size: 62 > - access: final synchronized > - state: inited > - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > - super: 'java/lang/Object' > - sub: > - arrays: NULL > - methods: Array(0x00007f620841f210) > - method ordering: Array(0x0000000800a7e5a8) > - default_methods: Array(0x0000000000000000) > - local interfaces: Array(0x00000008005af748) > - trans. interfaces: Array(0x00000008005af748) > - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 > - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder > - source file: 'LambdaForm$MH' > - class annotations: Array(0x0000000000000000) > - class type annotations: Array(0x0000000000000000) > - field annotations: Array(0x0000000000000000) > - field type annotations: Array(0x0000000000000000) > - inner classes: Array(0x00000008005af6d8) > - nest members: Array(0x00000008005af6d8) > - permitted subclasses: Array(0x00000008005af6d8) > - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > - vtable length 5 (start addr: 0x0000000800c0b5b8) > - itable length 2 (start addr: 0x0000000800c0b5e0) > - ---- static fields (1 words): > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > - ---- non-static fields (0 words): > - non-static oop maps: > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} > - instance size: 2 > - klass size: 62 > - access: final synchronized > - state: inited > - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > - super: 'java/lang/Object' > - sub: > - arrays: NULL > - methods: Array(0x00007f620841ea68) > - method ordering: Array(0x0000000800a7e5a8) > - default_methods: Array(0x0000000000000000) > - local interfaces: Array(0x00000008005af748) > - trans. interfaces: Array(0x00000008005af748) > - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 > - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder > - source file: 'LambdaForm$DMH' > - class annotations: Array(0x0000000000000000) > - class type annotations: Array(0x0000000000000000) > - field annotations: Array(0x0000000000000000) > - field type annotations: Array(0x0000000000000000) > - inner classes: Array(0x00000008005af6d8) > - nest members: Array(0x00000008005af6d8) > - permitted subclasses: Array(0x00000008005af6d8) > - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > - vtable length 5 (start addr: 0x0000000800c0b1b8) > - itable length 2 (start addr: 0x0000000800c0b1e0) > - ---- static fields (1 words): > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > - ---- non-static fields (0 words): > ... Yi Yang has updated the pull request incrementally with one additional commit since the last revision: review ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7105/files - new: https://git.openjdk.java.net/jdk/pull/7105/files/80a3d22b..7f0bdd23 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7105&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7105&range=01-02 Stats: 27 lines in 2 files changed: 0 ins; 11 del; 16 mod Patch: https://git.openjdk.java.net/jdk/pull/7105.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7105/head:pull/7105 PR: https://git.openjdk.java.net/jdk/pull/7105 From yyang at openjdk.java.net Wed Jan 19 02:45:26 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Wed, 19 Jan 2022 02:45:26 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v2] In-Reply-To: References: <_Pw-D6A2BD-4wx0mZ5lFvFlxBRylbA5WT9y5xgtDBvk=.fa85a604-e6a0-463b-8a4a-4ae7e210661a@github.com> Message-ID: On Tue, 18 Jan 2022 19:43:12 GMT, Chris Plummer wrote: > It seems it would be useful to support the verbose output with just a single class that is specified, although that would suggest that the dcmd name should then be something other than `VM.classes`. This is a good idea, but `jcmd VM.classes verbose=XX` looks strange, `jcmd VM.class XX` is also not much proper, because we desire to print all classes in default(`jcmd VM.class`). an alternative is to use `jcmd VM.classes verbose | grep XX` currently. ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From cjplummer at openjdk.java.net Wed Jan 19 02:53:32 2022 From: cjplummer at openjdk.java.net (Chris Plummer) Date: Wed, 19 Jan 2022 02:53:32 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v2] In-Reply-To: References: <_Pw-D6A2BD-4wx0mZ5lFvFlxBRylbA5WT9y5xgtDBvk=.fa85a604-e6a0-463b-8a4a-4ae7e210661a@github.com> Message-ID: On Wed, 19 Jan 2022 02:42:13 GMT, Yi Yang wrote: > > It seems it would be useful to support the verbose output with just a single class that is specified, although that would suggest that the dcmd name should then be something other than `VM.classes`. > > This is a good idea, but `jcmd VM.classes verbose=XX` looks strange, `jcmd VM.class XX` is also not much proper, because we desire to print all classes in default(`jcmd VM.class`). an alternative is to use `jcmd VM.classes verbose | grep XX` currently. I was thinking the syntax would look like: `jcmd VM.classes [verbose [classname]]` Your grep solution doesn't work because each class has multiple lines of output. ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From kbarrett at openjdk.java.net Wed Jan 19 04:47:36 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 19 Jan 2022 04:47:36 GMT Subject: [jdk18] RFR: 8279294: NonblockingQueue::try_pop may improperly indicate queue is empty In-Reply-To: References: Message-ID: On Mon, 17 Jan 2022 08:23:37 GMT, Kim Barrett wrote: > Please review this improvement to NonblockingQueue::try_pop. The old code > returned an indication that the queue was empty in some cases where that > wasn't true. In particular, contending try_pop operations could result in > some incorrectly indicating empty. The change fixes that and improves the > interaction between contending try_pops. > > Testing: > mach5 tier1 > > Lots of testing of this change in conjunction with others as part of > investigating and fixing JDK-8273383. After discussion with Thomas, I'm going to withdraw this PR and defer the fix to JDK 19 rather than try to rush things. ------------- PR: https://git.openjdk.java.net/jdk18/pull/106 From kbarrett at openjdk.java.net Wed Jan 19 04:47:37 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 19 Jan 2022 04:47:37 GMT Subject: [jdk18] RFR: 8279294: NonblockingQueue::try_pop may improperly indicate queue is empty In-Reply-To: <4KHnaiabB-z96_si_LaCLN18L7I2vCYOMZSJ-vvqTSY=.40e781d3-fd29-4e39-b85d-673386fc26a0@github.com> References: <4KHnaiabB-z96_si_LaCLN18L7I2vCYOMZSJ-vvqTSY=.40e781d3-fd29-4e39-b85d-673386fc26a0@github.com> Message-ID: On Tue, 18 Jan 2022 09:53:22 GMT, Thomas Schatzl wrote: >> Please review this improvement to NonblockingQueue::try_pop. The old code >> returned an indication that the queue was empty in some cases where that >> wasn't true. In particular, contending try_pop operations could result in >> some incorrectly indicating empty. The change fixes that and improves the >> interaction between contending try_pops. >> >> Testing: >> mach5 tier1 >> >> Lots of testing of this change in conjunction with others as part of >> investigating and fixing JDK-8273383. > > src/hotspot/share/utilities/nonblockingQueue.inline.hpp line 144: > >> 142: // (1) next_node is the extension of the queue's list. >> 143: // (2) next_node is NULL, because a competing try_pop took result. >> 144: // (3) next_node is the extension of some unrelated list, because a > > Would it be possible to name these cases a/b/c instead of numbering them? The comments below also refer to these subcases as 1a-1c. > > Cases a) and c) in this list seem to be in a different order than in the code, would be nice if they matched (just exchange). > > I do not understand (a linguistic problem) why for case 3, `next_node` is the extension of some "unrelated" list, i.e. I do not understand the use of the adjective "unrelated" here, probably taking the word too literally. > > The `result` to be popped must be in the same list as the list other threads pop from (e.g. in this case we always pop from the `_head` of the completed buffers list, racing with threads popping from the `_head` of the same list. There does not seem to be a way to compete with other, different lists here). Even if we got some element of a sub-list of the original list, it and its sublists seem still "related". The labeling is intentionally different. The two groups are not congruent. I thought there was enough commentary there to make things clear, but apparently not. ------------- PR: https://git.openjdk.java.net/jdk18/pull/106 From kbarrett at openjdk.java.net Wed Jan 19 04:47:37 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 19 Jan 2022 04:47:37 GMT Subject: [jdk18] Withdrawn: 8279294: NonblockingQueue::try_pop may improperly indicate queue is empty In-Reply-To: References: Message-ID: <6B_9eM1S6Rk86r0K2VU7tTc23fkojz3GKybEJPhecxg=.70ee9a86-7b50-4786-8066-1ef21a3a257a@github.com> On Mon, 17 Jan 2022 08:23:37 GMT, Kim Barrett wrote: > Please review this improvement to NonblockingQueue::try_pop. The old code > returned an indication that the queue was empty in some cases where that > wasn't true. In particular, contending try_pop operations could result in > some incorrectly indicating empty. The change fixes that and improves the > interaction between contending try_pops. > > Testing: > mach5 tier1 > > Lots of testing of this change in conjunction with others as part of > investigating and fixing JDK-8273383. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk18/pull/106 From iklam at openjdk.java.net Wed Jan 19 05:47:57 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 19 Jan 2022 05:47:57 GMT Subject: RFR: 8275731: CDS archived enums objects are recreated at runtime [v4] In-Reply-To: <9XdQFi_-JzM91ET0nN1gRCp8ZfMGBz1BwXglxqb8phg=.c643d5a5-b99a-4ce2-8616-9c1472e521b7@github.com> References: <9XdQFi_-JzM91ET0nN1gRCp8ZfMGBz1BwXglxqb8phg=.c643d5a5-b99a-4ce2-8616-9c1472e521b7@github.com> Message-ID: > **Background:** > > In the Java Language, Enums can be tested for equality, so the constants in an Enum type must be unique. Javac compiles an enum declaration like this: > > > public enum Day { SUNDAY, MONDAY ... } > > > to > > > public class Day extends java.lang.Enum { > public static final SUNDAY = new Day("SUNDAY"); > public static final MONDAY = new Day("MONDAY"); ... > } > > > With CDS archived heap objects, `Day::` is executed twice: once during `java -Xshare:dump`, and once during normal JVM execution. If the archived heap objects references one of the Enum constants created at dump time, we will violate the uniqueness requirements of the Enum constants at runtime. See the test case in the description of [JDK-8275731](https://bugs.openjdk.java.net/browse/JDK-8275731) > > **Fix:** > > During -Xshare:dump, if we discovered that an Enum constant of type X is archived, we archive all constants of type X. At Runtime, type X will skip the normal execution of `X::`. Instead, we run `HeapShared::initialize_enum_klass()` to retrieve all the constants of X that were saved at dump time. > > This is safe as we know that `X::` has no observable side effect -- it only creates the constants of type X, as well as the synthetic value `X::$VALUES`, which cannot be observed until X is fully initialized. > > **Verification:** > > To avoid future problems, I added a new tool, CDSHeapVerifier, to look for similar problems where the archived heap objects reference a static field that may be recreated at runtime. There are some manual steps involved, but I analyzed the potential problems found by the tool are they are all safe (after the current bug is fixed). See cdsHeapVerifier.cpp for gory details. An example trace of this tool can be found at https://bugs.openjdk.java.net/secure/attachment/97242/enum_warning.txt > > **Testing:** > > Passed Oracle CI tiers 1-4. WIll run tier 5 as well. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: Use InstanceKlass::do_local_static_fields for some field iterations ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6653/files - new: https://git.openjdk.java.net/jdk/pull/6653/files/6e160057..e27d3523 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6653&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6653&range=02-03 Stats: 150 lines in 2 files changed: 82 ins; 59 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/6653.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6653/head:pull/6653 PR: https://git.openjdk.java.net/jdk/pull/6653 From iklam at openjdk.java.net Wed Jan 19 05:48:08 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 19 Jan 2022 05:48:08 GMT Subject: RFR: 8275731: CDS archived enums objects are recreated at runtime [v3] In-Reply-To: <7c6mh2-s3SkpfGG1WptyZsJjTfcDy1wX0Ll0713MLkU=.7df74a01-7ea5-49c1-9bda-f73798df3852@github.com> References: <9XdQFi_-JzM91ET0nN1gRCp8ZfMGBz1BwXglxqb8phg=.c643d5a5-b99a-4ce2-8616-9c1472e521b7@github.com> <7c6mh2-s3SkpfGG1WptyZsJjTfcDy1wX0Ll0713MLkU=.7df74a01-7ea5-49c1-9bda-f73798df3852@github.com> Message-ID: <4CLwCQdc_haGT_ueBQGZKzJVasGK26B6iYcO7VtOfAs=.02f3deb9-7ac7-45fd-9a7c-37b0fe4a8ea2@github.com> On Mon, 17 Jan 2022 18:36:35 GMT, Coleen Phillimore wrote: >> Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: >> >> - Merge branch 'master' into 8275731-heapshared-enum >> - added exclusions needed by "java -Xshare:dump -ea -esa" >> - Comments from @calvinccheung off-line >> - 8275731: CDS archived enums objects are recreated at runtime > > src/hotspot/share/cds/cdsHeapVerifier.cpp line 165: > >> 163: >> 164: ResourceMark rm; >> 165: for (JavaFieldStream fs(ik); !fs.done(); fs.next()) { > > Can this call instead > void InstanceKlass::do_local_static_fields(void f(fieldDescriptor*, Handle, TRAPS), Handle mirror, TRAPS) { > and have this next few lines in the function? I moved the code inside a new class CDSHeapVerifier::CheckStaticFields so I can call InstanceKlass::do_local_static_fields > src/hotspot/share/cds/cdsHeapVerifier.cpp line 254: > >> 252: InstanceKlass* ik = InstanceKlass::cast(k); >> 253: for (JavaFieldStream fs(ik); !fs.done(); fs.next()) { >> 254: if (!fs.access_flags().is_static()) { > > same here. It only saves a couple of lines but then you can have the function outside this large function. You actually found a bug here. I am iterating over non-static fields and should walk the inherited fields as well. I changed the code to call InstanceKlass::do_nonstatic_fields() > src/hotspot/share/cds/cdsHeapVerifier.hpp line 52: > >> 50: mtClassShared, >> 51: HeapShared::oop_hash> _table; >> 52: > > Is this only used inside cdsHeapVerifier? if so it should be in the .cpp file. There's also an ArchiveableStaticFieldInfo. Not sure how they are related. This `_table` is part of the CDSHeapVerifier instance, which is stack allocated. So I need to declare it as part of the CDSHeapVerifier class declaration in the hpp file. > src/hotspot/share/cds/heapShared.cpp line 433: > >> 431: oop mirror = k->java_mirror(); >> 432: int i = 0; >> 433: for (JavaFieldStream fs(k); !fs.done(); fs.next()) { > > This seems like it should also use InstanceKlass::do_local_static_fields. Converting this to InstanceKlass::do_nonstatic_fields() is difficult because the loop body references 7 different variables declared outside of the loop. One thing I tried is to add a new version of do_nonstatic_fields2() that supports C++ lambdas. You can see my experiment from here: https://github.com/openjdk/jdk/compare/master...iklam:lambda-for-instanceklass-do_local_static_fields2?expand=1 I changed all my new code to use the do_nonstatic_fields2() function with lambda. > src/hotspot/share/cds/heapShared.cpp line 482: > >> 480: copy_open_objects(open_regions); >> 481: >> 482: CDSHeapVerifier::verify(); > > Should all this be DEBUG_ONLY ? I changed CDSHeapVerifier::verify() to a NOT_DEBUG_RETURN function. > src/hotspot/share/cds/heapShared.hpp line 236: > >> 234: oop _referrer; >> 235: oop _obj; >> 236: CachedOopInfo() :_subgraph_info(), _referrer(), _obj() {} > > Should these be initialized to nullptr? does this do this? These three fields are initialized with the default initializer (empty parenthesis) so they will be initialized to the null pointer. ------------- PR: https://git.openjdk.java.net/jdk/pull/6653 From iklam at openjdk.java.net Wed Jan 19 05:54:30 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 19 Jan 2022 05:54:30 GMT Subject: RFR: 8275731: CDS archived enums objects are recreated at runtime [v3] In-Reply-To: <7c6mh2-s3SkpfGG1WptyZsJjTfcDy1wX0Ll0713MLkU=.7df74a01-7ea5-49c1-9bda-f73798df3852@github.com> References: <9XdQFi_-JzM91ET0nN1gRCp8ZfMGBz1BwXglxqb8phg=.c643d5a5-b99a-4ce2-8616-9c1472e521b7@github.com> <7c6mh2-s3SkpfGG1WptyZsJjTfcDy1wX0Ll0713MLkU=.7df74a01-7ea5-49c1-9bda-f73798df3852@github.com> Message-ID: On Mon, 17 Jan 2022 19:22:23 GMT, Coleen Phillimore wrote: > I don't really know this code well enough to do a good code review. I had some comments though. Hi Coleen, thanks for taking a look. This PR has two major parts: 1. Check for inappropriate reference to static fields. This is mainly done in cdsHeapVerifier.cpp. These checks don't affect the contents of the CDS archive. They just print out warnings if problems are found. 2. Special initialization of enum classes. Essentially if any instance of an enum class `X` is archived, then `X::` will not be executed, and we'll take this path instead (in instanceKlass.cpp): // This is needed to ensure the consistency of the archived heap objects. if (has_archived_enum_objs()) { assert(is_shared(), "must be"); bool initialized = HeapShared::initialize_enum_klass(this, CHECK); if (initialized) { return; } } Could you check if (2) is correct? ------------- PR: https://git.openjdk.java.net/jdk/pull/6653 From dholmes at openjdk.java.net Wed Jan 19 07:03:29 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 19 Jan 2022 07:03:29 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v2] In-Reply-To: <3zl-jNbxBp_52T2RfuGIpJdslh3GLggDsNoqjqPiN3c=.0009ba39-59ed-46f2-9e0a-f4e1d190ac02@github.com> References: <_Pw-D6A2BD-4wx0mZ5lFvFlxBRylbA5WT9y5xgtDBvk=.fa85a604-e6a0-463b-8a4a-4ae7e210661a@github.com> <3zl-jNbxBp_52T2RfuGIpJdslh3GLggDsNoqjqPiN3c=.0009ba39-59ed-46f2-9e0a-f4e1d190ac02@github.com> Message-ID: On Wed, 19 Jan 2022 02:19:02 GMT, Yi Yang wrote: >> src/hotspot/share/oops/instanceKlass.cpp line 2103: >> >>> 2101: if (k->has_final_method()) buf[i++] = 'f'; >>> 2102: if (k->has_vanilla_constructor()) buf[i++] = 'V'; >>> 2103: if (k->is_instance_klass()) { >> >> Don't the properties queried in L2100 to L2102 only apply to instance classes? > > These methods belong to `Klass` That is true (though I wonder if it should be) but the question remains can these ever be present on a non-instance class? Somewhat related can you show the output for an array class please. ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From dholmes at openjdk.java.net Wed Jan 19 07:08:28 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 19 Jan 2022 07:08:28 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v2] In-Reply-To: References: <_Pw-D6A2BD-4wx0mZ5lFvFlxBRylbA5WT9y5xgtDBvk=.fa85a604-e6a0-463b-8a4a-4ae7e210661a@github.com> Message-ID: On Wed, 19 Jan 2022 02:29:22 GMT, Yi Yang wrote: >> src/hotspot/share/oops/instanceKlass.cpp line 2100: >> >>> 2098: char buf[10]; >>> 2099: int i = 0; >>> 2100: if (k->has_finalizer()) buf[i++] = 'F'; >> >> Where is the meaning of these flags documented? > > I don't find a proper place to document these flags, do you have any suggestions? > > I do think we can output flag explanations as well, but it looks somewhat strange.. > > Flags: V=..., W=... > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 fully_initialized W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > 0x0000000800c0b000 62 fully_initialized W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > 0x0000000800c0ac00 62 fully_initialized W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 > 0x0000000800c0a800 62 fully_initialized W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0a800 > 0x0000000800c0a400 62 fully_initialized W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0a400 Printing a legend line would be good, but there should also be actual documentation in the help output I think. ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From iklam at openjdk.java.net Wed Jan 19 07:21:27 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 19 Jan 2022 07:21:27 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v2] In-Reply-To: References: <_Pw-D6A2BD-4wx0mZ5lFvFlxBRylbA5WT9y5xgtDBvk=.fa85a604-e6a0-463b-8a4a-4ae7e210661a@github.com> Message-ID: On Wed, 19 Jan 2022 02:50:16 GMT, Chris Plummer wrote: > > > It seems it would be useful to support the verbose output with just a single class that is specified, although that would suggest that the dcmd name should then be something other than `VM.classes`. > > > > > > This is a good idea, but `jcmd VM.classes verbose=XX` looks strange, `jcmd VM.class XX` is also not much proper, because we desire to print all classes in default(`jcmd VM.class`). an alternative is to use `jcmd VM.classes verbose | grep XX` currently. > > I was thinking the syntax would look like: `jcmd VM.classes [verbose [classname]]` > > Your grep solution doesn't work because each class has multiple lines of output. How about this: jcmd VM.classes -verbose classname classname ... -verbose is optional more than one classnames can be specified. if no classnames are specified, all classes are printed ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From dholmes at openjdk.java.net Wed Jan 19 07:31:25 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 19 Jan 2022 07:31:25 GMT Subject: RFR: 8279887: 2 Null pointer dereference defect groups in os_posix.cpp In-Reply-To: References: Message-ID: On Tue, 18 Jan 2022 15:00:50 GMT, Harold Seigel wrote: > Please review this small fix to prevent possible Null pointer dereferences. The fix adds a Null check to prevent trying to park a Null thread. The Null check is needed because UNSAFE_ENTRY calls thread_from_jni_environment(), which returns NULL if the env thread is terminated. > > The fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold Hi Harold, This is a false positive - you can only park the current thread hence it can never be NULL. Cheers, David ------------- PR: https://git.openjdk.java.net/jdk/pull/7129 From dholmes at openjdk.java.net Wed Jan 19 07:33:24 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 19 Jan 2022 07:33:24 GMT Subject: RFR: 8279936: Change shared code to use os:: system API's [v2] In-Reply-To: <9dd-4_5r5HM3_Dys_O3NVqOPdKdtnVYXpqWhnhLgFLU=.56a9a8e2-a8f7-4a6f-bd7c-333ce4276e83@github.com> References: <9dd-4_5r5HM3_Dys_O3NVqOPdKdtnVYXpqWhnhLgFLU=.56a9a8e2-a8f7-4a6f-bd7c-333ce4276e83@github.com> Message-ID: On Tue, 18 Jan 2022 15:49:08 GMT, Harold Seigel wrote: >> Please review this small change to call os:: API's. the changes were tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. Most of these changes were for I/O related calls. Changes to memory allocation calls such as malloc and free will be handled in a future change. >> >> Thanks, Harold > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > restore abort() call Looks good! Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7055 From mdoerr at openjdk.java.net Wed Jan 19 08:32:34 2022 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 19 Jan 2022 08:32:34 GMT Subject: [jdk18] RFR: 8280155: [PPC64, s390] frame size checks are not yet correct In-Reply-To: References: Message-ID: On Tue, 18 Jan 2022 15:51:22 GMT, Martin Doerr wrote: > Fix frame size check and do null check earlier as described in the JBS issue. Thanks for the reviews! ------------- PR: https://git.openjdk.java.net/jdk18/pull/107 From mdoerr at openjdk.java.net Wed Jan 19 08:32:35 2022 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 19 Jan 2022 08:32:35 GMT Subject: [jdk18] Integrated: 8280155: [PPC64, s390] frame size checks are not yet correct In-Reply-To: References: Message-ID: <8CUlMMfPYbdJ7GXvnBk11phQ98AqAW4BW3q5W0CHhak=.ff1dfcb3-babf-48a5-a779-c916790393fe@github.com> On Tue, 18 Jan 2022 15:51:22 GMT, Martin Doerr wrote: > Fix frame size check and do null check earlier as described in the JBS issue. This pull request has now been integrated. Changeset: f37bfead Author: Martin Doerr URL: https://git.openjdk.java.net/jdk18/commit/f37bfeadcf036a75defc64ad7f4a9f5596cd7407 Stats: 11 lines in 3 files changed: 4 ins; 1 del; 6 mod 8280155: [PPC64, s390] frame size checks are not yet correct Reviewed-by: mbaesken, lucy ------------- PR: https://git.openjdk.java.net/jdk18/pull/107 From jbhateja at openjdk.java.net Wed Jan 19 08:38:26 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Wed, 19 Jan 2022 08:38:26 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v3] In-Reply-To: References: <1sV-WcCv0-H2lHh1FFwAocRL8EIVzZ_att5tUeY5M-c=.ee351f2b-c9b9-4294-9205-cf16da3e896d@github.com> Message-ID: On Fri, 14 Jan 2022 12:08:33 GMT, Jatin Bhateja wrote: >> Can any C2 compiler expert help review this? I updated copyright year to 2022 and renamed a function in latest commit. > > Hi @pfustc , > Apologies for being late in my response over this, following is the performance data of JMH micro (included with the report) operating over vectors of various primitive types with and without optimization. > [http://cr.openjdk.java.net/~jbhateja/post_loop_multiversioning/perf_post_loop_multiversioning_CLX.xlsx](http://cr.openjdk.java.net/~jbhateja/post_loop_multiversioning/perf_post_loop_multiversioning_CLX.xlsx > ) > Observations: > - Data shows reduction in cycles , dynamic instruction count, branches with optimization. > - Addition of tail loop iteration has impact on JIT code size, this may effect other optimizations like procedure in-lining. > - Scores are better for sub-word types (byte and short) since they have relatively long tail. > > Best Regards, > Jatin > Hi @jatin-bhateja , > > Thank you for the performance data. I repeat your JMH tests on AVX-512 and have below comments. > > * JIT code size increases after PostLoopMultiversioning is enabled. It is true but not related to this PR. The increase is caused by creation of multi-versioned post loops. Hence, the code size still increases even if we don't vectorize the post loop. To get rid of this side effect, I think we may directly vectorize RCE'd post loop without doing the multiversioning (prevent generation of any scalar tail - I see you have mentioned this in JBS comments). That's an enhancement we can do next. > * JMH shows some obvious performance regression when loop iteration count is small. I do have reproduced this regression in my repeated tests on AVX-512. But I don't really understand why this could happen with reduced CPU cycles and reduced dynamic instruction count. I heard that AVX-512 CPUs may run with lower frequency when some SIMD instructions are executed[1]. Is this a cause of the regression? > > Please let me know if you have further comments. > > [1] https://stackoverflow.com/questions/56852812/simd-instructions-lowering-cpu-frequency > > Thanks, Pengfei > Hi @jatin-bhateja , > > Thank you for the performance data. I repeat your JMH tests on AVX-512 and have below comments. > > * JIT code size increases after PostLoopMultiversioning is enabled. It is true but not related to this PR. The increase is caused by creation of multi-versioned post loops. Hence, the code size still increases even if we don't vectorize the post loop. To get rid of this side effect, I think we may directly vectorize RCE'd post loop without doing the multiversioning (prevent generation of any scalar tail - I see you have mentioned this in JBS comments). That's an enhancement we can do next. > * JMH shows some obvious performance regression when loop iteration count is small. I do have reproduced this regression in my repeated tests on AVX-512. But I don't really understand why this could happen with reduced CPU cycles and reduced dynamic instruction count. I heard that AVX-512 CPUs may run with lower frequency when some SIMD instructions are executed[1]. Is this a cause of the regression? > > Please let me know if you have further comments. > > [1] https://stackoverflow.com/questions/56852812/simd-instructions-lowering-cpu-frequency > > Thanks, Pengfei Hi @pfustc , Some more observations: 1) Since SLP aligns vector operations w.r.t only one dst array, so other vector loads and stores may incur cache line split penalty. 2) If vector size is equal to cache line size (64 bytes) then un-aligned vector operations will have greater penalty. 3) Frequency penalty is associated with vector size, a sequence which is based on ZMM register will operate at reduced frequency on CLX and prior generations. So if post vector tail loop iteration which is a clone of atomic vector loop is based on ZMM vectors may show degraded performance in case we jump over it after pre-loop i.e. in case of small unknow array lengths. One can restrict vector size to 32 bytes using -XX:MaxVectorSize=32 to circumvent this. BTW why have you kept a constraint on the vector size of post tail loop to match MaxVectorSize ? Thanks, Jatin ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From xliu at openjdk.java.net Wed Jan 19 08:43:29 2022 From: xliu at openjdk.java.net (Xin Liu) Date: Wed, 19 Jan 2022 08:43:29 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v2] In-Reply-To: References: <_Pw-D6A2BD-4wx0mZ5lFvFlxBRylbA5WT9y5xgtDBvk=.fa85a604-e6a0-463b-8a4a-4ae7e210661a@github.com> Message-ID: On Wed, 19 Jan 2022 07:18:03 GMT, Ioi Lam wrote: > > > > It seems it would be useful to support the verbose output with just a single class that is specified, although that would suggest that the dcmd name should then be something other than `VM.classes`. > > > > > > > > > This is a good idea, but `jcmd VM.classes verbose=XX` looks strange, `jcmd VM.class XX` is also not much proper, because we desire to print all classes in default(`jcmd VM.class`). an alternative is to use `jcmd VM.classes verbose | grep XX` currently. > > > > > > I was thinking the syntax would look like: `jcmd VM.classes [verbose [classname]]` > > Your grep solution doesn't work because each class has multiple lines of output. > > How about this: > > ``` > jcmd VM.classes -verbose classname classname ... > ``` > > -verbose is optional > > more than one classnames can be specified. > > if no classnames are specified, all classes are printed If the class name here means the "fully-qualified" class name, I guess it's not practical to input multiple classnames like "java.lang.invoke.LambdaForm$MH/0x0000000800c0b400" in cmdline. The main cost of VM_PrintClasses should be the traversal of all classes. I feel a filter won't save much runtime time. We can leave it to the external awk scripts. What do you think? ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From shade at openjdk.java.net Wed Jan 19 08:53:45 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 19 Jan 2022 08:53:45 GMT Subject: [jdk18] RFR: 8280234: AArch64 "core" variant does not build after JDK-8270947 Message-ID: Trivial (?) fix for AArch64 build failure: === Output from failing command(s) repeated here === * For target hotspot_variant-core_libjvm_objs_macroAssembler_aarch64.o: ------------- Commit messages: - Fix Changes: https://git.openjdk.java.net/jdk18/pull/108/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk18&pr=108&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8280234 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk18/pull/108.diff Fetch: git fetch https://git.openjdk.java.net/jdk18 pull/108/head:pull/108 PR: https://git.openjdk.java.net/jdk18/pull/108 From kbarrett at openjdk.java.net Wed Jan 19 09:02:29 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 19 Jan 2022 09:02:29 GMT Subject: RFR: 8279936: Change shared code to use os:: system API's [v2] In-Reply-To: <9dd-4_5r5HM3_Dys_O3NVqOPdKdtnVYXpqWhnhLgFLU=.56a9a8e2-a8f7-4a6f-bd7c-333ce4276e83@github.com> References: <9dd-4_5r5HM3_Dys_O3NVqOPdKdtnVYXpqWhnhLgFLU=.56a9a8e2-a8f7-4a6f-bd7c-333ce4276e83@github.com> Message-ID: On Tue, 18 Jan 2022 15:49:08 GMT, Harold Seigel wrote: >> Please review this small change to call os:: API's. the changes were tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. Most of these changes were for I/O related calls. Changes to memory allocation calls such as malloc and free will be handled in a future change. >> >> Thanks, Harold > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > restore abort() call Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7055 From adinn at openjdk.java.net Wed Jan 19 09:40:28 2022 From: adinn at openjdk.java.net (Andrew Dinn) Date: Wed, 19 Jan 2022 09:40:28 GMT Subject: [jdk18] RFR: 8280234: AArch64 "core" variant does not build after JDK-8270947 In-Reply-To: References: Message-ID: On Wed, 19 Jan 2022 08:46:56 GMT, Aleksey Shipilev wrote: > Trivial (?) fix for AArch64 build failure: > > > === Output from failing command(s) repeated here === > * For target hotspot_variant-core_libjvm_objs_macroAssembler_aarch64.o: Trivially good to go! ------------- Marked as reviewed by adinn (Reviewer). PR: https://git.openjdk.java.net/jdk18/pull/108 From aph at openjdk.java.net Wed Jan 19 09:53:36 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 19 Jan 2022 09:53:36 GMT Subject: [jdk18] RFR: 8280234: AArch64 "core" variant does not build after JDK-8270947 In-Reply-To: References: Message-ID: On Wed, 19 Jan 2022 08:46:56 GMT, Aleksey Shipilev wrote: > Trivial (?) fix for AArch64 build failure: > > > === Output from failing command(s) repeated here === > * For target hotspot_variant-core_libjvm_objs_macroAssembler_aarch64.o: Hmm, I guess that's right. I do wonder whether Thread::is_Compiler_thread() is well-defined in a "core" build, but it does seem to be. ------------- PR: https://git.openjdk.java.net/jdk18/pull/108 From ddong at openjdk.java.net Wed Jan 19 09:58:08 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Wed, 19 Jan 2022 09:58:08 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash [v3] In-Reply-To: References: Message-ID: > Hi, > > I found that the native stack frames in the hs log are not accurate sometimes on AArch64, not sure if this is a known issue or an issue worth fixing. > > The following steps can quick reproduce the problem: > > 1. apply the diff(comment the dtrace_object_alloc call in interpreter and make a crash on SharedRuntime::dtrace_object_alloc) > > index 39e99bdd5ed..4fc768e94aa 100644 > --- a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp > +++ b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp > @@ -3558,6 +3558,7 @@ void TemplateTable::_new() { > __ store_klass_gap(r0, zr); // zero klass gap for compressed oops > __ store_klass(r0, r4); // store klass last > > +/** > { > SkipIfEqual skip(_masm, &DTraceAllocProbes, false); > // Trigger dtrace event for fastpath > @@ -3567,6 +3568,7 @@ void TemplateTable::_new() { > __ pop(atos); // restore the return value > > } > +*/ > __ b(done); > } > > diff --git a/src/hotspot/cpu/x86/templateTable_x86.cpp b/src/hotspot/cpu/x86/templateTable_x86.cpp > index 19530b7c57c..15b0509da4c 100644 > --- a/src/hotspot/cpu/x86/templateTable_x86.cpp > +++ b/src/hotspot/cpu/x86/templateTable_x86.cpp > @@ -4033,6 +4033,7 @@ void TemplateTable::_new() { > Register tmp_store_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg); > __ store_klass(rax, rcx, tmp_store_klass); // klass > > +/** > { > SkipIfEqual skip_if(_masm, &DTraceAllocProbes, 0); > // Trigger dtrace event for fastpath > @@ -4041,6 +4042,7 @@ void TemplateTable::_new() { > CAST_FROM_FN_PTR(address, static_cast(SharedRuntime::dtrace_object_alloc)), rax); > __ pop(atos); > } > +*/ > > __ jmp(done); > } > diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp > index a5de65ea5ab..60b4bd3bcc8 100644 > --- a/src/hotspot/share/runtime/sharedRuntime.cpp > +++ b/src/hotspot/share/runtime/sharedRuntime.cpp > @@ -1002,6 +1002,7 @@ jlong SharedRuntime::get_java_tid(Thread* thread) { > * 6254741. Once that is fixed we can remove the dummy return value. > */ > int SharedRuntime::dtrace_object_alloc(oopDesc* o) { > + *(int*)0 = 1; > return dtrace_object_alloc(Thread::current(), o, o->size()); > } > > > 2. `java -XX:+DTraceAllocProbes -Xcomp -XX:-PreserveFramePointer -version` > > On x86_64, the native stack in hs log is complete, but on AArch64, the native stack is incorrect. > > In the beginning, I thought it might be the influence of PreserveFramePointer. Later, I found that no matter whether PreserveFramePointer is enabled or not, in the hs log of x86_64, the native stack is always correct, and aarch64 is wrong. > > After some investigation, I found that this problem is related to the layout of the stack. > > On x86_64, whether it is C/C++, interpreter, or JIT, `callee` will always put the `return address` and `fp` of the `caller` at the bottom of the stack. > Hence, `callee` can always get the `caller sp`(aka `sender sp`) by `fp + 2`, and if `caller` is a compiled method, `caller sp` is the key to getting the `caller`'s `caller` since `caller fp` may be invalid.(see frame::sender_for_compiled_frame). > > > push %rbp > mov %rsp,%rbp > > _ _ _ _ _ _ > | | > | | | > |_ _ _ _ _ _| | > | | | > caller | | <- caller sp | > _ _ _ |_ _ _ _ _ _| | expand > | | | > | ret addr | | direction > callee |_ _ _ _ _ _| | > | | V > | caller fp | <- fp > |_ _ _ _ _ _| > > > > But for AArch64, the C/C++ code doesn't put the `return address` and `fp` of the `caller` at the bottom of the stack. > Hence, we cannot use `fp + 2` to calculate the proper `caller sp`(although it is still implemented this way). > > When `caller` is a C1/C2 method A, and `callee` is a C/C++ method B, we cannot get the `caller` of A since we cannot get the proper sp value of it. > > > stp x29, x30, [sp, #-N]! > mov x29, sp > > _ _ _ _ _ _ > | | > | | | > |_ _ _ _ _ _| | > | | | > caller | | <- caller sp | > _ _ _ |_ _ _ _ _ _| - | expand > | | > . . . . . | | direction > _ _ _ _ _ _ | | > | | | N | > | ret addr | | | > callee |_ _ _ _ _ _| | | > | | - V > | caller fp | <- fp > |_ _ _ _ _ _| > > > > I am not very familiar with AArch64 and have no idea how to fix this issue perfectly at current. > > Based on my understanding of the implementation, we can get the correct stack trace when PreserveFramePointer is enabled. > > Although PreserveFramePointer is disabled by default, I found that some real applications will enable it in the production environment. > Therefore, in my opinion, this fix can help troubleshoot crash issues in applications that enable PreserveFramePointer on AArch64 platform. > > This patch changes the logic of l_sender_sp calculation, uses sender_sp() as the value of l_sender_sp when PreserveFramePointer is enabled. > > Any input is appreciated. > > Thanks, > Denghui Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: update ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6597/files - new: https://git.openjdk.java.net/jdk/pull/6597/files/0996bbe7..2342f438 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6597&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6597&range=01-02 Stats: 13 lines in 4 files changed: 3 ins; 0 del; 10 mod Patch: https://git.openjdk.java.net/jdk/pull/6597.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6597/head:pull/6597 PR: https://git.openjdk.java.net/jdk/pull/6597 From ddong at openjdk.java.net Wed Jan 19 09:59:31 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Wed, 19 Jan 2022 09:59:31 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash [v2] In-Reply-To: References: Message-ID: On Fri, 14 Jan 2022 16:15:06 GMT, Denghui Dong wrote: >> Hi, >> >> I found that the native stack frames in the hs log are not accurate sometimes on AArch64, not sure if this is a known issue or an issue worth fixing. >> >> The following steps can quick reproduce the problem: >> >> 1. apply the diff(comment the dtrace_object_alloc call in interpreter and make a crash on SharedRuntime::dtrace_object_alloc) >> >> index 39e99bdd5ed..4fc768e94aa 100644 >> --- a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp >> +++ b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp >> @@ -3558,6 +3558,7 @@ void TemplateTable::_new() { >> __ store_klass_gap(r0, zr); // zero klass gap for compressed oops >> __ store_klass(r0, r4); // store klass last >> >> +/** >> { >> SkipIfEqual skip(_masm, &DTraceAllocProbes, false); >> // Trigger dtrace event for fastpath >> @@ -3567,6 +3568,7 @@ void TemplateTable::_new() { >> __ pop(atos); // restore the return value >> >> } >> +*/ >> __ b(done); >> } >> >> diff --git a/src/hotspot/cpu/x86/templateTable_x86.cpp b/src/hotspot/cpu/x86/templateTable_x86.cpp >> index 19530b7c57c..15b0509da4c 100644 >> --- a/src/hotspot/cpu/x86/templateTable_x86.cpp >> +++ b/src/hotspot/cpu/x86/templateTable_x86.cpp >> @@ -4033,6 +4033,7 @@ void TemplateTable::_new() { >> Register tmp_store_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg); >> __ store_klass(rax, rcx, tmp_store_klass); // klass >> >> +/** >> { >> SkipIfEqual skip_if(_masm, &DTraceAllocProbes, 0); >> // Trigger dtrace event for fastpath >> @@ -4041,6 +4042,7 @@ void TemplateTable::_new() { >> CAST_FROM_FN_PTR(address, static_cast(SharedRuntime::dtrace_object_alloc)), rax); >> __ pop(atos); >> } >> +*/ >> >> __ jmp(done); >> } >> diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp >> index a5de65ea5ab..60b4bd3bcc8 100644 >> --- a/src/hotspot/share/runtime/sharedRuntime.cpp >> +++ b/src/hotspot/share/runtime/sharedRuntime.cpp >> @@ -1002,6 +1002,7 @@ jlong SharedRuntime::get_java_tid(Thread* thread) { >> * 6254741. Once that is fixed we can remove the dummy return value. >> */ >> int SharedRuntime::dtrace_object_alloc(oopDesc* o) { >> + *(int*)0 = 1; >> return dtrace_object_alloc(Thread::current(), o, o->size()); >> } >> >> >> 2. `java -XX:+DTraceAllocProbes -Xcomp -XX:-PreserveFramePointer -version` >> >> On x86_64, the native stack in hs log is complete, but on AArch64, the native stack is incorrect. >> >> In the beginning, I thought it might be the influence of PreserveFramePointer. Later, I found that no matter whether PreserveFramePointer is enabled or not, in the hs log of x86_64, the native stack is always correct, and aarch64 is wrong. >> >> After some investigation, I found that this problem is related to the layout of the stack. >> >> On x86_64, whether it is C/C++, interpreter, or JIT, `callee` will always put the `return address` and `fp` of the `caller` at the bottom of the stack. >> Hence, `callee` can always get the `caller sp`(aka `sender sp`) by `fp + 2`, and if `caller` is a compiled method, `caller sp` is the key to getting the `caller`'s `caller` since `caller fp` may be invalid.(see frame::sender_for_compiled_frame). >> >> >> push %rbp >> mov %rsp,%rbp >> >> _ _ _ _ _ _ >> | | >> | | | >> |_ _ _ _ _ _| | >> | | | >> caller | | <- caller sp | >> _ _ _ |_ _ _ _ _ _| | expand >> | | | >> | ret addr | | direction >> callee |_ _ _ _ _ _| | >> | | V >> | caller fp | <- fp >> |_ _ _ _ _ _| >> >> >> >> But for AArch64, the C/C++ code doesn't put the `return address` and `fp` of the `caller` at the bottom of the stack. >> Hence, we cannot use `fp + 2` to calculate the proper `caller sp`(although it is still implemented this way). >> >> When `caller` is a C1/C2 method A, and `callee` is a C/C++ method B, we cannot get the `caller` of A since we cannot get the proper sp value of it. >> >> >> stp x29, x30, [sp, #-N]! >> mov x29, sp >> >> _ _ _ _ _ _ >> | | >> | | | >> |_ _ _ _ _ _| | >> | | | >> caller | | <- caller sp | >> _ _ _ |_ _ _ _ _ _| - | expand >> | | >> . . . . . | | direction >> _ _ _ _ _ _ | | >> | | | N | >> | ret addr | | | >> callee |_ _ _ _ _ _| | | >> | | - V >> | caller fp | <- fp >> |_ _ _ _ _ _| >> >> >> >> I am not very familiar with AArch64 and have no idea how to fix this issue perfectly at current. >> >> Based on my understanding of the implementation, we can get the correct stack trace when PreserveFramePointer is enabled. >> >> Although PreserveFramePointer is disabled by default, I found that some real applications will enable it in the production environment. >> Therefore, in my opinion, this fix can help troubleshoot crash issues in applications that enable PreserveFramePointer on AArch64 platform. >> >> This patch changes the logic of l_sender_sp calculation, uses sender_sp() as the value of l_sender_sp when PreserveFramePointer is enabled. >> >> Any input is appreciated. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > fix pfl() crash problem and rename from_thread to from_anchor Thank you. > and perhaps we need a different name which suggests that. sp_ok_to_use() or sp_is_trusted() or somesuch? We do at least need a comment which explains that unless this boolean is true, the SP value in a frame is basically garbage, although it will point to somewhere within the stack. With that change, this patch can be integrated. Good suggestion, updated. > In the longer term, I think we should look at using libunwind to obtain a precise native stack trace, and then we can get rid of all the old kludges. Totally agree, I will take a deep look at libunwind when I have time. ------------- PR: https://git.openjdk.java.net/jdk/pull/6597 From aph at openjdk.java.net Wed Jan 19 10:52:32 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 19 Jan 2022 10:52:32 GMT Subject: [jdk18] RFR: 8280234: AArch64 "core" variant does not build after JDK-8270947 In-Reply-To: References: Message-ID: On Wed, 19 Jan 2022 08:46:56 GMT, Aleksey Shipilev wrote: > Trivial (?) fix for AArch64 build failure: > > > === Output from failing command(s) repeated here === > * For target hotspot_variant-core_libjvm_objs_macroAssembler_aarch64.o: Marked as reviewed by aph (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk18/pull/108 From dholmes at openjdk.java.net Wed Jan 19 11:55:28 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 19 Jan 2022 11:55:28 GMT Subject: RFR: 8280182: HotSpot Style Guide has stale link to chromium style guide In-Reply-To: References: Message-ID: On Tue, 18 Jan 2022 23:09:12 GMT, Liam Miller-Cushon wrote: > Update links to the chromium style guide in the HotSpot Style Guide. Looks good. Thanks for noticing the problem and fixing it. David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7138 From shade at openjdk.java.net Wed Jan 19 12:04:25 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 19 Jan 2022 12:04:25 GMT Subject: [jdk18] RFR: 8280234: AArch64 "core" variant does not build after JDK-8270947 In-Reply-To: References: Message-ID: On Wed, 19 Jan 2022 08:46:56 GMT, Aleksey Shipilev wrote: > Trivial (?) fix for AArch64 build failure: > > > === Output from failing command(s) repeated here === > * For target hotspot_variant-core_libjvm_objs_macroAssembler_aarch64.o: All right, AArch64 GHAs are clean, I am integrating under triviality rule. ------------- PR: https://git.openjdk.java.net/jdk18/pull/108 From shade at openjdk.java.net Wed Jan 19 12:04:26 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 19 Jan 2022 12:04:26 GMT Subject: [jdk18] Integrated: 8280234: AArch64 "core" variant does not build after JDK-8270947 In-Reply-To: References: Message-ID: On Wed, 19 Jan 2022 08:46:56 GMT, Aleksey Shipilev wrote: > Trivial (?) fix for AArch64 build failure: > > > === Output from failing command(s) repeated here === > * For target hotspot_variant-core_libjvm_objs_macroAssembler_aarch64.o: This pull request has now been integrated. Changeset: 28e02fa2 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk18/commit/28e02fa2cb40267136c88a507696ec3e610e95a3 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod 8280234: AArch64 "core" variant does not build after JDK-8270947 Reviewed-by: adinn, aph ------------- PR: https://git.openjdk.java.net/jdk18/pull/108 From foths.kounelhs at gmail.com Wed Jan 19 12:57:11 2022 From: foths.kounelhs at gmail.com (Fotios Kounelis) Date: Wed, 19 Jan 2022 12:57:11 +0000 Subject: [External] : Re: jvmci return array In-Reply-To: <32053EF1-60D5-4C63-BF3F-4350030157AE@oracle.com> References: <631cc1b2-931f-28ee-0e59-b87015b542a0@oracle.com> <2122c376-37e5-3cf2-68d0-42cf70fc113b@gmail.com> <32053EF1-60D5-4C63-BF3F-4350030157AE@oracle.com> Message-ID: Hello, Please find the pull request in the following link: https://github.com/fotiskoun/jdk11u-dev/pulls As you can see in the commit tab, this pull request contains only the hash_get function, as the rest of my logic is already in the master tab. To quickly describe what my general changes in jvmciRuntime.cpp do: I want to create two native functions, a hash_put and a hash_get. The arguments of hash_put() are 2 int[] arrays and their sizes and the behavior is to add these arrays in a local hash map. Respectively, the hash_get() has an int[] array as argument and by searching the map, using the argument as key, returns an array from the map. You could also have a look at hash_put for more and let me know if there is something wrong about it, too. Best regards, Fotis On 18/01/2022 22:40, Douglas Simon wrote: > I think we?re going to be able to better help if you put up a PR on GitHub (preferably a PR on your personal fork of https://github.com/openjdk/jdk). That way, there?s no need to guess at missing details of your problem. > > -Doug > >> On 19 Jan 2022, at 08:31, Fotios Kounelis wrote: >> >> Hi Tom, >> >> Thank you for your reply. My bad example was an attempt to use jni in the HotSpot internals. What I am trying to do is: a function returning an int[] array, filled with values of another existing array. >> >> So, my function 1) needs to return void 2) have an oop obj initialized with new_intArray 3) fill the values using int_at_put(0...size-1, arrayWithValues[0...size-1]) and 4) use the set_vm_result() to return my array. >> >> I am not sure what you meant by "proper unpacking logic by a stub caller" for the set_vm_result, as in the same file the common use is something like "thread->set_vm_resutl(obj)", initializing just the oop obj properly with a new_typeArray. >> >> I would appreciate a quick example in case the above is not how it should work. >> >> Thank you for your time! >> >> I couldn't find (or maybe I didn't understand) the set*ArrayRegion initialization in the jni.cpp with the macros, that is why I didn't mention it above. >> >> Best regards, >> >> Fotis >> >> On 18/01/2022 21:15, Tom Rodriguez wrote: >>> proper unpacking logic by a stub caller From hseigel at openjdk.java.net Wed Jan 19 13:54:27 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 19 Jan 2022 13:54:27 GMT Subject: RFR: 8279936: Change shared code to use os:: system API's [v2] In-Reply-To: <9dd-4_5r5HM3_Dys_O3NVqOPdKdtnVYXpqWhnhLgFLU=.56a9a8e2-a8f7-4a6f-bd7c-333ce4276e83@github.com> References: <9dd-4_5r5HM3_Dys_O3NVqOPdKdtnVYXpqWhnhLgFLU=.56a9a8e2-a8f7-4a6f-bd7c-333ce4276e83@github.com> Message-ID: On Tue, 18 Jan 2022 15:49:08 GMT, Harold Seigel wrote: >> Please review this small change to call os:: API's. the changes were tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. Most of these changes were for I/O related calls. Changes to memory allocation calls such as malloc and free will be handled in a future change. >> >> Thanks, Harold > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > restore abort() call Thanks David and Kim for the reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/7055 From hseigel at openjdk.java.net Wed Jan 19 13:54:27 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 19 Jan 2022 13:54:27 GMT Subject: Integrated: 8279936: Change shared code to use os:: system API's In-Reply-To: References: Message-ID: On Wed, 12 Jan 2022 21:56:18 GMT, Harold Seigel wrote: > Please review this small change to call os:: API's. the changes were tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. Most of these changes were for I/O related calls. Changes to memory allocation calls such as malloc and free will be handled in a future change. > > Thanks, Harold This pull request has now been integrated. Changeset: 96114315 Author: Harold Seigel URL: https://git.openjdk.java.net/jdk/commit/96114315cf91b03aeca7e12f225e4c76862f1be7 Stats: 23 lines in 8 files changed: 1 ins; 0 del; 22 mod 8279936: Change shared code to use os:: system API's Reviewed-by: dholmes, kbarrett ------------- PR: https://git.openjdk.java.net/jdk/pull/7055 From coleenp at openjdk.java.net Wed Jan 19 15:32:44 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 19 Jan 2022 15:32:44 GMT Subject: RFR: 8279887: 2 Null pointer dereference defect groups in os_posix.cpp In-Reply-To: References: Message-ID: On Wed, 19 Jan 2022 07:28:20 GMT, David Holmes wrote: > This is a false positive - you can only park the current thread hence it can never be NULL. I don't think the static analysis tools can know that, but adding a comment and returning NULL seems to be a fix that can please both the static analysis tools and readers of the code. ------------- PR: https://git.openjdk.java.net/jdk/pull/7129 From iklam at openjdk.java.net Wed Jan 19 17:32:37 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 19 Jan 2022 17:32:37 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v2] In-Reply-To: References: <_Pw-D6A2BD-4wx0mZ5lFvFlxBRylbA5WT9y5xgtDBvk=.fa85a604-e6a0-463b-8a4a-4ae7e210661a@github.com> Message-ID: On Wed, 19 Jan 2022 08:39:34 GMT, Xin Liu wrote: > > How about this: > > ``` > > jcmd VM.classes -verbose classname classname ... > > ``` > > > > -verbose is optional > > more than one classnames can be specified. > > if no classnames are specified, all classes are printed > > If the class name here means the "fully-qualified" class name, I guess it's not practical to input multiple classnames like "java.lang.invoke.LambdaForm$MH/0x0000000800c0b400" in cmdline. > > The main cost of VM_PrintClasses should be the traversal of all classes. I feel a filter won't save much runtime time. We can leave it to the external awk scripts. What do you think? That sounds fair. I think for filtering it's best left to external tools. I think we should use `-verbose` as the optional argument, to be consistent with other jcmds such as ` VM.symboltable` ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From jbhateja at openjdk.java.net Wed Jan 19 17:38:25 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Wed, 19 Jan 2022 17:38:25 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v2] In-Reply-To: References: Message-ID: > Summary of changes: > - Intrinsify Math.round(float) and Math.round(double) APIs. > - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. > - Test creation using new IR testing framework. > > Following are the performance number of a JMH micro included with the patch > > Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) > > ? | ? | BASELINE AVX2 | WithOpt AVX2 | Gain (opt/baseline) | Baseline AVX3 | Withopt AVX3 | Gain (opt/baseline) > -- | -- | -- | -- | -- | -- | -- | -- > Benchmark | ARRAYLEN | Score (ops/ms) | Score (ops/ms) | ? | Score (ops/ms) | Score (ops/ms) | ? > FpRoundingBenchmark.test_round_double | 1024 | 518.532 | 1364.066 | 2.630630318 | 512.908 | 4292.11 | 8.368186887 > FpRoundingBenchmark.test_round_double | 2048 | 270.137 | 830.986 | 3.076165057 | 273.159 | 2459.116 | 9.002507697 > FpRoundingBenchmark.test_round_float | 1024 | 752.436 | 7780.905 | 10.34095259 | 752.49 | 9506.694 | 12.63364829 > FpRoundingBenchmark.test_round_float | 2048 | 389.499 | 4113.046 | 10.55983712 | 389.63 | 4863.673 | 12.48279907 > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: 8279508: Adding a test for scalar intrinsification. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7094/files - new: https://git.openjdk.java.net/jdk/pull/7094/files/0fe01504..575d2935 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7094&range=00-01 Stats: 2 lines in 2 files changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7094.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7094/head:pull/7094 PR: https://git.openjdk.java.net/jdk/pull/7094 From kvn at openjdk.java.net Wed Jan 19 18:47:57 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 19 Jan 2022 18:47:57 GMT Subject: RFR: 8279282: [vectorapi] Matcher::supports_vector_comparison_unsigned is not needed on x86 [v3] In-Reply-To: <6YEvoLUipNu1l5haifwWwgQyfrjpRxh1o9qaIElSKQs=.d06b8dea-ed32-4cb5-ab09-28ce9a58e524@github.com> References: <6YEvoLUipNu1l5haifwWwgQyfrjpRxh1o9qaIElSKQs=.d06b8dea-ed32-4cb5-ab09-28ce9a58e524@github.com> Message-ID: On Sun, 9 Jan 2022 01:48:04 GMT, Quan Anh Mai wrote: >> Hi, >> >> Currently, unsigned comparison on AVX is implemented by zero extending elements and comparing the results. This leads to unnecessary complexity. This patch changes the implementation to use the identity existing in `Integer/Long.compareUnsigned`, that is `compareUnsigned(x, y) == compare(x ^ min_value, y ^ min_value)`. >> >> Thank you very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > use movddup for 128-bit vectors Good. Testing passed. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6966 From hseigel at openjdk.java.net Wed Jan 19 19:19:25 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 19 Jan 2022 19:19:25 GMT Subject: RFR: 8280178: Remove os:: API's that just call system API's Message-ID: <5sSEcCn30s5AytmuX6NptmzSZt-G3mHP9hTooGUTI4k=.35cb4a9b-5b03-43d3-9ad0-899ef76ddaf6@github.com> Please review this change to remove unneeded definitions for close(), read(), and socket() from class os in os.hpp. The definitions aren't needed because, for all platforms, these functions just call the host operating system versions. The fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. Thanks, Harold ------------- Commit messages: - 8280178: Remove os:: API's that just call system API's Changes: https://git.openjdk.java.net/jdk/pull/7145/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7145&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8280178 Stats: 68 lines in 19 files changed: 0 ins; 23 del; 45 mod Patch: https://git.openjdk.java.net/jdk/pull/7145.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7145/head:pull/7145 PR: https://git.openjdk.java.net/jdk/pull/7145 From iklam at openjdk.java.net Wed Jan 19 19:39:02 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 19 Jan 2022 19:39:02 GMT Subject: RFR: 8280178: Remove os:: API's that just call system API's In-Reply-To: <5sSEcCn30s5AytmuX6NptmzSZt-G3mHP9hTooGUTI4k=.35cb4a9b-5b03-43d3-9ad0-899ef76ddaf6@github.com> References: <5sSEcCn30s5AytmuX6NptmzSZt-G3mHP9hTooGUTI4k=.35cb4a9b-5b03-43d3-9ad0-899ef76ddaf6@github.com> Message-ID: On Wed, 19 Jan 2022 19:03:43 GMT, Harold Seigel wrote: > Please review this change to remove unneeded definitions for close(), read(), and socket() from class os in os.hpp. The definitions aren't needed because, for all platforms, these functions just call the host operating system versions. > > The fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold LGTM ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7145 From ccheung at openjdk.java.net Wed Jan 19 19:51:53 2022 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Wed, 19 Jan 2022 19:51:53 GMT Subject: RFR: 8280178: Remove os:: API's that just call system API's In-Reply-To: <5sSEcCn30s5AytmuX6NptmzSZt-G3mHP9hTooGUTI4k=.35cb4a9b-5b03-43d3-9ad0-899ef76ddaf6@github.com> References: <5sSEcCn30s5AytmuX6NptmzSZt-G3mHP9hTooGUTI4k=.35cb4a9b-5b03-43d3-9ad0-899ef76ddaf6@github.com> Message-ID: On Wed, 19 Jan 2022 19:03:43 GMT, Harold Seigel wrote: > Please review this change to remove unneeded definitions for close(), read(), and socket() from class os in os.hpp. The definitions aren't needed because, for all platforms, these functions just call the host operating system versions. > > The fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold LGTM ------------- Marked as reviewed by ccheung (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7145 From hseigel at openjdk.java.net Wed Jan 19 19:56:21 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 19 Jan 2022 19:56:21 GMT Subject: RFR: 8279887: 2 Null pointer dereference defect groups in os_posix.cpp [v2] In-Reply-To: References: Message-ID: > Please review this small fix to prevent possible Null pointer dereferences. The fix adds a Null check to prevent trying to park a Null thread. The Null check is needed because UNSAFE_ENTRY calls thread_from_jni_environment(), which returns NULL if the env thread is terminated. > > The fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: add comment ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7129/files - new: https://git.openjdk.java.net/jdk/pull/7129/files/fcde3737..8e79c91a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7129&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7129&range=00-01 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7129.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7129/head:pull/7129 PR: https://git.openjdk.java.net/jdk/pull/7129 From cushon at openjdk.java.net Wed Jan 19 20:22:12 2022 From: cushon at openjdk.java.net (Liam Miller-Cushon) Date: Wed, 19 Jan 2022 20:22:12 GMT Subject: Integrated: 8280182: HotSpot Style Guide has stale link to chromium style guide In-Reply-To: References: Message-ID: On Tue, 18 Jan 2022 23:09:12 GMT, Liam Miller-Cushon wrote: > Update links to the chromium style guide in the HotSpot Style Guide. This pull request has now been integrated. Changeset: dac15efc Author: Liam Miller-Cushon URL: https://git.openjdk.java.net/jdk/commit/dac15efc1be8fe49d2f6365f9adfb31dc3ea74ba Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod 8280182: HotSpot Style Guide has stale link to chromium style guide Reviewed-by: dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/7138 From xliu at openjdk.java.net Wed Jan 19 21:05:48 2022 From: xliu at openjdk.java.net (Xin Liu) Date: Wed, 19 Jan 2022 21:05:48 GMT Subject: RFR: 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase In-Reply-To: References: Message-ID: On Thu, 13 Jan 2022 17:44:57 GMT, Yi-Fan Tsai wrote: > 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase LGTM. I am not a reviewer, so we still need other reviewer to approve it. ------------- Marked as reviewed by xliu (Committer). PR: https://git.openjdk.java.net/jdk/pull/7067 From vlivanov at openjdk.java.net Wed Jan 19 21:37:50 2022 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Wed, 19 Jan 2022 21:37:50 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v2] In-Reply-To: References: Message-ID: On Wed, 19 Jan 2022 17:38:25 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. >> - Test creation using new IR testing framework. >> >> Following are the performance number of a JMH micro included with the patch >> >> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) >> >> ? | ? | BASELINE AVX2 | WithOpt AVX2 | Gain (opt/baseline) | Baseline AVX3 | Withopt AVX3 | Gain (opt/baseline) >> -- | -- | -- | -- | -- | -- | -- | -- >> Benchmark | ARRAYLEN | Score (ops/ms) | Score (ops/ms) | ? | Score (ops/ms) | Score (ops/ms) | ? >> FpRoundingBenchmark.test_round_double | 1024 | 518.532 | 1364.066 | 2.630630318 | 512.908 | 4292.11 | 8.368186887 >> FpRoundingBenchmark.test_round_double | 2048 | 270.137 | 830.986 | 3.076165057 | 273.159 | 2459.116 | 9.002507697 >> FpRoundingBenchmark.test_round_float | 1024 | 752.436 | 7780.905 | 10.34095259 | 752.49 | 9506.694 | 12.63364829 >> FpRoundingBenchmark.test_round_float | 2048 | 389.499 | 4113.046 | 10.55983712 | 389.63 | 4863.673 | 12.48279907 >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > 8279508: Adding a test for scalar intrinsification. There are already `RoundFloat`, `RoundDouble`, and `RoundDoubleMode` nodes defined. Though `RoundFloat` and `RoundDouble` are legacy nodes used only on x86-32, `RoundDoubleMode` supports multiple rounding modes and is amenable to auto-vectorization. What do you think about the following alternative? Reuse `RoundDoubleMode` (with a new rounding mode) and introduce `RoundFloatMode`. Special rounding rules is not the only peculiarity of `Math.round()`. It also converts the result to an integral type. It can be represented as `ConvF2I (RoundFloatMode f #rmode)` / `ConvD2L (RoundDoubleMode d #rmode)`. In scalar case, it can be matched as a single AD instruction. Auto-vectorizer can then convert it to `VectorCastF2X (RoundFloatModeV vf #rmode)` / `VectorCastD2X (RoundDoubleModeV vd #rmode)` and match it in a similar manner. test/hotspot/jtreg/compiler/c2/cr6340864/TestFloatVect.java line 33: > 31: * @run main/othervm -Xbatch -XX:CompileCommand=exclude,*::test() -Xmx128m -XX:MaxVectorSize=16 compiler.c2.cr6340864.TestFloatVect > 32: * @run main/othervm -Xbatch -XX:CompileCommand=exclude,*::test() -Xmx128m -XX:MaxVectorSize=32 compiler.c2.cr6340864.TestFloatVect > 33: * @run main/othervm -Xbatch -XX:CompileCommand=exclude,*::test() -XX:TieredStopAtLevel=2 -Xmx128m -XX:MaxVectorSize=32 compiler.c2.cr6340864.TestFloatVect What's the purpose of `-XX:TieredStopAtLevel=2` from testing perspective? ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From vlivanov at openjdk.java.net Wed Jan 19 21:59:49 2022 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Wed, 19 Jan 2022 21:59:49 GMT Subject: RFR: 8279282: [vectorapi] Matcher::supports_vector_comparison_unsigned is not needed on x86 [v3] In-Reply-To: <6YEvoLUipNu1l5haifwWwgQyfrjpRxh1o9qaIElSKQs=.d06b8dea-ed32-4cb5-ab09-28ce9a58e524@github.com> References: <6YEvoLUipNu1l5haifwWwgQyfrjpRxh1o9qaIElSKQs=.d06b8dea-ed32-4cb5-ab09-28ce9a58e524@github.com> Message-ID: On Sun, 9 Jan 2022 01:48:04 GMT, Quan Anh Mai wrote: >> Hi, >> >> Currently, unsigned comparison on AVX is implemented by zero extending elements and comparing the results. This leads to unnecessary complexity. This patch changes the implementation to use the identity existing in `Integer/Long.compareUnsigned`, that is `compareUnsigned(x, y) == compare(x ^ min_value, y ^ min_value)`. >> >> Thank you very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > use movddup for 128-bit vectors Looks good. src/hotspot/cpu/x86/x86.ad line 7383: > 7381: > 7382: if (vlen_enc == Assembler::AVX_128bit) { > 7383: __ vmovddup($xtmp$$XMMRegister, flip_bit, vlen_enc, $scratch$$Register); Scratch register is not needed here. `InternalAddress` should always be reachable. ------------- Marked as reviewed by vlivanov (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6966 From darcy at openjdk.java.net Wed Jan 19 22:12:51 2022 From: darcy at openjdk.java.net (Joe Darcy) Date: Wed, 19 Jan 2022 22:12:51 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v2] In-Reply-To: References: Message-ID: On Wed, 19 Jan 2022 17:38:25 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. >> - Test creation using new IR testing framework. >> >> Following are the performance number of a JMH micro included with the patch >> >> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) >> >> ? | ? | BASELINE AVX2 | WithOpt AVX2 | Gain (opt/baseline) | Baseline AVX3 | Withopt AVX3 | Gain (opt/baseline) >> -- | -- | -- | -- | -- | -- | -- | -- >> Benchmark | ARRAYLEN | Score (ops/ms) | Score (ops/ms) | ? | Score (ops/ms) | Score (ops/ms) | ? >> FpRoundingBenchmark.test_round_double | 1024 | 518.532 | 1364.066 | 2.630630318 | 512.908 | 4292.11 | 8.368186887 >> FpRoundingBenchmark.test_round_double | 2048 | 270.137 | 830.986 | 3.076165057 | 273.159 | 2459.116 | 9.002507697 >> FpRoundingBenchmark.test_round_float | 1024 | 752.436 | 7780.905 | 10.34095259 | 752.49 | 9506.694 | 12.63364829 >> FpRoundingBenchmark.test_round_float | 2048 | 389.499 | 4113.046 | 10.55983712 | 389.63 | 4863.673 | 12.48279907 >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > 8279508: Adding a test for scalar intrinsification. The testing for this PR doesn't look adequate to me. I don't see any testing for the values where the behavior of round has been redefined at points in the last decade. See JDK-8010430 and JDK-6430675, both of which have regression tests in the core libs area. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From dholmes at openjdk.java.net Wed Jan 19 22:43:49 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 19 Jan 2022 22:43:49 GMT Subject: RFR: 8279887: 2 Null pointer dereference defect groups in os_posix.cpp [v2] In-Reply-To: References: Message-ID: On Wed, 19 Jan 2022 19:56:21 GMT, Harold Seigel wrote: >> Please review this small fix to prevent possible Null pointer dereferences. The fix adds a Null check to prevent trying to park a Null thread. The Null check is needed because UNSAFE_ENTRY calls thread_from_jni_environment(), which returns NULL if the env thread is terminated. >> >> The fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. >> >> Thanks, Harold > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > add comment When the static analysis tool is wrong we ignore it and give feedback to the tool authors. We don't pollute our code with unnecessary checks for impossible situations. We have hit this many times in the past with parfait because it could not track state across the Java->VM boundary, for example. ------------- PR: https://git.openjdk.java.net/jdk/pull/7129 From kbarrett at openjdk.java.net Wed Jan 19 22:50:14 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 19 Jan 2022 22:50:14 GMT Subject: RFR: 8279294: NonblockingQueue::try_pop may improperly indicate queue is empty Message-ID: Please review this improvement to NonblockingQueue::try_pop. The old code returned an indication that the queue was empty in some cases where that wasn't true. In particular, contending try_pop operations could result in some incorrectly indicating empty. The change fixes that and improves the interaction between contending try_pops. Testing: mach5 tier1-3 Lots of testing of this change in conjunction with others as part of investigating and fixing JDK-8273383. ------------- Commit messages: - fix Changes: https://git.openjdk.java.net/jdk/pull/7149/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7149&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8279294 Stats: 44 lines in 1 file changed: 23 ins; 3 del; 18 mod Patch: https://git.openjdk.java.net/jdk/pull/7149.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7149/head:pull/7149 PR: https://git.openjdk.java.net/jdk/pull/7149 From coleenp at openjdk.java.net Wed Jan 19 23:27:16 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 19 Jan 2022 23:27:16 GMT Subject: RFR: 8276472: align_metadata_size is a nop Message-ID: I just added a comment to this and took out the align_up call. I like align_metadata_size where it is and changing to WordSize doesn't add more safety imo. Also found unused function while looking for units of metadata size. Tested with tier1 on Oracle supported platforms. ------------- Commit messages: - 8276472: align_metadata_size is a nop Changes: https://git.openjdk.java.net/jdk/pull/7150/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7150&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8276472 Stats: 8 lines in 2 files changed: 1 ins; 3 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/7150.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7150/head:pull/7150 PR: https://git.openjdk.java.net/jdk/pull/7150 From jwilhelm at openjdk.java.net Thu Jan 20 00:40:42 2022 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Thu, 20 Jan 2022 00:40:42 GMT Subject: RFR: Merge jdk18 Message-ID: Forwardport JDK 18 -> JDK 19 ------------- Commit messages: - Merge - 8280233: Temporarily disable Unix domain sockets in Windows PipeImpl - 8278834: Error "Cannot read field "sym" because "this.lvar[od]" is null" when compiling - 8272058: 25 Null pointer dereference defect groups in 4 files - 8280234: AArch64 "core" variant does not build after JDK-8270947 - 8280155: [PPC64, s390] frame size checks are not yet correct - 8273383: vmTestbase/vm/gc/containers/Combination05/TestDescription.java crashes verifying length of DCQS - 8279654: jdk/incubator/vector/Vector256ConversionTests.java crashes randomly with SVE - 8278417: Closed test fails after JDK-8276108 on aarch64 - 8274096: Improve decoding of image files - ... and 30 more: https://git.openjdk.java.net/jdk/compare/98d96a77...e0d83a07 The webrevs contain the adjustments done while merging with regards to each parent branch: - master: https://webrevs.openjdk.java.net/?repo=jdk&pr=7151&range=00.0 - jdk18: https://webrevs.openjdk.java.net/?repo=jdk&pr=7151&range=00.1 Changes: https://git.openjdk.java.net/jdk/pull/7151/files Stats: 1732 lines in 67 files changed: 933 ins; 606 del; 193 mod Patch: https://git.openjdk.java.net/jdk/pull/7151.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7151/head:pull/7151 PR: https://git.openjdk.java.net/jdk/pull/7151 From duke at openjdk.java.net Thu Jan 20 01:04:27 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Thu, 20 Jan 2022 01:04:27 GMT Subject: RFR: 8279282: [vectorapi] Matcher::supports_vector_comparison_unsigned is not needed on x86 [v4] In-Reply-To: References: Message-ID: > Hi, > > Currently, unsigned comparison on AVX is implemented by zero extending elements and comparing the results. This leads to unnecessary complexity. This patch changes the implementation to use the identity existing in `Integer/Long.compareUnsigned`, that is `compareUnsigned(x, y) == compare(x ^ min_value, y ^ min_value)`. > > Thank you very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: remove scratch ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6966/files - new: https://git.openjdk.java.net/jdk/pull/6966/files/59d1fa35..6514dd15 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6966&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6966&range=02-03 Stats: 5 lines in 1 file changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/6966.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6966/head:pull/6966 PR: https://git.openjdk.java.net/jdk/pull/6966 From duke at openjdk.java.net Thu Jan 20 01:04:30 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Thu, 20 Jan 2022 01:04:30 GMT Subject: RFR: 8279282: [vectorapi] Matcher::supports_vector_comparison_unsigned is not needed on x86 [v3] In-Reply-To: <6YEvoLUipNu1l5haifwWwgQyfrjpRxh1o9qaIElSKQs=.d06b8dea-ed32-4cb5-ab09-28ce9a58e524@github.com> References: <6YEvoLUipNu1l5haifwWwgQyfrjpRxh1o9qaIElSKQs=.d06b8dea-ed32-4cb5-ab09-28ce9a58e524@github.com> Message-ID: On Sun, 9 Jan 2022 01:48:04 GMT, Quan Anh Mai wrote: >> Hi, >> >> Currently, unsigned comparison on AVX is implemented by zero extending elements and comparing the results. This leads to unnecessary complexity. This patch changes the implementation to use the identity existing in `Integer/Long.compareUnsigned`, that is `compareUnsigned(x, y) == compare(x ^ min_value, y ^ min_value)`. >> >> Thank you very much. > > Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: > > use movddup for 128-bit vectors Thank you very much for the reviews and testing. ------------- PR: https://git.openjdk.java.net/jdk/pull/6966 From duke at openjdk.java.net Thu Jan 20 01:04:32 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Thu, 20 Jan 2022 01:04:32 GMT Subject: RFR: 8279282: [vectorapi] Matcher::supports_vector_comparison_unsigned is not needed on x86 [v3] In-Reply-To: References: <6YEvoLUipNu1l5haifwWwgQyfrjpRxh1o9qaIElSKQs=.d06b8dea-ed32-4cb5-ab09-28ce9a58e524@github.com> Message-ID: On Wed, 19 Jan 2022 21:52:19 GMT, Vladimir Ivanov wrote: >> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: >> >> use movddup for 128-bit vectors > > src/hotspot/cpu/x86/x86.ad line 7383: > >> 7381: >> 7382: if (vlen_enc == Assembler::AVX_128bit) { >> 7383: __ vmovddup($xtmp$$XMMRegister, flip_bit, vlen_enc, $scratch$$Register); > > Scratch register is not needed here. `InternalAddress` should always be reachable. Changed to `noreg` here, thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/6966 From jwilhelm at openjdk.java.net Thu Jan 20 01:21:56 2022 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Thu, 20 Jan 2022 01:21:56 GMT Subject: Integrated: Merge jdk18 In-Reply-To: References: Message-ID: On Thu, 20 Jan 2022 00:28:55 GMT, Jesper Wilhelmsson wrote: > Forwardport JDK 18 -> JDK 19 This pull request has now been integrated. Changeset: 4616c13c Author: Jesper Wilhelmsson URL: https://git.openjdk.java.net/jdk/commit/4616c13c2f1ced8a8bdeed81f0469523932e91b5 Stats: 1732 lines in 67 files changed: 933 ins; 606 del; 193 mod Merge ------------- PR: https://git.openjdk.java.net/jdk/pull/7151 From dholmes at openjdk.java.net Thu Jan 20 01:30:48 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 20 Jan 2022 01:30:48 GMT Subject: RFR: 8280178: Remove os:: API's that just call system API's In-Reply-To: <5sSEcCn30s5AytmuX6NptmzSZt-G3mHP9hTooGUTI4k=.35cb4a9b-5b03-43d3-9ad0-899ef76ddaf6@github.com> References: <5sSEcCn30s5AytmuX6NptmzSZt-G3mHP9hTooGUTI4k=.35cb4a9b-5b03-43d3-9ad0-899ef76ddaf6@github.com> Message-ID: On Wed, 19 Jan 2022 19:03:43 GMT, Harold Seigel wrote: > Please review this change to remove unneeded definitions for close(), read(), and socket() from class os in os.hpp. The definitions aren't needed because, for all platforms, these functions just call the host operating system versions. > > The fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold Good cleanup! Thanks Harold! David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7145 From dholmes at openjdk.java.net Thu Jan 20 01:43:49 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 20 Jan 2022 01:43:49 GMT Subject: RFR: 8276472: align_metadata_size is a nop In-Reply-To: References: Message-ID: <66JKC33QOOSCq1wdnR-9bcyqZqL6y7BKZSPY0Ds4lLU=.b20c4f5c-ed74-4a9c-8800-e1ca05950fd2@github.com> On Wed, 19 Jan 2022 23:20:37 GMT, Coleen Phillimore wrote: > I just added a comment to this and took out the align_up call. I like align_metadata_size where it is and changing to WordSize doesn't add more safety imo. Also found unused function while looking for units of metadata size. > Tested with tier1 on Oracle supported platforms. Hi Coleen, Why not just delete align_metadata_size? There are only 9 callers. Cheers, David ------------- PR: https://git.openjdk.java.net/jdk/pull/7150 From yyang at openjdk.java.net Thu Jan 20 07:34:25 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Thu, 20 Jan 2022 07:34:25 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v4] In-Reply-To: References: Message-ID: > Add VM.classes to print details of all classes, output looks like: > > 1. jcmd VM.classes > > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 > ... > > 2. jcmd VM.classes verbose > > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} > - instance size: 2 > - klass size: 62 > - access: final synchronized > - state: inited > - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > - super: 'java/lang/Object' > - sub: > - arrays: NULL > - methods: Array(0x00007f620841f210) > - method ordering: Array(0x0000000800a7e5a8) > - default_methods: Array(0x0000000000000000) > - local interfaces: Array(0x00000008005af748) > - trans. interfaces: Array(0x00000008005af748) > - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 > - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder > - source file: 'LambdaForm$MH' > - class annotations: Array(0x0000000000000000) > - class type annotations: Array(0x0000000000000000) > - field annotations: Array(0x0000000000000000) > - field type annotations: Array(0x0000000000000000) > - inner classes: Array(0x00000008005af6d8) > - nest members: Array(0x00000008005af6d8) > - permitted subclasses: Array(0x00000008005af6d8) > - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > - vtable length 5 (start addr: 0x0000000800c0b5b8) > - itable length 2 (start addr: 0x0000000800c0b5e0) > - ---- static fields (1 words): > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > - ---- non-static fields (0 words): > - non-static oop maps: > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} > - instance size: 2 > - klass size: 62 > - access: final synchronized > - state: inited > - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > - super: 'java/lang/Object' > - sub: > - arrays: NULL > - methods: Array(0x00007f620841ea68) > - method ordering: Array(0x0000000800a7e5a8) > - default_methods: Array(0x0000000000000000) > - local interfaces: Array(0x00000008005af748) > - trans. interfaces: Array(0x00000008005af748) > - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 > - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder > - source file: 'LambdaForm$DMH' > - class annotations: Array(0x0000000000000000) > - class type annotations: Array(0x0000000000000000) > - field annotations: Array(0x0000000000000000) > - field type annotations: Array(0x0000000000000000) > - inner classes: Array(0x00000008005af6d8) > - nest members: Array(0x00000008005af6d8) > - permitted subclasses: Array(0x00000008005af6d8) > - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > - vtable length 5 (start addr: 0x0000000800c0b1b8) > - itable length 2 (start addr: 0x0000000800c0b1e0) > - ---- static fields (1 words): > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > - ---- non-static fields (0 words): > ... Yi Yang has updated the pull request incrementally with two additional commits since the last revision: - -verbose and help doc - -verbose ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7105/files - new: https://git.openjdk.java.net/jdk/pull/7105/files/7f0bdd23..697c944c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7105&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7105&range=02-03 Stats: 11 lines in 2 files changed: 9 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7105.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7105/head:pull/7105 PR: https://git.openjdk.java.net/jdk/pull/7105 From yyang at openjdk.java.net Thu Jan 20 07:34:28 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Thu, 20 Jan 2022 07:34:28 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v3] In-Reply-To: References: Message-ID: On Wed, 19 Jan 2022 02:37:10 GMT, Yi Yang wrote: >> Add VM.classes to print details of all classes, output looks like: >> >> 1. jcmd VM.classes >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 >> ... >> >> 2. jcmd VM.classes verbose >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841f210) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 >> - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$MH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - vtable length 5 (start addr: 0x0000000800c0b5b8) >> - itable length 2 (start addr: 0x0000000800c0b5e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> - non-static oop maps: >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841ea68) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 >> - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$DMH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - vtable length 5 (start addr: 0x0000000800c0b1b8) >> - itable length 2 (start addr: 0x0000000800c0b1e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> ... > > Yi Yang has updated the pull request incrementally with one additional commit since the last revision: > > review Thank you all for suggestions and feedback. I add the help document for flags and use `-verbose` to print detailed content of classes, to be consistent with other jcmd commands. ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From yyang at openjdk.java.net Thu Jan 20 09:47:31 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Thu, 20 Jan 2022 09:47:31 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v5] In-Reply-To: References: Message-ID: > Add VM.classes to print details of all classes, output looks like: > > 1. jcmd VM.classes > > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 > ... > > 2. jcmd VM.classes verbose > > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} > - instance size: 2 > - klass size: 62 > - access: final synchronized > - state: inited > - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > - super: 'java/lang/Object' > - sub: > - arrays: NULL > - methods: Array(0x00007f620841f210) > - method ordering: Array(0x0000000800a7e5a8) > - default_methods: Array(0x0000000000000000) > - local interfaces: Array(0x00000008005af748) > - trans. interfaces: Array(0x00000008005af748) > - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 > - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder > - source file: 'LambdaForm$MH' > - class annotations: Array(0x0000000000000000) > - class type annotations: Array(0x0000000000000000) > - field annotations: Array(0x0000000000000000) > - field type annotations: Array(0x0000000000000000) > - inner classes: Array(0x00000008005af6d8) > - nest members: Array(0x00000008005af6d8) > - permitted subclasses: Array(0x00000008005af6d8) > - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > - vtable length 5 (start addr: 0x0000000800c0b5b8) > - itable length 2 (start addr: 0x0000000800c0b5e0) > - ---- static fields (1 words): > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > - ---- non-static fields (0 words): > - non-static oop maps: > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} > - instance size: 2 > - klass size: 62 > - access: final synchronized > - state: inited > - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > - super: 'java/lang/Object' > - sub: > - arrays: NULL > - methods: Array(0x00007f620841ea68) > - method ordering: Array(0x0000000800a7e5a8) > - default_methods: Array(0x0000000000000000) > - local interfaces: Array(0x00000008005af748) > - trans. interfaces: Array(0x00000008005af748) > - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 > - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder > - source file: 'LambdaForm$DMH' > - class annotations: Array(0x0000000000000000) > - class type annotations: Array(0x0000000000000000) > - field annotations: Array(0x0000000000000000) > - field type annotations: Array(0x0000000000000000) > - inner classes: Array(0x00000008005af6d8) > - nest members: Array(0x00000008005af6d8) > - permitted subclasses: Array(0x00000008005af6d8) > - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > - vtable length 5 (start addr: 0x0000000800c0b1b8) > - itable length 2 (start addr: 0x0000000800c0b1e0) > - ---- static fields (1 words): > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > - ---- non-static fields (0 words): > ... Yi Yang has updated the pull request incrementally with one additional commit since the last revision: fix test ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7105/files - new: https://git.openjdk.java.net/jdk/pull/7105/files/697c944c..b4da2ddc Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7105&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7105&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7105.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7105/head:pull/7105 PR: https://git.openjdk.java.net/jdk/pull/7105 From chagedorn at openjdk.java.net Thu Jan 20 10:07:48 2022 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Thu, 20 Jan 2022 10:07:48 GMT Subject: RFR: 8279887: 2 Null pointer dereference defect groups in os_posix.cpp [v2] In-Reply-To: References: Message-ID: On Wed, 19 Jan 2022 19:56:21 GMT, Harold Seigel wrote: >> Please review this small fix to prevent possible Null pointer dereferences. The fix adds a Null check to prevent trying to park a Null thread. The Null check is needed because UNSAFE_ENTRY calls thread_from_jni_environment(), which returns NULL if the env thread is terminated. >> >> The fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. >> >> Thanks, Harold > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > add comment Could turning this into a sanity assertion check with the comment as a failure message be an option instead? ------------- PR: https://git.openjdk.java.net/jdk/pull/7129 From pli at openjdk.java.net Thu Jan 20 10:07:53 2022 From: pli at openjdk.java.net (Pengfei Li) Date: Thu, 20 Jan 2022 10:07:53 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v3] In-Reply-To: <1sV-WcCv0-H2lHh1FFwAocRL8EIVzZ_att5tUeY5M-c=.ee351f2b-c9b9-4294-9205-cf16da3e896d@github.com> References: <1sV-WcCv0-H2lHh1FFwAocRL8EIVzZ_att5tUeY5M-c=.ee351f2b-c9b9-4294-9205-cf16da3e896d@github.com> Message-ID: <9rZsoAl3ZTZudCIDzeT7iV8eR6bl1OYTGFWjH2imFEw=.a8db687b-0c2f-456a-8cc9-f8f573bad90e@github.com> On Mon, 10 Jan 2022 06:20:01 GMT, Pengfei Li wrote: >> ### Background >> >> Post loop vectorization is a C2 compiler optimization in an experimental >> VM feature called PostLoopMultiversioning. It transforms the range-check >> eliminated post loop to a 1-iteration vectorized loop with vector mask. >> This optimization was contributed by Intel in 2016 to support x86 AVX512 >> masked vector instructions. However, it was disabled soon after an issue >> was found. Due to insufficient maintenance in these years, multiple bugs >> have been accumulated inside. But we (Arm) still think this is a useful >> framework for vector mask support in C2 auto-vectorized loops, for both >> x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable >> post loop vectorization. >> >> ### Changes in this patch >> >> This patch reworks post loop vectorization. The most significant change >> is removing vector mask support in C2 x86 backend and re-implementing >> it in the mid-end. With this, we can re-enable post loop vectorization >> for platforms other than x86. >> >> Previous implementation hard-codes x86 k1 register as a reserved AVX512 >> opmask register and defines two routines (setvectmask/restorevectmask) >> to set and restore the value of k1. But after [JDK-8211251](https://bugs.openjdk.java.net/browse/JDK-8211251) which encodes >> AVX512 instructions as unmasked by default, generated vector masks are >> no longer used in AVX512 vector instructions. To fix incorrect codegen >> and add vector mask support for more platforms, we turn to add a vector >> mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode >> to generate a mask and replace all Load/Store nodes in the post loop >> into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This >> IR form is exactly the same to those which are used in VectorAPI mask >> support. For now, we only add mask inputs for Load/Store nodes because >> we don't have reduction operations supported in post loop vectorization. >> After this change, the x86 k1 register is no longer reserved and can be >> allocated when PostLoopMultiversioning is enabled. >> >> Besides this change, we have fixed a compiler crash and five incorrect >> result issues with post loop vectorization. >> >> **I) C2 crashes with segmentation fault in strip-mined loops** >> >> Previous implementation was done before C2 loop strip-mining was merged >> into JDK master so it didn't take strip-mined loops into consideration. >> In C2's strip mined loops, post loop is not the sibling of the main loop >> in ideal loop tree. Instead, it's the sibling of the main loop's parent. >> This patch fixed a SIGSEGV issue caused by NULL pointer when locating >> post loop from strip-mined main loop. >> >> **II) Incorrect result issues with post loop vectorization** >> >> We have also fixed five incorrect vectorization issues. Some of them are >> hidden deep and can only be reproduced with corner cases. These issues >> have a common cause that it assumes the post loop can be vectorized if >> the vectorization in corresponding main loop is successful. But in many >> cases this assumption is wrong. Below are details. >> >> - **[Issue-1] Incorrect vectorization for partial vectorizable loops** >> >> This issue can be reproduced by below loop where only some operations in >> the loop body are vectorizable. >> >> for (int i = 0; i < 10000; i++) { >> res[i] = a[i] * b[i]; >> k = 3 * k + 1; >> } >> >> In the main loop, superword can work well if parts of the operations in >> loop body are not vectorizable since those parts can be unrolled only. >> But for post loops, we don't create vectors through combining scalar IRs >> generated from loop unrolling. Instead, we are doing scalars to vectors >> replacement for all operations in the loop body. Hence, all operations >> should be either vectorized together or not vectorized at all. To fix >> this kind of cases, we add an extra field "_slp_vector_pack_count" in >> CountedLoopNode to record the eventual count of vector packs in the main >> loop. This value is then passed to post loop and compared with post loop >> pack count. Vectorization will be bailed out in post loop if it creates >> more vector packs than in the main loop. >> >> - **[Issue-2] Incorrect result in loops with growing-down vectors** >> >> This issue appears with growing-down vectors, that is, vectors that grow >> to smaller memory address as the loop iterates. It can be reproduced by >> below counting-up loop with negative scale value in array index. >> >> for (int i = 0; i < 10000; i++) { >> a[MAX - i] = b[MAX - i]; >> } >> >> Cause of this issue is that for a growing-down vector, generated vector >> mask value has reversed vector-lane order so it masks incorrect vector >> lanes. Note that if negative scale value appears in counting-down loops, >> the vector will be growing up. With this rule, we fix the issue by only >> allowing positive array index scales in counting-up loops and negative >> array index scales in counting-down loops. This check is done with the >> help of SWPointer by comparing scale values in each memory access in the >> loop with loop stride value. >> >> - **[Issue-3] Incorrect result in manually unrolled loops** >> >> This issue can be reproduced by below manually unrolled loop. >> >> for (int i = 0; i < 10000; i += 2) { >> c[i] = a[i] + b[i]; >> c[i + 1] = a[i + 1] * b[i + 1]; >> } >> >> In this loop, operations in the 2nd statement duplicate those in the 1st >> statement with a small memory address offset. Vectorization in the main >> loop works well in this case because C2 does further unrolling and pack >> combination. But we cannot vectorize the post loop through replacement >> from scalars to vectors because it creates duplicated vector operations. >> To fix this, we restrict post loop vectorization to loops with stride >> values of 1 or -1. >> >> - **[Issue-4] Incorrect result in loops with mixed vector element sizes** >> >> This issue is found after we enable post loop vectorization for AArch64. >> It's reproducible by multiple array operations with different element >> sizes inside a loop. On x86, there is no issue because the values of x86 >> AVX512 opmasks only depend on which vector lanes are active. But AArch64 >> is different - the values of SVE predicates also depend on lane size of >> the vector. Hence, on AArch64 SVE, if a loop has mixed vector element >> sizes, we should use different vector masks. For now, we just support >> loops with only one vector element size, i.e., "int + float" vectors in >> a single loop is ok but "int + double" vectors in a single loop is not >> vectorizable. This fix also enables subword vectors support to make all >> primitive type array operations vectorizable. >> >> - **[Issue-5] Incorrect result in loops with potential data dependence** >> >> This issue can be reproduced by below corner case on AArch64 only. >> >> for (int i = 0; i < 10000; i++) { >> a[i] = x; >> a[i + OFFSET] = y; >> } >> >> In this case, two stores in the loop have data dependence if the OFFSET >> value is smaller than the vector length. So we cannot do vectorization >> through replacing scalars to vectors. But the main loop vectorization >> in this case is successful on AArch64 because AArch64 has partial vector >> load/store support. It splits vector fill with different values in lanes >> to several smaller-sized fills. In this patch, we add additional data >> dependence check for this kind of cases. The check is also done with the >> help of SWPointer class. In this check, we require that every two memory >> accesses (with at least one store) of the same element type (or subword >> size) in the loop has the same array index expression. >> >> ### Tests >> >> So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with >> experimental VM option "PostLoopMultiversioning" turned on. We found no >> issue in all tests. We notice that those existing cases are not enough >> because some of above issues are not spotted by them. We would like to >> add some new cases but we found existing vectorization tests are a bit >> cumbersome - golden results must be pre-calculated and hard-coded in the >> test code for correctness verification. Thus, in this patch, we propose >> a new vectorization testing framework. >> >> Our new framework brings a simpler way to add new cases. For a new test >> case, we only need to create a new method annotated with "@Test". The >> test runner will invoke each annotated method twice automatically. First >> time it runs in the interpreter and second time it's forced compiled by >> C2. Then the two return results are compared. So in this framework each >> test method should return a primitive value or an array of primitives. >> In this way, no extra verification code for vectorization correctness is >> required. This test runner is still jtreg-based and takes advantages of >> the jtreg WhiteBox API, which enables test methods running at specific >> compilation levels. Each test class inside is also jtreg-based. It just >> need to inherit from the test runner class and run with two additional >> options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". >> >> ### Summary & Future work >> >> In this patch, we reworked post loop vectorization. We made it platform >> independent and fixed several issues inside. We also implemented a new >> vectorization testing framework with many test cases inside. Meanwhile, >> we did some code cleanups. >> >> This patch only touches C2 code guarded with PostLoopMultiversioning, >> except a few data structure changes. So, there's no behavior change when >> experimental VM option PostLoopMultiversioning is off. Also, to reduce >> risks, we still propose to keep post loop vectorization experimental for >> now. But if it receives positive feedback, we would like to change it to >> non-experimental in the future. > > Pengfei Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Update copyright year and rename a function > > Change-Id: I15845ebd3982edebd4c151284cc6f2ff727630bb > - Merge branch 'master' into postloop > > Change-Id: Ie639c79c9cf016dc68ebf2c0031b60453b45e9a4 > - Fix issues in newly added test framework > > Change-Id: I6e61abf05e9665325cb3abaf407360b18355c6b1 > - Merge branch 'master' into postloop > > Change-Id: I9bb5a808d7540426dedb141fd198d25eb1f569e6 > - 8183390: Fix and re-enable post loop vectorization > > ** Background > > Post loop vectorization is a C2 compiler optimization in an experimental > VM feature called PostLoopMultiversioning. It transforms the range-check > eliminated post loop to a 1-iteration vectorized loop with vector mask. > This optimization was contributed by Intel in 2016 to support x86 AVX512 > masked vector instructions. However, it was disabled soon after an issue > was found. Due to insufficient maintenance in these years, multiple bugs > have been accumulated inside. But we (Arm) still think this is a useful > framework for vector mask support in C2 auto-vectorized loops, for both > x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable > post loop vectorization. > > ** Changes in this patch > > This patch reworks post loop vectorization. The most significant change > is removing vector mask support in C2 x86 backend and re-implementing > it in the mid-end. With this, we can re-enable post loop vectorization > for platforms other than x86. > > Previous implementation hard-codes x86 k1 register as a reserved AVX512 > opmask register and defines two routines (setvectmask/restorevectmask) > to set and restore the value of k1. But after JDK-8211251 which encodes > AVX512 instructions as unmasked by default, generated vector masks are > no longer used in AVX512 vector instructions. To fix incorrect codegen > and add vector mask support for more platforms, we turn to add a vector > mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode > to generate a mask and replace all Load/Store nodes in the post loop > into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This > IR form is exactly the same to those which are used in VectorAPI mask > support. For now, we only add mask inputs for Load/Store nodes because > we don't have reduction operations supported in post loop vectorization. > After this change, the x86 k1 register is no longer reserved and can be > allocated when PostLoopMultiversioning is enabled. > > Besides this change, we have fixed a compiler crash and five incorrect > result issues with post loop vectorization. > > - 1) C2 crashes with segmentation fault in strip-mined loops > > Previous implementation was done before C2 loop strip-mining was merged > into JDK master so it didn't take strip-mined loops into consideration. > In C2's strip mined loops, post loop is not the sibling of the main loop > in ideal loop tree. Instead, it's the sibling of the main loop's parent. > This patch fixed a SIGSEGV issue caused by NULL pointer when locating > post loop from strip-mined main loop. > > - 2) Incorrect result issues with post loop vectorization > > We have also fixed five incorrect vectorization issues. Some of them are > hidden deep and can only be reproduced with corner cases. These issues > have a common cause that it assumes the post loop can be vectorized if > the vectorization in corresponding main loop is successful. But in many > cases this assumption is wrong. Below are details. > > [Issue-1] Incorrect vectorization for partial vectorizable loops > > This issue can be reproduced by below loop where only some operations in > the loop body are vectorizable. > > for (int i = 0; i < 10000; i++) { > res[i] = a[i] * b[i]; > k = 3 * k + 1; > } > > In the main loop, superword can work well if parts of the operations in > loop body are not vectorizable since those parts can be unrolled only. > But for post loops, we don't create vectors through combining scalar IRs > generated from loop unrolling. Instead, we are doing scalars to vectors > replacement for all operations in the loop body. Hence, all operations > should be either vectorized together or not vectorized at all. To fix > this kind of cases, we add an extra field "_slp_vector_pack_count" in > CountedLoopNode to record the eventual count of vector packs in the main > loop. This value is then passed to post loop and compared with post loop > pack count. Vectorization will be bailed out in post loop if it creates > more vector packs than in the main loop. > > [Issue-2] Incorrect result in loops with growing-down vectors > > This issue appears with growing-down vectors, that is, vectors that grow > to smaller memory address as the loop iterates. It can be reproduced by > below counting-up loop with negative scale value in array index. > > for (int i = 0; i < 10000; i++) { > a[MAX - i] = b[MAX - i]; > } > > Cause of this issue is that for a growing-down vector, generated vector > mask value has reversed vector-lane order so it masks incorrect vector > lanes. Note that if negative scale value appears in counting-down loops, > the vector will be growing up. With this rule, we fix the issue by only > allowing positive array index scales in counting-up loops and negative > array index scales in counting-down loops. This check is done with the > help of SWPointer by comparing scale values in each memory access in the > loop with loop stride value. > > [Issue-3] Incorrect result in manually unrolled loops > > This issue can be reproduced by below manually unrolled loop. > > for (int i = 0; i < 10000; i += 2) { > c[i] = a[i] + b[i]; > c[i + 1] = a[i + 1] * b[i + 1]; > } > > In this loop, operations in the 2nd statement duplicate those in the 1st > statement with a small memory address offset. Vectorization in the main > loop works well in this case because C2 does further unrolling and pack > combination. But we cannot vectorize the post loop through replacement > from scalars to vectors because it creates duplicated vector operations. > To fix this, we restrict post loop vectorization to loops with stride > values of 1 or -1. > > [Issue-4] Incorrect result in loops with mixed vector element sizes > > This issue is found after we enable post loop vectorization for AArch64. > It's reproducible by multiple array operations with different element > sizes inside a loop. On x86, there is no issue because the values of x86 > AVX512 opmasks only depend on which vector lanes are active. But AArch64 > is different - the values of SVE predicates also depend on lane size of > the vector. Hence, on AArch64 SVE, if a loop has mixed vector element > sizes, we should use different vector masks. For now, we just support > loops with only one vector element size, i.e., "int + float" vectors in > a single loop is ok but "int + double" vectors in a single loop is not > vectorizable. This fix also enables subword vectors support to make all > primitive type array operations vectorizable. > > [Issue-5] Incorrect result in loops with potential data dependence > > This issue can be reproduced by below corner case on AArch64 only. > > for (int i = 0; i < 10000; i++) { > a[i] = x; > a[i + OFFSET] = y; > } > > In this case, two stores in the loop have data dependence if the OFFSET > value is smaller than the vector length. So we cannot do vectorization > through replacing scalars to vectors. But the main loop vectorization > in this case is successful on AArch64 because AArch64 has partial vector > load/store support. It splits vector fill with different values in lanes > to several smaller-sized fills. In this patch, we add additional data > dependence check for this kind of cases. The check is also done with the > help of SWPointer class. In this check, we require that every two memory > accesses (with at least one store) of the same element type (or subword > size) in the loop has the same array index expression. > > ** Tests > > So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with > experimental VM option "PostLoopMultiversioning" turned on. We found no > issue in all tests. We notice that those existing cases are not enough > because some of above issues are not spotted by them. We would like to > add some new cases but we found existing vectorization tests are a bit > cumbersome - golden results must be pre-calculated and hard-coded in the > test code for correctness verification. Thus, in this patch, we propose > a new vectorization testing framework. > > Our new framework brings a simpler way to add new cases. For a new test > case, we only need to create a new method annotated with "@Test". The > test runner will invoke each annotated method twice automatically. First > time it runs in the interpreter and second time it's forced compiled by > C2. Then the two return results are compared. So in this framework each > test method should return a primitive value or an array of primitives. > In this way, no extra verification code for vectorization correctness is > required. This test runner is still jtreg-based and takes advantages of > the jtreg WhiteBox API, which enables test methods running at specific > compilation levels. Each test class inside is also jtreg-based. It just > need to inherit from the test runner class and run with two additional > options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". > > ** Summary & Future work > > In this patch, we reworked post loop vectorization. We made it platform > independent and fixed several issues inside. We also implemented a new > vectorization testing framework with many test cases inside. Meanwhile, > we did some code cleanups. > > This patch only touches C2 code guarded with PostLoopMultiversioning, > except a few data structure changes. So, there's no behavior change when > experimental VM option PostLoopMultiversioning is off. Also, to reduce > risks, we still propose to keep post loop vectorization experimental for > now. But if it receives positive feedback, we would like to change it to > non-experimental in the future. Hi Jatin, Thanks for your observations. > BTW why have you kept a constraint on the vector size of post tail loop to match MaxVectorSize ? That's a really good question. Yes, the MaxVectorSize constraint was not added by me. It exists in previous Intel's implementation and I just move it into my `SuperWord::create_post_loop_vmask()` function in order to put scattered vectorizability checks together. I tired removing that constraint as you suggested, but after that jtreg `hotspot:compiler/c2/cr6340864/TestByteVect.java` failed on x86 AVX-512 with the `-XX:MaxVectorSize=8` configuration. This failure only appears on x86 but not on AArch64 SVE. Currently I'm investigating the cause. As I'm not quite familiar with AVX-512 instructions, could you help look at this problem if you have some bandwidth? (BTW, the code I removed are below lines in `superword.cpp`. ) if (unique_size * vlen != MaxVectorSize) { return NULL; } Thanks, Pengfei ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From ayang at openjdk.java.net Thu Jan 20 12:38:51 2022 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Thu, 20 Jan 2022 12:38:51 GMT Subject: RFR: 8280146: Parallel: Remove time log tag In-Reply-To: References: Message-ID: On Tue, 18 Jan 2022 14:09:53 GMT, Albert Mingkun Yang wrote: > Simple change of removing some unhelpful logs in Parallel GC. Then, `time` log tag becomes unused and is removed as well. > > Test: hotspot_gc Thanks for the review. ------------- PR: https://git.openjdk.java.net/jdk/pull/7128 From ayang at openjdk.java.net Thu Jan 20 12:38:51 2022 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Thu, 20 Jan 2022 12:38:51 GMT Subject: Integrated: 8280146: Parallel: Remove time log tag In-Reply-To: References: Message-ID: On Tue, 18 Jan 2022 14:09:53 GMT, Albert Mingkun Yang wrote: > Simple change of removing some unhelpful logs in Parallel GC. Then, `time` log tag becomes unused and is removed as well. > > Test: hotspot_gc This pull request has now been integrated. Changeset: 98b157a7 Author: Albert Mingkun Yang URL: https://git.openjdk.java.net/jdk/commit/98b157a79af3e76f028bccd04a5e505642aae7a4 Stats: 28 lines in 4 files changed: 0 ins; 28 del; 0 mod 8280146: Parallel: Remove time log tag Reviewed-by: tschatzl, iwalulya ------------- PR: https://git.openjdk.java.net/jdk/pull/7128 From dholmes at openjdk.java.net Thu Jan 20 12:56:48 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 20 Jan 2022 12:56:48 GMT Subject: RFR: 8279887: 2 Null pointer dereference defect groups in os_posix.cpp [v2] In-Reply-To: References: Message-ID: On Thu, 20 Jan 2022 10:04:18 GMT, Christian Hagedorn wrote: > Could turning this into a sanity assertion check with the comment as a failure message be an option instead? You mean as a way to circumvent the tool? Perhaps - though I would not want to do that either as this is a problem that could be seen in many call-chains that use JVM_ENTRY (and other cases) to get the "thread". In any case this has been reported as a false positive now. What I may do is a RFE to add `current_thread_from_jni_environment` which will never return NULL and so avoid the issue altogether. Cheers, David ------------- PR: https://git.openjdk.java.net/jdk/pull/7129 From coleenp at openjdk.java.net Thu Jan 20 12:58:49 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 20 Jan 2022 12:58:49 GMT Subject: RFR: 8276472: align_metadata_size is a nop In-Reply-To: References: Message-ID: On Wed, 19 Jan 2022 23:20:37 GMT, Coleen Phillimore wrote: > I just added a comment to this and took out the align_up call. I like align_metadata_size where it is and changing to WordSize doesn't add more safety imo. Also found unused function while looking for units of metadata size. > Tested with tier1 on Oracle supported platforms. I don't want to delete it because I want there to be the knowledge that we're aligning metadata to some boundary, and those were the places we have to do it. We may have had double-word alignment on 32 bits once. ------------- PR: https://git.openjdk.java.net/jdk/pull/7150 From hseigel at openjdk.java.net Thu Jan 20 13:13:50 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Thu, 20 Jan 2022 13:13:50 GMT Subject: RFR: 8280178: Remove os:: API's that just call system API's In-Reply-To: <5sSEcCn30s5AytmuX6NptmzSZt-G3mHP9hTooGUTI4k=.35cb4a9b-5b03-43d3-9ad0-899ef76ddaf6@github.com> References: <5sSEcCn30s5AytmuX6NptmzSZt-G3mHP9hTooGUTI4k=.35cb4a9b-5b03-43d3-9ad0-899ef76ddaf6@github.com> Message-ID: <0RXvxgYKmhEQwhyysmqQ_ivDzZQUo7us2cHmDHauG3Q=.8ea11747-8fd4-4bba-b07b-4d442b1f5da0@github.com> On Wed, 19 Jan 2022 19:03:43 GMT, Harold Seigel wrote: > Please review this change to remove unneeded definitions for close(), read(), and socket() from class os in os.hpp. The definitions aren't needed because, for all platforms, these functions just call the host operating system versions. > > The fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold Thanks Ioi, Calvin, and David for the reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/7145 From hseigel at openjdk.java.net Thu Jan 20 13:13:50 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Thu, 20 Jan 2022 13:13:50 GMT Subject: Integrated: 8280178: Remove os:: API's that just call system API's In-Reply-To: <5sSEcCn30s5AytmuX6NptmzSZt-G3mHP9hTooGUTI4k=.35cb4a9b-5b03-43d3-9ad0-899ef76ddaf6@github.com> References: <5sSEcCn30s5AytmuX6NptmzSZt-G3mHP9hTooGUTI4k=.35cb4a9b-5b03-43d3-9ad0-899ef76ddaf6@github.com> Message-ID: On Wed, 19 Jan 2022 19:03:43 GMT, Harold Seigel wrote: > Please review this change to remove unneeded definitions for close(), read(), and socket() from class os in os.hpp. The definitions aren't needed because, for all platforms, these functions just call the host operating system versions. > > The fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold This pull request has now been integrated. Changeset: a4d20190 Author: Harold Seigel URL: https://git.openjdk.java.net/jdk/commit/a4d201909c8919b7465dee72594d718252c6344e Stats: 68 lines in 19 files changed: 0 ins; 23 del; 45 mod 8280178: Remove os:: API's that just call system API's Reviewed-by: iklam, ccheung, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/7145 From duke at openjdk.java.net Thu Jan 20 14:22:21 2022 From: duke at openjdk.java.net (Alan Hayward) Date: Thu, 20 Jan 2022 14:22:21 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v12] In-Reply-To: References: Message-ID: > PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One > of its uses is to protect against ROP based attacks. This is done by > signing the Link Register whenever it is stored on the stack, and > authenticating the value when it is loaded back from the stack. If an > attacker were to try to change control flow by editing the stack then > the authentication check of the Link Register will fail, causing a > segfault when the function returns. > > On a system with PAC enabled, it is expected that all applications will > be compiled with ROP protection. Fedora 33 and upwards already provide > this. By compiling for ARMv8.0, GCC and LLVM will only use the set of > PAC instructions that exist in the NOP space - on hardware without PAC, > these instructions act as NOPs, allowing backward compatibility for > negligible performance cost (2 NOPs per non-leaf function). > > Hardware is currently limited to the Apple M1 MacBooks. All testing has > been done within a Fedora Docker image. A run of SpecJVM showed no > difference to that of noise - which was surprising. > > The most important part of this patch is simply compiling using branch > protection provided by GCC/LLVM. This protects all C++ code from being > used in ROP attacks, removing all static ROP gadgets from use. > > The remainder of the patch adds ROP protection to runtime generated > code, in both stubs and compiled Java code. Attacks here are much harder > as ROP gadgets must be found dynamically at runtime. If/when AOT > compilation is added to JDK, then all stubs and compiled Java will be > susceptible ROP gadgets being found by static analysis and therefore > potentially as vulnerable as C++ code. > > There are a number of places where the VM changes control flow by > rewriting the stack or otherwise. I?ve done some analysis as to how > these could also be used for attacks (which I didn?t want to post here). > These areas can be protected ensuring the pointers to various stubs and > entry points are stored in memory as signed pointers. These changes are > simple to make (they can be reduced to a type change in common code and > a few addition sign/auth calls in the backend), but there a lot of them > and the total code change is fairly large. I?m happy to provide a few > work in progress patches. > > In order to match the security benefits of the Apple Arm64e ABI across > the whole of JDK, then all the changes mentioned above would be > required. Alan Hayward has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 19 commits: - Merge master - Fix assembler for post-merge - Change UseROPProtection to UseBranchProtection Change-Id: I31c5e1bb5c285262f262459c13057a46221682f1 CustomizedGitHooks: yes - Remove BSD/Apple specific code - Default to building without branch-protection - Fix up UseROPProtection flag - Merge master - Merge master - Rename pauth_authenticate_or_strip_return_address - Fix windows aarch64 by restoring pauth file split - ... and 9 more: https://git.openjdk.java.net/jdk/compare/cf977e88...f6f80412 ------------- Changes: https://git.openjdk.java.net/jdk/pull/6334/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=11 Stats: 1367 lines in 25 files changed: 491 ins; 31 del; 845 mod Patch: https://git.openjdk.java.net/jdk/pull/6334.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6334/head:pull/6334 PR: https://git.openjdk.java.net/jdk/pull/6334 From chagedorn at openjdk.java.net Thu Jan 20 14:22:49 2022 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Thu, 20 Jan 2022 14:22:49 GMT Subject: RFR: 8279887: 2 Null pointer dereference defect groups in os_posix.cpp [v2] In-Reply-To: References: Message-ID: On Thu, 20 Jan 2022 12:53:45 GMT, David Holmes wrote: > > Could turning this into a sanity assertion check with the comment as a failure message be an option instead? > > You mean as a way to circumvent the tool? Yes. > Perhaps - though I would not want to do that either as this is a problem that could be seen in many call-chains that use JVM_ENTRY (and other cases) to get the "thread". In any case this has been reported as a false positive now. I see, thanks for the explanation David! Then I think it was the right way to report it as false positive. ------------- PR: https://git.openjdk.java.net/jdk/pull/7129 From hseigel at openjdk.java.net Thu Jan 20 14:55:56 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Thu, 20 Jan 2022 14:55:56 GMT Subject: Withdrawn: 8279887: 2 Null pointer dereference defect groups in os_posix.cpp In-Reply-To: References: Message-ID: On Tue, 18 Jan 2022 15:00:50 GMT, Harold Seigel wrote: > Please review this small fix to prevent possible Null pointer dereferences. The fix adds a Null check to prevent trying to park a Null thread. The Null check is needed because UNSAFE_ENTRY calls thread_from_jni_environment(), which returns NULL if the env thread is terminated. > > The fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/7129 From hseigel at openjdk.java.net Thu Jan 20 14:55:56 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Thu, 20 Jan 2022 14:55:56 GMT Subject: RFR: 8279887: 2 Null pointer dereference defect groups in os_posix.cpp [v2] In-Reply-To: References: Message-ID: On Wed, 19 Jan 2022 19:56:21 GMT, Harold Seigel wrote: >> Please review this small fix to prevent possible Null pointer dereferences. The fix adds a Null check to prevent trying to park a Null thread. The Null check is needed because UNSAFE_ENTRY calls thread_from_jni_environment(), which returns NULL if the env thread is terminated. >> >> The fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. >> >> Thanks, Harold > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > add comment Closing this pull request because this is not an issue. It's a false positive reported by the static analysis tool. ------------- PR: https://git.openjdk.java.net/jdk/pull/7129 From zgu at openjdk.java.net Thu Jan 20 15:39:19 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Thu, 20 Jan 2022 15:39:19 GMT Subject: RFR: 8280289: Enhance debug pp() command with NMT info Message-ID: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> JDK-8275320 enhanced NMT malloc header, provided ability to identify if a pointer points to a malloc'd memory. Further, JDK-8277822 enabled NMT for debug builds. This is a good opportunity to integrate NMT to debug pp() command to provide useful information collected by NMT. For example: NMT=summary (gdb) call pp(0x7f2d9803db70) "Executing pp" 0x00007f2d9803db70 malloc'd 1576 bytes by Internal (gdb) call pp(0x00007f4300a20000) "Executing pp" 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC NMT=detail (gdb) call pp(0x7f2d9803db70) "Executing pp" 0x00007f2d9803db70 malloc'd 1576 bytes by Internal [0x00007f2d9f1b784b] G1Arguments::create_heap()+0x1b [0x00007f2d9f98a855] universe_init()+0x85 [0x00007f2d9f2e0a97] init_globals()+0x37 [0x00007f2d9f960acb] Threads::create_vm(JavaVMInitArgs*, bool*)+0x3db (gdb) call pp(0x00007f4300a20000) "Executing pp" 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC [0x00007f433dc49c7f] reserve_memory(char*, unsigned long, unsigned long, int, bool)+0x17f [0x00007f433dc4cf0b] ReservedSpace::reserve(unsigned long, unsigned long, unsigned long, char*, bool)+0x14b [0x00007f433dc4d527] ReservedSpace::initialize(unsigned long, unsigned long, unsigned long, char*, bool)+0x1c7 [0x00007f433d949d40] ShenandoahHeap::initialize()+0x340 ------------- Commit messages: - More comment fixing - Fix comment - Fix minimal build - Merge branch 'master' into JDK-8280289-nmt-pp - Fix - Update comments - Cleanup and update copyright years - v0 Changes: https://git.openjdk.java.net/jdk/pull/7160/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7160&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8280289 Stats: 89 lines in 5 files changed: 76 ins; 4 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/7160.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7160/head:pull/7160 PR: https://git.openjdk.java.net/jdk/pull/7160 From duke at openjdk.java.net Thu Jan 20 15:58:09 2022 From: duke at openjdk.java.net (Bhavana-Kilambi) Date: Thu, 20 Jan 2022 15:58:09 GMT Subject: RFR: 8239927: Product variable PrefetchFieldsAhead is unused and should be removed [v4] In-Reply-To: References: Message-ID: <2xky5BN4tTu-ZmRfO-Um0JjHBDQkJtnekjfT-ax6ZIg=.34e819ac-0e87-490f-a8a5-5fbe857bc9d6@github.com> > The product variable "PrefetchFieldsAhead" is defined in gc_globals.hpp and set in vm_version_x86.cpp. > But as it's not used anywhere, removing this option from the JDK source. Bhavana-Kilambi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: - Merge master - 8239927: Product variable PrefetchFieldsAhead is unused and should be removed ------------- Changes: https://git.openjdk.java.net/jdk/pull/6783/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6783&range=03 Stats: 13 lines in 3 files changed: 1 ins; 10 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/6783.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6783/head:pull/6783 PR: https://git.openjdk.java.net/jdk/pull/6783 From stumon01 at arm.com Thu Jan 20 16:35:02 2022 From: stumon01 at arm.com (Stuart Monteith) Date: Thu, 20 Jan 2022 16:35:02 +0000 Subject: RFR: 8239927: Product variable PrefetchFieldsAhead is unused and should be removed [v4] In-Reply-To: <2xky5BN4tTu-ZmRfO-Um0JjHBDQkJtnekjfT-ax6ZIg=.34e819ac-0e87-490f-a8a5-5fbe857bc9d6@github.com> References: <2xky5BN4tTu-ZmRfO-Um0JjHBDQkJtnekjfT-ax6ZIg=.34e819ac-0e87-490f-a8a5-5fbe857bc9d6@github.com> Message-ID: <7271df41-6d92-fc56-16d0-1e9db54af584@arm.com> On 20/01/2022 15:58, Bhavana-Kilambi wrote: >> The product variable "PrefetchFieldsAhead" is defined in gc_globals.hpp and set in vm_version_x86.cpp. >> But as it's not used anywhere, removing this option from the JDK source. > > Bhavana-Kilambi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: > > - Merge master > - 8239927: Product variable PrefetchFieldsAhead is unused and should be removed > > ------------- > > Changes: https://git.openjdk.java.net/jdk/pull/6783/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6783&range=03 > Stats: 13 lines in 3 files changed: 1 ins; 10 del; 2 mod > Patch: https://git.openjdk.java.net/jdk/pull/6783.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/6783/head:pull/6783 > > PR: https://git.openjdk.java.net/jdk/pull/6783 > Hello Kim, while Ningsheng has reviewed this, would you be able to take a look too? Then the CSR can go ahead. Thanks, Stuart IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. From duke at openjdk.java.net Thu Jan 20 17:10:39 2022 From: duke at openjdk.java.net (Alan Hayward) Date: Thu, 20 Jan 2022 17:10:39 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v13] In-Reply-To: References: Message-ID: > PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One > of its uses is to protect against ROP based attacks. This is done by > signing the Link Register whenever it is stored on the stack, and > authenticating the value when it is loaded back from the stack. If an > attacker were to try to change control flow by editing the stack then > the authentication check of the Link Register will fail, causing a > segfault when the function returns. > > On a system with PAC enabled, it is expected that all applications will > be compiled with ROP protection. Fedora 33 and upwards already provide > this. By compiling for ARMv8.0, GCC and LLVM will only use the set of > PAC instructions that exist in the NOP space - on hardware without PAC, > these instructions act as NOPs, allowing backward compatibility for > negligible performance cost (2 NOPs per non-leaf function). > > Hardware is currently limited to the Apple M1 MacBooks. All testing has > been done within a Fedora Docker image. A run of SpecJVM showed no > difference to that of noise - which was surprising. > > The most important part of this patch is simply compiling using branch > protection provided by GCC/LLVM. This protects all C++ code from being > used in ROP attacks, removing all static ROP gadgets from use. > > The remainder of the patch adds ROP protection to runtime generated > code, in both stubs and compiled Java code. Attacks here are much harder > as ROP gadgets must be found dynamically at runtime. If/when AOT > compilation is added to JDK, then all stubs and compiled Java will be > susceptible ROP gadgets being found by static analysis and therefore > potentially as vulnerable as C++ code. > > There are a number of places where the VM changes control flow by > rewriting the stack or otherwise. I?ve done some analysis as to how > these could also be used for attacks (which I didn?t want to post here). > These areas can be protected ensuring the pointers to various stubs and > entry points are stored in memory as signed pointers. These changes are > simple to make (they can be reduced to a type change in common code and > a few addition sign/auth calls in the backend), but there a lot of them > and the total code change is fairly large. I?m happy to provide a few > work in progress patches. > > In order to match the security benefits of the Apple Arm64e ABI across > the whole of JDK, then all the changes mentioned above would be > required. Alan Hayward has updated the pull request incrementally with two additional commits since the last revision: - Fix jvmci tests - Fix GC issues ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6334/files - new: https://git.openjdk.java.net/jdk/pull/6334/files/f6f80412..14799421 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=12 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=11-12 Stats: 50 lines in 11 files changed: 32 ins; 2 del; 16 mod Patch: https://git.openjdk.java.net/jdk/pull/6334.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6334/head:pull/6334 PR: https://git.openjdk.java.net/jdk/pull/6334 From duke at openjdk.java.net Thu Jan 20 17:10:44 2022 From: duke at openjdk.java.net (Alan Hayward) Date: Thu, 20 Jan 2022 17:10:44 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v12] In-Reply-To: References: Message-ID: <6ITO7SRTa4es3yv8HZrbJB2AoHQjjYUC0PBYG9kndgY=.52b4e580-3ade-49d3-a37d-8d048dd3bc7c@github.com> On Thu, 20 Jan 2022 14:22:21 GMT, Alan Hayward wrote: >> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One >> of its uses is to protect against ROP based attacks. This is done by >> signing the Link Register whenever it is stored on the stack, and >> authenticating the value when it is loaded back from the stack. If an >> attacker were to try to change control flow by editing the stack then >> the authentication check of the Link Register will fail, causing a >> segfault when the function returns. >> >> On a system with PAC enabled, it is expected that all applications will >> be compiled with ROP protection. Fedora 33 and upwards already provide >> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of >> PAC instructions that exist in the NOP space - on hardware without PAC, >> these instructions act as NOPs, allowing backward compatibility for >> negligible performance cost (2 NOPs per non-leaf function). >> >> Hardware is currently limited to the Apple M1 MacBooks. All testing has >> been done within a Fedora Docker image. A run of SpecJVM showed no >> difference to that of noise - which was surprising. >> >> The most important part of this patch is simply compiling using branch >> protection provided by GCC/LLVM. This protects all C++ code from being >> used in ROP attacks, removing all static ROP gadgets from use. >> >> The remainder of the patch adds ROP protection to runtime generated >> code, in both stubs and compiled Java code. Attacks here are much harder >> as ROP gadgets must be found dynamically at runtime. If/when AOT >> compilation is added to JDK, then all stubs and compiled Java will be >> susceptible ROP gadgets being found by static analysis and therefore >> potentially as vulnerable as C++ code. >> >> There are a number of places where the VM changes control flow by >> rewriting the stack or otherwise. I?ve done some analysis as to how >> these could also be used for attacks (which I didn?t want to post here). >> These areas can be protected ensuring the pointers to various stubs and >> entry points are stored in memory as signed pointers. These changes are >> simple to make (they can be reduced to a type change in common code and >> a few addition sign/auth calls in the backend), but there a lot of them >> and the total code change is fairly large. I?m happy to provide a few >> work in progress patches. >> >> In order to match the security benefits of the Apple Arm64e ABI across >> the whole of JDK, then all the changes mentioned above would be >> required. > > Alan Hayward has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 19 commits: > > - Merge master > - Fix assembler for post-merge > - Change UseROPProtection to UseBranchProtection > > Change-Id: I31c5e1bb5c285262f262459c13057a46221682f1 > CustomizedGitHooks: yes > - Remove BSD/Apple specific code > - Default to building without branch-protection > - Fix up UseROPProtection flag > - Merge master > - Merge master > - Rename pauth_authenticate_or_strip_return_address > - Fix windows aarch64 by restoring pauth file split > - ... and 9 more: https://git.openjdk.java.net/jdk/compare/cf977e88...f6f80412 The new commits fix all the GC issues and fixes up the jvmci test. There are still a few nsk tests that need fixing. I'm looking at those now. ------------- PR: https://git.openjdk.java.net/jdk/pull/6334 From duke at openjdk.java.net Thu Jan 20 18:28:52 2022 From: duke at openjdk.java.net (Quan Anh Mai) Date: Thu, 20 Jan 2022 18:28:52 GMT Subject: Integrated: 8279282: [vectorapi] Matcher::supports_vector_comparison_unsigned is not needed on x86 In-Reply-To: References: Message-ID: <5T2nZzlLLC07Bfax18CCEr1qkPVihqj_Z8L_t7dpM4A=.8f139496-4213-4b29-aa12-07f28cd04b1c@github.com> On Wed, 5 Jan 2022 10:30:46 GMT, Quan Anh Mai wrote: > Hi, > > Currently, unsigned comparison on AVX is implemented by zero extending elements and comparing the results. This leads to unnecessary complexity. This patch changes the implementation to use the identity existing in `Integer/Long.compareUnsigned`, that is `compareUnsigned(x, y) == compare(x ^ min_value, y ^ min_value)`. > > Thank you very much. This pull request has now been integrated. Changeset: 02390c79 Author: Quan Anh Mai Committer: Sandhya Viswanathan URL: https://git.openjdk.java.net/jdk/commit/02390c79b1acff1a953d29c6f70623f3b7838698 Stats: 195 lines in 8 files changed: 65 ins; 99 del; 31 mod 8279282: [vectorapi] Matcher::supports_vector_comparison_unsigned is not needed on x86 Reviewed-by: kvn, sviswanathan, vlivanov ------------- PR: https://git.openjdk.java.net/jdk/pull/6966 From phh at openjdk.java.net Thu Jan 20 20:35:52 2022 From: phh at openjdk.java.net (Paul Hohensee) Date: Thu, 20 Jan 2022 20:35:52 GMT Subject: RFR: 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase In-Reply-To: References: Message-ID: On Thu, 13 Jan 2022 17:44:57 GMT, Yi-Fan Tsai wrote: > 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase verify_heapbase() can only be called in a 64-bit JVM because compressed oops are only supported in a 64-bit JVM. Thus, the #ifdef _LP64 isn't needed. CheckCompressedOops is only meaningful in a 64-bit JVM, but it's not conditionalized by _LP64 in globals.hpp, even though UseCompressedOops is. This is a bug, imo, that should be addressed in another PR. ------------- Changes requested by phh (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7067 From iwalulya at openjdk.java.net Thu Jan 20 20:52:52 2022 From: iwalulya at openjdk.java.net (Ivan Walulya) Date: Thu, 20 Jan 2022 20:52:52 GMT Subject: RFR: 8279294: NonblockingQueue::try_pop may improperly indicate queue is empty In-Reply-To: References: Message-ID: On Wed, 19 Jan 2022 22:43:25 GMT, Kim Barrett wrote: > Please review this improvement to NonblockingQueue::try_pop. The old code > returned an indication that the queue was empty in some cases where that > wasn't true. In particular, contending try_pop operations could result in > some incorrectly indicating empty. The change fixes that and improves the > interaction between contending try_pops. > > Testing: > mach5 tier1-3 > > Lots of testing of this change in conjunction with others as part of > investigating and fixing JDK-8273383. Lgtm! Suggestion: With the comments growing after each change, maybe we rename `result` to `old_head` ------------- Marked as reviewed by iwalulya (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7149 From coleenp at openjdk.java.net Thu Jan 20 21:49:49 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 20 Jan 2022 21:49:49 GMT Subject: RFR: 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase In-Reply-To: References: Message-ID: On Thu, 13 Jan 2022 17:44:57 GMT, Yi-Fan Tsai wrote: > 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase src/hotspot/cpu/x86/macroAssembler_x86.cpp line 4633: > 4631: if (CheckCompressedOops) { > 4632: Label ok; > 4633: #ifdef _LP64 Really? This makes a difference in performance? If it does, maybe one of the verify_heapbase() calls is in a place where it is called too frequently. ------------- PR: https://git.openjdk.java.net/jdk/pull/7067 From dholmes at openjdk.java.net Thu Jan 20 22:11:48 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 20 Jan 2022 22:11:48 GMT Subject: RFR: 8276472: align_metadata_size is a nop In-Reply-To: References: Message-ID: On Wed, 19 Jan 2022 23:20:37 GMT, Coleen Phillimore wrote: > I just added a comment to this and took out the align_up call. I like align_metadata_size where it is and changing to WordSize doesn't add more safety imo. Also found unused function while looking for units of metadata size. > Tested with tier1 on Oracle supported platforms. Okay. I'm not really sure why we need to make any changes in that case. If the incoming alignment of values were to change then you would want to keep the `align_up` call wouldn't you? At a minimum should we not have an assertion (static assert?) that the incoming values are actually aligned as expected? Thanks, David Also the issue title is misleading to me as it suggests that being a nop is a problem and that it will be fixed, but it is still a nop after this change. ------------- PR: https://git.openjdk.java.net/jdk/pull/7150 From psandoz at openjdk.java.net Thu Jan 20 22:20:53 2022 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Thu, 20 Jan 2022 22:20:53 GMT Subject: RFR: 8277155: Compress and expand vector operations In-Reply-To: References: Message-ID: On Wed, 24 Nov 2021 19:20:08 GMT, Paul Sandoz wrote: > Add two new cross-lane vector operations, `compress` and `expand`. > > An example of such usage might be code that selects elements from array `a` and stores those selected elements in array `z`: > > > int[] a = ...; > > int[] z = ...; > int ai = 0, zi = 0; > while (ai < a.length) { > IntVector av = IntVector.fromArray(SPECIES, a, ai); > // query over elements of vector av > // returning a mask marking elements of interest > VectorMask m = interestingBits(av, ...); > IntVector zv = av.compress(m); > zv.intoArray(z, zi, m.compress()); > ai += SPECIES.length(); > zi += m.trueCount(); > } > > > (There's also a more sophisticated version using `unslice` to coalesce matching elements with non-masked stores.) > > Given RDP 1 for 18 is getting close, 2021/12/09, we may not get this reviewed in time and included in [JEP 417](https://openjdk.java.net/jeps/417). Still I think I think it worth starting the review now (the CSR is marked provisional). Withdrawing and will create a new PR corresponding to a new incubating JEP. ------------- PR: https://git.openjdk.java.net/jdk/pull/6545 From psandoz at openjdk.java.net Thu Jan 20 22:20:53 2022 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Thu, 20 Jan 2022 22:20:53 GMT Subject: Withdrawn: 8277155: Compress and expand vector operations In-Reply-To: References: Message-ID: <2F6yBpkaX7fN2nCCxN5z28qFdG-awS9F-CXa1mGysQg=.3a5e1c30-2812-47e3-a7da-1e3d084ca096@github.com> On Wed, 24 Nov 2021 19:20:08 GMT, Paul Sandoz wrote: > Add two new cross-lane vector operations, `compress` and `expand`. > > An example of such usage might be code that selects elements from array `a` and stores those selected elements in array `z`: > > > int[] a = ...; > > int[] z = ...; > int ai = 0, zi = 0; > while (ai < a.length) { > IntVector av = IntVector.fromArray(SPECIES, a, ai); > // query over elements of vector av > // returning a mask marking elements of interest > VectorMask m = interestingBits(av, ...); > IntVector zv = av.compress(m); > zv.intoArray(z, zi, m.compress()); > ai += SPECIES.length(); > zi += m.trueCount(); > } > > > (There's also a more sophisticated version using `unslice` to coalesce matching elements with non-masked stores.) > > Given RDP 1 for 18 is getting close, 2021/12/09, we may not get this reviewed in time and included in [JEP 417](https://openjdk.java.net/jeps/417). Still I think I think it worth starting the review now (the CSR is marked provisional). This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/6545 From coleenp at openjdk.java.net Thu Jan 20 22:44:52 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 20 Jan 2022 22:44:52 GMT Subject: RFR: 8276472: align_metadata_size documents metadata alignment In-Reply-To: References: Message-ID: On Wed, 19 Jan 2022 23:20:37 GMT, Coleen Phillimore wrote: > I just added a comment to this and took out the align_up call. I like align_metadata_size where it is and changing to WordSize doesn't add more safety imo. Also found unused function while looking for units of metadata size. > Tested with tier1 on Oracle supported platforms. How about this: 8276472: align_metadata_size documents metadata alignment #7150 ------------- PR: https://git.openjdk.java.net/jdk/pull/7150 From kbarrett at openjdk.java.net Thu Jan 20 23:41:50 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Thu, 20 Jan 2022 23:41:50 GMT Subject: RFR: 8279294: NonblockingQueue::try_pop may improperly indicate queue is empty In-Reply-To: References: Message-ID: On Thu, 20 Jan 2022 20:49:53 GMT, Ivan Walulya wrote: > Suggestion: With the comments growing after each change, maybe we rename `result` to `old_head` I've been thinking that `result` might have been a poor naming choice here. I think I'd like to do such a variable renaming separately though. ------------- PR: https://git.openjdk.java.net/jdk/pull/7149 From phh at openjdk.java.net Fri Jan 21 00:01:55 2022 From: phh at openjdk.java.net (Paul Hohensee) Date: Fri, 21 Jan 2022 00:01:55 GMT Subject: RFR: 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase In-Reply-To: References: Message-ID: On Thu, 13 Jan 2022 17:44:57 GMT, Yi-Fan Tsai wrote: > 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase Hi, Coleen. This patch might not make much of a performance difference, but every little bit helps a debug JVM go a bit faster so as to maybe detect concurrency issues that would otherwise be masked. Plus, it's a starter bug for Yi-Fan to start to understand the macro assembler. ------------- PR: https://git.openjdk.java.net/jdk/pull/7067 From sviswanathan at openjdk.java.net Fri Jan 21 00:51:48 2022 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Fri, 21 Jan 2022 00:51:48 GMT Subject: RFR: 8279508: Auto-vectorize Math.round API [v2] In-Reply-To: References: Message-ID: <2TVKx_BFFyAK2ooOWKpdsEIMFzJngYxlWjbgeZ2y4Mc=.5deb2173-8107-476d-92ca-1835d69ce336@github.com> On Wed, 19 Jan 2022 17:38:25 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. >> - Test creation using new IR testing framework. >> >> Following are the performance number of a JMH micro included with the patch >> >> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) >> >> ? | ? | BASELINE AVX2 | WithOpt AVX2 | Gain (opt/baseline) | Baseline AVX3 | Withopt AVX3 | Gain (opt/baseline) >> -- | -- | -- | -- | -- | -- | -- | -- >> Benchmark | ARRAYLEN | Score (ops/ms) | Score (ops/ms) | ? | Score (ops/ms) | Score (ops/ms) | ? >> FpRoundingBenchmark.test_round_double | 1024 | 518.532 | 1364.066 | 2.630630318 | 512.908 | 4292.11 | 8.368186887 >> FpRoundingBenchmark.test_round_double | 2048 | 270.137 | 830.986 | 3.076165057 | 273.159 | 2459.116 | 9.002507697 >> FpRoundingBenchmark.test_round_float | 1024 | 752.436 | 7780.905 | 10.34095259 | 752.49 | 9506.694 | 12.63364829 >> FpRoundingBenchmark.test_round_float | 2048 | 389.499 | 4113.046 | 10.55983712 | 389.63 | 4863.673 | 12.48279907 >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > 8279508: Adding a test for scalar intrinsification. The JVM currently initializes the x86 mxcsr to round to nearest even, see below in stubGenerator_x86_64.cpp: // Round to nearest (even), 64-bit mode, exceptions masked StubRoutines::x86::_mxcsr_std = 0x1F80; The above works for Math.rint which is specified to be round to nearest even. Please see: https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html : section 4.8.4 The rounding mode needed for Math.round is round to positive infinity which needs a different x86 mxcsr initialization(0x5F80). ------------- PR: https://git.openjdk.java.net/jdk/pull/7094 From duke at openjdk.java.net Fri Jan 21 00:52:18 2022 From: duke at openjdk.java.net (Yi-Fan Tsai) Date: Fri, 21 Jan 2022 00:52:18 GMT Subject: RFR: 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase [v2] In-Reply-To: References: Message-ID: > 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase Yi-Fan Tsai has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Merge branch 'openjdk:master' into pushpop - Limit the change local - Merge branch 'openjdk:master' into pushpop - 8278036: Remove redundant push/pop in verify_heapbase ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7067/files - new: https://git.openjdk.java.net/jdk/pull/7067/files/258300d4..efabb85c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7067&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7067&range=00-01 Stats: 9100 lines in 370 files changed: 6194 ins; 1889 del; 1017 mod Patch: https://git.openjdk.java.net/jdk/pull/7067.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7067/head:pull/7067 PR: https://git.openjdk.java.net/jdk/pull/7067 From kbarrett at openjdk.java.net Fri Jan 21 02:23:49 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 21 Jan 2022 02:23:49 GMT Subject: RFR: 8280182: HotSpot Style Guide has stale link to chromium style guide In-Reply-To: References: Message-ID: On Tue, 18 Jan 2022 23:09:12 GMT, Liam Miller-Cushon wrote: > Update links to the chromium style guide in the HotSpot Style Guide. The current change process for the style guide requires voting by HotSpot group members and a decision by the group lead (Vladimir). That seems overly heavyweight for small editorial changes like this. I've been meaning to propose a change to the change process, but haven't gotten around to it. (And I have several similar editorial changes pending.) ------------- PR: https://git.openjdk.java.net/jdk/pull/7138 From duke at openjdk.java.net Fri Jan 21 02:48:17 2022 From: duke at openjdk.java.net (Yi-Fan Tsai) Date: Fri, 21 Jan 2022 02:48:17 GMT Subject: RFR: 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase [v3] In-Reply-To: References: Message-ID: > 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: Remvoe unnecessary _LP64 checks ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7067/files - new: https://git.openjdk.java.net/jdk/pull/7067/files/efabb85c..3dc444b0 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7067&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7067&range=01-02 Stats: 4 lines in 1 file changed: 0 ins; 4 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7067.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7067/head:pull/7067 PR: https://git.openjdk.java.net/jdk/pull/7067 From stuefe at openjdk.java.net Fri Jan 21 09:38:45 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 21 Jan 2022 09:38:45 GMT Subject: RFR: 8280289: Enhance debug pp() command with NMT info In-Reply-To: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> References: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> Message-ID: On Thu, 20 Jan 2022 15:31:12 GMT, Zhengyu Gu wrote: > JDK-8275320 enhanced NMT malloc header, provided ability to identify if a pointer points to a malloc'd memory. Further, JDK-8277822 enabled NMT for debug builds. > > This is a good opportunity to integrate NMT to debug pp() command to provide useful information collected by NMT. > > For example: > > NMT=summary > > (gdb) call pp(0x7f2d9803db70) > "Executing pp" > 0x00007f2d9803db70 malloc'd 1576 bytes by Internal > > (gdb) call pp(0x00007f4300a20000) > "Executing pp" > 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC > > > NMT=detail > > > (gdb) call pp(0x7f2d9803db70) > "Executing pp" > 0x00007f2d9803db70 malloc'd 1576 bytes by Internal > [0x00007f2d9f1b784b] G1Arguments::create_heap()+0x1b > [0x00007f2d9f98a855] universe_init()+0x85 > [0x00007f2d9f2e0a97] init_globals()+0x37 > [0x00007f2d9f960acb] Threads::create_vm(JavaVMInitArgs*, bool*)+0x3db > > (gdb) call pp(0x00007f4300a20000) > "Executing pp" > 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC > [0x00007f433dc49c7f] reserve_memory(char*, unsigned long, unsigned long, int, bool)+0x17f > [0x00007f433dc4cf0b] ReservedSpace::reserve(unsigned long, unsigned long, unsigned long, char*, bool)+0x14b > [0x00007f433dc4d527] ReservedSpace::initialize(unsigned long, unsigned long, unsigned long, char*, bool)+0x1c7 > [0x00007f433d949d40] ShenandoahHeap::initialize()+0x340 Hi Zhengyu, this is nice :) I have a small proposal though. How about splitting check_block_integrity into two functions, one which checks and returns error information, one which does the assert? That way seems a bit cleaner, and we can re-use the checking functionality without the assert in other places. I tested an addition to your patch, what I meant was something like this (I probably could have suggested the change in your PR but don't know how): https://github.com/openjdk/jdk/compare/pr/7160...tstuefe:zhengyu-pp-proposal And inside debug.cpp, in pp(), we could actually print out corruption information for corrupted blocks. But I did not do this in my proposal, since I was not sure how to distinguish "its a malloced block, I know it, so show me if its broken" from "its just some pointer, may not be malloced at all". Thought about printing something like "if this was a malloced block, its broken" but that seemed weird for the cases where the user knows the pointer is not malloced. What do you think? ..Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/7160 From zgu at openjdk.java.net Fri Jan 21 14:25:19 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Fri, 21 Jan 2022 14:25:19 GMT Subject: RFR: 8280289: Enhance debug pp() command with NMT info [v2] In-Reply-To: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> References: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> Message-ID: > JDK-8275320 enhanced NMT malloc header, provided ability to identify if a pointer points to a malloc'd memory. Further, JDK-8277822 enabled NMT for debug builds. > > This is a good opportunity to integrate NMT to debug pp() command to provide useful information collected by NMT. > > For example: > > NMT=summary > > (gdb) call pp(0x7f2d9803db70) > "Executing pp" > 0x00007f2d9803db70 malloc'd 1576 bytes by Internal > > (gdb) call pp(0x00007f4300a20000) > "Executing pp" > 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC > > > NMT=detail > > > (gdb) call pp(0x7f2d9803db70) > "Executing pp" > 0x00007f2d9803db70 malloc'd 1576 bytes by Internal > [0x00007f2d9f1b784b] G1Arguments::create_heap()+0x1b > [0x00007f2d9f98a855] universe_init()+0x85 > [0x00007f2d9f2e0a97] init_globals()+0x37 > [0x00007f2d9f960acb] Threads::create_vm(JavaVMInitArgs*, bool*)+0x3db > > (gdb) call pp(0x00007f4300a20000) > "Executing pp" > 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC > [0x00007f433dc49c7f] reserve_memory(char*, unsigned long, unsigned long, int, bool)+0x17f > [0x00007f433dc4cf0b] ReservedSpace::reserve(unsigned long, unsigned long, unsigned long, char*, bool)+0x14b > [0x00007f433dc4d527] ReservedSpace::initialize(unsigned long, unsigned long, unsigned long, char*, bool)+0x1c7 > [0x00007f433d949d40] ShenandoahHeap::initialize()+0x340 Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: Thomas' comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7160/files - new: https://git.openjdk.java.net/jdk/pull/7160/files/789e69e0..65fcfd4d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7160&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7160&range=00-01 Stats: 48 lines in 3 files changed: 15 ins; 8 del; 25 mod Patch: https://git.openjdk.java.net/jdk/pull/7160.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7160/head:pull/7160 PR: https://git.openjdk.java.net/jdk/pull/7160 From jiefu at openjdk.java.net Fri Jan 21 14:54:22 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 21 Jan 2022 14:54:22 GMT Subject: RFR: 8280457: Duplicate implementaion of dprecision_rounding and dstore_rounding Message-ID: Hi all, `GraphKit::dprecision_rounding` and `GraphKit::dstore_rounding` are duplicate. It would be better to remove one of them. The patch removes `GraphKit::dstore_rounding` and replaces all the usages with `GraphKit::dprecision_rounding`. Thanks. Best regards, Jie ------------- Commit messages: - 8280457: Duplicate implementaion of dprecision_rounding and dstore_rounding Changes: https://git.openjdk.java.net/jdk/pull/7176/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7176&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8280457 Stats: 28 lines in 4 files changed: 0 ins; 17 del; 11 mod Patch: https://git.openjdk.java.net/jdk/pull/7176.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7176/head:pull/7176 PR: https://git.openjdk.java.net/jdk/pull/7176 From vlivanov at openjdk.java.net Fri Jan 21 15:04:37 2022 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Fri, 21 Jan 2022 15:04:37 GMT Subject: RFR: 8280457: Duplicate implementaion of dprecision_rounding and dstore_rounding In-Reply-To: References: Message-ID: On Fri, 21 Jan 2022 14:46:07 GMT, Jie Fu wrote: > Hi all, > > `GraphKit::dprecision_rounding` and `GraphKit::dstore_rounding` are duplicate. > It would be better to remove one of them. > > The patch removes `GraphKit::dstore_rounding` and replaces all the usages with `GraphKit::dprecision_rounding`. > > Thanks. > Best regards, > Jie Looks good and trivial. ------------- Marked as reviewed by vlivanov (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7176 From shade at openjdk.java.net Fri Jan 21 15:20:54 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 21 Jan 2022 15:20:54 GMT Subject: RFR: 8280457: Duplicate implementaion of dprecision_rounding and dstore_rounding In-Reply-To: References: Message-ID: On Fri, 21 Jan 2022 14:46:07 GMT, Jie Fu wrote: > Hi all, > > `GraphKit::dprecision_rounding` and `GraphKit::dstore_rounding` are duplicate. > It would be better to remove one of them. > > The patch removes `GraphKit::dstore_rounding` and replaces all the usages with `GraphKit::dprecision_rounding`. > > Thanks. > Best regards, > Jie Changes requested by shade (Reviewer). src/hotspot/share/opto/graphKit.cpp line 2355: > 2353: // the call, precision_rounding does gvn.transform > 2354: Node *arg = argument(j); > 2355: arg = precision_rounding(arg); Hold on. This should be `dprecision_rounding`, with a `d`? ------------- PR: https://git.openjdk.java.net/jdk/pull/7176 From amenkov at openjdk.java.net Fri Jan 21 15:48:11 2022 From: amenkov at openjdk.java.net (Alex Menkov) Date: Fri, 21 Jan 2022 15:48:11 GMT Subject: RFR: 8240908: RetransformClass does not know about MethodParameters attribute Message-ID: Changes: - ClassFileReconstituter is updated to restore "MethodParameters" attribute; - handling of the attribute in VM_RedefineClasses is moved to be consistent with other code (like local variable table); - copied ClassTransformer class (from test/jdk/com/sun/jdi/lib/jdb) to /test/lib as it's used by tests from hotspot and jdk (and also by test from Valhalla repo); Will file a follow up issues to updates tests and remove the class from test/jdk/com/sun/jdi/lib/jdb ------------- Commit messages: - JDK-8240908 Changes: https://git.openjdk.java.net/jdk/pull/7180/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7180&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8240908 Stats: 431 lines in 5 files changed: 411 ins; 15 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/7180.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7180/head:pull/7180 PR: https://git.openjdk.java.net/jdk/pull/7180 From coleenp at openjdk.java.net Fri Jan 21 16:08:44 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 21 Jan 2022 16:08:44 GMT Subject: RFR: 8276472: align_metadata_size documents metadata alignment In-Reply-To: References: Message-ID: On Thu, 20 Jan 2022 22:08:14 GMT, David Holmes wrote: > At a minimum should we not have an assertion (static assert?) that the incoming values are actually aligned as expected? The incoming values are an int, ie number of words. so word aligned ?? ------------- PR: https://git.openjdk.java.net/jdk/pull/7150 From coleenp at openjdk.java.net Fri Jan 21 16:30:47 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 21 Jan 2022 16:30:47 GMT Subject: Withdrawn: 8276472: align_metadata_size documents metadata alignment In-Reply-To: References: Message-ID: On Wed, 19 Jan 2022 23:20:37 GMT, Coleen Phillimore wrote: > I just added a comment to this and took out the align_up call. I like align_metadata_size where it is and changing to WordSize doesn't add more safety imo. Also found unused function while looking for units of metadata size. > Tested with tier1 on Oracle supported platforms. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/7150 From vlivanov at openjdk.java.net Fri Jan 21 16:55:42 2022 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Fri, 21 Jan 2022 16:55:42 GMT Subject: RFR: 8280457: Duplicate implementaion of dprecision_rounding and dstore_rounding In-Reply-To: References: Message-ID: <6HSz3CFPLM6xteskytlvv0_9NLeH86WVXSpx5Geu9UU=.1f502d05-77ee-4923-961a-f9f9b235e98d@github.com> On Fri, 21 Jan 2022 15:16:49 GMT, Aleksey Shipilev wrote: >> Hi all, >> >> `GraphKit::dprecision_rounding` and `GraphKit::dstore_rounding` are duplicate. >> It would be better to remove one of them. >> >> The patch removes `GraphKit::dstore_rounding` and replaces all the usages with `GraphKit::dprecision_rounding`. >> >> Thanks. >> Best regards, >> Jie > > src/hotspot/share/opto/graphKit.cpp line 2355: > >> 2353: // the call, precision_rounding does gvn.transform >> 2354: Node *arg = argument(j); >> 2355: arg = precision_rounding(arg); > > Hold on. This should be `dprecision_rounding`, with a `d`? Good catch! ------------- PR: https://git.openjdk.java.net/jdk/pull/7176 From shade at openjdk.java.net Fri Jan 21 17:09:43 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 21 Jan 2022 17:09:43 GMT Subject: RFR: 8280457: Duplicate implementaion of dprecision_rounding and dstore_rounding In-Reply-To: References: Message-ID: On Fri, 21 Jan 2022 14:46:07 GMT, Jie Fu wrote: > Hi all, > > `GraphKit::dprecision_rounding` and `GraphKit::dstore_rounding` are duplicate. > It would be better to remove one of them. > > The patch removes `GraphKit::dstore_rounding` and replaces all the usages with `GraphKit::dprecision_rounding`. > > Thanks. > Best regards, > Jie Also, synopsis: "implementaion" -> "implementation". ------------- PR: https://git.openjdk.java.net/jdk/pull/7176 From zgu at openjdk.java.net Fri Jan 21 17:22:12 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Fri, 21 Jan 2022 17:22:12 GMT Subject: RFR: 8280289: Enhance debug pp() command with NMT info [v3] In-Reply-To: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> References: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> Message-ID: > JDK-8275320 enhanced NMT malloc header, provided ability to identify if a pointer points to a malloc'd memory. Further, JDK-8277822 enabled NMT for debug builds. > > This is a good opportunity to integrate NMT to debug pp() command to provide useful information collected by NMT. > > For example: > > NMT=summary > > (gdb) call pp(0x7f2d9803db70) > "Executing pp" > 0x00007f2d9803db70 malloc'd 1576 bytes by Internal > > (gdb) call pp(0x00007f4300a20000) > "Executing pp" > 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC > > > NMT=detail > > > (gdb) call pp(0x7f2d9803db70) > "Executing pp" > 0x00007f2d9803db70 malloc'd 1576 bytes by Internal > [0x00007f2d9f1b784b] G1Arguments::create_heap()+0x1b > [0x00007f2d9f98a855] universe_init()+0x85 > [0x00007f2d9f2e0a97] init_globals()+0x37 > [0x00007f2d9f960acb] Threads::create_vm(JavaVMInitArgs*, bool*)+0x3db > > (gdb) call pp(0x00007f4300a20000) > "Executing pp" > 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC > [0x00007f433dc49c7f] reserve_memory(char*, unsigned long, unsigned long, int, bool)+0x17f > [0x00007f433dc4cf0b] ReservedSpace::reserve(unsigned long, unsigned long, unsigned long, char*, bool)+0x14b > [0x00007f433dc4d527] ReservedSpace::initialize(unsigned long, unsigned long, unsigned long, char*, bool)+0x1c7 > [0x00007f433d949d40] ShenandoahHeap::initialize()+0x340 Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: Missing include file ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7160/files - new: https://git.openjdk.java.net/jdk/pull/7160/files/65fcfd4d..3ee1ce22 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7160&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7160&range=01-02 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7160.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7160/head:pull/7160 PR: https://git.openjdk.java.net/jdk/pull/7160 From coleenp at openjdk.java.net Fri Jan 21 17:46:42 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 21 Jan 2022 17:46:42 GMT Subject: RFR: 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase [v3] In-Reply-To: References: Message-ID: <0s-gWNZe3oidFhKsUVZRTydBRJU4lINVTF0K_peGJSQ=.8a8ee5af-82a5-42e4-9fa6-1d6c11826465@github.com> On Fri, 21 Jan 2022 02:48:17 GMT, Yi-Fan Tsai wrote: >> 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Remvoe unnecessary _LP64 checks Ok, then, why not? ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7067 From phh at openjdk.java.net Fri Jan 21 18:08:41 2022 From: phh at openjdk.java.net (Paul Hohensee) Date: Fri, 21 Jan 2022 18:08:41 GMT Subject: RFR: 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase [v3] In-Reply-To: References: Message-ID: <-uUucONQCXfmwR0uM2sAKBcMJsuA-KvTVizYaUwfzp0=.d3a2d75e-5b82-4345-8d86-f7d9d7482ac8@github.com> On Fri, 21 Jan 2022 02:48:17 GMT, Yi-Fan Tsai wrote: >> 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Remvoe unnecessary _LP64 checks Lgtm. ------------- Marked as reviewed by phh (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7067 From duke at openjdk.java.net Fri Jan 21 18:12:47 2022 From: duke at openjdk.java.net (Yi-Fan Tsai) Date: Fri, 21 Jan 2022 18:12:47 GMT Subject: Integrated: 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase In-Reply-To: References: Message-ID: On Thu, 13 Jan 2022 17:44:57 GMT, Yi-Fan Tsai wrote: > 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase This pull request has now been integrated. Changeset: 2920ce54 Author: Yi-Fan Tsai Committer: Paul Hohensee URL: https://git.openjdk.java.net/jdk/commit/2920ce54874c404126d9fd6bfbebee5f3da27dae Stats: 11 lines in 2 files changed: 7 ins; 0 del; 4 mod 8278036: Saving rscratch1 is optional in MacroAssembler::verify_heapbase Reviewed-by: xliu, phh, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/7067 From kvn at openjdk.java.net Fri Jan 21 19:17:23 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 21 Jan 2022 19:17:23 GMT Subject: RFR: 8280182: HotSpot Style Guide has stale link to chromium style guide In-Reply-To: References: Message-ID: On Tue, 18 Jan 2022 23:09:12 GMT, Liam Miller-Cushon wrote: > Update links to the chromium style guide in the HotSpot Style Guide. I agree that for small changes like this the process should be simplified. ------------- PR: https://git.openjdk.java.net/jdk/pull/7138 From jiefu at openjdk.java.net Fri Jan 21 23:29:54 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 21 Jan 2022 23:29:54 GMT Subject: RFR: 8280457: Duplicate implementation of dprecision_rounding and dstore_rounding [v2] In-Reply-To: References: Message-ID: > Hi all, > > `GraphKit::dprecision_rounding` and `GraphKit::dstore_rounding` are duplicate. > It would be better to remove one of them. > > The patch removes `GraphKit::dstore_rounding` and replaces all the usages with `GraphKit::dprecision_rounding`. > > Thanks. > Best regards, > Jie Jie Fu has updated the pull request incrementally with one additional commit since the last revision: precision_rounding --> dprecision_rounding ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7176/files - new: https://git.openjdk.java.net/jdk/pull/7176/files/7d79add2..963c369a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7176&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7176&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7176.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7176/head:pull/7176 PR: https://git.openjdk.java.net/jdk/pull/7176 From jiefu at openjdk.java.net Fri Jan 21 23:29:56 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 21 Jan 2022 23:29:56 GMT Subject: RFR: 8280457: Duplicate implementation of dprecision_rounding and dstore_rounding [v2] In-Reply-To: <6HSz3CFPLM6xteskytlvv0_9NLeH86WVXSpx5Geu9UU=.1f502d05-77ee-4923-961a-f9f9b235e98d@github.com> References: <6HSz3CFPLM6xteskytlvv0_9NLeH86WVXSpx5Geu9UU=.1f502d05-77ee-4923-961a-f9f9b235e98d@github.com> Message-ID: On Fri, 21 Jan 2022 16:52:27 GMT, Vladimir Ivanov wrote: >> src/hotspot/share/opto/graphKit.cpp line 2355: >> >>> 2353: // the call, precision_rounding does gvn.transform >>> 2354: Node *arg = argument(j); >>> 2355: arg = precision_rounding(arg); >> >> Hold on. This should be `dprecision_rounding`, with a `d`? > > Good catch! > Hold on. This should be `dprecision_rounding`, with a `d`? Ah, yes. Thanks @iwanowww and @shipilev for your review. All the comments had been addressed. ------------- PR: https://git.openjdk.java.net/jdk/pull/7176 From ddong at openjdk.java.net Sat Jan 22 02:40:41 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Sat, 22 Jan 2022 02:40:41 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash [v4] In-Reply-To: References: Message-ID: > Hi, > > I found that the native stack frames in the hs log are not accurate sometimes on AArch64, not sure if this is a known issue or an issue worth fixing. > > The following steps can quick reproduce the problem: > > 1. apply the diff(comment the dtrace_object_alloc call in interpreter and make a crash on SharedRuntime::dtrace_object_alloc) > > index 39e99bdd5ed..4fc768e94aa 100644 > --- a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp > +++ b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp > @@ -3558,6 +3558,7 @@ void TemplateTable::_new() { > __ store_klass_gap(r0, zr); // zero klass gap for compressed oops > __ store_klass(r0, r4); // store klass last > > +/** > { > SkipIfEqual skip(_masm, &DTraceAllocProbes, false); > // Trigger dtrace event for fastpath > @@ -3567,6 +3568,7 @@ void TemplateTable::_new() { > __ pop(atos); // restore the return value > > } > +*/ > __ b(done); > } > > diff --git a/src/hotspot/cpu/x86/templateTable_x86.cpp b/src/hotspot/cpu/x86/templateTable_x86.cpp > index 19530b7c57c..15b0509da4c 100644 > --- a/src/hotspot/cpu/x86/templateTable_x86.cpp > +++ b/src/hotspot/cpu/x86/templateTable_x86.cpp > @@ -4033,6 +4033,7 @@ void TemplateTable::_new() { > Register tmp_store_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg); > __ store_klass(rax, rcx, tmp_store_klass); // klass > > +/** > { > SkipIfEqual skip_if(_masm, &DTraceAllocProbes, 0); > // Trigger dtrace event for fastpath > @@ -4041,6 +4042,7 @@ void TemplateTable::_new() { > CAST_FROM_FN_PTR(address, static_cast(SharedRuntime::dtrace_object_alloc)), rax); > __ pop(atos); > } > +*/ > > __ jmp(done); > } > diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp > index a5de65ea5ab..60b4bd3bcc8 100644 > --- a/src/hotspot/share/runtime/sharedRuntime.cpp > +++ b/src/hotspot/share/runtime/sharedRuntime.cpp > @@ -1002,6 +1002,7 @@ jlong SharedRuntime::get_java_tid(Thread* thread) { > * 6254741. Once that is fixed we can remove the dummy return value. > */ > int SharedRuntime::dtrace_object_alloc(oopDesc* o) { > + *(int*)0 = 1; > return dtrace_object_alloc(Thread::current(), o, o->size()); > } > > > 2. `java -XX:+DTraceAllocProbes -Xcomp -XX:-PreserveFramePointer -version` > > On x86_64, the native stack in hs log is complete, but on AArch64, the native stack is incorrect. > > In the beginning, I thought it might be the influence of PreserveFramePointer. Later, I found that no matter whether PreserveFramePointer is enabled or not, in the hs log of x86_64, the native stack is always correct, and aarch64 is wrong. > > After some investigation, I found that this problem is related to the layout of the stack. > > On x86_64, whether it is C/C++, interpreter, or JIT, `callee` will always put the `return address` and `fp` of the `caller` at the bottom of the stack. > Hence, `callee` can always get the `caller sp`(aka `sender sp`) by `fp + 2`, and if `caller` is a compiled method, `caller sp` is the key to getting the `caller`'s `caller` since `caller fp` may be invalid.(see frame::sender_for_compiled_frame). > > > push %rbp > mov %rsp,%rbp > > _ _ _ _ _ _ > | | > | | | > |_ _ _ _ _ _| | > | | | > caller | | <- caller sp | > _ _ _ |_ _ _ _ _ _| | expand > | | | > | ret addr | | direction > callee |_ _ _ _ _ _| | > | | V > | caller fp | <- fp > |_ _ _ _ _ _| > > > > But for AArch64, the C/C++ code doesn't put the `return address` and `fp` of the `caller` at the bottom of the stack. > Hence, we cannot use `fp + 2` to calculate the proper `caller sp`(although it is still implemented this way). > > When `caller` is a C1/C2 method A, and `callee` is a C/C++ method B, we cannot get the `caller` of A since we cannot get the proper sp value of it. > > > stp x29, x30, [sp, #-N]! > mov x29, sp > > _ _ _ _ _ _ > | | > | | | > |_ _ _ _ _ _| | > | | | > caller | | <- caller sp | > _ _ _ |_ _ _ _ _ _| - | expand > | | > . . . . . | | direction > _ _ _ _ _ _ | | > | | | N | > | ret addr | | | > callee |_ _ _ _ _ _| | | > | | - V > | caller fp | <- fp > |_ _ _ _ _ _| > > > > I am not very familiar with AArch64 and have no idea how to fix this issue perfectly at current. > > Based on my understanding of the implementation, we can get the correct stack trace when PreserveFramePointer is enabled. > > Although PreserveFramePointer is disabled by default, I found that some real applications will enable it in the production environment. > Therefore, in my opinion, this fix can help troubleshoot crash issues in applications that enable PreserveFramePointer on AArch64 platform. > > This patch changes the logic of l_sender_sp calculation, uses sender_sp() as the value of l_sender_sp when PreserveFramePointer is enabled. > > Any input is appreciated. > > Thanks, > Denghui Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: update copyright year ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6597/files - new: https://git.openjdk.java.net/jdk/pull/6597/files/2342f438..3674f719 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6597&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6597&range=02-03 Stats: 4 lines in 4 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/6597.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6597/head:pull/6597 PR: https://git.openjdk.java.net/jdk/pull/6597 From stuefe at openjdk.java.net Sat Jan 22 09:08:08 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Sat, 22 Jan 2022 09:08:08 GMT Subject: RFR: 8280289: Enhance debug pp() command with NMT info [v3] In-Reply-To: References: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> Message-ID: On Fri, 21 Jan 2022 17:22:12 GMT, Zhengyu Gu wrote: >> JDK-8275320 enhanced NMT malloc header, provided ability to identify if a pointer points to a malloc'd memory. Further, JDK-8277822 enabled NMT for debug builds. >> >> This is a good opportunity to integrate NMT to debug pp() command to provide useful information collected by NMT. >> >> For example: >> >> NMT=summary >> >> (gdb) call pp(0x7f2d9803db70) >> "Executing pp" >> 0x00007f2d9803db70 malloc'd 1576 bytes by Internal >> >> (gdb) call pp(0x00007f4300a20000) >> "Executing pp" >> 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC >> >> >> NMT=detail >> >> >> (gdb) call pp(0x7f2d9803db70) >> "Executing pp" >> 0x00007f2d9803db70 malloc'd 1576 bytes by Internal >> [0x00007f2d9f1b784b] G1Arguments::create_heap()+0x1b >> [0x00007f2d9f98a855] universe_init()+0x85 >> [0x00007f2d9f2e0a97] init_globals()+0x37 >> [0x00007f2d9f960acb] Threads::create_vm(JavaVMInitArgs*, bool*)+0x3db >> >> (gdb) call pp(0x00007f4300a20000) >> "Executing pp" >> 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC >> [0x00007f433dc49c7f] reserve_memory(char*, unsigned long, unsigned long, int, bool)+0x17f >> [0x00007f433dc4cf0b] ReservedSpace::reserve(unsigned long, unsigned long, unsigned long, char*, bool)+0x14b >> [0x00007f433dc4d527] ReservedSpace::initialize(unsigned long, unsigned long, unsigned long, char*, bool)+0x1c7 >> [0x00007f433d949d40] ShenandoahHeap::initialize()+0x340 > > Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: > > Missing include file Some small remaining nitpicks. src/hotspot/share/services/virtualMemoryTracker.cpp line 683: > 681: > 682: bool do_allocation_site(const ReservedMemoryRegion* rgn) { > 683: if (_p >= rgn->base() && _p < rgn->base() + rgn->size()) { Use `VirtualMemoryRegion::contain_address()`? src/hotspot/share/services/virtualMemoryTracker.cpp line 693: > 691: }; > 692: > 693: const ReservedMemoryRegion* VirtualMemoryTracker::find_region(void* p) { Curious, is this thread-safe? Do we care if not? src/hotspot/share/utilities/debug.cpp line 496: > 494: p2i(p), p2i(rgn->base()), p2i(rgn->base() + rgn->size()), rgn->flag_name()); > 495: if (tracking_level == NMT_detail) { > 496: rgn->call_stack()->print_on(tty); Idea for later cleanup. In detail mode, malloc headers have no call stack (NULL) whereas regions have an empty stack (NativeCallStack::_empty_stack) ? If yes, may be nice to unify this behavior. Maybe also remove checks like this here, for MemTracker::tracking_level=detail, in favor of checking if rgn->call_stack() is NULL. That would feel more consistent and allow us (if we ever wanted) to have regions with and without stack. ------------- PR: https://git.openjdk.java.net/jdk/pull/7160 From jiefu at openjdk.java.net Sat Jan 22 09:13:16 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Sat, 22 Jan 2022 09:13:16 GMT Subject: RFR: 8280457: Duplicate implementation of dprecision_rounding and dstore_rounding [v2] In-Reply-To: References: Message-ID: <-GE6HIDVRSICTFGC7MIglTPbVupsdIMWDwGUacJho0M=.5e4b0c8e-dd68-4773-b3a1-7d0c7d8349d3@github.com> On Fri, 21 Jan 2022 23:29:54 GMT, Jie Fu wrote: >> Hi all, >> >> `GraphKit::dprecision_rounding` and `GraphKit::dstore_rounding` are duplicate. >> It would be better to remove one of them. >> >> The patch removes `GraphKit::dstore_rounding` and replaces all the usages with `GraphKit::dprecision_rounding`. >> >> Thanks. >> Best regards, >> Jie > > Jie Fu has updated the pull request incrementally with one additional commit since the last revision: > > precision_rounding --> dprecision_rounding `gc/logging/TestMetaSpaceLog.java` fails (time out) on MacOSX. But it passed on my local test. So I don't think the failure was caused by my change. ------------- PR: https://git.openjdk.java.net/jdk/pull/7176 From zgu at openjdk.java.net Sat Jan 22 15:00:09 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Sat, 22 Jan 2022 15:00:09 GMT Subject: RFR: 8280289: Enhance debug pp() command with NMT info [v3] In-Reply-To: References: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> Message-ID: On Sat, 22 Jan 2022 08:46:41 GMT, Thomas Stuefe wrote: >> Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: >> >> Missing include file > > src/hotspot/share/services/virtualMemoryTracker.cpp line 693: > >> 691: }; >> 692: >> 693: const ReservedMemoryRegion* VirtualMemoryTracker::find_region(void* p) { > > Curious, is this thread-safe? Do we care if not? It is not thread-safe. But the command is running inside debuger, I don't see there is concurrent case. > src/hotspot/share/utilities/debug.cpp line 496: > >> 494: p2i(p), p2i(rgn->base()), p2i(rgn->base() + rgn->size()), rgn->flag_name()); >> 495: if (tracking_level == NMT_detail) { >> 496: rgn->call_stack()->print_on(tty); > > Idea for later cleanup. In detail mode, malloc headers have no call stack (NULL) whereas regions have an empty stack (NativeCallStack::_empty_stack) ? If yes, may be nice to unify this behavior. > > Maybe also remove checks like this here, for MemTracker::tracking_level=detail, in favor of checking if rgn->call_stack() is NULL. That would feel more consistent and allow us (if we ever wanted) to have regions with and without stack. Yes. I noticed the inconsistent of call stack handling and yes, I would like to defer to other CR. ------------- PR: https://git.openjdk.java.net/jdk/pull/7160 From zgu at openjdk.java.net Sat Jan 22 15:11:33 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Sat, 22 Jan 2022 15:11:33 GMT Subject: RFR: 8280289: Enhance debug pp() command with NMT info [v4] In-Reply-To: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> References: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> Message-ID: > JDK-8275320 enhanced NMT malloc header, provided ability to identify if a pointer points to a malloc'd memory. Further, JDK-8277822 enabled NMT for debug builds. > > This is a good opportunity to integrate NMT to debug pp() command to provide useful information collected by NMT. > > For example: > > NMT=summary > > (gdb) call pp(0x7f2d9803db70) > "Executing pp" > 0x00007f2d9803db70 malloc'd 1576 bytes by Internal > > (gdb) call pp(0x00007f4300a20000) > "Executing pp" > 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC > > > NMT=detail > > > (gdb) call pp(0x7f2d9803db70) > "Executing pp" > 0x00007f2d9803db70 malloc'd 1576 bytes by Internal > [0x00007f2d9f1b784b] G1Arguments::create_heap()+0x1b > [0x00007f2d9f98a855] universe_init()+0x85 > [0x00007f2d9f2e0a97] init_globals()+0x37 > [0x00007f2d9f960acb] Threads::create_vm(JavaVMInitArgs*, bool*)+0x3db > > (gdb) call pp(0x00007f4300a20000) > "Executing pp" > 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC > [0x00007f433dc49c7f] reserve_memory(char*, unsigned long, unsigned long, int, bool)+0x17f > [0x00007f433dc4cf0b] ReservedSpace::reserve(unsigned long, unsigned long, unsigned long, char*, bool)+0x14b > [0x00007f433dc4d527] ReservedSpace::initialize(unsigned long, unsigned long, unsigned long, char*, bool)+0x1c7 > [0x00007f433d949d40] ShenandoahHeap::initialize()+0x340 Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: Thomas' comment ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7160/files - new: https://git.openjdk.java.net/jdk/pull/7160/files/3ee1ce22..d6b53d42 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7160&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7160&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7160.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7160/head:pull/7160 PR: https://git.openjdk.java.net/jdk/pull/7160 From zgu at openjdk.java.net Sat Jan 22 15:11:36 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Sat, 22 Jan 2022 15:11:36 GMT Subject: RFR: 8280289: Enhance debug pp() command with NMT info [v3] In-Reply-To: References: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> Message-ID: On Sat, 22 Jan 2022 08:44:56 GMT, Thomas Stuefe wrote: >> Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: >> >> Missing include file > > src/hotspot/share/services/virtualMemoryTracker.cpp line 683: > >> 681: >> 682: bool do_allocation_site(const ReservedMemoryRegion* rgn) { >> 683: if (_p >= rgn->base() && _p < rgn->base() + rgn->size()) { > > Use `VirtualMemoryRegion::contain_address()`? Sure. ------------- PR: https://git.openjdk.java.net/jdk/pull/7160 From stuefe at openjdk.java.net Sat Jan 22 18:00:03 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Sat, 22 Jan 2022 18:00:03 GMT Subject: RFR: 8280289: Enhance debug pp() command with NMT info [v4] In-Reply-To: References: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> Message-ID: On Sat, 22 Jan 2022 15:11:33 GMT, Zhengyu Gu wrote: >> JDK-8275320 enhanced NMT malloc header, provided ability to identify if a pointer points to a malloc'd memory. Further, JDK-8277822 enabled NMT for debug builds. >> >> This is a good opportunity to integrate NMT to debug pp() command to provide useful information collected by NMT. >> >> For example: >> >> NMT=summary >> >> (gdb) call pp(0x7f2d9803db70) >> "Executing pp" >> 0x00007f2d9803db70 malloc'd 1576 bytes by Internal >> >> (gdb) call pp(0x00007f4300a20000) >> "Executing pp" >> 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC >> >> >> NMT=detail >> >> >> (gdb) call pp(0x7f2d9803db70) >> "Executing pp" >> 0x00007f2d9803db70 malloc'd 1576 bytes by Internal >> [0x00007f2d9f1b784b] G1Arguments::create_heap()+0x1b >> [0x00007f2d9f98a855] universe_init()+0x85 >> [0x00007f2d9f2e0a97] init_globals()+0x37 >> [0x00007f2d9f960acb] Threads::create_vm(JavaVMInitArgs*, bool*)+0x3db >> >> (gdb) call pp(0x00007f4300a20000) >> "Executing pp" >> 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC >> [0x00007f433dc49c7f] reserve_memory(char*, unsigned long, unsigned long, int, bool)+0x17f >> [0x00007f433dc4cf0b] ReservedSpace::reserve(unsigned long, unsigned long, unsigned long, char*, bool)+0x14b >> [0x00007f433dc4d527] ReservedSpace::initialize(unsigned long, unsigned long, unsigned long, char*, bool)+0x1c7 >> [0x00007f433d949d40] ShenandoahHeap::initialize()+0x340 > > Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: > > Thomas' comment Looks good. Thanks for taking my suggestions! Cheers, Thomas ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7160 From stuefe at openjdk.java.net Sat Jan 22 18:03:27 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Sat, 22 Jan 2022 18:03:27 GMT Subject: RFR: JDK-8280503: Use allStatic.hpp instead of allocation.hpp where possible Message-ID: JDK-8249944 moved AllStatic to its own header. We should use that one instead of allocation.hpp where possible to reduce header dependencies. This patch: - replaces includes of allocation.hpp with allstatic.hpp where appropiate - fixes up resulting errors since this changes uncovers missing dependencies. Mainly, missing includes of debug.hpp, of globalDefinitions.hpp, and missing outputStream definitions. Changes are trivial but onerous. Done partly with a script, partly manually. Test: - Checked the build with gtests on Linux x86, x64, minimal, zero, aarch64, for both fastdebug and release. All builds of course without PCH. - GHAs ------------- Commit messages: - fix copyrights - start Changes: https://git.openjdk.java.net/jdk/pull/7188/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7188&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8280503 Stats: 365 lines in 170 files changed: 32 ins; 0 del; 333 mod Patch: https://git.openjdk.java.net/jdk/pull/7188.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7188/head:pull/7188 PR: https://git.openjdk.java.net/jdk/pull/7188 From foths.kounelhs at gmail.com Sat Jan 22 18:43:52 2022 From: foths.kounelhs at gmail.com (Fotios Kounelis) Date: Sat, 22 Jan 2022 18:43:52 +0000 Subject: [External] : Re: jvmci return array In-Reply-To: References: <631cc1b2-931f-28ee-0e59-b87015b542a0@oracle.com> <2122c376-37e5-3cf2-68d0-42cf70fc113b@gmail.com> <32053EF1-60D5-4C63-BF3F-4350030157AE@oracle.com> Message-ID: <72553a9a-0f08-807d-5e6b-bff005d7136c@gmail.com> Hello, I have this PR ( https://github.com/fotiskoun/jdk11u-dev/pull/1 ) for OpenJDK11, where I try to create a function (JVMCIRuntime::object_hash_get) in jvmciRuntime.cpp to return an int[] array. Could you please further assist me if the code works as expected or has any logical issues (Tom and Doug cc'ed in this email, already gave me some much appreciated hints!)? Best regards, Fotis On 19/01/2022 12:57, Fotios Kounelis wrote: > > Hello, > > Please find the pull request in the following link: > https://github.com/fotiskoun/jdk11u-dev/pulls > > As you can see in the commit tab, this pull request contains only the > hash_get function, as the rest of my logic is already in the master tab. > > To quickly describe what my general changes in jvmciRuntime.cpp do: I > want to create two native functions, a hash_put and a hash_get. The > arguments of hash_put() are 2 int[] arrays and their sizes and the > behavior is to add these arrays in a local hash map. Respectively, the > hash_get() has an int[] array as argument and by searching the map, > using the argument as key, returns an array from the map. > > You could also have a look at hash_put for more and let me know if > there is something wrong about it, too. > > Best regards, > > Fotis > > On 18/01/2022 22:40, Douglas Simon wrote: >> I think we?re going to be able to better help if you put up a PR on GitHub (preferably a PR on your personal fork ofhttps://github.com/openjdk/jdk). That way, there?s no need to guess at missing details of your problem. >> >> -Doug >> >>> On 19 Jan 2022, at 08:31, Fotios Kounelis wrote: >>> >>> Hi Tom, >>> >>> Thank you for your reply. My bad example was an attempt to use jni in the HotSpot internals. What I am trying to do is: a function returning an int[] array, filled with values of another existing array. >>> >>> So, my function 1) needs to return void 2) have an oop obj initialized with new_intArray 3) fill the values using int_at_put(0...size-1, arrayWithValues[0...size-1]) and 4) use the set_vm_result() to return my array. >>> >>> I am not sure what you meant by "proper unpacking logic by a stub caller" for the set_vm_result, as in the same file the common use is something like "thread->set_vm_resutl(obj)", initializing just the oop obj properly with a new_typeArray. >>> >>> I would appreciate a quick example in case the above is not how it should work. >>> >>> Thank you for your time! >>> >>> I couldn't find (or maybe I didn't understand) the set*ArrayRegion initialization in the jni.cpp with the macros, that is why I didn't mention it above. >>> >>> Best regards, >>> >>> Fotis >>> >>> On 18/01/2022 21:15, Tom Rodriguez wrote: >>>> proper unpacking logic by a stub caller From iklam at openjdk.java.net Sat Jan 22 19:58:04 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Sat, 22 Jan 2022 19:58:04 GMT Subject: RFR: 8280289: Enhance debug pp() command with NMT info [v3] In-Reply-To: References: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> Message-ID: On Sat, 22 Jan 2022 14:56:30 GMT, Zhengyu Gu wrote: >> src/hotspot/share/services/virtualMemoryTracker.cpp line 693: >> >>> 691: }; >>> 692: >>> 693: const ReservedMemoryRegion* VirtualMemoryTracker::find_region(void* p) { >> >> Curious, is this thread-safe? Do we care if not? > > It is not thread-safe. But the command is running inside debuger, I don't see there is concurrent case. Not sure about gdb, but I tried running a simple multi threaded program inside lldb on the mac. When calling a function, lldb will resume the program so other threads will execute concurrently. In this case, `walk_virtual_memory` is protected by a `ThreadCritical`. However, the value returned by `find_region` may become invalid as soon as `walk_virtual_memory` returns because it could be deallocated by another thread. ------------- PR: https://git.openjdk.java.net/jdk/pull/7160 From zgu at openjdk.java.net Sun Jan 23 00:33:00 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Sun, 23 Jan 2022 00:33:00 GMT Subject: RFR: 8280289: Enhance debug pp() command with NMT info [v3] In-Reply-To: References: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> Message-ID: On Sat, 22 Jan 2022 19:55:23 GMT, Ioi Lam wrote: >> It is not thread-safe. But the command is running inside debuger, I don't see there is concurrent case. > > Not sure about gdb, but I tried running a simple multi threaded program inside lldb on the mac. When calling a function, lldb will resume the program so other threads will execute concurrently. > > In this case, `walk_virtual_memory` is protected by a `ThreadCritical`. However, the value returned by `find_region` may become invalid as soon as `walk_virtual_memory` returns because it could be deallocated by another thread. Then malloc case is even worse, the block could be deallocated by another thread. Even `oop->print()` is questionable ... ------------- PR: https://git.openjdk.java.net/jdk/pull/7160 From dholmes at openjdk.java.net Mon Jan 24 02:26:07 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 24 Jan 2022 02:26:07 GMT Subject: RFR: 8280457: Duplicate implementation of dprecision_rounding and dstore_rounding [v2] In-Reply-To: References: Message-ID: On Fri, 21 Jan 2022 23:29:54 GMT, Jie Fu wrote: >> Hi all, >> >> `GraphKit::dprecision_rounding` and `GraphKit::dstore_rounding` are duplicate. >> It would be better to remove one of them. >> >> The patch removes `GraphKit::dstore_rounding` and replaces all the usages with `GraphKit::dprecision_rounding`. >> >> Thanks. >> Best regards, >> Jie > > Jie Fu has updated the pull request incrementally with one additional commit since the last revision: > > precision_rounding --> dprecision_rounding src/hotspot/share/opto/graphKit.hpp line 790: > 788: Node* precision_rounding(Node* n); > 789: > 790: // rounding for strict double precision conformance Isn't the reference to "strict" here and in the cpp file no longer applicable? ------------- PR: https://git.openjdk.java.net/jdk/pull/7176 From dholmes at openjdk.java.net Mon Jan 24 02:35:09 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 24 Jan 2022 02:35:09 GMT Subject: RFR: JDK-8280503: Use allStatic.hpp instead of allocation.hpp where possible In-Reply-To: References: Message-ID: <-rdJfr4oGyamt9-FGQZtAWFm1MraLZMpvtA8rxMjaNY=.e5d31b31-4b10-4373-b95b-b5e27c8d4bde@github.com> On Sat, 22 Jan 2022 13:33:24 GMT, Thomas Stuefe wrote: > JDK-8249944 moved AllStatic to its own header. We should use that one instead of allocation.hpp where possible to reduce header dependencies. > > This patch: > - replaces includes of allocation.hpp with allstatic.hpp where appropiate > - fixes up resulting errors since this changes uncovers missing dependencies. Mainly, missing includes of debug.hpp, of globalDefinitions.hpp, and missing outputStream definitions. > > Changes are trivial but onerous. Done partly with a script, partly manually. > > Test: > - Checked the build with gtests on Linux x86, x64, minimal, zero, aarch64, for both fastdebug and release. All builds of course without PCH. > - GHAs src/hotspot/share/cds/archiveUtils.cpp line 44: > 42: #include "utilities/debug.hpp" > 43: #include "utilities/formatBuffer.hpp" > 44: #include "utilities/globalDefinitions.hpp" Seems unrelated to this issue. ------------- PR: https://git.openjdk.java.net/jdk/pull/7188 From jiefu at openjdk.java.net Mon Jan 24 02:37:07 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 24 Jan 2022 02:37:07 GMT Subject: RFR: 8280457: Duplicate implementation of dprecision_rounding and dstore_rounding [v2] In-Reply-To: References: Message-ID: On Mon, 24 Jan 2022 02:23:11 GMT, David Holmes wrote: > Isn't the reference to "strict" here and in the cpp file no longer applicable? // Advertise here if the CPU requires explicit rounding operations to implement strictfp mode. #ifdef _LP64 static const bool strict_fp_requires_explicit_rounding = false; #else static const bool strict_fp_requires_explicit_rounding = true; #endif It seems that only x86_32 needs explicit rounding for strict_fp operations. ------------- PR: https://git.openjdk.java.net/jdk/pull/7176 From dholmes at openjdk.java.net Mon Jan 24 02:46:06 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 24 Jan 2022 02:46:06 GMT Subject: RFR: JDK-8280503: Use allStatic.hpp instead of allocation.hpp where possible In-Reply-To: References: Message-ID: On Sat, 22 Jan 2022 13:33:24 GMT, Thomas Stuefe wrote: > JDK-8249944 moved AllStatic to its own header. We should use that one instead of allocation.hpp where possible to reduce header dependencies. > > This patch: > - replaces includes of allocation.hpp with allstatic.hpp where appropiate > - fixes up resulting errors since this changes uncovers missing dependencies. Mainly, missing includes of debug.hpp, of globalDefinitions.hpp, and missing outputStream definitions. > > Changes are trivial but onerous. Done partly with a script, partly manually. > > Test: > - Checked the build with gtests on Linux x86, x64, minimal, zero, aarch64, for both fastdebug and release. All builds of course without PCH. > - GHAs Hi Thomas, Seems okay - hard to validate (and I expect allocation.hpp to be included somewhere in most cases anyway). Some of the changes seem to have nothing to do with: -#include "memory/allocation.hpp" +#include "memory/allStatic.hpp" ? Are these caused by transitive include changes? Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7188 From iklam at openjdk.java.net Mon Jan 24 04:18:06 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 24 Jan 2022 04:18:06 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v5] In-Reply-To: References: Message-ID: <_OK_txWawZez34924dgvXieLKwMMBIFm5yjHotuj9MI=.d951a93b-00e1-4ce8-b8b7-f21935b1dcf8@github.com> On Thu, 20 Jan 2022 09:47:31 GMT, Yi Yang wrote: >> Add VM.classes to print details of all classes, output looks like: >> >> 1. jcmd VM.classes >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 >> ... >> >> 2. jcmd VM.classes verbose >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841f210) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 >> - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$MH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - vtable length 5 (start addr: 0x0000000800c0b5b8) >> - itable length 2 (start addr: 0x0000000800c0b5e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> - non-static oop maps: >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841ea68) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 >> - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$DMH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - vtable length 5 (start addr: 0x0000000800c0b1b8) >> - itable length 2 (start addr: 0x0000000800c0b1e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> ... > > Yi Yang has updated the pull request incrementally with one additional commit since the last revision: > > fix test This looks generally OK to me. Some minor suggestions. src/hotspot/share/oops/instanceKlass.cpp line 2106: > 2104: // classloader name > 2105: ClassLoaderData* cld = k->class_loader_data(); > 2106: _st->print("%-12s ", cld->loader_name()); For custom class loaders, this will likely print a long class name that will over the 12 character limit, making the output somewhat hard to read. const char* ClassLoaderData::loader_name() const { if (_class_loader_klass == NULL) { return BOOTSTRAP_LOADER_NAME; } else if (_name != NULL) { return _name->as_C_string(); } else { return _class_loader_klass->external_name(); } } Also, for custom loaders, printing out just the name of the loader class is not sufficient, as multiple loader instances may have the same type. Maybe we should just remove line 2106? If the user wants to know the class loader, they can use the "-verbose" option of this jcmd. src/hotspot/share/services/diagnosticCommand.hpp line 870: > 868: } > 869: static const char* description() { > 870: return "Prints list of all loaded classes"; I think it's better to say "Print all loaded classes". Most commands in this file do not use the third-person singular verb ending. The words "list of" are redundant. test/hotspot/jtreg/runtime/CommandLine/PrintClasses.java line 40: > 38: var pid = Long.toString(ProcessHandle.current().pid()); > 39: var pb = new ProcessBuilder(); > 40: pb.command(new String[] { JDKToolFinder.getJDKTool("jcmd"), pid, "VM.classes", "-verbose"}); For sanity, I think we should have two test cases, one with -verbose, and one without. ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From iklam at openjdk.java.net Mon Jan 24 04:22:08 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 24 Jan 2022 04:22:08 GMT Subject: RFR: JDK-8280503: Use allStatic.hpp instead of allocation.hpp where possible In-Reply-To: References: Message-ID: On Sat, 22 Jan 2022 13:33:24 GMT, Thomas Stuefe wrote: > JDK-8249944 moved AllStatic to its own header. We should use that one instead of allocation.hpp where possible to reduce header dependencies. > > This patch: > - replaces includes of allocation.hpp with allstatic.hpp where appropiate > - fixes up resulting errors since this changes uncovers missing dependencies. Mainly, missing includes of debug.hpp, of globalDefinitions.hpp, and missing outputStream definitions. > > Changes are trivial but onerous. Done partly with a script, partly manually. > > Test: > - Checked the build with gtests on Linux x86, x64, minimal, zero, aarch64, for both fastdebug and release. All builds of course without PCH. > - GHAs Looks good to me. I've validated with these builds locally on my machine: linux-x64-debug linux-aarch64-open-debug linux-arm32 linux-ppc64le-debug linux-s390x-debug linux-aarch64-debug linux-arm32-open-debug linux-aarch64-lic I am running a mach5 job for tier1 + builds-tier5. That should cover most of the builds done by the Oracle CI. I'll post the results when they are ready. ------------- PR: https://git.openjdk.java.net/jdk/pull/7188 From iklam at openjdk.java.net Mon Jan 24 04:26:08 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 24 Jan 2022 04:26:08 GMT Subject: RFR: JDK-8280503: Use allStatic.hpp instead of allocation.hpp where possible In-Reply-To: References: Message-ID: On Sat, 22 Jan 2022 13:33:24 GMT, Thomas Stuefe wrote: > JDK-8249944 moved AllStatic to its own header. We should use that one instead of allocation.hpp where possible to reduce header dependencies. > > This patch: > - replaces includes of allocation.hpp with allstatic.hpp where appropiate > - fixes up resulting errors since this changes uncovers missing dependencies. Mainly, missing includes of debug.hpp, of globalDefinitions.hpp, and missing outputStream definitions. > > Changes are trivial but onerous. Done partly with a script, partly manually. > > Test: > - Checked the build with gtests on Linux x86, x64, minimal, zero, aarch64, for both fastdebug and release. All builds of course without PCH. > - GHAs BTW, I have some scripts for checking how often a header file is included. See https://github.com/iklam/tools/tree/main/headers count_hotspot_headers.tcl shows that allocation.hpp was included by 1006 .o files before this fix, and 996 files afterwards, so not a whole lot of reduction. That's because we have over 300 headers that include allocatons.hpp :-) ------------- PR: https://git.openjdk.java.net/jdk/pull/7188 From iklam at openjdk.java.net Mon Jan 24 04:33:07 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 24 Jan 2022 04:33:07 GMT Subject: RFR: JDK-8280503: Use allStatic.hpp instead of allocation.hpp where possible In-Reply-To: References: Message-ID: On Mon, 24 Jan 2022 04:18:48 GMT, Ioi Lam wrote: > I am running a mach5 job for tier1 + builds-tier5. That should cover most of the builds done by the Oracle CI. I'll post the results when they are ready. Unfortunately I am seeing failures on macos and windows: macos: src/hotspot/os/bsd/gc/z/zNUMA_bsd.cpp:25: src/hotspot/share/gc/z/zNUMA.hpp:39:10: error: unknown type name 'uint32_t' windows: src\hotspot\os\windows\threadLocalStorage_windows.cpp(34): error C3861: 'assert': identifier not found ------------- PR: https://git.openjdk.java.net/jdk/pull/7188 From stuefe at openjdk.java.net Mon Jan 24 05:32:03 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 24 Jan 2022 05:32:03 GMT Subject: RFR: JDK-8280503: Use allStatic.hpp instead of allocation.hpp where possible In-Reply-To: References: Message-ID: <_8EMCIOMzkM7nBVFjxdnC78Hly4vHMpdyxGuMfoyEN8=.ac9263e8-1859-47d0-8c7e-c1669a0367bb@github.com> On Mon, 24 Jan 2022 02:43:04 GMT, David Holmes wrote: > Hi Thomas, > > Seems okay - hard to validate (and I expect allocation.hpp to be included somewhere in most cases anyway). > It shouldn't, that's the idea of JDK-8249944. > Some of the changes seem to have nothing to do with: > > -#include "memory/allocation.hpp" +#include "memory/allStatic.hpp" > > ? Are these caused by transitive include changes? > See my comment in code. All places that needed to be fixed. > Thanks, David Thanks for the review, ..Thomas > src/hotspot/share/cds/archiveUtils.cpp line 44: > >> 42: #include "utilities/debug.hpp" >> 43: #include "utilities/formatBuffer.hpp" >> 44: #include "utilities/globalDefinitions.hpp" > > Seems unrelated to this issue. archiveUtils.cpp misses debug.hpp (since it uses assert) and globalDefinitions.hpp (since it uses types from there) but missed these includes. This had been hidden by one of its includes pulling allocation.hpp. Same goes for the other seemingly unrelated fixups: all missing includes hidden by including allocation.hpp ------------- PR: https://git.openjdk.java.net/jdk/pull/7188 From stuefe at openjdk.java.net Mon Jan 24 05:38:09 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 24 Jan 2022 05:38:09 GMT Subject: RFR: JDK-8280503: Use allStatic.hpp instead of allocation.hpp where possible In-Reply-To: References: Message-ID: On Mon, 24 Jan 2022 04:22:34 GMT, Ioi Lam wrote: > BTW, I have some scripts for checking how often a header file is included. See https://github.com/iklam/tools/tree/main/headers > > count_hotspot_headers.tcl shows that allocation.hpp was included by 1006 .o files before this fix, and 996 files afterwards, so not a whole lot of reduction. That's because we have over 300 headers that include allocatons.hpp :-) Yes. allocation.hpp could be split up more. E.g. MEMFLAGS and the NMT categories really should live somewhere else, I saw some places where allocation.hpp was included only because of them. StackObj may also be a good candidate for moving to an own small header. Does your tool tell you include chokepoints, maybe its just one central include pulling in allocation.hpp? > Unfortunately I am seeing failures on macos and windows: > > macos: > > src/hotspot/os/bsd/gc/z/zNUMA_bsd.cpp:25: src/hotspot/share/gc/z/zNUMA.hpp:39:10: error: unknown type name 'uint32_t' > > windows: > > src\hotspot\os\windows\threadLocalStorage_windows.cpp(34): error C3861: 'assert': identifier not found Strange, since the GHAs went through. What are your build flags? We should be able to rely on GHAs for builds at least :( The bugs are easy to fix though. Thanks for testing. ------------- PR: https://git.openjdk.java.net/jdk/pull/7188 From stuefe at openjdk.java.net Mon Jan 24 05:44:50 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 24 Jan 2022 05:44:50 GMT Subject: RFR: JDK-8280503: Use allStatic.hpp instead of allocation.hpp where possible [v2] In-Reply-To: References: Message-ID: <5hs6tTRypYuz6tjnj501Xk9rW8v7ORsyr_a9osqZye8=.1f674f0e-deb7-4683-9376-b55901fdc27b@github.com> > JDK-8249944 moved AllStatic to its own header. We should use that one instead of allocation.hpp where possible to reduce header dependencies. > > This patch: > - replaces includes of allocation.hpp with allstatic.hpp where appropiate > - fixes up resulting errors since this changes uncovers missing dependencies. Mainly, missing includes of debug.hpp, of globalDefinitions.hpp, and missing outputStream definitions. > > Changes are trivial but onerous. Done partly with a script, partly manually. > > Test: > - Checked the build with gtests on Linux x86, x64, minimal, zero, aarch64, for both fastdebug and release. All builds of course without PCH. > - GHAs Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: add missing includes for macos, windows ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7188/files - new: https://git.openjdk.java.net/jdk/pull/7188/files/37fd8fdb..2f60b989 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7188&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7188&range=00-01 Stats: 2 lines in 2 files changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7188.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7188/head:pull/7188 PR: https://git.openjdk.java.net/jdk/pull/7188 From iklam at openjdk.java.net Mon Jan 24 05:57:08 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 24 Jan 2022 05:57:08 GMT Subject: RFR: JDK-8280503: Use allStatic.hpp instead of allocation.hpp where possible [v2] In-Reply-To: <5hs6tTRypYuz6tjnj501Xk9rW8v7ORsyr_a9osqZye8=.1f674f0e-deb7-4683-9376-b55901fdc27b@github.com> References: <5hs6tTRypYuz6tjnj501Xk9rW8v7ORsyr_a9osqZye8=.1f674f0e-deb7-4683-9376-b55901fdc27b@github.com> Message-ID: On Mon, 24 Jan 2022 05:44:50 GMT, Thomas Stuefe wrote: >> JDK-8249944 moved AllStatic to its own header. We should use that one instead of allocation.hpp where possible to reduce header dependencies. >> >> This patch: >> - replaces includes of allocation.hpp with allstatic.hpp where appropiate >> - fixes up resulting errors since this changes uncovers missing dependencies. Mainly, missing includes of debug.hpp, of globalDefinitions.hpp, and missing outputStream definitions. >> >> Changes are trivial but onerous. Done partly with a script, partly manually. >> >> Test: >> - Checked the build with gtests on Linux x86, x64, minimal, zero, aarch64, for both fastdebug and release. All builds of course without PCH. >> - GHAs > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > add missing includes for macos, windows > > BTW, I have some scripts for checking how often a header file is included. See https://github.com/iklam/tools/tree/main/headers > > Does your tool tell you include chokepoints, maybe its just one central include pulling in allocation.hpp? > The whoincludes.tcl script can do that. Unfortunately it tells us that many popular header (such as ostream.hpp that was itself included 976 times) include allocations.hpp. src/hotspot$ tclsh whoincludes.tcl allocation.hpp| head -20 scanning 997 allocation.hpp 2 found 976 ostream.hpp 3 found 960 exceptions.hpp 4 found 938 atomic.hpp 5 found 891 memRegion.hpp 6 found 877 iterator.hpp 7 found 871 arena.hpp 8 found 864 mutex.hpp 9 found 860 growableArray.hpp 10 found 855 mutexLocker.hpp 11 found 855 autoRestore.hpp 12 found 848 padded.hpp 13 found 841 linkedlist.hpp 14 found 835 jfrAllocation.hpp 15 found 832 resourceHash.hpp 16 found 829 gcUtil.hpp 17 found 825 threadHeapSampler.hpp 18 found 825 thread.hpp 19 found 825 filterQueue.hpp 20 found 679 symbol.hpp ------------- PR: https://git.openjdk.java.net/jdk/pull/7188 From stuefe at openjdk.java.net Mon Jan 24 06:08:05 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 24 Jan 2022 06:08:05 GMT Subject: RFR: JDK-8280503: Use allStatic.hpp instead of allocation.hpp where possible [v2] In-Reply-To: References: <5hs6tTRypYuz6tjnj501Xk9rW8v7ORsyr_a9osqZye8=.1f674f0e-deb7-4683-9376-b55901fdc27b@github.com> Message-ID: On Mon, 24 Jan 2022 05:54:18 GMT, Ioi Lam wrote: > > > BTW, I have some scripts for checking how often a header file is included. See https://github.com/iklam/tools/tree/main/headers > > > > > > Does your tool tell you include chokepoints, maybe its just one central include pulling in allocation.hpp? > > The whoincludes.tcl script can do that. Unfortunately it tells us that many popular header (such as ostream.hpp that was itself included 976 times) include allocations.hpp. > > ``` > src/hotspot$ tclsh whoincludes.tcl allocation.hpp| head -20 > scanning 997 allocation.hpp > 2 found 976 ostream.hpp > 3 found 960 exceptions.hpp > 4 found 938 atomic.hpp > 5 found 891 memRegion.hpp > 6 found 877 iterator.hpp > 7 found 871 arena.hpp > 8 found 864 mutex.hpp > 9 found 860 growableArray.hpp > 10 found 855 mutexLocker.hpp > 11 found 855 autoRestore.hpp > 12 found 848 padded.hpp > 13 found 841 linkedlist.hpp > 14 found 835 jfrAllocation.hpp > 15 found 832 resourceHash.hpp > 16 found 829 gcUtil.hpp > 17 found 825 threadHeapSampler.hpp > 18 found 825 thread.hpp > 19 found 825 filterQueue.hpp > 20 found 679 symbol.hpp > ``` That's a useful script. Well, maybe we can break this up a bit more. BTW, can you send me your configure line for your breaking builds? I'd like to reproduce them locally. ------------- PR: https://git.openjdk.java.net/jdk/pull/7188 From iklam at openjdk.java.net Mon Jan 24 06:22:07 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 24 Jan 2022 06:22:07 GMT Subject: RFR: 8280289: Enhance debug pp() command with NMT info [v3] In-Reply-To: References: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> Message-ID: On Sun, 23 Jan 2022 00:30:23 GMT, Zhengyu Gu wrote: >> Not sure about gdb, but I tried running a simple multi threaded program inside lldb on the mac. When calling a function, lldb will resume the program so other threads will execute concurrently. >> >> In this case, `walk_virtual_memory` is protected by a `ThreadCritical`. However, the value returned by `find_region` may become invalid as soon as `walk_virtual_memory` returns because it could be deallocated by another thread. > > Then malloc case is even worse, the block could be deallocated by another thread. Even `oop->print()` is questionable ... Can the printing be done inside do_allocation_site() instead? That way we are holding the ThreadCritical that will prevent concurrent NMT record/remove operations. ------------- PR: https://git.openjdk.java.net/jdk/pull/7160 From stuefe at openjdk.java.net Mon Jan 24 07:41:44 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 24 Jan 2022 07:41:44 GMT Subject: RFR: JDK-8280503: Use allStatic.hpp instead of allocation.hpp where possible [v3] In-Reply-To: References: Message-ID: <6q0y0ZvuU8cTpnKZqZ92m_aqezIzEYH1-q0jdsXGfcs=.6079f777-aa28-4f1e-8fca-87b00a29c488@github.com> > JDK-8249944 moved AllStatic to its own header. We should use that one instead of allocation.hpp where possible to reduce header dependencies. > > This patch: > - replaces includes of allocation.hpp with allstatic.hpp where appropiate > - fixes up resulting errors since this changes uncovers missing dependencies. Mainly, missing includes of debug.hpp, of globalDefinitions.hpp, and missing outputStream definitions. > > Changes are trivial but onerous. Done partly with a script, partly manually. > > Test: > - Checked the build with gtests on Linux x86, x64, minimal, zero, aarch64, for both fastdebug and release. All builds of course without PCH. > - GHAs Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: Add missing header to zNUMA.hpp ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7188/files - new: https://git.openjdk.java.net/jdk/pull/7188/files/2f60b989..e2b9b0d6 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7188&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7188&range=01-02 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7188.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7188/head:pull/7188 PR: https://git.openjdk.java.net/jdk/pull/7188 From stuefe at openjdk.java.net Mon Jan 24 07:41:45 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 24 Jan 2022 07:41:45 GMT Subject: RFR: JDK-8280503: Use allStatic.hpp instead of allocation.hpp where possible [v2] In-Reply-To: <5hs6tTRypYuz6tjnj501Xk9rW8v7ORsyr_a9osqZye8=.1f674f0e-deb7-4683-9376-b55901fdc27b@github.com> References: <5hs6tTRypYuz6tjnj501Xk9rW8v7ORsyr_a9osqZye8=.1f674f0e-deb7-4683-9376-b55901fdc27b@github.com> Message-ID: On Mon, 24 Jan 2022 05:44:50 GMT, Thomas Stuefe wrote: >> JDK-8249944 moved AllStatic to its own header. We should use that one instead of allocation.hpp where possible to reduce header dependencies. >> >> This patch: >> - replaces includes of allocation.hpp with allstatic.hpp where appropiate >> - fixes up resulting errors since this changes uncovers missing dependencies. Mainly, missing includes of debug.hpp, of globalDefinitions.hpp, and missing outputStream definitions. >> >> Changes are trivial but onerous. Done partly with a script, partly manually. >> >> Test: >> - Checked the build with gtests on Linux x86, x64, minimal, zero, aarch64, for both fastdebug and release. All builds of course without PCH. >> - GHAs > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > add missing includes for macos, windows Hi Ioi, I fixed Windows, but cannot test MacOS since I don't have the hardware. Could you please give this another try? Found out that the reason we don't see failing builds in GHAs is that GHAs build with precompiled headers. We should change this if possible. Thanks, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/7188 From shade at openjdk.java.net Mon Jan 24 09:32:03 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 24 Jan 2022 09:32:03 GMT Subject: RFR: 8280457: Duplicate implementation of dprecision_rounding and dstore_rounding [v2] In-Reply-To: References: Message-ID: On Fri, 21 Jan 2022 23:29:54 GMT, Jie Fu wrote: >> Hi all, >> >> `GraphKit::dprecision_rounding` and `GraphKit::dstore_rounding` are duplicate. >> It would be better to remove one of them. >> >> The patch removes `GraphKit::dstore_rounding` and replaces all the usages with `GraphKit::dprecision_rounding`. >> >> Thanks. >> Best regards, >> Jie > > Jie Fu has updated the pull request incrementally with one additional commit since the last revision: > > precision_rounding --> dprecision_rounding Marked as reviewed by shade (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7176 From shade at openjdk.java.net Mon Jan 24 09:32:04 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 24 Jan 2022 09:32:04 GMT Subject: RFR: 8280457: Duplicate implementation of dprecision_rounding and dstore_rounding [v2] In-Reply-To: References: Message-ID: On Mon, 24 Jan 2022 02:33:57 GMT, Jie Fu wrote: >> src/hotspot/share/opto/graphKit.hpp line 790: >> >>> 788: Node* precision_rounding(Node* n); >>> 789: >>> 790: // rounding for strict double precision conformance >> >> Isn't the reference to "strict" here and in the cpp file no longer applicable? > >> Isn't the reference to "strict" here and in the cpp file no longer applicable? > > > // Advertise here if the CPU requires explicit rounding operations to implement strictfp mode. > #ifdef _LP64 > static const bool strict_fp_requires_explicit_rounding = false; > #else > static const bool strict_fp_requires_explicit_rounding = true; > #endif > > It seems that only x86_32 needs explicit rounding for strict_fp operations. Yup, "strict" is still meaningful on x86_32 FPU. It might be confusing due to fact that JDK 17 is now strictfp-by-default, but rounding to support strictfp is still needed for some awkward arches. ------------- PR: https://git.openjdk.java.net/jdk/pull/7176 From jiefu at openjdk.java.net Mon Jan 24 10:55:11 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Mon, 24 Jan 2022 10:55:11 GMT Subject: Integrated: 8280457: Duplicate implementation of dprecision_rounding and dstore_rounding In-Reply-To: References: Message-ID: On Fri, 21 Jan 2022 14:46:07 GMT, Jie Fu wrote: > Hi all, > > `GraphKit::dprecision_rounding` and `GraphKit::dstore_rounding` are duplicate. > It would be better to remove one of them. > > The patch removes `GraphKit::dstore_rounding` and replaces all the usages with `GraphKit::dprecision_rounding`. > > Thanks. > Best regards, > Jie This pull request has now been integrated. Changeset: 0567a84d Author: Jie Fu URL: https://git.openjdk.java.net/jdk/commit/0567a84d49fccda139388c22d1fc14e4aea6002b Stats: 28 lines in 4 files changed: 0 ins; 17 del; 11 mod 8280457: Duplicate implementation of dprecision_rounding and dstore_rounding Reviewed-by: vlivanov, shade ------------- PR: https://git.openjdk.java.net/jdk/pull/7176 From rkennke at openjdk.java.net Mon Jan 24 11:20:09 2022 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 24 Jan 2022 11:20:09 GMT Subject: Integrated: 8279534: Consolidate and remove oopDesc::klass_gap methods In-Reply-To: References: Message-ID: On Mon, 10 Jan 2022 13:40:29 GMT, Roman Kennke wrote: > After JDK-8278568, these methods are unused: > inline int klass_gap() const; > inline void set_klass_gap(int z); > > Except Zero which uses set_klass_gap(int), but we agreed elsewhere (#5585) that we don't want to access partly initialized oops as such. We should use the HeapWord* initialization variants in Zero, too. > > Note: we could take that even further and replace the initialization in Zero with ObjAllocator::initialize() call, but that would also have to remove the storestore fence, and possibly adopt ObjAllocator to avoid clearing in already-zeroed TLABs, all of which would have wider consequences and would be a matter for separate PR. > > Testing: > - [x] Build (for klass_gap methods removal) > - [ ] GHA for Zero stuff This pull request has now been integrated. Changeset: afd2805e Author: Roman Kennke URL: https://git.openjdk.java.net/jdk/commit/afd2805ef2fe72aee04b84956dba5bb5c012ff3c Stats: 18 lines in 3 files changed: 3 ins; 13 del; 2 mod 8279534: Consolidate and remove oopDesc::klass_gap methods Reviewed-by: shade, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/7008 From dholmes at openjdk.java.net Mon Jan 24 12:52:11 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 24 Jan 2022 12:52:11 GMT Subject: RFR: 8280457: Duplicate implementation of dprecision_rounding and dstore_rounding [v2] In-Reply-To: References: Message-ID: On Mon, 24 Jan 2022 09:28:51 GMT, Aleksey Shipilev wrote: >>> Isn't the reference to "strict" here and in the cpp file no longer applicable? >> >> >> // Advertise here if the CPU requires explicit rounding operations to implement strictfp mode. >> #ifdef _LP64 >> static const bool strict_fp_requires_explicit_rounding = false; >> #else >> static const bool strict_fp_requires_explicit_rounding = true; >> #endif >> >> It seems that only x86_32 needs explicit rounding for strict_fp operations. > > Yup, "strict" is still meaningful on x86_32 FPU. It might be confusing due to fact that JDK 17 is now strictfp-by-default, but rounding to support strictfp is still needed for some awkward arches. My point was that the now deleted function claimed it was the rounding function for non-strict, while the function that has been kept is described as being for strict. But now there is only one function that has to cater for both so using "strict" in its description seems inappropriate. ------------- PR: https://git.openjdk.java.net/jdk/pull/7176 From chagedorn at openjdk.java.net Mon Jan 24 13:30:37 2022 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Mon, 24 Jan 2022 13:30:37 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files Message-ID: When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method: Stack: [0x00007f6e01739000,0x00007f6e0183a000], sp=0x00007f6e01838110, free space=1020k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d V [libjvm.so+0x12091c9] JavaThread::run()+0x167 V [libjvm.so+0x1206ada] Thread::call_run()+0x180 V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method. This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)): Stack: [0x00007f34fca18000,0x00007f34fcb19000], sp=0x00007f34fcb17110, free space=1020k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 (c1_Compilation.cpp:607) V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec (c1_Compiler.cpp:250) V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 (compileBroker.cpp:2291) V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df (compileBroker.cpp:1966) V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 (compilerThread.cpp:59) V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d (thread.cpp:1297) V [libjvm.so+0x12091c9] JavaThread::run()+0x167 (thread.cpp:1280) V [libjvm.so+0x1206ada] Thread::call_run()+0x180 (thread.cpp:358) V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f (os_linux.cpp:705) For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability. Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. **Testing:** Apart from manual testing, I've added two kinds of tests: - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers. - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename. On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation. To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number. I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user). Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`! Thanks, Christian ------------- Commit messages: - Better formatting of trace output - some code move and more cleanups - refactor .debug_abbrev - refactor .debug_aranges - extract some methods for dwarf file creation - Cleanup line number program: result reporting, parameter passing, method names - Fix copyright - Move JTreg test and minor cleanups - Enable JTreg test for release builds - Fix 32-bit build - ... and 36 more: https://git.openjdk.java.net/jdk/compare/a4d20190...f8c98a29 Changes: https://git.openjdk.java.net/jdk/pull/7126/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7126&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8242181 Stats: 2594 lines in 19 files changed: 2456 ins; 72 del; 66 mod Patch: https://git.openjdk.java.net/jdk/pull/7126.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7126/head:pull/7126 PR: https://git.openjdk.java.net/jdk/pull/7126 From zgu at openjdk.java.net Mon Jan 24 14:03:09 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Mon, 24 Jan 2022 14:03:09 GMT Subject: RFR: 8280289: Enhance debug pp() command with NMT info [v3] In-Reply-To: References: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> Message-ID: On Mon, 24 Jan 2022 06:18:32 GMT, Ioi Lam wrote: >> Then malloc case is even worse, the block could be deallocated by another thread. Even `oop->print()` is questionable ... > > Can the printing be done inside do_allocation_site() instead? That way we are holding the ThreadCritical that will prevent concurrent NMT record/remove operations. Yes, we can. But I think dealing with malloc'd memory is more problematic, if call triggers concurrent execution. I am going to withdraw this for now and take another look how debuggers behave. Thanks ------------- PR: https://git.openjdk.java.net/jdk/pull/7160 From zgu at openjdk.java.net Mon Jan 24 14:03:10 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Mon, 24 Jan 2022 14:03:10 GMT Subject: Withdrawn: 8280289: Enhance debug pp() command with NMT info In-Reply-To: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> References: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> Message-ID: On Thu, 20 Jan 2022 15:31:12 GMT, Zhengyu Gu wrote: > JDK-8275320 enhanced NMT malloc header, provided ability to identify if a pointer points to a malloc'd memory. Further, JDK-8277822 enabled NMT for debug builds. > > This is a good opportunity to integrate NMT to debug pp() command to provide useful information collected by NMT. > > For example: > > NMT=summary > > (gdb) call pp(0x7f2d9803db70) > "Executing pp" > 0x00007f2d9803db70 malloc'd 1576 bytes by Internal > > (gdb) call pp(0x00007f4300a20000) > "Executing pp" > 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC > > > NMT=detail > > > (gdb) call pp(0x7f2d9803db70) > "Executing pp" > 0x00007f2d9803db70 malloc'd 1576 bytes by Internal > [0x00007f2d9f1b784b] G1Arguments::create_heap()+0x1b > [0x00007f2d9f98a855] universe_init()+0x85 > [0x00007f2d9f2e0a97] init_globals()+0x37 > [0x00007f2d9f960acb] Threads::create_vm(JavaVMInitArgs*, bool*)+0x3db > > (gdb) call pp(0x00007f4300a20000) > "Executing pp" > 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC > [0x00007f433dc49c7f] reserve_memory(char*, unsigned long, unsigned long, int, bool)+0x17f > [0x00007f433dc4cf0b] ReservedSpace::reserve(unsigned long, unsigned long, unsigned long, char*, bool)+0x14b > [0x00007f433dc4d527] ReservedSpace::initialize(unsigned long, unsigned long, unsigned long, char*, bool)+0x1c7 > [0x00007f433d949d40] ShenandoahHeap::initialize()+0x340 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/7160 From zgu at openjdk.java.net Mon Jan 24 15:11:13 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Mon, 24 Jan 2022 15:11:13 GMT Subject: RFR: 8280289: Enhance debug pp() command with NMT info [v3] In-Reply-To: References: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> Message-ID: On Mon, 24 Jan 2022 13:59:09 GMT, Zhengyu Gu wrote: >> Can the printing be done inside do_allocation_site() instead? That way we are holding the ThreadCritical that will prevent concurrent NMT record/remove operations. > > Yes, we can. But I think dealing with malloc'd memory is more problematic, if call triggers concurrent execution. > > I am going to withdraw this for now and take another look how debuggers behave. > > Thanks gdb also resumes all threads upon function call. ------------- PR: https://git.openjdk.java.net/jdk/pull/7160 From duke at openjdk.java.net Mon Jan 24 15:56:06 2022 From: duke at openjdk.java.net (Alan Hayward) Date: Mon, 24 Jan 2022 15:56:06 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v14] In-Reply-To: References: Message-ID: > PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One > of its uses is to protect against ROP based attacks. This is done by > signing the Link Register whenever it is stored on the stack, and > authenticating the value when it is loaded back from the stack. If an > attacker were to try to change control flow by editing the stack then > the authentication check of the Link Register will fail, causing a > segfault when the function returns. > > On a system with PAC enabled, it is expected that all applications will > be compiled with ROP protection. Fedora 33 and upwards already provide > this. By compiling for ARMv8.0, GCC and LLVM will only use the set of > PAC instructions that exist in the NOP space - on hardware without PAC, > these instructions act as NOPs, allowing backward compatibility for > negligible performance cost (2 NOPs per non-leaf function). > > Hardware is currently limited to the Apple M1 MacBooks. All testing has > been done within a Fedora Docker image. A run of SpecJVM showed no > difference to that of noise - which was surprising. > > The most important part of this patch is simply compiling using branch > protection provided by GCC/LLVM. This protects all C++ code from being > used in ROP attacks, removing all static ROP gadgets from use. > > The remainder of the patch adds ROP protection to runtime generated > code, in both stubs and compiled Java code. Attacks here are much harder > as ROP gadgets must be found dynamically at runtime. If/when AOT > compilation is added to JDK, then all stubs and compiled Java will be > susceptible ROP gadgets being found by static analysis and therefore > potentially as vulnerable as C++ code. > > There are a number of places where the VM changes control flow by > rewriting the stack or otherwise. I?ve done some analysis as to how > these could also be used for attacks (which I didn?t want to post here). > These areas can be protected ensuring the pointers to various stubs and > entry points are stored in memory as signed pointers. These changes are > simple to make (they can be reduced to a type change in common code and > a few addition sign/auth calls in the backend), but there a lot of them > and the total code change is fairly large. I?m happy to provide a few > work in progress patches. > > In order to match the security benefits of the Apple Arm64e ABI across > the whole of JDK, then all the changes mentioned above would be > required. Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: Fix popframe failures ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6334/files - new: https://git.openjdk.java.net/jdk/pull/6334/files/14799421..0b476542 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=13 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6334&range=12-13 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/6334.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6334/head:pull/6334 PR: https://git.openjdk.java.net/jdk/pull/6334 From duke at openjdk.java.net Mon Jan 24 15:56:06 2022 From: duke at openjdk.java.net (Alan Hayward) Date: Mon, 24 Jan 2022 15:56:06 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v13] In-Reply-To: References: Message-ID: On Thu, 20 Jan 2022 17:10:39 GMT, Alan Hayward wrote: >> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One >> of its uses is to protect against ROP based attacks. This is done by >> signing the Link Register whenever it is stored on the stack, and >> authenticating the value when it is loaded back from the stack. If an >> attacker were to try to change control flow by editing the stack then >> the authentication check of the Link Register will fail, causing a >> segfault when the function returns. >> >> On a system with PAC enabled, it is expected that all applications will >> be compiled with ROP protection. Fedora 33 and upwards already provide >> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of >> PAC instructions that exist in the NOP space - on hardware without PAC, >> these instructions act as NOPs, allowing backward compatibility for >> negligible performance cost (2 NOPs per non-leaf function). >> >> Hardware is currently limited to the Apple M1 MacBooks. All testing has >> been done within a Fedora Docker image. A run of SpecJVM showed no >> difference to that of noise - which was surprising. >> >> The most important part of this patch is simply compiling using branch >> protection provided by GCC/LLVM. This protects all C++ code from being >> used in ROP attacks, removing all static ROP gadgets from use. >> >> The remainder of the patch adds ROP protection to runtime generated >> code, in both stubs and compiled Java code. Attacks here are much harder >> as ROP gadgets must be found dynamically at runtime. If/when AOT >> compilation is added to JDK, then all stubs and compiled Java will be >> susceptible ROP gadgets being found by static analysis and therefore >> potentially as vulnerable as C++ code. >> >> There are a number of places where the VM changes control flow by >> rewriting the stack or otherwise. I?ve done some analysis as to how >> these could also be used for attacks (which I didn?t want to post here). >> These areas can be protected ensuring the pointers to various stubs and >> entry points are stored in memory as signed pointers. These changes are >> simple to make (they can be reduced to a type change in common code and >> a few addition sign/auth calls in the backend), but there a lot of them >> and the total code change is fairly large. I?m happy to provide a few >> work in progress patches. >> >> In order to match the security benefits of the Apple Arm64e ABI across >> the whole of JDK, then all the changes mentioned above would be >> required. > > Alan Hayward has updated the pull request incrementally with two additional commits since the last revision: > > - Fix jvmci tests > - Fix GC issues That fixes all the jtreg failures - everything passes now. ------------- PR: https://git.openjdk.java.net/jdk/pull/6334 From zgu at openjdk.java.net Mon Jan 24 16:35:44 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Mon, 24 Jan 2022 16:35:44 GMT Subject: RFR: 8280289: Enhance debug pp() command with NMT info [v5] In-Reply-To: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> References: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> Message-ID: > JDK-8275320 enhanced NMT malloc header, provided ability to identify if a pointer points to a malloc'd memory. Further, JDK-8277822 enabled NMT for debug builds. > > This is a good opportunity to integrate NMT to debug pp() command to provide useful information collected by NMT. > > For example: > > NMT=summary > > (gdb) call pp(0x7f2d9803db70) > "Executing pp" > 0x00007f2d9803db70 malloc'd 1576 bytes by Internal > > (gdb) call pp(0x00007f4300a20000) > "Executing pp" > 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC > > > NMT=detail > > > (gdb) call pp(0x7f2d9803db70) > "Executing pp" > 0x00007f2d9803db70 malloc'd 1576 bytes by Internal > [0x00007f2d9f1b784b] G1Arguments::create_heap()+0x1b > [0x00007f2d9f98a855] universe_init()+0x85 > [0x00007f2d9f2e0a97] init_globals()+0x37 > [0x00007f2d9f960acb] Threads::create_vm(JavaVMInitArgs*, bool*)+0x3db > > (gdb) call pp(0x00007f4300a20000) > "Executing pp" > 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC > [0x00007f433dc49c7f] reserve_memory(char*, unsigned long, unsigned long, int, bool)+0x17f > [0x00007f433dc4cf0b] ReservedSpace::reserve(unsigned long, unsigned long, unsigned long, char*, bool)+0x14b > [0x00007f433dc4d527] ReservedSpace::initialize(unsigned long, unsigned long, unsigned long, char*, bool)+0x1c7 > [0x00007f433d949d40] ShenandoahHeap::initialize()+0x340 Zhengyu Gu has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 13 additional commits since the last revision: - Merge branch 'master' into JDK-8280289-nmt-pp - Snapshot mmap'd region to avoid race - Thomas' comment - Missing include file - Thomas' comments - More comment fixing - Fix comment - Fix minimal build - Merge branch 'master' into JDK-8280289-nmt-pp - Fix - ... and 3 more: https://git.openjdk.java.net/jdk/compare/e546f502...92364529 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7160/files - new: https://git.openjdk.java.net/jdk/pull/7160/files/d6b53d42..92364529 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7160&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7160&range=03-04 Stats: 4593 lines in 219 files changed: 2849 ins; 1098 del; 646 mod Patch: https://git.openjdk.java.net/jdk/pull/7160.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7160/head:pull/7160 PR: https://git.openjdk.java.net/jdk/pull/7160 From zgu at openjdk.java.net Mon Jan 24 16:35:46 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Mon, 24 Jan 2022 16:35:46 GMT Subject: RFR: 8280289: Enhance debug pp() command with NMT info [v4] In-Reply-To: References: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> Message-ID: On Sat, 22 Jan 2022 15:11:33 GMT, Zhengyu Gu wrote: >> JDK-8275320 enhanced NMT malloc header, provided ability to identify if a pointer points to a malloc'd memory. Further, JDK-8277822 enabled NMT for debug builds. >> >> This is a good opportunity to integrate NMT to debug pp() command to provide useful information collected by NMT. >> >> For example: >> >> NMT=summary >> >> (gdb) call pp(0x7f2d9803db70) >> "Executing pp" >> 0x00007f2d9803db70 malloc'd 1576 bytes by Internal >> >> (gdb) call pp(0x00007f4300a20000) >> "Executing pp" >> 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC >> >> >> NMT=detail >> >> >> (gdb) call pp(0x7f2d9803db70) >> "Executing pp" >> 0x00007f2d9803db70 malloc'd 1576 bytes by Internal >> [0x00007f2d9f1b784b] G1Arguments::create_heap()+0x1b >> [0x00007f2d9f98a855] universe_init()+0x85 >> [0x00007f2d9f2e0a97] init_globals()+0x37 >> [0x00007f2d9f960acb] Threads::create_vm(JavaVMInitArgs*, bool*)+0x3db >> >> (gdb) call pp(0x00007f4300a20000) >> "Executing pp" >> 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC >> [0x00007f433dc49c7f] reserve_memory(char*, unsigned long, unsigned long, int, bool)+0x17f >> [0x00007f433dc4cf0b] ReservedSpace::reserve(unsigned long, unsigned long, unsigned long, char*, bool)+0x14b >> [0x00007f433dc4d527] ReservedSpace::initialize(unsigned long, unsigned long, unsigned long, char*, bool)+0x1c7 >> [0x00007f433d949d40] ShenandoahHeap::initialize()+0x340 > > Zhengyu Gu has updated the pull request incrementally with one additional commit since the last revision: > > Thomas' comment I don't think the patch can make things worse that oop->print(), because pp() command seems to exam "live" pointer. As Ioi pointed out, we do need safety to exam mmap bookkeeping info, as other threads may concurrently change them. I elected to take a snapshot of region, instead of printing info in place, to avoid contaminating VirtualMemoryTracker with printing code. ------------- PR: https://git.openjdk.java.net/jdk/pull/7160 From adinn at openjdk.java.net Mon Jan 24 16:39:12 2022 From: adinn at openjdk.java.net (Andrew Dinn) Date: Mon, 24 Jan 2022 16:39:12 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v14] In-Reply-To: References: Message-ID: On Mon, 24 Jan 2022 15:56:06 GMT, Alan Hayward wrote: >> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One >> of its uses is to protect against ROP based attacks. This is done by >> signing the Link Register whenever it is stored on the stack, and >> authenticating the value when it is loaded back from the stack. If an >> attacker were to try to change control flow by editing the stack then >> the authentication check of the Link Register will fail, causing a >> segfault when the function returns. >> >> On a system with PAC enabled, it is expected that all applications will >> be compiled with ROP protection. Fedora 33 and upwards already provide >> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of >> PAC instructions that exist in the NOP space - on hardware without PAC, >> these instructions act as NOPs, allowing backward compatibility for >> negligible performance cost (2 NOPs per non-leaf function). >> >> Hardware is currently limited to the Apple M1 MacBooks. All testing has >> been done within a Fedora Docker image. A run of SpecJVM showed no >> difference to that of noise - which was surprising. >> >> The most important part of this patch is simply compiling using branch >> protection provided by GCC/LLVM. This protects all C++ code from being >> used in ROP attacks, removing all static ROP gadgets from use. >> >> The remainder of the patch adds ROP protection to runtime generated >> code, in both stubs and compiled Java code. Attacks here are much harder >> as ROP gadgets must be found dynamically at runtime. If/when AOT >> compilation is added to JDK, then all stubs and compiled Java will be >> susceptible ROP gadgets being found by static analysis and therefore >> potentially as vulnerable as C++ code. >> >> There are a number of places where the VM changes control flow by >> rewriting the stack or otherwise. I?ve done some analysis as to how >> these could also be used for attacks (which I didn?t want to post here). >> These areas can be protected ensuring the pointers to various stubs and >> entry points are stored in memory as signed pointers. These changes are >> simple to make (they can be reduced to a type change in common code and >> a few addition sign/auth calls in the backend), but there a lot of them >> and the total code change is fairly large. I?m happy to provide a few >> work in progress patches. >> >> In order to match the security benefits of the Apple Arm64e ABI across >> the whole of JDK, then all the changes mentioned above would be >> required. > > Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: > > Fix popframe failures src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp line 835: > 833: __ stp(rlocals, rcpool, Address(sp, 2 * wordSize)); > 834: > 835: __ protect_return_address(); Most of the changes to fix the tests look fairly self-explanatory but I don't really understand why you relocated call to protect_return-_address from its previous location at line 801. Can you explain why it has been moved? ------------- PR: https://git.openjdk.java.net/jdk/pull/6334 From adinn at openjdk.java.net Mon Jan 24 16:47:11 2022 From: adinn at openjdk.java.net (Andrew Dinn) Date: Mon, 24 Jan 2022 16:47:11 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v14] In-Reply-To: References: Message-ID: On Mon, 24 Jan 2022 15:56:06 GMT, Alan Hayward wrote: >> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One >> of its uses is to protect against ROP based attacks. This is done by >> signing the Link Register whenever it is stored on the stack, and >> authenticating the value when it is loaded back from the stack. If an >> attacker were to try to change control flow by editing the stack then >> the authentication check of the Link Register will fail, causing a >> segfault when the function returns. >> >> On a system with PAC enabled, it is expected that all applications will >> be compiled with ROP protection. Fedora 33 and upwards already provide >> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of >> PAC instructions that exist in the NOP space - on hardware without PAC, >> these instructions act as NOPs, allowing backward compatibility for >> negligible performance cost (2 NOPs per non-leaf function). >> >> Hardware is currently limited to the Apple M1 MacBooks. All testing has >> been done within a Fedora Docker image. A run of SpecJVM showed no >> difference to that of noise - which was surprising. >> >> The most important part of this patch is simply compiling using branch >> protection provided by GCC/LLVM. This protects all C++ code from being >> used in ROP attacks, removing all static ROP gadgets from use. >> >> The remainder of the patch adds ROP protection to runtime generated >> code, in both stubs and compiled Java code. Attacks here are much harder >> as ROP gadgets must be found dynamically at runtime. If/when AOT >> compilation is added to JDK, then all stubs and compiled Java will be >> susceptible ROP gadgets being found by static analysis and therefore >> potentially as vulnerable as C++ code. >> >> There are a number of places where the VM changes control flow by >> rewriting the stack or otherwise. I?ve done some analysis as to how >> these could also be used for attacks (which I didn?t want to post here). >> These areas can be protected ensuring the pointers to various stubs and >> entry points are stored in memory as signed pointers. These changes are >> simple to make (they can be reduced to a type change in common code and >> a few addition sign/auth calls in the backend), but there a lot of them >> and the total code change is fairly large. I?m happy to provide a few >> work in progress patches. >> >> In order to match the security benefits of the Apple Arm64e ABI across >> the whole of JDK, then all the changes mentioned above would be >> required. > > Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: > > Fix popframe failures These test fixes all look like they are doing the right thing and are localized to systems that have PAC_RET support. It was a bit of a surprise to realize that double-protecting was a faux pas and, in consequence, that there there were places where a protected address was being encountered where a raw one was needed. It makes me wonder if there are other places where that might happen lying dormant which just happen not to have been found by any existing test. Do we have any way of systematically excluding that possibility (I supsect the answer is no but I have to ask). ------------- PR: https://git.openjdk.java.net/jdk/pull/6334 From duke at openjdk.java.net Mon Jan 24 16:54:11 2022 From: duke at openjdk.java.net (Alan Hayward) Date: Mon, 24 Jan 2022 16:54:11 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v14] In-Reply-To: References: Message-ID: On Mon, 24 Jan 2022 16:36:18 GMT, Andrew Dinn wrote: >> Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix popframe failures > > src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp line 835: > >> 833: __ stp(rlocals, rcpool, Address(sp, 2 * wordSize)); >> 834: >> 835: __ protect_return_address(); > > Most of the changes to fix the tests look fairly self-explanatory but I don't really understand why you relocated call to protect_return-_address from its previous location at line 801. Can you explain why it has been moved? I originally moved it as part of debugging (a GC load_at occurs during the load_mirror). Once all the GC changes went in (all the enter_subframe calls), this change was no longer required. Then, when I came to change it back, I realised it made more sense in the new place. The protect is now directly before the storing of lr to the stack. That's logically a better place and should make the assembler easier to read. ------------- PR: https://git.openjdk.java.net/jdk/pull/6334 From duke at openjdk.java.net Mon Jan 24 17:11:17 2022 From: duke at openjdk.java.net (Alan Hayward) Date: Mon, 24 Jan 2022 17:11:17 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v14] In-Reply-To: References: Message-ID: On Mon, 24 Jan 2022 16:44:23 GMT, Andrew Dinn wrote: > These test fixes all look like they are doing the right thing and are localized to systems that have PAC_RET support. > > It was a bit of a surprise to realize that double-protecting was a faux pas and, in consequence, that there there were places where a protected address was being encountered where a raw one was needed. It makes me wonder if there are other places where that might happen lying dormant which just happen not to have been found by any existing test. Do we have any way of systematically excluding that possibility (I supsect the answer is no but I have to ask). check_return_address() is being a huge help here. Before signing, the check is used to confirm it's a raw value. Meaning we get a segfault at the point of the second sign. It's a fairly straightforward process to then figure out that issue. (Without the check, the segfault would occur on return after the authentication. Which could be a long way away from the signing. From experience, that's not a fun debug journey.) There might still be some places where that code path is currently not executed today by any of the tests. There's no obvious way to catch those issues. Once this is in, getting some regular testing will be fairly crucial to make sure it doesn't rot. ------------- PR: https://git.openjdk.java.net/jdk/pull/6334 From iklam at openjdk.java.net Mon Jan 24 19:23:17 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 24 Jan 2022 19:23:17 GMT Subject: RFR: 8280289: Enhance debug pp() command with NMT info [v5] In-Reply-To: References: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> Message-ID: On Mon, 24 Jan 2022 16:35:44 GMT, Zhengyu Gu wrote: >> JDK-8275320 enhanced NMT malloc header, provided ability to identify if a pointer points to a malloc'd memory. Further, JDK-8277822 enabled NMT for debug builds. >> >> This is a good opportunity to integrate NMT to debug pp() command to provide useful information collected by NMT. >> >> For example: >> >> NMT=summary >> >> (gdb) call pp(0x7f2d9803db70) >> "Executing pp" >> 0x00007f2d9803db70 malloc'd 1576 bytes by Internal >> >> (gdb) call pp(0x00007f4300a20000) >> "Executing pp" >> 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC >> >> >> NMT=detail >> >> >> (gdb) call pp(0x7f2d9803db70) >> "Executing pp" >> 0x00007f2d9803db70 malloc'd 1576 bytes by Internal >> [0x00007f2d9f1b784b] G1Arguments::create_heap()+0x1b >> [0x00007f2d9f98a855] universe_init()+0x85 >> [0x00007f2d9f2e0a97] init_globals()+0x37 >> [0x00007f2d9f960acb] Threads::create_vm(JavaVMInitArgs*, bool*)+0x3db >> >> (gdb) call pp(0x00007f4300a20000) >> "Executing pp" >> 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC >> [0x00007f433dc49c7f] reserve_memory(char*, unsigned long, unsigned long, int, bool)+0x17f >> [0x00007f433dc4cf0b] ReservedSpace::reserve(unsigned long, unsigned long, unsigned long, char*, bool)+0x14b >> [0x00007f433dc4d527] ReservedSpace::initialize(unsigned long, unsigned long, unsigned long, char*, bool)+0x1c7 >> [0x00007f433d949d40] ShenandoahHeap::initialize()+0x340 > > Zhengyu Gu has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 13 additional commits since the last revision: > > - Merge branch 'master' into JDK-8280289-nmt-pp > - Snapshot mmap'd region to avoid race > - Thomas' comment > - Missing include file > - Thomas' comments > - More comment fixing > - Fix comment > - Fix minimal build > - Merge branch 'master' into JDK-8280289-nmt-pp > - Fix > - ... and 3 more: https://git.openjdk.java.net/jdk/compare/cf63dd86...92364529 LGTM ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7160 From iklam at openjdk.java.net Mon Jan 24 20:36:06 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 24 Jan 2022 20:36:06 GMT Subject: RFR: JDK-8280503: Use allStatic.hpp instead of allocation.hpp where possible [v3] In-Reply-To: <6q0y0ZvuU8cTpnKZqZ92m_aqezIzEYH1-q0jdsXGfcs=.6079f777-aa28-4f1e-8fca-87b00a29c488@github.com> References: <6q0y0ZvuU8cTpnKZqZ92m_aqezIzEYH1-q0jdsXGfcs=.6079f777-aa28-4f1e-8fca-87b00a29c488@github.com> Message-ID: On Mon, 24 Jan 2022 07:41:44 GMT, Thomas Stuefe wrote: >> JDK-8249944 moved AllStatic to its own header. We should use that one instead of allocation.hpp where possible to reduce header dependencies. >> >> This patch: >> - replaces includes of allocation.hpp with allstatic.hpp where appropiate >> - fixes up resulting errors since this changes uncovers missing dependencies. Mainly, missing includes of debug.hpp, of globalDefinitions.hpp, and missing outputStream definitions. >> >> Changes are trivial but onerous. Done partly with a script, partly manually. >> >> Test: >> - Checked the build with gtests on Linux x86, x64, minimal, zero, aarch64, for both fastdebug and release. All builds of course without PCH. >> - GHAs > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Add missing header to zNUMA.hpp All builds in our CI passed. ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7188 From dholmes at openjdk.java.net Mon Jan 24 22:50:54 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 24 Jan 2022 22:50:54 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files In-Reply-To: References: Message-ID: On Tue, 18 Jan 2022 13:19:39 GMT, Christian Hagedorn wrote: > When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method: > > Stack: [0x00007f6e01739000,0x00007f6e0183a000], sp=0x00007f6e01838110, free space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 > V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec > V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 > V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df > V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 > V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d > V [libjvm.so+0x12091c9] JavaThread::run()+0x167 > V [libjvm.so+0x1206ada] Thread::call_run()+0x180 > V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f > > This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method. > > This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)): > > Stack: [0x00007f34fca18000,0x00007f34fcb19000], sp=0x00007f34fcb17110, free space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 (c1_Compilation.cpp:607) > V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec (c1_Compiler.cpp:250) > V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 (compileBroker.cpp:2291) > V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df (compileBroker.cpp:1966) > V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 (compilerThread.cpp:59) > V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d (thread.cpp:1297) > V [libjvm.so+0x12091c9] JavaThread::run()+0x167 (thread.cpp:1280) > V [libjvm.so+0x1206ada] Thread::call_run()+0x180 (thread.cpp:358) > V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f (os_linux.cpp:705) > > For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. > > The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf > I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. > > The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability. > > Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. > > **Testing:** > Apart from manual testing, I've added two kinds of tests: > - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers. > - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename. > > On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation. > > To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number. > > I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user). > > Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`! > > Thanks, > Christian Hi Christian, This will be really useful - thank you. :) That said I have two general concerns both related to executing non-async-signal-safe code in the signal handler via the error reporting logic. Now I know we already do a ton of stuff in error reporting not guaranteed (in any way) to be safe to do in a signal handler, but whenever we add something new I have to ask the question: how likely is this additional code to lead to secondary failures (hangs or crashes)? Secondly, on the same issue the use of unified logging within this code seems even more likely to be problematic - I'm not aware of us currently using UL during error reporting. It may work in basic usecases but if it triggers logfile rotation or other more complex actions what then? Thanks, David ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From minqi at openjdk.java.net Tue Jan 25 00:30:51 2022 From: minqi at openjdk.java.net (Yumin Qi) Date: Tue, 25 Jan 2022 00:30:51 GMT Subject: RFR: 8278753: Runtime crashes with access violation during JNI_CreateJavaVM call Message-ID: Please review, When jlink with --compress=2, zip is used to compress the files while doing copy. The user case failed to load zip.dll, since zip.dll is not set in PATH. This failure is after we get NULL from GetModuleHandle("zip.dll"), then do LoadLibrary("zip.dll") will have same result. The fix is calling load_zip_library of ClassLoader first --- if zip library already loaded just return the cached handle for following usage, if not, load zip library and cached the handle. Tests: tier1,4,7 in test Manually tested user case, and checked output of jimage list for jlinked files using --compress=2. Thanks Yumin ------------- Commit messages: - 8278753: Runtime crashes with access violation during JNI_CreateJavaVM call Changes: https://git.openjdk.java.net/jdk/pull/7206/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7206&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8278753 Stats: 49 lines in 6 files changed: 18 ins; 14 del; 17 mod Patch: https://git.openjdk.java.net/jdk/pull/7206.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7206/head:pull/7206 PR: https://git.openjdk.java.net/jdk/pull/7206 From zgu at openjdk.java.net Tue Jan 25 01:34:35 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Tue, 25 Jan 2022 01:34:35 GMT Subject: RFR: 8280289: Enhance debug pp() command with NMT info In-Reply-To: References: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> Message-ID: On Fri, 21 Jan 2022 09:35:15 GMT, Thomas Stuefe wrote: >> JDK-8275320 enhanced NMT malloc header, provided ability to identify if a pointer points to a malloc'd memory. Further, JDK-8277822 enabled NMT for debug builds. >> >> This is a good opportunity to integrate NMT to debug pp() command to provide useful information collected by NMT. >> >> For example: >> >> NMT=summary >> >> (gdb) call pp(0x7f2d9803db70) >> "Executing pp" >> 0x00007f2d9803db70 malloc'd 1576 bytes by Internal >> >> (gdb) call pp(0x00007f4300a20000) >> "Executing pp" >> 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC >> >> >> NMT=detail >> >> >> (gdb) call pp(0x7f2d9803db70) >> "Executing pp" >> 0x00007f2d9803db70 malloc'd 1576 bytes by Internal >> [0x00007f2d9f1b784b] G1Arguments::create_heap()+0x1b >> [0x00007f2d9f98a855] universe_init()+0x85 >> [0x00007f2d9f2e0a97] init_globals()+0x37 >> [0x00007f2d9f960acb] Threads::create_vm(JavaVMInitArgs*, bool*)+0x3db >> >> (gdb) call pp(0x00007f4300a20000) >> "Executing pp" >> 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC >> [0x00007f433dc49c7f] reserve_memory(char*, unsigned long, unsigned long, int, bool)+0x17f >> [0x00007f433dc4cf0b] ReservedSpace::reserve(unsigned long, unsigned long, unsigned long, char*, bool)+0x14b >> [0x00007f433dc4d527] ReservedSpace::initialize(unsigned long, unsigned long, unsigned long, char*, bool)+0x1c7 >> [0x00007f433d949d40] ShenandoahHeap::initialize()+0x340 > > Hi Zhengyu, > > this is nice :) > > I have a small proposal though. How about splitting check_block_integrity into two functions, one which checks and returns error information, one which does the assert? That way seems a bit cleaner, and we can re-use the checking functionality without the assert in other places. > > I tested an addition to your patch, what I meant was something like this (I probably could have suggested the change in your PR but don't know how): > > https://github.com/openjdk/jdk/compare/pr/7160...tstuefe:zhengyu-pp-proposal > > And inside debug.cpp, in pp(), we could actually print out corruption information for corrupted blocks. But I did not do this in my proposal, since I was not sure how to distinguish "its a malloced block, I know it, so show me if its broken" from "its just some pointer, may not be malloced at all". Thought about printing something like "if this was a malloced block, its broken" but that seemed weird for the cases where the user knows the pointer is not malloced. > > What do you think? > > ..Thomas Thanks, @tstuefe @iklam ------------- PR: https://git.openjdk.java.net/jdk/pull/7160 From zgu at openjdk.java.net Tue Jan 25 01:34:36 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Tue, 25 Jan 2022 01:34:36 GMT Subject: Integrated: 8280289: Enhance debug pp() command with NMT info In-Reply-To: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> References: <9O0tj4bQ9tF7-4xVCrTQBG4BY5N113BFA4g5B5PZTOo=.3a151c20-bedc-46e0-a915-54b6fe12a70b@github.com> Message-ID: <0ET_37Kp7vC3TTBPokG1wmlxl8mw9vTh5IK5w4JbUM0=.50c57a12-d11a-4af9-8c5f-c0e348f4f71c@github.com> On Thu, 20 Jan 2022 15:31:12 GMT, Zhengyu Gu wrote: > JDK-8275320 enhanced NMT malloc header, provided ability to identify if a pointer points to a malloc'd memory. Further, JDK-8277822 enabled NMT for debug builds. > > This is a good opportunity to integrate NMT to debug pp() command to provide useful information collected by NMT. > > For example: > > NMT=summary > > (gdb) call pp(0x7f2d9803db70) > "Executing pp" > 0x00007f2d9803db70 malloc'd 1576 bytes by Internal > > (gdb) call pp(0x00007f4300a20000) > "Executing pp" > 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC > > > NMT=detail > > > (gdb) call pp(0x7f2d9803db70) > "Executing pp" > 0x00007f2d9803db70 malloc'd 1576 bytes by Internal > [0x00007f2d9f1b784b] G1Arguments::create_heap()+0x1b > [0x00007f2d9f98a855] universe_init()+0x85 > [0x00007f2d9f2e0a97] init_globals()+0x37 > [0x00007f2d9f960acb] Threads::create_vm(JavaVMInitArgs*, bool*)+0x3db > > (gdb) call pp(0x00007f4300a20000) > "Executing pp" > 0x00007f4300a20000 in mmap'd memory region [0x00007f4300a20000 - 0x00007f4310000000] by GC > [0x00007f433dc49c7f] reserve_memory(char*, unsigned long, unsigned long, int, bool)+0x17f > [0x00007f433dc4cf0b] ReservedSpace::reserve(unsigned long, unsigned long, unsigned long, char*, bool)+0x14b > [0x00007f433dc4d527] ReservedSpace::initialize(unsigned long, unsigned long, unsigned long, char*, bool)+0x1c7 > [0x00007f433d949d40] ShenandoahHeap::initialize()+0x340 This pull request has now been integrated. Changeset: a59d717f Author: Zhengyu Gu URL: https://git.openjdk.java.net/jdk/commit/a59d717fd65d523bb6f4fc57949054e904a149f1 Stats: 119 lines in 5 files changed: 89 ins; 6 del; 24 mod 8280289: Enhance debug pp() command with NMT info Reviewed-by: stuefe, iklam ------------- PR: https://git.openjdk.java.net/jdk/pull/7160 From dholmes at openjdk.java.net Tue Jan 25 01:40:35 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 25 Jan 2022 01:40:35 GMT Subject: RFR: 8278753: Runtime crashes with access violation during JNI_CreateJavaVM call In-Reply-To: References: Message-ID: On Tue, 25 Jan 2022 00:20:19 GMT, Yumin Qi wrote: > Please review, > When jlink with --compress=2, zip is used to compress the files while doing copy. The user case failed to load zip.dll, since zip.dll is not set in PATH. This failure is after we get NULL from GetModuleHandle("zip.dll"), then do LoadLibrary("zip.dll") will have same result. > The fix is calling load_zip_library of ClassLoader first --- if zip library already loaded just return the cached handle for following usage, if not, load zip library and cached the handle. > > Tests: tier1,4,7 in test > Manually tested user case, and checked output of jimage list for jlinked files using --compress=2. > > Thanks > Yumin This needs reviewing by the jimage folk too. ------------- PR: https://git.openjdk.java.net/jdk/pull/7206 From dholmes at openjdk.java.net Tue Jan 25 02:03:36 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 25 Jan 2022 02:03:36 GMT Subject: RFR: 8278753: Runtime crashes with access violation during JNI_CreateJavaVM call In-Reply-To: References: Message-ID: On Tue, 25 Jan 2022 00:20:19 GMT, Yumin Qi wrote: > Please review, > When jlink with --compress=2, zip is used to compress the files while doing copy. The user case failed to load zip.dll, since zip.dll is not set in PATH. This failure is after we get NULL from GetModuleHandle("zip.dll"), then do LoadLibrary("zip.dll") will have same result. > The fix is calling load_zip_library of ClassLoader first --- if zip library already loaded just return the cached handle for following usage, if not, load zip library and cached the handle. > > Tests: tier1,4,7 in test > Manually tested user case, and checked output of jimage list for jlinked files using --compress=2. > > Thanks > Yumin Hi Yumin, So let me see if I have this clear. - The jimage code was using the OS code (dlopen/loadlibrary etc) to try and load the zip library when needed. - The VM, which is always loaded first, always used to load the zip library unconditionally, hence the OS would simply return the JVM's zip handle to the jimage code. - When we changed the VM to only load the zip library if needed (not realizing jimage may also need it) then the jimage code would now only succeed if the zip library was in the appropriate lookup paths for the OS. - The fix is to change the jimage code so that it asks the JVM for the zip library, as the JVM is always setup correctly to find it. Does that sum it up? Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7206 From minqi at openjdk.java.net Tue Jan 25 05:06:39 2022 From: minqi at openjdk.java.net (Yumin Qi) Date: Tue, 25 Jan 2022 05:06:39 GMT Subject: RFR: 8278753: Runtime crashes with access violation during JNI_CreateJavaVM call In-Reply-To: References: Message-ID: <1iUBAAAsXbvt0D1eL1MfN_7H9Ou3T8RiDIEHQhp_5bI=.7a9bf5be-407f-452f-887d-879d92bebc70@github.com> On Tue, 25 Jan 2022 01:59:56 GMT, David Holmes wrote: > * The jimage code was using the OS code (dlopen/loadlibrary etc) to try and load the zip library when needed. Yes. The zip library has to be in PATH. > * The VM, which is always loaded first, always used to load the zip library unconditionally, hence the OS would simply return the JVM's zip handle to the jimage code. Yes. After the fix, jimage will use the version that JVM is using. > * When we changed the VM to only load the zip library if needed (not realizing jimage may also need it) then the jimage code would now only succeed if the zip library was in the appropriate lookup paths for the OS. Yes. When in JVM, zip library was always loaded (before https://bugs.openjdk.java.net/browse/JDK-8237750) so jimage in fact get the loaded zip handle from JVM. Unless user set PATH(other than jdk(jre)\bin) to contain the "zip.dll | libzip.so | libzip.dylib" then jimage will load and use that version. After this fix, jimage will use the same version as jvm. > * The fix is to change the jimage code so that it asks the JVM for the zip library, as the JVM is always setup correctly to find it. Yes. Thanks for taking a detail look. ------------- PR: https://git.openjdk.java.net/jdk/pull/7206 From dholmes at openjdk.java.net Tue Jan 25 06:14:42 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 25 Jan 2022 06:14:42 GMT Subject: RFR: 8278753: Runtime crashes with access violation during JNI_CreateJavaVM call In-Reply-To: References: Message-ID: On Tue, 25 Jan 2022 00:20:19 GMT, Yumin Qi wrote: > Please review, > When jlink with --compress=2, zip is used to compress the files while doing copy. The user case failed to load zip.dll, since zip.dll is not set in PATH. This failure is after we get NULL from GetModuleHandle("zip.dll"), then do LoadLibrary("zip.dll") will have same result. > The fix is calling load_zip_library of ClassLoader first --- if zip library already loaded just return the cached handle for following usage, if not, load zip library and cached the handle. > > Tests: tier1,4,7 in test > Manually tested user case, and checked output of jimage list for jlinked files using --compress=2. > > Thanks > Yumin Seems okay as long as there are no concerns from jimage folk about usage scenarios we may not be aware of. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7206 From stuefe at openjdk.java.net Tue Jan 25 09:18:44 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 25 Jan 2022 09:18:44 GMT Subject: Integrated: JDK-8280503: Use allStatic.hpp instead of allocation.hpp where possible In-Reply-To: References: Message-ID: On Sat, 22 Jan 2022 13:33:24 GMT, Thomas Stuefe wrote: > JDK-8249944 moved AllStatic to its own header. We should use that one instead of allocation.hpp where possible to reduce header dependencies. > > This patch: > - replaces includes of allocation.hpp with allstatic.hpp where appropiate > - fixes up resulting errors since this changes uncovers missing dependencies. Mainly, missing includes of debug.hpp, of globalDefinitions.hpp, and missing outputStream definitions. > > Changes are trivial but onerous. Done partly with a script, partly manually. > > Test: > - Checked the build with gtests on Linux x86, x64, minimal, zero, aarch64, for both fastdebug and release. All builds of course without PCH. > - GHAs This pull request has now been integrated. Changeset: 2155afe2 Author: Thomas Stuefe URL: https://git.openjdk.java.net/jdk/commit/2155afe2a87d718757b009d712361d7a63946a7f Stats: 368 lines in 172 files changed: 35 ins; 0 del; 333 mod 8280503: Use allStatic.hpp instead of allocation.hpp where possible Reviewed-by: dholmes, iklam ------------- PR: https://git.openjdk.java.net/jdk/pull/7188 From stuefe at openjdk.java.net Tue Jan 25 09:18:44 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 25 Jan 2022 09:18:44 GMT Subject: RFR: JDK-8280503: Use allStatic.hpp instead of allocation.hpp where possible [v3] In-Reply-To: References: <6q0y0ZvuU8cTpnKZqZ92m_aqezIzEYH1-q0jdsXGfcs=.6079f777-aa28-4f1e-8fca-87b00a29c488@github.com> Message-ID: On Mon, 24 Jan 2022 20:32:56 GMT, Ioi Lam wrote: > All builds in our CI passed. Thanks a lot, Ioi! ------------- PR: https://git.openjdk.java.net/jdk/pull/7188 From chagedorn at openjdk.java.net Tue Jan 25 09:43:34 2022 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Tue, 25 Jan 2022 09:43:34 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files In-Reply-To: References: Message-ID: <2vxIb1vN8LxdnG_zim6JI_RzovAOSLpJJMgxbgu1pnI=.f6cbbebc-2dc0-44ca-bd4b-4d6d3fc18b0f@github.com> On Tue, 18 Jan 2022 13:19:39 GMT, Christian Hagedorn wrote: > When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method: > > Stack: [0x00007f6e01739000,0x00007f6e0183a000], sp=0x00007f6e01838110, free space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 > V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec > V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 > V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df > V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 > V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d > V [libjvm.so+0x12091c9] JavaThread::run()+0x167 > V [libjvm.so+0x1206ada] Thread::call_run()+0x180 > V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f > > This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method. > > This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)): > > Stack: [0x00007f34fca18000,0x00007f34fcb19000], sp=0x00007f34fcb17110, free space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 (c1_Compilation.cpp:607) > V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec (c1_Compiler.cpp:250) > V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 (compileBroker.cpp:2291) > V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df (compileBroker.cpp:1966) > V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 (compilerThread.cpp:59) > V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d (thread.cpp:1297) > V [libjvm.so+0x12091c9] JavaThread::run()+0x167 (thread.cpp:1280) > V [libjvm.so+0x1206ada] Thread::call_run()+0x180 (thread.cpp:358) > V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f (os_linux.cpp:705) > > For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. > > The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf > I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. > > The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability. > > Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. > > **Testing:** > Apart from manual testing, I've added two kinds of tests: > - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers. > - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename. > > On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation. > > To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number. > > I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user). > > Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`! > > Thanks, > Christian Hi David > This will be really useful - thank you. :) I'm glad to hear that! :-) Thanks for your overall comments! > All build file changes need to be seen by the build team. Right, thanks for adding it again. > That said I have two general concerns both related to executing non-async-signal-safe code in the signal handler via the error reporting logic. Now I know we already do a ton of stuff in error reporting not guaranteed (in any way) to be safe to do in a signal handler, but whenever we add something new I have to ask the question: how likely is this additional code to lead to secondary failures (hangs or crashes)? That's a valid concern. I've also asked myself this question when I had initially started using some assertions. We should not crash again during error reporting. I've therefore tried to be as conservative as possible and added bailouts instead, also in loops when reading data. But of course, this is just a best effort and by no means a guarantee to be safe (especially in terms of crashes). What could be alternatives to make this better? > Secondly, on the same issue the use of unified logging within this code seems even more likely to be problematic - I'm not aware of us currently using UL during error reporting. It may work in basic usecases but if it triggers logfile rotation or other more complex actions what then? I haven't thought about this before. To be honest, I think UL printing of the `dwarf` tag is only useful during development when adding something new to the parser or when debugging. I don't see much value of these messages otherwise - even less for a Java user. As a first step, I could change the logs from `log_X()` to `log_develop_X()` but that just shifts the problem to non-product builds. Another option (or additional thing) could be to guard the log messages with a new develop flag that's disabled by default. By setting it for development, we accept that it might be unsafe which should be fine. Thanks, Christian ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From pli at openjdk.java.net Tue Jan 25 09:47:37 2022 From: pli at openjdk.java.net (Pengfei Li) Date: Tue, 25 Jan 2022 09:47:37 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v3] In-Reply-To: <1sV-WcCv0-H2lHh1FFwAocRL8EIVzZ_att5tUeY5M-c=.ee351f2b-c9b9-4294-9205-cf16da3e896d@github.com> References: <1sV-WcCv0-H2lHh1FFwAocRL8EIVzZ_att5tUeY5M-c=.ee351f2b-c9b9-4294-9205-cf16da3e896d@github.com> Message-ID: On Mon, 10 Jan 2022 06:20:01 GMT, Pengfei Li wrote: >> ### Background >> >> Post loop vectorization is a C2 compiler optimization in an experimental >> VM feature called PostLoopMultiversioning. It transforms the range-check >> eliminated post loop to a 1-iteration vectorized loop with vector mask. >> This optimization was contributed by Intel in 2016 to support x86 AVX512 >> masked vector instructions. However, it was disabled soon after an issue >> was found. Due to insufficient maintenance in these years, multiple bugs >> have been accumulated inside. But we (Arm) still think this is a useful >> framework for vector mask support in C2 auto-vectorized loops, for both >> x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable >> post loop vectorization. >> >> ### Changes in this patch >> >> This patch reworks post loop vectorization. The most significant change >> is removing vector mask support in C2 x86 backend and re-implementing >> it in the mid-end. With this, we can re-enable post loop vectorization >> for platforms other than x86. >> >> Previous implementation hard-codes x86 k1 register as a reserved AVX512 >> opmask register and defines two routines (setvectmask/restorevectmask) >> to set and restore the value of k1. But after [JDK-8211251](https://bugs.openjdk.java.net/browse/JDK-8211251) which encodes >> AVX512 instructions as unmasked by default, generated vector masks are >> no longer used in AVX512 vector instructions. To fix incorrect codegen >> and add vector mask support for more platforms, we turn to add a vector >> mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode >> to generate a mask and replace all Load/Store nodes in the post loop >> into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This >> IR form is exactly the same to those which are used in VectorAPI mask >> support. For now, we only add mask inputs for Load/Store nodes because >> we don't have reduction operations supported in post loop vectorization. >> After this change, the x86 k1 register is no longer reserved and can be >> allocated when PostLoopMultiversioning is enabled. >> >> Besides this change, we have fixed a compiler crash and five incorrect >> result issues with post loop vectorization. >> >> **I) C2 crashes with segmentation fault in strip-mined loops** >> >> Previous implementation was done before C2 loop strip-mining was merged >> into JDK master so it didn't take strip-mined loops into consideration. >> In C2's strip mined loops, post loop is not the sibling of the main loop >> in ideal loop tree. Instead, it's the sibling of the main loop's parent. >> This patch fixed a SIGSEGV issue caused by NULL pointer when locating >> post loop from strip-mined main loop. >> >> **II) Incorrect result issues with post loop vectorization** >> >> We have also fixed five incorrect vectorization issues. Some of them are >> hidden deep and can only be reproduced with corner cases. These issues >> have a common cause that it assumes the post loop can be vectorized if >> the vectorization in corresponding main loop is successful. But in many >> cases this assumption is wrong. Below are details. >> >> - **[Issue-1] Incorrect vectorization for partial vectorizable loops** >> >> This issue can be reproduced by below loop where only some operations in >> the loop body are vectorizable. >> >> for (int i = 0; i < 10000; i++) { >> res[i] = a[i] * b[i]; >> k = 3 * k + 1; >> } >> >> In the main loop, superword can work well if parts of the operations in >> loop body are not vectorizable since those parts can be unrolled only. >> But for post loops, we don't create vectors through combining scalar IRs >> generated from loop unrolling. Instead, we are doing scalars to vectors >> replacement for all operations in the loop body. Hence, all operations >> should be either vectorized together or not vectorized at all. To fix >> this kind of cases, we add an extra field "_slp_vector_pack_count" in >> CountedLoopNode to record the eventual count of vector packs in the main >> loop. This value is then passed to post loop and compared with post loop >> pack count. Vectorization will be bailed out in post loop if it creates >> more vector packs than in the main loop. >> >> - **[Issue-2] Incorrect result in loops with growing-down vectors** >> >> This issue appears with growing-down vectors, that is, vectors that grow >> to smaller memory address as the loop iterates. It can be reproduced by >> below counting-up loop with negative scale value in array index. >> >> for (int i = 0; i < 10000; i++) { >> a[MAX - i] = b[MAX - i]; >> } >> >> Cause of this issue is that for a growing-down vector, generated vector >> mask value has reversed vector-lane order so it masks incorrect vector >> lanes. Note that if negative scale value appears in counting-down loops, >> the vector will be growing up. With this rule, we fix the issue by only >> allowing positive array index scales in counting-up loops and negative >> array index scales in counting-down loops. This check is done with the >> help of SWPointer by comparing scale values in each memory access in the >> loop with loop stride value. >> >> - **[Issue-3] Incorrect result in manually unrolled loops** >> >> This issue can be reproduced by below manually unrolled loop. >> >> for (int i = 0; i < 10000; i += 2) { >> c[i] = a[i] + b[i]; >> c[i + 1] = a[i + 1] * b[i + 1]; >> } >> >> In this loop, operations in the 2nd statement duplicate those in the 1st >> statement with a small memory address offset. Vectorization in the main >> loop works well in this case because C2 does further unrolling and pack >> combination. But we cannot vectorize the post loop through replacement >> from scalars to vectors because it creates duplicated vector operations. >> To fix this, we restrict post loop vectorization to loops with stride >> values of 1 or -1. >> >> - **[Issue-4] Incorrect result in loops with mixed vector element sizes** >> >> This issue is found after we enable post loop vectorization for AArch64. >> It's reproducible by multiple array operations with different element >> sizes inside a loop. On x86, there is no issue because the values of x86 >> AVX512 opmasks only depend on which vector lanes are active. But AArch64 >> is different - the values of SVE predicates also depend on lane size of >> the vector. Hence, on AArch64 SVE, if a loop has mixed vector element >> sizes, we should use different vector masks. For now, we just support >> loops with only one vector element size, i.e., "int + float" vectors in >> a single loop is ok but "int + double" vectors in a single loop is not >> vectorizable. This fix also enables subword vectors support to make all >> primitive type array operations vectorizable. >> >> - **[Issue-5] Incorrect result in loops with potential data dependence** >> >> This issue can be reproduced by below corner case on AArch64 only. >> >> for (int i = 0; i < 10000; i++) { >> a[i] = x; >> a[i + OFFSET] = y; >> } >> >> In this case, two stores in the loop have data dependence if the OFFSET >> value is smaller than the vector length. So we cannot do vectorization >> through replacing scalars to vectors. But the main loop vectorization >> in this case is successful on AArch64 because AArch64 has partial vector >> load/store support. It splits vector fill with different values in lanes >> to several smaller-sized fills. In this patch, we add additional data >> dependence check for this kind of cases. The check is also done with the >> help of SWPointer class. In this check, we require that every two memory >> accesses (with at least one store) of the same element type (or subword >> size) in the loop has the same array index expression. >> >> ### Tests >> >> So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with >> experimental VM option "PostLoopMultiversioning" turned on. We found no >> issue in all tests. We notice that those existing cases are not enough >> because some of above issues are not spotted by them. We would like to >> add some new cases but we found existing vectorization tests are a bit >> cumbersome - golden results must be pre-calculated and hard-coded in the >> test code for correctness verification. Thus, in this patch, we propose >> a new vectorization testing framework. >> >> Our new framework brings a simpler way to add new cases. For a new test >> case, we only need to create a new method annotated with "@Test". The >> test runner will invoke each annotated method twice automatically. First >> time it runs in the interpreter and second time it's forced compiled by >> C2. Then the two return results are compared. So in this framework each >> test method should return a primitive value or an array of primitives. >> In this way, no extra verification code for vectorization correctness is >> required. This test runner is still jtreg-based and takes advantages of >> the jtreg WhiteBox API, which enables test methods running at specific >> compilation levels. Each test class inside is also jtreg-based. It just >> need to inherit from the test runner class and run with two additional >> options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". >> >> ### Summary & Future work >> >> In this patch, we reworked post loop vectorization. We made it platform >> independent and fixed several issues inside. We also implemented a new >> vectorization testing framework with many test cases inside. Meanwhile, >> we did some code cleanups. >> >> This patch only touches C2 code guarded with PostLoopMultiversioning, >> except a few data structure changes. So, there's no behavior change when >> experimental VM option PostLoopMultiversioning is off. Also, to reduce >> risks, we still propose to keep post loop vectorization experimental for >> now. But if it receives positive feedback, we would like to change it to >> non-experimental in the future. > > Pengfei Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Update copyright year and rename a function > > Change-Id: I15845ebd3982edebd4c151284cc6f2ff727630bb > - Merge branch 'master' into postloop > > Change-Id: Ie639c79c9cf016dc68ebf2c0031b60453b45e9a4 > - Fix issues in newly added test framework > > Change-Id: I6e61abf05e9665325cb3abaf407360b18355c6b1 > - Merge branch 'master' into postloop > > Change-Id: I9bb5a808d7540426dedb141fd198d25eb1f569e6 > - 8183390: Fix and re-enable post loop vectorization > > ** Background > > Post loop vectorization is a C2 compiler optimization in an experimental > VM feature called PostLoopMultiversioning. It transforms the range-check > eliminated post loop to a 1-iteration vectorized loop with vector mask. > This optimization was contributed by Intel in 2016 to support x86 AVX512 > masked vector instructions. However, it was disabled soon after an issue > was found. Due to insufficient maintenance in these years, multiple bugs > have been accumulated inside. But we (Arm) still think this is a useful > framework for vector mask support in C2 auto-vectorized loops, for both > x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable > post loop vectorization. > > ** Changes in this patch > > This patch reworks post loop vectorization. The most significant change > is removing vector mask support in C2 x86 backend and re-implementing > it in the mid-end. With this, we can re-enable post loop vectorization > for platforms other than x86. > > Previous implementation hard-codes x86 k1 register as a reserved AVX512 > opmask register and defines two routines (setvectmask/restorevectmask) > to set and restore the value of k1. But after JDK-8211251 which encodes > AVX512 instructions as unmasked by default, generated vector masks are > no longer used in AVX512 vector instructions. To fix incorrect codegen > and add vector mask support for more platforms, we turn to add a vector > mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode > to generate a mask and replace all Load/Store nodes in the post loop > into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This > IR form is exactly the same to those which are used in VectorAPI mask > support. For now, we only add mask inputs for Load/Store nodes because > we don't have reduction operations supported in post loop vectorization. > After this change, the x86 k1 register is no longer reserved and can be > allocated when PostLoopMultiversioning is enabled. > > Besides this change, we have fixed a compiler crash and five incorrect > result issues with post loop vectorization. > > - 1) C2 crashes with segmentation fault in strip-mined loops > > Previous implementation was done before C2 loop strip-mining was merged > into JDK master so it didn't take strip-mined loops into consideration. > In C2's strip mined loops, post loop is not the sibling of the main loop > in ideal loop tree. Instead, it's the sibling of the main loop's parent. > This patch fixed a SIGSEGV issue caused by NULL pointer when locating > post loop from strip-mined main loop. > > - 2) Incorrect result issues with post loop vectorization > > We have also fixed five incorrect vectorization issues. Some of them are > hidden deep and can only be reproduced with corner cases. These issues > have a common cause that it assumes the post loop can be vectorized if > the vectorization in corresponding main loop is successful. But in many > cases this assumption is wrong. Below are details. > > [Issue-1] Incorrect vectorization for partial vectorizable loops > > This issue can be reproduced by below loop where only some operations in > the loop body are vectorizable. > > for (int i = 0; i < 10000; i++) { > res[i] = a[i] * b[i]; > k = 3 * k + 1; > } > > In the main loop, superword can work well if parts of the operations in > loop body are not vectorizable since those parts can be unrolled only. > But for post loops, we don't create vectors through combining scalar IRs > generated from loop unrolling. Instead, we are doing scalars to vectors > replacement for all operations in the loop body. Hence, all operations > should be either vectorized together or not vectorized at all. To fix > this kind of cases, we add an extra field "_slp_vector_pack_count" in > CountedLoopNode to record the eventual count of vector packs in the main > loop. This value is then passed to post loop and compared with post loop > pack count. Vectorization will be bailed out in post loop if it creates > more vector packs than in the main loop. > > [Issue-2] Incorrect result in loops with growing-down vectors > > This issue appears with growing-down vectors, that is, vectors that grow > to smaller memory address as the loop iterates. It can be reproduced by > below counting-up loop with negative scale value in array index. > > for (int i = 0; i < 10000; i++) { > a[MAX - i] = b[MAX - i]; > } > > Cause of this issue is that for a growing-down vector, generated vector > mask value has reversed vector-lane order so it masks incorrect vector > lanes. Note that if negative scale value appears in counting-down loops, > the vector will be growing up. With this rule, we fix the issue by only > allowing positive array index scales in counting-up loops and negative > array index scales in counting-down loops. This check is done with the > help of SWPointer by comparing scale values in each memory access in the > loop with loop stride value. > > [Issue-3] Incorrect result in manually unrolled loops > > This issue can be reproduced by below manually unrolled loop. > > for (int i = 0; i < 10000; i += 2) { > c[i] = a[i] + b[i]; > c[i + 1] = a[i + 1] * b[i + 1]; > } > > In this loop, operations in the 2nd statement duplicate those in the 1st > statement with a small memory address offset. Vectorization in the main > loop works well in this case because C2 does further unrolling and pack > combination. But we cannot vectorize the post loop through replacement > from scalars to vectors because it creates duplicated vector operations. > To fix this, we restrict post loop vectorization to loops with stride > values of 1 or -1. > > [Issue-4] Incorrect result in loops with mixed vector element sizes > > This issue is found after we enable post loop vectorization for AArch64. > It's reproducible by multiple array operations with different element > sizes inside a loop. On x86, there is no issue because the values of x86 > AVX512 opmasks only depend on which vector lanes are active. But AArch64 > is different - the values of SVE predicates also depend on lane size of > the vector. Hence, on AArch64 SVE, if a loop has mixed vector element > sizes, we should use different vector masks. For now, we just support > loops with only one vector element size, i.e., "int + float" vectors in > a single loop is ok but "int + double" vectors in a single loop is not > vectorizable. This fix also enables subword vectors support to make all > primitive type array operations vectorizable. > > [Issue-5] Incorrect result in loops with potential data dependence > > This issue can be reproduced by below corner case on AArch64 only. > > for (int i = 0; i < 10000; i++) { > a[i] = x; > a[i + OFFSET] = y; > } > > In this case, two stores in the loop have data dependence if the OFFSET > value is smaller than the vector length. So we cannot do vectorization > through replacing scalars to vectors. But the main loop vectorization > in this case is successful on AArch64 because AArch64 has partial vector > load/store support. It splits vector fill with different values in lanes > to several smaller-sized fills. In this patch, we add additional data > dependence check for this kind of cases. The check is also done with the > help of SWPointer class. In this check, we require that every two memory > accesses (with at least one store) of the same element type (or subword > size) in the loop has the same array index expression. > > ** Tests > > So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with > experimental VM option "PostLoopMultiversioning" turned on. We found no > issue in all tests. We notice that those existing cases are not enough > because some of above issues are not spotted by them. We would like to > add some new cases but we found existing vectorization tests are a bit > cumbersome - golden results must be pre-calculated and hard-coded in the > test code for correctness verification. Thus, in this patch, we propose > a new vectorization testing framework. > > Our new framework brings a simpler way to add new cases. For a new test > case, we only need to create a new method annotated with "@Test". The > test runner will invoke each annotated method twice automatically. First > time it runs in the interpreter and second time it's forced compiled by > C2. Then the two return results are compared. So in this framework each > test method should return a primitive value or an array of primitives. > In this way, no extra verification code for vectorization correctness is > required. This test runner is still jtreg-based and takes advantages of > the jtreg WhiteBox API, which enables test methods running at specific > compilation levels. Each test class inside is also jtreg-based. It just > need to inherit from the test runner class and run with two additional > options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". > > ** Summary & Future work > > In this patch, we reworked post loop vectorization. We made it platform > independent and fixed several issues inside. We also implemented a new > vectorization testing framework with many test cases inside. Meanwhile, > we did some code cleanups. > > This patch only touches C2 code guarded with PostLoopMultiversioning, > except a few data structure changes. So, there's no behavior change when > experimental VM option PostLoopMultiversioning is off. Also, to reduce > risks, we still propose to keep post loop vectorization experimental for > now. But if it receives positive feedback, we would like to change it to > non-experimental in the future. Hi Jatin, > BTW why have you kept a constraint on the vector size of post tail loop to match MaxVectorSize ? I just did more investigation about why there was a MaxVectorSize constraint. By looking at the failure after removing that, I found in some smaller MaxVectorSize configurations, the main loop may unroll more times than the vector lane count. In this scenario, the vector drain loop is not cloned from the atomic vector main loop. Without the vector drain loop, 1 iteration would be **NOT** enough for the vector masked tail loop so the test case generates incorrect result. Hence, we need to keep the MaxVectorSize constraint to make sure it's enough for the vector masked tail loop to run at most 1 iteration. I'd be glad if you have more suggestions. Hi @vnkozlov , would you like to review this? I see you are the reviewer of the original Intel's implementation. ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From stuefe at openjdk.java.net Tue Jan 25 10:32:37 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 25 Jan 2022 10:32:37 GMT Subject: RFR: 8278753: Runtime crashes with access violation during JNI_CreateJavaVM call In-Reply-To: References: Message-ID: On Tue, 25 Jan 2022 00:20:19 GMT, Yumin Qi wrote: > Please review, > When jlink with --compress=2, zip is used to compress the files while doing copy. The user case failed to load zip.dll, since zip.dll is not set in PATH. This failure is after we get NULL from GetModuleHandle("zip.dll"), then do LoadLibrary("zip.dll") will have same result. > The fix is calling load_zip_library of ClassLoader first --- if zip library already loaded just return the cached handle for following usage, if not, load zip library and cached the handle. > > Tests: tier1,4,7 in test > Manually tested user case, and checked output of jimage list for jlinked files using --compress=2. > > Thanks > Yumin I'm curious, under what circumstances would, before https://bugs.openjdk.java.net/browse/JDK-8237750, we ever hit the LoadLibrary in imageDecompressor.cpp? Did this ever work? Was there ever a scenario where the JVM was not involved and hence the zip.dll was not loaded already? For me, the code looks good unless I miss a scenario where we don't have the JVM loaded already at this point. ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7206 From adinn at openjdk.java.net Tue Jan 25 11:03:40 2022 From: adinn at openjdk.java.net (Andrew Dinn) Date: Tue, 25 Jan 2022 11:03:40 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v14] In-Reply-To: References: Message-ID: On Mon, 24 Jan 2022 17:07:54 GMT, Alan Hayward wrote: > check_return_address() is being a huge help here Yes, I appreciate that rejecting double signing is a good model.The surprise was solely a reflection of my limited understanding. > There's no obvious way to catch those issues. Ok, well in that case I guess I'm all right for this to be pushed as is. However, the other reviewers really need to give a nod before you proceed. > Once this is in, getting some regular testing will be fairly crucial to make sure it doesn't rot. I am hoping we will get that *informally* as Java developers get their hands on shiny new M1 macs and start using them in anger. However, I agree that having regular test runs for mac-aarch64 is a necessity. ------------- PR: https://git.openjdk.java.net/jdk/pull/6334 From adinn at openjdk.java.net Tue Jan 25 11:11:44 2022 From: adinn at openjdk.java.net (Andrew Dinn) Date: Tue, 25 Jan 2022 11:11:44 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v14] In-Reply-To: References: Message-ID: On Mon, 24 Jan 2022 16:51:26 GMT, Alan Hayward wrote: >> src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp line 835: >> >>> 833: __ stp(rlocals, rcpool, Address(sp, 2 * wordSize)); >>> 834: >>> 835: __ protect_return_address(); >> >> Most of the changes to fix the tests look fairly self-explanatory but I don't really understand why you relocated call to protect_return-_address from its previous location at line 801. Can you explain why it has been moved? > > I originally moved it as part of debugging (a GC load_at occurs during the load_mirror). > Once all the GC changes went in (all the enter_subframe calls), this change was no longer required. > Then, when I came to change it back, I realised it made more sense in the new place. The protect is now directly before the storing of lr to the stack. That's logically a better place and should make the assembler easier to read. Ok, I thought the new place looked better but was not clear why it had not been there in the first place. Thanks for clarifying. ------------- PR: https://git.openjdk.java.net/jdk/pull/6334 From aph at openjdk.java.net Tue Jan 25 13:44:35 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 25 Jan 2022 13:44:35 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v2] In-Reply-To: <3ViGybkSVRbuD_wN398vEFGxNJfiuS1wA_SdLkGtM18=.86e45177-8525-42dc-b27f-c22a67489108@github.com> References: <3ViGybkSVRbuD_wN398vEFGxNJfiuS1wA_SdLkGtM18=.86e45177-8525-42dc-b27f-c22a67489108@github.com> Message-ID: On Thu, 11 Nov 2021 18:15:08 GMT, Florian Weimer wrote: > > > > Am I right is saying that for Macos, all generated code is remapped RO before execution? > > > > > > > > > Ah, no, it seems the code cache is not RWX all the time as far as Java threads are concerned. The Macos/AArch64 code is strategically calling pthread_jit_write_protect_np at Java <-> JVM transition points. > > > > > > And this requires magic kernel support. I did mention it to a kernel engineer who wasn't very impressed, but I think it's pretty cool. > > It's possible to emulate this to some extent with memory protection keys on POWER and (recent) x86. See `pkey_alloc`. I don't think this does exactly what we need, because (at least according to the docs) it does it for the whole process, not just the jit threads. Unless I've read the docs wrongly. ------------- PR: https://git.openjdk.java.net/jdk/pull/6334 From aph at openjdk.java.net Tue Jan 25 13:44:36 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 25 Jan 2022 13:44:36 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v9] In-Reply-To: References: <8qhvLwNTzv5KxwJo93xrYA3GQSAX9NJm24EmbqFc3l8=.ba92bad8-0983-4519-9255-6913569f2638@github.com> <5g4s-czewXTVHX027JYGJIXapsXAjGYmScabO9Nk8nA=.6bc890fd-9394-4b77-9c87-890c8364d980@github.com> <1O5M3usjaNAhxthALcIb-fLeJUMrNiLc9OQ5nrlXMkg=.d7c5dc66-61b9-4fb6-813e-e74f9d536baf@github.com> Message-ID: On Sat, 11 Dec 2021 09:30:32 GMT, Andrew Haley wrote: >> Ok, I think that's fine. How about on a non pac system allowing it for development only ? > > Maybe. Mind you, a lot of the time I'm looking at the output from production systems. > From a rather philosophical point of view, I assume that if the user of a computer asks for something that isn't going to break anything or confuse anyone, we should honour their request. Was this ever resolved? ------------- PR: https://git.openjdk.java.net/jdk/pull/6334 From aph at openjdk.java.net Tue Jan 25 13:49:37 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 25 Jan 2022 13:49:37 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v14] In-Reply-To: References: Message-ID: <7E6HbfQpDC20d4BJTl8NpafM8_ts3zUJYHoQs5yfe4A=.96b5172e-7540-4f1c-989e-23391d07c9a4@github.com> On Mon, 24 Jan 2022 15:56:06 GMT, Alan Hayward wrote: >> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One >> of its uses is to protect against ROP based attacks. This is done by >> signing the Link Register whenever it is stored on the stack, and >> authenticating the value when it is loaded back from the stack. If an >> attacker were to try to change control flow by editing the stack then >> the authentication check of the Link Register will fail, causing a >> segfault when the function returns. >> >> On a system with PAC enabled, it is expected that all applications will >> be compiled with ROP protection. Fedora 33 and upwards already provide >> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of >> PAC instructions that exist in the NOP space - on hardware without PAC, >> these instructions act as NOPs, allowing backward compatibility for >> negligible performance cost (2 NOPs per non-leaf function). >> >> Hardware is currently limited to the Apple M1 MacBooks. All testing has >> been done within a Fedora Docker image. A run of SpecJVM showed no >> difference to that of noise - which was surprising. >> >> The most important part of this patch is simply compiling using branch >> protection provided by GCC/LLVM. This protects all C++ code from being >> used in ROP attacks, removing all static ROP gadgets from use. >> >> The remainder of the patch adds ROP protection to runtime generated >> code, in both stubs and compiled Java code. Attacks here are much harder >> as ROP gadgets must be found dynamically at runtime. If/when AOT >> compilation is added to JDK, then all stubs and compiled Java will be >> susceptible ROP gadgets being found by static analysis and therefore >> potentially as vulnerable as C++ code. >> >> There are a number of places where the VM changes control flow by >> rewriting the stack or otherwise. I?ve done some analysis as to how >> these could also be used for attacks (which I didn?t want to post here). >> These areas can be protected ensuring the pointers to various stubs and >> entry points are stored in memory as signed pointers. These changes are >> simple to make (they can be reduced to a type change in common code and >> a few addition sign/auth calls in the backend), but there a lot of them >> and the total code change is fairly large. I?m happy to provide a few >> work in progress patches. >> >> In order to match the security benefits of the Apple Arm64e ABI across >> the whole of JDK, then all the changes mentioned above would be >> required. > > Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: > > Fix popframe failures I've reverted my approval pending my question which wasn't resolved. ------------- Changes requested by aph (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6334 From erikj at openjdk.java.net Tue Jan 25 14:03:28 2022 From: erikj at openjdk.java.net (Erik Joelsson) Date: Tue, 25 Jan 2022 14:03:28 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files In-Reply-To: References: Message-ID: On Tue, 18 Jan 2022 13:19:39 GMT, Christian Hagedorn wrote: > When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method: > > Stack: [0x00007f6e01739000,0x00007f6e0183a000], sp=0x00007f6e01838110, free space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 > V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec > V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 > V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df > V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 > V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d > V [libjvm.so+0x12091c9] JavaThread::run()+0x167 > V [libjvm.so+0x1206ada] Thread::call_run()+0x180 > V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f > > This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method. > > This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)): > > Stack: [0x00007f34fca18000,0x00007f34fcb19000], sp=0x00007f34fcb17110, free space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 (c1_Compilation.cpp:607) > V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec (c1_Compiler.cpp:250) > V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 (compileBroker.cpp:2291) > V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df (compileBroker.cpp:1966) > V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 (compilerThread.cpp:59) > V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d (thread.cpp:1297) > V [libjvm.so+0x12091c9] JavaThread::run()+0x167 (thread.cpp:1280) > V [libjvm.so+0x1206ada] Thread::call_run()+0x180 (thread.cpp:358) > V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f (os_linux.cpp:705) > > For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. > > The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf > I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. > > The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability. > > Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. > > **Testing:** > Apart from manual testing, I've added two kinds of tests: > - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers. > - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename. > > On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation. > > To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number. > > I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user). > > Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`! > > Thanks, > Christian Build changes look good. The reference to the new env variable is missing the initial underscore in two places. test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java line 30: > 28: * @summary Test DWARF parser with various crashes if debug symbols are available. If the libjvm debug symbols are not > 29: * in the same directory as the libjvm.so file, in a subdirectory called .debug, or in the path specified > 30: * by the environment variable JVM_DWARF_PATH, then no verification of the hs_err_file is done for libjvm.so. Suggestion: * by the environment variable _JVM_DWARF_PATH, then no verification of the hs_err_file is done for libjvm.so. test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java line 165: > 163: Asserts.assertTrue(matcher.find(), "Could not find filename or line number in \"" + line + "\""); > 164: System.out.println("Did not find symbols for " + library + ". If they are not in the same directory as " + library + " consider setting " + > 165: "the environmental variable JVM_DWARF_PATH to point to the debug symbols directory."); Suggestion: "the environmental variable _JVM_DWARF_PATH to point to the debug symbols directory."); ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From duke at openjdk.java.net Tue Jan 25 14:25:34 2022 From: duke at openjdk.java.net (Alan Hayward) Date: Tue, 25 Jan 2022 14:25:34 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v9] In-Reply-To: References: <8qhvLwNTzv5KxwJo93xrYA3GQSAX9NJm24EmbqFc3l8=.ba92bad8-0983-4519-9255-6913569f2638@github.com> <5g4s-czewXTVHX027JYGJIXapsXAjGYmScabO9Nk8nA=.6bc890fd-9394-4b77-9c87-890c8364d980@github.com> <1O5M3usjaNAhxthALcIb-fLeJUMrNiLc9OQ5nrlXMkg=.d7c5dc66-61b9-4fb6-813e-e74f9d536baf@github.com> Message-ID: <7WX8q1UcPHempROiRWX2Z38tiIrKnyM3iiWifPfgnJk=.06701f7e-5f02-45ec-beb5-e876974fb4a5@github.com> On Tue, 25 Jan 2022 13:41:38 GMT, Andrew Haley wrote: >> Maybe. Mind you, a lot of the time I'm looking at the output from production systems. >> From a rather philosophical point of view, I assume that if the user of a computer asks for something that isn't going to break anything or confuse anyone, we should honour their request. > > Was this ever resolved? Sort of. That code has changed quite a bit - UseROPProtection now is a string field not a bool. "none" or not set - pac disabled. "pac-ret" - pac always enabled "standard" - pac enabled if the cpu+os support it. Also, the pac instructions used aren't all in the NOP space. So, it will crash on a non-pac machine. It might be possible to change it so it does only use nop space instructions, but I don't think it'll be optimal (need to double check). ------------- PR: https://git.openjdk.java.net/jdk/pull/6334 From chagedorn at openjdk.java.net Tue Jan 25 15:10:11 2022 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Tue, 25 Jan 2022 15:10:11 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v2] In-Reply-To: References: Message-ID: > When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method: > > Stack: [0x00007f6e01739000,0x00007f6e0183a000], sp=0x00007f6e01838110, free space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 > V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec > V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 > V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df > V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 > V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d > V [libjvm.so+0x12091c9] JavaThread::run()+0x167 > V [libjvm.so+0x1206ada] Thread::call_run()+0x180 > V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f > > This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method. > > This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)): > > Stack: [0x00007f34fca18000,0x00007f34fcb19000], sp=0x00007f34fcb17110, free space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 (c1_Compilation.cpp:607) > V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec (c1_Compiler.cpp:250) > V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 (compileBroker.cpp:2291) > V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df (compileBroker.cpp:1966) > V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 (compilerThread.cpp:59) > V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d (thread.cpp:1297) > V [libjvm.so+0x12091c9] JavaThread::run()+0x167 (thread.cpp:1280) > V [libjvm.so+0x1206ada] Thread::call_run()+0x180 (thread.cpp:358) > V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f (os_linux.cpp:705) > > For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. > > The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf > I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. > > The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability. > > Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. > > **Testing:** > Apart from manual testing, I've added two kinds of tests: > - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers. > - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename. > > On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation. > > To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number. > > I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user). > > Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`! > > Thanks, > Christian Christian Hagedorn has updated the pull request incrementally with two additional commits since the last revision: - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7126/files - new: https://git.openjdk.java.net/jdk/pull/7126/files/f8c98a29..7ddb7737 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7126&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7126&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7126.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7126/head:pull/7126 PR: https://git.openjdk.java.net/jdk/pull/7126 From chagedorn at openjdk.java.net Tue Jan 25 15:10:13 2022 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Tue, 25 Jan 2022 15:10:13 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v2] In-Reply-To: References: Message-ID: On Tue, 25 Jan 2022 14:00:49 GMT, Erik Joelsson wrote: > Build changes look good. The reference to the new env variable is missing the initial underscore in two places. Thanks Erik for reviewing the build changes. I updated the places. ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From aph at openjdk.java.net Tue Jan 25 15:30:41 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 25 Jan 2022 15:30:41 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v9] In-Reply-To: <7WX8q1UcPHempROiRWX2Z38tiIrKnyM3iiWifPfgnJk=.06701f7e-5f02-45ec-beb5-e876974fb4a5@github.com> References: <8qhvLwNTzv5KxwJo93xrYA3GQSAX9NJm24EmbqFc3l8=.ba92bad8-0983-4519-9255-6913569f2638@github.com> <5g4s-czewXTVHX027JYGJIXapsXAjGYmScabO9Nk8nA=.6bc890fd-9394-4b77-9c87-890c8364d980@github.com> <1O5M3usjaNAhxthALcIb-fLeJUMrNiLc9OQ5nrlXMkg=.d7c5dc66-61b9-4fb6-813e-e74f9d536baf@github.com> <7WX8q1UcPHempROiRWX2Z38tiIrKnyM3iiWifPfgnJk=.06701f7e-5f02-45ec-beb5-e876974fb4a5@github.com> Message-ID: On Tue, 25 Jan 2022 14:22:15 GMT, Alan Hayward wrote: >> Was this ever resolved? > > Sort of. That code has changed quite a bit - UseROPProtection now is a string field not a bool. > "none" or not set - pac disabled. > "pac-ret" - pac always enabled > "standard" - pac enabled if the cpu+os support it. > > Also, the pac instructions used aren't all in the NOP space. So, it will crash on a non-pac machine. It might be possible to change it so it does only use nop space instructions, but I don't think it'll be optimal (need to double check). OK, cool. ------------- PR: https://git.openjdk.java.net/jdk/pull/6334 From aph at openjdk.java.net Tue Jan 25 16:29:42 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 25 Jan 2022 16:29:42 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v14] In-Reply-To: References: Message-ID: <9cG3vLZ7nzDen6cG_H3D8QMEUb1MixD-Q2av3wKVPBU=.e33c1f78-d1de-49e6-a7ab-dd15f8c9195c@github.com> On Mon, 24 Jan 2022 15:56:06 GMT, Alan Hayward wrote: >> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One >> of its uses is to protect against ROP based attacks. This is done by >> signing the Link Register whenever it is stored on the stack, and >> authenticating the value when it is loaded back from the stack. If an >> attacker were to try to change control flow by editing the stack then >> the authentication check of the Link Register will fail, causing a >> segfault when the function returns. >> >> On a system with PAC enabled, it is expected that all applications will >> be compiled with ROP protection. Fedora 33 and upwards already provide >> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of >> PAC instructions that exist in the NOP space - on hardware without PAC, >> these instructions act as NOPs, allowing backward compatibility for >> negligible performance cost (2 NOPs per non-leaf function). >> >> Hardware is currently limited to the Apple M1 MacBooks. All testing has >> been done within a Fedora Docker image. A run of SpecJVM showed no >> difference to that of noise - which was surprising. >> >> The most important part of this patch is simply compiling using branch >> protection provided by GCC/LLVM. This protects all C++ code from being >> used in ROP attacks, removing all static ROP gadgets from use. >> >> The remainder of the patch adds ROP protection to runtime generated >> code, in both stubs and compiled Java code. Attacks here are much harder >> as ROP gadgets must be found dynamically at runtime. If/when AOT >> compilation is added to JDK, then all stubs and compiled Java will be >> susceptible ROP gadgets being found by static analysis and therefore >> potentially as vulnerable as C++ code. >> >> There are a number of places where the VM changes control flow by >> rewriting the stack or otherwise. I?ve done some analysis as to how >> these could also be used for attacks (which I didn?t want to post here). >> These areas can be protected ensuring the pointers to various stubs and >> entry points are stored in memory as signed pointers. These changes are >> simple to make (they can be reduced to a type change in common code and >> a few addition sign/auth calls in the backend), but there a lot of them >> and the total code change is fairly large. I?m happy to provide a few >> work in progress patches. >> >> In order to match the security benefits of the Apple Arm64e ABI across >> the whole of JDK, then all the changes mentioned above would be >> required. > > Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: > > Fix popframe failures Marked as reviewed by aph (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/6334 From jbhateja at openjdk.java.net Tue Jan 25 16:45:33 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Tue, 25 Jan 2022 16:45:33 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v3] In-Reply-To: References: <1sV-WcCv0-H2lHh1FFwAocRL8EIVzZ_att5tUeY5M-c=.ee351f2b-c9b9-4294-9205-cf16da3e896d@github.com> Message-ID: On Tue, 25 Jan 2022 09:41:01 GMT, Pengfei Li wrote: > Hi Jatin, > > > BTW why have you kept a constraint on the vector size of post tail loop to match MaxVectorSize ? > > I just did more investigation about why there was a MaxVectorSize constraint. By looking at the failure after removing that, I found in some smaller MaxVectorSize configurations, the main loop may unroll more times than the vector lane count. In this scenario, the vector drain loop is not cloned from the atomic vector main loop. Without the vector drain loop, 1 iteration would be **NOT** enough for the vector masked tail loop so the test case generates incorrect result. Hence, we need to keep the MaxVectorSize constraint to make sure it's enough for the vector masked tail loop to run at most 1 iteration. > > I'd be glad if you have more suggestions. Hi Pengfie, Thanks for clarification, can we not prevent generating post vector masked tail iteration in case there is no vector atomic loop after unrolled main vector loop. This can guard against any performance penalties associated with usage of 64 byte vectors in drain loop if main loop operates over smaller vectors. Regards, Jatin ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From minqi at openjdk.java.net Tue Jan 25 17:26:34 2022 From: minqi at openjdk.java.net (Yumin Qi) Date: Tue, 25 Jan 2022 17:26:34 GMT Subject: RFR: 8278753: Runtime crashes with access violation during JNI_CreateJavaVM call In-Reply-To: References: Message-ID: On Tue, 25 Jan 2022 10:29:48 GMT, Thomas Stuefe wrote: > I'm curious, under what circumstances would, before https://bugs.openjdk.java.net/browse/JDK-8237750, we ever hit the LoadLibrary in imageDecompressor.cpp? Did this ever work? Was there ever a scenario where the JVM was not involved and hence the zip.dll was not loaded already? > > For me, the code looks good unless I miss a scenario where we don't have the JVM loaded already at this point. Thanks for review. Before 8237750, the zip library is always loaded so jimage just get the handle of the loaded zip by calling . After that, zip is loaded at need, so if jvm does not use zip, it will not be loaded. Unfortunately, it does not realize that jimage is using this zip, so it failed to get the handle. But there exists a case, if the zip is in PATH, the following fix 8244495 used LoadLibrary("zip.dll") for a rescue. If zip.dll is not in PATH, the call still failed to load zip. This is the current issue. So far, if user loaded zip from native code before jimage code is executed ( I could not image a scenario how it can happen), the GetModuleHandle("zip.dll") may return the handle to it, but it does not surely it is for the "zip.dll" --- if multiple instances exist, the returned handle is not guaranteed the one you want. With this change, if user loads zip from native code (with different version), JVM does not sense that, it will still load zip from $JDK or $JRE, and jimage still uses handle returned from JVM. The only case is JVM failed to load zip library: if (_zip_handle == NULL) { vm_exit_during_initialization("Unable to load zip library", path); } You cannot make any progress on the failure. ------------- PR: https://git.openjdk.java.net/jdk/pull/7206 From pli at openjdk.java.net Wed Jan 26 01:56:33 2022 From: pli at openjdk.java.net (Pengfei Li) Date: Wed, 26 Jan 2022 01:56:33 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v3] In-Reply-To: <1sV-WcCv0-H2lHh1FFwAocRL8EIVzZ_att5tUeY5M-c=.ee351f2b-c9b9-4294-9205-cf16da3e896d@github.com> References: <1sV-WcCv0-H2lHh1FFwAocRL8EIVzZ_att5tUeY5M-c=.ee351f2b-c9b9-4294-9205-cf16da3e896d@github.com> Message-ID: On Mon, 10 Jan 2022 06:20:01 GMT, Pengfei Li wrote: >> ### Background >> >> Post loop vectorization is a C2 compiler optimization in an experimental >> VM feature called PostLoopMultiversioning. It transforms the range-check >> eliminated post loop to a 1-iteration vectorized loop with vector mask. >> This optimization was contributed by Intel in 2016 to support x86 AVX512 >> masked vector instructions. However, it was disabled soon after an issue >> was found. Due to insufficient maintenance in these years, multiple bugs >> have been accumulated inside. But we (Arm) still think this is a useful >> framework for vector mask support in C2 auto-vectorized loops, for both >> x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable >> post loop vectorization. >> >> ### Changes in this patch >> >> This patch reworks post loop vectorization. The most significant change >> is removing vector mask support in C2 x86 backend and re-implementing >> it in the mid-end. With this, we can re-enable post loop vectorization >> for platforms other than x86. >> >> Previous implementation hard-codes x86 k1 register as a reserved AVX512 >> opmask register and defines two routines (setvectmask/restorevectmask) >> to set and restore the value of k1. But after [JDK-8211251](https://bugs.openjdk.java.net/browse/JDK-8211251) which encodes >> AVX512 instructions as unmasked by default, generated vector masks are >> no longer used in AVX512 vector instructions. To fix incorrect codegen >> and add vector mask support for more platforms, we turn to add a vector >> mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode >> to generate a mask and replace all Load/Store nodes in the post loop >> into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This >> IR form is exactly the same to those which are used in VectorAPI mask >> support. For now, we only add mask inputs for Load/Store nodes because >> we don't have reduction operations supported in post loop vectorization. >> After this change, the x86 k1 register is no longer reserved and can be >> allocated when PostLoopMultiversioning is enabled. >> >> Besides this change, we have fixed a compiler crash and five incorrect >> result issues with post loop vectorization. >> >> **I) C2 crashes with segmentation fault in strip-mined loops** >> >> Previous implementation was done before C2 loop strip-mining was merged >> into JDK master so it didn't take strip-mined loops into consideration. >> In C2's strip mined loops, post loop is not the sibling of the main loop >> in ideal loop tree. Instead, it's the sibling of the main loop's parent. >> This patch fixed a SIGSEGV issue caused by NULL pointer when locating >> post loop from strip-mined main loop. >> >> **II) Incorrect result issues with post loop vectorization** >> >> We have also fixed five incorrect vectorization issues. Some of them are >> hidden deep and can only be reproduced with corner cases. These issues >> have a common cause that it assumes the post loop can be vectorized if >> the vectorization in corresponding main loop is successful. But in many >> cases this assumption is wrong. Below are details. >> >> - **[Issue-1] Incorrect vectorization for partial vectorizable loops** >> >> This issue can be reproduced by below loop where only some operations in >> the loop body are vectorizable. >> >> for (int i = 0; i < 10000; i++) { >> res[i] = a[i] * b[i]; >> k = 3 * k + 1; >> } >> >> In the main loop, superword can work well if parts of the operations in >> loop body are not vectorizable since those parts can be unrolled only. >> But for post loops, we don't create vectors through combining scalar IRs >> generated from loop unrolling. Instead, we are doing scalars to vectors >> replacement for all operations in the loop body. Hence, all operations >> should be either vectorized together or not vectorized at all. To fix >> this kind of cases, we add an extra field "_slp_vector_pack_count" in >> CountedLoopNode to record the eventual count of vector packs in the main >> loop. This value is then passed to post loop and compared with post loop >> pack count. Vectorization will be bailed out in post loop if it creates >> more vector packs than in the main loop. >> >> - **[Issue-2] Incorrect result in loops with growing-down vectors** >> >> This issue appears with growing-down vectors, that is, vectors that grow >> to smaller memory address as the loop iterates. It can be reproduced by >> below counting-up loop with negative scale value in array index. >> >> for (int i = 0; i < 10000; i++) { >> a[MAX - i] = b[MAX - i]; >> } >> >> Cause of this issue is that for a growing-down vector, generated vector >> mask value has reversed vector-lane order so it masks incorrect vector >> lanes. Note that if negative scale value appears in counting-down loops, >> the vector will be growing up. With this rule, we fix the issue by only >> allowing positive array index scales in counting-up loops and negative >> array index scales in counting-down loops. This check is done with the >> help of SWPointer by comparing scale values in each memory access in the >> loop with loop stride value. >> >> - **[Issue-3] Incorrect result in manually unrolled loops** >> >> This issue can be reproduced by below manually unrolled loop. >> >> for (int i = 0; i < 10000; i += 2) { >> c[i] = a[i] + b[i]; >> c[i + 1] = a[i + 1] * b[i + 1]; >> } >> >> In this loop, operations in the 2nd statement duplicate those in the 1st >> statement with a small memory address offset. Vectorization in the main >> loop works well in this case because C2 does further unrolling and pack >> combination. But we cannot vectorize the post loop through replacement >> from scalars to vectors because it creates duplicated vector operations. >> To fix this, we restrict post loop vectorization to loops with stride >> values of 1 or -1. >> >> - **[Issue-4] Incorrect result in loops with mixed vector element sizes** >> >> This issue is found after we enable post loop vectorization for AArch64. >> It's reproducible by multiple array operations with different element >> sizes inside a loop. On x86, there is no issue because the values of x86 >> AVX512 opmasks only depend on which vector lanes are active. But AArch64 >> is different - the values of SVE predicates also depend on lane size of >> the vector. Hence, on AArch64 SVE, if a loop has mixed vector element >> sizes, we should use different vector masks. For now, we just support >> loops with only one vector element size, i.e., "int + float" vectors in >> a single loop is ok but "int + double" vectors in a single loop is not >> vectorizable. This fix also enables subword vectors support to make all >> primitive type array operations vectorizable. >> >> - **[Issue-5] Incorrect result in loops with potential data dependence** >> >> This issue can be reproduced by below corner case on AArch64 only. >> >> for (int i = 0; i < 10000; i++) { >> a[i] = x; >> a[i + OFFSET] = y; >> } >> >> In this case, two stores in the loop have data dependence if the OFFSET >> value is smaller than the vector length. So we cannot do vectorization >> through replacing scalars to vectors. But the main loop vectorization >> in this case is successful on AArch64 because AArch64 has partial vector >> load/store support. It splits vector fill with different values in lanes >> to several smaller-sized fills. In this patch, we add additional data >> dependence check for this kind of cases. The check is also done with the >> help of SWPointer class. In this check, we require that every two memory >> accesses (with at least one store) of the same element type (or subword >> size) in the loop has the same array index expression. >> >> ### Tests >> >> So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with >> experimental VM option "PostLoopMultiversioning" turned on. We found no >> issue in all tests. We notice that those existing cases are not enough >> because some of above issues are not spotted by them. We would like to >> add some new cases but we found existing vectorization tests are a bit >> cumbersome - golden results must be pre-calculated and hard-coded in the >> test code for correctness verification. Thus, in this patch, we propose >> a new vectorization testing framework. >> >> Our new framework brings a simpler way to add new cases. For a new test >> case, we only need to create a new method annotated with "@Test". The >> test runner will invoke each annotated method twice automatically. First >> time it runs in the interpreter and second time it's forced compiled by >> C2. Then the two return results are compared. So in this framework each >> test method should return a primitive value or an array of primitives. >> In this way, no extra verification code for vectorization correctness is >> required. This test runner is still jtreg-based and takes advantages of >> the jtreg WhiteBox API, which enables test methods running at specific >> compilation levels. Each test class inside is also jtreg-based. It just >> need to inherit from the test runner class and run with two additional >> options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". >> >> ### Summary & Future work >> >> In this patch, we reworked post loop vectorization. We made it platform >> independent and fixed several issues inside. We also implemented a new >> vectorization testing framework with many test cases inside. Meanwhile, >> we did some code cleanups. >> >> This patch only touches C2 code guarded with PostLoopMultiversioning, >> except a few data structure changes. So, there's no behavior change when >> experimental VM option PostLoopMultiversioning is off. Also, to reduce >> risks, we still propose to keep post loop vectorization experimental for >> now. But if it receives positive feedback, we would like to change it to >> non-experimental in the future. > > Pengfei Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Update copyright year and rename a function > > Change-Id: I15845ebd3982edebd4c151284cc6f2ff727630bb > - Merge branch 'master' into postloop > > Change-Id: Ie639c79c9cf016dc68ebf2c0031b60453b45e9a4 > - Fix issues in newly added test framework > > Change-Id: I6e61abf05e9665325cb3abaf407360b18355c6b1 > - Merge branch 'master' into postloop > > Change-Id: I9bb5a808d7540426dedb141fd198d25eb1f569e6 > - 8183390: Fix and re-enable post loop vectorization > > ** Background > > Post loop vectorization is a C2 compiler optimization in an experimental > VM feature called PostLoopMultiversioning. It transforms the range-check > eliminated post loop to a 1-iteration vectorized loop with vector mask. > This optimization was contributed by Intel in 2016 to support x86 AVX512 > masked vector instructions. However, it was disabled soon after an issue > was found. Due to insufficient maintenance in these years, multiple bugs > have been accumulated inside. But we (Arm) still think this is a useful > framework for vector mask support in C2 auto-vectorized loops, for both > x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable > post loop vectorization. > > ** Changes in this patch > > This patch reworks post loop vectorization. The most significant change > is removing vector mask support in C2 x86 backend and re-implementing > it in the mid-end. With this, we can re-enable post loop vectorization > for platforms other than x86. > > Previous implementation hard-codes x86 k1 register as a reserved AVX512 > opmask register and defines two routines (setvectmask/restorevectmask) > to set and restore the value of k1. But after JDK-8211251 which encodes > AVX512 instructions as unmasked by default, generated vector masks are > no longer used in AVX512 vector instructions. To fix incorrect codegen > and add vector mask support for more platforms, we turn to add a vector > mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode > to generate a mask and replace all Load/Store nodes in the post loop > into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This > IR form is exactly the same to those which are used in VectorAPI mask > support. For now, we only add mask inputs for Load/Store nodes because > we don't have reduction operations supported in post loop vectorization. > After this change, the x86 k1 register is no longer reserved and can be > allocated when PostLoopMultiversioning is enabled. > > Besides this change, we have fixed a compiler crash and five incorrect > result issues with post loop vectorization. > > - 1) C2 crashes with segmentation fault in strip-mined loops > > Previous implementation was done before C2 loop strip-mining was merged > into JDK master so it didn't take strip-mined loops into consideration. > In C2's strip mined loops, post loop is not the sibling of the main loop > in ideal loop tree. Instead, it's the sibling of the main loop's parent. > This patch fixed a SIGSEGV issue caused by NULL pointer when locating > post loop from strip-mined main loop. > > - 2) Incorrect result issues with post loop vectorization > > We have also fixed five incorrect vectorization issues. Some of them are > hidden deep and can only be reproduced with corner cases. These issues > have a common cause that it assumes the post loop can be vectorized if > the vectorization in corresponding main loop is successful. But in many > cases this assumption is wrong. Below are details. > > [Issue-1] Incorrect vectorization for partial vectorizable loops > > This issue can be reproduced by below loop where only some operations in > the loop body are vectorizable. > > for (int i = 0; i < 10000; i++) { > res[i] = a[i] * b[i]; > k = 3 * k + 1; > } > > In the main loop, superword can work well if parts of the operations in > loop body are not vectorizable since those parts can be unrolled only. > But for post loops, we don't create vectors through combining scalar IRs > generated from loop unrolling. Instead, we are doing scalars to vectors > replacement for all operations in the loop body. Hence, all operations > should be either vectorized together or not vectorized at all. To fix > this kind of cases, we add an extra field "_slp_vector_pack_count" in > CountedLoopNode to record the eventual count of vector packs in the main > loop. This value is then passed to post loop and compared with post loop > pack count. Vectorization will be bailed out in post loop if it creates > more vector packs than in the main loop. > > [Issue-2] Incorrect result in loops with growing-down vectors > > This issue appears with growing-down vectors, that is, vectors that grow > to smaller memory address as the loop iterates. It can be reproduced by > below counting-up loop with negative scale value in array index. > > for (int i = 0; i < 10000; i++) { > a[MAX - i] = b[MAX - i]; > } > > Cause of this issue is that for a growing-down vector, generated vector > mask value has reversed vector-lane order so it masks incorrect vector > lanes. Note that if negative scale value appears in counting-down loops, > the vector will be growing up. With this rule, we fix the issue by only > allowing positive array index scales in counting-up loops and negative > array index scales in counting-down loops. This check is done with the > help of SWPointer by comparing scale values in each memory access in the > loop with loop stride value. > > [Issue-3] Incorrect result in manually unrolled loops > > This issue can be reproduced by below manually unrolled loop. > > for (int i = 0; i < 10000; i += 2) { > c[i] = a[i] + b[i]; > c[i + 1] = a[i + 1] * b[i + 1]; > } > > In this loop, operations in the 2nd statement duplicate those in the 1st > statement with a small memory address offset. Vectorization in the main > loop works well in this case because C2 does further unrolling and pack > combination. But we cannot vectorize the post loop through replacement > from scalars to vectors because it creates duplicated vector operations. > To fix this, we restrict post loop vectorization to loops with stride > values of 1 or -1. > > [Issue-4] Incorrect result in loops with mixed vector element sizes > > This issue is found after we enable post loop vectorization for AArch64. > It's reproducible by multiple array operations with different element > sizes inside a loop. On x86, there is no issue because the values of x86 > AVX512 opmasks only depend on which vector lanes are active. But AArch64 > is different - the values of SVE predicates also depend on lane size of > the vector. Hence, on AArch64 SVE, if a loop has mixed vector element > sizes, we should use different vector masks. For now, we just support > loops with only one vector element size, i.e., "int + float" vectors in > a single loop is ok but "int + double" vectors in a single loop is not > vectorizable. This fix also enables subword vectors support to make all > primitive type array operations vectorizable. > > [Issue-5] Incorrect result in loops with potential data dependence > > This issue can be reproduced by below corner case on AArch64 only. > > for (int i = 0; i < 10000; i++) { > a[i] = x; > a[i + OFFSET] = y; > } > > In this case, two stores in the loop have data dependence if the OFFSET > value is smaller than the vector length. So we cannot do vectorization > through replacing scalars to vectors. But the main loop vectorization > in this case is successful on AArch64 because AArch64 has partial vector > load/store support. It splits vector fill with different values in lanes > to several smaller-sized fills. In this patch, we add additional data > dependence check for this kind of cases. The check is also done with the > help of SWPointer class. In this check, we require that every two memory > accesses (with at least one store) of the same element type (or subword > size) in the loop has the same array index expression. > > ** Tests > > So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with > experimental VM option "PostLoopMultiversioning" turned on. We found no > issue in all tests. We notice that those existing cases are not enough > because some of above issues are not spotted by them. We would like to > add some new cases but we found existing vectorization tests are a bit > cumbersome - golden results must be pre-calculated and hard-coded in the > test code for correctness verification. Thus, in this patch, we propose > a new vectorization testing framework. > > Our new framework brings a simpler way to add new cases. For a new test > case, we only need to create a new method annotated with "@Test". The > test runner will invoke each annotated method twice automatically. First > time it runs in the interpreter and second time it's forced compiled by > C2. Then the two return results are compared. So in this framework each > test method should return a primitive value or an array of primitives. > In this way, no extra verification code for vectorization correctness is > required. This test runner is still jtreg-based and takes advantages of > the jtreg WhiteBox API, which enables test methods running at specific > compilation levels. Each test class inside is also jtreg-based. It just > need to inherit from the test runner class and run with two additional > options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". > > ** Summary & Future work > > In this patch, we reworked post loop vectorization. We made it platform > independent and fixed several issues inside. We also implemented a new > vectorization testing framework with many test cases inside. Meanwhile, > we did some code cleanups. > > This patch only touches C2 code guarded with PostLoopMultiversioning, > except a few data structure changes. So, there's no behavior change when > experimental VM option PostLoopMultiversioning is off. Also, to reduce > risks, we still propose to keep post loop vectorization experimental for > now. But if it receives positive feedback, we would like to change it to > non-experimental in the future. Don?t know why your reply is not displayed on Github page. > ?Thanks for clarification, can we not prevent generating post vector masked tail iteration in case there is no vector atomic loop after unrolled main vector loop. This can guard against any performance penalties associated with usage of 64 byte vectors in drain loop if main loop operates over smaller vectors.? So far I haven?t found a good way to prevent generating vector masked tail loop if no atomic vector drain loop exists, because the scalar RCE?d post loop is inserted before the main loop is unrolled, but the vector drain loop is inserted in later ideal loop phase iteration after superword has done -- see ?insert_scalar_rced_post_loop()? and ?insert_vector_post_loop()? call sites in loopTransform.cpp. But your suggestion is good. I believe we should fix that performance penalty before making post loop vectorization non-experimental. I have another idea that making the vector masked tail loop running more than 1 iteration. In this way, the cloned vector drain loop is no longer required. But it requires some bigger refactoring work. As this patch focuses on ?fix and re-enablement? and already contains a lot of fixes of previously accumulated issues, I?d like to try my idea in later patches. Thanks, Pengfei From: Jatin Bhateja ***@***.***> Sent: Wednesday, January 26, 2022 00:42 To: openjdk/jdk ***@***.***> Cc: Pengfei Li ***@***.***>; Mention ***@***.***> Subject: Re: [openjdk/jdk] 8183390: Fix and re-enable post loop vectorization (PR #6828) Hi Jatin, BTW why have you kept a constraint on the vector size of post tail loop to match MaxVectorSize ? I just did more investigation about why there was a MaxVectorSize constraint. By looking at the failure after removing that, I found in some smaller MaxVectorSize configurations, the main loop may unroll more times than the vector lane count. In this scenario, the vector drain loop is not cloned from the atomic vector main loop. Without the vector drain loop, 1 iteration would be NOT enough for the vector masked tail loop so the test case generates incorrect result. Hence, we need to keep the MaxVectorSize constraint to make sure it's enough for the vector masked tail loop to run at most 1 iteration. I'd be glad if you have more suggestions. Hi Pengfie, Thanks for clarification, can we not prevent generating post vector masked tail iteration in case there is no vector atomic loop after unrolled main vector loop. This can guard against any performance penalties associated with usage of 64 byte vectors in drain loop if main loop operates over smaller vectors. Regards, Jatin ? Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: ***@***.******@***.***>> ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From stuefe at openjdk.java.net Wed Jan 26 05:54:30 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 26 Jan 2022 05:54:30 GMT Subject: RFR: 8278753: Runtime crashes with access violation during JNI_CreateJavaVM call In-Reply-To: References: Message-ID: On Tue, 25 Jan 2022 17:22:54 GMT, Yumin Qi wrote: >> I'm curious, under what circumstances would, before https://bugs.openjdk.java.net/browse/JDK-8237750, we ever hit the LoadLibrary in imageDecompressor.cpp? Did this ever work? Was there ever a scenario where the JVM was not involved and hence the zip.dll was not loaded already? >> >> For me, the code looks good unless I miss a scenario where we don't have the JVM loaded already at this point. > >> I'm curious, under what circumstances would, before https://bugs.openjdk.java.net/browse/JDK-8237750, we ever hit the LoadLibrary in imageDecompressor.cpp? Did this ever work? Was there ever a scenario where the JVM was not involved and hence the zip.dll was not loaded already? >> >> For me, the code looks good unless I miss a scenario where we don't have the JVM loaded already at this point. > > Thanks for review. Before 8237750, the zip library is always loaded so jimage just get the handle of the loaded zip by calling . After that, zip is loaded at need, so if jvm does not use zip, it will not be loaded. Unfortunately, it does not realize that jimage is using this zip, so it failed to get the handle. But there exists a case, if the zip is in PATH, the following fix 8244495 used LoadLibrary("zip.dll") for a rescue. If zip.dll is not in PATH, the call still failed to load zip. This is the current issue. > > So far, if user loaded zip from native code before jimage code is executed ( I could not image a scenario how it can happen), the GetModuleHandle("zip.dll") may return the handle to it, but it does not surely it is for the "zip.dll" --- if multiple instances exist, the returned handle is not guaranteed the one you want. > > With this change, if user loads zip from native code (with different version), JVM does not sense that, it will still load zip from $JDK or $JRE, and jimage still uses handle returned from JVM. The only case is JVM failed to load zip library: > > if (_zip_handle == NULL) { > vm_exit_during_initialization("Unable to load zip library", path); > } > > You cannot make any progress on the failure. Thanks for the explanation, @yminqi. Change looks good. ------------- PR: https://git.openjdk.java.net/jdk/pull/7206 From stuefe at openjdk.java.net Wed Jan 26 06:11:44 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 26 Jan 2022 06:11:44 GMT Subject: RFR: JDK-8280583: Always build NMT Message-ID: <_RJK80WSxCYhWSqjjoQYbVPr2_uEnlNqp_UdwniQp4c=.d3b51bfa-ffe7-4a72-97fe-64dbb25e1f5a@github.com> After discussing this on hotspot-runtime-dev [1], the general opinion seems to be that it would be worthwhile to get rid of INCLUDE_NMT and make NMT unconditional. This affects minimal builds only. As pointed out in the mail thread, the overhead is very small and it would get rid of one configuration to build and test. [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2022-January/053504.html Patch removes INCLUDE_NMT from hotspot, as well as dependent macros, as well as the associated build option. ------------- Commit messages: - remove INCLUDE_NMT and dependend code Changes: https://git.openjdk.java.net/jdk/pull/7213/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7213&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8280583 Stats: 217 lines in 28 files changed: 10 ins; 192 del; 15 mod Patch: https://git.openjdk.java.net/jdk/pull/7213.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7213/head:pull/7213 PR: https://git.openjdk.java.net/jdk/pull/7213 From stuefe at openjdk.java.net Wed Jan 26 06:23:41 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 26 Jan 2022 06:23:41 GMT Subject: RFR: JDK-8280583: Always build NMT [v2] In-Reply-To: <_RJK80WSxCYhWSqjjoQYbVPr2_uEnlNqp_UdwniQp4c=.d3b51bfa-ffe7-4a72-97fe-64dbb25e1f5a@github.com> References: <_RJK80WSxCYhWSqjjoQYbVPr2_uEnlNqp_UdwniQp4c=.d3b51bfa-ffe7-4a72-97fe-64dbb25e1f5a@github.com> Message-ID: > After discussing this on hotspot-runtime-dev [1], the general opinion seems to be that it would be worthwhile to get rid of INCLUDE_NMT and make NMT unconditional. This affects minimal builds only. As pointed out in the mail thread, the overhead is very small and it would get rid of one configuration to build and test. > > [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2022-January/053504.html > > Patch removes INCLUDE_NMT from hotspot, as well as dependent macros, as well as the associated build option. Thomas Stuefe has updated the pull request incrementally with 15 additional commits since the last revision: - 8280377: MethodHandleProxies does not correctly invoke default methods with varags Reviewed-by: alanb - 8213905: reflection not working for type annotations applied to exception types in the inner class constructor Reviewed-by: jlahoda - 8279242: Reflection newInstance() error message when constructor has no access modifiers could use improvement Reviewed-by: iris, dholmes, mchung - 8269542: JDWP: EnableCollection support is no longer spec compliant after JDK-8255987 8258071: Fix for JDK-8255987 can be subverted with ObjectReference.EnableCollection Reviewed-by: dholmes, pliden - 8280166: Extend java/lang/instrument/GetObjectSizeIntrinsicsTest.java test cases Reviewed-by: sspitsyn, lmesnik - 8280041: Retry loop issues in java.io.ClassCache Co-authored-by: Peter Levart Reviewed-by: rkennke, rriggs, plevart - 8280168: Add Objects.toIdentityString Reviewed-by: alanb, mchung, rriggs, smarks - 8279946: (ch) java.nio.channels.FileChannel tryLock and write methods are missing @throws NonWritableChannelException Reviewed-by: alanb - 8280396: G1: Full gc mark stack draining should prefer to make work available to other threads Reviewed-by: sjohanss, ayang - 8280414: Memory leak in DefaultProxySelector Reviewed-by: dfuchs - ... and 5 more: https://git.openjdk.java.net/jdk/compare/a19176e6...44478392 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7213/files - new: https://git.openjdk.java.net/jdk/pull/7213/files/a19176e6..44478392 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7213&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7213&range=00-01 Stats: 948 lines in 56 files changed: 717 ins; 90 del; 141 mod Patch: https://git.openjdk.java.net/jdk/pull/7213.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7213/head:pull/7213 PR: https://git.openjdk.java.net/jdk/pull/7213 From ddong at openjdk.java.net Wed Jan 26 07:07:30 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Wed, 26 Jan 2022 07:07:30 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash [v2] In-Reply-To: References: Message-ID: <1Jp1yw4AlLwueqCZQKRwOInHJ4HPvIyRVAPqDP2WUr4=.67c937a7-7ef2-40de-8f8e-ecc7da20e469@github.com> On Tue, 18 Jan 2022 13:43:25 GMT, Andrew Haley wrote: >> Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: >> >> fix pfl() crash problem and rename from_thread to from_anchor > > So, here's my thinking for now. `_from_anchor` really means _this SP is trustworthy_, and perhaps we need a different name which suggests that. `sp_ok_to_use()` or `sp_is_trusted()` or somesuch? We do at least need a comment which explains that unless this boolean is true, the SP value in a frame is basically garbage, although it will point to somewhere within the stack. With that change, this patch can be integrated. > In the longer term, I think we should look at using libunwind to obtain a precise native stack trace, and then we can get rid of all the old kludges. @theRealAph Do you have other comments on the current patch? ------------- PR: https://git.openjdk.java.net/jdk/pull/6597 From duke at openjdk.java.net Wed Jan 26 07:17:51 2022 From: duke at openjdk.java.net (KIRIYAMA Takuya) Date: Wed, 26 Jan 2022 07:17:51 GMT Subject: RFR: 8280684: JfrRecorderService failes with guarantee(num_written > 0) when no space left on device. Message-ID: I think JFR should report an error message and jvm should shut down safely instead of gurantee failure. For instance, jdk.jfr.internal.Repository#newChunk() reports an appropriate message and stops jvm as below by using JfrJavaSupport::abort(). [0.673s][error][jfr] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp) [0.673s][error][jfr,system] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp) [0.673s][error][jfr,system] An irrecoverable error in Jfr. Shutting down VM... I modified StreamWriterHost not to call guarantee failure but to call JfrJavaSupport::abort(). I added a argument to JfrJavaSupport::abort() which tells os::abort() not to put out core because there is no space on device. Could you please review the fix? ------------- Commit messages: - 8280684: JfrRecorderService failes with guarantee(num_written > 0) when no space left on device. - 8280684: JfrRecorderService failes with guarantee(num_written > 0) when no space left on device. - 8280684: JfrRecorderService failes with guarantee(num_written > 0) when no space left on device. Changes: https://git.openjdk.java.net/jdk/pull/7227/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7227&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8280684 Stats: 146 lines in 4 files changed: 140 ins; 0 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/7227.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7227/head:pull/7227 PR: https://git.openjdk.java.net/jdk/pull/7227 From shade at openjdk.java.net Wed Jan 26 08:29:33 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 26 Jan 2022 08:29:33 GMT Subject: RFR: JDK-8280583: Always build NMT [v2] In-Reply-To: References: <_RJK80WSxCYhWSqjjoQYbVPr2_uEnlNqp_UdwniQp4c=.d3b51bfa-ffe7-4a72-97fe-64dbb25e1f5a@github.com> Message-ID: On Wed, 26 Jan 2022 06:23:41 GMT, Thomas Stuefe wrote: >> After discussing this on hotspot-runtime-dev [1], the general opinion seems to be that it would be worthwhile to get rid of INCLUDE_NMT and make NMT unconditional. This affects minimal builds only. As pointed out in the mail thread, the overhead is very small and it would get rid of one configuration to build and test. >> >> [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2022-January/053504.html >> >> Patch removes INCLUDE_NMT from hotspot, as well as dependent macros, as well as the associated build option. > > Thomas Stuefe has updated the pull request incrementally with 15 additional commits since the last revision: > > - 8280377: MethodHandleProxies does not correctly invoke default methods with varags > > Reviewed-by: alanb > - 8213905: reflection not working for type annotations applied to exception types in the inner class constructor > > Reviewed-by: jlahoda > - 8279242: Reflection newInstance() error message when constructor has no access modifiers could use improvement > > Reviewed-by: iris, dholmes, mchung > - 8269542: JDWP: EnableCollection support is no longer spec compliant after JDK-8255987 > 8258071: Fix for JDK-8255987 can be subverted with ObjectReference.EnableCollection > > Reviewed-by: dholmes, pliden > - 8280166: Extend java/lang/instrument/GetObjectSizeIntrinsicsTest.java test cases > > Reviewed-by: sspitsyn, lmesnik > - 8280041: Retry loop issues in java.io.ClassCache > > Co-authored-by: Peter Levart > Reviewed-by: rkennke, rriggs, plevart > - 8280168: Add Objects.toIdentityString > > Reviewed-by: alanb, mchung, rriggs, smarks > - 8279946: (ch) java.nio.channels.FileChannel tryLock and write methods are missing @throws NonWritableChannelException > > Reviewed-by: alanb > - 8280396: G1: Full gc mark stack draining should prefer to make work available to other threads > > Reviewed-by: sjohanss, ayang > - 8280414: Memory leak in DefaultProxySelector > > Reviewed-by: dfuchs > - ... and 5 more: https://git.openjdk.java.net/jdk/compare/a19176e6...44478392 There is some major weirdness in this PR: it includes a lot of unrelated changes. Consider rebasing to current master and force-pushing? ------------- PR: https://git.openjdk.java.net/jdk/pull/7213 From stuefe at openjdk.java.net Wed Jan 26 08:36:44 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 26 Jan 2022 08:36:44 GMT Subject: RFR: JDK-8280583: Always build NMT [v3] In-Reply-To: <_RJK80WSxCYhWSqjjoQYbVPr2_uEnlNqp_UdwniQp4c=.d3b51bfa-ffe7-4a72-97fe-64dbb25e1f5a@github.com> References: <_RJK80WSxCYhWSqjjoQYbVPr2_uEnlNqp_UdwniQp4c=.d3b51bfa-ffe7-4a72-97fe-64dbb25e1f5a@github.com> Message-ID: > After discussing this on hotspot-runtime-dev [1], the general opinion seems to be that it would be worthwhile to get rid of INCLUDE_NMT and make NMT unconditional. This affects minimal builds only. As pointed out in the mail thread, the overhead is very small and it would get rid of one configuration to build and test. > > [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2022-January/053504.html > > Patch removes INCLUDE_NMT from hotspot, as well as dependent macros, as well as the associated build option. Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: - fix copyrights - remove INCLUDE_NMT and dependend code ------------- Changes: https://git.openjdk.java.net/jdk/pull/7213/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7213&range=02 Stats: 243 lines in 28 files changed: 10 ins; 192 del; 41 mod Patch: https://git.openjdk.java.net/jdk/pull/7213.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7213/head:pull/7213 PR: https://git.openjdk.java.net/jdk/pull/7213 From stuefe at openjdk.java.net Wed Jan 26 08:36:44 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 26 Jan 2022 08:36:44 GMT Subject: RFR: JDK-8280583: Always build NMT [v2] In-Reply-To: References: <_RJK80WSxCYhWSqjjoQYbVPr2_uEnlNqp_UdwniQp4c=.d3b51bfa-ffe7-4a72-97fe-64dbb25e1f5a@github.com> Message-ID: <4nRn_jOASRPVRKX-Up0w8psR5-Ar4OWvOBcLWTi_p_k=.1ad61ef7-6628-4995-b891-2636efffac5b@github.com> On Wed, 26 Jan 2022 08:26:27 GMT, Aleksey Shipilev wrote: > There is some major weirdness in this PR: it includes a lot of unrelated changes. Consider rebasing to current master and force-pushing? Done. I wondered about that, I did merge master - normally, this shows up as a single merge change. Not sure what happened. ------------- PR: https://git.openjdk.java.net/jdk/pull/7213 From shade at openjdk.java.net Wed Jan 26 08:45:33 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 26 Jan 2022 08:45:33 GMT Subject: RFR: JDK-8280583: Always build NMT [v3] In-Reply-To: References: <_RJK80WSxCYhWSqjjoQYbVPr2_uEnlNqp_UdwniQp4c=.d3b51bfa-ffe7-4a72-97fe-64dbb25e1f5a@github.com> Message-ID: On Wed, 26 Jan 2022 08:36:44 GMT, Thomas Stuefe wrote: >> After discussing this on hotspot-runtime-dev [1], the general opinion seems to be that it would be worthwhile to get rid of INCLUDE_NMT and make NMT unconditional. This affects minimal builds only. As pointed out in the mail thread, the overhead is very small and it would get rid of one configuration to build and test. >> >> [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2022-January/053504.html >> >> Patch removes INCLUDE_NMT from hotspot, as well as dependent macros, as well as the associated build option. > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: > > - fix copyrights > - remove INCLUDE_NMT and dependend code This looks fine to me. ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7213 From alanb at openjdk.java.net Wed Jan 26 09:02:34 2022 From: alanb at openjdk.java.net (Alan Bateman) Date: Wed, 26 Jan 2022 09:02:34 GMT Subject: RFR: 8278753: Runtime crashes with access violation during JNI_CreateJavaVM call In-Reply-To: References: Message-ID: On Tue, 25 Jan 2022 00:20:19 GMT, Yumin Qi wrote: > Please review, > When jlink with --compress=2, zip is used to compress the files while doing copy. The user case failed to load zip.dll, since zip.dll is not set in PATH. This failure is after we get NULL from GetModuleHandle("zip.dll"), then do LoadLibrary("zip.dll") will have same result. > The fix is calling load_zip_library of ClassLoader first --- if zip library already loaded just return the cached handle for following usage, if not, load zip library and cached the handle. > > Tests: tier1,4,7 in test > Manually tested user case, and checked output of jimage list for jlinked files using --compress=2. > > Thanks > Yumin I think this looks okay but I think @JimLaskey and/or @sundararajana should look at this because it creates a dependency on a JVM_* function. I'm trying to think if there are any interop issues when using jrtfs. Jim/Sundar can correct me but I think we are okay there because a tool on say JDK 8 (or 11 or 17) that accesses a JDK 19 run-time image will use the BasicImageReader and won't use libjimage in the target VM. ------------- PR: https://git.openjdk.java.net/jdk/pull/7206 From stuefe at openjdk.java.net Wed Jan 26 09:27:33 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 26 Jan 2022 09:27:33 GMT Subject: RFR: JDK-8280583: Always build NMT [v3] In-Reply-To: References: <_RJK80WSxCYhWSqjjoQYbVPr2_uEnlNqp_UdwniQp4c=.d3b51bfa-ffe7-4a72-97fe-64dbb25e1f5a@github.com> Message-ID: On Wed, 26 Jan 2022 08:42:46 GMT, Aleksey Shipilev wrote: > This looks fine to me. Thanks, Aleksey! ------------- PR: https://git.openjdk.java.net/jdk/pull/7213 From tschatzl at openjdk.java.net Wed Jan 26 09:33:37 2022 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 26 Jan 2022 09:33:37 GMT Subject: RFR: 8279294: NonblockingQueue::try_pop may improperly indicate queue is empty In-Reply-To: References: Message-ID: On Wed, 19 Jan 2022 22:43:25 GMT, Kim Barrett wrote: > Please review this improvement to NonblockingQueue::try_pop. The old code > returned an indication that the queue was empty in some cases where that > wasn't true. In particular, contending try_pop operations could result in > some incorrectly indicating empty. The change fixes that and improves the > interaction between contending try_pops. > > Testing: > mach5 tier1-3 > > Lots of testing of this change in conjunction with others as part of > investigating and fixing JDK-8273383. Lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7149 From aph at openjdk.java.net Wed Jan 26 09:59:30 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 26 Jan 2022 09:59:30 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash [v4] In-Reply-To: References: Message-ID: On Sat, 22 Jan 2022 02:40:41 GMT, Denghui Dong wrote: >> Hi, >> >> I found that the native stack frames in the hs log are not accurate sometimes on AArch64, not sure if this is a known issue or an issue worth fixing. >> >> The following steps can quick reproduce the problem: >> >> 1. apply the diff(comment the dtrace_object_alloc call in interpreter and make a crash on SharedRuntime::dtrace_object_alloc) >> >> index 39e99bdd5ed..4fc768e94aa 100644 >> --- a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp >> +++ b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp >> @@ -3558,6 +3558,7 @@ void TemplateTable::_new() { >> __ store_klass_gap(r0, zr); // zero klass gap for compressed oops >> __ store_klass(r0, r4); // store klass last >> >> +/** >> { >> SkipIfEqual skip(_masm, &DTraceAllocProbes, false); >> // Trigger dtrace event for fastpath >> @@ -3567,6 +3568,7 @@ void TemplateTable::_new() { >> __ pop(atos); // restore the return value >> >> } >> +*/ >> __ b(done); >> } >> >> diff --git a/src/hotspot/cpu/x86/templateTable_x86.cpp b/src/hotspot/cpu/x86/templateTable_x86.cpp >> index 19530b7c57c..15b0509da4c 100644 >> --- a/src/hotspot/cpu/x86/templateTable_x86.cpp >> +++ b/src/hotspot/cpu/x86/templateTable_x86.cpp >> @@ -4033,6 +4033,7 @@ void TemplateTable::_new() { >> Register tmp_store_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg); >> __ store_klass(rax, rcx, tmp_store_klass); // klass >> >> +/** >> { >> SkipIfEqual skip_if(_masm, &DTraceAllocProbes, 0); >> // Trigger dtrace event for fastpath >> @@ -4041,6 +4042,7 @@ void TemplateTable::_new() { >> CAST_FROM_FN_PTR(address, static_cast(SharedRuntime::dtrace_object_alloc)), rax); >> __ pop(atos); >> } >> +*/ >> >> __ jmp(done); >> } >> diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp >> index a5de65ea5ab..60b4bd3bcc8 100644 >> --- a/src/hotspot/share/runtime/sharedRuntime.cpp >> +++ b/src/hotspot/share/runtime/sharedRuntime.cpp >> @@ -1002,6 +1002,7 @@ jlong SharedRuntime::get_java_tid(Thread* thread) { >> * 6254741. Once that is fixed we can remove the dummy return value. >> */ >> int SharedRuntime::dtrace_object_alloc(oopDesc* o) { >> + *(int*)0 = 1; >> return dtrace_object_alloc(Thread::current(), o, o->size()); >> } >> >> >> 2. `java -XX:+DTraceAllocProbes -Xcomp -XX:-PreserveFramePointer -version` >> >> On x86_64, the native stack in hs log is complete, but on AArch64, the native stack is incorrect. >> >> In the beginning, I thought it might be the influence of PreserveFramePointer. Later, I found that no matter whether PreserveFramePointer is enabled or not, in the hs log of x86_64, the native stack is always correct, and aarch64 is wrong. >> >> After some investigation, I found that this problem is related to the layout of the stack. >> >> On x86_64, whether it is C/C++, interpreter, or JIT, `callee` will always put the `return address` and `fp` of the `caller` at the bottom of the stack. >> Hence, `callee` can always get the `caller sp`(aka `sender sp`) by `fp + 2`, and if `caller` is a compiled method, `caller sp` is the key to getting the `caller`'s `caller` since `caller fp` may be invalid.(see frame::sender_for_compiled_frame). >> >> >> push %rbp >> mov %rsp,%rbp >> >> _ _ _ _ _ _ >> | | >> | | | >> |_ _ _ _ _ _| | >> | | | >> caller | | <- caller sp | >> _ _ _ |_ _ _ _ _ _| | expand >> | | | >> | ret addr | | direction >> callee |_ _ _ _ _ _| | >> | | V >> | caller fp | <- fp >> |_ _ _ _ _ _| >> >> >> >> But for AArch64, the C/C++ code doesn't put the `return address` and `fp` of the `caller` at the bottom of the stack. >> Hence, we cannot use `fp + 2` to calculate the proper `caller sp`(although it is still implemented this way). >> >> When `caller` is a C1/C2 method A, and `callee` is a C/C++ method B, we cannot get the `caller` of A since we cannot get the proper sp value of it. >> >> >> stp x29, x30, [sp, #-N]! >> mov x29, sp >> >> _ _ _ _ _ _ >> | | >> | | | >> |_ _ _ _ _ _| | >> | | | >> caller | | <- caller sp | >> _ _ _ |_ _ _ _ _ _| - | expand >> | | >> . . . . . | | direction >> _ _ _ _ _ _ | | >> | | | N | >> | ret addr | | | >> callee |_ _ _ _ _ _| | | >> | | - V >> | caller fp | <- fp >> |_ _ _ _ _ _| >> >> >> >> I am not very familiar with AArch64 and have no idea how to fix this issue perfectly at current. >> >> Based on my understanding of the implementation, we can get the correct stack trace when PreserveFramePointer is enabled. >> >> Although PreserveFramePointer is disabled by default, I found that some real applications will enable it in the production environment. >> Therefore, in my opinion, this fix can help troubleshoot crash issues in applications that enable PreserveFramePointer on AArch64 platform. >> >> This patch changes the logic of l_sender_sp calculation, uses sender_sp() as the value of l_sender_sp when PreserveFramePointer is enabled. >> >> Any input is appreciated. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > update copyright year Marked as reviewed by aph (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/6597 From ihse at openjdk.java.net Wed Jan 26 10:40:29 2022 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Wed, 26 Jan 2022 10:40:29 GMT Subject: RFR: JDK-8280583: Always build NMT [v3] In-Reply-To: References: <_RJK80WSxCYhWSqjjoQYbVPr2_uEnlNqp_UdwniQp4c=.d3b51bfa-ffe7-4a72-97fe-64dbb25e1f5a@github.com> Message-ID: On Wed, 26 Jan 2022 08:36:44 GMT, Thomas Stuefe wrote: >> After discussing this on hotspot-runtime-dev [1], the general opinion seems to be that it would be worthwhile to get rid of INCLUDE_NMT and make NMT unconditional. This affects minimal builds only. As pointed out in the mail thread, the overhead is very small and it would get rid of one configuration to build and test. >> >> [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2022-January/053504.html >> >> Patch removes INCLUDE_NMT from hotspot, as well as dependent macros, as well as the associated build option. > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: > > - fix copyrights > - remove INCLUDE_NMT and dependend code Build changes look fine. Good riddance! ------------- Marked as reviewed by ihse (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7213 From ddong at openjdk.java.net Wed Jan 26 10:47:31 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Wed, 26 Jan 2022 10:47:31 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash [v4] In-Reply-To: References: Message-ID: On Sat, 22 Jan 2022 02:40:41 GMT, Denghui Dong wrote: >> Hi, >> >> I found that the native stack frames in the hs log are not accurate sometimes on AArch64, not sure if this is a known issue or an issue worth fixing. >> >> The following steps can quick reproduce the problem: >> >> 1. apply the diff(comment the dtrace_object_alloc call in interpreter and make a crash on SharedRuntime::dtrace_object_alloc) >> >> index 39e99bdd5ed..4fc768e94aa 100644 >> --- a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp >> +++ b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp >> @@ -3558,6 +3558,7 @@ void TemplateTable::_new() { >> __ store_klass_gap(r0, zr); // zero klass gap for compressed oops >> __ store_klass(r0, r4); // store klass last >> >> +/** >> { >> SkipIfEqual skip(_masm, &DTraceAllocProbes, false); >> // Trigger dtrace event for fastpath >> @@ -3567,6 +3568,7 @@ void TemplateTable::_new() { >> __ pop(atos); // restore the return value >> >> } >> +*/ >> __ b(done); >> } >> >> diff --git a/src/hotspot/cpu/x86/templateTable_x86.cpp b/src/hotspot/cpu/x86/templateTable_x86.cpp >> index 19530b7c57c..15b0509da4c 100644 >> --- a/src/hotspot/cpu/x86/templateTable_x86.cpp >> +++ b/src/hotspot/cpu/x86/templateTable_x86.cpp >> @@ -4033,6 +4033,7 @@ void TemplateTable::_new() { >> Register tmp_store_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg); >> __ store_klass(rax, rcx, tmp_store_klass); // klass >> >> +/** >> { >> SkipIfEqual skip_if(_masm, &DTraceAllocProbes, 0); >> // Trigger dtrace event for fastpath >> @@ -4041,6 +4042,7 @@ void TemplateTable::_new() { >> CAST_FROM_FN_PTR(address, static_cast(SharedRuntime::dtrace_object_alloc)), rax); >> __ pop(atos); >> } >> +*/ >> >> __ jmp(done); >> } >> diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp >> index a5de65ea5ab..60b4bd3bcc8 100644 >> --- a/src/hotspot/share/runtime/sharedRuntime.cpp >> +++ b/src/hotspot/share/runtime/sharedRuntime.cpp >> @@ -1002,6 +1002,7 @@ jlong SharedRuntime::get_java_tid(Thread* thread) { >> * 6254741. Once that is fixed we can remove the dummy return value. >> */ >> int SharedRuntime::dtrace_object_alloc(oopDesc* o) { >> + *(int*)0 = 1; >> return dtrace_object_alloc(Thread::current(), o, o->size()); >> } >> >> >> 2. `java -XX:+DTraceAllocProbes -Xcomp -XX:-PreserveFramePointer -version` >> >> On x86_64, the native stack in hs log is complete, but on AArch64, the native stack is incorrect. >> >> In the beginning, I thought it might be the influence of PreserveFramePointer. Later, I found that no matter whether PreserveFramePointer is enabled or not, in the hs log of x86_64, the native stack is always correct, and aarch64 is wrong. >> >> After some investigation, I found that this problem is related to the layout of the stack. >> >> On x86_64, whether it is C/C++, interpreter, or JIT, `callee` will always put the `return address` and `fp` of the `caller` at the bottom of the stack. >> Hence, `callee` can always get the `caller sp`(aka `sender sp`) by `fp + 2`, and if `caller` is a compiled method, `caller sp` is the key to getting the `caller`'s `caller` since `caller fp` may be invalid.(see frame::sender_for_compiled_frame). >> >> >> push %rbp >> mov %rsp,%rbp >> >> _ _ _ _ _ _ >> | | >> | | | >> |_ _ _ _ _ _| | >> | | | >> caller | | <- caller sp | >> _ _ _ |_ _ _ _ _ _| | expand >> | | | >> | ret addr | | direction >> callee |_ _ _ _ _ _| | >> | | V >> | caller fp | <- fp >> |_ _ _ _ _ _| >> >> >> >> But for AArch64, the C/C++ code doesn't put the `return address` and `fp` of the `caller` at the bottom of the stack. >> Hence, we cannot use `fp + 2` to calculate the proper `caller sp`(although it is still implemented this way). >> >> When `caller` is a C1/C2 method A, and `callee` is a C/C++ method B, we cannot get the `caller` of A since we cannot get the proper sp value of it. >> >> >> stp x29, x30, [sp, #-N]! >> mov x29, sp >> >> _ _ _ _ _ _ >> | | >> | | | >> |_ _ _ _ _ _| | >> | | | >> caller | | <- caller sp | >> _ _ _ |_ _ _ _ _ _| - | expand >> | | >> . . . . . | | direction >> _ _ _ _ _ _ | | >> | | | N | >> | ret addr | | | >> callee |_ _ _ _ _ _| | | >> | | - V >> | caller fp | <- fp >> |_ _ _ _ _ _| >> >> >> >> I am not very familiar with AArch64 and have no idea how to fix this issue perfectly at current. >> >> Based on my understanding of the implementation, we can get the correct stack trace when PreserveFramePointer is enabled. >> >> Although PreserveFramePointer is disabled by default, I found that some real applications will enable it in the production environment. >> Therefore, in my opinion, this fix can help troubleshoot crash issues in applications that enable PreserveFramePointer on AArch64 platform. >> >> This patch changes the logic of l_sender_sp calculation, uses sender_sp() as the value of l_sender_sp when PreserveFramePointer is enabled. >> >> Any input is appreciated. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > update copyright year May I have a second reviewer? ------------- PR: https://git.openjdk.java.net/jdk/pull/6597 From jbhateja at openjdk.java.net Wed Jan 26 11:59:33 2022 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Wed, 26 Jan 2022 11:59:33 GMT Subject: RFR: 8183390: Fix and re-enable post loop vectorization [v3] In-Reply-To: <1sV-WcCv0-H2lHh1FFwAocRL8EIVzZ_att5tUeY5M-c=.ee351f2b-c9b9-4294-9205-cf16da3e896d@github.com> References: <1sV-WcCv0-H2lHh1FFwAocRL8EIVzZ_att5tUeY5M-c=.ee351f2b-c9b9-4294-9205-cf16da3e896d@github.com> Message-ID: On Mon, 10 Jan 2022 06:20:01 GMT, Pengfei Li wrote: >> ### Background >> >> Post loop vectorization is a C2 compiler optimization in an experimental >> VM feature called PostLoopMultiversioning. It transforms the range-check >> eliminated post loop to a 1-iteration vectorized loop with vector mask. >> This optimization was contributed by Intel in 2016 to support x86 AVX512 >> masked vector instructions. However, it was disabled soon after an issue >> was found. Due to insufficient maintenance in these years, multiple bugs >> have been accumulated inside. But we (Arm) still think this is a useful >> framework for vector mask support in C2 auto-vectorized loops, for both >> x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable >> post loop vectorization. >> >> ### Changes in this patch >> >> This patch reworks post loop vectorization. The most significant change >> is removing vector mask support in C2 x86 backend and re-implementing >> it in the mid-end. With this, we can re-enable post loop vectorization >> for platforms other than x86. >> >> Previous implementation hard-codes x86 k1 register as a reserved AVX512 >> opmask register and defines two routines (setvectmask/restorevectmask) >> to set and restore the value of k1. But after [JDK-8211251](https://bugs.openjdk.java.net/browse/JDK-8211251) which encodes >> AVX512 instructions as unmasked by default, generated vector masks are >> no longer used in AVX512 vector instructions. To fix incorrect codegen >> and add vector mask support for more platforms, we turn to add a vector >> mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode >> to generate a mask and replace all Load/Store nodes in the post loop >> into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This >> IR form is exactly the same to those which are used in VectorAPI mask >> support. For now, we only add mask inputs for Load/Store nodes because >> we don't have reduction operations supported in post loop vectorization. >> After this change, the x86 k1 register is no longer reserved and can be >> allocated when PostLoopMultiversioning is enabled. >> >> Besides this change, we have fixed a compiler crash and five incorrect >> result issues with post loop vectorization. >> >> **I) C2 crashes with segmentation fault in strip-mined loops** >> >> Previous implementation was done before C2 loop strip-mining was merged >> into JDK master so it didn't take strip-mined loops into consideration. >> In C2's strip mined loops, post loop is not the sibling of the main loop >> in ideal loop tree. Instead, it's the sibling of the main loop's parent. >> This patch fixed a SIGSEGV issue caused by NULL pointer when locating >> post loop from strip-mined main loop. >> >> **II) Incorrect result issues with post loop vectorization** >> >> We have also fixed five incorrect vectorization issues. Some of them are >> hidden deep and can only be reproduced with corner cases. These issues >> have a common cause that it assumes the post loop can be vectorized if >> the vectorization in corresponding main loop is successful. But in many >> cases this assumption is wrong. Below are details. >> >> - **[Issue-1] Incorrect vectorization for partial vectorizable loops** >> >> This issue can be reproduced by below loop where only some operations in >> the loop body are vectorizable. >> >> for (int i = 0; i < 10000; i++) { >> res[i] = a[i] * b[i]; >> k = 3 * k + 1; >> } >> >> In the main loop, superword can work well if parts of the operations in >> loop body are not vectorizable since those parts can be unrolled only. >> But for post loops, we don't create vectors through combining scalar IRs >> generated from loop unrolling. Instead, we are doing scalars to vectors >> replacement for all operations in the loop body. Hence, all operations >> should be either vectorized together or not vectorized at all. To fix >> this kind of cases, we add an extra field "_slp_vector_pack_count" in >> CountedLoopNode to record the eventual count of vector packs in the main >> loop. This value is then passed to post loop and compared with post loop >> pack count. Vectorization will be bailed out in post loop if it creates >> more vector packs than in the main loop. >> >> - **[Issue-2] Incorrect result in loops with growing-down vectors** >> >> This issue appears with growing-down vectors, that is, vectors that grow >> to smaller memory address as the loop iterates. It can be reproduced by >> below counting-up loop with negative scale value in array index. >> >> for (int i = 0; i < 10000; i++) { >> a[MAX - i] = b[MAX - i]; >> } >> >> Cause of this issue is that for a growing-down vector, generated vector >> mask value has reversed vector-lane order so it masks incorrect vector >> lanes. Note that if negative scale value appears in counting-down loops, >> the vector will be growing up. With this rule, we fix the issue by only >> allowing positive array index scales in counting-up loops and negative >> array index scales in counting-down loops. This check is done with the >> help of SWPointer by comparing scale values in each memory access in the >> loop with loop stride value. >> >> - **[Issue-3] Incorrect result in manually unrolled loops** >> >> This issue can be reproduced by below manually unrolled loop. >> >> for (int i = 0; i < 10000; i += 2) { >> c[i] = a[i] + b[i]; >> c[i + 1] = a[i + 1] * b[i + 1]; >> } >> >> In this loop, operations in the 2nd statement duplicate those in the 1st >> statement with a small memory address offset. Vectorization in the main >> loop works well in this case because C2 does further unrolling and pack >> combination. But we cannot vectorize the post loop through replacement >> from scalars to vectors because it creates duplicated vector operations. >> To fix this, we restrict post loop vectorization to loops with stride >> values of 1 or -1. >> >> - **[Issue-4] Incorrect result in loops with mixed vector element sizes** >> >> This issue is found after we enable post loop vectorization for AArch64. >> It's reproducible by multiple array operations with different element >> sizes inside a loop. On x86, there is no issue because the values of x86 >> AVX512 opmasks only depend on which vector lanes are active. But AArch64 >> is different - the values of SVE predicates also depend on lane size of >> the vector. Hence, on AArch64 SVE, if a loop has mixed vector element >> sizes, we should use different vector masks. For now, we just support >> loops with only one vector element size, i.e., "int + float" vectors in >> a single loop is ok but "int + double" vectors in a single loop is not >> vectorizable. This fix also enables subword vectors support to make all >> primitive type array operations vectorizable. >> >> - **[Issue-5] Incorrect result in loops with potential data dependence** >> >> This issue can be reproduced by below corner case on AArch64 only. >> >> for (int i = 0; i < 10000; i++) { >> a[i] = x; >> a[i + OFFSET] = y; >> } >> >> In this case, two stores in the loop have data dependence if the OFFSET >> value is smaller than the vector length. So we cannot do vectorization >> through replacing scalars to vectors. But the main loop vectorization >> in this case is successful on AArch64 because AArch64 has partial vector >> load/store support. It splits vector fill with different values in lanes >> to several smaller-sized fills. In this patch, we add additional data >> dependence check for this kind of cases. The check is also done with the >> help of SWPointer class. In this check, we require that every two memory >> accesses (with at least one store) of the same element type (or subword >> size) in the loop has the same array index expression. >> >> ### Tests >> >> So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with >> experimental VM option "PostLoopMultiversioning" turned on. We found no >> issue in all tests. We notice that those existing cases are not enough >> because some of above issues are not spotted by them. We would like to >> add some new cases but we found existing vectorization tests are a bit >> cumbersome - golden results must be pre-calculated and hard-coded in the >> test code for correctness verification. Thus, in this patch, we propose >> a new vectorization testing framework. >> >> Our new framework brings a simpler way to add new cases. For a new test >> case, we only need to create a new method annotated with "@Test". The >> test runner will invoke each annotated method twice automatically. First >> time it runs in the interpreter and second time it's forced compiled by >> C2. Then the two return results are compared. So in this framework each >> test method should return a primitive value or an array of primitives. >> In this way, no extra verification code for vectorization correctness is >> required. This test runner is still jtreg-based and takes advantages of >> the jtreg WhiteBox API, which enables test methods running at specific >> compilation levels. Each test class inside is also jtreg-based. It just >> need to inherit from the test runner class and run with two additional >> options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". >> >> ### Summary & Future work >> >> In this patch, we reworked post loop vectorization. We made it platform >> independent and fixed several issues inside. We also implemented a new >> vectorization testing framework with many test cases inside. Meanwhile, >> we did some code cleanups. >> >> This patch only touches C2 code guarded with PostLoopMultiversioning, >> except a few data structure changes. So, there's no behavior change when >> experimental VM option PostLoopMultiversioning is off. Also, to reduce >> risks, we still propose to keep post loop vectorization experimental for >> now. But if it receives positive feedback, we would like to change it to >> non-experimental in the future. > > Pengfei Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Update copyright year and rename a function > > Change-Id: I15845ebd3982edebd4c151284cc6f2ff727630bb > - Merge branch 'master' into postloop > > Change-Id: Ie639c79c9cf016dc68ebf2c0031b60453b45e9a4 > - Fix issues in newly added test framework > > Change-Id: I6e61abf05e9665325cb3abaf407360b18355c6b1 > - Merge branch 'master' into postloop > > Change-Id: I9bb5a808d7540426dedb141fd198d25eb1f569e6 > - 8183390: Fix and re-enable post loop vectorization > > ** Background > > Post loop vectorization is a C2 compiler optimization in an experimental > VM feature called PostLoopMultiversioning. It transforms the range-check > eliminated post loop to a 1-iteration vectorized loop with vector mask. > This optimization was contributed by Intel in 2016 to support x86 AVX512 > masked vector instructions. However, it was disabled soon after an issue > was found. Due to insufficient maintenance in these years, multiple bugs > have been accumulated inside. But we (Arm) still think this is a useful > framework for vector mask support in C2 auto-vectorized loops, for both > x86 AVX512 and AArch64 SVE. Hence, we propose this to fix and re-enable > post loop vectorization. > > ** Changes in this patch > > This patch reworks post loop vectorization. The most significant change > is removing vector mask support in C2 x86 backend and re-implementing > it in the mid-end. With this, we can re-enable post loop vectorization > for platforms other than x86. > > Previous implementation hard-codes x86 k1 register as a reserved AVX512 > opmask register and defines two routines (setvectmask/restorevectmask) > to set and restore the value of k1. But after JDK-8211251 which encodes > AVX512 instructions as unmasked by default, generated vector masks are > no longer used in AVX512 vector instructions. To fix incorrect codegen > and add vector mask support for more platforms, we turn to add a vector > mask input to C2 mid-end IRs. Specifically, we use a VectorMaskGenNode > to generate a mask and replace all Load/Store nodes in the post loop > into LoadVectorMasked/StoreVectorMasked nodes with that mask input. This > IR form is exactly the same to those which are used in VectorAPI mask > support. For now, we only add mask inputs for Load/Store nodes because > we don't have reduction operations supported in post loop vectorization. > After this change, the x86 k1 register is no longer reserved and can be > allocated when PostLoopMultiversioning is enabled. > > Besides this change, we have fixed a compiler crash and five incorrect > result issues with post loop vectorization. > > - 1) C2 crashes with segmentation fault in strip-mined loops > > Previous implementation was done before C2 loop strip-mining was merged > into JDK master so it didn't take strip-mined loops into consideration. > In C2's strip mined loops, post loop is not the sibling of the main loop > in ideal loop tree. Instead, it's the sibling of the main loop's parent. > This patch fixed a SIGSEGV issue caused by NULL pointer when locating > post loop from strip-mined main loop. > > - 2) Incorrect result issues with post loop vectorization > > We have also fixed five incorrect vectorization issues. Some of them are > hidden deep and can only be reproduced with corner cases. These issues > have a common cause that it assumes the post loop can be vectorized if > the vectorization in corresponding main loop is successful. But in many > cases this assumption is wrong. Below are details. > > [Issue-1] Incorrect vectorization for partial vectorizable loops > > This issue can be reproduced by below loop where only some operations in > the loop body are vectorizable. > > for (int i = 0; i < 10000; i++) { > res[i] = a[i] * b[i]; > k = 3 * k + 1; > } > > In the main loop, superword can work well if parts of the operations in > loop body are not vectorizable since those parts can be unrolled only. > But for post loops, we don't create vectors through combining scalar IRs > generated from loop unrolling. Instead, we are doing scalars to vectors > replacement for all operations in the loop body. Hence, all operations > should be either vectorized together or not vectorized at all. To fix > this kind of cases, we add an extra field "_slp_vector_pack_count" in > CountedLoopNode to record the eventual count of vector packs in the main > loop. This value is then passed to post loop and compared with post loop > pack count. Vectorization will be bailed out in post loop if it creates > more vector packs than in the main loop. > > [Issue-2] Incorrect result in loops with growing-down vectors > > This issue appears with growing-down vectors, that is, vectors that grow > to smaller memory address as the loop iterates. It can be reproduced by > below counting-up loop with negative scale value in array index. > > for (int i = 0; i < 10000; i++) { > a[MAX - i] = b[MAX - i]; > } > > Cause of this issue is that for a growing-down vector, generated vector > mask value has reversed vector-lane order so it masks incorrect vector > lanes. Note that if negative scale value appears in counting-down loops, > the vector will be growing up. With this rule, we fix the issue by only > allowing positive array index scales in counting-up loops and negative > array index scales in counting-down loops. This check is done with the > help of SWPointer by comparing scale values in each memory access in the > loop with loop stride value. > > [Issue-3] Incorrect result in manually unrolled loops > > This issue can be reproduced by below manually unrolled loop. > > for (int i = 0; i < 10000; i += 2) { > c[i] = a[i] + b[i]; > c[i + 1] = a[i + 1] * b[i + 1]; > } > > In this loop, operations in the 2nd statement duplicate those in the 1st > statement with a small memory address offset. Vectorization in the main > loop works well in this case because C2 does further unrolling and pack > combination. But we cannot vectorize the post loop through replacement > from scalars to vectors because it creates duplicated vector operations. > To fix this, we restrict post loop vectorization to loops with stride > values of 1 or -1. > > [Issue-4] Incorrect result in loops with mixed vector element sizes > > This issue is found after we enable post loop vectorization for AArch64. > It's reproducible by multiple array operations with different element > sizes inside a loop. On x86, there is no issue because the values of x86 > AVX512 opmasks only depend on which vector lanes are active. But AArch64 > is different - the values of SVE predicates also depend on lane size of > the vector. Hence, on AArch64 SVE, if a loop has mixed vector element > sizes, we should use different vector masks. For now, we just support > loops with only one vector element size, i.e., "int + float" vectors in > a single loop is ok but "int + double" vectors in a single loop is not > vectorizable. This fix also enables subword vectors support to make all > primitive type array operations vectorizable. > > [Issue-5] Incorrect result in loops with potential data dependence > > This issue can be reproduced by below corner case on AArch64 only. > > for (int i = 0; i < 10000; i++) { > a[i] = x; > a[i + OFFSET] = y; > } > > In this case, two stores in the loop have data dependence if the OFFSET > value is smaller than the vector length. So we cannot do vectorization > through replacing scalars to vectors. But the main loop vectorization > in this case is successful on AArch64 because AArch64 has partial vector > load/store support. It splits vector fill with different values in lanes > to several smaller-sized fills. In this patch, we add additional data > dependence check for this kind of cases. The check is also done with the > help of SWPointer class. In this check, we require that every two memory > accesses (with at least one store) of the same element type (or subword > size) in the loop has the same array index expression. > > ** Tests > > So far we have tested full jtreg on both x86 AVX512 and AArch64 SVE with > experimental VM option "PostLoopMultiversioning" turned on. We found no > issue in all tests. We notice that those existing cases are not enough > because some of above issues are not spotted by them. We would like to > add some new cases but we found existing vectorization tests are a bit > cumbersome - golden results must be pre-calculated and hard-coded in the > test code for correctness verification. Thus, in this patch, we propose > a new vectorization testing framework. > > Our new framework brings a simpler way to add new cases. For a new test > case, we only need to create a new method annotated with "@Test". The > test runner will invoke each annotated method twice automatically. First > time it runs in the interpreter and second time it's forced compiled by > C2. Then the two return results are compared. So in this framework each > test method should return a primitive value or an array of primitives. > In this way, no extra verification code for vectorization correctness is > required. This test runner is still jtreg-based and takes advantages of > the jtreg WhiteBox API, which enables test methods running at specific > compilation levels. Each test class inside is also jtreg-based. It just > need to inherit from the test runner class and run with two additional > options "-Xbootclasspath/a:." and "-XX:+WhiteBoxAPI". > > ** Summary & Future work > > In this patch, we reworked post loop vectorization. We made it platform > independent and fixed several issues inside. We also implemented a new > vectorization testing framework with many test cases inside. Meanwhile, > we did some code cleanups. > > This patch only touches C2 code guarded with PostLoopMultiversioning, > except a few data structure changes. So, there's no behavior change when > experimental VM option PostLoopMultiversioning is off. Also, to reduce > risks, we still propose to keep post loop vectorization experimental for > now. But if it receives positive feedback, we would like to change it to > non-experimental in the future. Hi Pengfie, I further analyzed why we see a post vector tail loop operating at wider vector size compared to main vector loop, this problem will occur only for small profiled trip counts, consider following example :- class add { public static int LEN = 23; public static void micro(int [] a , int [] b, int [] r) { for (int j = 0 ; j < LEN ; j++) r[j] = a[j] + b[j]; } public static void main(String [] args) { int [] a = new int[LEN]; int [] b = new int[LEN]; int [] r = new int[LEN]; for (int i = 0 ; i < LEN-1; i++) { a[i] = (int)-i; b[i] = (int)i; } for (int i = 0 ; i < 100000 ; i++) micro(a, b, r); System.out.println("Res = " + r[5]); } } ``` Unrolling analysis will unroll micro 4 times before it hits the residual iteration limit set by LoopPercentProfileLimit. Following is an excerpt from the generate code 0x00007efd15030330: vmovdqu 0x10(%rdx,%r9,4),%xmm0 0x00007efd15030337: vpaddd 0x10(%rsi,%r9,4),%xmm0,%xmm0 0x00007efd1503033e: vmovdqu %xmm0,0x10(%rcx,%r9,4) 0x00007efd15030345: add $0x4,%r9d 0x00007efd15030349: cmp %ebx,%r9d 0x00007efd1503034c: jl 0x00007efd15030330 0x00007efd1503034e: mov 0x390(%r15),%rbx 0x00007efd15030355: test %eax,(%rbx) 0x00007efd15030357: cmp %r8d,%r9d 0x00007efd15030360: jl 0x00007efd15030317 0x00007efd15030362: cmp %ebp,%r9d 0x00007efd15030365: jge 0x00007efd150303d7 0x00007efd1503036b: cmp %eax,%r9d 0x00007efd1503036e: jae 0x00007efd1503050c 0x00007efd15030374: cmp %r13d,%r9d 0x00007efd15030377: jae 0x00007efd15030534 0x00007efd15030380: cmp %r10d,%r9d 0x00007efd15030383: jae 0x00007efd1503055c 0x00007efd15030389: mov %ebp,%r8d 0x00007efd1503038c: sub %r9d,%r8d 0x00007efd1503038f: movslq %r9d,%r11 0x00007efd15030392: movslq %r8d,%r8 0x00007efd15030395: shl $0x2,%r11 0x00007efd15030399: movabs $0xffffffffffffffff,%r9 0x00007efd150303a3: bzhi %r8,%r9,%r9 0x00007efd150303a8: kmovq %r9,%k7 0x00007efd150303ad: vmovdqu32 0x10(%rsi,%r11,1),%zmm0{%k7}{z} 0x00007efd150303b8: vmovdqu32 0x10(%rdx,%r11,1),%zmm1{%k7}{z} 0x00007efd150303c3: vpaddd %zmm1,%zmm0,%zmm0 0x00007efd150303c9: vmovdqu32 %zmm0,0x10(%rcx,%r11,1){%k7} SLP kicks in only after no further major optimizations are seen over loops hence main loop will get vectorized using 16 byte vectors. Post masked vector loop however will generate code based on MaxVectorSize (64 byte on AVX512) and will show a performance degradation. , As a quick check if we increase the LoopPercentProfileLimit to a higher value it shall facilitate further unrolling and thus SLP can then infer a wider vector for main loop. If we record the unroll_factor at which initial SLP iteration triggered and then generate post vector loops at that granularity instead of slp_max_unroll, it should prevent any performance degradation, provided that there is a atomic post vector loop for correctness which consumes all but last vector iteration. Another solution could be to prevent generating masked vector iteration if there is no atomic loop OR if the main vector loop is not unrolled further and is operating on vector sizes less than MaxVectorSize. As you suggested we can do these changes in subsequent incremental patches before making this feature non-experimental. Best Regards, Jatin ------------- PR: https://git.openjdk.java.net/jdk/pull/6828 From zgu at openjdk.java.net Wed Jan 26 13:27:31 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Wed, 26 Jan 2022 13:27:31 GMT Subject: RFR: JDK-8280583: Always build NMT [v3] In-Reply-To: References: <_RJK80WSxCYhWSqjjoQYbVPr2_uEnlNqp_UdwniQp4c=.d3b51bfa-ffe7-4a72-97fe-64dbb25e1f5a@github.com> Message-ID: On Wed, 26 Jan 2022 08:36:44 GMT, Thomas Stuefe wrote: >> After discussing this on hotspot-runtime-dev [1], the general opinion seems to be that it would be worthwhile to get rid of INCLUDE_NMT and make NMT unconditional. This affects minimal builds only. As pointed out in the mail thread, the overhead is very small and it would get rid of one configuration to build and test. >> >> [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2022-January/053504.html >> >> Patch removes INCLUDE_NMT from hotspot, as well as dependent macros, as well as the associated build option. > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: > > - fix copyrights > - remove INCLUDE_NMT and dependend code Good to me. ------------- Marked as reviewed by zgu (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7213 From zgu at openjdk.java.net Wed Jan 26 14:12:35 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Wed, 26 Jan 2022 14:12:35 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v2] In-Reply-To: References: Message-ID: On Tue, 25 Jan 2022 15:10:11 GMT, Christian Hagedorn wrote: >> When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method: >> >> Stack: [0x00007f6e01739000,0x00007f6e0183a000], sp=0x00007f6e01838110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f >> >> This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method. >> >> This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)): >> >> Stack: [0x00007f34fca18000,0x00007f34fcb19000], sp=0x00007f34fcb17110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 (c1_Compilation.cpp:607) >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec (c1_Compiler.cpp:250) >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 (compileBroker.cpp:2291) >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df (compileBroker.cpp:1966) >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 (compilerThread.cpp:59) >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d (thread.cpp:1297) >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 (thread.cpp:1280) >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 (thread.cpp:358) >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f (os_linux.cpp:705) >> >> For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. >> >> The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf >> I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. >> >> The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability. >> >> Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. >> >> **Testing:** >> Apart from manual testing, I've added two kinds of tests: >> - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers. >> - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename. >> >> On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation. >> >> To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number. >> >> I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user). >> >> Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`! >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request incrementally with two additional commits since the last revision: > > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> Personally, I am in favor of this project. Actually, I were experimenting it with libdwarf. I would like to add some historical background on this topic, just for consideration. We had a dwarf parser over a decade ago, a little after elf parser, but never made to mainline. There were several reasons at the time. Good news, some are no longer applied today :-) - At the time, Solaris still used STABS format, we could not get support from Solaris compiler team. - If one platform does not support a feature, no one can have it. That's why we could have it on Windows from day one, but did not enable it until much later. - Different compiler (and different version of the same compiler) can generate DWARF with different version, may not be compatible with each other, as DWARF allows custom fields. - Maintenance cost to catch up DWARF spec/compiler changes. ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From erikj at openjdk.java.net Wed Jan 26 14:26:40 2022 From: erikj at openjdk.java.net (Erik Joelsson) Date: Wed, 26 Jan 2022 14:26:40 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v2] In-Reply-To: References: Message-ID: <9kJSXjQdxkkNKDyFOZ3FiHkSda_BVgW632KkKyz3k14=.024fac33-1147-4d22-bdb8-91cefc913e25@github.com> On Tue, 25 Jan 2022 15:10:11 GMT, Christian Hagedorn wrote: >> When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method: >> >> Stack: [0x00007f6e01739000,0x00007f6e0183a000], sp=0x00007f6e01838110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f >> >> This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method. >> >> This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)): >> >> Stack: [0x00007f34fca18000,0x00007f34fcb19000], sp=0x00007f34fcb17110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 (c1_Compilation.cpp:607) >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec (c1_Compiler.cpp:250) >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 (compileBroker.cpp:2291) >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df (compileBroker.cpp:1966) >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 (compilerThread.cpp:59) >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d (thread.cpp:1297) >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 (thread.cpp:1280) >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 (thread.cpp:358) >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f (os_linux.cpp:705) >> >> For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. >> >> The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf >> I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. >> >> The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability. >> >> Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. >> >> **Testing:** >> Apart from manual testing, I've added two kinds of tests: >> - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers. >> - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename. >> >> On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation. >> >> To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number. >> >> I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user). >> >> Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`! >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request incrementally with two additional commits since the last revision: > > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> Build changes look good. ------------- Marked as reviewed by erikj (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7126 From adinn at openjdk.java.net Wed Jan 26 14:55:45 2022 From: adinn at openjdk.java.net (Andrew Dinn) Date: Wed, 26 Jan 2022 14:55:45 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash [v4] In-Reply-To: References: Message-ID: <_7m4HHMiUBoRRawqc7DPJbfRXZSD1JyVoHLDKPaAddk=.d4c38e5a-f92a-4768-8f60-684d97a99d81@github.com> On Wed, 26 Jan 2022 10:44:20 GMT, Denghui Dong wrote: > May I have a second reviewer? I am looking at this. ------------- PR: https://git.openjdk.java.net/jdk/pull/6597 From kbarrett at openjdk.java.net Wed Jan 26 17:12:08 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 26 Jan 2022 17:12:08 GMT Subject: RFR: 8279294: NonblockingQueue::try_pop may improperly indicate queue is empty [v2] In-Reply-To: References: Message-ID: <0WudjqjLMxZNGV4oWMkJYmAfhflDfq6OtP8WFLO2wHY=.882632a5-5386-4634-83a9-3279f3e1cdd5@github.com> > Please review this improvement to NonblockingQueue::try_pop. The old code > returned an indication that the queue was empty in some cases where that > wasn't true. In particular, contending try_pop operations could result in > some incorrectly indicating empty. The change fixes that and improves the > interaction between contending try_pops. > > Testing: > mach5 tier1-3 > > Lots of testing of this change in conjunction with others as part of > investigating and fixing JDK-8273383. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'master' into fix-try-pop2 - fix ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7149/files - new: https://git.openjdk.java.net/jdk/pull/7149/files/562729a1..fc29432d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7149&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7149&range=00-01 Stats: 7328 lines in 498 files changed: 4297 ins; 1612 del; 1419 mod Patch: https://git.openjdk.java.net/jdk/pull/7149.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7149/head:pull/7149 PR: https://git.openjdk.java.net/jdk/pull/7149 From kbarrett at openjdk.java.net Wed Jan 26 17:12:10 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 26 Jan 2022 17:12:10 GMT Subject: RFR: 8279294: NonblockingQueue::try_pop may improperly indicate queue is empty [v2] In-Reply-To: References: Message-ID: On Thu, 20 Jan 2022 20:49:53 GMT, Ivan Walulya wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: >> >> - Merge branch 'master' into fix-try-pop2 >> - fix > > Lgtm! > > Suggestion: > With the comments growing after each change, maybe we rename `result` to `old_head` Thanks @walulyai and @tschatzl for reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/7149 From kbarrett at openjdk.java.net Wed Jan 26 17:12:10 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 26 Jan 2022 17:12:10 GMT Subject: Integrated: 8279294: NonblockingQueue::try_pop may improperly indicate queue is empty In-Reply-To: References: Message-ID: On Wed, 19 Jan 2022 22:43:25 GMT, Kim Barrett wrote: > Please review this improvement to NonblockingQueue::try_pop. The old code > returned an indication that the queue was empty in some cases where that > wasn't true. In particular, contending try_pop operations could result in > some incorrectly indicating empty. The change fixes that and improves the > interaction between contending try_pops. > > Testing: > mach5 tier1-3 > > Lots of testing of this change in conjunction with others as part of > investigating and fixing JDK-8273383. This pull request has now been integrated. Changeset: 4b2370e5 Author: Kim Barrett URL: https://git.openjdk.java.net/jdk/commit/4b2370e57698e7413fef053afe9d22bb0bc86041 Stats: 44 lines in 1 file changed: 23 ins; 3 del; 18 mod 8279294: NonblockingQueue::try_pop may improperly indicate queue is empty Reviewed-by: iwalulya, tschatzl ------------- PR: https://git.openjdk.java.net/jdk/pull/7149 From minqi at openjdk.java.net Wed Jan 26 18:13:35 2022 From: minqi at openjdk.java.net (Yumin Qi) Date: Wed, 26 Jan 2022 18:13:35 GMT Subject: RFR: 8278753: Runtime crashes with access violation during JNI_CreateJavaVM call In-Reply-To: References: Message-ID: On Tue, 25 Jan 2022 00:20:19 GMT, Yumin Qi wrote: > Please review, > When jlink with --compress=2, zip is used to compress the files while doing copy. The user case failed to load zip.dll, since zip.dll is not set in PATH. This failure is after we get NULL from GetModuleHandle("zip.dll"), then do LoadLibrary("zip.dll") will have same result. > The fix is calling load_zip_library of ClassLoader first --- if zip library already loaded just return the cached handle for following usage, if not, load zip library and cached the handle. > > Tests: tier1,4,7 in test > Manually tested user case, and checked output of jimage list for jlinked files using --compress=2. > > Thanks > Yumin Update: tier1,tier4 passed tier7 failed on: test/hotspot/jtreg/serviceability/sa/ClhsdbThreadContext.java That is not related to the change since it is not using zip. ------------- PR: https://git.openjdk.java.net/jdk/pull/7206 From dcubed at openjdk.java.net Wed Jan 26 20:53:29 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Wed, 26 Jan 2022 20:53:29 GMT Subject: RFR: 8278753: Runtime crashes with access violation during JNI_CreateJavaVM call In-Reply-To: References: Message-ID: On Tue, 25 Jan 2022 00:20:19 GMT, Yumin Qi wrote: > Please review, > When jlink with --compress=2, zip is used to compress the files while doing copy. The user case failed to load zip.dll, since zip.dll is not set in PATH. This failure is after we get NULL from GetModuleHandle("zip.dll"), then do LoadLibrary("zip.dll") will have same result. > The fix is calling load_zip_library of ClassLoader first --- if zip library already loaded just return the cached handle for following usage, if not, load zip library and cached the handle. > > Tests: tier1,4,7 in test > Manually tested user case, and checked output of jimage list for jlinked files using --compress=2. > > Thanks > Yumin Your Tier7 failure is likely this known bug: JDK-8280601 ClhsdbThreadContext.java test is triggering codecache related assert in PointerFinder.find() https://bugs.openjdk.java.net/browse/JDK-8280601 ------------- PR: https://git.openjdk.java.net/jdk/pull/7206 From duke at openjdk.java.net Thu Jan 27 04:29:35 2022 From: duke at openjdk.java.net (duke) Date: Thu, 27 Jan 2022 04:29:35 GMT Subject: Withdrawn: 8273563: Improve performance of implicit exceptions with -XX:-OmitStackTraceInFastThrow In-Reply-To: References: Message-ID: On Mon, 13 Sep 2021 10:05:16 GMT, Volker Simonis wrote: > Currently, if running with `-XX:-OmitStackTraceInFastThrow`, C2 has no possibility to create implicit exceptions like AIOOBE, NullPointerExceptions, etc. in compiled code. This means that such methods will always be deoptimized and re-executed in the interpreter if such exceptions are happening. > > If implicit exceptions are used for normal control flow, that can have a dramatic impact on performance. A prominent example for such code is [Tomcat's `HttpParser::isAlpha()` method](https://github.com/apache/tomcat/blob/26ba86cdbd40ca718e43b82e62b3eb49d004c3d6/java/org/apache/tomcat/util/http/parser/HttpParser.java#L266-L274): > > public static boolean isAlpha(int c) { > try { > return IS_ALPHA[c]; > } catch (ArrayIndexOutOfBoundsException ex) { > return false; > } > } > > > ### Solution > > Instead of deoptimizing and resorting to the interpreter, we can generate code which allocates and initializes the corresponding exceptions right in compiled code. This results in a ten-times performance improvement for the above code: > > -XX:-OmitStackTraceInFastThrow -XX:-OptimizeImplicitExceptions > Benchmark (exceptionProbability) Mode Cnt Score Error Units > ImplicitExceptions.bench 0.0 avgt 5 1.430 ? 0.353 ns/op > ImplicitExceptions.bench 0.33 avgt 5 3563.038 ? 77.358 ns/op > ImplicitExceptions.bench 0.66 avgt 5 8609.693 ? 1205.104 ns/op > ImplicitExceptions.bench 1.00 avgt 5 12842.401 ? 1022.728 ns/op > > -XX:-OmitStackTraceInFastThrow -XX:+OptimizeImplicitExceptions > Benchmark (exceptionProbability) Mode Cnt Score Error Units > ImplicitExceptions.bench 0.0 avgt 5 1.432 ? 0.352 ns/op > ImplicitExceptions.bench 0.33 avgt 5 355.723 ? 16.641 ns/op > ImplicitExceptions.bench 0.66 avgt 5 887.068 ? 166.728 ns/op > ImplicitExceptions.bench 1.00 avgt 5 1274.418 ? 88.235 ns/op > > > ### Implementation details > > - The new optimization is guarded by the option `OptimizeImplicitExceptions` which is on by default. > - In `GraphKit::builtin_throw()` we can't simply use `CallGenerator::for_direct_call()` to create a `DirectCallGenerator` for the call to the exception's `` function because `DirectCallGenerator` assumes in various places that calls are only issued at `invoke*` bytecodes. This is is not true in genral for bytecode which can cause an implicit exception. > - Instead, we manually wire up the call based on the code in `DirectCallGenerator::generate()`. > - We use a similar trick like for method handle intrinsics where the callee from the bytecode is replaced by a direct call and this fact is recorded in the call's `_override_symbolic_info` field. For calling constructors of implicit exceptions I've introduced the new field `_implicit_exception_init`. This field is also used in various assertions to prevent queries for the bytecode's symbolic method information which doesn't exist because we're not at an `invoke*` bytecode at the place where we generate the call. > - The PR contains a micro-benchmark which compares the old and the new implementation for [Tomcat's `HttpParser::isAlpha()` method](https://github.com/apache/tomcat/blob/26ba86cdbd40ca718e43b82e62b3eb49d004c3d6/java/org/apache/tomcat/util/http/parser/HttpParser.java#L266-L274). Except for the trivial case where the exception probability is 0 (i.e. no exceptions are happening at all) the new implementation is about 10 times faster. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/5488 From chagedorn at openjdk.java.net Thu Jan 27 08:41:34 2022 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Thu, 27 Jan 2022 08:41:34 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v2] In-Reply-To: References: Message-ID: <54FxD8Y6tYN9qIxG9kM1609F8U5qX1L2q5k36XCYnzs=.776977b4-80f0-41f9-99d3-6829f8e1d067@github.com> On Tue, 25 Jan 2022 15:06:41 GMT, Christian Hagedorn wrote: > Build changes look good. Thanks Erik! > Personally, I am in favor of this project. Actually, I were experimenting it with libdwarf. > > I would like to add some historical background on this topic, just for consideration. Thanks Zhengyu for sharing some background! > We had a dwarf parser over a decade ago, a little after elf parser, but never made to mainline. There were several reasons at the time. Good news, some are no longer applied today :-) That's interesting. Is this implementation still around somewhere? I'm glad that some of the mentioned things are not a problem anymore. > * Different compiler (and different version of the same compiler) can generate DWARF with different version, may not be compatible with each other, as DWARF allows custom fields. > * Maintenance cost to catch up DWARF spec/compiler changes. That's indeed a problem of facing different DWARF versions. For this parser, I tried to support the current default of GCC 10.x which is DWARF 4. This standard was introduced in 2010 and is probably used by most compilers nowadays at least (if not already DWARF 5 which was introduced in 2017). However, even with GCC 10.x, it emitted DWARF 3 for one of the sections (I'm not sure why) which I also needed to support - thus you can never be sure. DWARF 5 is still experimental for GCC 10.x and had some issues when I tried that out back there - so I stayed away from implementing parsing steps for it. But now with GCC 11.x, DWARF 5 seems to have become the default. I might have to try out what's being emitted for HotSpot. But I think for now, it is better to only focus on DWARF 4 instead of trying to support various versions in one patch - we could still come back to that later if it becomes widely used. Even if DWARF 5 is emitted, GCC could be configured, for example, to emit DWARF 4 only which is probably an acceptable workaround for testing environments. ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From yyang at openjdk.java.net Thu Jan 27 09:17:09 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Thu, 27 Jan 2022 09:17:09 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v6] In-Reply-To: References: Message-ID: > Add VM.classes to print details of all classes, output looks like: > > 1. jcmd VM.classes > > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 > ... > > 2. jcmd VM.classes verbose > > KlassAddr Size State Flags LoaderName ClassName > 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 > java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} > - instance size: 2 > - klass size: 62 > - access: final synchronized > - state: inited > - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > - super: 'java/lang/Object' > - sub: > - arrays: NULL > - methods: Array(0x00007f620841f210) > - method ordering: Array(0x0000000800a7e5a8) > - default_methods: Array(0x0000000000000000) > - local interfaces: Array(0x00000008005af748) > - trans. interfaces: Array(0x00000008005af748) > - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 > - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder > - source file: 'LambdaForm$MH' > - class annotations: Array(0x0000000000000000) > - class type annotations: Array(0x0000000000000000) > - field annotations: Array(0x0000000000000000) > - field type annotations: Array(0x0000000000000000) > - inner classes: Array(0x00000008005af6d8) > - nest members: Array(0x00000008005af6d8) > - permitted subclasses: Array(0x00000008005af6d8) > - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' > - vtable length 5 (start addr: 0x0000000800c0b5b8) > - itable length 2 (start addr: 0x0000000800c0b5e0) > - ---- static fields (1 words): > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > - ---- non-static fields (0 words): > - non-static oop maps: > 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 > java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} > - instance size: 2 > - klass size: 62 > - access: final synchronized > - state: inited > - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > - super: 'java/lang/Object' > - sub: > - arrays: NULL > - methods: Array(0x00007f620841ea68) > - method ordering: Array(0x0000000800a7e5a8) > - default_methods: Array(0x0000000000000000) > - local interfaces: Array(0x00000008005af748) > - trans. interfaces: Array(0x00000008005af748) > - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 > - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder > - source file: 'LambdaForm$DMH' > - class annotations: Array(0x0000000000000000) > - class type annotations: Array(0x0000000000000000) > - field annotations: Array(0x0000000000000000) > - field type annotations: Array(0x0000000000000000) > - inner classes: Array(0x00000008005af6d8) > - nest members: Array(0x00000008005af6d8) > - permitted subclasses: Array(0x00000008005af6d8) > - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' > - vtable length 5 (start addr: 0x0000000800c0b1b8) > - itable length 2 (start addr: 0x0000000800c0b1e0) > - ---- static fields (1 words): > - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 > - ---- non-static fields (0 words): > ... Yi Yang has updated the pull request incrementally with one additional commit since the last revision: fix ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7105/files - new: https://git.openjdk.java.net/jdk/pull/7105/files/b4da2ddc..4d2538be Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7105&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7105&range=04-05 Stats: 12 lines in 3 files changed: 6 ins; 4 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/7105.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7105/head:pull/7105 PR: https://git.openjdk.java.net/jdk/pull/7105 From yyang at openjdk.java.net Thu Jan 27 09:17:12 2022 From: yyang at openjdk.java.net (Yi Yang) Date: Thu, 27 Jan 2022 09:17:12 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v5] In-Reply-To: <_OK_txWawZez34924dgvXieLKwMMBIFm5yjHotuj9MI=.d951a93b-00e1-4ce8-b8b7-f21935b1dcf8@github.com> References: <_OK_txWawZez34924dgvXieLKwMMBIFm5yjHotuj9MI=.d951a93b-00e1-4ce8-b8b7-f21935b1dcf8@github.com> Message-ID: On Mon, 24 Jan 2022 04:12:22 GMT, Ioi Lam wrote: >> Yi Yang has updated the pull request incrementally with one additional commit since the last revision: >> >> fix test > > src/hotspot/share/oops/instanceKlass.cpp line 2106: > >> 2104: // classloader name >> 2105: ClassLoaderData* cld = k->class_loader_data(); >> 2106: _st->print("%-12s ", cld->loader_name()); > > For custom class loaders, this will likely print a long class name that will over the 12 character limit, making the output somewhat hard to read. > > > > const char* ClassLoaderData::loader_name() const { > if (_class_loader_klass == NULL) { > return BOOTSTRAP_LOADER_NAME; > } else if (_name != NULL) { > return _name->as_C_string(); > } else { > return _class_loader_klass->external_name(); > } > } > > > Also, for custom loaders, printing out just the name of the loader class is not sufficient, as multiple loader instances may have the same type. > > Maybe we should just remove line 2106? If the user wants to know the class loader, they can use the "-verbose" option of this jcmd. Thanks for review! All comments are addressed. ------------- PR: https://git.openjdk.java.net/jdk/pull/7105 From stuefe at openjdk.java.net Thu Jan 27 09:21:38 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 27 Jan 2022 09:21:38 GMT Subject: RFR: JDK-8280583: Always build NMT [v2] In-Reply-To: References: <_RJK80WSxCYhWSqjjoQYbVPr2_uEnlNqp_UdwniQp4c=.d3b51bfa-ffe7-4a72-97fe-64dbb25e1f5a@github.com> Message-ID: On Wed, 26 Jan 2022 08:26:27 GMT, Aleksey Shipilev wrote: >> Thomas Stuefe has updated the pull request incrementally with 15 additional commits since the last revision: >> >> - 8280377: MethodHandleProxies does not correctly invoke default methods with varags >> >> Reviewed-by: alanb >> - 8213905: reflection not working for type annotations applied to exception types in the inner class constructor >> >> Reviewed-by: jlahoda >> - 8279242: Reflection newInstance() error message when constructor has no access modifiers could use improvement >> >> Reviewed-by: iris, dholmes, mchung >> - 8269542: JDWP: EnableCollection support is no longer spec compliant after JDK-8255987 >> 8258071: Fix for JDK-8255987 can be subverted with ObjectReference.EnableCollection >> >> Reviewed-by: dholmes, pliden >> - 8280166: Extend java/lang/instrument/GetObjectSizeIntrinsicsTest.java test cases >> >> Reviewed-by: sspitsyn, lmesnik >> - 8280041: Retry loop issues in java.io.ClassCache >> >> Co-authored-by: Peter Levart >> Reviewed-by: rkennke, rriggs, plevart >> - 8280168: Add Objects.toIdentityString >> >> Reviewed-by: alanb, mchung, rriggs, smarks >> - 8279946: (ch) java.nio.channels.FileChannel tryLock and write methods are missing @throws NonWritableChannelException >> >> Reviewed-by: alanb >> - 8280396: G1: Full gc mark stack draining should prefer to make work available to other threads >> >> Reviewed-by: sjohanss, ayang >> - 8280414: Memory leak in DefaultProxySelector >> >> Reviewed-by: dfuchs >> - ... and 5 more: https://git.openjdk.java.net/jdk/compare/a19176e6...44478392 > > There is some major weirdness in this PR: it includes a lot of unrelated changes. Consider rebasing to current master and force-pushing? Thanks @shipilev, @magicus and @zhengyu123 ! ------------- PR: https://git.openjdk.java.net/jdk/pull/7213 From stuefe at openjdk.java.net Thu Jan 27 09:21:39 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 27 Jan 2022 09:21:39 GMT Subject: Integrated: JDK-8280583: Always build NMT In-Reply-To: <_RJK80WSxCYhWSqjjoQYbVPr2_uEnlNqp_UdwniQp4c=.d3b51bfa-ffe7-4a72-97fe-64dbb25e1f5a@github.com> References: <_RJK80WSxCYhWSqjjoQYbVPr2_uEnlNqp_UdwniQp4c=.d3b51bfa-ffe7-4a72-97fe-64dbb25e1f5a@github.com> Message-ID: On Tue, 25 Jan 2022 10:57:14 GMT, Thomas Stuefe wrote: > After discussing this on hotspot-runtime-dev [1], the general opinion seems to be that it would be worthwhile to get rid of INCLUDE_NMT and make NMT unconditional. This affects minimal builds only. As pointed out in the mail thread, the overhead is very small and it would get rid of one configuration to build and test. > > [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2022-January/053504.html > > Patch removes INCLUDE_NMT from hotspot, as well as dependent macros, as well as the associated build option. This pull request has now been integrated. Changeset: cab59051 Author: Thomas Stuefe URL: https://git.openjdk.java.net/jdk/commit/cab590517bf705418c7376edd5d7066b13b6dde8 Stats: 243 lines in 28 files changed: 10 ins; 192 del; 41 mod 8280583: Always build NMT Reviewed-by: shade, ihse, zgu ------------- PR: https://git.openjdk.java.net/jdk/pull/7213 From stuefe at openjdk.java.net Thu Jan 27 09:29:30 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 27 Jan 2022 09:29:30 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v2] In-Reply-To: References: Message-ID: On Tue, 25 Jan 2022 15:10:11 GMT, Christian Hagedorn wrote: >> When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method: >> >> Stack: [0x00007f6e01739000,0x00007f6e0183a000], sp=0x00007f6e01838110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f >> >> This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method. >> >> This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)): >> >> Stack: [0x00007f34fca18000,0x00007f34fcb19000], sp=0x00007f34fcb17110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 (c1_Compilation.cpp:607) >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec (c1_Compiler.cpp:250) >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 (compileBroker.cpp:2291) >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df (compileBroker.cpp:1966) >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 (compilerThread.cpp:59) >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d (thread.cpp:1297) >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 (thread.cpp:1280) >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 (thread.cpp:358) >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f (os_linux.cpp:705) >> >> For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. >> >> The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf >> I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. >> >> The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability. >> >> Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. >> >> **Testing:** >> Apart from manual testing, I've added two kinds of tests: >> - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers. >> - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename. >> >> On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation. >> >> To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number. >> >> I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user). >> >> Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`! >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request incrementally with two additional commits since the last revision: > > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> Hi Christian, this is very nice and useful! Two general remarks. One concern I have is that the new functionality should be super stable, since nothing is more annoying than to crash during stack dumping in hs-err file; I much rather have a call stack without bells and whistles than an abridged one. Maybe we could, in hs-err printing, if we got secondary crashes during callstack dumping, repeat the step with all optional features (also name demangling) disabled? This could also be done in a separate RFE. We'll know when this happens, we can react then. Another small concern, we parse the Elf file while dumping the stack, right? I remember having a lot of problems on Solaris when dumping callstacks, because there parsing the elf file was really slow. And that delayed call stack printing by a lot, so much that the ErrorCrashTimeout often kicked in and spoiled the crash logs for us. Cheers, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From andreas.rosenberg at apis.de Thu Jan 27 10:19:32 2022 From: andreas.rosenberg at apis.de (Andreas Rosenberg) Date: Thu, 27 Jan 2022 10:19:32 +0000 Subject: Fix proposal for bug JDK-8221642 Message-ID: Hi, this is my first posting regarding to JDK contribution, so this may be the wrong place to ask. Please point me in the right direction in this case. We are using Java rather heavily via JNI on a custom application. For a long time we did stick to JRE 1.8 for various reasons. My task is to plan an upgrade to a more recent JDK version and while doing some test I encountered bugs related to this: JDK-8227491 (JNI - caller sensitive methods). We are parsing Java class files to auto gen the JNI code for our application, and are also using reflection. The workaround given is clumsy and needs manual intervention, so I was looking for a more elegant solution. The problem is: a caller sensitive method wants to determine the caller class for security checks. In case of a JNI call no Java stack frame exists, so the JVM function "jclass JVM_GetCallerClass(JNIEnv* env)" answers NULL which leads to NPEs. My idea is this: create an internal proxy class inside "java.base" that reflects this case (e.g. "java.lang.NativeCall" or "java.lang.NativeCode"). This class is final and implements nothing. Then "jclass JVM_GetCallerClass(JNIEnv* env)" (jvm.cpp) could be modified and instead of answering NULL in case of a JNI call, it should do this to answer the class proxy: return JVM_FindClassFromBootLoader(env, "java/lang/NativeCall"); This would have the following advantages: - JNI code could again simply call "caller sensitive methods" without the need to make an additional wrapper class - it would be more a expressive way on the Java side to detect "the callee is native code" than checking for null - it would fit better into the framework I already applied this fix on my own copy of the JDK 17 sources and it works pretty well for us. As there are probably security considerations involved, advice from experts is required. But from my understanding the Java security model is designed for the main app being writing in Java. In this case there are always Java stacks frames available as parents for caller sensitive methods, so the proposed fix would not affect the behavior. This assumes that "GetCallerClass" only answers NULL for the JNI case. This needs verification. If the main app is native code which uses JNI, the Java security model can only affect the Java part and as soon as an additional Java stack frame has been generated a regular Java class will be found and the "standard behavior" should apply again. Comments appreciated. It this fix looks reasonable, what are the steps to get it implemented and integrated into the official source tree? Best regards, Andy From adinn at openjdk.java.net Thu Jan 27 12:25:29 2022 From: adinn at openjdk.java.net (Andrew Dinn) Date: Thu, 27 Jan 2022 12:25:29 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash [v4] In-Reply-To: References: Message-ID: On Sat, 22 Jan 2022 02:40:41 GMT, Denghui Dong wrote: >> Hi, >> >> I found that the native stack frames in the hs log are not accurate sometimes on AArch64, not sure if this is a known issue or an issue worth fixing. >> >> The following steps can quick reproduce the problem: >> >> 1. apply the diff(comment the dtrace_object_alloc call in interpreter and make a crash on SharedRuntime::dtrace_object_alloc) >> >> index 39e99bdd5ed..4fc768e94aa 100644 >> --- a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp >> +++ b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp >> @@ -3558,6 +3558,7 @@ void TemplateTable::_new() { >> __ store_klass_gap(r0, zr); // zero klass gap for compressed oops >> __ store_klass(r0, r4); // store klass last >> >> +/** >> { >> SkipIfEqual skip(_masm, &DTraceAllocProbes, false); >> // Trigger dtrace event for fastpath >> @@ -3567,6 +3568,7 @@ void TemplateTable::_new() { >> __ pop(atos); // restore the return value >> >> } >> +*/ >> __ b(done); >> } >> >> diff --git a/src/hotspot/cpu/x86/templateTable_x86.cpp b/src/hotspot/cpu/x86/templateTable_x86.cpp >> index 19530b7c57c..15b0509da4c 100644 >> --- a/src/hotspot/cpu/x86/templateTable_x86.cpp >> +++ b/src/hotspot/cpu/x86/templateTable_x86.cpp >> @@ -4033,6 +4033,7 @@ void TemplateTable::_new() { >> Register tmp_store_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg); >> __ store_klass(rax, rcx, tmp_store_klass); // klass >> >> +/** >> { >> SkipIfEqual skip_if(_masm, &DTraceAllocProbes, 0); >> // Trigger dtrace event for fastpath >> @@ -4041,6 +4042,7 @@ void TemplateTable::_new() { >> CAST_FROM_FN_PTR(address, static_cast(SharedRuntime::dtrace_object_alloc)), rax); >> __ pop(atos); >> } >> +*/ >> >> __ jmp(done); >> } >> diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp >> index a5de65ea5ab..60b4bd3bcc8 100644 >> --- a/src/hotspot/share/runtime/sharedRuntime.cpp >> +++ b/src/hotspot/share/runtime/sharedRuntime.cpp >> @@ -1002,6 +1002,7 @@ jlong SharedRuntime::get_java_tid(Thread* thread) { >> * 6254741. Once that is fixed we can remove the dummy return value. >> */ >> int SharedRuntime::dtrace_object_alloc(oopDesc* o) { >> + *(int*)0 = 1; >> return dtrace_object_alloc(Thread::current(), o, o->size()); >> } >> >> >> 2. `java -XX:+DTraceAllocProbes -Xcomp -XX:-PreserveFramePointer -version` >> >> On x86_64, the native stack in hs log is complete, but on AArch64, the native stack is incorrect. >> >> In the beginning, I thought it might be the influence of PreserveFramePointer. Later, I found that no matter whether PreserveFramePointer is enabled or not, in the hs log of x86_64, the native stack is always correct, and aarch64 is wrong. >> >> After some investigation, I found that this problem is related to the layout of the stack. >> >> On x86_64, whether it is C/C++, interpreter, or JIT, `callee` will always put the `return address` and `fp` of the `caller` at the bottom of the stack. >> Hence, `callee` can always get the `caller sp`(aka `sender sp`) by `fp + 2`, and if `caller` is a compiled method, `caller sp` is the key to getting the `caller`'s `caller` since `caller fp` may be invalid.(see frame::sender_for_compiled_frame). >> >> >> push %rbp >> mov %rsp,%rbp >> >> _ _ _ _ _ _ >> | | >> | | | >> |_ _ _ _ _ _| | >> | | | >> caller | | <- caller sp | >> _ _ _ |_ _ _ _ _ _| | expand >> | | | >> | ret addr | | direction >> callee |_ _ _ _ _ _| | >> | | V >> | caller fp | <- fp >> |_ _ _ _ _ _| >> >> >> >> But for AArch64, the C/C++ code doesn't put the `return address` and `fp` of the `caller` at the bottom of the stack. >> Hence, we cannot use `fp + 2` to calculate the proper `caller sp`(although it is still implemented this way). >> >> When `caller` is a C1/C2 method A, and `callee` is a C/C++ method B, we cannot get the `caller` of A since we cannot get the proper sp value of it. >> >> >> stp x29, x30, [sp, #-N]! >> mov x29, sp >> >> _ _ _ _ _ _ >> | | >> | | | >> |_ _ _ _ _ _| | >> | | | >> caller | | <- caller sp | >> _ _ _ |_ _ _ _ _ _| - | expand >> | | >> . . . . . | | direction >> _ _ _ _ _ _ | | >> | | | N | >> | ret addr | | | >> callee |_ _ _ _ _ _| | | >> | | - V >> | caller fp | <- fp >> |_ _ _ _ _ _| >> >> >> >> I am not very familiar with AArch64 and have no idea how to fix this issue perfectly at current. >> >> Based on my understanding of the implementation, we can get the correct stack trace when PreserveFramePointer is enabled. >> >> Although PreserveFramePointer is disabled by default, I found that some real applications will enable it in the production environment. >> Therefore, in my opinion, this fix can help troubleshoot crash issues in applications that enable PreserveFramePointer on AArch64 platform. >> >> This patch changes the logic of l_sender_sp calculation, uses sender_sp() as the value of l_sender_sp when PreserveFramePointer is enabled. >> >> Any input is appreciated. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > update copyright year This looks ok apart from the comment in sender_for_compiled_frame needing updating. src/hotspot/cpu/aarch64/frame_aarch64.cpp line 470: > 468: // in C2 code but it will have been pushed onto the stack. so we > 469: // have to find it relative to the unextended sp > 470: The comment above this change needs to be updated to explain when and why it is correct to use 1) the unextended sp and frame size or 2) the sender sp. ------------- Changes requested by adinn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/6597 From zgu at openjdk.java.net Thu Jan 27 13:46:29 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Thu, 27 Jan 2022 13:46:29 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v2] In-Reply-To: <54FxD8Y6tYN9qIxG9kM1609F8U5qX1L2q5k36XCYnzs=.776977b4-80f0-41f9-99d3-6829f8e1d067@github.com> References: <54FxD8Y6tYN9qIxG9kM1609F8U5qX1L2q5k36XCYnzs=.776977b4-80f0-41f9-99d3-6829f8e1d067@github.com> Message-ID: On Thu, 27 Jan 2022 08:38:02 GMT, Christian Hagedorn wrote: > > Build changes look good. > > Thanks Erik! > > > Personally, I am in favor of this project. Actually, I were experimenting it with libdwarf. > > I would like to add some historical background on this topic, just for consideration. > > Thanks Zhengyu for sharing some background! > > > We had a dwarf parser over a decade ago, a little after elf parser, but never made to mainline. There were several reasons at the time. Good news, some are no longer applied today :-) > > That's interesting. Is this implementation still around somewhere? I'm glad that some of the mentioned things are not a problem anymore. > Not I know. IIRC, it was based on DWARF 2. > > * Different compiler (and different version of the same compiler) can generate DWARF with different version, may not be compatible with each other, as DWARF allows custom fields. > > * Maintenance cost to catch up DWARF spec/compiler changes. > > That's indeed a problem of facing different DWARF versions. For this parser, I tried to support the current default of GCC 10.x which is DWARF 4. This standard was introduced in 2010 and is probably used by most compilers nowadays at least (if not already DWARF 5 which was introduced in 2017). However, even with GCC 10.x, it emitted DWARF 3 for one of the sections (I'm not sure why) which I also needed to support - thus you can never be sure. > > DWARF 5 is still experimental for GCC 10.x and had some issues when I tried that out back there - so I stayed away from implementing parsing steps for it. But now with GCC 11.x, DWARF 5 seems to have become the default. I might have to try out what's being emitted for HotSpot. But I think for now, it is better to only focus on DWARF 4 instead of trying to support various versions in one patch - we could still come back to that later if it becomes widely used. Even if DWARF 5 is emitted, GCC could be configured, for example, to emit DWARF 4 only which is probably an acceptable workaround for testing environments. I think maintenance and test could be major pain points. Based on build.html, we can use gcc version anywhere between 5.0 and 10.2, it could be a challenge to ensure all supported version work correctly. ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From stefank at openjdk.java.net Thu Jan 27 14:42:51 2022 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 27 Jan 2022 14:42:51 GMT Subject: RFR: 8280784: VM_Cleanup unnecessarily processes all thread oops Message-ID: While looking at ZGC latencies in a benchmark with >20000 Java threads, I noticed that the Cleanup VM operation could take up to 500 ms. It turned out that the time was spent processing the oops in all Java threads. Since none of the safepoint cleanup tasks use the oops in the threads, I propose that we stop processing the oops in this VM Operation. ------------- Commit messages: - 8280784: VM_Cleanup unnecessarily processes all thread oops Changes: https://git.openjdk.java.net/jdk/pull/7246/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7246&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8280784 Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7246.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7246/head:pull/7246 PR: https://git.openjdk.java.net/jdk/pull/7246 From stefank at openjdk.java.net Thu Jan 27 14:42:51 2022 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 27 Jan 2022 14:42:51 GMT Subject: RFR: 8280784: VM_Cleanup unnecessarily processes all thread oops In-Reply-To: References: Message-ID: On Thu, 27 Jan 2022 14:35:08 GMT, Stefan Karlsson wrote: > While looking at ZGC latencies in a benchmark with >20000 Java threads, I noticed that the Cleanup VM operation could take up to 500 ms. It turned out that the time was spent processing the oops in all Java threads. Since none of the safepoint cleanup tasks use the oops in the threads, I propose that we stop processing the oops in this VM Operation. Tested with Oracle tier1-3 ------------- PR: https://git.openjdk.java.net/jdk/pull/7246 From eosterlund at openjdk.java.net Thu Jan 27 15:10:34 2022 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 27 Jan 2022 15:10:34 GMT Subject: RFR: 8280784: VM_Cleanup unnecessarily processes all thread oops In-Reply-To: References: Message-ID: On Thu, 27 Jan 2022 14:35:08 GMT, Stefan Karlsson wrote: > While looking at ZGC latencies in a benchmark with >20000 Java threads, I noticed that the Cleanup VM operation could take up to 500 ms. It turned out that the time was spent processing the oops in all Java threads. Since none of the safepoint cleanup tasks use the oops in the threads, I propose that we stop processing the oops in this VM Operation. Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7246 From ddong at openjdk.java.net Thu Jan 27 15:15:15 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Thu, 27 Jan 2022 15:15:15 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash [v5] In-Reply-To: References: Message-ID: > Hi, > > I found that the native stack frames in the hs log are not accurate sometimes on AArch64, not sure if this is a known issue or an issue worth fixing. > > The following steps can quick reproduce the problem: > > 1. apply the diff(comment the dtrace_object_alloc call in interpreter and make a crash on SharedRuntime::dtrace_object_alloc) > > index 39e99bdd5ed..4fc768e94aa 100644 > --- a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp > +++ b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp > @@ -3558,6 +3558,7 @@ void TemplateTable::_new() { > __ store_klass_gap(r0, zr); // zero klass gap for compressed oops > __ store_klass(r0, r4); // store klass last > > +/** > { > SkipIfEqual skip(_masm, &DTraceAllocProbes, false); > // Trigger dtrace event for fastpath > @@ -3567,6 +3568,7 @@ void TemplateTable::_new() { > __ pop(atos); // restore the return value > > } > +*/ > __ b(done); > } > > diff --git a/src/hotspot/cpu/x86/templateTable_x86.cpp b/src/hotspot/cpu/x86/templateTable_x86.cpp > index 19530b7c57c..15b0509da4c 100644 > --- a/src/hotspot/cpu/x86/templateTable_x86.cpp > +++ b/src/hotspot/cpu/x86/templateTable_x86.cpp > @@ -4033,6 +4033,7 @@ void TemplateTable::_new() { > Register tmp_store_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg); > __ store_klass(rax, rcx, tmp_store_klass); // klass > > +/** > { > SkipIfEqual skip_if(_masm, &DTraceAllocProbes, 0); > // Trigger dtrace event for fastpath > @@ -4041,6 +4042,7 @@ void TemplateTable::_new() { > CAST_FROM_FN_PTR(address, static_cast(SharedRuntime::dtrace_object_alloc)), rax); > __ pop(atos); > } > +*/ > > __ jmp(done); > } > diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp > index a5de65ea5ab..60b4bd3bcc8 100644 > --- a/src/hotspot/share/runtime/sharedRuntime.cpp > +++ b/src/hotspot/share/runtime/sharedRuntime.cpp > @@ -1002,6 +1002,7 @@ jlong SharedRuntime::get_java_tid(Thread* thread) { > * 6254741. Once that is fixed we can remove the dummy return value. > */ > int SharedRuntime::dtrace_object_alloc(oopDesc* o) { > + *(int*)0 = 1; > return dtrace_object_alloc(Thread::current(), o, o->size()); > } > > > 2. `java -XX:+DTraceAllocProbes -Xcomp -XX:-PreserveFramePointer -version` > > On x86_64, the native stack in hs log is complete, but on AArch64, the native stack is incorrect. > > In the beginning, I thought it might be the influence of PreserveFramePointer. Later, I found that no matter whether PreserveFramePointer is enabled or not, in the hs log of x86_64, the native stack is always correct, and aarch64 is wrong. > > After some investigation, I found that this problem is related to the layout of the stack. > > On x86_64, whether it is C/C++, interpreter, or JIT, `callee` will always put the `return address` and `fp` of the `caller` at the bottom of the stack. > Hence, `callee` can always get the `caller sp`(aka `sender sp`) by `fp + 2`, and if `caller` is a compiled method, `caller sp` is the key to getting the `caller`'s `caller` since `caller fp` may be invalid.(see frame::sender_for_compiled_frame). > > > push %rbp > mov %rsp,%rbp > > _ _ _ _ _ _ > | | > | | | > |_ _ _ _ _ _| | > | | | > caller | | <- caller sp | > _ _ _ |_ _ _ _ _ _| | expand > | | | > | ret addr | | direction > callee |_ _ _ _ _ _| | > | | V > | caller fp | <- fp > |_ _ _ _ _ _| > > > > But for AArch64, the C/C++ code doesn't put the `return address` and `fp` of the `caller` at the bottom of the stack. > Hence, we cannot use `fp + 2` to calculate the proper `caller sp`(although it is still implemented this way). > > When `caller` is a C1/C2 method A, and `callee` is a C/C++ method B, we cannot get the `caller` of A since we cannot get the proper sp value of it. > > > stp x29, x30, [sp, #-N]! > mov x29, sp > > _ _ _ _ _ _ > | | > | | | > |_ _ _ _ _ _| | > | | | > caller | | <- caller sp | > _ _ _ |_ _ _ _ _ _| - | expand > | | > . . . . . | | direction > _ _ _ _ _ _ | | > | | | N | > | ret addr | | | > callee |_ _ _ _ _ _| | | > | | - V > | caller fp | <- fp > |_ _ _ _ _ _| > > > > I am not very familiar with AArch64 and have no idea how to fix this issue perfectly at current. > > Based on my understanding of the implementation, we can get the correct stack trace when PreserveFramePointer is enabled. > > Although PreserveFramePointer is disabled by default, I found that some real applications will enable it in the production environment. > Therefore, in my opinion, this fix can help troubleshoot crash issues in applications that enable PreserveFramePointer on AArch64 platform. > > This patch changes the logic of l_sender_sp calculation, uses sender_sp() as the value of l_sender_sp when PreserveFramePointer is enabled. > > Any input is appreciated. > > Thanks, > Denghui Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: add comment ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6597/files - new: https://git.openjdk.java.net/jdk/pull/6597/files/3674f719..bdc3901f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6597&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6597&range=03-04 Stats: 11 lines in 1 file changed: 8 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/6597.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6597/head:pull/6597 PR: https://git.openjdk.java.net/jdk/pull/6597 From ddong at openjdk.java.net Thu Jan 27 15:21:42 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Thu, 27 Jan 2022 15:21:42 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash [v4] In-Reply-To: References: Message-ID: On Thu, 27 Jan 2022 12:19:12 GMT, Andrew Dinn wrote: >> Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: >> >> update copyright year > > src/hotspot/cpu/aarch64/frame_aarch64.cpp line 470: > >> 468: // in C2 code but it will have been pushed onto the stack. so we >> 469: // have to find it relative to the unextended sp >> 470: > > The comment above this change needs to be updated to explain when and why it is correct to use 1) the unextended sp and frame size or 2) the sender sp. Thanks. I changed the comment, but this seems cannot be clearly explained in one or two sentences. Please feel free to give any advice to refine the comment. ------------- PR: https://git.openjdk.java.net/jdk/pull/6597 From iklam at openjdk.java.net Thu Jan 27 16:05:38 2022 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 27 Jan 2022 16:05:38 GMT Subject: RFR: 8275775: Add jcmd VM.classes to print details of all classes [v6] In-Reply-To: References: Message-ID: On Thu, 27 Jan 2022 09:17:09 GMT, Yi Yang wrote: >> Add VM.classes to print details of all classes, output looks like: >> >> 1. jcmd VM.classes >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> 0x0000000800c0ac00 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0ac00 >> ... >> >> 2. jcmd VM.classes verbose >> >> KlassAddr Size State Flags LoaderName ClassName >> 0x0000000800c0b400 62 inited W bootstrap java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 >> java.lang.invoke.LambdaForm$MH/0x0000000800c0b400 {0x0000000800c0b400} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841f210) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [41] {0x00007f620841f030} for 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' cache=0x00007f620841f380 >> - class loader data: loader data: 0x00007f61c804a690 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$MH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b3968} = 'java/lang/invoke/LambdaForm$MH+0x0000000800c0b400' >> - vtable length 5 (start addr: 0x0000000800c0b5b8) >> - itable length 2 (start addr: 0x0000000800c0b5e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> - non-static oop maps: >> 0x0000000800c0b000 62 inited W bootstrap java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 >> java.lang.invoke.LambdaForm$DMH/0x0000000800c0b000 {0x0000000800c0b000} >> - instance size: 2 >> - klass size: 62 >> - access: final synchronized >> - state: inited >> - name: 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - super: 'java/lang/Object' >> - sub: >> - arrays: NULL >> - methods: Array(0x00007f620841ea68) >> - method ordering: Array(0x0000000800a7e5a8) >> - default_methods: Array(0x0000000000000000) >> - local interfaces: Array(0x00000008005af748) >> - trans. interfaces: Array(0x00000008005af748) >> - constants: constant pool [49] {0x00007f620841e838} for 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' cache=0x00007f620841ebe0 >> - class loader data: loader data: 0x00007f61c804a750 of 'bootstrap' has a class holder >> - source file: 'LambdaForm$DMH' >> - class annotations: Array(0x0000000000000000) >> - class type annotations: Array(0x0000000000000000) >> - field annotations: Array(0x0000000000000000) >> - field type annotations: Array(0x0000000000000000) >> - inner classes: Array(0x00000008005af6d8) >> - nest members: Array(0x00000008005af6d8) >> - permitted subclasses: Array(0x00000008005af6d8) >> - java mirror: a 'java/lang/Class'{0x000000011f4b0968} = 'java/lang/invoke/LambdaForm$DMH+0x0000000800c0b000' >> - vtable length 5 (start addr: 0x0000000800c0b1b8) >> - itable length 2 (start addr: 0x0000000800c0b1e0) >> - ---- static fields (1 words): >> - static final '_D_0' 'Ljava/lang/invoke/LambdaForm;' @112 >> - ---- non-static fields (0 words): >> ... > > Yi Yang has updated the pull request incrementally with one additional commit since the last revision: > > fix LGTM. One minor nit, Could you update the PR description to include examples of the final output format. src/hotspot/share/oops/instanceKlass.cpp line 2081: > 2079: _st->print(INTPTR_FORMAT " ", p2i(k)); > 2080: // klass size > 2081: _st->print("%-4d ", k->size()); Should be `%4d` so that the numbers are aligned correctly. ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7105 From shade at openjdk.java.net Thu Jan 27 16:05:36 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 27 Jan 2022 16:05:36 GMT Subject: RFR: 8280784: VM_Cleanup unnecessarily processes all thread oops In-Reply-To: References: Message-ID: On Thu, 27 Jan 2022 14:35:08 GMT, Stefan Karlsson wrote: > While looking at ZGC latencies in a benchmark with >20000 Java threads, I noticed that the Cleanup VM operation could take up to 500 ms. It turned out that the time was spent processing the oops in all Java threads. Since none of the safepoint cleanup tasks use the oops in the threads, I propose that we stop processing the oops in this VM Operation. Looks fine, but doesn't that apply to other "empty" VMOps: `VM_None`, `VM_ForceSafepoint`, `VM_ThreadSuspend`, etc? Probably some code commoning is in order there... ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7246 From stefank at openjdk.java.net Thu Jan 27 16:19:33 2022 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 27 Jan 2022 16:19:33 GMT Subject: RFR: 8280784: VM_Cleanup unnecessarily processes all thread oops In-Reply-To: References: Message-ID: On Thu, 27 Jan 2022 16:02:09 GMT, Aleksey Shipilev wrote: >> While looking at ZGC latencies in a benchmark with >20000 Java threads, I noticed that the Cleanup VM operation could take up to 500 ms. It turned out that the time was spent processing the oops in all Java threads. Since none of the safepoint cleanup tasks use the oops in the threads, I propose that we stop processing the oops in this VM Operation. > > Looks fine, but doesn't that apply to other "empty" VMOps: `VM_None`, `VM_ForceSafepoint`, `VM_ThreadSuspend`, etc? Probably some code commoning is in order there... @shipilev Yeah, probably. None of the listed VM operations seemed time-sensitive to me, so I left them as-is. But I'll try to unify this and re-run the tests. ------------- PR: https://git.openjdk.java.net/jdk/pull/7246 From shade at openjdk.java.net Thu Jan 27 16:45:32 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 27 Jan 2022 16:45:32 GMT Subject: RFR: 8280784: VM_Cleanup unnecessarily processes all thread oops In-Reply-To: References: Message-ID: On Thu, 27 Jan 2022 16:02:09 GMT, Aleksey Shipilev wrote: >> While looking at ZGC latencies in a benchmark with >20000 Java threads, I noticed that the Cleanup VM operation could take up to 500 ms. It turned out that the time was spent processing the oops in all Java threads. Since none of the safepoint cleanup tasks use the oops in the threads, I propose that we stop processing the oops in this VM Operation. > > Looks fine, but doesn't that apply to other "empty" VMOps: `VM_None`, `VM_ForceSafepoint`, `VM_ThreadSuspend`, etc? Probably some code commoning is in order there... > @shipilev Yeah, probably. None of the listed VM operations seemed time-sensitive to me, so I left them as-is. But I'll try to unify this and re-run the tests. No need to do the unification here, I think. This point change is cleanly backportable. Unification in some future RFR would be nice. ------------- PR: https://git.openjdk.java.net/jdk/pull/7246 From mandy.chung at oracle.com Thu Jan 27 16:45:59 2022 From: mandy.chung at oracle.com (Mandy Chung) Date: Thu, 27 Jan 2022 08:45:59 -0800 Subject: Fix proposal for bug JDK-8221642 In-Reply-To: References: Message-ID: <0ab483b9-05ea-6ca3-668c-4bae665190cc@oracle.com> Hi Andreas, What methods are you calling that throws NPE?? Do you have the stack trace to share? The spec of AccessibleObject was updated for JDK-8221530 if there is no caller frame when calling from JNI: "The check when invoked by JNI code with no Java class on the stack only succeeds if the member and the declaring class are public, and the class is in a package that is exported to all modules." I think AccessibleObject::canAccess, setAccessible, trySetAccessible should follow the same rule. Mandy On 1/27/22 2:19 AM, Andreas Rosenberg wrote: > Hi, > > this is my first posting regarding to JDK contribution, so this may be the wrong place to ask. > Please point me in the right direction in this case. > > We are using Java rather heavily via JNI on a custom application. For a long time we did stick to JRE 1.8 > for various reasons. My task is to plan an upgrade to a more recent JDK version and while doing some > test I encountered bugs related to this: JDK-8227491 (JNI - caller sensitive methods). > > We are parsing Java class files to auto gen the JNI code for our application, and are also using reflection. > The workaround given is clumsy and needs manual intervention, so I was looking for a more elegant solution. > > The problem is: a caller sensitive method wants to determine the caller class for security checks. In case of > a JNI call no Java stack frame exists, so the JVM function "jclass JVM_GetCallerClass(JNIEnv* env)" answers NULL > which leads to NPEs. > > My idea is this: create an internal proxy class inside "java.base" that reflects this case > (e.g. "java.lang.NativeCall" or "java.lang.NativeCode"). > This class is final and implements nothing. > > Then "jclass JVM_GetCallerClass(JNIEnv* env)" (jvm.cpp) could be modified and instead of answering NULL > in case of a JNI call, it should do this to answer the class proxy: > > return JVM_FindClassFromBootLoader(env, "java/lang/NativeCall"); > > This would have the following advantages: > - JNI code could again simply call "caller sensitive methods" without the need to make an additional wrapper class > - it would be more a expressive way on the Java side to detect "the callee is native code" than checking for null > - it would fit better into the framework > > I already applied this fix on my own copy of the JDK 17 sources and it works pretty well for us. > > As there are probably security considerations involved, advice from experts is required. > But from my understanding the Java security model is designed for the main app being writing in Java. > In this case there are always Java stacks frames available as parents for caller sensitive methods, so > the proposed fix would not affect the behavior. This assumes that "GetCallerClass" only answers > NULL for the JNI case. This needs verification. > > If the main app is native code which uses JNI, the Java security model can only affect the Java part and > as soon as an additional Java stack frame has been generated a regular Java class will be found and > the "standard behavior" should apply again. > > Comments appreciated. > > It this fix looks reasonable, what are the steps to get it implemented and integrated into the official > source tree? > > Best regards, > Andy > > From adinn at openjdk.java.net Thu Jan 27 17:28:35 2022 From: adinn at openjdk.java.net (Andrew Dinn) Date: Thu, 27 Jan 2022 17:28:35 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash [v5] In-Reply-To: References: Message-ID: <6mjXiSUnxOCmNY_kSjf9EUYb1Io5aXhy2Ih5Cckpieg=.4c32b6bd-6cfb-4d88-a0b5-8bbd8e26ca7a@github.com> On Thu, 27 Jan 2022 15:15:15 GMT, Denghui Dong wrote: >> Hi, >> >> I found that the native stack frames in the hs log are not accurate sometimes on AArch64, not sure if this is a known issue or an issue worth fixing. >> >> The following steps can quick reproduce the problem: >> >> 1. apply the diff(comment the dtrace_object_alloc call in interpreter and make a crash on SharedRuntime::dtrace_object_alloc) >> >> index 39e99bdd5ed..4fc768e94aa 100644 >> --- a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp >> +++ b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp >> @@ -3558,6 +3558,7 @@ void TemplateTable::_new() { >> __ store_klass_gap(r0, zr); // zero klass gap for compressed oops >> __ store_klass(r0, r4); // store klass last >> >> +/** >> { >> SkipIfEqual skip(_masm, &DTraceAllocProbes, false); >> // Trigger dtrace event for fastpath >> @@ -3567,6 +3568,7 @@ void TemplateTable::_new() { >> __ pop(atos); // restore the return value >> >> } >> +*/ >> __ b(done); >> } >> >> diff --git a/src/hotspot/cpu/x86/templateTable_x86.cpp b/src/hotspot/cpu/x86/templateTable_x86.cpp >> index 19530b7c57c..15b0509da4c 100644 >> --- a/src/hotspot/cpu/x86/templateTable_x86.cpp >> +++ b/src/hotspot/cpu/x86/templateTable_x86.cpp >> @@ -4033,6 +4033,7 @@ void TemplateTable::_new() { >> Register tmp_store_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg); >> __ store_klass(rax, rcx, tmp_store_klass); // klass >> >> +/** >> { >> SkipIfEqual skip_if(_masm, &DTraceAllocProbes, 0); >> // Trigger dtrace event for fastpath >> @@ -4041,6 +4042,7 @@ void TemplateTable::_new() { >> CAST_FROM_FN_PTR(address, static_cast(SharedRuntime::dtrace_object_alloc)), rax); >> __ pop(atos); >> } >> +*/ >> >> __ jmp(done); >> } >> diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp >> index a5de65ea5ab..60b4bd3bcc8 100644 >> --- a/src/hotspot/share/runtime/sharedRuntime.cpp >> +++ b/src/hotspot/share/runtime/sharedRuntime.cpp >> @@ -1002,6 +1002,7 @@ jlong SharedRuntime::get_java_tid(Thread* thread) { >> * 6254741. Once that is fixed we can remove the dummy return value. >> */ >> int SharedRuntime::dtrace_object_alloc(oopDesc* o) { >> + *(int*)0 = 1; >> return dtrace_object_alloc(Thread::current(), o, o->size()); >> } >> >> >> 2. `java -XX:+DTraceAllocProbes -Xcomp -XX:-PreserveFramePointer -version` >> >> On x86_64, the native stack in hs log is complete, but on AArch64, the native stack is incorrect. >> >> In the beginning, I thought it might be the influence of PreserveFramePointer. Later, I found that no matter whether PreserveFramePointer is enabled or not, in the hs log of x86_64, the native stack is always correct, and aarch64 is wrong. >> >> After some investigation, I found that this problem is related to the layout of the stack. >> >> On x86_64, whether it is C/C++, interpreter, or JIT, `callee` will always put the `return address` and `fp` of the `caller` at the bottom of the stack. >> Hence, `callee` can always get the `caller sp`(aka `sender sp`) by `fp + 2`, and if `caller` is a compiled method, `caller sp` is the key to getting the `caller`'s `caller` since `caller fp` may be invalid.(see frame::sender_for_compiled_frame). >> >> >> push %rbp >> mov %rsp,%rbp >> >> _ _ _ _ _ _ >> | | >> | | | >> |_ _ _ _ _ _| | >> | | | >> caller | | <- caller sp | >> _ _ _ |_ _ _ _ _ _| | expand >> | | | >> | ret addr | | direction >> callee |_ _ _ _ _ _| | >> | | V >> | caller fp | <- fp >> |_ _ _ _ _ _| >> >> >> >> But for AArch64, the C/C++ code doesn't put the `return address` and `fp` of the `caller` at the bottom of the stack. >> Hence, we cannot use `fp + 2` to calculate the proper `caller sp`(although it is still implemented this way). >> >> When `caller` is a C1/C2 method A, and `callee` is a C/C++ method B, we cannot get the `caller` of A since we cannot get the proper sp value of it. >> >> >> stp x29, x30, [sp, #-N]! >> mov x29, sp >> >> _ _ _ _ _ _ >> | | >> | | | >> |_ _ _ _ _ _| | >> | | | >> caller | | <- caller sp | >> _ _ _ |_ _ _ _ _ _| - | expand >> | | >> . . . . . | | direction >> _ _ _ _ _ _ | | >> | | | N | >> | ret addr | | | >> callee |_ _ _ _ _ _| | | >> | | - V >> | caller fp | <- fp >> |_ _ _ _ _ _| >> >> >> >> I am not very familiar with AArch64 and have no idea how to fix this issue perfectly at current. >> >> Based on my understanding of the implementation, we can get the correct stack trace when PreserveFramePointer is enabled. >> >> Although PreserveFramePointer is disabled by default, I found that some real applications will enable it in the production environment. >> Therefore, in my opinion, this fix can help troubleshoot crash issues in applications that enable PreserveFramePointer on AArch64 platform. >> >> This patch changes the logic of l_sender_sp calculation, uses sender_sp() as the value of l_sender_sp when PreserveFramePointer is enabled. >> >> Any input is appreciated. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > add comment Marked as reviewed by adinn (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/6597 From adinn at openjdk.java.net Thu Jan 27 17:28:36 2022 From: adinn at openjdk.java.net (Andrew Dinn) Date: Thu, 27 Jan 2022 17:28:36 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash [v4] In-Reply-To: References: Message-ID: On Thu, 27 Jan 2022 15:18:01 GMT, Denghui Dong wrote: >> src/hotspot/cpu/aarch64/frame_aarch64.cpp line 470: >> >>> 468: // in C2 code but it will have been pushed onto the stack. so we >>> 469: // have to find it relative to the unextended sp >>> 470: >> >> The comment above this change needs to be updated to explain when and why it is correct to use 1) the unextended sp and frame size or 2) the sender sp. > > Thanks. > I changed the comment, but this seems cannot be clearly explained in one or two sentences. > Please feel free to give any advice to refine the comment. Thanks. That's actually very clear and I don't think I can improve it. ------------- PR: https://git.openjdk.java.net/jdk/pull/6597 From hseigel at openjdk.java.net Thu Jan 27 19:25:56 2022 From: hseigel at openjdk.java.net (Harold Seigel) Date: Thu, 27 Jan 2022 19:25:56 GMT Subject: RFR: 8214976: Warn about uses of functions replaced for portability Message-ID: Please review this new attempt to resolve JDK-8214976. This fix adds Pragmas to generate compilation errors, when using gcc, if calling a native system function instead of the os:: version of the function. The fix includes changes to calls in non-shared code because it is cleaner than adding PRAGMAs and, for some cases, the os:: version of a function has added value, such as asserts and RESTARTABLE. This fix slightly changes the signature of os::abort() so it wouldn't conflict with native abort() functions. Changes to Windows code is left for a future RFE. This fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64, and Mach5 builds of Zero, PPC, and s390. Thanks, Harold ------------- Commit messages: - 8214976: Warn about uses of functions replaced for portability Changes: https://git.openjdk.java.net/jdk/pull/7248/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7248&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8214976 Stats: 148 lines in 15 files changed: 105 ins; 0 del; 43 mod Patch: https://git.openjdk.java.net/jdk/pull/7248.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7248/head:pull/7248 PR: https://git.openjdk.java.net/jdk/pull/7248 From kbarrett at openjdk.java.net Thu Jan 27 21:06:38 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Thu, 27 Jan 2022 21:06:38 GMT Subject: RFR: 8280828: Improve invariants in NonblockingQueue::append Message-ID: <5RadwfEH_n0x_cLSSZePRdiQ5W6nRhfNJ_ns3ajDZtQ=.6a3a1eaa-55b3-4154-a5a3-6e2985a1ceaf@github.com> Please review this change to NonblockingQueue to improve invariants in the append operation by making a change in try_pop. When taking the last entry in the queue, try_pop needs to do some cleanup of the queue fields, setting them to NULL. The order of those cleanups doesn't matter for correctness. However, setting first _head then _tail permits append to assert that _head is NULL when it finds _tail was NULL. The current order (set _tail first, then _head) doesn't permit such an assertion. Testing: mach5 tier1-3 I also did lots of testing with this change included while investigating JDK-8273383. ------------- Commit messages: - append invariant Changes: https://git.openjdk.java.net/jdk/pull/7250/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7250&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8280828 Stats: 44 lines in 1 file changed: 19 ins; 6 del; 19 mod Patch: https://git.openjdk.java.net/jdk/pull/7250.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7250/head:pull/7250 PR: https://git.openjdk.java.net/jdk/pull/7250 From ddong at openjdk.java.net Fri Jan 28 00:50:12 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Fri, 28 Jan 2022 00:50:12 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash [v2] In-Reply-To: References: Message-ID: On Tue, 18 Jan 2022 13:43:25 GMT, Andrew Haley wrote: >> Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: >> >> fix pfl() crash problem and rename from_thread to from_anchor > > So, here's my thinking for now. `_from_anchor` really means _this SP is trustworthy_, and perhaps we need a different name which suggests that. `sp_ok_to_use()` or `sp_is_trusted()` or somesuch? We do at least need a comment which explains that unless this boolean is true, the SP value in a frame is basically garbage, although it will point to somewhere within the stack. With that change, this patch can be integrated. > In the longer term, I think we should look at using libunwind to obtain a precise native stack trace, and then we can get rid of all the old kludges. Thanks. @theRealAph @adinn ------------- PR: https://git.openjdk.java.net/jdk/pull/6597 From ddong at openjdk.java.net Fri Jan 28 00:52:14 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Fri, 28 Jan 2022 00:52:14 GMT Subject: Integrated: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash In-Reply-To: References: Message-ID: On Mon, 29 Nov 2021 17:40:43 GMT, Denghui Dong wrote: > Hi, > > I found that the native stack frames in the hs log are not accurate sometimes on AArch64, not sure if this is a known issue or an issue worth fixing. > > The following steps can quick reproduce the problem: > > 1. apply the diff(comment the dtrace_object_alloc call in interpreter and make a crash on SharedRuntime::dtrace_object_alloc) > > index 39e99bdd5ed..4fc768e94aa 100644 > --- a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp > +++ b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp > @@ -3558,6 +3558,7 @@ void TemplateTable::_new() { > __ store_klass_gap(r0, zr); // zero klass gap for compressed oops > __ store_klass(r0, r4); // store klass last > > +/** > { > SkipIfEqual skip(_masm, &DTraceAllocProbes, false); > // Trigger dtrace event for fastpath > @@ -3567,6 +3568,7 @@ void TemplateTable::_new() { > __ pop(atos); // restore the return value > > } > +*/ > __ b(done); > } > > diff --git a/src/hotspot/cpu/x86/templateTable_x86.cpp b/src/hotspot/cpu/x86/templateTable_x86.cpp > index 19530b7c57c..15b0509da4c 100644 > --- a/src/hotspot/cpu/x86/templateTable_x86.cpp > +++ b/src/hotspot/cpu/x86/templateTable_x86.cpp > @@ -4033,6 +4033,7 @@ void TemplateTable::_new() { > Register tmp_store_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg); > __ store_klass(rax, rcx, tmp_store_klass); // klass > > +/** > { > SkipIfEqual skip_if(_masm, &DTraceAllocProbes, 0); > // Trigger dtrace event for fastpath > @@ -4041,6 +4042,7 @@ void TemplateTable::_new() { > CAST_FROM_FN_PTR(address, static_cast(SharedRuntime::dtrace_object_alloc)), rax); > __ pop(atos); > } > +*/ > > __ jmp(done); > } > diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp > index a5de65ea5ab..60b4bd3bcc8 100644 > --- a/src/hotspot/share/runtime/sharedRuntime.cpp > +++ b/src/hotspot/share/runtime/sharedRuntime.cpp > @@ -1002,6 +1002,7 @@ jlong SharedRuntime::get_java_tid(Thread* thread) { > * 6254741. Once that is fixed we can remove the dummy return value. > */ > int SharedRuntime::dtrace_object_alloc(oopDesc* o) { > + *(int*)0 = 1; > return dtrace_object_alloc(Thread::current(), o, o->size()); > } > > > 2. `java -XX:+DTraceAllocProbes -Xcomp -XX:-PreserveFramePointer -version` > > On x86_64, the native stack in hs log is complete, but on AArch64, the native stack is incorrect. > > In the beginning, I thought it might be the influence of PreserveFramePointer. Later, I found that no matter whether PreserveFramePointer is enabled or not, in the hs log of x86_64, the native stack is always correct, and aarch64 is wrong. > > After some investigation, I found that this problem is related to the layout of the stack. > > On x86_64, whether it is C/C++, interpreter, or JIT, `callee` will always put the `return address` and `fp` of the `caller` at the bottom of the stack. > Hence, `callee` can always get the `caller sp`(aka `sender sp`) by `fp + 2`, and if `caller` is a compiled method, `caller sp` is the key to getting the `caller`'s `caller` since `caller fp` may be invalid.(see frame::sender_for_compiled_frame). > > > push %rbp > mov %rsp,%rbp > > _ _ _ _ _ _ > | | > | | | > |_ _ _ _ _ _| | > | | | > caller | | <- caller sp | > _ _ _ |_ _ _ _ _ _| | expand > | | | > | ret addr | | direction > callee |_ _ _ _ _ _| | > | | V > | caller fp | <- fp > |_ _ _ _ _ _| > > > > But for AArch64, the C/C++ code doesn't put the `return address` and `fp` of the `caller` at the bottom of the stack. > Hence, we cannot use `fp + 2` to calculate the proper `caller sp`(although it is still implemented this way). > > When `caller` is a C1/C2 method A, and `callee` is a C/C++ method B, we cannot get the `caller` of A since we cannot get the proper sp value of it. > > > stp x29, x30, [sp, #-N]! > mov x29, sp > > _ _ _ _ _ _ > | | > | | | > |_ _ _ _ _ _| | > | | | > caller | | <- caller sp | > _ _ _ |_ _ _ _ _ _| - | expand > | | > . . . . . | | direction > _ _ _ _ _ _ | | > | | | N | > | ret addr | | | > callee |_ _ _ _ _ _| | | > | | - V > | caller fp | <- fp > |_ _ _ _ _ _| > > > > I am not very familiar with AArch64 and have no idea how to fix this issue perfectly at current. > > Based on my understanding of the implementation, we can get the correct stack trace when PreserveFramePointer is enabled. > > Although PreserveFramePointer is disabled by default, I found that some real applications will enable it in the production environment. > Therefore, in my opinion, this fix can help troubleshoot crash issues in applications that enable PreserveFramePointer on AArch64 platform. > > This patch changes the logic of l_sender_sp calculation, uses sender_sp() as the value of l_sender_sp when PreserveFramePointer is enabled. > > Any input is appreciated. > > Thanks, > Denghui This pull request has now been integrated. Changeset: 094db1a3 Author: Denghui Dong URL: https://git.openjdk.java.net/jdk/commit/094db1a3eeb3709c8218d8d26f13699024ec2943 Stats: 32 lines in 4 files changed: 23 ins; 0 del; 9 mod 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash Reviewed-by: aph, adinn ------------- PR: https://git.openjdk.java.net/jdk/pull/6597 From mandy.chung at oracle.com Fri Jan 28 01:54:19 2022 From: mandy.chung at oracle.com (Mandy Chung) Date: Thu, 27 Jan 2022 17:54:19 -0800 Subject: Fix proposal for bug JDK-8221642 In-Reply-To: References: Message-ID: I see how NPE is thrown (from `AccessibleObject::setAccessible` and `trySetAccessible`).? The proper fix should follow the rule as the access check that it can set the accessible flag only on public members of a public type that is exported unconditionally. The fix is straight forward but involves spec change.? I'll post PR soon. Mandy On 1/27/22 8:45 AM, Mandy Chung wrote: > Hi Andreas, > > What methods are you calling that throws NPE?? Do you have the stack > trace to share? > > The spec of AccessibleObject was updated for JDK-8221530 if there is > no caller frame when calling from JNI: > > "The check when invoked by JNI code with no Java class on the stack > only succeeds if the member and the declaring class are public, and > the class is in a package that is exported to all modules." > > I think AccessibleObject::canAccess, setAccessible, trySetAccessible > should follow the same rule. > > Mandy > > On 1/27/22 2:19 AM, Andreas Rosenberg wrote: >> Hi, >> >> this is my first posting regarding to JDK contribution, so this may be the wrong place to ask. >> Please point me in the right direction in this case. >> >> We are using Java rather heavily via JNI on a custom application. For a long time we did stick to JRE 1.8 >> for various reasons. My task is to plan an upgrade to a more recent JDK version and while doing some >> test I encountered bugs related to this: JDK-8227491 (JNI - caller sensitive methods). >> >> We are parsing Java class files to auto gen the JNI code for our application, and are also using reflection. >> The workaround given is clumsy and needs manual intervention, so I was looking for a more elegant solution. >> >> The problem is: a caller sensitive method wants to determine the caller class for security checks. In case of >> a JNI call no Java stack frame exists, so the JVM function "jclass JVM_GetCallerClass(JNIEnv* env)" answers NULL >> which leads to NPEs. >> >> My idea is this: create an internal proxy class inside "java.base" that reflects this case >> (e.g. "java.lang.NativeCall" or "java.lang.NativeCode"). >> This class is final and implements nothing. >> >> Then "jclass JVM_GetCallerClass(JNIEnv* env)" (jvm.cpp) could be modified and instead of answering NULL >> in case of a JNI call, it should do this to answer the class proxy: >> >> return JVM_FindClassFromBootLoader(env, "java/lang/NativeCall"); >> >> This would have the following advantages: >> - JNI code could again simply call "caller sensitive methods" without the need to make an additional wrapper class >> - it would be more a expressive way on the Java side to detect "the callee is native code" than checking for null >> - it would fit better into the framework >> >> I already applied this fix on my own copy of the JDK 17 sources and it works pretty well for us. >> >> As there are probably security considerations involved, advice from experts is required. >> But from my understanding the Java security model is designed for the main app being writing in Java. >> In this case there are always Java stacks frames available as parents for caller sensitive methods, so >> the proposed fix would not affect the behavior. This assumes that "GetCallerClass" only answers >> NULL for the JNI case. This needs verification. >> >> If the main app is native code which uses JNI, the Java security model can only affect the Java part and >> as soon as an additional Java stack frame has been generated a regular Java class will be found and >> the "standard behavior" should apply again. >> >> Comments appreciated. >> >> It this fix looks reasonable, what are the steps to get it implemented and integrated into the official >> source tree? >> >> Best regards, >> Andy >> >> From ysuenaga at openjdk.java.net Fri Jan 28 02:29:10 2022 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Fri, 28 Jan 2022 02:29:10 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v2] In-Reply-To: References: Message-ID: On Tue, 25 Jan 2022 15:10:11 GMT, Christian Hagedorn wrote: >> When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method: >> >> Stack: [0x00007f6e01739000,0x00007f6e0183a000], sp=0x00007f6e01838110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f >> >> This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method. >> >> This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see [JDK-8185712](https://bugs.openjdk.java.net/browse/JDK-8185712)): >> >> Stack: [0x00007f34fca18000,0x00007f34fcb19000], sp=0x00007f34fcb17110, free space=1020k >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x620d86] Compilation::~Compilation()+0x64 (c1_Compilation.cpp:607) >> V [libjvm.so+0x624b92] Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0xec (c1_Compiler.cpp:250) >> V [libjvm.so+0x8303ef] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x899 (compileBroker.cpp:2291) >> V [libjvm.so+0x82f067] CompileBroker::compiler_thread_loop()+0x3df (compileBroker.cpp:1966) >> V [libjvm.so+0x84f0d1] CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x69 (compilerThread.cpp:59) >> V [libjvm.so+0x1209329] JavaThread::thread_main_inner()+0x15d (thread.cpp:1297) >> V [libjvm.so+0x12091c9] JavaThread::run()+0x167 (thread.cpp:1280) >> V [libjvm.so+0x1206ada] Thread::call_run()+0x180 (thread.cpp:358) >> V [libjvm.so+0x1012e55] thread_native_entry(Thread*)+0x18f (os_linux.cpp:705) >> >> For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4. >> >> The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf >> I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec. >> >> The comments at the `Dwarf` class in the `elf.hpp` file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout the `elf.hpp` file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability. >> >> Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures, `-Xlog:dwarf` can be used with `info`, `debug` or `trace` which provides logging messages throughout parsing. >> >> **Testing:** >> Apart from manual testing, I've added two kinds of tests: >> - A JTreg test: Spawns new VMs to let them crash in various ways. The test reads the created hs_err files to check if the DWARF parsing could correctly find the filename and line number. For normal HotSpot files, I could not check against hardcoded filenames and line numbers as they are subject to change (especially line number can quickly become different). I therefore just added some sanity checks in the form of "found a non-empty file" and "found a non-zero line number". On top of that, I added tests that let the VM crash in custom C files (which will not change). This enables an additional verification of hardcoded filenames and line numbers. >> - Gtests: Directly calling the `get_source()` method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename. >> >> On top of that, there are also existing JTreg tests that call `-XX:NativeMemoryTracking=detail` which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation. >> >> To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional `_JVM_DWARF_PATH` environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required some `make` changes). This is similar to what's done on Windows with `_NT_SYMBOL_PATH`. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number. >> >> I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user). >> >> Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for `make`! >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request incrementally with two additional commits since the last revision: > > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> > - Update test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java > > Co-authored-by: Erik Joelsson <37597443+erikj79 at users.noreply.github.com> I think this feature is very useful, thanks Christian! SA already has similar feature to gather call stacks with DWARF, so it would be nice to share DWARF parser between SA and HotSpot. P.S. I've proposed to use elfutils to parse DWARF in SA in [JDK-8245234](https://bugs.openjdk.java.net/browse/JDK-8245234). ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From dholmes at openjdk.java.net Fri Jan 28 05:12:09 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 28 Jan 2022 05:12:09 GMT Subject: RFR: 8214976: Warn about uses of functions replaced for portability In-Reply-To: References: Message-ID: On Thu, 27 Jan 2022 19:18:10 GMT, Harold Seigel wrote: > Please review this new attempt to resolve JDK-8214976. This fix adds Pragmas to generate compilation errors, when using gcc, if calling a native system function instead of the os:: version of the function. The fix includes changes to calls in non-shared code because it is cleaner than adding PRAGMAs and, for some cases, the os:: version of a function has added value, such as asserts and RESTARTABLE. This fix slightly changes the signature of os::abort() so it wouldn't conflict with native abort() functions. Changes to Windows code is left for a future RFE. > > This fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64, and Mach5 builds of Zero, PPC, and s390. > > Thanks, Harold Hi Harold, Still have reservations about the awkwardness of this. Quite a few comments below. Shouldn't we generate a warning for all external functions for which there is an os:: replacement e.g. pread is called by read_at; gethostbyname is called by get_host_by_name; ... Thanks, David src/hotspot/os/aix/os_aix.cpp line 2499: > 2497: struct dirent *ptr; > 2498: > 2499: dir = os::opendir(path); Just to clarify, as we are in the scope of the os class both `opendir` and `os::opendir` are the same thing here - and similarly for other code in the os class - right? src/hotspot/share/runtime/os.hpp line 533: > 531: // platforms that support such things. This calls shutdown() and then aborts. > 532: static void abort(bool dump_core, void *siginfo, const void *context); > 533: static void abort(bool dump_core); I don't understand why the change to the default arg was needed. There should be no conflict between `os::abort()` and `::abort()`. src/hotspot/share/utilities/compilerWarnings_gcc.hpp line 80: > 78: #define NULL 0 > 79: #endif > 80: #endif This is really ugly just because we include dirent.h so we can add the warning for a few functions; and even uglier because it is only needed for AIX, and even uglier still because based on the existing code we only compile AIX with xlc - no? Otherwise we would already need this hack for gcc. src/hotspot/share/utilities/compilerWarnings_gcc.hpp line 97: > 95: FORBID_C_FUNCTION(FILE* fopen(const char*, const char*), "use os::fopen"); > 96: FORBID_C_FUNCTION(int fsync(int), "use os::fsync"); > 97: FORBID_C_FUNCTION(int ftruncate(int, off_t), "use os::ftruncate"); Shouldn't this be ftruncate for BSD and ftruncate64 for other Posix (not sure what Windows has)? src/hotspot/share/utilities/compilerWarnings_gcc.hpp line 99: > 97: FORBID_C_FUNCTION(int ftruncate(int, off_t), "use os::ftruncate"); > 98: FORBID_C_FUNCTION(void funlockfile(FILE *), "use os::funlockfile"); > 99: FORBID_C_FUNCTION(off_t lseek(int, off_t, int), "use os::lseek"); Similarly there should be a lseek64 definition too. src/hotspot/share/utilities/ostream.cpp line 615: > 613: > 614: PRAGMA_DIAG_PUSH > 615: PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION(write); Why do we not call os::write here? ------------- PR: https://git.openjdk.java.net/jdk/pull/7248 From dholmes at openjdk.java.net Fri Jan 28 06:12:16 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 28 Jan 2022 06:12:16 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash [v5] In-Reply-To: References: Message-ID: <2Ml-XYdL_b1_2ba9qleL5KiydVzDvl7oZhK60r3lA4c=.070d461f-1b8a-440e-a244-800bc49248bb@github.com> On Thu, 27 Jan 2022 15:15:15 GMT, Denghui Dong wrote: >> Hi, >> >> I found that the native stack frames in the hs log are not accurate sometimes on AArch64, not sure if this is a known issue or an issue worth fixing. >> >> The following steps can quick reproduce the problem: >> >> 1. apply the diff(comment the dtrace_object_alloc call in interpreter and make a crash on SharedRuntime::dtrace_object_alloc) >> >> index 39e99bdd5ed..4fc768e94aa 100644 >> --- a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp >> +++ b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp >> @@ -3558,6 +3558,7 @@ void TemplateTable::_new() { >> __ store_klass_gap(r0, zr); // zero klass gap for compressed oops >> __ store_klass(r0, r4); // store klass last >> >> +/** >> { >> SkipIfEqual skip(_masm, &DTraceAllocProbes, false); >> // Trigger dtrace event for fastpath >> @@ -3567,6 +3568,7 @@ void TemplateTable::_new() { >> __ pop(atos); // restore the return value >> >> } >> +*/ >> __ b(done); >> } >> >> diff --git a/src/hotspot/cpu/x86/templateTable_x86.cpp b/src/hotspot/cpu/x86/templateTable_x86.cpp >> index 19530b7c57c..15b0509da4c 100644 >> --- a/src/hotspot/cpu/x86/templateTable_x86.cpp >> +++ b/src/hotspot/cpu/x86/templateTable_x86.cpp >> @@ -4033,6 +4033,7 @@ void TemplateTable::_new() { >> Register tmp_store_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg); >> __ store_klass(rax, rcx, tmp_store_klass); // klass >> >> +/** >> { >> SkipIfEqual skip_if(_masm, &DTraceAllocProbes, 0); >> // Trigger dtrace event for fastpath >> @@ -4041,6 +4042,7 @@ void TemplateTable::_new() { >> CAST_FROM_FN_PTR(address, static_cast(SharedRuntime::dtrace_object_alloc)), rax); >> __ pop(atos); >> } >> +*/ >> >> __ jmp(done); >> } >> diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp >> index a5de65ea5ab..60b4bd3bcc8 100644 >> --- a/src/hotspot/share/runtime/sharedRuntime.cpp >> +++ b/src/hotspot/share/runtime/sharedRuntime.cpp >> @@ -1002,6 +1002,7 @@ jlong SharedRuntime::get_java_tid(Thread* thread) { >> * 6254741. Once that is fixed we can remove the dummy return value. >> */ >> int SharedRuntime::dtrace_object_alloc(oopDesc* o) { >> + *(int*)0 = 1; >> return dtrace_object_alloc(Thread::current(), o, o->size()); >> } >> >> >> 2. `java -XX:+DTraceAllocProbes -Xcomp -XX:-PreserveFramePointer -version` >> >> On x86_64, the native stack in hs log is complete, but on AArch64, the native stack is incorrect. >> >> In the beginning, I thought it might be the influence of PreserveFramePointer. Later, I found that no matter whether PreserveFramePointer is enabled or not, in the hs log of x86_64, the native stack is always correct, and aarch64 is wrong. >> >> After some investigation, I found that this problem is related to the layout of the stack. >> >> On x86_64, whether it is C/C++, interpreter, or JIT, `callee` will always put the `return address` and `fp` of the `caller` at the bottom of the stack. >> Hence, `callee` can always get the `caller sp`(aka `sender sp`) by `fp + 2`, and if `caller` is a compiled method, `caller sp` is the key to getting the `caller`'s `caller` since `caller fp` may be invalid.(see frame::sender_for_compiled_frame). >> >> >> push %rbp >> mov %rsp,%rbp >> >> _ _ _ _ _ _ >> | | >> | | | >> |_ _ _ _ _ _| | >> | | | >> caller | | <- caller sp | >> _ _ _ |_ _ _ _ _ _| | expand >> | | | >> | ret addr | | direction >> callee |_ _ _ _ _ _| | >> | | V >> | caller fp | <- fp >> |_ _ _ _ _ _| >> >> >> >> But for AArch64, the C/C++ code doesn't put the `return address` and `fp` of the `caller` at the bottom of the stack. >> Hence, we cannot use `fp + 2` to calculate the proper `caller sp`(although it is still implemented this way). >> >> When `caller` is a C1/C2 method A, and `callee` is a C/C++ method B, we cannot get the `caller` of A since we cannot get the proper sp value of it. >> >> >> stp x29, x30, [sp, #-N]! >> mov x29, sp >> >> _ _ _ _ _ _ >> | | >> | | | >> |_ _ _ _ _ _| | >> | | | >> caller | | <- caller sp | >> _ _ _ |_ _ _ _ _ _| - | expand >> | | >> . . . . . | | direction >> _ _ _ _ _ _ | | >> | | | N | >> | ret addr | | | >> callee |_ _ _ _ _ _| | | >> | | - V >> | caller fp | <- fp >> |_ _ _ _ _ _| >> >> >> >> I am not very familiar with AArch64 and have no idea how to fix this issue perfectly at current. >> >> Based on my understanding of the implementation, we can get the correct stack trace when PreserveFramePointer is enabled. >> >> Although PreserveFramePointer is disabled by default, I found that some real applications will enable it in the production environment. >> Therefore, in my opinion, this fix can help troubleshoot crash issues in applications that enable PreserveFramePointer on AArch64 platform. >> >> This patch changes the logic of l_sender_sp calculation, uses sender_sp() as the value of l_sender_sp when PreserveFramePointer is enabled. >> >> Any input is appreciated. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > add comment This seems to be causing some test failures - please see https://bugs.openjdk.java.net/browse/JDK-8280843 ------------- PR: https://git.openjdk.java.net/jdk/pull/6597 From ddong at openjdk.java.net Fri Jan 28 07:27:14 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Fri, 28 Jan 2022 07:27:14 GMT Subject: RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash [v5] In-Reply-To: <2Ml-XYdL_b1_2ba9qleL5KiydVzDvl7oZhK60r3lA4c=.070d461f-1b8a-440e-a244-800bc49248bb@github.com> References: <2Ml-XYdL_b1_2ba9qleL5KiydVzDvl7oZhK60r3lA4c=.070d461f-1b8a-440e-a244-800bc49248bb@github.com> Message-ID: On Fri, 28 Jan 2022 06:08:44 GMT, David Holmes wrote: > This seems to be causing some test failures - please see https://bugs.openjdk.java.net/browse/JDK-8280843 I am looking at this. It seems to be caused by missing corresponding changes in thread_xxx_aarch64.cpp on other platforms. ------------- PR: https://git.openjdk.java.net/jdk/pull/6597 From ddong at openjdk.java.net Fri Jan 28 07:52:33 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Fri, 28 Jan 2022 07:52:33 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 Message-ID: Hi, Could I have a review of this fix? JDK-8277948 missed the changes to thread_bsd_aarch64.cpp and thread_windows_aarch64.cpp. It caused some tests failures. I'm trying to find an environment for windows/mac on aarch64 for testing. Thanks, Denghui ------------- Commit messages: - 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 Changes: https://git.openjdk.java.net/jdk/pull/7260/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7260&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8280843 Stats: 8 lines in 2 files changed: 4 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/7260.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7260/head:pull/7260 PR: https://git.openjdk.java.net/jdk/pull/7260 From jiefu at openjdk.java.net Fri Jan 28 08:22:11 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 28 Jan 2022 08:22:11 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 07:41:21 GMT, Denghui Dong wrote: > Hi, > > Could I have a review of this fix? > > JDK-8277948 missed the changes to thread_bsd_aarch64.cpp and thread_windows_aarch64.cpp. It caused some tests failures. > > I'm trying to find an environment for windows/mac on aarch64 for testing. > > Thanks, > Denghui src/hotspot/os_cpu/windows_aarch64/thread_windows_aarch64.cpp line 2: > 1: /* > 2: * Copyright (c) 2020, 2022 Microsoft Corporation. All rights reserved. If I remember it correctly, we'd better not modify the copyright line other than Oracle's. Maybe, you can add a new line here for Alibaba? ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From ddong at openjdk.java.net Fri Jan 28 08:40:45 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Fri, 28 Jan 2022 08:40:45 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v2] In-Reply-To: References: Message-ID: <0IPg799Z7RIWxVz7TSRINlA12wegXcEMOrqYyUpZW3E=.ed59fc1d-0e8f-49d8-95af-3cf18978a9d6@github.com> On Fri, 28 Jan 2022 08:18:31 GMT, Jie Fu wrote: >> Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: >> >> update copyright > > src/hotspot/os_cpu/windows_aarch64/thread_windows_aarch64.cpp line 2: > >> 1: /* >> 2: * Copyright (c) 2020, 2022 Microsoft Corporation. All rights reserved. > > If I remember it correctly, we'd better not modify the copyright line other than Oracle's. > Maybe, you can add a new line here for Alibaba? Thanks for the reminder. I added a new line for Alibaba for this file. ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From jiefu at openjdk.java.net Fri Jan 28 08:45:08 2022 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 28 Jan 2022 08:45:08 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v2] In-Reply-To: <0IPg799Z7RIWxVz7TSRINlA12wegXcEMOrqYyUpZW3E=.ed59fc1d-0e8f-49d8-95af-3cf18978a9d6@github.com> References: <0IPg799Z7RIWxVz7TSRINlA12wegXcEMOrqYyUpZW3E=.ed59fc1d-0e8f-49d8-95af-3cf18978a9d6@github.com> Message-ID: On Fri, 28 Jan 2022 08:37:25 GMT, Denghui Dong wrote: > Thanks for the reminder. I added a new line for Alibaba for this file. Thanks for the update. Well, I would suggest keeping Microsoft Corporation the first line, then Alibaba since the file was created by Microsoft. What do you think? ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From ddong at openjdk.java.net Fri Jan 28 08:40:43 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Fri, 28 Jan 2022 08:40:43 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v2] In-Reply-To: References: Message-ID: > Hi, > > Could I have a review of this fix? > > JDK-8277948 missed the changes to thread_bsd_aarch64.cpp and thread_windows_aarch64.cpp. It caused some tests failures. > > I'm trying to find an environment for windows/mac on aarch64 for testing. > > Thanks, > Denghui Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: update copyright ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7260/files - new: https://git.openjdk.java.net/jdk/pull/7260/files/c688840a..e40a4d47 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7260&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7260&range=00-01 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/7260.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7260/head:pull/7260 PR: https://git.openjdk.java.net/jdk/pull/7260 From ddong at openjdk.java.net Fri Jan 28 08:50:49 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Fri, 28 Jan 2022 08:50:49 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: Message-ID: > Hi, > > Could I have a review of this fix? > > JDK-8277948 missed the changes to thread_bsd_aarch64.cpp and thread_windows_aarch64.cpp. It caused some tests failures. > > I'm trying to find an environment for windows/mac on aarch64 for testing. > > Thanks, > Denghui Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: update copyright ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7260/files - new: https://git.openjdk.java.net/jdk/pull/7260/files/e40a4d47..4e0ada13 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7260&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7260&range=01-02 Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/7260.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7260/head:pull/7260 PR: https://git.openjdk.java.net/jdk/pull/7260 From ddong at openjdk.java.net Fri Jan 28 08:50:51 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Fri, 28 Jan 2022 08:50:51 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: <0IPg799Z7RIWxVz7TSRINlA12wegXcEMOrqYyUpZW3E=.ed59fc1d-0e8f-49d8-95af-3cf18978a9d6@github.com> Message-ID: On Fri, 28 Jan 2022 08:42:25 GMT, Jie Fu wrote: >> Thanks for the reminder. >> I added a new line for Alibaba for this file. > >> Thanks for the reminder. I added a new line for Alibaba for this file. > > Thanks for the update. > > Well, I would suggest keeping Microsoft Corporation the first line, then Alibaba since the file was created by Microsoft. > What do you think? Make sense, updated. ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From stefank at openjdk.java.net Fri Jan 28 09:09:15 2022 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Fri, 28 Jan 2022 09:09:15 GMT Subject: Integrated: 8280784: VM_Cleanup unnecessarily processes all thread oops In-Reply-To: References: Message-ID: On Thu, 27 Jan 2022 14:35:08 GMT, Stefan Karlsson wrote: > While looking at ZGC latencies in a benchmark with >20000 Java threads, I noticed that the Cleanup VM operation could take up to 500 ms. It turned out that the time was spent processing the oops in all Java threads. Since none of the safepoint cleanup tasks use the oops in the threads, I propose that we stop processing the oops in this VM Operation. This pull request has now been integrated. Changeset: 8a3cca09 Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/8a3cca09ba427282f2712bec7298b85bbacf076b Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod 8280784: VM_Cleanup unnecessarily processes all thread oops Reviewed-by: eosterlund, shade ------------- PR: https://git.openjdk.java.net/jdk/pull/7246 From chagedorn at openjdk.java.net Fri Jan 28 09:22:14 2022 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Fri, 28 Jan 2022 09:22:14 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v2] In-Reply-To: References: Message-ID: On Thu, 27 Jan 2022 09:26:50 GMT, Thomas Stuefe wrote: > Hi Christian, this is very nice and useful! Thanks Thomas! > Two general remarks. One concern I have is that the new functionality should be super stable, since nothing is more annoying than to crash during stack dumping in hs-err file; I much rather have a call stack without bells and whistles than an abridged one. Maybe we could, in hs-err printing, if we got secondary crashes during callstack dumping, repeat the step with all optional features (also name demangling) disabled? This could also be done in a separate RFE. We'll know when this happens, we can react then. I absolutely agree - stability should be the primary concern. An incomplete hs-err file should be avoided at any cost. Doing an additional "catch and repeat without optional features" sounds interesting to get more safety. Would such a thing be easy to add? Yes, it might be better to do that in a separate RFE. > Another small concern, we parse the Elf file while dumping the stack, right? I remember having a lot of problems on Solaris when dumping callstacks, because there parsing the elf file was really slow. And that delayed call stack printing by a lot, so much that the ErrorCrashTimeout often kicked in and spoiled the crash logs for us. Yes, a pc for a frame is directly parsed when printing the corresponding frame. It takes some more time to do the additional parsing but not that much. These are the timestamps from a quick `-XX:CICrashAt=1` run with `-Xlog:dwarf=info` on my local machine on `Ubuntu 20.04` with a `fastdebug` build: [1.862s][info][dwarf] Open DWARF file: /home/christian/Downloads/test/jdk-19/fastdebug/lib/server/libjvm.debuginfo [1.867s][info][dwarf] pc: 0x00007ffa35c8a9cf, offset: 0x007749cf, filename: c1_Compiler.cpp, line: 250 [1.871s][info][dwarf] pc: 0x00007ffa35fbfb28, offset: 0x00aa9b28, filename: compileBroker.cpp, line: 2291 [1.876s][info][dwarf] pc: 0x00007ffa35fc08e8, offset: 0x00aaa8e8, filename: compileBroker.cpp, line: 1966 [1.881s][info][dwarf] pc: 0x00007ffa36e50cca, offset: 0x0193acca, filename: thread.cpp, line: 1297 [1.890s][info][dwarf] pc: 0x00007ffa36e59010, offset: 0x01943010, filename: thread.cpp, line: 358 [1.897s][info][dwarf] pc: 0x00007ffa36b3c524, offset: 0x01626524, filename: os_linux.cpp, line: 705 The parsing of a single pc takes a little less than 0.01s. Of course, this is not a great way to measure performance. It also highly depends on the source files themselves, the machine setup etc. Thus, this cannot be considered a valid performance test. But still, I think these numbers can give us some indication of the order of magnitude. Compared to the current `ErrorLogTimeout` default value of 2min this looks promising. ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From stefank at openjdk.java.net Fri Jan 28 09:22:37 2022 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Fri, 28 Jan 2022 09:22:37 GMT Subject: RFR: 8280817: Clean up and unify empty VM operations Message-ID: <7V3RAOXnSv7NBTg_NjGTl7aASNsggHvDOmHivZtzjMo=.491a2dc9-18a0-4ba9-84c4-2497bd3bbe6d@github.com> There are a number of VM operations that do nothing, except triggering a safepoint. I'd like to clean up and unify them a bit to: 1) Use one parent class which implements the empty doit function and turns off thread oop processing. 2) Remove unused VM_Operations 3) Don't reuse the VM_None type - there's bug/inconsistency here in that some subsystems report the name as None, while others report the name passed to the constructor 4) Remove unused enum values ------------- Commit messages: - 8280817: Clean up and unify empty VM operations Changes: https://git.openjdk.java.net/jdk/pull/7261/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7261&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8280817 Stats: 59 lines in 4 files changed: 10 ins; 24 del; 25 mod Patch: https://git.openjdk.java.net/jdk/pull/7261.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7261/head:pull/7261 PR: https://git.openjdk.java.net/jdk/pull/7261 From aph at openjdk.java.net Fri Jan 28 09:35:10 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 28 Jan 2022 09:35:10 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 08:50:49 GMT, Denghui Dong wrote: >> Hi, >> >> Could I have a review of this fix? >> >> JDK-8277948 missed the changes to thread_bsd_aarch64.cpp and thread_windows_aarch64.cpp. It caused some tests failures. >> >> I'm trying to find an environment for windows/mac on aarch64 for testing. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > update copyright Marked as reviewed by aph (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From chagedorn at openjdk.java.net Fri Jan 28 09:41:12 2022 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Fri, 28 Jan 2022 09:41:12 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v2] In-Reply-To: References: <54FxD8Y6tYN9qIxG9kM1609F8U5qX1L2q5k36XCYnzs=.776977b4-80f0-41f9-99d3-6829f8e1d067@github.com> Message-ID: On Thu, 27 Jan 2022 13:43:00 GMT, Zhengyu Gu wrote: > > That's interesting. Is this implementation still around somewhere? I'm glad that some of the mentioned things are not a problem anymore. > > Not I know. IIRC, it was based on DWARF 2. Okay, thanks. > > > > * Different compiler (and different version of the same compiler) can generate DWARF with different version, may not be compatible with each other, as DWARF allows custom fields. > > > * Maintenance cost to catch up DWARF spec/compiler changes. > > > > > > That's indeed a problem of facing different DWARF versions. For this parser, I tried to support the current default of GCC 10.x which is DWARF 4. This standard was introduced in 2010 and is probably used by most compilers nowadays at least (if not already DWARF 5 which was introduced in 2017). However, even with GCC 10.x, it emitted DWARF 3 for one of the sections (I'm not sure why) which I also needed to support - thus you can never be sure. > > DWARF 5 is still experimental for GCC 10.x and had some issues when I tried that out back there - so I stayed away from implementing parsing steps for it. But now with GCC 11.x, DWARF 5 seems to have become the default. I might have to try out what's being emitted for HotSpot. But I think for now, it is better to only focus on DWARF 4 instead of trying to support various versions in one patch - we could still come back to that later if it becomes widely used. Even if DWARF 5 is emitted, GCC could be configured, for example, to emit DWARF 4 only which is probably an acceptable workaround for testing environments. > > I think maintenance and test could be major pain points. Based on build.html, we can use gcc version anywhere between 5.0 and 10.2, it could be a challenge to ensure all supported version work correctly. I agree, that wide range is a problem and older GCC versions emitting older DWARF version are not covered with this patch. If I parse a DWARF section header with an unsupported version I will just bail out and it falls back to the stack trace we are seeing today without source information. That's probably fine for the scope of this patch. We could still come back and add support for other missing versions. ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From chagedorn at openjdk.java.net Fri Jan 28 09:47:11 2022 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Fri, 28 Jan 2022 09:47:11 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v2] In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 02:26:21 GMT, Yasumasa Suenaga wrote: > I think this feature is very useful, thanks Christian! Thanks Yasumasa! > SA already has similar feature to gather call stacks with DWARF, so it would be nice to share DWARF parser between SA and HotSpot. I agree, that would be a good idea! > P.S. I've proposed to use elfutils to parse DWARF in SA in [JDK-8245234](https://bugs.openjdk.java.net/browse/JDK-8245234). Ah that's interesting, I've missed that. I had a look at SA when I've started to do some work on this patch back in 2020 and even took some things over. ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From shade at openjdk.java.net Fri Jan 28 09:55:08 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 28 Jan 2022 09:55:08 GMT Subject: RFR: 8280817: Clean up and unify empty VM operations In-Reply-To: <7V3RAOXnSv7NBTg_NjGTl7aASNsggHvDOmHivZtzjMo=.491a2dc9-18a0-4ba9-84c4-2497bd3bbe6d@github.com> References: <7V3RAOXnSv7NBTg_NjGTl7aASNsggHvDOmHivZtzjMo=.491a2dc9-18a0-4ba9-84c4-2497bd3bbe6d@github.com> Message-ID: On Fri, 28 Jan 2022 09:15:23 GMT, Stefan Karlsson wrote: > There are a number of VM operations that do nothing, except triggering a safepoint. I'd like to clean up and unify them a bit to: > 1) Use one parent class which implements the empty doit function and turns off thread oop processing. > 2) Remove unused VM_Operations > 3) Don't reuse the VM_None type - there's bug/inconsistency here in that some subsystems report the name as None, while others report the name passed to the constructor > 4) Remove unused enum values Looks nice! ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7261 From aph at openjdk.java.net Fri Jan 28 09:56:16 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 28 Jan 2022 09:56:16 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 08:50:49 GMT, Denghui Dong wrote: >> Hi, >> >> Could I have a review of this fix? >> >> JDK-8277948 missed the changes to thread_bsd_aarch64.cpp and thread_windows_aarch64.cpp. It caused some tests failures. >> >> I'm trying to find an environment for windows/mac on aarch64 for testing. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > update copyright It would be nice to know which tests failed on MacOS. ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From stuefe at openjdk.java.net Fri Jan 28 10:02:15 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 28 Jan 2022 10:02:15 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v2] In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 09:19:04 GMT, Christian Hagedorn wrote: > > Two general remarks. One concern I have is that the new functionality should be super stable, since nothing is more annoying than to crash during stack dumping in hs-err file; I much rather have a call stack without bells and whistles than an abridged one. Maybe we could, in hs-err printing, if we got secondary crashes during callstack dumping, repeat the step with all optional features (also name demangling) disabled? This could also be done in a separate RFE. We'll know when this happens, we can react then. > > I absolutely agree - stability should be the primary concern. An incomplete hs-err file should be avoided at any cost. Doing an additional "catch and repeat without optional features" sounds interesting to get more safety. Would such a thing be easy to add? Yes, it might be better to do that in a separate RFE. It is probably easy, but I also thing this would be better in a separate RFE. And we already have a timeout per reporting step since JDK-8166944, so that long-running steps don't spoil error reporting for everyone. We can just add a second call stack print step if the first one failed. > > > Another small concern, we parse the Elf file while dumping the stack, right? I remember having a lot of problems on Solaris when dumping callstacks, because there parsing the elf file was really slow. And that delayed call stack printing by a lot, so much that the ErrorCrashTimeout often kicked in and spoiled the crash logs for us. > > Yes, a pc for a frame is directly parsed when printing the corresponding frame. It takes some more time to do the additional parsing but not that much. These are the timestamps from a quick `-XX:CICrashAt=1` run with `-Xlog:dwarf=info` on my local machine on `Ubuntu 20.04` with a `fastdebug` build: > > ``` > [1.862s][info][dwarf] Open DWARF file: /home/christian/Downloads/test/jdk-19/fastdebug/lib/server/libjvm.debuginfo > [1.867s][info][dwarf] pc: 0x00007ffa35c8a9cf, offset: 0x007749cf, filename: c1_Compiler.cpp, line: 250 > [1.871s][info][dwarf] pc: 0x00007ffa35fbfb28, offset: 0x00aa9b28, filename: compileBroker.cpp, line: 2291 > [1.876s][info][dwarf] pc: 0x00007ffa35fc08e8, offset: 0x00aaa8e8, filename: compileBroker.cpp, line: 1966 > [1.881s][info][dwarf] pc: 0x00007ffa36e50cca, offset: 0x0193acca, filename: thread.cpp, line: 1297 > [1.890s][info][dwarf] pc: 0x00007ffa36e59010, offset: 0x01943010, filename: thread.cpp, line: 358 > [1.897s][info][dwarf] pc: 0x00007ffa36b3c524, offset: 0x01626524, filename: os_linux.cpp, line: 705 > ``` > > The parsing of a single pc takes a little less than 0.01s. Of course, this is not a great way to measure performance. It also highly depends on the source files themselves, the machine setup etc. Thus, this cannot be considered a valid performance test. But still, I think these numbers can give us some indication of the order of magnitude. Compared to the current `ErrorLogTimeout` default value of 2min this looks promising. Okay, this looks reasonable. In our case, I remember having a very slow file system and an overloaded machine. But this would be solved also by just repeating call stack printing if the first attempt times out. Cheers, and thanks for this patch! ..Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From aph at openjdk.java.net Fri Jan 28 10:09:10 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 28 Jan 2022 10:09:10 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 09:52:40 GMT, Andrew Haley wrote: > It would be nice to know which tests failed on MacOS. Found it in Bugzilla: Test: compiler/regalloc/TestC2IntPressure.java ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From aph at openjdk.java.net Fri Jan 28 10:12:16 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 28 Jan 2022 10:12:16 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: Message-ID: <9rzYgoTPDChBRcBn2qTluv1h8q7QxZiBMJcwKmL2vJA=.51a5dedc-b4e6-4ff2-8df7-01786efba954@github.com> On Fri, 28 Jan 2022 10:05:08 GMT, Andrew Haley wrote: > > It would be nice to know which tests failed on MacOS. > > Found it in Bugzilla: Test: compiler/regalloc/TestC2IntPressure.java Passes on MacOS now. ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From coleenp at openjdk.java.net Fri Jan 28 10:16:10 2022 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 28 Jan 2022 10:16:10 GMT Subject: RFR: 8280817: Clean up and unify empty VM operations In-Reply-To: <7V3RAOXnSv7NBTg_NjGTl7aASNsggHvDOmHivZtzjMo=.491a2dc9-18a0-4ba9-84c4-2497bd3bbe6d@github.com> References: <7V3RAOXnSv7NBTg_NjGTl7aASNsggHvDOmHivZtzjMo=.491a2dc9-18a0-4ba9-84c4-2497bd3bbe6d@github.com> Message-ID: <0NUmrpudUhj3LWPCQvcM5rIzbgFaPAJNOCvq-Aispe0=.baa06651-4c24-4a3f-8605-946ede13fca6@github.com> On Fri, 28 Jan 2022 09:15:23 GMT, Stefan Karlsson wrote: > There are a number of VM operations that do nothing, except triggering a safepoint. I'd like to clean up and unify them a bit to: > 1) Use one parent class which implements the empty doit function and turns off thread oop processing. > 2) Remove unused VM_Operations > 3) Don't reuse the VM_None type - there's bug/inconsistency here in that some subsystems report the name as None, while others report the name passed to the constructor > 4) Remove unused enum values Looks good. Nice to see unused operations cleaned up. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7261 From chagedorn at openjdk.java.net Fri Jan 28 10:50:11 2022 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Fri, 28 Jan 2022 10:50:11 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v2] In-Reply-To: References: Message-ID: <8-tXxWKUxKMA9z_uf2hTInUcmNfVT8nWXFFv1taQWP4=.d712ba08-ed79-4ee7-8930-bf75af97680d@github.com> On Fri, 28 Jan 2022 09:58:26 GMT, Thomas Stuefe wrote: > > > Two general remarks. One concern I have is that the new functionality should be super stable, since nothing is more annoying than to crash during stack dumping in hs-err file; I much rather have a call stack without bells and whistles than an abridged one. Maybe we could, in hs-err printing, if we got secondary crashes during callstack dumping, repeat the step with all optional features (also name demangling) disabled? This could also be done in a separate RFE. We'll know when this happens, we can react then. > > > > > > I absolutely agree - stability should be the primary concern. An incomplete hs-err file should be avoided at any cost. Doing an additional "catch and repeat without optional features" sounds interesting to get more safety. Would such a thing be easy to add? Yes, it might be better to do that in a separate RFE. > > It is probably easy, but I also thing this would be better in a separate RFE. And we already have a timeout per reporting step since JDK-8166944, so that long-running steps don't spoil error reporting for everyone. We can just add a second call stack print step if the first one failed. That sounds great. And good to know about JDK-8166944! > > The parsing of a single pc takes a little less than 0.01s. Of course, this is not a great way to measure performance. It also highly depends on the source files themselves, the machine setup etc. Thus, this cannot be considered a valid performance test. But still, I think these numbers can give us some indication of the order of magnitude. Compared to the current `ErrorLogTimeout` default value of 2min this looks promising. > > Okay, this looks reasonable. In our case, I remember having a very slow file system and an overloaded machine. But this would be solved also by just repeating call stack printing if the first attempt times out. Yes, I think so, too. Ah, I see, yes that would be an option in this case. > Cheers, and thanks for this patch! You're welcome! I'm glad this can help with the analysis of crashes in the future :-) Cheers, Christian ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From andreas.rosenberg at apis.de Fri Jan 28 11:15:39 2022 From: andreas.rosenberg at apis.de (Andreas Rosenberg) Date: Fri, 28 Jan 2022 11:15:39 +0000 Subject: Fix proposal for bug JDK-8221642 In-Reply-To: References: Message-ID: Hi Mandy, thanks for looking at my problem. Yes, "setAccessible" is one of the problems, but our main issue is related to "ResourceBundle". I've created a small example that shows the problem: https://github.com/anrose00/JniSensitiveCaller Any comments on my proposal would be great. Andreas From: Mandy Chung Sent: Freitag, 28. Januar 2022 02:54 To: Andreas Rosenberg Cc: hotspot-dev at openjdk.java.net; 'core-libs-dev' Subject: Re: Fix proposal for bug JDK-8221642 I see how NPE is thrown (from `AccessibleObject::setAccessible` and `trySetAccessible`). The proper fix should follow the rule as the access check that it can set the accessible flag only on public members of a public type that is exported unconditionally. The fix is straight forward but involves spec change. I'll post PR soon. Mandy On 1/27/22 8:45 AM, Mandy Chung wrote: Hi Andreas, What methods are you calling that throws NPE? Do you have the stack trace to share? The spec of AccessibleObject was updated for JDK-8221530 if there is no caller frame when calling from JNI: "The check when invoked by JNI code with no Java class on the stack only succeeds if the member and the declaring class are public, and the class is in a package that is exported to all modules." I think AccessibleObject::canAccess, setAccessible, trySetAccessible should follow the same rule. Mandy On 1/27/22 2:19 AM, Andreas Rosenberg wrote: Hi, this is my first posting regarding to JDK contribution, so this may be the wrong place to ask. Please point me in the right direction in this case. We are using Java rather heavily via JNI on a custom application. For a long time we did stick to JRE 1.8 for various reasons. My task is to plan an upgrade to a more recent JDK version and while doing some test I encountered bugs related to this: JDK-8227491 (JNI - caller sensitive methods). We are parsing Java class files to auto gen the JNI code for our application, and are also using reflection. The workaround given is clumsy and needs manual intervention, so I was looking for a more elegant solution. The problem is: a caller sensitive method wants to determine the caller class for security checks. In case of a JNI call no Java stack frame exists, so the JVM function "jclass JVM_GetCallerClass(JNIEnv* env)" answers NULL which leads to NPEs. My idea is this: create an internal proxy class inside "java.base" that reflects this case (e.g. "java.lang.NativeCall" or "java.lang.NativeCode"). This class is final and implements nothing. Then "jclass JVM_GetCallerClass(JNIEnv* env)" (jvm.cpp) could be modified and instead of answering NULL in case of a JNI call, it should do this to answer the class proxy: return JVM_FindClassFromBootLoader(env, "java/lang/NativeCall"); This would have the following advantages: - JNI code could again simply call "caller sensitive methods" without the need to make an additional wrapper class - it would be more a expressive way on the Java side to detect "the callee is native code" than checking for null - it would fit better into the framework I already applied this fix on my own copy of the JDK 17 sources and it works pretty well for us. As there are probably security considerations involved, advice from experts is required. But from my understanding the Java security model is designed for the main app being writing in Java. In this case there are always Java stacks frames available as parents for caller sensitive methods, so the proposed fix would not affect the behavior. This assumes that "GetCallerClass" only answers NULL for the JNI case. This needs verification. If the main app is native code which uses JNI, the Java security model can only affect the Java part and as soon as an additional Java stack frame has been generated a regular Java class will be found and the "standard behavior" should apply again. Comments appreciated. It this fix looks reasonable, what are the steps to get it implemented and integrated into the official source tree? Best regards, Andy From dholmes at openjdk.java.net Fri Jan 28 12:56:16 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 28 Jan 2022 12:56:16 GMT Subject: RFR: 8280684: JfrRecorderService failes with guarantee(num_written > 0) when no space left on device. In-Reply-To: References: Message-ID: <_q4yOx0RetnKX6Q7MQoRUzg8u3_n7_tZTOsnJB25Hn8=.1d610ce7-21ce-42b7-a619-b3555ec7a0b9@github.com> On Wed, 26 Jan 2022 06:41:41 GMT, KIRIYAMA Takuya wrote: > I think JFR should report an error message and jvm should shut down safely instead of gurantee failure. > > For instance, jdk.jfr.internal.Repository#newChunk() reports an appropriate message and stops jvm as below > by using JfrJavaSupport::abort(). > > [0.673s][error][jfr] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp) > [0.673s][error][jfr,system] Could not create chunk in repository /tmp/2022_01_12_22_32_42_18030, class java.io.IOException: Unable to create JFR repository directory using base location (/tmp) > [0.673s][error][jfr,system] An irrecoverable error in Jfr. Shutting down VM... > > I modified StreamWriterHost not to call guarantee failure but to call JfrJavaSupport::abort(). > I added a argument to JfrJavaSupport::abort() which tells os::abort() not to put out core > because there is no space on device. > Could you please review the fix? JFR team need to review this. ------------- PR: https://git.openjdk.java.net/jdk/pull/7227 From dholmes at openjdk.java.net Fri Jan 28 13:12:09 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 28 Jan 2022 13:12:09 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: Message-ID: <-ZpLxEuxwZ_7vF9zWzAyanr_MVryFHmbET1l1uzqv2c=.67897f2a-327e-4b84-94b8-3423c80394bb@github.com> On Fri, 28 Jan 2022 08:50:49 GMT, Denghui Dong wrote: >> Hi, >> >> Could I have a review of this fix? >> >> JDK-8277948 missed the changes to thread_bsd_aarch64.cpp and thread_windows_aarch64.cpp. It caused some tests failures. >> >> I'm trying to find an environment for windows/mac on aarch64 for testing. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > update copyright Thanks for the quick fix - this is causing a lot of noise in our CI. The changes are consistent with the Linux variant. Please ensure to test on all platforms prior to integrating in the future. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7260 From zgu at openjdk.java.net Fri Jan 28 13:48:13 2022 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Fri, 28 Jan 2022 13:48:13 GMT Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files [v2] In-Reply-To: References: <54FxD8Y6tYN9qIxG9kM1609F8U5qX1L2q5k36XCYnzs=.776977b4-80f0-41f9-99d3-6829f8e1d067@github.com> Message-ID: On Fri, 28 Jan 2022 09:38:14 GMT, Christian Hagedorn wrote: > > > That's interesting. Is this implementation still around somewhere? I'm glad that some of the mentioned things are not a problem anymore. > > > > > > Not I know. IIRC, it was based on DWARF 2. > > Okay, thanks. > > > > > * Different compiler (and different version of the same compiler) can generate DWARF with different version, may not be compatible with each other, as DWARF allows custom fields. > > > > * Maintenance cost to catch up DWARF spec/compiler changes. > > > > > > > > > That's indeed a problem of facing different DWARF versions. For this parser, I tried to support the current default of GCC 10.x which is DWARF 4. This standard was introduced in 2010 and is probably used by most compilers nowadays at least (if not already DWARF 5 which was introduced in 2017). However, even with GCC 10.x, it emitted DWARF 3 for one of the sections (I'm not sure why) which I also needed to support - thus you can never be sure. > > > DWARF 5 is still experimental for GCC 10.x and had some issues when I tried that out back there - so I stayed away from implementing parsing steps for it. But now with GCC 11.x, DWARF 5 seems to have become the default. I might have to try out what's being emitted for HotSpot. But I think for now, it is better to only focus on DWARF 4 instead of trying to support various versions in one patch - we could still come back to that later if it becomes widely used. Even if DWARF 5 is emitted, GCC could be configured, for example, to emit DWARF 4 only which is probably an acceptable workaround for testing environments. > > > > > > I think maintenance and test could be major pain points. Based on build.html, we can use gcc version anywhere between 5.0 and 10.2, it could be a challenge to ensure all supported version work correctly. > > I agree, that wide range is a problem and older GCC versions emitting older DWARF version are not covered with this patch. If I parse a DWARF section header with an unsupported version I will just bail out and it falls back to the stack trace we are seeing today without source information. That's probably fine for the scope of this patch. We could still come back and add support for other missing versions. I see, it makes sense. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/7126 From aph at openjdk.java.net Fri Jan 28 13:53:17 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 28 Jan 2022 13:53:17 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: <-ZpLxEuxwZ_7vF9zWzAyanr_MVryFHmbET1l1uzqv2c=.67897f2a-327e-4b84-94b8-3423c80394bb@github.com> References: <-ZpLxEuxwZ_7vF9zWzAyanr_MVryFHmbET1l1uzqv2c=.67897f2a-327e-4b84-94b8-3423c80394bb@github.com> Message-ID: On Fri, 28 Jan 2022 13:09:10 GMT, David Holmes wrote: > Please ensure to test on all platforms prior to integrating in the future. Not everyone can. Shouldn't this have come up in the pre-submit tests? ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From ddong at openjdk.java.net Fri Jan 28 14:42:12 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Fri, 28 Jan 2022 14:42:12 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 08:50:49 GMT, Denghui Dong wrote: >> Hi, >> >> Could I have a review of this fix? >> >> JDK-8277948 missed the changes to thread_bsd_aarch64.cpp and thread_windows_aarch64.cpp. It caused some tests failures. >> >> I'm trying to find an environment for windows/mac on aarch64 for testing. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > update copyright Thanks for the review. @DamonFool @theRealAph @dholmes-ora I'm still looking for windows aarch64 environment for testing and will integrate this patch if everything is ok. ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From dcubed at openjdk.java.net Fri Jan 28 14:55:10 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 28 Jan 2022 14:55:10 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: Message-ID: <8beM5H7JBVzLZw_jIbfwFjhKgBfUdwAorM7bKvYakmA=.138abf15-7cff-4b77-99bc-f7c3577ff343@github.com> On Fri, 28 Jan 2022 08:50:49 GMT, Denghui Dong wrote: >> Hi, >> >> Could I have a review of this fix? >> >> JDK-8277948 missed the changes to thread_bsd_aarch64.cpp and thread_windows_aarch64.cpp. It caused some tests failures. >> >> I'm trying to find an environment for windows/mac on aarch64 for testing. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > update copyright It appears that GHA only builds on macosx-aarch64 and does not execute tests on macosx-aarch64. ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From ddong at openjdk.java.net Fri Jan 28 15:00:17 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Fri, 28 Jan 2022 15:00:17 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 08:50:49 GMT, Denghui Dong wrote: >> Hi, >> >> Could I have a review of this fix? >> >> JDK-8277948 missed the changes to thread_bsd_aarch64.cpp and thread_windows_aarch64.cpp. It caused some tests failures. >> >> I'm trying to find an environment for windows/mac on aarch64 for testing. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > update copyright Hmm, it seems that there is no way to get an aarch64 windows VM on azure. ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From dcubed at openjdk.java.net Fri Jan 28 15:27:24 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 28 Jan 2022 15:27:24 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 14:56:40 GMT, Denghui Dong wrote: >> Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: >> >> update copyright > > Hmm, it seems that there is no way to get an aarch64 windows VM on azure. @D-D-H - I'm kicking off a Mach5 Tier1 on your v02 bits. We don't have windows-aarch64 in our setup so I can't help with that issue. IMHO, I don't think you need to wait for windows-aarch64, but... ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From dcubed at openjdk.java.net Fri Jan 28 16:23:10 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 28 Jan 2022 16:23:10 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 08:50:49 GMT, Denghui Dong wrote: >> Hi, >> >> Could I have a review of this fix? >> >> JDK-8277948 missed the changes to thread_bsd_aarch64.cpp and thread_windows_aarch64.cpp. It caused some tests failures. >> >> I'm trying to find an environment for windows/mac on aarch64 for testing. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > update copyright Mach5 Tier1 passed with no failures. ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From dcubed at openjdk.java.net Fri Jan 28 17:36:13 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 28 Jan 2022 17:36:13 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: Message-ID: <8WbAbXKivPBdyMfrTqb-yI7jI0yR8DNyH6S3D7E1KRg=.44dd73af-de3f-4751-9246-71d698786a9d@github.com> On Fri, 28 Jan 2022 08:50:49 GMT, Denghui Dong wrote: >> Hi, >> >> Could I have a review of this fix? >> >> JDK-8277948 missed the changes to thread_bsd_aarch64.cpp and thread_windows_aarch64.cpp. It caused some tests failures. >> >> I'm trying to find an environment for windows/mac on aarch64 for testing. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > update copyright Mach5 Tier3 passed with no failures. ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From dcubed at openjdk.java.net Fri Jan 28 17:42:12 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 28 Jan 2022 17:42:12 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 08:50:49 GMT, Denghui Dong wrote: >> Hi, >> >> Could I have a review of this fix? >> >> JDK-8277948 missed the changes to thread_bsd_aarch64.cpp and thread_windows_aarch64.cpp. It caused some tests failures. >> >> I'm trying to find an environment for windows/mac on aarch64 for testing. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > update copyright We have reached 10 rows of failures in the Mach5 CI. I'm going to start setting up a ProblemListing of the failing test. ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From minqi at openjdk.java.net Fri Jan 28 17:47:08 2022 From: minqi at openjdk.java.net (Yumin Qi) Date: Fri, 28 Jan 2022 17:47:08 GMT Subject: RFR: 8278753: Runtime crashes with access violation during JNI_CreateJavaVM call In-Reply-To: References: Message-ID: On Wed, 26 Jan 2022 08:59:49 GMT, Alan Bateman wrote: >> Please review, >> When jlink with --compress=2, zip is used to compress the files while doing copy. The user case failed to load zip.dll, since zip.dll is not set in PATH. This failure is after we get NULL from GetModuleHandle("zip.dll"), then do LoadLibrary("zip.dll") will have same result. >> The fix is calling load_zip_library of ClassLoader first --- if zip library already loaded just return the cached handle for following usage, if not, load zip library and cached the handle. >> >> Tests: tier1,4,7 in test >> Manually tested user case, and checked output of jimage list for jlinked files using --compress=2. >> >> Thanks >> Yumin > > I think this looks okay but I think @JimLaskey and/or @sundararajana should look at this because it creates a dependency on a JVM_* function. I'm trying to think if there are any interop issues when using jrtfs. Jim/Sundar can correct me but I think we are okay there because a tool on say JDK 8 (or 11 or 17) that accesses a JDK 19 run-time image will use the BasicImageReader and won't use libjimage in the target VM. Thanks to @AlanBateman, @JimLaskey or @sundararajana Could you have a look and comment? Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/7206 From aph at openjdk.java.net Fri Jan 28 17:52:10 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 28 Jan 2022 17:52:10 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 17:39:21 GMT, Daniel D. Daugherty wrote: > We have reached 10 rows of failures in the Mach5 CI. I'm going to start setting up a ProblemListing of the failing test. What? You just said it passed. ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From dcubed at openjdk.java.net Fri Jan 28 18:10:11 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 28 Jan 2022 18:10:11 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 08:50:49 GMT, Denghui Dong wrote: >> Hi, >> >> Could I have a review of this fix? >> >> JDK-8277948 missed the changes to thread_bsd_aarch64.cpp and thread_windows_aarch64.cpp. It caused some tests failures. >> >> I'm trying to find an environment for windows/mac on aarch64 for testing. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > update copyright The fix hasn't been pushed yet so the CI is failing. And my Mach5 Tier3 *test job* of the v02 fix has passed. I would be fine if this fix pushed now. Especially since I've tested in Mach Tier[1-3] and it passes. So everywhere we've seen this test fail (so far) in Mach5 has been re-tested and now passes. Okay. I now have a review of the ProblemListing and we haven't heard from @D-D-H in ~3 hours so I'm going ahead with the ProblemListing. ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From aph at openjdk.java.net Fri Jan 28 18:10:12 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 28 Jan 2022 18:10:12 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 08:50:49 GMT, Denghui Dong wrote: >> Hi, >> >> Could I have a review of this fix? >> >> JDK-8277948 missed the changes to thread_bsd_aarch64.cpp and thread_windows_aarch64.cpp. It caused some tests failures. >> >> I'm trying to find an environment for windows/mac on aarch64 for testing. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > update copyright > The fix hasn't been pushed yet so the CI is failing. > > And my Mach5 Tier3 _test job_ of the v02 fix has passed. OK, I see. This can't be pushed because MacOS Pre-submit tests are still running. Although I think I'd just push it anyway. ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From dcubed at openjdk.java.net Fri Jan 28 18:13:13 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 28 Jan 2022 18:13:13 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 08:50:49 GMT, Denghui Dong wrote: >> Hi, >> >> Could I have a review of this fix? >> >> JDK-8277948 missed the changes to thread_bsd_aarch64.cpp and thread_windows_aarch64.cpp. It caused some tests failures. >> >> I'm trying to find an environment for windows/mac on aarch64 for testing. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > update copyright This PR will need to be merged with jdk/jdk and the test UnProblemListed when work resumes on this PR. ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From aph at openjdk.java.net Fri Jan 28 18:16:11 2022 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 28 Jan 2022 18:16:11 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 18:10:04 GMT, Daniel D. Daugherty wrote: > This PR will need to be merged with jdk/jdk and the test UnProblemListed when work resumes on this PR. OK. I think there may be a problem with MacOS CI. Pre-submit tests are still running, but they seem to be marked as having failed. ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From mandy.chung at oracle.com Fri Jan 28 19:07:58 2022 From: mandy.chung at oracle.com (Mandy Chung) Date: Fri, 28 Jan 2022 11:07:58 -0800 Subject: Fix proposal for bug JDK-8221642 In-Reply-To: References: Message-ID: <56902d1a-2683-3097-436a-ded5838f4620@oracle.com> Your proposal is essentially for all JNI code with no caller frame to default to java.base, which gets all permissions.? It means that it could break encapsulation to access any members.? Arguably one could consider JNI have superpower.? In addition, default to java.base may not make sense for some Java APIs, ResouroceBundle::getBundle(String bundlename) is one example.? It uses the caller class's loader to load the resource bundle.?? Default to java.base means it defaults to the bootstrap loader which can't find the resource bundle on the class path for example.?? For the ResourceBundle case, it seems that the unnamed module defined by the system class loader might be an appropriate default. The proper way is to examine each caller-sensitive method and investigate what makes sense when invoked by JNI code with no caller frame.? JDK-8177155 is the RFE for such task. System::getLogger, Logger::getLogger, and core reflection API are looked at but more to follow up. I created https://bugs.openjdk.java.net/browse/JDK-8280902 to follow up the ResourceBundle::getBundle issue. Mandy [1] https://bugs.openjdk.java.net/browse/JDK-8177155 On 1/28/22 3:15 AM, Andreas Rosenberg wrote: > > Hi Mandy, > > thanks for looking at my problem. Yes, "setAccessible" is one of the > problems, > > but our main issue is related to "ResourceBundle". > > I've created a small example that shows the > problem:https://github.com/anrose00/JniSensitiveCaller > > > Any comments on my proposal would be great. > > Andreas > > *From:*Mandy Chung > *Sent:* Freitag, 28. Januar 2022 02:54 > *To:* Andreas Rosenberg > *Cc:* hotspot-dev at openjdk.java.net; 'core-libs-dev' > > *Subject:* Re: Fix proposal for bug JDK-8221642 > > I see how NPE is thrown (from `AccessibleObject::setAccessible` and > `trySetAccessible`). The proper fix should follow the rule as the > access check that it can set the accessible flag only on public > members of a public type that is exported unconditionally. > > The fix is straight forward but involves spec change.? I'll post PR soon. > > Mandy > > On 1/27/22 8:45 AM, Mandy Chung wrote: > > Hi Andreas, > > What methods are you calling that throws NPE?? Do you have the > stack trace to share? > > The spec of AccessibleObject was updated for JDK-8221530 if there > is no caller frame when calling from JNI: > > "The check when invoked by JNI code with no Java class on the > stack only succeeds if the member and the declaring class are > public, and the class is in a package that is exported to all > modules." > > I think AccessibleObject::canAccess, setAccessible, > trySetAccessible should follow the same rule. > > Mandy > > On 1/27/22 2:19 AM, Andreas Rosenberg wrote: > > Hi, > > this is my first posting regarding to JDK contribution, so this may be the wrong place to ask. > > Please point me in the right direction in this case. > > We are using Java rather heavily via JNI on a custom application. For a long time we did stick to JRE 1.8 > > for various reasons. My task is to plan an upgrade to a more recent JDK version and while doing some > > test I encountered bugs related to this: JDK-8227491(JNI - caller sensitive methods). > > We are parsing Java class files to auto gen the JNI code for our application, and are also using reflection. > > The workaround given is clumsy and needs manual intervention, so I was looking for a more elegant solution. > > The problem is: a caller sensitive method wants to determine the caller class for security checks. In case of > > a JNI call no Java stack frame exists, so the JVM function "jclass JVM_GetCallerClass(JNIEnv* env)" answers NULL > > which leads to NPEs. > > My idea is this: create an internal proxy class inside "java.base" that reflects this case > > (e.g. "java.lang.NativeCall" or "java.lang.NativeCode"). > > This class is final and implements nothing. > > Then "jclass JVM_GetCallerClass(JNIEnv* env)" (jvm.cpp) could be modified and instead of answering NULL > > in case of a JNI call, it should do this to answer the class proxy: > > return JVM_FindClassFromBootLoader(env, "java/lang/NativeCall"); > > This would have the following advantages: > > - JNI code could again simply call "caller sensitive methods" without the need to make an additional wrapper class > > - it would be more a expressive way on the Java side to detect "the callee is native code" than checking for null > > - it would fit better into the framework > > I already applied this fix on my own copy of the JDK 17 sources and it works pretty well for us. > > As there are probably security considerations involved, advice from experts is required. > > But from my understanding the Java security model is designed for the main app being writing in Java. > > In this case there are always Java stacks frames available as parents for caller sensitive methods, so > > the proposed fix would not affect the behavior. This assumes that "GetCallerClass" only answers > > NULL for the JNI case. This needs verification. > > If the main app is native code which uses JNI, the Java security model can only affect the Java part and > > as soon as an additional Java stack frame has been generated a regular Java class will be found and > > the "standard behavior" should apply again. > > Comments appreciated. > > It this fix looks reasonable, what are the steps to get it implemented and integrated into the official > > source tree? > > Best regards, > > Andy > From ddong at openjdk.java.net Fri Jan 28 22:11:10 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Fri, 28 Jan 2022 22:11:10 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 18:13:05 GMT, Andrew Haley wrote: > This PR will need to be merged with jdk/jdk and the test UnProblemListed > when work resumes on this PR. Sorry for the late reply. Is it okay to merge the fix now? ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From dcubed at openjdk.java.net Fri Jan 28 22:37:10 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Fri, 28 Jan 2022 22:37:10 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v3] In-Reply-To: References: Message-ID: <6lspMOs4cS6rFjR9sKQ5O63742oPcvE0e153mhhQxuw=.1d3a7196-2950-4b12-b8fb-781a43ca5106@github.com> On Fri, 28 Jan 2022 08:50:49 GMT, Denghui Dong wrote: >> Hi, >> >> Could I have a review of this fix? >> >> JDK-8277948 missed the changes to thread_bsd_aarch64.cpp and thread_windows_aarch64.cpp. It caused some tests failures. >> >> I'm trying to find an environment for windows/mac on aarch64 for testing. >> >> Thanks, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > update copyright I'm okay if you integrate. Don't forget to merge with jdk/jdk and UnProblemList the test before you integrate. ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From ddong at openjdk.java.net Fri Jan 28 22:55:39 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Fri, 28 Jan 2022 22:55:39 GMT Subject: RFR: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 [v4] In-Reply-To: References: Message-ID: > Hi, > > Could I have a review of this fix? > > JDK-8277948 missed the changes to thread_bsd_aarch64.cpp and thread_windows_aarch64.cpp. It caused some tests failures. > > I'm trying to find an environment for windows/mac on aarch64 for testing. > > Thanks, > Denghui Denghui Dong has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Unproblemlist TestC2IntPressure on macosx-aarch64 - Merge branch 'master' into JDK-8280843 - update copyright - update copyright - 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/7260/files - new: https://git.openjdk.java.net/jdk/pull/7260/files/4e0ada13..526c7db7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7260&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7260&range=02-03 Stats: 216 lines in 17 files changed: 82 ins; 90 del; 44 mod Patch: https://git.openjdk.java.net/jdk/pull/7260.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7260/head:pull/7260 PR: https://git.openjdk.java.net/jdk/pull/7260 From ddong at openjdk.java.net Fri Jan 28 22:56:23 2022 From: ddong at openjdk.java.net (Denghui Dong) Date: Fri, 28 Jan 2022 22:56:23 GMT Subject: Integrated: 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 07:41:21 GMT, Denghui Dong wrote: > Hi, > > Could I have a review of this fix? > > JDK-8277948 missed the changes to thread_bsd_aarch64.cpp and thread_windows_aarch64.cpp. It caused some tests failures. > > I'm trying to find an environment for windows/mac on aarch64 for testing. > > Thanks, > Denghui This pull request has now been integrated. Changeset: 91391598 Author: Denghui Dong URL: https://git.openjdk.java.net/jdk/commit/91391598989c70c98b9400997df4f9177d3e576f Stats: 9 lines in 3 files changed: 5 ins; 1 del; 3 mod 8280843: macos-Aarch64 SEGV in frame::sender_for_compiled_frame after JDK-8277948 Reviewed-by: aph, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/7260 From kbarrett at openjdk.java.net Fri Jan 28 23:06:05 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 28 Jan 2022 23:06:05 GMT Subject: RFR: 8214976: Warn about uses of functions replaced for portability In-Reply-To: References: Message-ID: On Thu, 27 Jan 2022 19:18:10 GMT, Harold Seigel wrote: > Please review this new attempt to resolve JDK-8214976. This fix adds Pragmas to generate compilation errors, when using gcc, if calling a native system function instead of the os:: version of the function. The fix includes changes to calls in non-shared code because it is cleaner than adding PRAGMAs and, for some cases, the os:: version of a function has added value, such as asserts and RESTARTABLE. This fix slightly changes the signature of os::abort() so it wouldn't conflict with native abort() functions. Changes to Windows code is left for a future RFE. > > This fix was tested with Mach5 tiers 1-2 on Linux, Mac OS, and Windows, Mach5 tiers 3-5 on Linux x64, and Mach5 builds of Zero, PPC, and s390. > > Thanks, Harold Changes requested by kbarrett (Reviewer). src/hotspot/share/utilities/compilerWarnings_gcc.hpp line 109: > 107: FORBID_C_FUNCTION(ssize_t write(int, const void*, size_t ), "use os::write"); > 108: > 109: FORBID_C_FUNCTION(char* strtok(char*, const char*), "use strtok_r"); Some of these functions are portable and ought to be forbidden in a platform agnostic location, so the restriction also applies if/when we have real support on other platforms. I think almost none are gcc (or clang) specific, but are instead probably posix and not windows, so maybe should go in a different place as well. Basically I think the structure / placement considerations need some more work. src/hotspot/share/utilities/compilerWarnings_gcc.hpp line 114: > 112: > 113: #define FORBID_C_FUNCTION(signature, alternative) > 114: #define PRAGMA_PERMIT_FORBIDDEN_C_FUNCTION(name) These aren't needed. The default empty definitions in compilerWarnings.hpp cover this case. ------------- PR: https://git.openjdk.java.net/jdk/pull/7248 From kbarrett at openjdk.java.net Fri Jan 28 23:06:06 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 28 Jan 2022 23:06:06 GMT Subject: RFR: 8214976: Warn about uses of functions replaced for portability In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 05:08:57 GMT, David Holmes wrote: > Shouldn't we generate a warning for all external functions for which there is an os:: replacement e.g. pread is called by read_at; gethostbyname is called by get_host_by_name; ... > > Thanks, David That seems like a good goal, but I don't think we have to get complete coverage in one PR. > src/hotspot/os/aix/os_aix.cpp line 2499: > >> 2497: struct dirent *ptr; >> 2498: >> 2499: dir = os::opendir(path); > > Just to clarify, as we are in the scope of the os class both `opendir` and `os::opendir` are the same thing here - and similarly for other code in the os class - right? Yes, that's correct. So an unqualified opendir here should not trigger a forbidden warning. > src/hotspot/share/utilities/compilerWarnings_gcc.hpp line 80: > >> 78: #define NULL 0 >> 79: #endif >> 80: #endif > > This is really ugly just because we include dirent.h so we can add the warning for a few functions; and even uglier because it is only needed for AIX, and even uglier still because based on the existing code we only compile AIX with xlc - no? Otherwise we would already need this hack for gcc. We only compile AIX with xclang these days. I don't know how our "xlc" compiler platform mechanism interacts with our "gcc" (which is really both gcc and clang) compiler platform, or if it interacts, or if it should. But none of that matters for the dirent.h problem. The problem there is that it's a system header, irrespective of what compiler is being used, and it has this problem. So whether we need this NULL cruft here depends on whether AIX with xclang uses this file or not. One option would be to just not deal with the dirent stuff yet, saving that for a followup focused on that problem. ------------- PR: https://git.openjdk.java.net/jdk/pull/7248 From stuefe at openjdk.java.net Sat Jan 29 07:12:19 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Sat, 29 Jan 2022 07:12:19 GMT Subject: RFR: 8214976: Warn about uses of functions replaced for portability In-Reply-To: References: Message-ID: On Fri, 28 Jan 2022 22:49:48 GMT, Kim Barrett wrote: >> src/hotspot/share/utilities/compilerWarnings_gcc.hpp line 80: >> >>> 78: #define NULL 0 >>> 79: #endif >>> 80: #endif >> >> This is really ugly just because we include dirent.h so we can add the warning for a few functions; and even uglier because it is only needed for AIX, and even uglier still because based on the existing code we only compile AIX with xlc - no? Otherwise we would already need this hack for gcc. > > We only compile AIX with xclang these days. I don't know how our "xlc" compiler platform mechanism interacts with our "gcc" (which is really both gcc and clang) compiler platform, or if it interacts, or if it should. But none of that matters for the dirent.h problem. The problem there is that it's a system header, irrespective of what compiler is being used, and it has this problem. So whether we need this NULL cruft here depends on whether AIX with xclang uses this file or not. One option would be to just not deal with the dirent stuff yet, saving that for a followup focused on that problem. Sorry, I'm confused. We build AIX with xlc. I don't believe we even include this file on AIX. How does this help AIX? ------------- PR: https://git.openjdk.java.net/jdk/pull/7248 From kbarrett at openjdk.java.net Sun Jan 30 00:35:27 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Sun, 30 Jan 2022 00:35:27 GMT Subject: RFR: 8280916: Simplify HotSpot Style Guide editorial changes Message-ID: Please review this change to the HotSpot Style Guide change process. The current process involves gathering consensus among the HotSpot Group Members. That's fine for changes of substance. But it seems overly weighty for editorial changes that don't affect the substance of the guide, but only it's clarity or accuracy. The proposed change would permit the normal PR process to be used for such changes, but require the requisite reviewers to additionally be HotSpot Group Members. Note that there have already been a couple of changes that effectively followed the proposed new process. https://bugs.openjdk.java.net/browse/JDK-8274169 https://bugs.openjdk.java.net/browse/JDK-8280182 ------------- Commit messages: - update generated html - editorial change process Changes: https://git.openjdk.java.net/jdk/pull/7280/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7280&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8280916 Stats: 13 lines in 2 files changed: 9 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/7280.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7280/head:pull/7280 PR: https://git.openjdk.java.net/jdk/pull/7280 From kbarrett at openjdk.java.net Sun Jan 30 00:40:09 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Sun, 30 Jan 2022 00:40:09 GMT Subject: RFR: 8280916: Simplify HotSpot Style Guide editorial changes In-Reply-To: References: Message-ID: On Sun, 30 Jan 2022 00:28:59 GMT, Kim Barrett wrote: > Please review this change to the HotSpot Style Guide change process. > > The current process involves gathering consensus among the HotSpot Group > Members. That's fine for changes of substance. But it seems overly weighty > for editorial changes that don't affect the substance of the guide, but only > it's clarity or accuracy. > > The proposed change would permit the normal PR process to be used for such > changes, but require the requisite reviewers to additionally be HotSpot Group > Members. > > Note that there have already been a couple of changes that effectively > followed the proposed new process. > https://bugs.openjdk.java.net/browse/JDK-8274169 > https://bugs.openjdk.java.net/browse/JDK-8280182 I messed up the PR submission for this. Closing and will open a new one. ------------- PR: https://git.openjdk.java.net/jdk/pull/7280 From kbarrett at openjdk.java.net Sun Jan 30 00:40:09 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Sun, 30 Jan 2022 00:40:09 GMT Subject: Withdrawn: 8280916: Simplify HotSpot Style Guide editorial changes In-Reply-To: References: Message-ID: On Sun, 30 Jan 2022 00:28:59 GMT, Kim Barrett wrote: > Please review this change to the HotSpot Style Guide change process. > > The current process involves gathering consensus among the HotSpot Group > Members. That's fine for changes of substance. But it seems overly weighty > for editorial changes that don't affect the substance of the guide, but only > it's clarity or accuracy. > > The proposed change would permit the normal PR process to be used for such > changes, but require the requisite reviewers to additionally be HotSpot Group > Members. > > Note that there have already been a couple of changes that effectively > followed the proposed new process. > https://bugs.openjdk.java.net/browse/JDK-8274169 > https://bugs.openjdk.java.net/browse/JDK-8280182 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/7280 From kbarrett at openjdk.java.net Sun Jan 30 00:48:32 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Sun, 30 Jan 2022 00:48:32 GMT Subject: RFR: 8280916: Simplify HotSpot Style Guide editorial changes Message-ID: Please review this change to the HotSpot Style Guide change process. The current process involves gathering consensus among the HotSpot Group Members. That's fine for changes of substance. But it seems overly weighty for editorial changes that don't affect the substance of the guide, but only it's clarity or accuracy. The proposed change would permit the normal PR process to be used for such changes, but require the requisite reviewers to additionally be HotSpot Group Members. Note that there have already been a couple of changes that effectively followed the proposed new process. https://bugs.openjdk.java.net/browse/JDK-8274169 https://bugs.openjdk.java.net/browse/JDK-8280182 This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Monday 14-Feb-2022 at 12h00 UTC. Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. ------------- Commit messages: - update generated html - editorial change process Changes: https://git.openjdk.java.net/jdk/pull/7281/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7281&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8280916 Stats: 13 lines in 2 files changed: 9 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/7281.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7281/head:pull/7281 PR: https://git.openjdk.java.net/jdk/pull/7281 From dcubed at openjdk.java.net Sun Jan 30 14:14:13 2022 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Sun, 30 Jan 2022 14:14:13 GMT Subject: RFR: 8280916: Simplify HotSpot Style Guide editorial changes In-Reply-To: References: Message-ID: On Sun, 30 Jan 2022 00:39:20 GMT, Kim Barrett wrote: > Please review this change to the HotSpot Style Guide change process. > > The current process involves gathering consensus among the HotSpot Group > Members. That's fine for changes of substance. But it seems overly weighty > for editorial changes that don't affect the substance of the guide, but only > it's clarity or accuracy. > > The proposed change would permit the normal PR process to be used for such > changes, but require the requisite reviewers to additionally be HotSpot Group > Members. > > Note that there have already been a couple of changes that effectively > followed the proposed new process. > https://bugs.openjdk.java.net/browse/JDK-8274169 > https://bugs.openjdk.java.net/browse/JDK-8280182 > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Monday 14-Feb-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Thumbs up. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7281 From dholmes at openjdk.java.net Sun Jan 30 21:59:06 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Sun, 30 Jan 2022 21:59:06 GMT Subject: RFR: 8280916: Simplify HotSpot Style Guide editorial changes In-Reply-To: References: Message-ID: On Sun, 30 Jan 2022 00:39:20 GMT, Kim Barrett wrote: > Please review this change to the HotSpot Style Guide change process. > > The current process involves gathering consensus among the HotSpot Group > Members. That's fine for changes of substance. But it seems overly weighty > for editorial changes that don't affect the substance of the guide, but only > it's clarity or accuracy. > > The proposed change would permit the normal PR process to be used for such > changes, but require the requisite reviewers to additionally be HotSpot Group > Members. > > Note that there have already been a couple of changes that effectively > followed the proposed new process. > https://bugs.openjdk.java.net/browse/JDK-8274169 > https://bugs.openjdk.java.net/browse/JDK-8280182 > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Monday 14-Feb-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7281 From stuefe at openjdk.java.net Mon Jan 31 07:18:09 2022 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 31 Jan 2022 07:18:09 GMT Subject: RFR: 8280916: Simplify HotSpot Style Guide editorial changes In-Reply-To: References: Message-ID: On Sun, 30 Jan 2022 00:39:20 GMT, Kim Barrett wrote: > Please review this change to the HotSpot Style Guide change process. > > The current process involves gathering consensus among the HotSpot Group > Members. That's fine for changes of substance. But it seems overly weighty > for editorial changes that don't affect the substance of the guide, but only > it's clarity or accuracy. > > The proposed change would permit the normal PR process to be used for such > changes, but require the requisite reviewers to additionally be HotSpot Group > Members. > > Note that there have already been a couple of changes that effectively > followed the proposed new process. > https://bugs.openjdk.java.net/browse/JDK-8274169 > https://bugs.openjdk.java.net/browse/JDK-8280182 > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Monday 14-Feb-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. This makes sense. About substantive changes: since those affect a lot of people, would it make sense to include a clause for a minimum time to collect answers on a proposed style change? Like the 24hrs clause for code reviews? Thanks, Thomas ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7281 From stefank at openjdk.java.net Mon Jan 31 08:58:11 2022 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 31 Jan 2022 08:58:11 GMT Subject: RFR: 8280916: Simplify HotSpot Style Guide editorial changes In-Reply-To: References: Message-ID: On Sun, 30 Jan 2022 00:39:20 GMT, Kim Barrett wrote: > Please review this change to the HotSpot Style Guide change process. > > The current process involves gathering consensus among the HotSpot Group > Members. That's fine for changes of substance. But it seems overly weighty > for editorial changes that don't affect the substance of the guide, but only > it's clarity or accuracy. > > The proposed change would permit the normal PR process to be used for such > changes, but require the requisite reviewers to additionally be HotSpot Group > Members. > > Note that there have already been a couple of changes that effectively > followed the proposed new process. > https://bugs.openjdk.java.net/browse/JDK-8274169 > https://bugs.openjdk.java.net/browse/JDK-8280182 > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Monday 14-Feb-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Marked as reviewed by stefank (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7281 From andreas.rosenberg at apis.de Mon Jan 31 09:21:08 2022 From: andreas.rosenberg at apis.de (Andreas Rosenberg) Date: Mon, 31 Jan 2022 09:21:08 +0000 Subject: Fix proposal for bug JDK-8221642 In-Reply-To: <56902d1a-2683-3097-436a-ded5838f4620@oracle.com> References: <56902d1a-2683-3097-436a-ded5838f4620@oracle.com> Message-ID: Hi Mandy, thanks for your comments. Yes, the correct solution is to examine each caller sensitive method. Surely my idea is not perfect, as the problem with the ResourceBundle and the bootstrap loader shows. I had the hope that cases like this could be solved by implementing a "proxy module object" that could define the correct behavior for such cases (e.g. the correct class loader). As far as I understood, the #getModule() call could be used for this. At least the class loader issue could probably be solved this way, just as an idea. I'm not very familiar with all the aspects of module usage, but this way you had at least a kind of definition in Java, how native code should be seen regarding module usage. My search for "@CallerSensitive" gave me 149 hits in java files, so this is quite a task to examine all. My fear is that we my run into another exception in a few month and the fixes for such problems will not arrive in a few days and we are facing the same problem again. So a global solution would be preferable. Of course you are worried about strange side effects or maybe even security /safety issues, but my hope was that somebody here had the expertise to give a good estimation on this. Regarding permissions: if you don't have any Java stack frames on the stack, that means a native application is using Java code as a kind of library (e.g. we use it to read/write MS Excel via Apache POI). In such cases the native app must care about that. I could imagine, that there could be use cases that the native app wants to limit permissions for a certain Java component (e.g. a WebView that may load data from external sites). In such cases you must define permissions for the component, but this should work as soon as there is at least one additional Java stack frame on the stack. Right? Best regards, Andreas From: Mandy Chung Sent: Freitag, 28. Januar 2022 20:08 To: Andreas Rosenberg Cc: hotspot-dev at openjdk.java.net; 'core-libs-dev' Subject: Re: Fix proposal for bug JDK-8221642 Your proposal is essentially for all JNI code with no caller frame to default to java.base, which gets all permissions. It means that it could break encapsulation to access any members. Arguably one could consider JNI have superpower. In addition, default to java.base may not make sense for some Java APIs, ResouroceBundle::getBundle(String bundlename) is one example. It uses the caller class's loader to load the resource bundle. Default to java.base means it defaults to the bootstrap loader which can't find the resource bundle on the class path for example. For the ResourceBundle case, it seems that the unnamed module defined by the system class loader might be an appropriate default. The proper way is to examine each caller-sensitive method and investigate what makes sense when invoked by JNI code with no caller frame. JDK-8177155 is the RFE for such task. System::getLogger, Logger::getLogger, and core reflection API are looked at but more to follow up. I created https://bugs.openjdk.java.net/browse/JDK-8280902 to follow up the ResourceBundle::getBundle issue. Mandy [1] https://bugs.openjdk.java.net/browse/JDK-8177155 On 1/28/22 3:15 AM, Andreas Rosenberg wrote: Hi Mandy, thanks for looking at my problem. Yes, "setAccessible" is one of the problems, but our main issue is related to "ResourceBundle". I've created a small example that shows the problem: https://github.com/anrose00/JniSensitiveCaller Any comments on my proposal would be great. Andreas From: Mandy Chung Sent: Freitag, 28. Januar 2022 02:54 To: Andreas Rosenberg Cc: hotspot-dev at openjdk.java.net; 'core-libs-dev' Subject: Re: Fix proposal for bug JDK-8221642 I see how NPE is thrown (from `AccessibleObject::setAccessible` and `trySetAccessible`). The proper fix should follow the rule as the access check that it can set the accessible flag only on public members of a public type that is exported unconditionally. The fix is straight forward but involves spec change. I'll post PR soon. Mandy On 1/27/22 8:45 AM, Mandy Chung wrote: Hi Andreas, What methods are you calling that throws NPE? Do you have the stack trace to share? The spec of AccessibleObject was updated for JDK-8221530 if there is no caller frame when calling from JNI: "The check when invoked by JNI code with no Java class on the stack only succeeds if the member and the declaring class are public, and the class is in a package that is exported to all modules." I think AccessibleObject::canAccess, setAccessible, trySetAccessible should follow the same rule. Mandy On 1/27/22 2:19 AM, Andreas Rosenberg wrote: Hi, this is my first posting regarding to JDK contribution, so this may be the wrong place to ask. Please point me in the right direction in this case. We are using Java rather heavily via JNI on a custom application. For a long time we did stick to JRE 1.8 for various reasons. My task is to plan an upgrade to a more recent JDK version and while doing some test I encountered bugs related to this: JDK-8227491 (JNI - caller sensitive methods). We are parsing Java class files to auto gen the JNI code for our application, and are also using reflection. The workaround given is clumsy and needs manual intervention, so I was looking for a more elegant solution. The problem is: a caller sensitive method wants to determine the caller class for security checks. In case of a JNI call no Java stack frame exists, so the JVM function "jclass JVM_GetCallerClass(JNIEnv* env)" answers NULL which leads to NPEs. My idea is this: create an internal proxy class inside "java.base" that reflects this case (e.g. "java.lang.NativeCall" or "java.lang.NativeCode"). This class is final and implements nothing. Then "jclass JVM_GetCallerClass(JNIEnv* env)" (jvm.cpp) could be modified and instead of answering NULL in case of a JNI call, it should do this to answer the class proxy: return JVM_FindClassFromBootLoader(env, "java/lang/NativeCall"); This would have the following advantages: - JNI code could again simply call "caller sensitive methods" without the need to make an additional wrapper class - it would be more a expressive way on the Java side to detect "the callee is native code" than checking for null - it would fit better into the framework I already applied this fix on my own copy of the JDK 17 sources and it works pretty well for us. As there are probably security considerations involved, advice from experts is required. But from my understanding the Java security model is designed for the main app being writing in Java. In this case there are always Java stacks frames available as parents for caller sensitive methods, so the proposed fix would not affect the behavior. This assumes that "GetCallerClass" only answers NULL for the JNI case. This needs verification. If the main app is native code which uses JNI, the Java security model can only affect the Java part and as soon as an additional Java stack frame has been generated a regular Java class will be found and the "standard behavior" should apply again. Comments appreciated. It this fix looks reasonable, what are the steps to get it implemented and integrated into the official source tree? Best regards, Andy From stefank at openjdk.java.net Mon Jan 31 12:34:11 2022 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 31 Jan 2022 12:34:11 GMT Subject: RFR: 8280817: Clean up and unify empty VM operations In-Reply-To: <7V3RAOXnSv7NBTg_NjGTl7aASNsggHvDOmHivZtzjMo=.491a2dc9-18a0-4ba9-84c4-2497bd3bbe6d@github.com> References: <7V3RAOXnSv7NBTg_NjGTl7aASNsggHvDOmHivZtzjMo=.491a2dc9-18a0-4ba9-84c4-2497bd3bbe6d@github.com> Message-ID: On Fri, 28 Jan 2022 09:15:23 GMT, Stefan Karlsson wrote: > There are a number of VM operations that do nothing, except triggering a safepoint. I'd like to clean up and unify them a bit to: > 1) Use one parent class which implements the empty doit function and turns off thread oop processing. > 2) Remove unused VM_Operations > 3) Don't reuse the VM_None type - there's bug/inconsistency here in that some subsystems report the name as None, while others report the name passed to the constructor > 4) Remove unused enum values Thanks for reviewing! ------------- PR: https://git.openjdk.java.net/jdk/pull/7261 From stefank at openjdk.java.net Mon Jan 31 12:34:12 2022 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 31 Jan 2022 12:34:12 GMT Subject: Integrated: 8280817: Clean up and unify empty VM operations In-Reply-To: <7V3RAOXnSv7NBTg_NjGTl7aASNsggHvDOmHivZtzjMo=.491a2dc9-18a0-4ba9-84c4-2497bd3bbe6d@github.com> References: <7V3RAOXnSv7NBTg_NjGTl7aASNsggHvDOmHivZtzjMo=.491a2dc9-18a0-4ba9-84c4-2497bd3bbe6d@github.com> Message-ID: On Fri, 28 Jan 2022 09:15:23 GMT, Stefan Karlsson wrote: > There are a number of VM operations that do nothing, except triggering a safepoint. I'd like to clean up and unify them a bit to: > 1) Use one parent class which implements the empty doit function and turns off thread oop processing. > 2) Remove unused VM_Operations > 3) Don't reuse the VM_None type - there's bug/inconsistency here in that some subsystems report the name as None, while others report the name passed to the constructor > 4) Remove unused enum values This pull request has now been integrated. Changeset: 61794c50 Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/61794c503973a330278f0595e36be0bd686fe2b5 Stats: 59 lines in 4 files changed: 10 ins; 24 del; 25 mod 8280817: Clean up and unify empty VM operations Reviewed-by: shade, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/7261 From shade at openjdk.java.net Mon Jan 31 13:19:28 2022 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 31 Jan 2022 13:19:28 GMT Subject: RFR: 8280867: Cpuid1Ecx feature parsing is incorrect for AMD CPUs Message-ID: See discussion in the bug. AFAICS, the fix is to "just" shift the flags by one to match both Intel and AMD specs. I believe this is not a serious bug, because adjacent bits in AMD case are set on modern chips, and Intel detection code only uses `lzcnt` and `prefetchw` out of these flags, both with Intel-specific hacks that are dropped now. Additional testing: - [x] Linux x86_64 fastdebug on TR 3970X (Zen 2) - [x] Linux x86_64 fastdebug on i5-11500 (Rocket Lake) - [x] Eyeballing `-Xlog:os+cpu` on TR 3970X (Zen 2) -- no change in detected flags - [x] Eyeballing `-Xlog:os+cpu` on i5-11500 (Rocket Lake) -- no change in detected flags ------------- Commit messages: - ...and copyright dates. - Add braces, while we are at it - Fix Changes: https://git.openjdk.java.net/jdk/pull/7287/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7287&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8280867 Stats: 9 lines in 1 file changed: 0 ins; 1 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/7287.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/7287/head:pull/7287 PR: https://git.openjdk.java.net/jdk/pull/7287 From duke at openjdk.java.net Mon Jan 31 14:23:17 2022 From: duke at openjdk.java.net (Alan Hayward) Date: Mon, 31 Jan 2022 14:23:17 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v14] In-Reply-To: References: Message-ID: <2SkxCOa_V6nHyzep29ZEDo4QEOyuBIv2fhxNCgceLXc=.fb5a680c-74ec-4b13-9748-7b4d173fefe4@github.com> On Mon, 24 Jan 2022 15:56:06 GMT, Alan Hayward wrote: >> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One >> of its uses is to protect against ROP based attacks. This is done by >> signing the Link Register whenever it is stored on the stack, and >> authenticating the value when it is loaded back from the stack. If an >> attacker were to try to change control flow by editing the stack then >> the authentication check of the Link Register will fail, causing a >> segfault when the function returns. >> >> On a system with PAC enabled, it is expected that all applications will >> be compiled with ROP protection. Fedora 33 and upwards already provide >> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of >> PAC instructions that exist in the NOP space - on hardware without PAC, >> these instructions act as NOPs, allowing backward compatibility for >> negligible performance cost (2 NOPs per non-leaf function). >> >> Hardware is currently limited to the Apple M1 MacBooks. All testing has >> been done within a Fedora Docker image. A run of SpecJVM showed no >> difference to that of noise - which was surprising. >> >> The most important part of this patch is simply compiling using branch >> protection provided by GCC/LLVM. This protects all C++ code from being >> used in ROP attacks, removing all static ROP gadgets from use. >> >> The remainder of the patch adds ROP protection to runtime generated >> code, in both stubs and compiled Java code. Attacks here are much harder >> as ROP gadgets must be found dynamically at runtime. If/when AOT >> compilation is added to JDK, then all stubs and compiled Java will be >> susceptible ROP gadgets being found by static analysis and therefore >> potentially as vulnerable as C++ code. >> >> There are a number of places where the VM changes control flow by >> rewriting the stack or otherwise. I?ve done some analysis as to how >> these could also be used for attacks (which I didn?t want to post here). >> These areas can be protected ensuring the pointers to various stubs and >> entry points are stored in memory as signed pointers. These changes are >> simple to make (they can be reduced to a type change in common code and >> a few addition sign/auth calls in the backend), but there a lot of them >> and the total code change is fairly large. I?m happy to provide a few >> work in progress patches. >> >> In order to match the security benefits of the Apple Arm64e ABI across >> the whole of JDK, then all the changes mentioned above would be >> required. > > Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: > > Fix popframe failures Can anyone give the CSR a review please? It's blocked on having a hotspot engineer review it. https://bugs.openjdk.java.net/browse/JDK-8277543 ------------- PR: https://git.openjdk.java.net/jdk/pull/6334 From kbarrett at openjdk.java.net Mon Jan 31 17:07:09 2022 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 31 Jan 2022 17:07:09 GMT Subject: RFR: 8280916: Simplify HotSpot Style Guide editorial changes In-Reply-To: References: Message-ID: On Mon, 31 Jan 2022 07:15:25 GMT, Thomas Stuefe wrote: > About substantive changes: since those affect a lot of people, would it make sense to include a clause for a minimum time to collect answers on a proposed style change? Like the 24hrs clause for code reviews? Since ultimately a substantive change requires the Group Lead to decide that consensus has been reached, I don't think more detail is really necessary in this area. ------------- PR: https://git.openjdk.java.net/jdk/pull/7281 From david.holmes at oracle.com Mon Jan 31 22:10:43 2022 From: david.holmes at oracle.com (David Holmes) Date: Tue, 1 Feb 2022 08:10:43 +1000 Subject: RFR: 8242181: [Linux] Show source information when printing native stack traces in hs_err files In-Reply-To: <2vxIb1vN8LxdnG_zim6JI_RzovAOSLpJJMgxbgu1pnI=.f6cbbebc-2dc0-44ca-bd4b-4d6d3fc18b0f@github.com> References: <2vxIb1vN8LxdnG_zim6JI_RzovAOSLpJJMgxbgu1pnI=.f6cbbebc-2dc0-44ca-bd4b-4d6d3fc18b0f@github.com> Message-ID: Hi Christian, Sorry for the delay in coming back to this, I wanted to see what other feedback arose. On 25/01/2022 7:43 pm, Christian Hagedorn wrote: > Hi David > >> This will be really useful - thank you. :) > > I'm glad to hear that! :-) Thanks for your overall comments! > >> All build file changes need to be seen by the build team. > > Right, thanks for adding it again. > >> That said I have two general concerns both related to executing non-async-signal-safe code in the signal handler via the error reporting logic. Now I know we already do a ton of stuff in error reporting not guaranteed (in any way) to be safe to do in a signal handler, but whenever we add something new I have to ask the question: how likely is this additional code to lead to secondary failures (hangs or crashes)? > > That's a valid concern. I've also asked myself this question when I had initially started using some assertions. We should not crash again during error reporting. I've therefore tried to be as conservative as possible and added bailouts instead, also in loops when reading data. But of course, this is just a best effort and by no means a guarantee to be safe (especially in terms of crashes). What could be alternatives to make this better? If the parsing code turns out to be very problematic in a signal handling context, then we could disable it in that context. So we really want to try and do a lot of testing by throwing random signals at the VM and see what breaks. >> Secondly, on the same issue the use of unified logging within this code seems even more likely to be problematic - I'm not aware of us currently using UL during error reporting. It may work in basic usecases but if it triggers logfile rotation or other more complex actions what then? > > I haven't thought about this before. To be honest, I think UL printing of the `dwarf` tag is only useful during development when adding something new to the parser or when debugging. I don't see much value of these messages otherwise - even less for a Java user. As a first step, I could change the logs from `log_X()` to `log_develop_X()` but that just shifts the problem to non-product builds. Another option (or additional thing) could be to guard the log messages with a new develop flag that's disabled by default. By setting it for development, we accept that it might be unsafe which should be fine. I think changing the logging to develop only is a reasonable step. I don't see logging of crash handling / error reporting as generally useful for the end user. Thanks, David > Thanks, > Christian > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/7126 From kvn at openjdk.java.net Mon Jan 31 22:14:08 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 31 Jan 2022 22:14:08 GMT Subject: RFR: 8280867: Cpuid1Ecx feature parsing is incorrect for AMD CPUs In-Reply-To: References: Message-ID: On Mon, 31 Jan 2022 11:26:29 GMT, Aleksey Shipilev wrote: > See discussion in the bug. AFAICS, the fix is to "just" shift the flags by one to match both Intel and AMD specs. I believe this is not a serious bug, because adjacent bits in AMD case are set on modern chips, and Intel detection code only uses `lzcnt` and `prefetchw` out of these flags, both with Intel-specific hacks that are dropped now. > > Additional testing: > - [x] Linux x86_64 fastdebug on TR 3970X (Zen 2) > - [x] Linux x86_64 fastdebug on i5-11500 (Rocket Lake) > - [x] Eyeballing `-Xlog:os+cpu` on TR 3970X (Zen 2) -- no change in detected flags > - [x] Eyeballing `-Xlog:os+cpu` on i5-11500 (Rocket Lake) -- no change in detected flags Right. In Intel's manual: CPUID.(EAX=8000_0001H):ECX[bit 5]=1 indicates LZCNT is supported. CPUID.(EAX=8000_0001H):ECX[bit 8]=1 indicates PREFETCHW is supported. >From AMD's ECX[bit 5]=1 ABM: advanced bit manipulation. LZCNT instruction support. ECX[bit 6]=1 SSE4A: EXTRQ, INSERTQ, MOVNTSS, and MOVNTSD instruction support. ECX[bit 7]=1 MisAlignSse: misaligned SSE mode. ECX[bit 8]=1 3DNowPrefetch: PREFETCH and PREFETCHW instruction support. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7287 From kvn at openjdk.java.net Mon Jan 31 22:18:11 2022 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 31 Jan 2022 22:18:11 GMT Subject: RFR: 8280916: Simplify HotSpot Style Guide editorial changes In-Reply-To: References: Message-ID: <-94XsVuSzzJ68iz1GZCqu4BXZOp9OVA6t9R7tvaUIU4=.108377db-1312-45de-aafc-88b980e41778@github.com> On Sun, 30 Jan 2022 00:39:20 GMT, Kim Barrett wrote: > Please review this change to the HotSpot Style Guide change process. > > The current process involves gathering consensus among the HotSpot Group > Members. That's fine for changes of substance. But it seems overly weighty > for editorial changes that don't affect the substance of the guide, but only > it's clarity or accuracy. > > The proposed change would permit the normal PR process to be used for such > changes, but require the requisite reviewers to additionally be HotSpot Group > Members. > > Note that there have already been a couple of changes that effectively > followed the proposed new process. > https://bugs.openjdk.java.net/browse/JDK-8274169 > https://bugs.openjdk.java.net/browse/JDK-8280182 > > This is a modification of the Style Guide, so rough consensus among the > HotSpot Group members is required to make this change. Only Group members > should vote for approval (via the github PR), though reasoned objections or > comments from anyone will be considered. A decision on this proposal will not > be made before Monday 14-Feb-2022 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review process > to approve (click on Review Changes > Approve), rather than sending a "vote: > yes" email reply that would be normal for a CFV. Approved. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/7281 From dholmes at openjdk.java.net Mon Jan 31 22:39:18 2022 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 31 Jan 2022 22:39:18 GMT Subject: RFR: 8277204: Implementation of JEP 8264130: PAC-RET protection for Linux/AArch64 [v14] In-Reply-To: References: Message-ID: On Mon, 24 Jan 2022 15:56:06 GMT, Alan Hayward wrote: >> PAC is an optional feature in AArch64 8.3 and is compulsory in v9. One >> of its uses is to protect against ROP based attacks. This is done by >> signing the Link Register whenever it is stored on the stack, and >> authenticating the value when it is loaded back from the stack. If an >> attacker were to try to change control flow by editing the stack then >> the authentication check of the Link Register will fail, causing a >> segfault when the function returns. >> >> On a system with PAC enabled, it is expected that all applications will >> be compiled with ROP protection. Fedora 33 and upwards already provide >> this. By compiling for ARMv8.0, GCC and LLVM will only use the set of >> PAC instructions that exist in the NOP space - on hardware without PAC, >> these instructions act as NOPs, allowing backward compatibility for >> negligible performance cost (2 NOPs per non-leaf function). >> >> Hardware is currently limited to the Apple M1 MacBooks. All testing has >> been done within a Fedora Docker image. A run of SpecJVM showed no >> difference to that of noise - which was surprising. >> >> The most important part of this patch is simply compiling using branch >> protection provided by GCC/LLVM. This protects all C++ code from being >> used in ROP attacks, removing all static ROP gadgets from use. >> >> The remainder of the patch adds ROP protection to runtime generated >> code, in both stubs and compiled Java code. Attacks here are much harder >> as ROP gadgets must be found dynamically at runtime. If/when AOT >> compilation is added to JDK, then all stubs and compiled Java will be >> susceptible ROP gadgets being found by static analysis and therefore >> potentially as vulnerable as C++ code. >> >> There are a number of places where the VM changes control flow by >> rewriting the stack or otherwise. I?ve done some analysis as to how >> these could also be used for attacks (which I didn?t want to post here). >> These areas can be protected ensuring the pointers to various stubs and >> entry points are stored in memory as signed pointers. These changes are >> simple to make (they can be reduced to a type change in common code and >> a few addition sign/auth calls in the backend), but there a lot of them >> and the total code change is fairly large. I?m happy to provide a few >> work in progress patches. >> >> In order to match the security benefits of the Apple Arm64e ABI across >> the whole of JDK, then all the changes mentioned above would be >> required. > > Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: > > Fix popframe failures A few stylistic nits below. And one query. Thanks, David src/hotspot/cpu/aarch64/globals_aarch64.hpp line 123: > 121: range(1, 99) \ > 122: product(ccstr, UseBranchProtection, "none", \ > 123: "Branch Protection to use: none,standard,pac-ret") \ Nit: spaces after commas please. src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 5286: > 5284: > 5285: void MacroAssembler::enter() > 5286: { Style nit: opening braces go on the same line as the function declaration. src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 417: > 415: // Enable PAC if this code has been built with branch-protection and the CPU/OS supports it. > 416: #ifdef __ARM_FEATURE_PAC_DEFAULT > 417: if (_features & CPU_PACA) { Style nit: no implicit booleans - expand as "if ( A & B != 0)" src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 429: > 427: #else > 428: warning("UseROPProtection specified, but not supported in the VM."); > 429: #endif If we issue these warnings should `_rop_protection` still be set true? src/hotspot/share/gc/shared/barrierSetNMethod.cpp line 58: > 56: > 57: address return_address = *return_address_ptr; > 58: AARCH64_ONLY(return_address=pauth_strip_pointer(return_address)); Style nit: spaces around binary operators please. There are a couple of occurrences. src/hotspot/share/runtime/frame.cpp line 1115: > 1113: AARCH64_ONLY(if (!pauth_ptr_is_raw(x)) { > 1114: return false; > 1115: }) Style nit: Use ifdef for multi-line code blocks. ------------- PR: https://git.openjdk.java.net/jdk/pull/6334 From dlong at openjdk.java.net Mon Jan 31 23:15:09 2022 From: dlong at openjdk.java.net (Dean Long) Date: Mon, 31 Jan 2022 23:15:09 GMT Subject: RFR: 8280867: Cpuid1Ecx feature parsing is incorrect for AMD CPUs In-Reply-To: References: Message-ID: On Mon, 31 Jan 2022 11:26:29 GMT, Aleksey Shipilev wrote: > See discussion in the bug. AFAICS, the fix is to "just" shift the flags by one to match both Intel and AMD specs. I believe this is not a serious bug, because adjacent bits in AMD case are set on modern chips, and Intel detection code only uses `lzcnt` and `prefetchw` out of these flags, both with Intel-specific hacks that are dropped now. > > Additional testing: > - [x] Linux x86_64 fastdebug on TR 3970X (Zen 2) > - [x] Linux x86_64 fastdebug on i5-11500 (Rocket Lake) > - [x] Eyeballing `-Xlog:os+cpu` on TR 3970X (Zen 2) -- no change in detected flags > - [x] Eyeballing `-Xlog:os+cpu` on i5-11500 (Rocket Lake) -- no change in detected flags Marked as reviewed by dlong (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/7287