From lmesnik at openjdk.java.net Tue Jun 1 02:59:19 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Tue, 1 Jun 2021 02:59:19 GMT Subject: RFR: 8265148: StackWatermarkSet being updated during AsyncGetCallTrace In-Reply-To: References: Message-ID: On Thu, 27 May 2021 04:01:17 GMT, Leonid Mesnik wrote: > 8265148: StackWatermarkSet being updated during AsyncGetCallTrace I verified that async-profiles works with ZGC and data look reasonable. ------------- PR: https://git.openjdk.java.net/jdk/pull/4217 From lmesnik at openjdk.java.net Tue Jun 1 03:03:22 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Tue, 1 Jun 2021 03:03:22 GMT Subject: RFR: 8265148: StackWatermarkSet being updated during AsyncGetCallTrace In-Reply-To: References: Message-ID: On Thu, 27 May 2021 23:07:02 GMT, David Holmes wrote: >> 8265148: StackWatermarkSet being updated during AsyncGetCallTrace > > src/hotspot/share/prims/forte.cpp line 326: > >> 324: int loop_count; >> 325: int loop_max = MaxJavaStackTraceDepth * 2; >> 326: RegisterMap map(thread, false, false); > > Can we add some comments as to what the false parameters mean please. > > RegisterMap map(thread, false /* no update */, false /*no stackwatermark frame processing */); > > Though it may be that a more elaborate block comment is needed to explain why we don't want stackwatermark frame processing. Let me check with Erik if it makes sense to put more generic comments about the usage of stackwatermark frame processing in RegisterMap, frames etc. They can't be updated in an arbitrary thread state. It makes sense describe this info in stackwatermarking. ------------- PR: https://git.openjdk.java.net/jdk/pull/4217 From eosterlund at openjdk.java.net Tue Jun 1 04:12:20 2021 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 1 Jun 2021 04:12:20 GMT Subject: RFR: 8265148: StackWatermarkSet being updated during AsyncGetCallTrace In-Reply-To: References: Message-ID: <0OCkOjLl2iUz5Knd6Y246N_HB9rJGHTOInAjqro49S4=.e88b2d28-4a5f-42e3-885d-3304e3ba096d@github.com> On Tue, 1 Jun 2021 03:00:23 GMT, Leonid Mesnik wrote: > Let me check with Erik if it makes sense to put more generic comments about the usage of stackwatermark frame processing in RegisterMap, frames etc. They can't be updated in an arbitrary thread state. It makes sense describe this info in stackwatermarking. You could say something generic like "StackWatermark can only be used when at points where the stack can be parsed by the GC", or something like that. ------------- PR: https://git.openjdk.java.net/jdk/pull/4217 From lmesnik at openjdk.java.net Tue Jun 1 04:26:16 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Tue, 1 Jun 2021 04:26:16 GMT Subject: RFR: 8265148: StackWatermarkSet being updated during AsyncGetCallTrace In-Reply-To: <0OCkOjLl2iUz5Knd6Y246N_HB9rJGHTOInAjqro49S4=.e88b2d28-4a5f-42e3-885d-3304e3ba096d@github.com> References: <0OCkOjLl2iUz5Knd6Y246N_HB9rJGHTOInAjqro49S4=.e88b2d28-4a5f-42e3-885d-3304e3ba096d@github.com> Message-ID: On Tue, 1 Jun 2021 04:09:13 GMT, Erik ?sterlund wrote: >> Let me check with Erik if it makes sense to put more generic comments about the usage of stackwatermark frame processing in RegisterMap, frames etc. They can't be updated in an arbitrary thread state. It makes sense describe this info in stackwatermarking. > >> Let me check with Erik if it makes sense to put more generic comments about the usage of stackwatermark frame processing in RegisterMap, frames etc. They can't be updated in an arbitrary thread state. It makes sense describe this info in stackwatermarking. > > You could say something generic like "StackWatermark can only be used when at points where the stack can be parsed by the GC", or something like that. Thank you for your prompt response. I meant that it makes sense to update comments for RegisterMap to mention this. Currently, the doc says how to use it and how to disable update: "Updating of the RegisterMap can be turned off by instantiating the // register map as: RegisterMap map(thread, false);" But it says nothing about why and how process_frames should be set. It might make sense to put this info there so anyone could easily find and read it. I think it is better to put it there rather than in forte.cpp. ------------- PR: https://git.openjdk.java.net/jdk/pull/4217 From eosterlund at openjdk.java.net Tue Jun 1 05:05:24 2021 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 1 Jun 2021 05:05:24 GMT Subject: RFR: 8265148: StackWatermarkSet being updated during AsyncGetCallTrace In-Reply-To: References: <0OCkOjLl2iUz5Knd6Y246N_HB9rJGHTOInAjqro49S4=.e88b2d28-4a5f-42e3-885d-3304e3ba096d@github.com> Message-ID: <5lUwO4aIv9dQqDNbZ-T_EGzr9ui-Lwj33dbaQuj5lyw=.54a7475e-c352-4463-b2a1-cb92cf600e48@github.com> On Tue, 1 Jun 2021 04:23:43 GMT, Leonid Mesnik wrote: > Thank you for your prompt response. I meant that it makes sense to update comments for RegisterMap to mention this. Currently, the doc says how to use it and how to disable update: > > "Updating of the RegisterMap can be turned off by instantiating the > > // register map as: RegisterMap map(thread, false);" > > But it says nothing about why and how process_frames should be set. > > It might make sense to put this info there so anyone could easily find and read it. I think it is better to put it there rather than in forte.cpp. Yes I agree - that does make sense. ------------- PR: https://git.openjdk.java.net/jdk/pull/4217 From tschatzl at openjdk.java.net Tue Jun 1 08:23:21 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 1 Jun 2021 08:23:21 GMT Subject: RFR: 8267914: Remove DeferredObjectToKlass workaround [v2] In-Reply-To: References: Message-ID: On Mon, 31 May 2021 09:10:33 GMT, Stefan Karlsson wrote: >> In JKD-8229839 we fixed a circular dependency problem between oop.inline.hpp and markWord.inline.hpp. When JDK-8267464: >> 'Circular-dependency resilient inline headers' gets integrated, this workaround isn't needed anymore. > > Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: > > - Merge remote-tracking branch 'origin/master' into 8267914_remove_DeferredObjectToKlass > - 8267914: Remove DeferredObjectToKlass workaround Lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4242 From tschatzl at openjdk.java.net Tue Jun 1 09:47:21 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 1 Jun 2021 09:47:21 GMT Subject: RFR: 8017163: G1: Refactor remembered sets [v4] In-Reply-To: References: <8nhr5kPP_J7KSB8L0ppDo3SL3BfyzYrNPawOUx6r24A=.2d469775-2431-4515-8f44-21da441be68e@github.com> Message-ID: On Mon, 31 May 2021 10:02:18 GMT, Stefan Johansson wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> More cleanup after sjohanss comments > > src/hotspot/share/gc/g1/g1ServiceThread.cpp line 29: > >> 27: #include "gc/g1/heapRegion.inline.hpp" >> 28: #include "gc/g1/heapRegionRemSet.inline.hpp" >> 29: #include "gc/shared/suspendibleThreadSet.hpp" > > I think all changes to g1ServiceThread.* could be reverted now when you added the state to the new task instead. Adding `is_enqueued()` to not check `next() == NULL` is a good change, but it should be made on its own. I agree. Note that I think that these changes fix real issues (no virtual destructor with virtual methods) too. > src/hotspot/share/gc/g1/heapRegionRemSet.hpp line 126: > >> 124: // This correction is necessary because the above includes the second >> 125: // part. >> 126: + (sizeof(HeapRegionRemSet) - sizeof(G1CardSet)) > > Pre-existing, but what do you think about moving/including the comment for this part of the calculation in the function comment. I think the comment should stay where it is - it is confusing otherwise, and establishing the context for this seems hard. I can remove the comment if you want though. ------------- PR: https://git.openjdk.java.net/jdk/pull/4116 From tschatzl at openjdk.java.net Tue Jun 1 09:51:20 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 1 Jun 2021 09:51:20 GMT Subject: RFR: 8017163: G1: Refactor remembered sets [v4] In-Reply-To: References: <8nhr5kPP_J7KSB8L0ppDo3SL3BfyzYrNPawOUx6r24A=.2d469775-2431-4515-8f44-21da441be68e@github.com> Message-ID: On Mon, 31 May 2021 10:22:01 GMT, Stefan Johansson wrote: > A general question about the testing? Have you done any testing with `VerifyRememberedSets` turned on? Yes we did as long as there have been changes to the code path adding remembered sets. I'll do some reruns with that to be current, just in case. We have not seen issues for a long time with missing remembered set entries, but re-checking is always good. ------------- PR: https://git.openjdk.java.net/jdk/pull/4116 From stefank at openjdk.java.net Tue Jun 1 10:26:24 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 1 Jun 2021 10:26:24 GMT Subject: RFR: 8267914: Remove DeferredObjectToKlass workaround [v2] In-Reply-To: References: Message-ID: On Mon, 31 May 2021 09:10:33 GMT, Stefan Karlsson wrote: >> In JKD-8229839 we fixed a circular dependency problem between oop.inline.hpp and markWord.inline.hpp. When JDK-8267464: >> 'Circular-dependency resilient inline headers' gets integrated, this workaround isn't needed anymore. > > Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: > > - Merge remote-tracking branch 'origin/master' into 8267914_remove_DeferredObjectToKlass > - 8267914: Remove DeferredObjectToKlass workaround Thanks for reviewing! ------------- PR: https://git.openjdk.java.net/jdk/pull/4242 From stefank at openjdk.java.net Tue Jun 1 10:26:25 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 1 Jun 2021 10:26:25 GMT Subject: Integrated: 8267914: Remove DeferredObjectToKlass workaround In-Reply-To: References: Message-ID: On Fri, 28 May 2021 11:13:44 GMT, Stefan Karlsson wrote: > In JKD-8229839 we fixed a circular dependency problem between oop.inline.hpp and markWord.inline.hpp. When JDK-8267464: > 'Circular-dependency resilient inline headers' gets integrated, this workaround isn't needed anymore. This pull request has now been integrated. Changeset: 6149b9ad Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/6149b9ad7569ce1711201353fd644b6a739d5a5b Stats: 31 lines in 3 files changed: 1 ins; 22 del; 8 mod 8267914: Remove DeferredObjectToKlass workaround Reviewed-by: eosterlund, tschatzl ------------- PR: https://git.openjdk.java.net/jdk/pull/4242 From tschatzl at openjdk.java.net Tue Jun 1 12:03:49 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 1 Jun 2021 12:03:49 GMT Subject: RFR: 8017163: G1: Refactor remembered sets [v6] In-Reply-To: References: Message-ID: <1Cdqg_uI8c0U_vAaBqLV_A8pBEq8FJ1hrMl9Tm7LA7Q=.a7aaa2f1-8ed3-44b4-978a-3c639128c98c@github.com> > Hi all, > > can I have reviews for this change that significantly refactors the remembered set for more scalability. > > The current G1 remembered set implementation has been designed for use cases and Java heaps and applications from 20 years ago. > > Over time many problems with performance and in particular memory usage have been observed: > > * adding elements to the lowest tier data structure takes a per-remembered set global lock. Measurements have shown that the applications can wait thousands of seconds acquiring these locks. While the affected threads are in most cases refinement threads so does not directly affect the application, it can still affect the ability of G1 to meet some goals needed for keeping pause times (i.e. amount of cards from the refinement buffers to be merged into the card table and then scanned during gc). > > * there is a substantial memory overhead for managing the data structures: examples are > * using separate (hash) tables for the three different types of card containers > * there is significant unnecessary preallocation of memory for some of the card set containers > * Containers store redundant information > > * inflexibility when reusing memory: in the current implementation the different containers use different approaches to manage memory. Most use the C heap directly, some the C heap with some internal global memory pool. This in practice makes it very difficult to implement anything other than giving back memory in the collection pause. The corresponding "Free Collection Set" pause can take a significant amount of time because of that. > Also memory reuse is limited and preallocating arenas is limited (or would have to be reimplemented multiple times), stressing the C heap allocator. > > * inability to support additional use cases: over time interesting ideas (e.g. JDK-8058803) came up for improving performance of remembered set management. Mostly due to redundant information everywhere and completely different handling of various aspects in the containers it is in practice impossible to implement these. > > * (partial) inability to give back memory to the OS. While some of the containers use the C heap allocator, and so in some way give back memory, these implementations and handling is different for every container. > > * the existing granularity of containers are unbalanced: currently there exist three tiers: "sparse", "fine" and "full". Sparse is an array of cards ranging in the hundreds maybe, "fine" is a bitmap covering a whole region and full is a bit indicating that that region should be scanned completely during GC. > > The problem is that there is nothing between "no card at all" and "sparse" and in particular the difference between the capability to hold entries of "sparse" and "fine". I.e. memory usage difference when exceeding a "sparse" array (holding 128 entries at 32M regions, taking ~256 bytes) to fine that is able to hold 65k entries using 8kB is significant. > For these reason there is even a dedicated option to stop allocating more "fine" containers and just give up and use "full" instead to avoid excessive memory usage. With extremely bad consequences in pause times. > > Over time some of these issues have been fixed or in many cases band-aided, and some of these fixes and ideas were the result of working on this change (e.g. JDK-8262185, JDK-8233919, JDK-8213108). > > This change is effectively a rewrite of the Java heap card based part of a region's remembered set. > > This initial fully working change can be roughly described with the following properties: > > * use a single `ConcurrentHashTable` for the card containers of a given region. The container in use replaced (coarsened) on the fly within the CHT node, completely lock-free. This implements JDK-6949259. > > * memory for a given region's remembered set for all containers (and the CHT nodes) is backed by per container type and per remembered set arena style bump-pointer allocation buffers. In this change, in the pause, memory is given back to free lists only. The implementation gives back memory to the OS concurrently to the application. Memory is still managed using the C heap memory manager though, but abstracted away and could be replaced by manual page memory management. > > * there are now four different container types and one meta-container type. These four actual containers are: > * inline pointer: the change store a few (3-5) cards in the CHT node directly and uses no extra memory. > * array of cards: similar to the "sparse" container, an array of cards with a configurable amount of entries. However bulk allocation of memory is now managed at a lower level so there is much less waste. > * bitmap: similar to "fine", a bitmap spanning a (sub-)range of memory > * full: same as full, indicating for a (sub-)range of memory that all cards are to be looked at during scan. Similar to inline pointers, this uses no extra memory. > * howl: the Howl container subdivides a given memory range into subranges where any of the other containers describing that sub-range of the heap may be stored in. This is somewhat similar to the idea suggested in JDK-8048504. > > * care has been taken to minimize container memory usage, e.g. by not adding redundant information there and in general carefully specify them. They have been designed with future enhancements in mind. > > In some benchmarks (where there is significant remembered set memory usage) we are seeing memory reduction to 25% of JDK 16 levels with this change. Garbage collection times are at most as long or shorter than before; most changes affecting pause times have been extracted earlier. Individiual affected phases are generally shorter now. > > Testing: tier1-8 many times, manual and automated perf testing Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: sjohanss-review 3 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4116/files - new: https://git.openjdk.java.net/jdk/pull/4116/files/346247fe..e2c3c300 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4116&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4116&range=04-05 Stats: 183 lines in 13 files changed: 44 ins; 54 del; 85 mod Patch: https://git.openjdk.java.net/jdk/pull/4116.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4116/head:pull/4116 PR: https://git.openjdk.java.net/jdk/pull/4116 From alanb at openjdk.java.net Tue Jun 1 12:50:29 2021 From: alanb at openjdk.java.net (Alan Bateman) Date: Tue, 1 Jun 2021 12:50:29 GMT Subject: RFR: 8266459: Implement JEP 411: Deprecate the Security Manager for Removal [v6] In-Reply-To: <2BckdZPCGmN4MBYST7T7jVz08y-vJwSqRc3pcPEBTWA=.c47e0c65-f091-4c87-89d5-893f662ff892@github.com> References: <_lD3kOiYR1ulm4m7HivqnnFQBD7WxWWLBz56oP6EMVU=.1723b50b-4390-457c-878d-f726cb1ce170@github.com> <2BckdZPCGmN4MBYST7T7jVz08y-vJwSqRc3pcPEBTWA=.c47e0c65-f091-4c87-89d5-893f662ff892@github.com> Message-ID: On Mon, 31 May 2021 15:02:57 GMT, Weijun Wang wrote: >> Please review this implementation of [JEP 411](https://openjdk.java.net/jeps/411). >> >> The code change is divided into 3 commits. Please review them one by one. >> >> 1. https://github.com/openjdk/jdk/commit/576161d15423f58281e384174d28c9f9be7941a1 The essential change for this JEP, including the `@Deprecate` annotations and spec change. It also update the default value of the `java.security.manager` system property to "disallow", and necessary test change following this update. >> 2. https://github.com/openjdk/jdk/commit/26a54a835e9f84aa528740a7c5c35d07355a8a66 Manual changes to several files so that the next commit can be generated programatically. >> 3. https://github.com/openjdk/jdk/commit/eb6c566ff9207974a03a53335e0e697cffcf0950 Automatic changes to other source files to avoid javac warnings on deprecation for removal >> >> The 1st and 2nd commits should be reviewed carefully. The 3rd one is generated programmatically, see the comment below for more details. If you are only interested in a portion of the 3rd commit and would like to review it as a separate file, please comment here and I'll generate an individual webrev. >> >> Due to the size of this PR, no attempt is made to update copyright years for any file to minimize unnecessary merge conflict. >> >> Furthermore, since the default value of `java.security.manager` system property is now "disallow", most of the tests calling `System.setSecurityManager()` need to launched with `-Djava.security.manager=allow`. This is covered in a different PR at https://github.com/openjdk/jdk/pull/4071. >> >> Update: the deprecation annotations and javadoc tags, build, compiler, core-libs, hotspot, i18n, jmx, net, nio, security, and serviceability are reviewed. Rest are 2d, awt, beans, sound, and swing. > > Weijun Wang has updated the pull request incrementally with one additional commit since the last revision: > > default behavior reverted to allow System.setSecurityManagerDirect looks a bit ugly now. Can this be renamed to implSetSecurityManager and avoid the line break in the middle of the declaration? The usage of System.err usage in setSecurityManager also needs to be re-examined as this will run arbitrary code when System.err can be changed. To fix this will require capturing the stream at startup (as was done with the illegal access logger). It's okay to integrate with what you have for the first push and we can fix this issue with System.err when the warning message is changed to the intended message. ------------- PR: https://git.openjdk.java.net/jdk/pull/4073 From sjohanss at openjdk.java.net Tue Jun 1 13:15:23 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Tue, 1 Jun 2021 13:15:23 GMT Subject: RFR: 8017163: G1: Refactor remembered sets [v4] In-Reply-To: References: <8nhr5kPP_J7KSB8L0ppDo3SL3BfyzYrNPawOUx6r24A=.2d469775-2431-4515-8f44-21da441be68e@github.com> Message-ID: On Tue, 1 Jun 2021 09:44:14 GMT, Thomas Schatzl wrote: >> src/hotspot/share/gc/g1/heapRegionRemSet.hpp line 126: >> >>> 124: // This correction is necessary because the above includes the second >>> 125: // part. >>> 126: + (sizeof(HeapRegionRemSet) - sizeof(G1CardSet)) >> >> Pre-existing, but what do you think about moving/including the comment for this part of the calculation in the function comment. > > I think the comment should stay where it is - it is confusing otherwise, and establishing the context for this seems hard. I can remove the comment if you want though. Removing sounds good to me :) Or fit it on one line like this: + (sizeof(HeapRegionRemSet) - sizeof(G1CardSet)) // Avoid double counting G1CardSet ------------- PR: https://git.openjdk.java.net/jdk/pull/4116 From tschatzl at openjdk.java.net Tue Jun 1 13:31:53 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 1 Jun 2021 13:31:53 GMT Subject: RFR: 8017163: G1: Refactor remembered sets [v7] In-Reply-To: References: Message-ID: > Hi all, > > can I have reviews for this change that significantly refactors the remembered set for more scalability. > > The current G1 remembered set implementation has been designed for use cases and Java heaps and applications from 20 years ago. > > Over time many problems with performance and in particular memory usage have been observed: > > * adding elements to the lowest tier data structure takes a per-remembered set global lock. Measurements have shown that the applications can wait thousands of seconds acquiring these locks. While the affected threads are in most cases refinement threads so does not directly affect the application, it can still affect the ability of G1 to meet some goals needed for keeping pause times (i.e. amount of cards from the refinement buffers to be merged into the card table and then scanned during gc). > > * there is a substantial memory overhead for managing the data structures: examples are > * using separate (hash) tables for the three different types of card containers > * there is significant unnecessary preallocation of memory for some of the card set containers > * Containers store redundant information > > * inflexibility when reusing memory: in the current implementation the different containers use different approaches to manage memory. Most use the C heap directly, some the C heap with some internal global memory pool. This in practice makes it very difficult to implement anything other than giving back memory in the collection pause. The corresponding "Free Collection Set" pause can take a significant amount of time because of that. > Also memory reuse is limited and preallocating arenas is limited (or would have to be reimplemented multiple times), stressing the C heap allocator. > > * inability to support additional use cases: over time interesting ideas (e.g. JDK-8058803) came up for improving performance of remembered set management. Mostly due to redundant information everywhere and completely different handling of various aspects in the containers it is in practice impossible to implement these. > > * (partial) inability to give back memory to the OS. While some of the containers use the C heap allocator, and so in some way give back memory, these implementations and handling is different for every container. > > * the existing granularity of containers are unbalanced: currently there exist three tiers: "sparse", "fine" and "full". Sparse is an array of cards ranging in the hundreds maybe, "fine" is a bitmap covering a whole region and full is a bit indicating that that region should be scanned completely during GC. > > The problem is that there is nothing between "no card at all" and "sparse" and in particular the difference between the capability to hold entries of "sparse" and "fine". I.e. memory usage difference when exceeding a "sparse" array (holding 128 entries at 32M regions, taking ~256 bytes) to fine that is able to hold 65k entries using 8kB is significant. > For these reason there is even a dedicated option to stop allocating more "fine" containers and just give up and use "full" instead to avoid excessive memory usage. With extremely bad consequences in pause times. > > Over time some of these issues have been fixed or in many cases band-aided, and some of these fixes and ideas were the result of working on this change (e.g. JDK-8262185, JDK-8233919, JDK-8213108). > > This change is effectively a rewrite of the Java heap card based part of a region's remembered set. > > This initial fully working change can be roughly described with the following properties: > > * use a single `ConcurrentHashTable` for the card containers of a given region. The container in use replaced (coarsened) on the fly within the CHT node, completely lock-free. This implements JDK-6949259. > > * memory for a given region's remembered set for all containers (and the CHT nodes) is backed by per container type and per remembered set arena style bump-pointer allocation buffers. In this change, in the pause, memory is given back to free lists only. The implementation gives back memory to the OS concurrently to the application. Memory is still managed using the C heap memory manager though, but abstracted away and could be replaced by manual page memory management. > > * there are now four different container types and one meta-container type. These four actual containers are: > * inline pointer: the change store a few (3-5) cards in the CHT node directly and uses no extra memory. > * array of cards: similar to the "sparse" container, an array of cards with a configurable amount of entries. However bulk allocation of memory is now managed at a lower level so there is much less waste. > * bitmap: similar to "fine", a bitmap spanning a (sub-)range of memory > * full: same as full, indicating for a (sub-)range of memory that all cards are to be looked at during scan. Similar to inline pointers, this uses no extra memory. > * howl: the Howl container subdivides a given memory range into subranges where any of the other containers describing that sub-range of the heap may be stored in. This is somewhat similar to the idea suggested in JDK-8048504. > > * care has been taken to minimize container memory usage, e.g. by not adding redundant information there and in general carefully specify them. They have been designed with future enhancements in mind. > > In some benchmarks (where there is significant remembered set memory usage) we are seeing memory reduction to 25% of JDK 16 levels with this change. Garbage collection times are at most as long or shorter than before; most changes affecting pause times have been extracted earlier. Individiual affected phases are generally shorter now. > > Testing: tier1-8 many times, manual and automated perf testing Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: Rename G1CardSetContainerOnHeap to G1CardSetContainer on popular demand ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4116/files - new: https://git.openjdk.java.net/jdk/pull/4116/files/e2c3c300..f87b398f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4116&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4116&range=05-06 Stats: 40 lines in 7 files changed: 0 ins; 0 del; 40 mod Patch: https://git.openjdk.java.net/jdk/pull/4116.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4116/head:pull/4116 PR: https://git.openjdk.java.net/jdk/pull/4116 From tschatzl at openjdk.java.net Tue Jun 1 13:46:46 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 1 Jun 2021 13:46:46 GMT Subject: RFR: 8017163: G1: Refactor remembered sets [v8] In-Reply-To: References: Message-ID: > Hi all, > > can I have reviews for this change that significantly refactors the remembered set for more scalability. > > The current G1 remembered set implementation has been designed for use cases and Java heaps and applications from 20 years ago. > > Over time many problems with performance and in particular memory usage have been observed: > > * adding elements to the lowest tier data structure takes a per-remembered set global lock. Measurements have shown that the applications can wait thousands of seconds acquiring these locks. While the affected threads are in most cases refinement threads so does not directly affect the application, it can still affect the ability of G1 to meet some goals needed for keeping pause times (i.e. amount of cards from the refinement buffers to be merged into the card table and then scanned during gc). > > * there is a substantial memory overhead for managing the data structures: examples are > * using separate (hash) tables for the three different types of card containers > * there is significant unnecessary preallocation of memory for some of the card set containers > * Containers store redundant information > > * inflexibility when reusing memory: in the current implementation the different containers use different approaches to manage memory. Most use the C heap directly, some the C heap with some internal global memory pool. This in practice makes it very difficult to implement anything other than giving back memory in the collection pause. The corresponding "Free Collection Set" pause can take a significant amount of time because of that. > Also memory reuse is limited and preallocating arenas is limited (or would have to be reimplemented multiple times), stressing the C heap allocator. > > * inability to support additional use cases: over time interesting ideas (e.g. JDK-8058803) came up for improving performance of remembered set management. Mostly due to redundant information everywhere and completely different handling of various aspects in the containers it is in practice impossible to implement these. > > * (partial) inability to give back memory to the OS. While some of the containers use the C heap allocator, and so in some way give back memory, these implementations and handling is different for every container. > > * the existing granularity of containers are unbalanced: currently there exist three tiers: "sparse", "fine" and "full". Sparse is an array of cards ranging in the hundreds maybe, "fine" is a bitmap covering a whole region and full is a bit indicating that that region should be scanned completely during GC. > > The problem is that there is nothing between "no card at all" and "sparse" and in particular the difference between the capability to hold entries of "sparse" and "fine". I.e. memory usage difference when exceeding a "sparse" array (holding 128 entries at 32M regions, taking ~256 bytes) to fine that is able to hold 65k entries using 8kB is significant. > For these reason there is even a dedicated option to stop allocating more "fine" containers and just give up and use "full" instead to avoid excessive memory usage. With extremely bad consequences in pause times. > > Over time some of these issues have been fixed or in many cases band-aided, and some of these fixes and ideas were the result of working on this change (e.g. JDK-8262185, JDK-8233919, JDK-8213108). > > This change is effectively a rewrite of the Java heap card based part of a region's remembered set. > > This initial fully working change can be roughly described with the following properties: > > * use a single `ConcurrentHashTable` for the card containers of a given region. The container in use replaced (coarsened) on the fly within the CHT node, completely lock-free. This implements JDK-6949259. > > * memory for a given region's remembered set for all containers (and the CHT nodes) is backed by per container type and per remembered set arena style bump-pointer allocation buffers. In this change, in the pause, memory is given back to free lists only. The implementation gives back memory to the OS concurrently to the application. Memory is still managed using the C heap memory manager though, but abstracted away and could be replaced by manual page memory management. > > * there are now four different container types and one meta-container type. These four actual containers are: > * inline pointer: the change store a few (3-5) cards in the CHT node directly and uses no extra memory. > * array of cards: similar to the "sparse" container, an array of cards with a configurable amount of entries. However bulk allocation of memory is now managed at a lower level so there is much less waste. > * bitmap: similar to "fine", a bitmap spanning a (sub-)range of memory > * full: same as full, indicating for a (sub-)range of memory that all cards are to be looked at during scan. Similar to inline pointers, this uses no extra memory. > * howl: the Howl container subdivides a given memory range into subranges where any of the other containers describing that sub-range of the heap may be stored in. This is somewhat similar to the idea suggested in JDK-8048504. > > * care has been taken to minimize container memory usage, e.g. by not adding redundant information there and in general carefully specify them. They have been designed with future enhancements in mind. > > In some benchmarks (where there is significant remembered set memory usage) we are seeing memory reduction to 25% of JDK 16 levels with this change. Garbage collection times are at most as long or shorter than before; most changes affecting pause times have been extracted earlier. Individiual affected phases are generally shorter now. > > Testing: tier1-8 many times, manual and automated perf testing Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: Improve comment ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4116/files - new: https://git.openjdk.java.net/jdk/pull/4116/files/f87b398f..97e63605 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4116&range=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4116&range=06-07 Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/4116.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4116/head:pull/4116 PR: https://git.openjdk.java.net/jdk/pull/4116 From coleenp at openjdk.java.net Tue Jun 1 14:02:22 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 1 Jun 2021 14:02:22 GMT Subject: RFR: 8267920: Create separate Events buffer for VMOperations [v3] In-Reply-To: References: Message-ID: On Mon, 31 May 2021 11:17:44 GMT, Stefan Karlsson wrote: >> The Events classes collect events in a circular buffer that gets dumped into the hs_err files. There are different sections to sort out different types of events. See: >> >> // A log for internal exception related messages, like internal >> // throws and implicit exceptions. >> static ExceptionsEventLog* _exceptions; >> >> // Deoptization related messages >> static StringEventLog* _deopt_messages; >> >> // Redefinition related messages >> static StringEventLog* _redefinitions; >> >> // Class unloading events >> static UnloadingEventLog* _class_unloading; >> >> There's also a buffer for non-categorized events: >> >> // A log for generic messages that aren't well categorized. >> static StringEventLog* _messages; >> >> I propose that we create a separate buffer for VMOperations. This will make it easier to debug GC related bugs. >> >> With the proposed patch, the hs_err files will now have a section that looks like this. >> >> VM Operations (20 events): >> Event: 0,186 Executing VM operation: HandshakeAllThreads >> Event: 0,186 Executing VM operation: HandshakeAllThreads done >> Event: 0,230 Executing VM operation: ZMarkStart >> Event: 0,230 Executing VM operation: ZMarkStart done >> Event: 0,232 Executing VM operation: HandshakeAllThreads >> Event: 0,232 Executing VM operation: HandshakeAllThreads done >> Event: 0,232 Executing VM operation: HandshakeAllThreads >> Event: 0,232 Executing VM operation: HandshakeAllThreads done >> Event: 0,232 Executing VM operation: HandshakeAllThreads >> Event: 0,233 Executing VM operation: HandshakeAllThreads done >> Event: 0,233 Executing VM operation: ZMarkEnd >> Event: 0,233 Executing VM operation: ZMarkEnd done >> Event: 0,234 Executing VM operation: HandshakeAllThreads >> Event: 0,234 Executing VM operation: HandshakeAllThreads done >> Event: 0,234 Executing VM operation: ZVerify >> Event: 0,234 Executing VM operation: ZVerify done >> Event: 0,234 Executing VM operation: CleanClassLoaderDataMetaspaces >> Event: 0,234 Executing VM operation: CleanClassLoaderDataMetaspaces done >> Event: 0,235 Executing VM operation: ZRelocateStart >> Event: 0,235 Executing VM operation: ZRelocateStart done > > Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Review tschatzl > - Merge remote-tracking branch 'origin/master' into 8267914_events_vmoperations > - Review coleenp > - 8267920: Create separate Events buffer for VMOperations I think the way Stefan did it would make it easier to add Class Loading specific EventMark. classfile/classLoader.cpp: EventMark m("loading class %s", class_name); Although I don't see any other EventMarks except for shenandoah GC. So there could be future refactoring here. ------------- PR: https://git.openjdk.java.net/jdk/pull/4243 From alanb at openjdk.java.net Tue Jun 1 14:27:23 2021 From: alanb at openjdk.java.net (Alan Bateman) Date: Tue, 1 Jun 2021 14:27:23 GMT Subject: RFR: 8195129: System.load() fails to load from unicode paths [v2] In-Reply-To: References: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> Message-ID: On Fri, 28 May 2021 05:47:11 GMT, David Holmes wrote: > The core-libs folks have the experience/expertise with these character encoding issues so I will defer to them. Naoto has agreed to look at this. ------------- PR: https://git.openjdk.java.net/jdk/pull/4169 From tschatzl at openjdk.java.net Tue Jun 1 14:41:43 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 1 Jun 2021 14:41:43 GMT Subject: RFR: 8017163: G1: Refactor remembered sets [v9] In-Reply-To: References: Message-ID: > Hi all, > > can I have reviews for this change that significantly refactors the remembered set for more scalability. > > The current G1 remembered set implementation has been designed for use cases and Java heaps and applications from 20 years ago. > > Over time many problems with performance and in particular memory usage have been observed: > > * adding elements to the lowest tier data structure takes a per-remembered set global lock. Measurements have shown that the applications can wait thousands of seconds acquiring these locks. While the affected threads are in most cases refinement threads so does not directly affect the application, it can still affect the ability of G1 to meet some goals needed for keeping pause times (i.e. amount of cards from the refinement buffers to be merged into the card table and then scanned during gc). > > * there is a substantial memory overhead for managing the data structures: examples are > * using separate (hash) tables for the three different types of card containers > * there is significant unnecessary preallocation of memory for some of the card set containers > * Containers store redundant information > > * inflexibility when reusing memory: in the current implementation the different containers use different approaches to manage memory. Most use the C heap directly, some the C heap with some internal global memory pool. This in practice makes it very difficult to implement anything other than giving back memory in the collection pause. The corresponding "Free Collection Set" pause can take a significant amount of time because of that. > Also memory reuse is limited and preallocating arenas is limited (or would have to be reimplemented multiple times), stressing the C heap allocator. > > * inability to support additional use cases: over time interesting ideas (e.g. JDK-8058803) came up for improving performance of remembered set management. Mostly due to redundant information everywhere and completely different handling of various aspects in the containers it is in practice impossible to implement these. > > * (partial) inability to give back memory to the OS. While some of the containers use the C heap allocator, and so in some way give back memory, these implementations and handling is different for every container. > > * the existing granularity of containers are unbalanced: currently there exist three tiers: "sparse", "fine" and "full". Sparse is an array of cards ranging in the hundreds maybe, "fine" is a bitmap covering a whole region and full is a bit indicating that that region should be scanned completely during GC. > > The problem is that there is nothing between "no card at all" and "sparse" and in particular the difference between the capability to hold entries of "sparse" and "fine". I.e. memory usage difference when exceeding a "sparse" array (holding 128 entries at 32M regions, taking ~256 bytes) to fine that is able to hold 65k entries using 8kB is significant. > For these reason there is even a dedicated option to stop allocating more "fine" containers and just give up and use "full" instead to avoid excessive memory usage. With extremely bad consequences in pause times. > > Over time some of these issues have been fixed or in many cases band-aided, and some of these fixes and ideas were the result of working on this change (e.g. JDK-8262185, JDK-8233919, JDK-8213108). > > This change is effectively a rewrite of the Java heap card based part of a region's remembered set. > > This initial fully working change can be roughly described with the following properties: > > * use a single `ConcurrentHashTable` for the card containers of a given region. The container in use replaced (coarsened) on the fly within the CHT node, completely lock-free. This implements JDK-6949259. > > * memory for a given region's remembered set for all containers (and the CHT nodes) is backed by per container type and per remembered set arena style bump-pointer allocation buffers. In this change, in the pause, memory is given back to free lists only. The implementation gives back memory to the OS concurrently to the application. Memory is still managed using the C heap memory manager though, but abstracted away and could be replaced by manual page memory management. > > * there are now four different container types and one meta-container type. These four actual containers are: > * inline pointer: the change store a few (3-5) cards in the CHT node directly and uses no extra memory. > * array of cards: similar to the "sparse" container, an array of cards with a configurable amount of entries. However bulk allocation of memory is now managed at a lower level so there is much less waste. > * bitmap: similar to "fine", a bitmap spanning a (sub-)range of memory > * full: same as full, indicating for a (sub-)range of memory that all cards are to be looked at during scan. Similar to inline pointers, this uses no extra memory. > * howl: the Howl container subdivides a given memory range into subranges where any of the other containers describing that sub-range of the heap may be stored in. This is somewhat similar to the idea suggested in JDK-8048504. > > * care has been taken to minimize container memory usage, e.g. by not adding redundant information there and in general carefully specify them. They have been designed with future enhancements in mind. > > In some benchmarks (where there is significant remembered set memory usage) we are seeing memory reduction to 25% of JDK 16 levels with this change. Garbage collection times are at most as long or shorter than before; most changes affecting pause times have been extracted earlier. Individiual affected phases are generally shorter now. > > Testing: tier1-8 many times, manual and automated perf testing Thomas Schatzl has updated the pull request incrementally with two additional commits since the last revision: - Always have power-of-2 Howl buckets to avoid memory waste (these entries have never been used before, just taking a small amount of memory) - Improved documentation ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4116/files - new: https://git.openjdk.java.net/jdk/pull/4116/files/97e63605..ace8172e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4116&range=08 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4116&range=07-08 Stats: 28 lines in 3 files changed: 18 ins; 0 del; 10 mod Patch: https://git.openjdk.java.net/jdk/pull/4116.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4116/head:pull/4116 PR: https://git.openjdk.java.net/jdk/pull/4116 From weijun at openjdk.java.net Tue Jun 1 15:06:41 2021 From: weijun at openjdk.java.net (Weijun Wang) Date: Tue, 1 Jun 2021 15:06:41 GMT Subject: RFR: 8266459: Implement JEP 411: Deprecate the Security Manager for Removal [v7] In-Reply-To: <_lD3kOiYR1ulm4m7HivqnnFQBD7WxWWLBz56oP6EMVU=.1723b50b-4390-457c-878d-f726cb1ce170@github.com> References: <_lD3kOiYR1ulm4m7HivqnnFQBD7WxWWLBz56oP6EMVU=.1723b50b-4390-457c-878d-f726cb1ce170@github.com> Message-ID: > Please review this implementation of [JEP 411](https://openjdk.java.net/jeps/411). > > The code change is divided into 3 commits. Please review them one by one. > > 1. https://github.com/openjdk/jdk/commit/576161d15423f58281e384174d28c9f9be7941a1 The essential change for this JEP, including the `@Deprecate` annotations and spec change. It also update the default value of the `java.security.manager` system property to "disallow", and necessary test change following this update. > 2. https://github.com/openjdk/jdk/commit/26a54a835e9f84aa528740a7c5c35d07355a8a66 Manual changes to several files so that the next commit can be generated programatically. > 3. https://github.com/openjdk/jdk/commit/eb6c566ff9207974a03a53335e0e697cffcf0950 Automatic changes to other source files to avoid javac warnings on deprecation for removal > > The 1st and 2nd commits should be reviewed carefully. The 3rd one is generated programmatically, see the comment below for more details. If you are only interested in a portion of the 3rd commit and would like to review it as a separate file, please comment here and I'll generate an individual webrev. > > Due to the size of this PR, no attempt is made to update copyright years for any file to minimize unnecessary merge conflict. > > Furthermore, since the default value of `java.security.manager` system property is now "disallow", most of the tests calling `System.setSecurityManager()` need to launched with `-Djava.security.manager=allow`. This is covered in a different PR at https://github.com/openjdk/jdk/pull/4071. > > Update: the deprecation annotations and javadoc tags, build, compiler, core-libs, hotspot, i18n, jmx, net, nio, security, and serviceability are reviewed. Rest are 2d, awt, beans, sound, and swing. Weijun Wang has updated the pull request incrementally with one additional commit since the last revision: rename setSecurityManagerDirect to implSetSecurityManager ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4073/files - new: https://git.openjdk.java.net/jdk/pull/4073/files/8fd09c39..926e4b9a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4073&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4073&range=05-06 Stats: 5 lines in 1 file changed: 0 ins; 1 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/4073.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4073/head:pull/4073 PR: https://git.openjdk.java.net/jdk/pull/4073 From weijun at openjdk.java.net Tue Jun 1 15:21:33 2021 From: weijun at openjdk.java.net (Weijun Wang) Date: Tue, 1 Jun 2021 15:21:33 GMT Subject: RFR: 8266459: Implement JEP 411: Deprecate the Security Manager for Removal [v8] In-Reply-To: <_lD3kOiYR1ulm4m7HivqnnFQBD7WxWWLBz56oP6EMVU=.1723b50b-4390-457c-878d-f726cb1ce170@github.com> References: <_lD3kOiYR1ulm4m7HivqnnFQBD7WxWWLBz56oP6EMVU=.1723b50b-4390-457c-878d-f726cb1ce170@github.com> Message-ID: > Please review this implementation of [JEP 411](https://openjdk.java.net/jeps/411). > > The code change is divided into 3 commits. Please review them one by one. > > 1. https://github.com/openjdk/jdk/commit/576161d15423f58281e384174d28c9f9be7941a1 The essential change for this JEP, including the `@Deprecate` annotations and spec change. It also update the default value of the `java.security.manager` system property to "disallow", and necessary test change following this update. > 2. https://github.com/openjdk/jdk/commit/26a54a835e9f84aa528740a7c5c35d07355a8a66 Manual changes to several files so that the next commit can be generated programatically. > 3. https://github.com/openjdk/jdk/commit/eb6c566ff9207974a03a53335e0e697cffcf0950 Automatic changes to other source files to avoid javac warnings on deprecation for removal > > The 1st and 2nd commits should be reviewed carefully. The 3rd one is generated programmatically, see the comment below for more details. If you are only interested in a portion of the 3rd commit and would like to review it as a separate file, please comment here and I'll generate an individual webrev. > > Due to the size of this PR, no attempt is made to update copyright years for any file to minimize unnecessary merge conflict. > > Furthermore, since the default value of `java.security.manager` system property is now "disallow", most of the tests calling `System.setSecurityManager()` need to launched with `-Djava.security.manager=allow`. This is covered in a different PR at https://github.com/openjdk/jdk/pull/4071. > > Update: the deprecation annotations and javadoc tags, build, compiler, core-libs, hotspot, i18n, jmx, net, nio, security, and serviceability are reviewed. Rest are 2d, awt, beans, sound, and swing. Weijun Wang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: - merge from master - rename setSecurityManagerDirect to implSetSecurityManager - default behavior reverted to allow - move one annotation to new method - merge from master, two files removed, one needs merge - keep only one systemProperty tag - fixing awt/datatransfer/DataFlavor/DataFlavorRemoteTest.java - feedback from Sean, Phil and Alan - add supresswarnings annotations automatically - manual change before automatic annotating - ... and 1 more: https://git.openjdk.java.net/jdk/compare/74b70a56...ea2c4b48 ------------- Changes: https://git.openjdk.java.net/jdk/pull/4073/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4073&range=07 Stats: 2132 lines in 826 files changed: 1997 ins; 20 del; 115 mod Patch: https://git.openjdk.java.net/jdk/pull/4073.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4073/head:pull/4073 PR: https://git.openjdk.java.net/jdk/pull/4073 From alanb at openjdk.java.net Tue Jun 1 16:05:29 2021 From: alanb at openjdk.java.net (Alan Bateman) Date: Tue, 1 Jun 2021 16:05:29 GMT Subject: RFR: 8266459: Implement JEP 411: Deprecate the Security Manager for Removal [v8] In-Reply-To: References: <_lD3kOiYR1ulm4m7HivqnnFQBD7WxWWLBz56oP6EMVU=.1723b50b-4390-457c-878d-f726cb1ce170@github.com> Message-ID: On Tue, 1 Jun 2021 15:21:33 GMT, Weijun Wang wrote: >> Please review this implementation of [JEP 411](https://openjdk.java.net/jeps/411). >> >> The code change is divided into 3 commits. Please review them one by one. >> >> 1. https://github.com/openjdk/jdk/commit/576161d15423f58281e384174d28c9f9be7941a1 The essential change for this JEP, including the `@Deprecate` annotations and spec change. It also update the default value of the `java.security.manager` system property to "disallow", and necessary test change following this update. >> 2. https://github.com/openjdk/jdk/commit/26a54a835e9f84aa528740a7c5c35d07355a8a66 Manual changes to several files so that the next commit can be generated programatically. >> 3. https://github.com/openjdk/jdk/commit/eb6c566ff9207974a03a53335e0e697cffcf0950 Automatic changes to other source files to avoid javac warnings on deprecation for removal >> >> The 1st and 2nd commits should be reviewed carefully. The 3rd one is generated programmatically, see the comment below for more details. If you are only interested in a portion of the 3rd commit and would like to review it as a separate file, please comment here and I'll generate an individual webrev. >> >> Due to the size of this PR, no attempt is made to update copyright years for any file to minimize unnecessary merge conflict. >> >> Furthermore, since the default value of `java.security.manager` system property is now "disallow", most of the tests calling `System.setSecurityManager()` need to launched with `-Djava.security.manager=allow`. This is covered in a different PR at https://github.com/openjdk/jdk/pull/4071. >> >> Update: the deprecation annotations and javadoc tags, build, compiler, core-libs, hotspot, i18n, jmx, net, nio, security, and serviceability are reviewed. Rest are 2d, awt, beans, sound, and swing. > > Weijun Wang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: > > - merge from master > - rename setSecurityManagerDirect to implSetSecurityManager > - default behavior reverted to allow > - move one annotation to new method > - merge from master, two files removed, one needs merge > - keep only one systemProperty tag > - fixing awt/datatransfer/DataFlavor/DataFlavorRemoteTest.java > - feedback from Sean, Phil and Alan > - add supresswarnings annotations automatically > - manual change before automatic annotating > - ... and 1 more: https://git.openjdk.java.net/jdk/compare/74b70a56...ea2c4b48 Marked as reviewed by alanb (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/4073 From joehw at openjdk.java.net Tue Jun 1 16:28:27 2021 From: joehw at openjdk.java.net (Joe Wang) Date: Tue, 1 Jun 2021 16:28:27 GMT Subject: RFR: 8266459: Implement JEP 411: Deprecate the Security Manager for Removal [v8] In-Reply-To: References: <_lD3kOiYR1ulm4m7HivqnnFQBD7WxWWLBz56oP6EMVU=.1723b50b-4390-457c-878d-f726cb1ce170@github.com> Message-ID: On Tue, 1 Jun 2021 15:21:33 GMT, Weijun Wang wrote: >> Please review this implementation of [JEP 411](https://openjdk.java.net/jeps/411). >> >> The code change is divided into 3 commits. Please review them one by one. >> >> 1. https://github.com/openjdk/jdk/commit/576161d15423f58281e384174d28c9f9be7941a1 The essential change for this JEP, including the `@Deprecate` annotations and spec change. It also update the default value of the `java.security.manager` system property to "disallow", and necessary test change following this update. >> 2. https://github.com/openjdk/jdk/commit/26a54a835e9f84aa528740a7c5c35d07355a8a66 Manual changes to several files so that the next commit can be generated programatically. >> 3. https://github.com/openjdk/jdk/commit/eb6c566ff9207974a03a53335e0e697cffcf0950 Automatic changes to other source files to avoid javac warnings on deprecation for removal >> >> The 1st and 2nd commits should be reviewed carefully. The 3rd one is generated programmatically, see the comment below for more details. If you are only interested in a portion of the 3rd commit and would like to review it as a separate file, please comment here and I'll generate an individual webrev. >> >> Due to the size of this PR, no attempt is made to update copyright years for any file to minimize unnecessary merge conflict. >> >> Furthermore, since the default value of `java.security.manager` system property is now "disallow", most of the tests calling `System.setSecurityManager()` need to launched with `-Djava.security.manager=allow`. This is covered in a different PR at https://github.com/openjdk/jdk/pull/4071. >> >> Update: the deprecation annotations and javadoc tags, build, compiler, core-libs, hotspot, i18n, jmx, net, nio, security, and serviceability are reviewed. Rest are 2d, awt, beans, sound, and swing. > > Weijun Wang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: > > - merge from master > - rename setSecurityManagerDirect to implSetSecurityManager > - default behavior reverted to allow > - move one annotation to new method > - merge from master, two files removed, one needs merge > - keep only one systemProperty tag > - fixing awt/datatransfer/DataFlavor/DataFlavorRemoteTest.java > - feedback from Sean, Phil and Alan > - add supresswarnings annotations automatically > - manual change before automatic annotating > - ... and 1 more: https://git.openjdk.java.net/jdk/compare/74b70a56...ea2c4b48 Marked as reviewed by joehw (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/4073 From mcimadamore at openjdk.java.net Tue Jun 1 17:03:59 2021 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Tue, 1 Jun 2021 17:03:59 GMT Subject: RFR: 8264774: Implementation of Foreign Function and Memory API (Incubator) [v27] In-Reply-To: References: Message-ID: <2Ea2q8-xz5mHKcvpHDlC9SqoArgaW0Iw32iG4HpQuVs=.c4df006f-e9fe-4438-ada1-abab7e7bd2a8@github.com> > This PR contains the API and implementation changes for JEP-412 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/412 Maurizio Cimadamore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 38 commits: - Merge branch 'master' into JEP-412 - Merge branch 'master' into JEP-412 - * Add missing `final` in some static fields * Downgrade native methods in ProgrammableUpcallHandler to package-private - Add sealed modifiers in morally sealed API interfaces - Merge branch 'master' into JEP-412 - Fix VaList test Remove unused code in Utils - Merge pull request #11 from JornVernee/JEP-412-MXCSR Add MXCSR save and restore to upcall stubs for non-windows platforms - Add MXCSR save and restore to upcall stubs for non-windows platforms - Merge branch 'master' into JEP-412 - Fix issue with bounded arena allocator - ... and 28 more: https://git.openjdk.java.net/jdk/compare/36dc268a...10767bc0 ------------- Changes: https://git.openjdk.java.net/jdk/pull/3699/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3699&range=26 Stats: 14501 lines in 219 files changed: 8847 ins; 3642 del; 2012 mod Patch: https://git.openjdk.java.net/jdk/pull/3699.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3699/head:pull/3699 PR: https://git.openjdk.java.net/jdk/pull/3699 From iveresov at openjdk.java.net Tue Jun 1 17:46:25 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Tue, 1 Jun 2021 17:46:25 GMT Subject: RFR: 8265518: C1: Intrinsic support for Preconditions.checkIndex [v6] In-Reply-To: References: Message-ID: <8RxYJZEePGo31FEahqK76i4h-q02mTtmfAYBg_3sjAY=.f6ff6a94-31b6-4149-8236-75b65dc82cf8@github.com> On Tue, 25 May 2021 02:44:41 GMT, Yi Yang wrote: >> Looks like now the test fails in the pre-submit tests? > > Thank you @veresov! > > I'm glad to have more comments from hotspot-compiler group. > > Later, I'd like to integrate it if there are no more comments/objections. > > Thanks! > Yang @kelthuzadx Sorry about the delay. Could you please rebase this to the current master and I'll push it. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/3615 From psandoz at openjdk.java.net Tue Jun 1 18:07:47 2021 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Tue, 1 Jun 2021 18:07:47 GMT Subject: RFR: 8266317: Vector API enhancements [v4] In-Reply-To: <9SrHXt3Om3jkyg0A4_X-L3YSMFM4Ib7Y6blBVFcs_Ik=.3122d7fc-e762-4982-ac8c-46cb43b5606a@github.com> References: <9SrHXt3Om3jkyg0A4_X-L3YSMFM4Ib7Y6blBVFcs_Ik=.3122d7fc-e762-4982-ac8c-46cb43b5606a@github.com> Message-ID: > This PR contains API and implementation changes for [JEP-414 Vector API (Second Incubator)](https://openjdk.java.net/jeps/414), in preparation for when targeted. > > Enhancements are made to the API for the support of operations on characters, such as for UTF-8 character decoding. Specifically, methods for loading/storing a `short` vector from/to a `char[]` array, and new vector comparison operators for unsigned comparisons with integral vectors. The x64 implementation is enhanced to supported unsigned comparisons. > > Enhancements are made to the API for loading/storing a `byte` vector from/to a `boolean[]` array. > > The testing of loads/stores can be expanded for scatter/gather, but before doing that i think some refactoring of the tests is required to reposition tests in the right classes. I would like to do that work after integration of the PR. Paul Sandoz has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - Merge remote-tracking branch 'upstream/master' into JDK-8266317-vector-api-enhancements - JavaDoc refs for unsigned operators. - Rename method. - Minor clarifications to the specification. - 8266317: Vector API enhancements ------------- Changes: https://git.openjdk.java.net/jdk/pull/3803/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3803&range=03 Stats: 10016 lines in 121 files changed: 9084 ins; 190 del; 742 mod Patch: https://git.openjdk.java.net/jdk/pull/3803.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3803/head:pull/3803 PR: https://git.openjdk.java.net/jdk/pull/3803 From lmesnik at openjdk.java.net Tue Jun 1 18:09:28 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Tue, 1 Jun 2021 18:09:28 GMT Subject: Integrated: 8265148: StackWatermarkSet being updated during AsyncGetCallTrace In-Reply-To: References: Message-ID: <6wsQ5S6b38LKJaHo7R6aZrmjWgXDVILskv-sRMTT-vc=.3461dbbb-9dda-4d42-8349-c4108c29026c@github.com> On Thu, 27 May 2021 04:01:17 GMT, Leonid Mesnik wrote: > 8265148: StackWatermarkSet being updated during AsyncGetCallTrace This pull request has now been integrated. Changeset: 2b338355 Author: Leonid Mesnik URL: https://git.openjdk.java.net/jdk/commit/2b3383557f71ede15d00bd87742a277c0c764d20 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod 8265148: StackWatermarkSet being updated during AsyncGetCallTrace Reviewed-by: stefank, eosterlund ------------- PR: https://git.openjdk.java.net/jdk/pull/4217 From naoto at openjdk.java.net Tue Jun 1 18:45:22 2021 From: naoto at openjdk.java.net (Naoto Sato) Date: Tue, 1 Jun 2021 18:45:22 GMT Subject: RFR: 8195129: System.load() fails to load from unicode paths [v3] In-Reply-To: <3y0nPfUyTPbNksPn1y5pvopzN2AReOgIl2CafPKD4b4=.3b490e90-5098-4d9f-8d7e-2770f5548895@github.com> References: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> <3y0nPfUyTPbNksPn1y5pvopzN2AReOgIl2CafPKD4b4=.3b490e90-5098-4d9f-8d7e-2770f5548895@github.com> Message-ID: On Thu, 27 May 2021 16:14:38 GMT, Maxim Kartashev wrote: >> Character strings within JVM are produced and consumed in several formats. Strings come from/to Java in the UTF8 format and POSIX APIs (like fprintf() or dlopen()) consume strings also in UTF8. On Windows, however, the situation is far less simple: some new(er) APIs expect UTF16 (wide-character strings), some older APIs can only work with strings in a "platform" format, where not all UTF8 characters can be represented; which ones can depends on the current "code page". >> >> This commit switches the Windows version of native library loading code to using the new UTF16 API `LoadLibraryW()` and attempts to streamline the use of various string formats in the surrounding code. >> >> Namely, exception messages are made to consume strings explicitly in the UTF8 format, while logging functions (that end up using legacy Windows API) are made to consume "platform" strings in most cases. One exception is `JVM_LoadLibrary()` logging where the UTF8 name of the library is logged, which can, of course, be fixed, but was considered not worth the additional code (NB: this isn't a new bug). >> >> The test runs in a separate JVM in order to make NIO happy about non-ASCII characters in the file name; tests are executed with LC_ALL=C and that doesn't let NIO work with non-ASCII file names even on Linux or MacOS. >> >> Tested by running `test/hotspot/jtreg:tier1` on Linux and `jtreg:test/hotspot/jtreg/runtime` on Windows 10. The new test (` jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode`) was explicitly ran on those platforms as well. >> >> Results from Linux: >> >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >> jtreg:test/hotspot/jtreg:tier1 1784 1784 0 0 >> ============================== >> TEST SUCCESS >> >> >> Building target 'run-test-only' in configuration 'linux-x86_64-server-release' >> Test selection 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode', will run: >> * jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode >> >> Running test 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode' >> Passed: runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java >> Test results: passed: 1 >> >> >> Results from Windows 10: >> >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >> jtreg:test/hotspot/jtreg/runtime 746 746 0 0 >> ============================== >> TEST SUCCESS >> Finished building target 'run-test-only' in configuration 'windows-x86_64-server-fastdebug' >> >> >> Building target 'run-test-only' in configuration 'windows-x86_64-server-fastdebug' >> Test selection 'test/hotspot/jtreg/runtime/jni/loadLibraryUnicode', will run: >> * jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode >> >> Running test 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode' >> Passed: runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java >> Test results: passed: 1 > > Maxim Kartashev has updated the pull request incrementally with two additional commits since the last revision: > > - Coding style-related corrections. > - Corrected the test to use Platform.sharedLibraryExt() src/hotspot/os/windows/os_windows.cpp line 1491: > 1489: static errno_t convert_UTF8_to_UTF16(char const* utf8_str, LPWSTR* utf16_str) { > 1490: return convert_to_UTF16(utf8_str, CP_UTF8, utf16_str); > 1491: } IIUC, `utf8_str` is the "modified" UTF-8 string in JNI. Using it as the standard UTF-8 (I believe Windows' encoding `CP_UTF8` is the one) may end up in incorrect conversions in some corner cases, e.g., for supplementary characters. test/hotspot/jtreg/runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java line 42: > 40: String nativePathSetting = "-Dtest.nativepath=" + getSystemProperty("test.nativepath"); > 41: ProcessBuilder pb = ProcessTools.createTestJvm(nativePathSetting, LoadLibraryUnicode.class.getName()); > 42: pb.environment().put("LC_ALL", "en_US.UTF-8"); Some environments/user configs may not have `UTF-8` codeset on the platform. May need to gracefully exit in such a case. ------------- PR: https://git.openjdk.java.net/jdk/pull/4169 From lmesnik at openjdk.java.net Wed Jun 2 00:24:04 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Wed, 2 Jun 2021 00:24:04 GMT Subject: RFR: 8267893: Improve jtreg test failure handler do get native/mixed stack traces for cores and live processes [v2] In-Reply-To: References: Message-ID: > EFH is improved to process cores and get mixed stack traces with jhsdb and native stack traces with gdb/lldb. It might be useful because hs_err doesn't contain info about all threads, sometimes it is even not generated. Leonid Mesnik has updated the pull request incrementally with two additional commits since the last revision: - Merge branch 'efh' of https://github.com/lmesnik/jdk into efh - updated after comments from Igor ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4234/files - new: https://git.openjdk.java.net/jdk/pull/4234/files/68fd69d9..cb1eb944 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4234&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4234&range=00-01 Stats: 47 lines in 7 files changed: 9 ins; 31 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/4234.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4234/head:pull/4234 PR: https://git.openjdk.java.net/jdk/pull/4234 From lmesnik at openjdk.java.net Wed Jun 2 00:30:54 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Wed, 2 Jun 2021 00:30:54 GMT Subject: RFR: 8267893: Improve jtreg test failure handler do get native/mixed stack traces for cores and live processes [v3] In-Reply-To: References: Message-ID: > EFH is improved to process cores and get mixed stack traces with jhsdb and native stack traces with gdb/lldb. It might be useful because hs_err doesn't contain info about all threads, sometimes it is even not generated. Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: README updated. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4234/files - new: https://git.openjdk.java.net/jdk/pull/4234/files/cb1eb944..c48542b5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4234&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4234&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/4234.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4234/head:pull/4234 PR: https://git.openjdk.java.net/jdk/pull/4234 From lmesnik at openjdk.java.net Wed Jun 2 00:44:05 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Wed, 2 Jun 2021 00:44:05 GMT Subject: RFR: 8267893: Improve jtreg test failure handler do get native/mixed stack traces for cores and live processes [v4] In-Reply-To: References: Message-ID: > EFH is improved to process cores and get mixed stack traces with jhsdb and native stack traces with gdb/lldb. It might be useful because hs_err doesn't contain info about all threads, sometimes it is even not generated. Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: remove unused code. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4234/files - new: https://git.openjdk.java.net/jdk/pull/4234/files/c48542b5..57d76163 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4234&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4234&range=02-03 Stats: 4 lines in 1 file changed: 0 ins; 3 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/4234.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4234/head:pull/4234 PR: https://git.openjdk.java.net/jdk/pull/4234 From lmesnik at openjdk.java.net Wed Jun 2 01:00:53 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Wed, 2 Jun 2021 01:00:53 GMT Subject: RFR: 8267893: Improve jtreg test failure handler do get native/mixed stack traces for cores and live processes [v5] In-Reply-To: References: Message-ID: > EFH is improved to process cores and get mixed stack traces with jhsdb and native stack traces with gdb/lldb. It might be useful because hs_err doesn't contain info about all threads, sometimes it is even not generated. Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: spaces updated. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4234/files - new: https://git.openjdk.java.net/jdk/pull/4234/files/57d76163..e70518bc Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4234&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4234&range=03-04 Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/4234.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4234/head:pull/4234 PR: https://git.openjdk.java.net/jdk/pull/4234 From lmesnik at openjdk.java.net Wed Jun 2 01:00:54 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Wed, 2 Jun 2021 01:00:54 GMT Subject: RFR: 8267893: Improve jtreg test failure handler do get native/mixed stack traces for cores and live processes [v5] In-Reply-To: References: Message-ID: On Fri, 28 May 2021 02:20:04 GMT, Igor Ignatyev wrote: >> Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: >> >> spaces updated. > > test/failure_handler/Makefile line 41: > >> 39: CONF_DIR = src/share/conf >> 40: >> 41: JAVA_RELEASE = 7 > > could you please update `DEPENDENCES` section in `test/failure_handler/README`? done ------------- PR: https://git.openjdk.java.net/jdk/pull/4234 From lmesnik at openjdk.java.net Wed Jun 2 01:11:29 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Wed, 2 Jun 2021 01:11:29 GMT Subject: RFR: 8267893: Improve jtreg test failure handler do get native/mixed stack traces for cores and live processes In-Reply-To: References: Message-ID: <4A_3o38kwqyRGJCTG3TigvK26WhJmdyUFLRR5BqwGWU=.99ef5094-0710-4014-91ab-0adf26b03263@github.com> On Fri, 28 May 2021 00:54:21 GMT, Leonid Mesnik wrote: >> Hi Leonid, >> >> What is EFH? Please update the bug and PR to explain. >> >> Thanks, >> David > >> Hi Leonid, >> >> What is EFH? Please update the bug and PR to explain. >> >> Thanks, >> David > > Updated summary to "Improve jtreg test failure handler do get native/mixed stack traces for cores and live processes". > @lmesnik , how has this been tested? I used it in the loom for several weeks. ------------- PR: https://git.openjdk.java.net/jdk/pull/4234 From lmesnik at openjdk.java.net Wed Jun 2 01:21:32 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Wed, 2 Jun 2021 01:21:32 GMT Subject: RFR: 8267893: Improve jtreg test failure handler do get native/mixed stack traces for cores and live processes [v5] In-Reply-To: References: Message-ID: On Fri, 28 May 2021 02:25:59 GMT, Igor Ignatyev wrote: >> Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: >> >> spaces updated. > > @lmesnik , how has this been tested? @iignatev, thank you for your comments. I updated all of them. ------------- PR: https://git.openjdk.java.net/jdk/pull/4234 From coleenp at openjdk.java.net Wed Jun 2 02:01:30 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 2 Jun 2021 02:01:30 GMT Subject: RFR: 8266967: debug.cpp utility find() should print Java Object fields. [v2] In-Reply-To: <3q3pcFTsL_lG-lh78-zZSkTomOw0vPLLAoPm6ez-TAM=.45e3cf48-009b-4899-ac3d-851338987959@github.com> References: <3q3pcFTsL_lG-lh78-zZSkTomOw0vPLLAoPm6ez-TAM=.45e3cf48-009b-4899-ac3d-851338987959@github.com> Message-ID: On Thu, 13 May 2021 13:24:17 GMT, Kevin Walls wrote: >> This change enables debug.cpp's find() utility to print Java Objects with their fields. >> >> find() calls os::print_location, and Java heap objects are printed with instanceKlass oop_print_on. >> Removing the ifdef for defining oop_print_on for instanceKlass, and also on methods in FieldPrinter and FieldDescriptor, make this work. >> >> >> Checking other uses of os::print_location this might affect: >> >> macroAssembler_x86.cpp has MacroAssembler::print_state32 and MacroAssembler::print_state64 >> which use os::print_location to print register contents and print words at top of stack. >> These will be more verbose, as it already is in non-PRODUCT builds. >> >> vmError uses os::print_location when showing the stack, i.e. this output: >> >> Stack slot to memory mapping: >> stack at sp + 0 slots: 0x0000000000000002 is an unknown value >> ..etc... >> >> ...will be more verbose when Java object references are found (for the 8 stack slots it tries to show). >> >> >> Shenandoah uses os::print_location once, but for non-Java heap objects so nothing changes. >> >> >> Manual testing on Linux-x64 and Windows: old behaviour shows these two lines only: >> >> "Executing find" >> 0x00000000ff0a03e0 is an oop: jdk.internal.loader.ClassLoaders$AppClassLoader >> {0x00000000ff0a03e0} - klass: 'jdk/internal/loader/ClassLoaders$AppClassLoader' >> >> ...then with the change the full info: >> >> "Executing find" >> 0x00000000ff0a03e0 is an oop: jdk.internal.loader.ClassLoaders$AppClassLoader >> {0x00000000ff0a03e0} - klass: 'jdk/internal/loader/ClassLoaders$AppClassLoader' >> - ---- fields (total size 13 words): >> - private 'defaultAssertionStatus' 'Z' @12 false >> - private final 'parent' 'Ljava/lang/ClassLoader;' @24 a 'jdk/internal/loader/ClassLoaders$PlatformClassLoader'{0x00000000ff0a0a >> 40} (ff0a0a40) >> - private final 'name' 'Ljava/lang/String;' @28 "app"{0x00000000ff0d0060} (ff0d0060) >> - private final 'unnamedModule' 'Ljava/lang/Module;' @32 a 'java/lang/Module'{0x00000000ff0a0448} (ff0a0448) >> ...etc... > > Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: > > ifdef correction This looks good! ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4011 From yyang at openjdk.java.net Wed Jun 2 02:19:56 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Wed, 2 Jun 2021 02:19:56 GMT Subject: RFR: 8265518: C1: Intrinsic support for Preconditions.checkIndex [v10] In-Reply-To: References: Message-ID: > The JDK codebase re-created many variants of checkIndex(`grep -I -r 'cehckIndex' jdk/`). A notable variant is java.nio.Buffer.checkIndex, which annotated with @IntrinsicCandidate and it only has a corresponding C1 intrinsic version. > > In fact, there is an utility method `jdk.internal.util.Preconditions.checkIndex`(wrapped by java.lang.Objects.checkIndex) that behaves the same as these variants of checkIndex, we can replace these re-created variants of checkIndex by Objects.checkIndex, it would significantly reduce duplicated code and enjoys performance improvement because Preconditions.checkIndex is @IntrinsicCandidate and it has a corresponding intrinsic method in HotSpot. > > But, the problem is currently HotSpot only implements the C2 version of Preconditions.checkIndex. To reuse it global-widely in JDK code, I think we can firstly implement its C1 counterpart. There are also a few kinds of stuff we can do later: > > 1. Replace all variants of checkIndex by Objects.checkIndex in the whole JDK codebase. > 2. Remove Buffer.checkIndex and obsolete/deprecate InlineNIOCheckIndex flag > > Testing: cds, compiler and jdk Yi Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: - x86_32 fails - build failed - cmp clobbers its left argument on x86_32 - better check1-4 - AssertionError when expected exception was not thrown - remove extra newline - remove InlineNIOCheckIndex flag - remove java_nio_Buffer in javaClasses.hpp - consolidate ------------- Changes: https://git.openjdk.java.net/jdk/pull/3615/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3615&range=09 Stats: 338 lines in 11 files changed: 242 ins; 78 del; 18 mod Patch: https://git.openjdk.java.net/jdk/pull/3615.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3615/head:pull/3615 PR: https://git.openjdk.java.net/jdk/pull/3615 From yyang at openjdk.java.net Wed Jun 2 02:32:54 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Wed, 2 Jun 2021 02:32:54 GMT Subject: RFR: 8265518: C1: Intrinsic support for Preconditions.checkIndex [v11] In-Reply-To: References: Message-ID: > The JDK codebase re-created many variants of checkIndex(`grep -I -r 'cehckIndex' jdk/`). A notable variant is java.nio.Buffer.checkIndex, which annotated with @IntrinsicCandidate and it only has a corresponding C1 intrinsic version. > > In fact, there is an utility method `jdk.internal.util.Preconditions.checkIndex`(wrapped by java.lang.Objects.checkIndex) that behaves the same as these variants of checkIndex, we can replace these re-created variants of checkIndex by Objects.checkIndex, it would significantly reduce duplicated code and enjoys performance improvement because Preconditions.checkIndex is @IntrinsicCandidate and it has a corresponding intrinsic method in HotSpot. > > But, the problem is currently HotSpot only implements the C2 version of Preconditions.checkIndex. To reuse it global-widely in JDK code, I think we can firstly implement its C1 counterpart. There are also a few kinds of stuff we can do later: > > 1. Replace all variants of checkIndex by Objects.checkIndex in the whole JDK codebase. > 2. Remove Buffer.checkIndex and obsolete/deprecate InlineNIOCheckIndex flag > > Testing: cds, compiler and jdk Yi Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: - x86_32 fails - build failed - cmp clobbers its left argument on x86_32 - better check1-4 - AssertionError when expected exception was not thrown - remove InlineNIOCheckIndex flag - remove java_nio_Buffer in javaClasses.hpp - consolidate ------------- Changes: https://git.openjdk.java.net/jdk/pull/3615/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3615&range=10 Stats: 338 lines in 11 files changed: 242 ins; 78 del; 18 mod Patch: https://git.openjdk.java.net/jdk/pull/3615.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3615/head:pull/3615 PR: https://git.openjdk.java.net/jdk/pull/3615 From stefank at openjdk.java.net Wed Jun 2 06:38:27 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 2 Jun 2021 06:38:27 GMT Subject: RFR: 8267920: Create separate Events buffer for VMOperations [v2] In-Reply-To: References: Message-ID: On Mon, 31 May 2021 08:56:48 GMT, Stefan Karlsson wrote: > > Hi Stefan, > > Adding the new event buffer itself seems fine. > > Unlike Coleen I can't figure out why you added LogFunction and the new template classes. :) I can't tell if this was necessary or you just preferred to re-do the specialization mechanism. ?? > > Thanks, > > David > > It was mostly done to not duplicate the code of EventMark when adding new EventMarkX classes. This can probably be done different ways. I'm open to suggestions. @dholmes-ora Have you had time to think about this? ------------- PR: https://git.openjdk.java.net/jdk/pull/4243 From github.com+4146708+a74nh at openjdk.java.net Wed Jun 2 09:49:04 2021 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Wed, 2 Jun 2021 09:49:04 GMT Subject: RFR: 8266749: AArch64: Backtracing broken on PAC enabled systems [v6] In-Reply-To: <3Ak1iZsEfTEKadfUcF6zGVuzsDoiQbaupm66NvSwlaY=.8323de39-d7e6-4049-9243-7c31a75bbc9f@github.com> References: <3Ak1iZsEfTEKadfUcF6zGVuzsDoiQbaupm66NvSwlaY=.8323de39-d7e6-4049-9243-7c31a75bbc9f@github.com> Message-ID: <6XgtFHEGXqVgiGLcWadXJzSSUtMR_sBNT-TO1Z9WPvY=.22753345-69fe-4d63-83e9-8df1b695c717@github.com> > On PAC systems, native code may sign return addresses before saving > them to the stack. We must ensure we strip the any signed bits in > order to walk the stack. > Add extra asserts in places where we do not expect saved return > addresses to be signed. > > On non-PAC systems, all PAC instructions are treated as NOPs. > > On Apple, use the provided ptrauth interface instead of asm > as the compiler may optimise further. > > Fedora 33 compiles all distro packages using PAC. Running the distro > provided OpenJDK-latest in GDB on a PAC system: > > Thread 2 "java" hit Breakpoint 1, 0x0000fffff68d7fe4 in init_globals() () > from /usr/lib/jvm/java-16-openjdk-16.0.1.0.9-1.rolling.fc33.aarch64-fastdebug/lib/server/libjvm.so > (gdb) call (int)pns($sp, $fp, $pc) > > "Executing pns" > Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0xe26fe4] init_globals()+0x10 > C 0x006ffffff74750c4 > C 0x0042fffff6a7f84c > C 0x0037fffff7fa0954 > C 0x0030fffff7fa4540 > C 0x0078fffff7d980c8 > > OpenJDK with this patch at the same breakpoint: > > (gdb) call (int)pns($sp, $fp, $pc) > "Executing pns" > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x189c47c] Threads::create_vm(JavaVMInitArgs*, bool*)+0x27c > V [libjvm.so+0xf527a0] JNI_CreateJavaVM+0xc0 > C [libjli.so+0x3860] JavaMain+0x7c > C [libjli.so+0x732c] ThreadJavaMain+0xc > C [libpthread.so.0+0x80c8] start_thread+0xd8 > > OpenJDK with this patch breakpointed at pd_hotspot_signal_handler: > > "Executing pns" > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x148a730] PosixSignals::pd_hotspot_signal_handler(int, siginfo_t*, ucontext_t*, JavaThread*)+0x0 > C [linux-vdso.so.1+0x80c] __kernel_rt_sigreturn+0x0 > J 53 c1 jdk.internal.org.objectweb.asm.SymbolTable.addConstantUtf8(Ljava/lang/String;)I java.base (98 bytes) @ 0x0000ffffe159cc3c [0x0000ffffe159cb40+0x00000000000000fc] > j jdk.internal.org.objectweb.asm.SymbolTable.setMajorVersionAndClassName(ILjava/lang/String;)I+12 java.base > j jdk.internal.org.objectweb.asm.ClassWriter.visit(IILjava/lang/String;Ljava/lang/String;Ljava/lang/String;[Ljava/lang/String;)V+20 java.base > j java.lang.invoke.InvokerBytecodeGenerator.classFilePrologue()Ljdk/internal/org/objectweb/asm/ClassWriter;+30 java.base > j java.lang.invoke.InvokerBytecodeGenerator.generateCustomizedCodeBytes()[B+1 java.base > j java.lang.invoke.InvokerBytecodeGenerator.generateCustomizedCode(Ljava/lang/invoke/LambdaForm;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/MemberName;+27 java.base > j java.lang.invoke.LambdaForm.compileToBytecode()V+69 java.base > j java.lang.invoke.DirectMethodHandle.makePreparedLambdaForm(Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/LambdaForm;+792 java.base > j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/LambdaForm;+17 java.base > j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MemberName;Z)Ljava/lang/invoke/LambdaForm;+163 java.base > j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MemberName;)Ljava/lang/invoke/LambdaForm;+2 java.base > j java.lang.invoke.DirectMethodHandle.make(BLjava/lang/Class;Ljava/lang/invoke/MemberName;Ljava/lang/Class;)Ljava/lang/invoke/DirectMethodHandle;+159 java.base > j java.lang.invoke.MethodHandles$Lookup.getDirectMethodCommon(BLjava/lang/Class;Ljava/lang/invoke/MemberName;ZZLjava/lang/invoke/MethodHandles$Lookup;)Ljava/lang/invoke/MethodHandle;+210 java.base > j java.lang.invoke.MethodHandles$Lookup.getDirectMethodNoSecurityManager(BLjava/lang/Class;Ljava/lang/invoke/MemberName;Ljava/lang/invoke/MethodHandles$Lookup;)Ljava/lang/invoke/MethodHandle;+14 java.base > j java.lang.invoke.MethodHandles$Lookup.getDirectMethodForConstant(BLjava/lang/Class;Ljava/lang/invoke/MemberName;)Ljava/lang/invoke/MethodHandle;+31 java.base > j java.lang.invoke.MethodHandles$Lookup.linkMethodHandleConstant(BLjava/lang/Class;Ljava/lang/String;Ljava/lang/Object;)Ljava/lang/invoke/MethodHandle;+153 java.base > j java.lang.invoke.MethodHandleNatives.linkMethodHandleConstant(Ljava/lang/Class;ILjava/lang/Class;Ljava/lang/String;Ljava/lang/Object;)Ljava/lang/invoke/MethodHandle;+38 java.base > v ~StubRoutines::call_stub > V [libjvm.so+0xe20118] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x5c8 > V [libjvm.so+0xe20f64] JavaCalls::call_static(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x284 > V [libjvm.so+0x184b778] SystemDictionary::link_method_handle_constant(Klass*, int, Klass*, Symbol*, Symbol*, Thread*)+0x398 > V [libjvm.so+0xa1f104] ConstantPool::resolve_constant_at_impl(constantPoolHandle const&, int, int, bool*, Thread*)+0xca0 > V [libjvm.so+0xa1fb6c] ConstantPool::copy_bootstrap_arguments_at_impl(constantPoolHandle const&, int, int, int, objArrayHandle, int, bool, Handle, Thread*)+0x3fc > V [libjvm.so+0x6bef6c] BootstrapInfo::resolve_args(Thread*)+0xcbc > V [libjvm.so+0x6c1538] BootstrapInfo::resolve_bsm(Thread*)+0x1194 > V [libjvm.so+0x184d300] SystemDictionary::invoke_bootstrap_method(BootstrapInfo&, Thread*)+0x30 > V [libjvm.so+0x120450c] LinkResolver::resolve_dynamic_call(CallInfo&, BootstrapInfo&, Thread*)+0x2c > V [libjvm.so+0x1204b1c] LinkResolver::resolve_invokedynamic(CallInfo&, constantPoolHandle const&, int, Thread*)+0x1bc > V [libjvm.so+0xe0ecc4] InterpreterRuntime::resolve_invokedynamic(JavaThread*)+0x190 > V [libjvm.so+0xe123a0] InterpreterRuntime::resolve_from_cache(JavaThread*, Bytecodes::Code)+0x160 > j jdk.internal.module.ModulePath.explodedPackages(Ljava/nio/file/Path;)Ljava/util/Set;+5 java.base > j jdk.internal.module.ModulePath.lambda$readExplodedModule$9(Ljava/nio/file/Path;)Ljava/util/Set;+2 java.base > j jdk.internal.module.ModulePath$$Lambda$2+0x000000010003bbe0.get()Ljava/lang/Object;+8 java.base > j jdk.internal.module.ModuleInfo.doRead(Ljava/io/DataInput;)Ljdk/internal/module/ModuleInfo$Attributes;+762 java.base > j jdk.internal.module.ModuleInfo.read(Ljava/io/InputStream;Ljava/util/function/Supplier;)Ljdk/internal/module/ModuleInfo$Attributes;+16 java.base > j jdk.internal.module.ModulePath.readExplodedModule(Ljava/nio/file/Path;)Ljava/lang/module/ModuleReference;+35 java.base > j jdk.internal.module.ModulePath.readModule(Ljava/nio/file/Path;Ljava/nio/file/attribute/BasicFileAttributes;)Ljava/lang/module/ModuleReference;+11 java.base > j jdk.internal.module.ModulePath.scanDirectory(Ljava/nio/file/Path;)Ljava/util/Map;+69 java.base > j jdk.internal.module.ModulePath.scan(Ljava/nio/file/Path;)Ljava/util/Map;+60 java.base > j jdk.internal.module.ModulePath.scanNextEntry()V+23 java.base > j jdk.internal.module.ModulePath.find(Ljava/lang/String;)Ljava/util/Optional;+36 java.base > j jdk.internal.module.SystemModuleFinders$1.lambda$find$0(Ljava/lang/module/ModuleFinder;Ljava/lang/String;)Ljava/util/Optional;+2 java.base > j jdk.internal.module.SystemModuleFinders$1$$Lambda$1+0x0000000100033b00.run()Ljava/lang/Object;+8 java.base > j java.security.AccessController.executePrivileged(Ljava/security/PrivilegedAction;Ljava/security/AccessControlContext;Ljava/lang/Class;)Ljava/lang/Object;+29 java.base > j java.security.AccessController.doPrivileged(Ljava/security/PrivilegedAction;)Ljava/lang/Object;+5 java.base > j jdk.internal.module.SystemModuleFinders$1.find(Ljava/lang/String;)Ljava/util/Optional;+12 java.base > j jdk.internal.module.ModuleBootstrap.boot2()Ljava/lang/ModuleLayer;+304 java.base > j jdk.internal.module.ModuleBootstrap.boot()Ljava/lang/ModuleLayer;+64 java.base > j java.lang.System.initPhase2(ZZ)I+0 java.base > v ~StubRoutines::call_stub > V [libjvm.so+0xe20118] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x5c8 > V [libjvm.so+0xe20f64] JavaCalls::call_static(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x284 > V [libjvm.so+0x189c7bc] Threads::create_vm(JavaVMInitArgs*, bool*)+0x5bc > V [libjvm.so+0xf527a0] JNI_CreateJavaVM+0xc0 > C [libjli.so+0x3860] JavaMain+0x7c > C [libjli.so+0x732c] ThreadJavaMain+0xc > C [libpthread.so.0+0x80c8] start_thread+0xd8 Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: Remove asserts / fix build CustomizedGitHooks: yes Change-Id: I6b634b90e81cf8f6e4cd1cdc63f6926eaa7025f6 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4029/files - new: https://git.openjdk.java.net/jdk/pull/4029/files/70d13e7a..406eeed5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4029&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4029&range=04-05 Stats: 9 lines in 4 files changed: 1 ins; 6 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/4029.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4029/head:pull/4029 PR: https://git.openjdk.java.net/jdk/pull/4029 From github.com+4146708+a74nh at openjdk.java.net Wed Jun 2 09:49:10 2021 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Wed, 2 Jun 2021 09:49:10 GMT Subject: RFR: 8266749: AArch64: Backtracing broken on PAC enabled systems [v5] In-Reply-To: References: <3Ak1iZsEfTEKadfUcF6zGVuzsDoiQbaupm66NvSwlaY=.8323de39-d7e6-4049-9243-7c31a75bbc9f@github.com> Message-ID: On Thu, 27 May 2021 18:04:33 GMT, Gerard Ziemski wrote: >> Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: >> >> Add sender_pc_raw() >> >> Change-Id: I865170d4462c2ff3a67cbee992eadad4810efebf >> CustomizedGitHooks: yes > > src/hotspot/os_cpu/linux_aarch64/os_linux_aarch64.cpp line 145: > >> 143: - NativeInstruction::instruction_size); >> 144: // Compiled code should not have signed the return address. >> 145: assert(pauth_ptr_is_raw(pc), "cannot be signed"); > > Do we need this, now that we have the assert in frame constructor ? Removed as suggested. Also fixed some build failures. ------------- PR: https://git.openjdk.java.net/jdk/pull/4029 From aph at openjdk.java.net Wed Jun 2 10:05:32 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 2 Jun 2021 10:05:32 GMT Subject: RFR: 8266749: AArch64: Backtracing broken on PAC enabled systems [v6] In-Reply-To: <6XgtFHEGXqVgiGLcWadXJzSSUtMR_sBNT-TO1Z9WPvY=.22753345-69fe-4d63-83e9-8df1b695c717@github.com> References: <3Ak1iZsEfTEKadfUcF6zGVuzsDoiQbaupm66NvSwlaY=.8323de39-d7e6-4049-9243-7c31a75bbc9f@github.com> <6XgtFHEGXqVgiGLcWadXJzSSUtMR_sBNT-TO1Z9WPvY=.22753345-69fe-4d63-83e9-8df1b695c717@github.com> Message-ID: On Wed, 2 Jun 2021 09:49:04 GMT, Alan Hayward wrote: >> On PAC systems, native code may sign return addresses before saving >> them to the stack. We must ensure we strip the any signed bits in >> order to walk the stack. >> Add extra asserts in places where we do not expect saved return >> addresses to be signed. >> >> On non-PAC systems, all PAC instructions are treated as NOPs. >> >> On Apple, use the provided ptrauth interface instead of asm >> as the compiler may optimise further. >> >> Fedora 33 compiles all distro packages using PAC. Running the distro >> provided OpenJDK-latest in GDB on a PAC system: >> >> Thread 2 "java" hit Breakpoint 1, 0x0000fffff68d7fe4 in init_globals() () >> from /usr/lib/jvm/java-16-openjdk-16.0.1.0.9-1.rolling.fc33.aarch64-fastdebug/lib/server/libjvm.so >> (gdb) call (int)pns($sp, $fp, $pc) >> >> "Executing pns" >> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0xe26fe4] init_globals()+0x10 >> C 0x006ffffff74750c4 >> C 0x0042fffff6a7f84c >> C 0x0037fffff7fa0954 >> C 0x0030fffff7fa4540 >> C 0x0078fffff7d980c8 >> >> OpenJDK with this patch at the same breakpoint: >> >> (gdb) call (int)pns($sp, $fp, $pc) >> "Executing pns" >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x189c47c] Threads::create_vm(JavaVMInitArgs*, bool*)+0x27c >> V [libjvm.so+0xf527a0] JNI_CreateJavaVM+0xc0 >> C [libjli.so+0x3860] JavaMain+0x7c >> C [libjli.so+0x732c] ThreadJavaMain+0xc >> C [libpthread.so.0+0x80c8] start_thread+0xd8 >> >> OpenJDK with this patch breakpointed at pd_hotspot_signal_handler: >> >> "Executing pns" >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x148a730] PosixSignals::pd_hotspot_signal_handler(int, siginfo_t*, ucontext_t*, JavaThread*)+0x0 >> C [linux-vdso.so.1+0x80c] __kernel_rt_sigreturn+0x0 >> J 53 c1 jdk.internal.org.objectweb.asm.SymbolTable.addConstantUtf8(Ljava/lang/String;)I java.base (98 bytes) @ 0x0000ffffe159cc3c [0x0000ffffe159cb40+0x00000000000000fc] >> j jdk.internal.org.objectweb.asm.SymbolTable.setMajorVersionAndClassName(ILjava/lang/String;)I+12 java.base >> j jdk.internal.org.objectweb.asm.ClassWriter.visit(IILjava/lang/String;Ljava/lang/String;Ljava/lang/String;[Ljava/lang/String;)V+20 java.base >> j java.lang.invoke.InvokerBytecodeGenerator.classFilePrologue()Ljdk/internal/org/objectweb/asm/ClassWriter;+30 java.base >> j java.lang.invoke.InvokerBytecodeGenerator.generateCustomizedCodeBytes()[B+1 java.base >> j java.lang.invoke.InvokerBytecodeGenerator.generateCustomizedCode(Ljava/lang/invoke/LambdaForm;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/MemberName;+27 java.base >> j java.lang.invoke.LambdaForm.compileToBytecode()V+69 java.base >> j java.lang.invoke.DirectMethodHandle.makePreparedLambdaForm(Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/LambdaForm;+792 java.base >> j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/LambdaForm;+17 java.base >> j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MemberName;Z)Ljava/lang/invoke/LambdaForm;+163 java.base >> j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MemberName;)Ljava/lang/invoke/LambdaForm;+2 java.base >> j java.lang.invoke.DirectMethodHandle.make(BLjava/lang/Class;Ljava/lang/invoke/MemberName;Ljava/lang/Class;)Ljava/lang/invoke/DirectMethodHandle;+159 java.base >> j java.lang.invoke.MethodHandles$Lookup.getDirectMethodCommon(BLjava/lang/Class;Ljava/lang/invoke/MemberName;ZZLjava/lang/invoke/MethodHandles$Lookup;)Ljava/lang/invoke/MethodHandle;+210 java.base >> j java.lang.invoke.MethodHandles$Lookup.getDirectMethodNoSecurityManager(BLjava/lang/Class;Ljava/lang/invoke/MemberName;Ljava/lang/invoke/MethodHandles$Lookup;)Ljava/lang/invoke/MethodHandle;+14 java.base >> j java.lang.invoke.MethodHandles$Lookup.getDirectMethodForConstant(BLjava/lang/Class;Ljava/lang/invoke/MemberName;)Ljava/lang/invoke/MethodHandle;+31 java.base >> j java.lang.invoke.MethodHandles$Lookup.linkMethodHandleConstant(BLjava/lang/Class;Ljava/lang/String;Ljava/lang/Object;)Ljava/lang/invoke/MethodHandle;+153 java.base >> j java.lang.invoke.MethodHandleNatives.linkMethodHandleConstant(Ljava/lang/Class;ILjava/lang/Class;Ljava/lang/String;Ljava/lang/Object;)Ljava/lang/invoke/MethodHandle;+38 java.base >> v ~StubRoutines::call_stub >> V [libjvm.so+0xe20118] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x5c8 >> V [libjvm.so+0xe20f64] JavaCalls::call_static(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x284 >> V [libjvm.so+0x184b778] SystemDictionary::link_method_handle_constant(Klass*, int, Klass*, Symbol*, Symbol*, Thread*)+0x398 >> V [libjvm.so+0xa1f104] ConstantPool::resolve_constant_at_impl(constantPoolHandle const&, int, int, bool*, Thread*)+0xca0 >> V [libjvm.so+0xa1fb6c] ConstantPool::copy_bootstrap_arguments_at_impl(constantPoolHandle const&, int, int, int, objArrayHandle, int, bool, Handle, Thread*)+0x3fc >> V [libjvm.so+0x6bef6c] BootstrapInfo::resolve_args(Thread*)+0xcbc >> V [libjvm.so+0x6c1538] BootstrapInfo::resolve_bsm(Thread*)+0x1194 >> V [libjvm.so+0x184d300] SystemDictionary::invoke_bootstrap_method(BootstrapInfo&, Thread*)+0x30 >> V [libjvm.so+0x120450c] LinkResolver::resolve_dynamic_call(CallInfo&, BootstrapInfo&, Thread*)+0x2c >> V [libjvm.so+0x1204b1c] LinkResolver::resolve_invokedynamic(CallInfo&, constantPoolHandle const&, int, Thread*)+0x1bc >> V [libjvm.so+0xe0ecc4] InterpreterRuntime::resolve_invokedynamic(JavaThread*)+0x190 >> V [libjvm.so+0xe123a0] InterpreterRuntime::resolve_from_cache(JavaThread*, Bytecodes::Code)+0x160 >> j jdk.internal.module.ModulePath.explodedPackages(Ljava/nio/file/Path;)Ljava/util/Set;+5 java.base >> j jdk.internal.module.ModulePath.lambda$readExplodedModule$9(Ljava/nio/file/Path;)Ljava/util/Set;+2 java.base >> j jdk.internal.module.ModulePath$$Lambda$2+0x000000010003bbe0.get()Ljava/lang/Object;+8 java.base >> j jdk.internal.module.ModuleInfo.doRead(Ljava/io/DataInput;)Ljdk/internal/module/ModuleInfo$Attributes;+762 java.base >> j jdk.internal.module.ModuleInfo.read(Ljava/io/InputStream;Ljava/util/function/Supplier;)Ljdk/internal/module/ModuleInfo$Attributes;+16 java.base >> j jdk.internal.module.ModulePath.readExplodedModule(Ljava/nio/file/Path;)Ljava/lang/module/ModuleReference;+35 java.base >> j jdk.internal.module.ModulePath.readModule(Ljava/nio/file/Path;Ljava/nio/file/attribute/BasicFileAttributes;)Ljava/lang/module/ModuleReference;+11 java.base >> j jdk.internal.module.ModulePath.scanDirectory(Ljava/nio/file/Path;)Ljava/util/Map;+69 java.base >> j jdk.internal.module.ModulePath.scan(Ljava/nio/file/Path;)Ljava/util/Map;+60 java.base >> j jdk.internal.module.ModulePath.scanNextEntry()V+23 java.base >> j jdk.internal.module.ModulePath.find(Ljava/lang/String;)Ljava/util/Optional;+36 java.base >> j jdk.internal.module.SystemModuleFinders$1.lambda$find$0(Ljava/lang/module/ModuleFinder;Ljava/lang/String;)Ljava/util/Optional;+2 java.base >> j jdk.internal.module.SystemModuleFinders$1$$Lambda$1+0x0000000100033b00.run()Ljava/lang/Object;+8 java.base >> j java.security.AccessController.executePrivileged(Ljava/security/PrivilegedAction;Ljava/security/AccessControlContext;Ljava/lang/Class;)Ljava/lang/Object;+29 java.base >> j java.security.AccessController.doPrivileged(Ljava/security/PrivilegedAction;)Ljava/lang/Object;+5 java.base >> j jdk.internal.module.SystemModuleFinders$1.find(Ljava/lang/String;)Ljava/util/Optional;+12 java.base >> j jdk.internal.module.ModuleBootstrap.boot2()Ljava/lang/ModuleLayer;+304 java.base >> j jdk.internal.module.ModuleBootstrap.boot()Ljava/lang/ModuleLayer;+64 java.base >> j java.lang.System.initPhase2(ZZ)I+0 java.base >> v ~StubRoutines::call_stub >> V [libjvm.so+0xe20118] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x5c8 >> V [libjvm.so+0xe20f64] JavaCalls::call_static(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x284 >> V [libjvm.so+0x189c7bc] Threads::create_vm(JavaVMInitArgs*, bool*)+0x5bc >> V [libjvm.so+0xf527a0] JNI_CreateJavaVM+0xc0 >> C [libjli.so+0x3860] JavaMain+0x7c >> C [libjli.so+0x732c] ThreadJavaMain+0xc >> C [libpthread.so.0+0x80c8] start_thread+0xd8 > > Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: > > Remove asserts / fix build > > CustomizedGitHooks: yes > Change-Id: I6b634b90e81cf8f6e4cd1cdc63f6926eaa7025f6 src/hotspot/cpu/aarch64/pauth_aarch64.hpp line 31: > 29: > 30: inline bool pauth_ptr_is_raw(address ptr) { > 31: // Confirm none of the high bits are set in the pointer. This predicate seems to me to be be misnamed: it's checking for unsigned/stripped, not for raw. The raw value is whatever gets saved in LR/pushed onto the stack. ------------- PR: https://git.openjdk.java.net/jdk/pull/4029 From david.holmes at oracle.com Wed Jun 2 10:42:35 2021 From: david.holmes at oracle.com (David Holmes) Date: Wed, 2 Jun 2021 20:42:35 +1000 Subject: RFR: 8267920: Create separate Events buffer for VMOperations [v2] In-Reply-To: References: Message-ID: On 2/06/2021 4:38 pm, Stefan Karlsson wrote: > On Mon, 31 May 2021 08:56:48 GMT, Stefan Karlsson wrote: > >>> Hi Stefan, >>> Adding the new event buffer itself seems fine. >>> Unlike Coleen I can't figure out why you added LogFunction and the new template classes. :) I can't tell if this was necessary or you just preferred to re-do the specialization mechanism. ?? >>> Thanks, >>> David >> >> It was mostly done to not duplicate the code of EventMark when adding new EventMarkX classes. This can probably be done different ways. I'm open to suggestions. > > @dholmes-ora Have you had time to think about this? No, sorry. Fine to proceed. Thanks, David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/4243 > From tschatzl at openjdk.java.net Wed Jun 2 10:44:54 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 2 Jun 2021 10:44:54 GMT Subject: RFR: 8017163: G1: Refactor remembered sets [v10] In-Reply-To: References: Message-ID: > Hi all, > > can I have reviews for this change that significantly refactors the remembered set for more scalability. > > The current G1 remembered set implementation has been designed for use cases and Java heaps and applications from 20 years ago. > > Over time many problems with performance and in particular memory usage have been observed: > > * adding elements to the lowest tier data structure takes a per-remembered set global lock. Measurements have shown that the applications can wait thousands of seconds acquiring these locks. While the affected threads are in most cases refinement threads so does not directly affect the application, it can still affect the ability of G1 to meet some goals needed for keeping pause times (i.e. amount of cards from the refinement buffers to be merged into the card table and then scanned during gc). > > * there is a substantial memory overhead for managing the data structures: examples are > * using separate (hash) tables for the three different types of card containers > * there is significant unnecessary preallocation of memory for some of the card set containers > * Containers store redundant information > > * inflexibility when reusing memory: in the current implementation the different containers use different approaches to manage memory. Most use the C heap directly, some the C heap with some internal global memory pool. This in practice makes it very difficult to implement anything other than giving back memory in the collection pause. The corresponding "Free Collection Set" pause can take a significant amount of time because of that. > Also memory reuse is limited and preallocating arenas is limited (or would have to be reimplemented multiple times), stressing the C heap allocator. > > * inability to support additional use cases: over time interesting ideas (e.g. JDK-8058803) came up for improving performance of remembered set management. Mostly due to redundant information everywhere and completely different handling of various aspects in the containers it is in practice impossible to implement these. > > * (partial) inability to give back memory to the OS. While some of the containers use the C heap allocator, and so in some way give back memory, these implementations and handling is different for every container. > > * the existing granularity of containers are unbalanced: currently there exist three tiers: "sparse", "fine" and "full". Sparse is an array of cards ranging in the hundreds maybe, "fine" is a bitmap covering a whole region and full is a bit indicating that that region should be scanned completely during GC. > > The problem is that there is nothing between "no card at all" and "sparse" and in particular the difference between the capability to hold entries of "sparse" and "fine". I.e. memory usage difference when exceeding a "sparse" array (holding 128 entries at 32M regions, taking ~256 bytes) to fine that is able to hold 65k entries using 8kB is significant. > For these reason there is even a dedicated option to stop allocating more "fine" containers and just give up and use "full" instead to avoid excessive memory usage. With extremely bad consequences in pause times. > > Over time some of these issues have been fixed or in many cases band-aided, and some of these fixes and ideas were the result of working on this change (e.g. JDK-8262185, JDK-8233919, JDK-8213108). > > This change is effectively a rewrite of the Java heap card based part of a region's remembered set. > > This initial fully working change can be roughly described with the following properties: > > * use a single `ConcurrentHashTable` for the card containers of a given region. The container in use replaced (coarsened) on the fly within the CHT node, completely lock-free. This implements JDK-6949259. > > * memory for a given region's remembered set for all containers (and the CHT nodes) is backed by per container type and per remembered set arena style bump-pointer allocation buffers. In this change, in the pause, memory is given back to free lists only. The implementation gives back memory to the OS concurrently to the application. Memory is still managed using the C heap memory manager though, but abstracted away and could be replaced by manual page memory management. > > * there are now four different container types and one meta-container type. These four actual containers are: > * inline pointer: the change store a few (3-5) cards in the CHT node directly and uses no extra memory. > * array of cards: similar to the "sparse" container, an array of cards with a configurable amount of entries. However bulk allocation of memory is now managed at a lower level so there is much less waste. > * bitmap: similar to "fine", a bitmap spanning a (sub-)range of memory > * full: same as full, indicating for a (sub-)range of memory that all cards are to be looked at during scan. Similar to inline pointers, this uses no extra memory. > * howl: the Howl container subdivides a given memory range into subranges where any of the other containers describing that sub-range of the heap may be stored in. This is somewhat similar to the idea suggested in JDK-8048504. > > * care has been taken to minimize container memory usage, e.g. by not adding redundant information there and in general carefully specify them. They have been designed with future enhancements in mind. > > In some benchmarks (where there is significant remembered set memory usage) we are seeing memory reduction to 25% of JDK 16 levels with this change. Garbage collection times are at most as long or shorter than before; most changes affecting pause times have been extracted earlier. Individiual affected phases are generally shorter now. > > Testing: tier1-8 many times, manual and automated perf testing Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: - Merge branch 'master' of gh:openjdk/jdk into tschatzl:submit/8017163-refactor-remembered-set - Always have power-of-2 Howl buckets to avoid memory waste (these entries have never been used before, just taking a small amount of memory) - Improved documentation - Improve comment - Rename G1CardSetContainerOnHeap to G1CardSetContainer on popular demand - sjohanss-review 3 - Merge branch 'master' of gh:openjdk/jdk into 8017163-refactor-remembered-set - More cleanup after sjohanss comments - Rename FOUND - Remove prefetching of log buffers - ... and 5 more: https://git.openjdk.java.net/jdk/compare/de6472c4...39a41f64 ------------- Changes: https://git.openjdk.java.net/jdk/pull/4116/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4116&range=09 Stats: 6131 lines in 64 files changed: 4557 ins; 1315 del; 259 mod Patch: https://git.openjdk.java.net/jdk/pull/4116.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4116/head:pull/4116 PR: https://git.openjdk.java.net/jdk/pull/4116 From kartik.kalaghatagi at gmail.com Wed Jun 2 10:54:42 2021 From: kartik.kalaghatagi at gmail.com (Kartik Kalaghatagi) Date: Wed, 2 Jun 2021 16:24:42 +0530 Subject: Feature: Add memory allocation information to thread model. Message-ID: Hi JDK Team, I would like to add a new feature to native C++ thread implementation located at 'hotspot\share\runtime\thread.cpp". Before I explain what the changes are, I want to share what problem it solves. Q. What is the current challenge/problem we have? A. Today Java is being used by many companies and has been running in production for decades. And since many cloud-based companies are evolving and supporting multi-tenants, there is a very big challenge in terms of memory. Currently, we have many tools like VisualVM, JVMTI agents which profile the memory (heap) and gives us the information of how much memory is being used. And this profiling adds significant cost when repeated regularly. We need to take the heap dump, analyze it and then understand by GC roots to figure out how much memory the thread was consuming and how it caused Out-of-memory and impacted other customers. Also, there is no isolation between tenants, currently, we can't say a standard customer is allowed to process only 100MB of memory to do the work while a premium customer gets to process 1GB, because nowhere we get this information on how much memory the thread is using during runtime. Q. What is the solution? A. If we introduce a variable that will hold the information about the amount of memory allocated since the start of the thread. At any given point in time, the application now has a context on how much memory the thread is using. With Aspect-oriented programming evolving we can intercept the 'start' and 'end' point, where the thread will do its work, between this we can reset the allocation value OR the same can be achieved by writing agent using JVMTI. *Implementation overview.* I am proposing 2 ways we can get this information. 1. We can add 2 functions to the thread object "getAllocatedBytes()" and "resetAllocatedBytes()" and in the java application we can control the threads based on memory usage and prevent the system from going down because of Out-of-memory. 2. We can add a JVMTI function to extract this information. and spawn an agent thread to monitor the memory usage of all other threads and take appropriate action. *Performance and corner cases.* I guess there won't be any performance impact since we are not waiting for the JVM to go to a safe state as it happens while taking heap dump. Also, there won't be any overflow since the variable is of type 'long' whose value is 2**64 and it takes the thread to allocate 18446.744073709553049 petabytes of memory and in an ideal case JVM will be restarted within this, even if overflow occurs the value will reset to zero. I would appreciate any comments on this and also correct me if I am wrong or missed anything. Regards, Kartik From mcimadamore at openjdk.java.net Wed Jun 2 10:55:38 2021 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Wed, 2 Jun 2021 10:55:38 GMT Subject: Integrated: 8264774: Implementation of Foreign Function and Memory API (Incubator) In-Reply-To: References: Message-ID: On Mon, 26 Apr 2021 17:10:13 GMT, Maurizio Cimadamore wrote: > This PR contains the API and implementation changes for JEP-412 [1]. A more detailed description of such changes, to avoid repetitions during the review process, is included as a separate comment. > > [1] - https://openjdk.java.net/jeps/412 This pull request has now been integrated. Changeset: a223189b Author: Maurizio Cimadamore URL: https://git.openjdk.java.net/jdk/commit/a223189b069a7cfe49511d49b5b09e7107cb3cab Stats: 14500 lines in 219 files changed: 8847 ins; 3642 del; 2011 mod 8264774: Implementation of Foreign Function and Memory API (Incubator) Co-authored-by: Paul Sandoz Co-authored-by: Jorn Vernee Co-authored-by: Vladimir Ivanov Co-authored-by: Athijegannathan Sundararajan Co-authored-by: Chris Hegarty Reviewed-by: psandoz, chegar, mchung, vlivanov ------------- PR: https://git.openjdk.java.net/jdk/pull/3699 From stefank at openjdk.java.net Wed Jun 2 11:01:06 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 2 Jun 2021 11:01:06 GMT Subject: RFR: 8267920: Create separate Events buffer for VMOperations [v4] In-Reply-To: References: Message-ID: <9pEJLjwpPfVWqcgnj2uCIakweyPXBXUNWSM4N66TEGs=.f50b218a-86d6-46b6-9c3f-8300823783d3@github.com> > The Events classes collect events in a circular buffer that gets dumped into the hs_err files. There are different sections to sort out different types of events. See: > > // A log for internal exception related messages, like internal > // throws and implicit exceptions. > static ExceptionsEventLog* _exceptions; > > // Deoptization related messages > static StringEventLog* _deopt_messages; > > // Redefinition related messages > static StringEventLog* _redefinitions; > > // Class unloading events > static UnloadingEventLog* _class_unloading; > > There's also a buffer for non-categorized events: > > // A log for generic messages that aren't well categorized. > static StringEventLog* _messages; > > I propose that we create a separate buffer for VMOperations. This will make it easier to debug GC related bugs. > > With the proposed patch, the hs_err files will now have a section that looks like this. > > VM Operations (20 events): > Event: 0,186 Executing VM operation: HandshakeAllThreads > Event: 0,186 Executing VM operation: HandshakeAllThreads done > Event: 0,230 Executing VM operation: ZMarkStart > Event: 0,230 Executing VM operation: ZMarkStart done > Event: 0,232 Executing VM operation: HandshakeAllThreads > Event: 0,232 Executing VM operation: HandshakeAllThreads done > Event: 0,232 Executing VM operation: HandshakeAllThreads > Event: 0,232 Executing VM operation: HandshakeAllThreads done > Event: 0,232 Executing VM operation: HandshakeAllThreads > Event: 0,233 Executing VM operation: HandshakeAllThreads done > Event: 0,233 Executing VM operation: ZMarkEnd > Event: 0,233 Executing VM operation: ZMarkEnd done > Event: 0,234 Executing VM operation: HandshakeAllThreads > Event: 0,234 Executing VM operation: HandshakeAllThreads done > Event: 0,234 Executing VM operation: ZVerify > Event: 0,234 Executing VM operation: ZVerify done > Event: 0,234 Executing VM operation: CleanClassLoaderDataMetaspaces > Event: 0,234 Executing VM operation: CleanClassLoaderDataMetaspaces done > Event: 0,235 Executing VM operation: ZRelocateStart > Event: 0,235 Executing VM operation: ZRelocateStart done Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge remote-tracking branch 'origin/master' into 8267914_events_vmoperations - Review tschatzl - Merge remote-tracking branch 'origin/master' into 8267914_events_vmoperations - Review coleenp - 8267920: Create separate Events buffer for VMOperations ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4243/files - new: https://git.openjdk.java.net/jdk/pull/4243/files/601310ab..553760cd Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4243&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4243&range=02-03 Stats: 40030 lines in 1138 files changed: 13525 ins; 22872 del; 3633 mod Patch: https://git.openjdk.java.net/jdk/pull/4243.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4243/head:pull/4243 PR: https://git.openjdk.java.net/jdk/pull/4243 From github.com+4146708+a74nh at openjdk.java.net Wed Jun 2 11:09:31 2021 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Wed, 2 Jun 2021 11:09:31 GMT Subject: RFR: 8266749: AArch64: Backtracing broken on PAC enabled systems [v6] In-Reply-To: References: <3Ak1iZsEfTEKadfUcF6zGVuzsDoiQbaupm66NvSwlaY=.8323de39-d7e6-4049-9243-7c31a75bbc9f@github.com> <6XgtFHEGXqVgiGLcWadXJzSSUtMR_sBNT-TO1Z9WPvY=.22753345-69fe-4d63-83e9-8df1b695c717@github.com> Message-ID: On Wed, 2 Jun 2021 10:02:45 GMT, Andrew Haley wrote: >> Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove asserts / fix build >> >> CustomizedGitHooks: yes >> Change-Id: I6b634b90e81cf8f6e4cd1cdc63f6926eaa7025f6 > > src/hotspot/cpu/aarch64/pauth_aarch64.hpp line 31: > >> 29: >> 30: inline bool pauth_ptr_is_raw(address ptr) { >> 31: // Confirm none of the high bits are set in the pointer. > > This predicate seems to me to be be misnamed: it's checking for unsigned/stripped, not for raw. The raw value is whatever gets saved in LR/pushed onto the stack. Ah, yes, because the name pauth_ptr_is_raw() now clashes with sender_pc_raw(). I'll fix up, one way or the other. Don't really want to call it pauth_ptr_is_unsigned() or pauth_ptr_is_authenticated(), because they seem to imply different things at first glance. ------------- PR: https://git.openjdk.java.net/jdk/pull/4029 From mcimadamore at openjdk.java.net Wed Jun 2 11:34:41 2021 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Wed, 2 Jun 2021 11:34:41 GMT Subject: RFR: 8266257: Fix foreign linker build issues for ppc and s390 Message-ID: This patch addresses some build issues introduced by integration of JEP-412. The support for JEP-412 turns some static functions (e.g.`float_move`, `long_move`) in sharedRuntime into proper member functions, as they need to be referenced by the new support for Panama upcall handlers. Sadly, not all Hotspot ports agree on the number of parameters these functions take - most notably, ppc and s390 have incompatible signatures, and, because of that, failt to build. A simpler solution is to move these functions to the x86 macro assembler - after all, these functions are specific to a given platform, and excessive sharing should be avoided. This patch does that - and fixes other remaining issues with non-standard hotspot builds (e.g. by adding stab implementation for some unimplemented Panama support). ------------- Commit messages: - Add newline - Fix build issues Changes: https://git.openjdk.java.net/jdk/pull/4303/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4303&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8266257 Stats: 532 lines in 15 files changed: 289 ins; 221 del; 22 mod Patch: https://git.openjdk.java.net/jdk/pull/4303.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4303/head:pull/4303 PR: https://git.openjdk.java.net/jdk/pull/4303 From jvernee at openjdk.java.net Wed Jun 2 11:45:28 2021 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Wed, 2 Jun 2021 11:45:28 GMT Subject: RFR: 8266257: Fix foreign linker build issues for ppc and s390 In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 11:27:28 GMT, Maurizio Cimadamore wrote: > This patch addresses some build issues introduced by integration of JEP-412. > The support for JEP-412 turns some static functions (e.g.`float_move`, `long_move`) in sharedRuntime into proper member functions, as they need to be referenced by the new support for Panama upcall handlers. Sadly, not all Hotspot ports agree on the number of parameters these functions take - most notably, ppc and s390 have incompatible signatures, and, because of that, failt to build. > > A simpler solution is to move these functions to the x86 macro assembler - after all, these functions are specific to a given platform, and excessive sharing should be avoided. This patch does that - and fixes other remaining issues with non-standard hotspot builds (e.g. by adding stab implementation for some unimplemented Panama support). LGTM! ------------- Marked as reviewed by jvernee (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4303 From vlivanov at openjdk.java.net Wed Jun 2 11:45:28 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Wed, 2 Jun 2021 11:45:28 GMT Subject: RFR: 8266257: Fix foreign linker build issues for ppc and s390 In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 11:27:28 GMT, Maurizio Cimadamore wrote: > This patch addresses some build issues introduced by integration of JEP-412. > The support for JEP-412 turns some static functions (e.g.`float_move`, `long_move`) in sharedRuntime into proper member functions, as they need to be referenced by the new support for Panama upcall handlers. Sadly, not all Hotspot ports agree on the number of parameters these functions take - most notably, ppc and s390 have incompatible signatures, and, because of that, failt to build. > > A simpler solution is to move these functions to the x86 macro assembler - after all, these functions are specific to a given platform, and excessive sharing should be avoided. This patch does that - and fixes other remaining issues with non-standard hotspot builds (e.g. by adding stab implementation for some unimplemented Panama support). Looks good. ------------- Marked as reviewed by vlivanov (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4303 From weijun at openjdk.java.net Wed Jun 2 12:01:30 2021 From: weijun at openjdk.java.net (Weijun Wang) Date: Wed, 2 Jun 2021 12:01:30 GMT Subject: RFR: 8266459: Implement JEP 411: Deprecate the Security Manager for Removal [v9] In-Reply-To: <_lD3kOiYR1ulm4m7HivqnnFQBD7WxWWLBz56oP6EMVU=.1723b50b-4390-457c-878d-f726cb1ce170@github.com> References: <_lD3kOiYR1ulm4m7HivqnnFQBD7WxWWLBz56oP6EMVU=.1723b50b-4390-457c-878d-f726cb1ce170@github.com> Message-ID: > Please review this implementation of [JEP 411](https://openjdk.java.net/jeps/411). > > The code change is divided into 3 commits. Please review them one by one. > > 1. https://github.com/openjdk/jdk/commit/576161d15423f58281e384174d28c9f9be7941a1 The essential change for this JEP, including the `@Deprecate` annotations and spec change. It also update the default value of the `java.security.manager` system property to "disallow", and necessary test change following this update. > 2. https://github.com/openjdk/jdk/commit/26a54a835e9f84aa528740a7c5c35d07355a8a66 Manual changes to several files so that the next commit can be generated programatically. > 3. https://github.com/openjdk/jdk/commit/eb6c566ff9207974a03a53335e0e697cffcf0950 Automatic changes to other source files to avoid javac warnings on deprecation for removal > > The 1st and 2nd commits should be reviewed carefully. The 3rd one is generated programmatically, see the comment below for more details. If you are only interested in a portion of the 3rd commit and would like to review it as a separate file, please comment here and I'll generate an individual webrev. > > Due to the size of this PR, no attempt is made to update copyright years for any file to minimize unnecessary merge conflict. > > Furthermore, since the default value of `java.security.manager` system property is now "disallow", most of the tests calling `System.setSecurityManager()` need to launched with `-Djava.security.manager=allow`. This is covered in a different PR at https://github.com/openjdk/jdk/pull/4071. > > Update: the deprecation annotations and javadoc tags, build, compiler, core-libs, hotspot, i18n, jmx, net, nio, security, and serviceability are reviewed. Rest are 2d, awt, beans, sound, and swing. Weijun Wang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 14 commits: - copyright years - merge from master, resolve one conflict - Merge branch 'master' - merge from master - rename setSecurityManagerDirect to implSetSecurityManager - default behavior reverted to allow - move one annotation to new method - merge from master, two files removed, one needs merge - keep only one systemProperty tag - fixing awt/datatransfer/DataFlavor/DataFlavorRemoteTest.java - ... and 4 more: https://git.openjdk.java.net/jdk/compare/19450b99...331389b5 ------------- Changes: https://git.openjdk.java.net/jdk/pull/4073/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4073&range=08 Stats: 2755 lines in 826 files changed: 1997 ins; 20 del; 738 mod Patch: https://git.openjdk.java.net/jdk/pull/4073.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4073/head:pull/4073 PR: https://git.openjdk.java.net/jdk/pull/4073 From weijun at openjdk.java.net Wed Jun 2 12:01:33 2021 From: weijun at openjdk.java.net (Weijun Wang) Date: Wed, 2 Jun 2021 12:01:33 GMT Subject: Integrated: 8266459: Implement JEP 411: Deprecate the Security Manager for Removal In-Reply-To: <_lD3kOiYR1ulm4m7HivqnnFQBD7WxWWLBz56oP6EMVU=.1723b50b-4390-457c-878d-f726cb1ce170@github.com> References: <_lD3kOiYR1ulm4m7HivqnnFQBD7WxWWLBz56oP6EMVU=.1723b50b-4390-457c-878d-f726cb1ce170@github.com> Message-ID: On Mon, 17 May 2021 18:23:41 GMT, Weijun Wang wrote: > Please review this implementation of [JEP 411](https://openjdk.java.net/jeps/411). > > The code change is divided into 3 commits. Please review them one by one. > > 1. https://github.com/openjdk/jdk/commit/576161d15423f58281e384174d28c9f9be7941a1 The essential change for this JEP, including the `@Deprecate` annotations and spec change. It also update the default value of the `java.security.manager` system property to "disallow", and necessary test change following this update. > 2. https://github.com/openjdk/jdk/commit/26a54a835e9f84aa528740a7c5c35d07355a8a66 Manual changes to several files so that the next commit can be generated programatically. > 3. https://github.com/openjdk/jdk/commit/eb6c566ff9207974a03a53335e0e697cffcf0950 Automatic changes to other source files to avoid javac warnings on deprecation for removal > > The 1st and 2nd commits should be reviewed carefully. The 3rd one is generated programmatically, see the comment below for more details. If you are only interested in a portion of the 3rd commit and would like to review it as a separate file, please comment here and I'll generate an individual webrev. > > Due to the size of this PR, no attempt is made to update copyright years for any file to minimize unnecessary merge conflict. > > Furthermore, since the default value of `java.security.manager` system property is now "disallow", most of the tests calling `System.setSecurityManager()` need to launched with `-Djava.security.manager=allow`. This is covered in a different PR at https://github.com/openjdk/jdk/pull/4071. > > Update: the deprecation annotations and javadoc tags, build, compiler, core-libs, hotspot, i18n, jmx, net, nio, security, and serviceability are reviewed. Rest are 2d, awt, beans, sound, and swing. This pull request has now been integrated. Changeset: 6765f902 Author: Weijun Wang URL: https://git.openjdk.java.net/jdk/commit/6765f902505fbdd02f25b599f942437cd805cad1 Stats: 2755 lines in 826 files changed: 1997 ins; 20 del; 738 mod 8266459: Implement JEP 411: Deprecate the Security Manager for Removal Co-authored-by: Sean Mullan Co-authored-by: Lance Andersen Co-authored-by: Weijun Wang Reviewed-by: erikj, darcy, chegar, naoto, joehw, alanb, mchung, kcr, prr, lancea ------------- PR: https://git.openjdk.java.net/jdk/pull/4073 From stefank at openjdk.java.net Wed Jun 2 13:36:40 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 2 Jun 2021 13:36:40 GMT Subject: RFR: 8267920: Create separate Events buffer for VMOperations [v4] In-Reply-To: <9pEJLjwpPfVWqcgnj2uCIakweyPXBXUNWSM4N66TEGs=.f50b218a-86d6-46b6-9c3f-8300823783d3@github.com> References: <9pEJLjwpPfVWqcgnj2uCIakweyPXBXUNWSM4N66TEGs=.f50b218a-86d6-46b6-9c3f-8300823783d3@github.com> Message-ID: On Wed, 2 Jun 2021 11:01:06 GMT, Stefan Karlsson wrote: >> The Events classes collect events in a circular buffer that gets dumped into the hs_err files. There are different sections to sort out different types of events. See: >> >> // A log for internal exception related messages, like internal >> // throws and implicit exceptions. >> static ExceptionsEventLog* _exceptions; >> >> // Deoptization related messages >> static StringEventLog* _deopt_messages; >> >> // Redefinition related messages >> static StringEventLog* _redefinitions; >> >> // Class unloading events >> static UnloadingEventLog* _class_unloading; >> >> There's also a buffer for non-categorized events: >> >> // A log for generic messages that aren't well categorized. >> static StringEventLog* _messages; >> >> I propose that we create a separate buffer for VMOperations. This will make it easier to debug GC related bugs. >> >> With the proposed patch, the hs_err files will now have a section that looks like this. >> >> VM Operations (20 events): >> Event: 0,186 Executing VM operation: HandshakeAllThreads >> Event: 0,186 Executing VM operation: HandshakeAllThreads done >> Event: 0,230 Executing VM operation: ZMarkStart >> Event: 0,230 Executing VM operation: ZMarkStart done >> Event: 0,232 Executing VM operation: HandshakeAllThreads >> Event: 0,232 Executing VM operation: HandshakeAllThreads done >> Event: 0,232 Executing VM operation: HandshakeAllThreads >> Event: 0,232 Executing VM operation: HandshakeAllThreads done >> Event: 0,232 Executing VM operation: HandshakeAllThreads >> Event: 0,233 Executing VM operation: HandshakeAllThreads done >> Event: 0,233 Executing VM operation: ZMarkEnd >> Event: 0,233 Executing VM operation: ZMarkEnd done >> Event: 0,234 Executing VM operation: HandshakeAllThreads >> Event: 0,234 Executing VM operation: HandshakeAllThreads done >> Event: 0,234 Executing VM operation: ZVerify >> Event: 0,234 Executing VM operation: ZVerify done >> Event: 0,234 Executing VM operation: CleanClassLoaderDataMetaspaces >> Event: 0,234 Executing VM operation: CleanClassLoaderDataMetaspaces done >> Event: 0,235 Executing VM operation: ZRelocateStart >> Event: 0,235 Executing VM operation: ZRelocateStart done > > Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Merge remote-tracking branch 'origin/master' into 8267914_events_vmoperations > - Review tschatzl > - Merge remote-tracking branch 'origin/master' into 8267914_events_vmoperations > - Review coleenp > - 8267920: Create separate Events buffer for VMOperations Thanks for the reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/4243 From stefank at openjdk.java.net Wed Jun 2 13:36:41 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 2 Jun 2021 13:36:41 GMT Subject: Integrated: 8267920: Create separate Events buffer for VMOperations In-Reply-To: References: Message-ID: On Fri, 28 May 2021 11:22:49 GMT, Stefan Karlsson wrote: > The Events classes collect events in a circular buffer that gets dumped into the hs_err files. There are different sections to sort out different types of events. See: > > // A log for internal exception related messages, like internal > // throws and implicit exceptions. > static ExceptionsEventLog* _exceptions; > > // Deoptization related messages > static StringEventLog* _deopt_messages; > > // Redefinition related messages > static StringEventLog* _redefinitions; > > // Class unloading events > static UnloadingEventLog* _class_unloading; > > There's also a buffer for non-categorized events: > > // A log for generic messages that aren't well categorized. > static StringEventLog* _messages; > > I propose that we create a separate buffer for VMOperations. This will make it easier to debug GC related bugs. > > With the proposed patch, the hs_err files will now have a section that looks like this. > > VM Operations (20 events): > Event: 0,186 Executing VM operation: HandshakeAllThreads > Event: 0,186 Executing VM operation: HandshakeAllThreads done > Event: 0,230 Executing VM operation: ZMarkStart > Event: 0,230 Executing VM operation: ZMarkStart done > Event: 0,232 Executing VM operation: HandshakeAllThreads > Event: 0,232 Executing VM operation: HandshakeAllThreads done > Event: 0,232 Executing VM operation: HandshakeAllThreads > Event: 0,232 Executing VM operation: HandshakeAllThreads done > Event: 0,232 Executing VM operation: HandshakeAllThreads > Event: 0,233 Executing VM operation: HandshakeAllThreads done > Event: 0,233 Executing VM operation: ZMarkEnd > Event: 0,233 Executing VM operation: ZMarkEnd done > Event: 0,234 Executing VM operation: HandshakeAllThreads > Event: 0,234 Executing VM operation: HandshakeAllThreads done > Event: 0,234 Executing VM operation: ZVerify > Event: 0,234 Executing VM operation: ZVerify done > Event: 0,234 Executing VM operation: CleanClassLoaderDataMetaspaces > Event: 0,234 Executing VM operation: CleanClassLoaderDataMetaspaces done > Event: 0,235 Executing VM operation: ZRelocateStart > Event: 0,235 Executing VM operation: ZRelocateStart done This pull request has now been integrated. Changeset: 47677580 Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/476775808f82a4b0d42ac58fdb801812b54e01a1 Stats: 69 lines in 3 files changed: 49 ins; 3 del; 17 mod 8267920: Create separate Events buffer for VMOperations Reviewed-by: coleenp, dholmes, tschatzl ------------- PR: https://git.openjdk.java.net/jdk/pull/4243 From ayang at openjdk.java.net Wed Jun 2 14:17:58 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Wed, 2 Jun 2021 14:17:58 GMT Subject: RFR: JDK-8267916: Make as_CompilerThread consistent with other as_*_thread methods [v2] In-Reply-To: References: Message-ID: > Simple refactoring around `as_CompilerThread`. A new file, `compilerThread.inline.hpp`, is created to get around the circular dependency. Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: cast ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4240/files - new: https://git.openjdk.java.net/jdk/pull/4240/files/cc3141b2..c919ae36 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4240&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4240&range=00-01 Stats: 66 lines in 14 files changed: 9 ins; 52 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/4240.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4240/head:pull/4240 PR: https://git.openjdk.java.net/jdk/pull/4240 From ayang at openjdk.java.net Wed Jun 2 14:29:30 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Wed, 2 Jun 2021 14:29:30 GMT Subject: RFR: JDK-8267916: Make as_CompilerThread consistent with other as_*_thread methods [v2] In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 14:17:58 GMT, Albert Mingkun Yang wrote: >> Simple refactoring around `as_CompilerThread`. A new file, `compilerThread.inline.hpp`, is created to get around the circular dependency. > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > cast I encountered some circular dependencies after moving the definition to `thread.inline.hpp`, so I replaced it with a static `cast` method, as Kim instructed from an offline discussion. Note now there's a discrepancy: `CompilerThread::cast(t)` vs `t->as_Worker_thread()` (or `t->as_Java_thread()`). I can do the same for other `as_*_thread` methods as well (in this PR or another one) if people feel like this approach. ------------- PR: https://git.openjdk.java.net/jdk/pull/4240 From stefank at openjdk.java.net Wed Jun 2 14:34:11 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 2 Jun 2021 14:34:11 GMT Subject: RFR: 8268118: Rename bytes_os_cpu.inline.hpp files to bytes_os_cpu.hpp Message-ID: Today we transitively include the `bytes__.inline.hpp` files from `bytes.hpp`. This is goes against the HotSpot Style Guide that states: > .inline.hpp files should only be included in .cpp or .inline.hpp files. The `bytes__.inline.hpp` don't include any other HotSpot files, so I propose that we simply rename them to `bytes__.hpp`. ------------- Commit messages: - 8268118: Rename bytes_os_cpu.inline.hpp files to bytes_os_cpu.hpp Changes: https://git.openjdk.java.net/jdk/pull/4310/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4310&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8268118 Stats: 540 lines in 24 files changed: 258 ins; 258 del; 24 mod Patch: https://git.openjdk.java.net/jdk/pull/4310.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4310/head:pull/4310 PR: https://git.openjdk.java.net/jdk/pull/4310 From stefank at openjdk.java.net Wed Jun 2 14:35:06 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 2 Jun 2021 14:35:06 GMT Subject: RFR: 8268119: Rename copy_os_cpu.inline.hpp files to copy_os_cpu.hpp Message-ID: Today we transitively include the `copy__.inline.hpp` files from `copy.hpp`. This is goes against the HotSpot Style Guide that states: > .inline.hpp files should only be included in .cpp or .inline.hpp files. The `copy__.inline.hpp` don't include any other HotSpot files, so I propose that we simply rename them to `copy__.hpp`. ------------- Commit messages: - 8268119: Rename copy_os_cpu.inline.hpp files to copy_os_cpu.hpp Changes: https://git.openjdk.java.net/jdk/pull/4311/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4311&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8268119 Stats: 36 lines in 10 files changed: 0 ins; 12 del; 24 mod Patch: https://git.openjdk.java.net/jdk/pull/4311.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4311/head:pull/4311 PR: https://git.openjdk.java.net/jdk/pull/4311 From kbarrett at openjdk.java.net Wed Jun 2 14:43:30 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 2 Jun 2021 14:43:30 GMT Subject: RFR: 8268119: Rename copy_os_cpu.inline.hpp files to copy_os_cpu.hpp In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 14:28:57 GMT, Stefan Karlsson wrote: > Today we transitively include the `copy__.inline.hpp` files from `copy.hpp`. This is goes against the HotSpot Style Guide that states: > >> .inline.hpp files should only be included in .cpp or .inline.hpp files. > > The `copy__.inline.hpp` don't include any other HotSpot files, so I propose that we simply rename them to `copy__.hpp`. Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4311 From kbarrett at openjdk.java.net Wed Jun 2 14:46:31 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 2 Jun 2021 14:46:31 GMT Subject: RFR: JDK-8267916: Make as_CompilerThread consistent with other as_*_thread methods [v2] In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 14:17:58 GMT, Albert Mingkun Yang wrote: >> Simple refactoring around `as_CompilerThread`. A new file, `compilerThread.inline.hpp`, is created to get around the circular dependency. > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > cast Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4240 From github.com+4146708+a74nh at openjdk.java.net Wed Jun 2 14:49:32 2021 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Wed, 2 Jun 2021 14:49:32 GMT Subject: RFR: 8266749: AArch64: Backtracing broken on PAC enabled systems [v6] In-Reply-To: References: <3Ak1iZsEfTEKadfUcF6zGVuzsDoiQbaupm66NvSwlaY=.8323de39-d7e6-4049-9243-7c31a75bbc9f@github.com> <6XgtFHEGXqVgiGLcWadXJzSSUtMR_sBNT-TO1Z9WPvY=.22753345-69fe-4d63-83e9-8df1b695c717@github.com> Message-ID: On Wed, 2 Jun 2021 11:06:42 GMT, Alan Hayward wrote: >> src/hotspot/cpu/aarch64/pauth_aarch64.hpp line 31: >> >>> 29: >>> 30: inline bool pauth_ptr_is_raw(address ptr) { >>> 31: // Confirm none of the high bits are set in the pointer. >> >> This predicate seems to me to be be misnamed: it's checking for unsigned/stripped, not for raw. The raw value is whatever gets saved in LR/pushed onto the stack. > > Ah, yes, because the name pauth_ptr_is_raw() now clashes with sender_pc_raw(). > > I'll fix up, one way or the other. Don't really want to call it pauth_ptr_is_unsigned() or pauth_ptr_is_authenticated(), because they seem to imply different things at first glance. raw is generally accepted as the correct name for a pointer that has not been signed (or has been authenticated/stripped). For example see: https://github.com/apple/llvm-project/blob/a63a81bd9911f87a0b5dcd5bdd7ccdda7124af87/clang/docs/PointerAuthentication.rst#basic-concepts So, pauth_ptr_is_raw() is the correct name. But, now what to rename sender_pc_raw() to ? :) The PAC coder would say: inline address frame::sender_pc() const { return *sender_pc_addr(); } inline address frame::sender_pc_authenticated() const { return pauth_strip_pointer(sender_pc()); } But that's not right here for a common interface. How about: inline address frame::stored_sender_pc() const { return *sender_pc_addr(); } inline address frame::sender_pc() const { return pauth_strip_pointer(stored_sender_pc()); } ------------- PR: https://git.openjdk.java.net/jdk/pull/4029 From kbarrett at openjdk.java.net Wed Jun 2 14:53:30 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 2 Jun 2021 14:53:30 GMT Subject: RFR: JDK-8267916: Make as_CompilerThread consistent with other as_*_thread methods [v2] In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 14:26:07 GMT, Albert Mingkun Yang wrote: > I encountered some circular dependencies after moving the definition to `thread.inline.hpp`, so I replaced it with a static `cast` method, as Kim instructed from an offline discussion. Note now there's a discrepancy: `CompilerThread::cast(t)` vs `t->as_Worker_thread()` (or `t->as_Java_thread()`). I can do the same for other `as_*_thread` methods as well (in this PR or another one) if people feel like this approach. as_Worker_thread => WorkerThread::cast could pretty easily be added to this change. There's only one call. There are lots of as_Java_thread calls. Maybe check with runtime team how they feel about the inconsistency vs code churn. ------------- PR: https://git.openjdk.java.net/jdk/pull/4240 From chagedorn at openjdk.java.net Wed Jun 2 15:26:10 2021 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Wed, 2 Jun 2021 15:26:10 GMT Subject: RFR: 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests [v11] In-Reply-To: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> References: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> Message-ID: > This RFE provides an IR test framework to perform regex-based checks on the C2 IR shape of test methods emitted by the VM flags `-XX:+PrintIdeal` and `-XX:+PrintOptoAssembly`. The framework can also be used for other non-IR matching (and non-compiler) tests by providing easy to use annotations for commonly used testing patterns and compiler control flags. > > The framework is based on the ideas of the currently present IR test framework in [Valhalla](https://github.com/openjdk/valhalla/blob/e9c78ce4fcfd01361c35883e0d68f9ae5a80d079/test/hotspot/jtreg/compiler/valhalla/inlinetypes/InlineTypeTest.java) (mainly implemented by @TobiHartmann) which is being used with great success. This new framework aims to replace the old one in Valhalla at some point. > > A detailed description about how this new IR test framework works and how it is used is provided in the [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) file and in the [Javadocs](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc/jdk/test/lib/hotspot/ir_framework/package-summary.html) written for the framework classes. > > To finish a first version of this framework for JDK 17, I decided to leave some improvement possibilities and ideas to be followed up on in additional RFEs. Some ideas are mentioned in "Future Work" in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) and were also created as subtasks of this RFE. > > Testing (also described in "Internal Framework Tests in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md)): > There are various tests to verify the correctness of the test framework which can be found as JTreg tests in the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) folder. Additional testing was performed by converting all compiler Inline Types test of project Valhalla (done by @katyapav in [JDK-8263024](https://bugs.openjdk.java.net/browse/JDK-8263024)) that used the old framework to the new framework. This provided additional testing for the framework itself. We ran the converted tests with all the flag settings used in hs-tier1-9 and hs-precheckin-comp. For sanity checking, this was also done with a sample IR test in mainline. > > Some stats about the framework code added to [ir_framework](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework): > > - without the [Javadocs files](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc) : 60 changed files, 13212 insertions, 0 deletions. > - without the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) and [examples](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/examples) folder: 40 files changed, 6781 insertions > - comments: 2399 insertions (calculated with `git diff --cached !(tests|examples) | grep -c -E "(^[+-]\s*(/)?*)|(^[+-]\s*//)"`) > - which leaves 4382 lines of code inserted > > Big thanks to: > - @TobiHartmann for all his help by discussing the new framework and for providing insights from his IR test framework in Valhalla. > - @katyapav for converting the Valhalla tests to use the new framework which found some harder to catch bugs in the framework and also some actual C2 bugs. > - @iignatev for helping to simplify the framework usage with JTreg and with the framework internal VM calling structure. > - and others who provided valuable feedback. > > Thanks, > Christian Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: Add more whitelisted flags ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3508/files - new: https://git.openjdk.java.net/jdk/pull/3508/files/c35c658c..7a316de0 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3508&range=10 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3508&range=09-10 Stats: 9 lines in 1 file changed: 8 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3508.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3508/head:pull/3508 PR: https://git.openjdk.java.net/jdk/pull/3508 From aph at openjdk.java.net Wed Jun 2 15:55:37 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 2 Jun 2021 15:55:37 GMT Subject: RFR: 8266749: AArch64: Backtracing broken on PAC enabled systems [v6] In-Reply-To: References: <3Ak1iZsEfTEKadfUcF6zGVuzsDoiQbaupm66NvSwlaY=.8323de39-d7e6-4049-9243-7c31a75bbc9f@github.com> <6XgtFHEGXqVgiGLcWadXJzSSUtMR_sBNT-TO1Z9WPvY=.22753345-69fe-4d63-83e9-8df1b695c717@github.com> Message-ID: On Wed, 2 Jun 2021 14:46:26 GMT, Alan Hayward wrote: >> Ah, yes, because the name pauth_ptr_is_raw() now clashes with sender_pc_raw(). >> >> I'll fix up, one way or the other. Don't really want to call it pauth_ptr_is_unsigned() or pauth_ptr_is_authenticated(), because they seem to imply different things at first glance. > > raw is generally accepted as the correct name for a pointer that has not been signed (or has been authenticated/stripped). > For example see: https://github.com/apple/llvm-project/blob/a63a81bd9911f87a0b5dcd5bdd7ccdda7124af87/clang/docs/PointerAuthentication.rst#basic-concepts > > So, pauth_ptr_is_raw() is the correct name. > > But, now what to rename sender_pc_raw() to ? :) > The PAC coder would say: > inline address frame::sender_pc() const { return *sender_pc_addr(); } > inline address frame::sender_pc_authenticated() const { return pauth_strip_pointer(sender_pc()); } > But that's not right here for a common interface. > > How about: > inline address frame::stored_sender_pc() const { return *sender_pc_addr(); } > inline address frame::sender_pc() const { return pauth_strip_pointer(stored_sender_pc()); } Off the top of my head, I would have thought that it makes the most sense to use Arm's terminology, which is "signed pointer". So, that'd be `frame::signed_sender_pc()` or `frame::maybe_signed_sender_pc()`. ------------- PR: https://git.openjdk.java.net/jdk/pull/4029 From hohensee at amazon.com Wed Jun 2 16:08:28 2021 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 2 Jun 2021 16:08:28 +0000 Subject: Feature: Add memory allocation information to thread model. Message-ID: Hi, Kartik, The info you want is available from com.sun.management.ThreadMXBean.getThreadAllocatedBytes(). See, e.g., https://docs.oracle.com/javase/8/docs/jre/api/management/extension/com/sun/management/ThreadMXBean.html I don't think a resetAllocatedBytes method is needed. Your application can just save the current allocatedbytes value, do work, and then subtract the saved value from the new current value. Thanks, Paul ?-----Original Message----- From: hotspot-dev on behalf of Kartik Kalaghatagi Date: Wednesday, June 2, 2021 at 4:00 AM To: "hotspot-dev at openjdk.java.net" Subject: Feature: Add memory allocation information to thread model. Hi JDK Team, I would like to add a new feature to native C++ thread implementation located at 'hotspot\share\runtime\thread.cpp". Before I explain what the changes are, I want to share what problem it solves. Q. What is the current challenge/problem we have? A. Today Java is being used by many companies and has been running in production for decades. And since many cloud-based companies are evolving and supporting multi-tenants, there is a very big challenge in terms of memory. Currently, we have many tools like VisualVM, JVMTI agents which profile the memory (heap) and gives us the information of how much memory is being used. And this profiling adds significant cost when repeated regularly. We need to take the heap dump, analyze it and then understand by GC roots to figure out how much memory the thread was consuming and how it caused Out-of-memory and impacted other customers. Also, there is no isolation between tenants, currently, we can't say a standard customer is allowed to process only 100MB of memory to do the work while a premium customer gets to process 1GB, because nowhere we get this information on how much memory the thread is using during runtime. Q. What is the solution? A. If we introduce a variable that will hold the information about the amount of memory allocated since the start of the thread. At any given point in time, the application now has a context on how much memory the thread is using. With Aspect-oriented programming evolving we can intercept the 'start' and 'end' point, where the thread will do its work, between this we can reset the allocation value OR the same can be achieved by writing agent using JVMTI. *Implementation overview.* I am proposing 2 ways we can get this information. 1. We can add 2 functions to the thread object "getAllocatedBytes()" and "resetAllocatedBytes()" and in the java application we can control the threads based on memory usage and prevent the system from going down because of Out-of-memory. 2. We can add a JVMTI function to extract this information. and spawn an agent thread to monitor the memory usage of all other threads and take appropriate action. *Performance and corner cases.* I guess there won't be any performance impact since we are not waiting for the JVM to go to a safe state as it happens while taking heap dump. Also, there won't be any overflow since the variable is of type 'long' whose value is 2**64 and it takes the thread to allocate 18446.744073709553049 petabytes of memory and in an ideal case JVM will be restarted within this, even if overflow occurs the value will reset to zero. I would appreciate any comments on this and also correct me if I am wrong or missed anything. Regards, Kartik From jjg at openjdk.java.net Wed Jun 2 16:25:41 2021 From: jjg at openjdk.java.net (Jonathan Gibbons) Date: Wed, 2 Jun 2021 16:25:41 GMT Subject: RFR: JDK-8266254: Update to use jtreg 6 Message-ID: Please review the change to update to using jtreg 6. The primary change is to the jib-profiles.js file, which specifies the version of jtreg to use, for those systems that rely on this file. In addition, the `requiredVersion` has been updated in the various `TEST.ROOT` files. All the tests that could be updated ahead of time have been updated. There are a few tests remaining that need to be done at this time, because of the change in the module name for TestNG 7.3. It changed from a default of `testng` to and explicit `org.testng`. ------------- Commit messages: - JDK-8266254: Update to use jtreg 6 Changes: https://git.openjdk.java.net/jdk/pull/4315/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4315&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8266254 Stats: 17 lines in 11 files changed: 0 ins; 1 del; 16 mod Patch: https://git.openjdk.java.net/jdk/pull/4315.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4315/head:pull/4315 PR: https://git.openjdk.java.net/jdk/pull/4315 From lancea at openjdk.java.net Wed Jun 2 16:35:30 2021 From: lancea at openjdk.java.net (Lance Andersen) Date: Wed, 2 Jun 2021 16:35:30 GMT Subject: RFR: JDK-8266254: Update to use jtreg 6 In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 16:13:48 GMT, Jonathan Gibbons wrote: > Please review the change to update to using jtreg 6. > > The primary change is to the jib-profiles.js file, which specifies the version of jtreg to use, for those systems that rely on this file. In addition, the `requiredVersion` has been updated in the various `TEST.ROOT` files. > > All the tests that could be updated ahead of time have been updated. There are a few tests remaining that need to be done at this time, because of the change in the module name for TestNG 7.3. It changed from a default of `testng` to and explicit `org.testng`. Marked as reviewed by lancea (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/4315 From erikj at openjdk.java.net Wed Jun 2 16:52:27 2021 From: erikj at openjdk.java.net (Erik Joelsson) Date: Wed, 2 Jun 2021 16:52:27 GMT Subject: RFR: JDK-8266254: Update to use jtreg 6 In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 16:13:48 GMT, Jonathan Gibbons wrote: > Please review the change to update to using jtreg 6. > > The primary change is to the jib-profiles.js file, which specifies the version of jtreg to use, for those systems that rely on this file. In addition, the `requiredVersion` has been updated in the various `TEST.ROOT` files. > > All the tests that could be updated ahead of time have been updated. There are a few tests remaining that need to be done at this time, because of the change in the module name for TestNG 7.3. It changed from a default of `testng` to and explicit `org.testng`. Marked as reviewed by erikj (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/4315 From mchung at openjdk.java.net Wed Jun 2 16:56:30 2021 From: mchung at openjdk.java.net (Mandy Chung) Date: Wed, 2 Jun 2021 16:56:30 GMT Subject: RFR: JDK-8266254: Update to use jtreg 6 In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 16:13:48 GMT, Jonathan Gibbons wrote: > Please review the change to update to using jtreg 6. > > The primary change is to the jib-profiles.js file, which specifies the version of jtreg to use, for those systems that rely on this file. In addition, the `requiredVersion` has been updated in the various `TEST.ROOT` files. > > All the tests that could be updated ahead of time have been updated. There are a few tests remaining that need to be done at this time, because of the change in the module name for TestNG 7.3. It changed from a default of `testng` to and explicit `org.testng`. Marked as reviewed by mchung (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/4315 From naoto at openjdk.java.net Wed Jun 2 17:05:34 2021 From: naoto at openjdk.java.net (Naoto Sato) Date: Wed, 2 Jun 2021 17:05:34 GMT Subject: RFR: JDK-8266254: Update to use jtreg 6 In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 16:13:48 GMT, Jonathan Gibbons wrote: > Please review the change to update to using jtreg 6. > > The primary change is to the jib-profiles.js file, which specifies the version of jtreg to use, for those systems that rely on this file. In addition, the `requiredVersion` has been updated in the various `TEST.ROOT` files. > > All the tests that could be updated ahead of time have been updated. There are a few tests remaining that need to be done at this time, because of the change in the module name for TestNG 7.3. It changed from a default of `testng` to and explicit `org.testng`. Some of the modified files have copyright year left unchanged. `2021` needs to be appended. ------------- PR: https://git.openjdk.java.net/jdk/pull/4315 From alanb at openjdk.java.net Wed Jun 2 17:12:29 2021 From: alanb at openjdk.java.net (Alan Bateman) Date: Wed, 2 Jun 2021 17:12:29 GMT Subject: RFR: JDK-8266254: Update to use jtreg 6 In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 16:13:48 GMT, Jonathan Gibbons wrote: > Please review the change to update to using jtreg 6. > > The primary change is to the jib-profiles.js file, which specifies the version of jtreg to use, for those systems that rely on this file. In addition, the `requiredVersion` has been updated in the various `TEST.ROOT` files. > > All the tests that could be updated ahead of time have been updated. There are a few tests remaining that need to be done at this time, because of the change in the module name for TestNG 7.3. It changed from a default of `testng` to and explicit `org.testng`. Looks good, I had expected we would have more tests depending on the automatic module. ------------- Marked as reviewed by alanb (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4315 From iklam at openjdk.java.net Wed Jun 2 17:22:31 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 2 Jun 2021 17:22:31 GMT Subject: RFR: JDK-8267916: Make as_CompilerThread consistent with other as_*_thread methods [v2] In-Reply-To: References: Message-ID: <5C8r28tpPvktY6wfjJjfv1OlXtbRDojEQgkuzgHwyao=.dee6d745-94b9-4b83-af13-8b611da5000a@github.com> On Wed, 2 Jun 2021 14:50:03 GMT, Kim Barrett wrote: > > I encountered some circular dependencies after moving the definition to `thread.inline.hpp`, so I replaced it with a static `cast` method, as Kim instructed from an offline discussion. Note now there's a discrepancy: `CompilerThread::cast(t)` vs `t->as_Worker_thread()` (or `t->as_Java_thread()`). I can do the same for other `as_*_thread` methods as well (in this PR or another one) if people feel like this approach. > > as_Worker_thread => WorkerThread::cast could pretty easily be added to this change. There's only one call. > > There are lots of as_Java_thread calls. Maybe check with runtime team how they feel about the inconsistency vs code churn. I prefer `JavaThread::cast()` over `Thread::as_Java_thread()`. The former style is consistent with other functions such as `InstanceKlass:cast()`. As far as code churn, [JDK-8252685](https://bugs.openjdk.java.net/browse/JDK-8252685) already removed a bunch of `as_Java_thread()` calls (we have about 110 of them now, vs about 180 in JDK 16), so we may as well fix the rest for a more consistent style, which is also better with header dependencies. @dholmes-ora what do you think? In any case, we should do that in a different PR. ------------- PR: https://git.openjdk.java.net/jdk/pull/4240 From sviswanathan at openjdk.java.net Wed Jun 2 17:23:08 2021 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Wed, 2 Jun 2021 17:23:08 GMT Subject: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v16] In-Reply-To: References: Message-ID: > This PR contains Short Vector Math Library support related changes for [JEP-414 Vector API (Second Incubator)](https://openjdk.java.net/jeps/414), in preparation for when targeted. > > Intel Short Vector Math Library (SVML) based intrinsics in native x86 assembly provide optimized implementation for Vector API transcendental and trigonometric methods. > These methods are built into a separate library instead of being part of libjvm.so or jvm.dll. > > The following changes are made: > The source for these methods is placed in the jdk.incubator.vector module under src/jdk.incubator.vector/linux/native/libsvml and src/jdk.incubator.vector/windows/native/libsvml. > The assembly source files are named as ?*.S? and include files are named as ?*.S.inc?. > The corresponding build script is placed at make/modules/jdk.incubator.vector/Lib.gmk. > Changes are made to build system to support dependency tracking for assembly files with includes. > The built native libraries (libsvml.so/svml.dll) are placed in bin directory of JDK on Windows and lib directory of JDK on Linux. > The C2 JIT uses the dll_load and dll_lookup to get the addresses of optimized methods from this library. > > Build system changes and module library build scripts are contributed by Magnus (magnus.ihse.bursie at oracle.com). > > Looking forward to your review and feedback. > > Performance: > Micro benchmark Base Optimized Unit Gain(Optimized/Base) > Double128Vector.ACOS 45.91 87.34 ops/ms 1.90 > Double128Vector.ASIN 45.06 92.36 ops/ms 2.05 > Double128Vector.ATAN 19.92 118.36 ops/ms 5.94 > Double128Vector.ATAN2 15.24 88.17 ops/ms 5.79 > Double128Vector.CBRT 45.77 208.36 ops/ms 4.55 > Double128Vector.COS 49.94 245.89 ops/ms 4.92 > Double128Vector.COSH 26.91 126.00 ops/ms 4.68 > Double128Vector.EXP 71.64 379.65 ops/ms 5.30 > Double128Vector.EXPM1 35.95 150.37 ops/ms 4.18 > Double128Vector.HYPOT 50.67 174.10 ops/ms 3.44 > Double128Vector.LOG 61.95 279.84 ops/ms 4.52 > Double128Vector.LOG10 59.34 239.05 ops/ms 4.03 > Double128Vector.LOG1P 18.56 200.32 ops/ms 10.79 > Double128Vector.SIN 49.36 240.79 ops/ms 4.88 > Double128Vector.SINH 26.59 103.75 ops/ms 3.90 > Double128Vector.TAN 41.05 152.39 ops/ms 3.71 > Double128Vector.TANH 45.29 169.53 ops/ms 3.74 > Double256Vector.ACOS 54.21 106.39 ops/ms 1.96 > Double256Vector.ASIN 53.60 107.99 ops/ms 2.01 > Double256Vector.ATAN 21.53 189.11 ops/ms 8.78 > Double256Vector.ATAN2 16.67 140.76 ops/ms 8.44 > Double256Vector.CBRT 56.45 397.13 ops/ms 7.04 > Double256Vector.COS 58.26 389.77 ops/ms 6.69 > Double256Vector.COSH 29.44 151.11 ops/ms 5.13 > Double256Vector.EXP 86.67 564.68 ops/ms 6.52 > Double256Vector.EXPM1 41.96 201.28 ops/ms 4.80 > Double256Vector.HYPOT 66.18 305.74 ops/ms 4.62 > Double256Vector.LOG 71.52 394.90 ops/ms 5.52 > Double256Vector.LOG10 65.43 362.32 ops/ms 5.54 > Double256Vector.LOG1P 19.99 300.88 ops/ms 15.05 > Double256Vector.SIN 57.06 380.98 ops/ms 6.68 > Double256Vector.SINH 29.40 117.37 ops/ms 3.99 > Double256Vector.TAN 44.90 279.90 ops/ms 6.23 > Double256Vector.TANH 54.08 274.71 ops/ms 5.08 > Double512Vector.ACOS 55.65 687.54 ops/ms 12.35 > Double512Vector.ASIN 57.31 777.72 ops/ms 13.57 > Double512Vector.ATAN 21.42 729.21 ops/ms 34.04 > Double512Vector.ATAN2 16.37 414.33 ops/ms 25.32 > Double512Vector.CBRT 56.78 834.38 ops/ms 14.69 > Double512Vector.COS 59.88 837.04 ops/ms 13.98 > Double512Vector.COSH 30.34 172.76 ops/ms 5.70 > Double512Vector.EXP 99.66 1608.12 ops/ms 16.14 > Double512Vector.EXPM1 43.39 318.61 ops/ms 7.34 > Double512Vector.HYPOT 73.87 1502.72 ops/ms 20.34 > Double512Vector.LOG 74.84 996.00 ops/ms 13.31 > Double512Vector.LOG10 71.12 1046.52 ops/ms 14.72 > Double512Vector.LOG1P 19.75 776.87 ops/ms 39.34 > Double512Vector.POW 37.42 384.13 ops/ms 10.26 > Double512Vector.SIN 59.74 728.45 ops/ms 12.19 > Double512Vector.SINH 29.47 143.38 ops/ms 4.87 > Double512Vector.TAN 46.20 587.21 ops/ms 12.71 > Double512Vector.TANH 57.36 495.42 ops/ms 8.64 > Double64Vector.ACOS 24.04 73.67 ops/ms 3.06 > Double64Vector.ASIN 23.78 75.11 ops/ms 3.16 > Double64Vector.ATAN 14.14 62.81 ops/ms 4.44 > Double64Vector.ATAN2 10.38 44.43 ops/ms 4.28 > Double64Vector.CBRT 16.47 107.50 ops/ms 6.53 > Double64Vector.COS 23.42 152.01 ops/ms 6.49 > Double64Vector.COSH 17.34 113.34 ops/ms 6.54 > Double64Vector.EXP 27.08 203.53 ops/ms 7.52 > Double64Vector.EXPM1 18.77 96.73 ops/ms 5.15 > Double64Vector.HYPOT 18.54 103.62 ops/ms 5.59 > Double64Vector.LOG 26.75 142.63 ops/ms 5.33 > Double64Vector.LOG10 25.85 139.71 ops/ms 5.40 > Double64Vector.LOG1P 13.26 97.94 ops/ms 7.38 > Double64Vector.SIN 23.28 146.91 ops/ms 6.31 > Double64Vector.SINH 17.62 88.59 ops/ms 5.03 > Double64Vector.TAN 21.00 86.43 ops/ms 4.12 > Double64Vector.TANH 23.75 111.35 ops/ms 4.69 > Float128Vector.ACOS 57.52 110.65 ops/ms 1.92 > Float128Vector.ASIN 57.15 117.95 ops/ms 2.06 > Float128Vector.ATAN 22.52 318.74 ops/ms 14.15 > Float128Vector.ATAN2 17.06 246.07 ops/ms 14.42 > Float128Vector.CBRT 29.72 443.74 ops/ms 14.93 > Float128Vector.COS 42.82 803.02 ops/ms 18.75 > Float128Vector.COSH 31.44 118.34 ops/ms 3.76 > Float128Vector.EXP 72.43 855.33 ops/ms 11.81 > Float128Vector.EXPM1 37.82 127.85 ops/ms 3.38 > Float128Vector.HYPOT 53.20 591.68 ops/ms 11.12 > Float128Vector.LOG 52.95 877.94 ops/ms 16.58 > Float128Vector.LOG10 49.26 603.72 ops/ms 12.26 > Float128Vector.LOG1P 20.89 430.59 ops/ms 20.61 > Float128Vector.SIN 43.38 745.31 ops/ms 17.18 > Float128Vector.SINH 31.11 112.91 ops/ms 3.63 > Float128Vector.TAN 37.25 332.13 ops/ms 8.92 > Float128Vector.TANH 57.63 453.77 ops/ms 7.87 > Float256Vector.ACOS 65.23 123.73 ops/ms 1.90 > Float256Vector.ASIN 63.41 132.86 ops/ms 2.10 > Float256Vector.ATAN 23.51 649.02 ops/ms 27.61 > Float256Vector.ATAN2 18.19 455.95 ops/ms 25.07 > Float256Vector.CBRT 45.99 594.81 ops/ms 12.93 > Float256Vector.COS 43.75 926.69 ops/ms 21.18 > Float256Vector.COSH 33.52 130.46 ops/ms 3.89 > Float256Vector.EXP 75.70 1366.72 ops/ms 18.05 > Float256Vector.EXPM1 39.00 149.72 ops/ms 3.84 > Float256Vector.HYPOT 52.91 1023.18 ops/ms 19.34 > Float256Vector.LOG 53.31 1545.77 ops/ms 29.00 > Float256Vector.LOG10 50.31 863.80 ops/ms 17.17 > Float256Vector.LOG1P 21.51 616.59 ops/ms 28.66 > Float256Vector.SIN 44.07 911.04 ops/ms 20.67 > Float256Vector.SINH 33.16 122.50 ops/ms 3.69 > Float256Vector.TAN 37.85 497.75 ops/ms 13.15 > Float256Vector.TANH 64.27 537.20 ops/ms 8.36 > Float512Vector.ACOS 67.33 1718.00 ops/ms 25.52 > Float512Vector.ASIN 66.12 1780.85 ops/ms 26.93 > Float512Vector.ATAN 22.63 1780.31 ops/ms 78.69 > Float512Vector.ATAN2 17.52 1113.93 ops/ms 63.57 > Float512Vector.CBRT 54.78 2087.58 ops/ms 38.11 > Float512Vector.COS 40.92 1567.93 ops/ms 38.32 > Float512Vector.COSH 33.42 138.36 ops/ms 4.14 > Float512Vector.EXP 70.51 3835.97 ops/ms 54.41 > Float512Vector.EXPM1 38.06 279.80 ops/ms 7.35 > Float512Vector.HYPOT 50.99 3287.55 ops/ms 64.47 > Float512Vector.LOG 49.61 3156.99 ops/ms 63.64 > Float512Vector.LOG10 46.94 2489.16 ops/ms 53.02 > Float512Vector.LOG1P 20.66 1689.86 ops/ms 81.81 > Float512Vector.POW 32.73 1015.85 ops/ms 31.04 > Float512Vector.SIN 41.17 1587.71 ops/ms 38.56 > Float512Vector.SINH 33.05 129.39 ops/ms 3.91 > Float512Vector.TAN 35.60 1336.11 ops/ms 37.53 > Float512Vector.TANH 65.77 2295.28 ops/ms 34.90 > Float64Vector.ACOS 48.41 89.34 ops/ms 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 > Float64Vector.EXP 65.80 486.37 ops/ms 7.39 > Float64Vector.EXPM1 34.61 85.99 ops/ms 2.48 > Float64Vector.HYPOT 50.40 147.82 ops/ms 2.93 > Float64Vector.LOG 51.93 163.25 ops/ms 3.14 > Float64Vector.LOG10 49.53 147.98 ops/ms 2.99 > Float64Vector.LOG1P 19.20 206.81 ops/ms 10.77 > Float64Vector.SIN 44.41 382.09 ops/ms 8.60 > Float64Vector.SINH 28.20 90.68 ops/ms 3.22 > Float64Vector.TAN 36.29 160.89 ops/ms 4.43 > Float64Vector.TANH 47.65 214.04 ops/ms 4.49 Sandhya Viswanathan has updated the pull request incrementally with one additional commit since the last revision: update javadoc ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3638/files - new: https://git.openjdk.java.net/jdk/pull/3638/files/e5208a18..b229e8b4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3638&range=15 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3638&range=14-15 Stats: 18 lines in 1 file changed: 0 ins; 0 del; 18 mod Patch: https://git.openjdk.java.net/jdk/pull/3638.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3638/head:pull/3638 PR: https://git.openjdk.java.net/jdk/pull/3638 From coleenp at openjdk.java.net Wed Jun 2 17:28:34 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 2 Jun 2021 17:28:34 GMT Subject: RFR: 8268119: Rename copy_os_cpu.inline.hpp files to copy_os_cpu.hpp In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 14:28:57 GMT, Stefan Karlsson wrote: > Today we transitively include the `copy__.inline.hpp` files from `copy.hpp`. This is goes against the HotSpot Style Guide that states: > >> .inline.hpp files should only be included in .cpp or .inline.hpp files. > > The `copy__.inline.hpp` don't include any other HotSpot files, so I propose that we simply rename them to `copy__.hpp`. Seems fine and trivial if you've compiled it on all these platforms (we have cross compilations for ppc and s390). ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4311 From mdoerr at openjdk.java.net Wed Jun 2 17:28:34 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 2 Jun 2021 17:28:34 GMT Subject: RFR: 8266257: Fix foreign linker build issues for ppc and s390 In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 11:27:28 GMT, Maurizio Cimadamore wrote: > This patch addresses some build issues introduced by integration of JEP-412. > The support for JEP-412 turns some static functions (e.g.`float_move`, `long_move`) in sharedRuntime into proper member functions, as they need to be referenced by the new support for Panama upcall handlers. Sadly, not all Hotspot ports agree on the number of parameters these functions take - most notably, ppc and s390 have incompatible signatures, and, because of that, failt to build. > > A simpler solution is to move these functions to the x86 macro assembler - after all, these functions are specific to a given platform, and excessive sharing should be avoided. This patch does that - and fixes other remaining issues with non-standard hotspot builds (e.g. by adding stab implementation for some unimplemented Panama support). I just verified that it builds on real machines. Thanks for fixing! ------------- PR: https://git.openjdk.java.net/jdk/pull/4303 From coleenp at openjdk.java.net Wed Jun 2 17:29:33 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 2 Jun 2021 17:29:33 GMT Subject: RFR: 8268118: Rename bytes_os_cpu.inline.hpp files to bytes_os_cpu.hpp In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 14:27:43 GMT, Stefan Karlsson wrote: > Today we transitively include the `bytes__.inline.hpp` files from `bytes.hpp`. This is goes against the HotSpot Style Guide that states: > >> .inline.hpp files should only be included in .cpp or .inline.hpp files. > > The `bytes__.inline.hpp` don't include any other HotSpot files, so I propose that we simply rename them to `bytes__.hpp`. Seems fine and trivial if you've compiled it on all these platforms (we have cross compilations for ppc and s390). I guess we picked .inline.hpp for the file names since the define inline functions but it's better to avoid breaking the .inline.hpp rule. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4310 From lmesnik at openjdk.java.net Wed Jun 2 17:30:33 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Wed, 2 Jun 2021 17:30:33 GMT Subject: RFR: 8267893: Improve jtreg test failure handler do get native/mixed stack traces for cores and live processes [v5] In-Reply-To: References: Message-ID: On Fri, 28 May 2021 02:14:45 GMT, Igor Ignatyev wrote: >> Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: >> >> spaces updated. > > test/failure_handler/src/share/classes/jdk/test/failurehandler/action/PatternAction.java line 63: > >> 61: } >> 62: for (int i = 0, n = args.length; i < n; ++i) { >> 63: args[i] = args[i].replace("%java", helper.findApp("java").getAbsolutePath()); > > are we sure that `java` from `PATH` (which is what `findApp` returns) is the right `java`? The paths contain testJdk and compileJdk before PATH. We use it to find any of our tools. ------------- PR: https://git.openjdk.java.net/jdk/pull/4234 From iklam at openjdk.java.net Wed Jun 2 17:32:31 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 2 Jun 2021 17:32:31 GMT Subject: RFR: JDK-8267916: Make as_CompilerThread consistent with other as_*_thread methods [v2] In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 14:17:58 GMT, Albert Mingkun Yang wrote: >> Simple refactoring around `as_CompilerThread`. A new file, `compilerThread.inline.hpp`, is created to get around the circular dependency. > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > cast This looks good to me. If we decide to use Compiler::cast, the RFE title should be updated before integration. ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4240 From iris at openjdk.java.net Wed Jun 2 17:51:29 2021 From: iris at openjdk.java.net (Iris Clark) Date: Wed, 2 Jun 2021 17:51:29 GMT Subject: RFR: JDK-8266254: Update to use jtreg 6 In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 16:13:48 GMT, Jonathan Gibbons wrote: > Please review the change to update to using jtreg 6. > > The primary change is to the jib-profiles.js file, which specifies the version of jtreg to use, for those systems that rely on this file. In addition, the `requiredVersion` has been updated in the various `TEST.ROOT` files. > > All the tests that could be updated ahead of time have been updated. There are a few tests remaining that need to be done at this time, because of the change in the module name for TestNG 7.3. It changed from a default of `testng` to and explicit `org.testng`. Marked as reviewed by iris (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/4315 From stefank at openjdk.java.net Wed Jun 2 17:52:32 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 2 Jun 2021 17:52:32 GMT Subject: RFR: 8268119: Rename copy_os_cpu.inline.hpp files to copy_os_cpu.hpp In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 17:25:24 GMT, Coleen Phillimore wrote: > Seems fine and trivial if you've compiled it on all these platforms (we have cross compilations for ppc and s390). Thanks @coleenp. There's currently an unrelated build problem with many of those platforms. When that fix gets pushed, I'll pull them into this branch. ------------- PR: https://git.openjdk.java.net/jdk/pull/4311 From psandoz at openjdk.java.net Wed Jun 2 17:55:52 2021 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Wed, 2 Jun 2021 17:55:52 GMT Subject: RFR: 8266317: Vector API enhancements [v5] In-Reply-To: <9SrHXt3Om3jkyg0A4_X-L3YSMFM4Ib7Y6blBVFcs_Ik=.3122d7fc-e762-4982-ac8c-46cb43b5606a@github.com> References: <9SrHXt3Om3jkyg0A4_X-L3YSMFM4Ib7Y6blBVFcs_Ik=.3122d7fc-e762-4982-ac8c-46cb43b5606a@github.com> Message-ID: <0DmgZbt9Pi-lIZPNgJ5iHx8wDv6YOTZC3GmPqXbhqtA=.f76153cd-1b1d-4deb-8ff2-f4a90bfd3d23@github.com> > This PR contains API and implementation changes for [JEP-414 Vector API (Second Incubator)](https://openjdk.java.net/jeps/414), in preparation for when targeted. > > Enhancements are made to the API for the support of operations on characters, such as for UTF-8 character decoding. Specifically, methods for loading/storing a `short` vector from/to a `char[]` array, and new vector comparison operators for unsigned comparisons with integral vectors. The x64 implementation is enhanced to supported unsigned comparisons. > > Enhancements are made to the API for loading/storing a `byte` vector from/to a `boolean[]` array. > > The testing of loads/stores can be expanded for scatter/gather, but before doing that i think some refactoring of the tests is required to reposition tests in the right classes. I would like to do that work after integration of the PR. Paul Sandoz has updated the pull request incrementally with one additional commit since the last revision: Check vlen in bytes for unsigned support ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3803/files - new: https://git.openjdk.java.net/jdk/pull/3803/files/12b23f62..20a9c9ce Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3803&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3803&range=03-04 Stats: 3 lines in 1 file changed: 1 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/3803.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3803/head:pull/3803 PR: https://git.openjdk.java.net/jdk/pull/3803 From chegar at openjdk.java.net Wed Jun 2 18:25:29 2021 From: chegar at openjdk.java.net (Chris Hegarty) Date: Wed, 2 Jun 2021 18:25:29 GMT Subject: RFR: JDK-8266254: Update to use jtreg 6 In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 16:13:48 GMT, Jonathan Gibbons wrote: > Please review the change to update to using jtreg 6. > > The primary change is to the jib-profiles.js file, which specifies the version of jtreg to use, for those systems that rely on this file. In addition, the `requiredVersion` has been updated in the various `TEST.ROOT` files. > > All the tests that could be updated ahead of time have been updated. There are a few tests remaining that need to be done at this time, because of the change in the module name for TestNG 7.3. It changed from a default of `testng` to and explicit `org.testng`. Marked as reviewed by chegar (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/4315 From mseledtsov at openjdk.java.net Wed Jun 2 19:58:39 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Wed, 2 Jun 2021 19:58:39 GMT Subject: RFR: 8267917: mark hotspot containers tests which ignore external VM flags In-Reply-To: <-2xStypSXn36rLxpzahRaczKC34Oy1QvW9oZISxyNTI=.e018fe94-7a4d-4e78-9212-349546a01908@github.com> References: <-2xStypSXn36rLxpzahRaczKC34Oy1QvW9oZISxyNTI=.e018fe94-7a4d-4e78-9212-349546a01908@github.com> Message-ID: On Fri, 28 May 2021 08:31:49 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this tiny and trivial patch that adds `@requires vm.flagless` to two container tests that, supposedly, ignore external VM flags? > > attn: @mseledts > > Cheers, > -- Igor Marked as reviewed by mseledtsov (Committer). ------------- PR: https://git.openjdk.java.net/jdk/pull/4241 From jjg at openjdk.java.net Wed Jun 2 20:15:55 2021 From: jjg at openjdk.java.net (Jonathan Gibbons) Date: Wed, 2 Jun 2021 20:15:55 GMT Subject: RFR: JDK-8266254: Update to use jtreg 6 [v2] In-Reply-To: References: Message-ID: > Please review the change to update to using jtreg 6. > > The primary change is to the jib-profiles.js file, which specifies the version of jtreg to use, for those systems that rely on this file. In addition, the `requiredVersion` has been updated in the various `TEST.ROOT` files. > > All the tests that could be updated ahead of time have been updated. There are a few tests remaining that need to be done at this time, because of the change in the module name for TestNG 7.3. It changed from a default of `testng` to and explicit `org.testng`. Jonathan Gibbons has updated the pull request incrementally with one additional commit since the last revision: Update copyright years ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4315/files - new: https://git.openjdk.java.net/jdk/pull/4315/files/0d1e554a..4ef5614f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4315&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4315&range=00-01 Stats: 6 lines in 6 files changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/4315.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4315/head:pull/4315 PR: https://git.openjdk.java.net/jdk/pull/4315 From naoto at openjdk.java.net Wed Jun 2 20:21:38 2021 From: naoto at openjdk.java.net (Naoto Sato) Date: Wed, 2 Jun 2021 20:21:38 GMT Subject: RFR: JDK-8266254: Update to use jtreg 6 [v2] In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 20:15:55 GMT, Jonathan Gibbons wrote: >> Please review the change to update to using jtreg 6. >> >> The primary change is to the jib-profiles.js file, which specifies the version of jtreg to use, for those systems that rely on this file. In addition, the `requiredVersion` has been updated in the various `TEST.ROOT` files. >> >> All the tests that could be updated ahead of time have been updated. There are a few tests remaining that need to be done at this time, because of the change in the module name for TestNG 7.3. It changed from a default of `testng` to and explicit `org.testng`. > > Jonathan Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright years Marked as reviewed by naoto (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/4315 From duke at openjdk.java.net Wed Jun 2 21:23:40 2021 From: duke at openjdk.java.net (duke) Date: Wed, 2 Jun 2021 21:23:40 GMT Subject: Withdrawn: JDK-8266254: Update to use jtreg 6 In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 16:13:48 GMT, Jonathan Gibbons wrote: > Please review the change to update to using jtreg 6. > > The primary change is to the jib-profiles.js file, which specifies the version of jtreg to use, for those systems that rely on this file. In addition, the `requiredVersion` has been updated in the various `TEST.ROOT` files. > > All the tests that could be updated ahead of time have been updated. There are a few tests remaining that need to be done at this time, because of the change in the module name for TestNG 7.3. It changed from a default of `testng` to an explicit `org.testng`. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/4315 From dholmes at openjdk.java.net Wed Jun 2 21:52:37 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 2 Jun 2021 21:52:37 GMT Subject: RFR: JDK-8267916: Make as_CompilerThread consistent with other as_*_thread methods [v2] In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 14:17:58 GMT, Albert Mingkun Yang wrote: >> Simple refactoring around `as_CompilerThread`. A new file, `compilerThread.inline.hpp`, is created to get around the circular dependency. > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > cast The change to CompilerThread::cast doesn't seem necessary to address any circular header dependencies - what am I missing? I personally prefer the as_X() style for thread casts (though X is spelt incorrectly in most places), but `cast` is a more traditional name and used for other type hierarchies in the VM. So we can change this for other thread types in a separate RFE. As Ioi notes, this RFE needs a new synopsis: "Adopt cast notation for CompilerThread conversions" or something like that. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4240 From dholmes at openjdk.java.net Wed Jun 2 22:41:36 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 2 Jun 2021 22:41:36 GMT Subject: RFR: JDK-8267916: Make as_CompilerThread consistent with other as_*_thread methods [v2] In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 21:49:28 GMT, David Holmes wrote: > The change to CompilerThread::cast doesn't seem necessary to address any circular header dependencies - what am I missing? Doh! I was missing the fact the cast method goes in compilerThread.hpp not thread.hpp. ------------- PR: https://git.openjdk.java.net/jdk/pull/4240 From yyang at openjdk.java.net Thu Jun 3 02:56:35 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Thu, 3 Jun 2021 02:56:35 GMT Subject: RFR: 8265518: C1: Intrinsic support for Preconditions.checkIndex [v6] In-Reply-To: <8RxYJZEePGo31FEahqK76i4h-q02mTtmfAYBg_3sjAY=.f6ff6a94-31b6-4149-8236-75b65dc82cf8@github.com> References: <8RxYJZEePGo31FEahqK76i4h-q02mTtmfAYBg_3sjAY=.f6ff6a94-31b6-4149-8236-75b65dc82cf8@github.com> Message-ID: On Tue, 1 Jun 2021 17:43:45 GMT, Igor Veresov wrote: >> Thank you @veresov! >> >> I'm glad to have more comments from hotspot-compiler group. >> >> Later, I'd like to integrate it if there are no more comments/objections. >> >> Thanks! >> Yang > > @kelthuzadx Sorry about the delay. Could you please rebase this to the current master and I'll push it. Thanks! @veresov I've rebased to the latest commit and resolved all conflicts. Please take a look at this. Thank you! ------------- PR: https://git.openjdk.java.net/jdk/pull/3615 From dholmes at openjdk.java.net Thu Jun 3 03:03:38 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 3 Jun 2021 03:03:38 GMT Subject: RFR: 8268118: Rename bytes_os_cpu.inline.hpp files to bytes_os_cpu.hpp In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 14:27:43 GMT, Stefan Karlsson wrote: > Today we transitively include the `bytes__.inline.hpp` files from `bytes.hpp`. This is goes against the HotSpot Style Guide that states: > >> .inline.hpp files should only be included in .cpp or .inline.hpp files. > > The `bytes__.inline.hpp` don't include any other HotSpot files, so I propose that we simply rename them to `bytes__.hpp`. Seems okay. As the inline definitions are trivial they don't need to be in an inline.hpp file. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4310 From iveresov at openjdk.java.net Thu Jun 3 04:41:41 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Thu, 3 Jun 2021 04:41:41 GMT Subject: RFR: 8265518: C1: Intrinsic support for Preconditions.checkIndex [v11] In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 02:32:54 GMT, Yi Yang wrote: >> The JDK codebase re-created many variants of checkIndex(`grep -I -r 'cehckIndex' jdk/`). A notable variant is java.nio.Buffer.checkIndex, which annotated with @IntrinsicCandidate and it only has a corresponding C1 intrinsic version. >> >> In fact, there is an utility method `jdk.internal.util.Preconditions.checkIndex`(wrapped by java.lang.Objects.checkIndex) that behaves the same as these variants of checkIndex, we can replace these re-created variants of checkIndex by Objects.checkIndex, it would significantly reduce duplicated code and enjoys performance improvement because Preconditions.checkIndex is @IntrinsicCandidate and it has a corresponding intrinsic method in HotSpot. >> >> But, the problem is currently HotSpot only implements the C2 version of Preconditions.checkIndex. To reuse it global-widely in JDK code, I think we can firstly implement its C1 counterpart. There are also a few kinds of stuff we can do later: >> >> 1. Replace all variants of checkIndex by Objects.checkIndex in the whole JDK codebase. >> 2. Remove Buffer.checkIndex and obsolete/deprecate InlineNIOCheckIndex flag >> >> Testing: cds, compiler and jdk > > Yi Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: > > - x86_32 fails > - build failed > - cmp clobbers its left argument on x86_32 > - better check1-4 > - AssertionError when expected exception was not thrown > - remove InlineNIOCheckIndex flag > - remove java_nio_Buffer in javaClasses.hpp > - consolidate `test/jdk/java/util/Objects/CheckIndex.java` and `test/jdk/java/util/Objects/CheckLongIndex.java` started failing. Please take a look. test CheckIndex.testCheckIndex(0, -2147483647, false): failure java.lang.AssertionError: Index 0 is out of bounds of [0, -2147483647), but was reported to be within bounds at org.testng.Assert.fail(Assert.java:94) at CheckIndex.lambda$testCheckIndex$3(CheckIndex.java:103) at CheckIndex.testCheckIndex(CheckIndex.java:117) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:85) at org.testng.internal.Invoker.invokeMethod(Invoker.java:639) at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:821) at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1131) at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:124) at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:108) at org.testng.TestRunner.privateRun(TestRunner.java:773) at org.testng.TestRunner.run(TestRunner.java:623) at org.testng.SuiteRunner.runTest(SuiteRunner.java:357) at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:352) at org.testng.SuiteRunner.privateRun(SuiteRunner.java:310) at org.testng.SuiteRunner.run(SuiteRunner.java:259) at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52) at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86) at org.testng.TestNG.runSuitesSequentially(TestNG.java:1185) at org.testng.TestNG.runSuitesLocally(TestNG.java:1110) at org.testng.TestNG.run(TestNG.java:1018) at com.sun.javatest.regtest.agent.TestNGRunner.main(TestNGRunner.java:94) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) at java.base/java.lang.Thread.run(Thread.java:831) test CheckIndex.testCheckIndex(0, -2147483648, false): failure java.lang.AssertionError: Index 0 is out of bounds of [0, -2147483648), but was reported to be within bounds at org.testng.Assert.fail(Assert.java:94) at CheckIndex.lambda$testCheckIndex$3(CheckIndex.java:103) at CheckIndex.testCheckIndex(CheckIndex.java:120) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:85) at org.testng.internal.Invoker.invokeMethod(Invoker.java:639) at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:821) at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1131) at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:124) at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:108) at org.testng.TestRunner.privateRun(TestRunner.java:773) at org.testng.TestRunner.run(TestRunner.java:623) at org.testng.SuiteRunner.runTest(SuiteRunner.java:357) at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:352) at org.testng.SuiteRunner.privateRun(SuiteRunner.java:310) at org.testng.SuiteRunner.run(SuiteRunner.java:259) at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52) at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86) at org.testng.TestNG.runSuitesSequentially(TestNG.java:1185) at org.testng.TestNG.runSuitesLocally(TestNG.java:1110) at org.testng.TestNG.run(TestNG.java:1018) at com.sun.javatest.regtest.agent.TestNGRunner.main(TestNGRunner.java:94) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) at java.base/java.lang.Thread.run(Thread.java:831) test CheckIndex.testCheckIndex(1, -1, false): failure java.lang.AssertionError: Index 1 is out of bounds of [0, -1), but was reported to be within bounds at org.testng.Assert.fail(Assert.java:94) at CheckIndex.lambda$testCheckIndex$3(CheckIndex.java:103) at CheckIndex.testCheckIndex(CheckIndex.java:130) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:85) at org.testng.internal.Invoker.invokeMethod(Invoker.java:639) at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:821) at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1131) at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:124) at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:108) at org.testng.TestRunner.privateRun(TestRunner.java:773) at org.testng.TestRunner.run(TestRunner.java:623) at org.testng.SuiteRunner.runTest(SuiteRunner.java:357) at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:352) at org.testng.SuiteRunner.privateRun(SuiteRunner.java:310) at org.testng.SuiteRunner.run(SuiteRunner.java:259) at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52) at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86) at org.testng.TestNG.runSuitesSequentially(TestNG.java:1185) at org.testng.TestNG.runSuitesLocally(TestNG.java:1110) at org.testng.TestNG.run(TestNG.java:1018) at com.sun.javatest.regtest.agent.TestNGRunner.main(TestNGRunner.java:94) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) at java.base/java.lang.Thread.run(Thread.java:831) ------------- PR: https://git.openjdk.java.net/jdk/pull/3615 From lmesnik at openjdk.java.net Thu Jun 3 06:22:53 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Thu, 3 Jun 2021 06:22:53 GMT Subject: RFR: 8212155: Race condition when posting dynamic_code_generated event leads to JVM crash Message-ID: Fixed a race condition between posting and enabling DynamicCodeGenerated event. ------------- Commit messages: - 8212155: Race condition when posting dynamic_code_generated event leads to JVM crash Changes: https://git.openjdk.java.net/jdk/pull/4331/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4331&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8212155 Stats: 122 lines in 3 files changed: 116 ins; 0 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/4331.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4331/head:pull/4331 PR: https://git.openjdk.java.net/jdk/pull/4331 From github.com+28651297+mkartashev at openjdk.java.net Thu Jun 3 06:59:09 2021 From: github.com+28651297+mkartashev at openjdk.java.net (Maxim Kartashev) Date: Thu, 3 Jun 2021 06:59:09 GMT Subject: RFR: 8195129: System.load() fails to load from unicode paths [v4] In-Reply-To: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> References: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> Message-ID: <2y1L0zS1OMMm2QZAzbU1T146d_p84L_TSJDh0NRxw5U=.1eeb8ff0-31df-43f2-babf-a2745b96b9c4@github.com> > Character strings within JVM are produced and consumed in several formats. Strings come from/to Java in the UTF8 format and POSIX APIs (like fprintf() or dlopen()) consume strings also in UTF8. On Windows, however, the situation is far less simple: some new(er) APIs expect UTF16 (wide-character strings), some older APIs can only work with strings in a "platform" format, where not all UTF8 characters can be represented; which ones can depends on the current "code page". > > This commit switches the Windows version of native library loading code to using the new UTF16 API `LoadLibraryW()` and attempts to streamline the use of various string formats in the surrounding code. > > Namely, exception messages are made to consume strings explicitly in the UTF8 format, while logging functions (that end up using legacy Windows API) are made to consume "platform" strings in most cases. One exception is `JVM_LoadLibrary()` logging where the UTF8 name of the library is logged, which can, of course, be fixed, but was considered not worth the additional code (NB: this isn't a new bug). > > The test runs in a separate JVM in order to make NIO happy about non-ASCII characters in the file name; tests are executed with LC_ALL=C and that doesn't let NIO work with non-ASCII file names even on Linux or MacOS. > > Tested by running `test/hotspot/jtreg:tier1` on Linux and `jtreg:test/hotspot/jtreg/runtime` on Windows 10. The new test (` jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode`) was explicitly ran on those platforms as well. > > Results from Linux: > > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg:tier1 1784 1784 0 0 > ============================== > TEST SUCCESS > > > Building target 'run-test-only' in configuration 'linux-x86_64-server-release' > Test selection 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode', will run: > * jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode > > Running test 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode' > Passed: runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java > Test results: passed: 1 > > > Results from Windows 10: > > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg/runtime 746 746 0 0 > ============================== > TEST SUCCESS > Finished building target 'run-test-only' in configuration 'windows-x86_64-server-fastdebug' > > > Building target 'run-test-only' in configuration 'windows-x86_64-server-fastdebug' > Test selection 'test/hotspot/jtreg/runtime/jni/loadLibraryUnicode', will run: > * jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode > > Running test 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode' > Passed: runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java > Test results: passed: 1 Maxim Kartashev has updated the pull request incrementally with one additional commit since the last revision: Updated Java_jdk_internal_loader_NativeLibraries_load() to use a new helper function to get the library name encoded as true UTF-8 (not in "modified UTF-8" form). Also updated the test to automatically pass if "UTF-8" is not supported by NIO on the platform. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4169/files - new: https://git.openjdk.java.net/jdk/pull/4169/files/c8ef8b64..97c918ca Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4169&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4169&range=02-03 Stats: 71 lines in 4 files changed: 57 ins; 6 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/4169.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4169/head:pull/4169 PR: https://git.openjdk.java.net/jdk/pull/4169 From github.com+28651297+mkartashev at openjdk.java.net Thu Jun 3 06:59:14 2021 From: github.com+28651297+mkartashev at openjdk.java.net (Maxim Kartashev) Date: Thu, 3 Jun 2021 06:59:14 GMT Subject: RFR: 8195129: System.load() fails to load from unicode paths [v3] In-Reply-To: References: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> <3y0nPfUyTPbNksPn1y5pvopzN2AReOgIl2CafPKD4b4=.3b490e90-5098-4d9f-8d7e-2770f5548895@github.com> Message-ID: On Tue, 1 Jun 2021 18:24:05 GMT, Naoto Sato wrote: >> Maxim Kartashev has updated the pull request incrementally with two additional commits since the last revision: >> >> - Coding style-related corrections. >> - Corrected the test to use Platform.sharedLibraryExt() > > src/hotspot/os/windows/os_windows.cpp line 1491: > >> 1489: static errno_t convert_UTF8_to_UTF16(char const* utf8_str, LPWSTR* utf16_str) { >> 1490: return convert_to_UTF16(utf8_str, CP_UTF8, utf16_str); >> 1491: } > > IIUC, `utf8_str` is the "modified" UTF-8 string in JNI. Using it as the standard UTF-8 (I believe Windows' encoding `CP_UTF8` is the one) may end up in incorrect conversions in some corner cases, e.g., for supplementary characters. Right; I changed the code in NativeLibraries.c to pass down true UTF-8 instead of "modified UTF-8". Please, take a look. ------------- PR: https://git.openjdk.java.net/jdk/pull/4169 From github.com+28651297+mkartashev at openjdk.java.net Thu Jun 3 07:01:45 2021 From: github.com+28651297+mkartashev at openjdk.java.net (Maxim Kartashev) Date: Thu, 3 Jun 2021 07:01:45 GMT Subject: RFR: 8195129: System.load() fails to load from unicode paths [v3] In-Reply-To: References: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> <3y0nPfUyTPbNksPn1y5pvopzN2AReOgIl2CafPKD4b4=.3b490e90-5098-4d9f-8d7e-2770f5548895@github.com> Message-ID: On Tue, 1 Jun 2021 18:42:34 GMT, Naoto Sato wrote: >> Maxim Kartashev has updated the pull request incrementally with two additional commits since the last revision: >> >> - Coding style-related corrections. >> - Corrected the test to use Platform.sharedLibraryExt() > > test/hotspot/jtreg/runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java line 42: > >> 40: String nativePathSetting = "-Dtest.nativepath=" + getSystemProperty("test.nativepath"); >> 41: ProcessBuilder pb = ProcessTools.createTestJvm(nativePathSetting, LoadLibraryUnicode.class.getName()); >> 42: pb.environment().put("LC_ALL", "en_US.UTF-8"); > > Some environments/user configs may not have `UTF-8` codeset on the platform. May need to gracefully exit in such a case. I added `java.nio.charset.Charset.isSupported("UTF-8")` check to the test. Hope that's enough for the environments without `UTF-8`. ------------- PR: https://git.openjdk.java.net/jdk/pull/4169 From ayang at openjdk.java.net Thu Jun 3 08:25:41 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Thu, 3 Jun 2021 08:25:41 GMT Subject: RFR: JDK-8267916: Adopt cast notation for CompilerThread conversions [v2] In-Reply-To: <5C8r28tpPvktY6wfjJjfv1OlXtbRDojEQgkuzgHwyao=.dee6d745-94b9-4b83-af13-8b611da5000a@github.com> References: <5C8r28tpPvktY6wfjJjfv1OlXtbRDojEQgkuzgHwyao=.dee6d745-94b9-4b83-af13-8b611da5000a@github.com> Message-ID: On Wed, 2 Jun 2021 17:19:20 GMT, Ioi Lam wrote: > as_Worker_thread => WorkerThread::cast could pretty easily be added to this change. There's only one call. I will do that in another PR to keep this one more focused. > As Ioi notes, this RFE needs a new synopsis: "Adopt cast notation for CompilerThread conversions" or something like that. Updated. Thank you all for the suggestions and reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/4240 From ayang at openjdk.java.net Thu Jun 3 08:25:41 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Thu, 3 Jun 2021 08:25:41 GMT Subject: Integrated: JDK-8267916: Adopt cast notation for CompilerThread conversions In-Reply-To: References: Message-ID: On Fri, 28 May 2021 08:09:36 GMT, Albert Mingkun Yang wrote: > Simple refactoring around `as_CompilerThread`. A new file, `compilerThread.inline.hpp`, is created to get around the circular dependency. This pull request has now been integrated. Changeset: a52a08d2 Author: Albert Mingkun Yang URL: https://git.openjdk.java.net/jdk/commit/a52a08d20be13721fcde65cad3567bbfa04f45cd Stats: 24 lines in 6 files changed: 9 ins; 11 del; 4 mod 8267916: Adopt cast notation for CompilerThread conversions Reviewed-by: kbarrett, iklam, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/4240 From stefank at openjdk.java.net Thu Jun 3 09:06:41 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 3 Jun 2021 09:06:41 GMT Subject: RFR: 8268119: Rename copy_os_cpu.inline.hpp files to copy_os_cpu.hpp In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 14:28:57 GMT, Stefan Karlsson wrote: > Today we transitively include the `copy__.inline.hpp` files from `copy.hpp`. This is goes against the HotSpot Style Guide that states: > >> .inline.hpp files should only be included in .cpp or .inline.hpp files. > > The `copy__.inline.hpp` don't include any other HotSpot files, so I propose that we simply rename them to `copy__.hpp`. I pulled in #4303 locally and cross-compiled to verify that this didn't break the build. Thanks @kimbarrett and @coleenp for reviewing. ------------- PR: https://git.openjdk.java.net/jdk/pull/4311 From stefank at openjdk.java.net Thu Jun 3 09:06:42 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 3 Jun 2021 09:06:42 GMT Subject: Integrated: 8268119: Rename copy_os_cpu.inline.hpp files to copy_os_cpu.hpp In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 14:28:57 GMT, Stefan Karlsson wrote: > Today we transitively include the `copy__.inline.hpp` files from `copy.hpp`. This is goes against the HotSpot Style Guide that states: > >> .inline.hpp files should only be included in .cpp or .inline.hpp files. > > The `copy__.inline.hpp` don't include any other HotSpot files, so I propose that we simply rename them to `copy__.hpp`. This pull request has now been integrated. Changeset: 1296a6c4 Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/1296a6c425e22e0fdc748a996b886923c602ab3f Stats: 36 lines in 10 files changed: 0 ins; 12 del; 24 mod 8268119: Rename copy_os_cpu.inline.hpp files to copy_os_cpu.hpp Reviewed-by: kbarrett, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/4311 From stefank at openjdk.java.net Thu Jun 3 09:07:43 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 3 Jun 2021 09:07:43 GMT Subject: RFR: 8268118: Rename bytes_os_cpu.inline.hpp files to bytes_os_cpu.hpp In-Reply-To: References: Message-ID: <_5rLP0AhRhzIBEwZ0piZQYm15oQ3opHk1k3xhP0Jvk4=.606b27c0-b0cc-4927-9cc7-1608c9d74938@github.com> On Wed, 2 Jun 2021 14:27:43 GMT, Stefan Karlsson wrote: > Today we transitively include the `bytes__.inline.hpp` files from `bytes.hpp`. This is goes against the HotSpot Style Guide that states: > >> .inline.hpp files should only be included in .cpp or .inline.hpp files. > > The `bytes__.inline.hpp` don't include any other HotSpot files, so I propose that we simply rename them to `bytes__.hpp`. Thanks for reviewing! I pulled in #4303 locally and cross-compiled to verify that this didn't break the build. ------------- PR: https://git.openjdk.java.net/jdk/pull/4310 From stefank at openjdk.java.net Thu Jun 3 09:07:43 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 3 Jun 2021 09:07:43 GMT Subject: Integrated: 8268118: Rename bytes_os_cpu.inline.hpp files to bytes_os_cpu.hpp In-Reply-To: References: Message-ID: <-Sp8Me8K8PtgTaPiR3MFvYctEJW6lk76gc9RInEGPTg=.f4a6c773-2010-4fe6-a554-30ba585f12c7@github.com> On Wed, 2 Jun 2021 14:27:43 GMT, Stefan Karlsson wrote: > Today we transitively include the `bytes__.inline.hpp` files from `bytes.hpp`. This is goes against the HotSpot Style Guide that states: > >> .inline.hpp files should only be included in .cpp or .inline.hpp files. > > The `bytes__.inline.hpp` don't include any other HotSpot files, so I propose that we simply rename them to `bytes__.hpp`. This pull request has now been integrated. Changeset: c8f4c02b Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/c8f4c02bf005ee1531193535632a5ece768916d0 Stats: 540 lines in 24 files changed: 258 ins; 258 del; 24 mod 8268118: Rename bytes_os_cpu.inline.hpp files to bytes_os_cpu.hpp Reviewed-by: coleenp, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/4310 From mcimadamore at openjdk.java.net Thu Jun 3 09:44:42 2021 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Thu, 3 Jun 2021 09:44:42 GMT Subject: Integrated: 8266257: Fix foreign linker build issues for ppc and s390 In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 11:27:28 GMT, Maurizio Cimadamore wrote: > This patch addresses some build issues introduced by integration of JEP-412. > The support for JEP-412 turns some static functions (e.g.`float_move`, `long_move`) in sharedRuntime into proper member functions, as they need to be referenced by the new support for Panama upcall handlers. Sadly, not all Hotspot ports agree on the number of parameters these functions take - most notably, ppc and s390 have incompatible signatures, and, because of that, failt to build. > > A simpler solution is to move these functions to the x86 macro assembler - after all, these functions are specific to a given platform, and excessive sharing should be avoided. This patch does that - and fixes other remaining issues with non-standard hotspot builds (e.g. by adding stab implementation for some unimplemented Panama support). This pull request has now been integrated. Changeset: 29ab1628 Author: Maurizio Cimadamore URL: https://git.openjdk.java.net/jdk/commit/29ab16284a4f1ac7ed691fd12cb622b0440c04be Stats: 532 lines in 15 files changed: 289 ins; 221 del; 22 mod 8266257: Fix foreign linker build issues for ppc and s390 Reviewed-by: jvernee, vlivanov ------------- PR: https://git.openjdk.java.net/jdk/pull/4303 From ayang at openjdk.java.net Thu Jun 3 11:29:48 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Thu, 3 Jun 2021 11:29:48 GMT Subject: RFR: 8268164: Adopt cast notation for WorkerThread conversions Message-ID: Followup of JDK-8267916 (#4240), the same refactoring for `WorkerThread`. ------------- Commit messages: - worker_thread Changes: https://git.openjdk.java.net/jdk/pull/4334/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4334&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8268164 Stats: 14 lines in 4 files changed: 5 ins; 8 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/4334.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4334/head:pull/4334 PR: https://git.openjdk.java.net/jdk/pull/4334 From stefank at openjdk.java.net Thu Jun 3 11:40:41 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 3 Jun 2021 11:40:41 GMT Subject: RFR: 8268164: Adopt cast notation for WorkerThread conversions In-Reply-To: References: Message-ID: On Thu, 3 Jun 2021 11:21:11 GMT, Albert Mingkun Yang wrote: > Followup of JDK-8267916 (#4240), the same refactoring for `WorkerThread`. I wonder if it wouldn't be nicer to be able to write: WorkerThread::current()->id() ------------- PR: https://git.openjdk.java.net/jdk/pull/4334 From ayang at openjdk.java.net Thu Jun 3 12:02:36 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Thu, 3 Jun 2021 12:02:36 GMT Subject: RFR: 8268164: Adopt cast notation for WorkerThread conversions In-Reply-To: References: Message-ID: On Thu, 3 Jun 2021 11:38:07 GMT, Stefan Karlsson wrote: > I wonder if it wouldn't be nicer to be able to write: > > ``` > WorkerThread::current()->id() > ``` That will cause a compiling error: src/hotspot/share/gc/shared/referenceProcessor.cpp:933:35: error: no member named 'id' in 'Thread' id = WorkerThread::current()->id(); ~~~~~~~~~~~~~~~~~~~~~~~ ^ 1 error generated. `WorkerThread` doesn't have a static `current()` method, so it will get the one from `Thread`, which returns a `Thread*`. However, `id()` is only defined for `WorkerThread`, not `Thread`. ------------- PR: https://git.openjdk.java.net/jdk/pull/4334 From stefank at openjdk.java.net Thu Jun 3 12:29:36 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 3 Jun 2021 12:29:36 GMT Subject: RFR: 8268164: Adopt cast notation for WorkerThread conversions In-Reply-To: References: Message-ID: On Thu, 3 Jun 2021 11:59:35 GMT, Albert Mingkun Yang wrote: > > I wonder if it wouldn't be nicer to be able to write: > > ``` > > WorkerThread::current()->id() > > ``` > > That will cause a compiling error: > > ``` > src/hotspot/share/gc/shared/referenceProcessor.cpp:933:35: error: no member named 'id' in 'Thread' > id = WorkerThread::current()->id(); > ~~~~~~~~~~~~~~~~~~~~~~~ ^ > 1 error generated. > ``` > > `WorkerThread` doesn't have a static `current()` method, so it will get the one from `Thread`, which returns a `Thread*`. However, `id()` is only defined for `WorkerThread`, not `Thread`. Looks like you just called Thread::current() via WorkerThread. I mean that it would be nice if you could create a function that looked something like this: static WorkerThread* WorkerThread::current() { return WorkerThread::cast(Thread::current()); } ------------- PR: https://git.openjdk.java.net/jdk/pull/4334 From stefank at openjdk.java.net Thu Jun 3 12:35:39 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 3 Jun 2021 12:35:39 GMT Subject: RFR: 8268164: Adopt cast notation for WorkerThread conversions In-Reply-To: References: Message-ID: On Thu, 3 Jun 2021 11:21:11 GMT, Albert Mingkun Yang wrote: > Followup of JDK-8267916 (#4240), the same refactoring for `WorkerThread`. I now see that they actually have this for the CompilerThread: https://github.com/openjdk/jdk/pull/4240/files#diff-2dcac7a3550ebd199e8cffdeaabdba67643e794c6a6f9b0042a8cfe7dbaa9066 ------------- PR: https://git.openjdk.java.net/jdk/pull/4334 From erikj at openjdk.java.net Thu Jun 3 12:42:40 2021 From: erikj at openjdk.java.net (Erik Joelsson) Date: Thu, 3 Jun 2021 12:42:40 GMT Subject: RFR: JDK-8266254: Update to use jtreg 6 [v2] In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 20:15:55 GMT, Jonathan Gibbons wrote: >> Please review the change to update to using jtreg 6. >> >> The primary change is to the jib-profiles.js file, which specifies the version of jtreg to use, for those systems that rely on this file. In addition, the `requiredVersion` has been updated in the various `TEST.ROOT` files. >> >> All the tests that could be updated ahead of time have been updated. There are a few tests remaining that need to be done at this time, because of the change in the module name for TestNG 7.3. It changed from a default of `testng` to an explicit `org.testng`. > > Jonathan Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright years This PR wasn't handled properly when integrated. The change looks like it was pushed correctly and the bug was updated, so we are only missing some book keeping in the PR. I've filed https://bugs.openjdk.java.net/browse/SKARA-1069. ------------- PR: https://git.openjdk.java.net/jdk/pull/4315 From ayang at openjdk.java.net Thu Jun 3 12:47:02 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Thu, 3 Jun 2021 12:47:02 GMT Subject: RFR: 8268164: Adopt cast notation for WorkerThread conversions [v2] In-Reply-To: References: Message-ID: > Followup of JDK-8267916 (#4240), the same refactoring for `WorkerThread`. Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: review ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4334/files - new: https://git.openjdk.java.net/jdk/pull/4334/files/aa2769f4..daf5990f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4334&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4334&range=00-01 Stats: 5 lines in 2 files changed: 4 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/4334.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4334/head:pull/4334 PR: https://git.openjdk.java.net/jdk/pull/4334 From ayang at openjdk.java.net Thu Jun 3 12:47:02 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Thu, 3 Jun 2021 12:47:02 GMT Subject: RFR: 8268164: Adopt cast notation for WorkerThread conversions [v2] In-Reply-To: References: Message-ID: On Thu, 3 Jun 2021 12:26:59 GMT, Stefan Karlsson wrote: > I mean that it would be nice if you could create a function that looked something like this... I see. Updated. ------------- PR: https://git.openjdk.java.net/jdk/pull/4334 From stefank at openjdk.java.net Thu Jun 3 12:55:38 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 3 Jun 2021 12:55:38 GMT Subject: RFR: 8268164: Adopt cast notation for WorkerThread conversions [v2] In-Reply-To: References: Message-ID: On Thu, 3 Jun 2021 12:47:02 GMT, Albert Mingkun Yang wrote: >> Followup of JDK-8267916 (#4240), the same refactoring for `WorkerThread`. > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review Marked as reviewed by stefank (Reviewer). src/hotspot/share/runtime/nonJavaThread.hpp line 111: > 109: > 110: static WorkerThread* cast(Thread* t) { > 111: assert(t->is_Worker_thread(), "incorrect cast to const WorkerThread"); It should probably not say *const* here. ------------- PR: https://git.openjdk.java.net/jdk/pull/4334 From dholmes at openjdk.java.net Thu Jun 3 13:05:38 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 3 Jun 2021 13:05:38 GMT Subject: RFR: 8268164: Adopt cast notation for WorkerThread conversions [v2] In-Reply-To: References: Message-ID: On Thu, 3 Jun 2021 12:51:01 GMT, Stefan Karlsson wrote: >> Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: >> >> review > > src/hotspot/share/runtime/nonJavaThread.hpp line 111: > >> 109: >> 110: static WorkerThread* cast(Thread* t) { >> 111: assert(t->is_Worker_thread(), "incorrect cast to const WorkerThread"); > > It should probably not say *const* here. `as_Worker_thread()` used const so I wonder whether the new cast function should too? ------------- PR: https://git.openjdk.java.net/jdk/pull/4334 From ayang at openjdk.java.net Thu Jun 3 13:14:54 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Thu, 3 Jun 2021 13:14:54 GMT Subject: RFR: 8268164: Adopt cast notation for WorkerThread conversions [v3] In-Reply-To: References: Message-ID: > Followup of JDK-8267916 (#4240), the same refactoring for `WorkerThread`. Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: review ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4334/files - new: https://git.openjdk.java.net/jdk/pull/4334/files/daf5990f..c032002b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4334&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4334&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/4334.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4334/head:pull/4334 PR: https://git.openjdk.java.net/jdk/pull/4334 From luhenry at openjdk.java.net Thu Jun 3 14:30:38 2021 From: luhenry at openjdk.java.net (Ludovic Henry) Date: Thu, 3 Jun 2021 14:30:38 GMT Subject: RFR: JDK-8267985: Allow AsyncGetCallTrace and JFR to walk a stub frame In-Reply-To: References: Message-ID: On Mon, 31 May 2021 16:06:10 GMT, Ludovic Henry wrote: > When the signal sent for AsyncGetCallTrace or JFR would land on a stub > (like arraycopy), it wouldn't be able to detect the sender (caller) > frame because `_cb->frame_size() == 0`. > > Because we fully control how the prolog and epilog of stub code is > generated, we know there are two cases: > 1. A stack frame is allocated via macroAssembler->enter(), and consists > in `push rbp; mov rsp, rbp;`. > 2. No stack frames are allocated and rbp is left unchanged and rsp is > decremented with the `call` instruction that push the return `pc` on the > stack. > > For case 1., we can easily know the sender frame by simply looking at > rbp, especially since we know that all stubs preserve the frame pointer > (on x86 at least). > > For case 2., we end up returning the sender's sender, but that already > gives us more information than what we have today. Depends on https://github.com/openjdk/jdk/pull/4337 ------------- PR: https://git.openjdk.java.net/jdk/pull/4274 From luhenry at openjdk.java.net Thu Jun 3 14:35:05 2021 From: luhenry at openjdk.java.net (Ludovic Henry) Date: Thu, 3 Jun 2021 14:35:05 GMT Subject: RFR: 8268178: Extract sender frame parsing to CodeBlock::FrameParser Message-ID: Whether and how a frame is setup is controlled by the code generator for the specific CodeBlock. The CodeBlock is then in the best place to know how to parse the sender's frame from the current frame in the given CodeBlock. This refactoring proposes to extract this parsing out of `frame` and into a `CodeBlock::FrameParser`. This FrameParser is then specialized in the relevant inherited children of CodeBlock. This change is to largely facilitate adding new supported cases for JDK-8252417 [1] like runtime stubs. [1] https://bugs.openjdk.java.net/browse/JDK-8252417 ------------- Commit messages: - 8268178: Extract sender frame parsing to CodeBlock::FrameParser Changes: https://git.openjdk.java.net/jdk/pull/4337/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4337&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8268178 Stats: 584 lines in 17 files changed: 472 ins; 86 del; 26 mod Patch: https://git.openjdk.java.net/jdk/pull/4337.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4337/head:pull/4337 PR: https://git.openjdk.java.net/jdk/pull/4337 From github.com+4146708+a74nh at openjdk.java.net Thu Jun 3 15:14:16 2021 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Thu, 3 Jun 2021 15:14:16 GMT Subject: RFR: 8266749: AArch64: Backtracing broken on PAC enabled systems [v7] In-Reply-To: <3Ak1iZsEfTEKadfUcF6zGVuzsDoiQbaupm66NvSwlaY=.8323de39-d7e6-4049-9243-7c31a75bbc9f@github.com> References: <3Ak1iZsEfTEKadfUcF6zGVuzsDoiQbaupm66NvSwlaY=.8323de39-d7e6-4049-9243-7c31a75bbc9f@github.com> Message-ID: > On PAC systems, native code may sign return addresses before saving > them to the stack. We must ensure we strip the any signed bits in > order to walk the stack. > Add extra asserts in places where we do not expect saved return > addresses to be signed. > > On non-PAC systems, all PAC instructions are treated as NOPs. > > On Apple, use the provided ptrauth interface instead of asm > as the compiler may optimise further. > > Fedora 33 compiles all distro packages using PAC. Running the distro > provided OpenJDK-latest in GDB on a PAC system: > > Thread 2 "java" hit Breakpoint 1, 0x0000fffff68d7fe4 in init_globals() () > from /usr/lib/jvm/java-16-openjdk-16.0.1.0.9-1.rolling.fc33.aarch64-fastdebug/lib/server/libjvm.so > (gdb) call (int)pns($sp, $fp, $pc) > > "Executing pns" > Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0xe26fe4] init_globals()+0x10 > C 0x006ffffff74750c4 > C 0x0042fffff6a7f84c > C 0x0037fffff7fa0954 > C 0x0030fffff7fa4540 > C 0x0078fffff7d980c8 > > OpenJDK with this patch at the same breakpoint: > > (gdb) call (int)pns($sp, $fp, $pc) > "Executing pns" > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x189c47c] Threads::create_vm(JavaVMInitArgs*, bool*)+0x27c > V [libjvm.so+0xf527a0] JNI_CreateJavaVM+0xc0 > C [libjli.so+0x3860] JavaMain+0x7c > C [libjli.so+0x732c] ThreadJavaMain+0xc > C [libpthread.so.0+0x80c8] start_thread+0xd8 > > OpenJDK with this patch breakpointed at pd_hotspot_signal_handler: > > "Executing pns" > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x148a730] PosixSignals::pd_hotspot_signal_handler(int, siginfo_t*, ucontext_t*, JavaThread*)+0x0 > C [linux-vdso.so.1+0x80c] __kernel_rt_sigreturn+0x0 > J 53 c1 jdk.internal.org.objectweb.asm.SymbolTable.addConstantUtf8(Ljava/lang/String;)I java.base (98 bytes) @ 0x0000ffffe159cc3c [0x0000ffffe159cb40+0x00000000000000fc] > j jdk.internal.org.objectweb.asm.SymbolTable.setMajorVersionAndClassName(ILjava/lang/String;)I+12 java.base > j jdk.internal.org.objectweb.asm.ClassWriter.visit(IILjava/lang/String;Ljava/lang/String;Ljava/lang/String;[Ljava/lang/String;)V+20 java.base > j java.lang.invoke.InvokerBytecodeGenerator.classFilePrologue()Ljdk/internal/org/objectweb/asm/ClassWriter;+30 java.base > j java.lang.invoke.InvokerBytecodeGenerator.generateCustomizedCodeBytes()[B+1 java.base > j java.lang.invoke.InvokerBytecodeGenerator.generateCustomizedCode(Ljava/lang/invoke/LambdaForm;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/MemberName;+27 java.base > j java.lang.invoke.LambdaForm.compileToBytecode()V+69 java.base > j java.lang.invoke.DirectMethodHandle.makePreparedLambdaForm(Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/LambdaForm;+792 java.base > j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/LambdaForm;+17 java.base > j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MemberName;Z)Ljava/lang/invoke/LambdaForm;+163 java.base > j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MemberName;)Ljava/lang/invoke/LambdaForm;+2 java.base > j java.lang.invoke.DirectMethodHandle.make(BLjava/lang/Class;Ljava/lang/invoke/MemberName;Ljava/lang/Class;)Ljava/lang/invoke/DirectMethodHandle;+159 java.base > j java.lang.invoke.MethodHandles$Lookup.getDirectMethodCommon(BLjava/lang/Class;Ljava/lang/invoke/MemberName;ZZLjava/lang/invoke/MethodHandles$Lookup;)Ljava/lang/invoke/MethodHandle;+210 java.base > j java.lang.invoke.MethodHandles$Lookup.getDirectMethodNoSecurityManager(BLjava/lang/Class;Ljava/lang/invoke/MemberName;Ljava/lang/invoke/MethodHandles$Lookup;)Ljava/lang/invoke/MethodHandle;+14 java.base > j java.lang.invoke.MethodHandles$Lookup.getDirectMethodForConstant(BLjava/lang/Class;Ljava/lang/invoke/MemberName;)Ljava/lang/invoke/MethodHandle;+31 java.base > j java.lang.invoke.MethodHandles$Lookup.linkMethodHandleConstant(BLjava/lang/Class;Ljava/lang/String;Ljava/lang/Object;)Ljava/lang/invoke/MethodHandle;+153 java.base > j java.lang.invoke.MethodHandleNatives.linkMethodHandleConstant(Ljava/lang/Class;ILjava/lang/Class;Ljava/lang/String;Ljava/lang/Object;)Ljava/lang/invoke/MethodHandle;+38 java.base > v ~StubRoutines::call_stub > V [libjvm.so+0xe20118] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x5c8 > V [libjvm.so+0xe20f64] JavaCalls::call_static(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x284 > V [libjvm.so+0x184b778] SystemDictionary::link_method_handle_constant(Klass*, int, Klass*, Symbol*, Symbol*, Thread*)+0x398 > V [libjvm.so+0xa1f104] ConstantPool::resolve_constant_at_impl(constantPoolHandle const&, int, int, bool*, Thread*)+0xca0 > V [libjvm.so+0xa1fb6c] ConstantPool::copy_bootstrap_arguments_at_impl(constantPoolHandle const&, int, int, int, objArrayHandle, int, bool, Handle, Thread*)+0x3fc > V [libjvm.so+0x6bef6c] BootstrapInfo::resolve_args(Thread*)+0xcbc > V [libjvm.so+0x6c1538] BootstrapInfo::resolve_bsm(Thread*)+0x1194 > V [libjvm.so+0x184d300] SystemDictionary::invoke_bootstrap_method(BootstrapInfo&, Thread*)+0x30 > V [libjvm.so+0x120450c] LinkResolver::resolve_dynamic_call(CallInfo&, BootstrapInfo&, Thread*)+0x2c > V [libjvm.so+0x1204b1c] LinkResolver::resolve_invokedynamic(CallInfo&, constantPoolHandle const&, int, Thread*)+0x1bc > V [libjvm.so+0xe0ecc4] InterpreterRuntime::resolve_invokedynamic(JavaThread*)+0x190 > V [libjvm.so+0xe123a0] InterpreterRuntime::resolve_from_cache(JavaThread*, Bytecodes::Code)+0x160 > j jdk.internal.module.ModulePath.explodedPackages(Ljava/nio/file/Path;)Ljava/util/Set;+5 java.base > j jdk.internal.module.ModulePath.lambda$readExplodedModule$9(Ljava/nio/file/Path;)Ljava/util/Set;+2 java.base > j jdk.internal.module.ModulePath$$Lambda$2+0x000000010003bbe0.get()Ljava/lang/Object;+8 java.base > j jdk.internal.module.ModuleInfo.doRead(Ljava/io/DataInput;)Ljdk/internal/module/ModuleInfo$Attributes;+762 java.base > j jdk.internal.module.ModuleInfo.read(Ljava/io/InputStream;Ljava/util/function/Supplier;)Ljdk/internal/module/ModuleInfo$Attributes;+16 java.base > j jdk.internal.module.ModulePath.readExplodedModule(Ljava/nio/file/Path;)Ljava/lang/module/ModuleReference;+35 java.base > j jdk.internal.module.ModulePath.readModule(Ljava/nio/file/Path;Ljava/nio/file/attribute/BasicFileAttributes;)Ljava/lang/module/ModuleReference;+11 java.base > j jdk.internal.module.ModulePath.scanDirectory(Ljava/nio/file/Path;)Ljava/util/Map;+69 java.base > j jdk.internal.module.ModulePath.scan(Ljava/nio/file/Path;)Ljava/util/Map;+60 java.base > j jdk.internal.module.ModulePath.scanNextEntry()V+23 java.base > j jdk.internal.module.ModulePath.find(Ljava/lang/String;)Ljava/util/Optional;+36 java.base > j jdk.internal.module.SystemModuleFinders$1.lambda$find$0(Ljava/lang/module/ModuleFinder;Ljava/lang/String;)Ljava/util/Optional;+2 java.base > j jdk.internal.module.SystemModuleFinders$1$$Lambda$1+0x0000000100033b00.run()Ljava/lang/Object;+8 java.base > j java.security.AccessController.executePrivileged(Ljava/security/PrivilegedAction;Ljava/security/AccessControlContext;Ljava/lang/Class;)Ljava/lang/Object;+29 java.base > j java.security.AccessController.doPrivileged(Ljava/security/PrivilegedAction;)Ljava/lang/Object;+5 java.base > j jdk.internal.module.SystemModuleFinders$1.find(Ljava/lang/String;)Ljava/util/Optional;+12 java.base > j jdk.internal.module.ModuleBootstrap.boot2()Ljava/lang/ModuleLayer;+304 java.base > j jdk.internal.module.ModuleBootstrap.boot()Ljava/lang/ModuleLayer;+64 java.base > j java.lang.System.initPhase2(ZZ)I+0 java.base > v ~StubRoutines::call_stub > V [libjvm.so+0xe20118] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x5c8 > V [libjvm.so+0xe20f64] JavaCalls::call_static(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x284 > V [libjvm.so+0x189c7bc] Threads::create_vm(JavaVMInitArgs*, bool*)+0x5bc > V [libjvm.so+0xf527a0] JNI_CreateJavaVM+0xc0 > C [libjli.so+0x3860] JavaMain+0x7c > C [libjli.so+0x732c] ThreadJavaMain+0xc > C [libpthread.so.0+0x80c8] start_thread+0xd8 Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: sender_pc_raw -> sender_pc_maybe_signed Change-Id: I5d9b83f42d05a8773b341708579970d9c449ced2 CustomizedGitHooks: yes ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4029/files - new: https://git.openjdk.java.net/jdk/pull/4029/files/406eeed5..ce2ed307 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4029&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4029&range=05-06 Stats: 5 lines in 3 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/4029.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4029/head:pull/4029 PR: https://git.openjdk.java.net/jdk/pull/4029 From gziemski at openjdk.java.net Thu Jun 3 15:14:19 2021 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Thu, 3 Jun 2021 15:14:19 GMT Subject: RFR: 8266749: AArch64: Backtracing broken on PAC enabled systems [v6] In-Reply-To: <6XgtFHEGXqVgiGLcWadXJzSSUtMR_sBNT-TO1Z9WPvY=.22753345-69fe-4d63-83e9-8df1b695c717@github.com> References: <3Ak1iZsEfTEKadfUcF6zGVuzsDoiQbaupm66NvSwlaY=.8323de39-d7e6-4049-9243-7c31a75bbc9f@github.com> <6XgtFHEGXqVgiGLcWadXJzSSUtMR_sBNT-TO1Z9WPvY=.22753345-69fe-4d63-83e9-8df1b695c717@github.com> Message-ID: On Wed, 2 Jun 2021 09:49:04 GMT, Alan Hayward wrote: >> On PAC systems, native code may sign return addresses before saving >> them to the stack. We must ensure we strip the any signed bits in >> order to walk the stack. >> Add extra asserts in places where we do not expect saved return >> addresses to be signed. >> >> On non-PAC systems, all PAC instructions are treated as NOPs. >> >> On Apple, use the provided ptrauth interface instead of asm >> as the compiler may optimise further. >> >> Fedora 33 compiles all distro packages using PAC. Running the distro >> provided OpenJDK-latest in GDB on a PAC system: >> >> Thread 2 "java" hit Breakpoint 1, 0x0000fffff68d7fe4 in init_globals() () >> from /usr/lib/jvm/java-16-openjdk-16.0.1.0.9-1.rolling.fc33.aarch64-fastdebug/lib/server/libjvm.so >> (gdb) call (int)pns($sp, $fp, $pc) >> >> "Executing pns" >> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0xe26fe4] init_globals()+0x10 >> C 0x006ffffff74750c4 >> C 0x0042fffff6a7f84c >> C 0x0037fffff7fa0954 >> C 0x0030fffff7fa4540 >> C 0x0078fffff7d980c8 >> >> OpenJDK with this patch at the same breakpoint: >> >> (gdb) call (int)pns($sp, $fp, $pc) >> "Executing pns" >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x189c47c] Threads::create_vm(JavaVMInitArgs*, bool*)+0x27c >> V [libjvm.so+0xf527a0] JNI_CreateJavaVM+0xc0 >> C [libjli.so+0x3860] JavaMain+0x7c >> C [libjli.so+0x732c] ThreadJavaMain+0xc >> C [libpthread.so.0+0x80c8] start_thread+0xd8 >> >> OpenJDK with this patch breakpointed at pd_hotspot_signal_handler: >> >> "Executing pns" >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x148a730] PosixSignals::pd_hotspot_signal_handler(int, siginfo_t*, ucontext_t*, JavaThread*)+0x0 >> C [linux-vdso.so.1+0x80c] __kernel_rt_sigreturn+0x0 >> J 53 c1 jdk.internal.org.objectweb.asm.SymbolTable.addConstantUtf8(Ljava/lang/String;)I java.base (98 bytes) @ 0x0000ffffe159cc3c [0x0000ffffe159cb40+0x00000000000000fc] >> j jdk.internal.org.objectweb.asm.SymbolTable.setMajorVersionAndClassName(ILjava/lang/String;)I+12 java.base >> j jdk.internal.org.objectweb.asm.ClassWriter.visit(IILjava/lang/String;Ljava/lang/String;Ljava/lang/String;[Ljava/lang/String;)V+20 java.base >> j java.lang.invoke.InvokerBytecodeGenerator.classFilePrologue()Ljdk/internal/org/objectweb/asm/ClassWriter;+30 java.base >> j java.lang.invoke.InvokerBytecodeGenerator.generateCustomizedCodeBytes()[B+1 java.base >> j java.lang.invoke.InvokerBytecodeGenerator.generateCustomizedCode(Ljava/lang/invoke/LambdaForm;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/MemberName;+27 java.base >> j java.lang.invoke.LambdaForm.compileToBytecode()V+69 java.base >> j java.lang.invoke.DirectMethodHandle.makePreparedLambdaForm(Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/LambdaForm;+792 java.base >> j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/LambdaForm;+17 java.base >> j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MemberName;Z)Ljava/lang/invoke/LambdaForm;+163 java.base >> j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MemberName;)Ljava/lang/invoke/LambdaForm;+2 java.base >> j java.lang.invoke.DirectMethodHandle.make(BLjava/lang/Class;Ljava/lang/invoke/MemberName;Ljava/lang/Class;)Ljava/lang/invoke/DirectMethodHandle;+159 java.base >> j java.lang.invoke.MethodHandles$Lookup.getDirectMethodCommon(BLjava/lang/Class;Ljava/lang/invoke/MemberName;ZZLjava/lang/invoke/MethodHandles$Lookup;)Ljava/lang/invoke/MethodHandle;+210 java.base >> j java.lang.invoke.MethodHandles$Lookup.getDirectMethodNoSecurityManager(BLjava/lang/Class;Ljava/lang/invoke/MemberName;Ljava/lang/invoke/MethodHandles$Lookup;)Ljava/lang/invoke/MethodHandle;+14 java.base >> j java.lang.invoke.MethodHandles$Lookup.getDirectMethodForConstant(BLjava/lang/Class;Ljava/lang/invoke/MemberName;)Ljava/lang/invoke/MethodHandle;+31 java.base >> j java.lang.invoke.MethodHandles$Lookup.linkMethodHandleConstant(BLjava/lang/Class;Ljava/lang/String;Ljava/lang/Object;)Ljava/lang/invoke/MethodHandle;+153 java.base >> j java.lang.invoke.MethodHandleNatives.linkMethodHandleConstant(Ljava/lang/Class;ILjava/lang/Class;Ljava/lang/String;Ljava/lang/Object;)Ljava/lang/invoke/MethodHandle;+38 java.base >> v ~StubRoutines::call_stub >> V [libjvm.so+0xe20118] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x5c8 >> V [libjvm.so+0xe20f64] JavaCalls::call_static(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x284 >> V [libjvm.so+0x184b778] SystemDictionary::link_method_handle_constant(Klass*, int, Klass*, Symbol*, Symbol*, Thread*)+0x398 >> V [libjvm.so+0xa1f104] ConstantPool::resolve_constant_at_impl(constantPoolHandle const&, int, int, bool*, Thread*)+0xca0 >> V [libjvm.so+0xa1fb6c] ConstantPool::copy_bootstrap_arguments_at_impl(constantPoolHandle const&, int, int, int, objArrayHandle, int, bool, Handle, Thread*)+0x3fc >> V [libjvm.so+0x6bef6c] BootstrapInfo::resolve_args(Thread*)+0xcbc >> V [libjvm.so+0x6c1538] BootstrapInfo::resolve_bsm(Thread*)+0x1194 >> V [libjvm.so+0x184d300] SystemDictionary::invoke_bootstrap_method(BootstrapInfo&, Thread*)+0x30 >> V [libjvm.so+0x120450c] LinkResolver::resolve_dynamic_call(CallInfo&, BootstrapInfo&, Thread*)+0x2c >> V [libjvm.so+0x1204b1c] LinkResolver::resolve_invokedynamic(CallInfo&, constantPoolHandle const&, int, Thread*)+0x1bc >> V [libjvm.so+0xe0ecc4] InterpreterRuntime::resolve_invokedynamic(JavaThread*)+0x190 >> V [libjvm.so+0xe123a0] InterpreterRuntime::resolve_from_cache(JavaThread*, Bytecodes::Code)+0x160 >> j jdk.internal.module.ModulePath.explodedPackages(Ljava/nio/file/Path;)Ljava/util/Set;+5 java.base >> j jdk.internal.module.ModulePath.lambda$readExplodedModule$9(Ljava/nio/file/Path;)Ljava/util/Set;+2 java.base >> j jdk.internal.module.ModulePath$$Lambda$2+0x000000010003bbe0.get()Ljava/lang/Object;+8 java.base >> j jdk.internal.module.ModuleInfo.doRead(Ljava/io/DataInput;)Ljdk/internal/module/ModuleInfo$Attributes;+762 java.base >> j jdk.internal.module.ModuleInfo.read(Ljava/io/InputStream;Ljava/util/function/Supplier;)Ljdk/internal/module/ModuleInfo$Attributes;+16 java.base >> j jdk.internal.module.ModulePath.readExplodedModule(Ljava/nio/file/Path;)Ljava/lang/module/ModuleReference;+35 java.base >> j jdk.internal.module.ModulePath.readModule(Ljava/nio/file/Path;Ljava/nio/file/attribute/BasicFileAttributes;)Ljava/lang/module/ModuleReference;+11 java.base >> j jdk.internal.module.ModulePath.scanDirectory(Ljava/nio/file/Path;)Ljava/util/Map;+69 java.base >> j jdk.internal.module.ModulePath.scan(Ljava/nio/file/Path;)Ljava/util/Map;+60 java.base >> j jdk.internal.module.ModulePath.scanNextEntry()V+23 java.base >> j jdk.internal.module.ModulePath.find(Ljava/lang/String;)Ljava/util/Optional;+36 java.base >> j jdk.internal.module.SystemModuleFinders$1.lambda$find$0(Ljava/lang/module/ModuleFinder;Ljava/lang/String;)Ljava/util/Optional;+2 java.base >> j jdk.internal.module.SystemModuleFinders$1$$Lambda$1+0x0000000100033b00.run()Ljava/lang/Object;+8 java.base >> j java.security.AccessController.executePrivileged(Ljava/security/PrivilegedAction;Ljava/security/AccessControlContext;Ljava/lang/Class;)Ljava/lang/Object;+29 java.base >> j java.security.AccessController.doPrivileged(Ljava/security/PrivilegedAction;)Ljava/lang/Object;+5 java.base >> j jdk.internal.module.SystemModuleFinders$1.find(Ljava/lang/String;)Ljava/util/Optional;+12 java.base >> j jdk.internal.module.ModuleBootstrap.boot2()Ljava/lang/ModuleLayer;+304 java.base >> j jdk.internal.module.ModuleBootstrap.boot()Ljava/lang/ModuleLayer;+64 java.base >> j java.lang.System.initPhase2(ZZ)I+0 java.base >> v ~StubRoutines::call_stub >> V [libjvm.so+0xe20118] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x5c8 >> V [libjvm.so+0xe20f64] JavaCalls::call_static(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x284 >> V [libjvm.so+0x189c7bc] Threads::create_vm(JavaVMInitArgs*, bool*)+0x5bc >> V [libjvm.so+0xf527a0] JNI_CreateJavaVM+0xc0 >> C [libjli.so+0x3860] JavaMain+0x7c >> C [libjli.so+0x732c] ThreadJavaMain+0xc >> C [libpthread.so.0+0x80c8] start_thread+0xd8 > > Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: > > Remove asserts / fix build > > CustomizedGitHooks: yes > Change-Id: I6b634b90e81cf8f6e4cd1cdc63f6926eaa7025f6 Marked as reviewed by gziemski (Committer). ------------- PR: https://git.openjdk.java.net/jdk/pull/4029 From psandoz at openjdk.java.net Thu Jun 3 15:21:44 2021 From: psandoz at openjdk.java.net (Paul Sandoz) Date: Thu, 3 Jun 2021 15:21:44 GMT Subject: Integrated: 8266317: Vector API enhancements In-Reply-To: <9SrHXt3Om3jkyg0A4_X-L3YSMFM4Ib7Y6blBVFcs_Ik=.3122d7fc-e762-4982-ac8c-46cb43b5606a@github.com> References: <9SrHXt3Om3jkyg0A4_X-L3YSMFM4Ib7Y6blBVFcs_Ik=.3122d7fc-e762-4982-ac8c-46cb43b5606a@github.com> Message-ID: On Thu, 29 Apr 2021 21:13:38 GMT, Paul Sandoz wrote: > This PR contains API and implementation changes for [JEP-414 Vector API (Second Incubator)](https://openjdk.java.net/jeps/414), in preparation for when targeted. > > Enhancements are made to the API for the support of operations on characters, such as for UTF-8 character decoding. Specifically, methods for loading/storing a `short` vector from/to a `char[]` array, and new vector comparison operators for unsigned comparisons with integral vectors. The x64 implementation is enhanced to supported unsigned comparisons. > > Enhancements are made to the API for loading/storing a `byte` vector from/to a `boolean[]` array. > > The testing of loads/stores can be expanded for scatter/gather, but before doing that i think some refactoring of the tests is required to reposition tests in the right classes. I would like to do that work after integration of the PR. This pull request has now been integrated. Changeset: 5982cfc8 Author: Paul Sandoz URL: https://git.openjdk.java.net/jdk/commit/5982cfc85602862608fae56adb6041794e8c0d59 Stats: 10017 lines in 121 files changed: 9085 ins; 190 del; 742 mod 8266317: Vector API enhancements Co-authored-by: Paul Sandoz Co-authored-by: Sandhya Viswanathan Reviewed-by: jbhateja, vlivanov ------------- PR: https://git.openjdk.java.net/jdk/pull/3803 From chagedorn at openjdk.java.net Thu Jun 3 15:40:12 2021 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Thu, 3 Jun 2021 15:40:12 GMT Subject: RFR: 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests [v12] In-Reply-To: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> References: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> Message-ID: > This RFE provides an IR test framework to perform regex-based checks on the C2 IR shape of test methods emitted by the VM flags `-XX:+PrintIdeal` and `-XX:+PrintOptoAssembly`. The framework can also be used for other non-IR matching (and non-compiler) tests by providing easy to use annotations for commonly used testing patterns and compiler control flags. > > The framework is based on the ideas of the currently present IR test framework in [Valhalla](https://github.com/openjdk/valhalla/blob/e9c78ce4fcfd01361c35883e0d68f9ae5a80d079/test/hotspot/jtreg/compiler/valhalla/inlinetypes/InlineTypeTest.java) (mainly implemented by @TobiHartmann) which is being used with great success. This new framework aims to replace the old one in Valhalla at some point. > > A detailed description about how this new IR test framework works and how it is used is provided in the [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) file and in the [Javadocs](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc/jdk/test/lib/hotspot/ir_framework/package-summary.html) written for the framework classes. > > To finish a first version of this framework for JDK 17, I decided to leave some improvement possibilities and ideas to be followed up on in additional RFEs. Some ideas are mentioned in "Future Work" in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) and were also created as subtasks of this RFE. > > Testing (also described in "Internal Framework Tests in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md)): > There are various tests to verify the correctness of the test framework which can be found as JTreg tests in the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) folder. Additional testing was performed by converting all compiler Inline Types test of project Valhalla (done by @katyapav in [JDK-8263024](https://bugs.openjdk.java.net/browse/JDK-8263024)) that used the old framework to the new framework. This provided additional testing for the framework itself. We ran the converted tests with all the flag settings used in hs-tier1-9 and hs-precheckin-comp. For sanity checking, this was also done with a sample IR test in mainline. > > Some stats about the framework code added to [ir_framework](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework): > > - without the [Javadocs files](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc) : 60 changed files, 13212 insertions, 0 deletions. > - without the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) and [examples](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/examples) folder: 40 files changed, 6781 insertions > - comments: 2399 insertions (calculated with `git diff --cached !(tests|examples) | grep -c -E "(^[+-]\s*(/)?*)|(^[+-]\s*//)"`) > - which leaves 4382 lines of code inserted > > Big thanks to: > - @TobiHartmann for all his help by discussing the new framework and for providing insights from his IR test framework in Valhalla. > - @katyapav for converting the Valhalla tests to use the new framework which found some harder to catch bugs in the framework and also some actual C2 bugs. > - @iignatev for helping to simplify the framework usage with JTreg and with the framework internal VM calling structure. > - and others who provided valuable feedback. > > Thanks, > Christian Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: Fix failing internal tests on Windows and add missing flag description in README ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3508/files - new: https://git.openjdk.java.net/jdk/pull/3508/files/7a316de0..df7576f7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3508&range=11 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3508&range=10-11 Stats: 18 lines in 5 files changed: 9 ins; 0 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/3508.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3508/head:pull/3508 PR: https://git.openjdk.java.net/jdk/pull/3508 From iignatyev at openjdk.java.net Thu Jun 3 16:39:50 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Thu, 3 Jun 2021 16:39:50 GMT Subject: RFR: 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests [v12] In-Reply-To: References: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> Message-ID: On Thu, 3 Jun 2021 15:40:12 GMT, Christian Hagedorn wrote: >> This RFE provides an IR test framework to perform regex-based checks on the C2 IR shape of test methods emitted by the VM flags `-XX:+PrintIdeal` and `-XX:+PrintOptoAssembly`. The framework can also be used for other non-IR matching (and non-compiler) tests by providing easy to use annotations for commonly used testing patterns and compiler control flags. >> >> The framework is based on the ideas of the currently present IR test framework in [Valhalla](https://github.com/openjdk/valhalla/blob/e9c78ce4fcfd01361c35883e0d68f9ae5a80d079/test/hotspot/jtreg/compiler/valhalla/inlinetypes/InlineTypeTest.java) (mainly implemented by @TobiHartmann) which is being used with great success. This new framework aims to replace the old one in Valhalla at some point. >> >> A detailed description about how this new IR test framework works and how it is used is provided in the [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) file and in the [Javadocs](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc/jdk/test/lib/hotspot/ir_framework/package-summary.html) written for the framework classes. >> >> To finish a first version of this framework for JDK 17, I decided to leave some improvement possibilities and ideas to be followed up on in additional RFEs. Some ideas are mentioned in "Future Work" in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) and were also created as subtasks of this RFE. >> >> Testing (also described in "Internal Framework Tests in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md)): >> There are various tests to verify the correctness of the test framework which can be found as JTreg tests in the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) folder. Additional testing was performed by converting all compiler Inline Types test of project Valhalla (done by @katyapav in [JDK-8263024](https://bugs.openjdk.java.net/browse/JDK-8263024)) that used the old framework to the new framework. This provided additional testing for the framework itself. We ran the converted tests with all the flag settings used in hs-tier1-9 and hs-precheckin-comp. For sanity checking, this was also done with a sample IR test in mainline. >> >> Some stats about the framework code added to [ir_framework](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework): >> >> - without the [Javadocs files](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc) : 60 changed files, 13212 insertions, 0 deletions. >> - without the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) and [examples](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/examples) folder: 40 files changed, 6781 insertions >> - comments: 2399 insertions (calculated with `git diff --cached !(tests|examples) | grep -c -E "(^[+-]\s*(/)?*)|(^[+-]\s*//)"`) >> - which leaves 4382 lines of code inserted >> >> Big thanks to: >> - @TobiHartmann for all his help by discussing the new framework and for providing insights from his IR test framework in Valhalla. >> - @katyapav for converting the Valhalla tests to use the new framework which found some harder to catch bugs in the framework and also some actual C2 bugs. >> - @iignatev for helping to simplify the framework usage with JTreg and with the framework internal VM calling structure. >> - and others who provided valuable feedback. >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: > > Fix failing internal tests on Windows and add missing flag description in README LGTM, modulo two nits in README file and the location of tests and examples: I'd prefer to place them into `/test/hotspot/jtreg/testlibrary_tests/ir_framework` and add explicit reference to that location in README file. test/hotspot/jtreg/compiler/lib/ir_framework/README.md line 34: > 32: > 33: ## 2. Features > 34: The framework offers various annotations and flags to control how your test code should be invoked and being checked. This section gives an overview over all these features. redundant space at the beginning of the line test/hotspot/jtreg/compiler/lib/ir_framework/README.md line 135: > 133: > 134: ## 6. Summary > 135: The initial design and feature set was kept simple and straight forward and serves well for small to medium sized tests. There are a lot of possibilities to further enhance the framework and make it more powerful. This can be tackled in additional RFEs. A few ideas can be found as subtasks of the [initial RFE](https://bugs.openjdk.java.net/browse/JDK-8254129) for this framework. redundant space: Suggestion: The initial design and feature set was kept simple and straight forward and serves well for small to medium sized tests. There are a lot of possibilities to further enhance the framework and make it more powerful. This can be tackled in additional RFEs. A few ideas can be found as subtasks of the [initial RFE](https://bugs.openjdk.java.net/browse/JDK-8254129) for this framework. ------------- Marked as reviewed by iignatyev (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3508 From sspitsyn at openjdk.java.net Thu Jun 3 16:53:38 2021 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Thu, 3 Jun 2021 16:53:38 GMT Subject: RFR: 8212155: Race condition when posting dynamic_code_generated event leads to JVM crash In-Reply-To: References: Message-ID: <-6_SJRedB1131iS7_XYwro6x3ANkwoGyo_oG9pDftKA=.3658fe2f-66a3-4424-a679-5d75ce2182d3@github.com> On Thu, 3 Jun 2021 06:15:28 GMT, Leonid Mesnik wrote: > Fixed a race condition between posting and enabling DynamicCodeGenerated event. Hi Leonid, The fix looks good to me. Thanks, Serguei src/hotspot/share/prims/jvmtiExport.cpp line 2293: > 2291: // jvmti thread state. > 2292: // The collector and/or state might be NULL if JvmtiDynamicCodeEventCollector has been initialized > 2293: // while JVMTI_EVENT_DYNAMIC_CODE_GENERATED was disabled Could you, reballance this comment a little bit and add a dot at the end? ------------- Marked as reviewed by sspitsyn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4331 From lmesnik at openjdk.java.net Thu Jun 3 17:24:07 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Thu, 3 Jun 2021 17:24:07 GMT Subject: RFR: 8212155: Race condition when posting dynamic_code_generated event leads to JVM crash [v2] In-Reply-To: References: Message-ID: > Fixed a race condition between posting and enabling DynamicCodeGenerated event. Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: fixed comment ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4331/files - new: https://git.openjdk.java.net/jdk/pull/4331/files/4a998eeb..c650f446 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4331&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4331&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/4331.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4331/head:pull/4331 PR: https://git.openjdk.java.net/jdk/pull/4331 From iignatyev at openjdk.java.net Thu Jun 3 17:29:36 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Thu, 3 Jun 2021 17:29:36 GMT Subject: RFR: 8267917: mark hotspot containers tests which ignore external VM flags In-Reply-To: <-2xStypSXn36rLxpzahRaczKC34Oy1QvW9oZISxyNTI=.e018fe94-7a4d-4e78-9212-349546a01908@github.com> References: <-2xStypSXn36rLxpzahRaczKC34Oy1QvW9oZISxyNTI=.e018fe94-7a4d-4e78-9212-349546a01908@github.com> Message-ID: On Fri, 28 May 2021 08:31:49 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this tiny and trivial patch that adds `@requires vm.flagless` to two container tests that, supposedly, ignore external VM flags? > > attn: @mseledts > > Cheers, > -- Igor Thanks, Misha! can I get a second Review? -- Igor ------------- PR: https://git.openjdk.java.net/jdk/pull/4241 From naoto at openjdk.java.net Thu Jun 3 17:51:40 2021 From: naoto at openjdk.java.net (Naoto Sato) Date: Thu, 3 Jun 2021 17:51:40 GMT Subject: RFR: 8195129: System.load() fails to load from unicode paths [v3] In-Reply-To: References: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> <3y0nPfUyTPbNksPn1y5pvopzN2AReOgIl2CafPKD4b4=.3b490e90-5098-4d9f-8d7e-2770f5548895@github.com> Message-ID: <923qsXnidgxLrhNUc8Bxw3zDCiw1ZNLC6OmIYqIeSOE=.783c385d-1d4c-4c87-b975-3ee27a71513d@github.com> On Thu, 3 Jun 2021 06:55:26 GMT, Maxim Kartashev wrote: >> src/hotspot/os/windows/os_windows.cpp line 1491: >> >>> 1489: static errno_t convert_UTF8_to_UTF16(char const* utf8_str, LPWSTR* utf16_str) { >>> 1490: return convert_to_UTF16(utf8_str, CP_UTF8, utf16_str); >>> 1491: } >> >> IIUC, `utf8_str` is the "modified" UTF-8 string in JNI. Using it as the standard UTF-8 (I believe Windows' encoding `CP_UTF8` is the one) may end up in incorrect conversions in some corner cases, e.g., for supplementary characters. > > Right; I changed the code in NativeLibraries.c to pass down true UTF-8 instead of "modified UTF-8". Please, take a look. I am not sure we can pass non `modified UTF-8` through `JVM_LoadLibrary()`. Probably some VM folks can enlighten here? ------------- PR: https://git.openjdk.java.net/jdk/pull/4169 From sviswanathan at openjdk.java.net Thu Jun 3 17:52:12 2021 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Thu, 3 Jun 2021 17:52:12 GMT Subject: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v17] In-Reply-To: References: Message-ID: > This PR contains Short Vector Math Library support related changes for [JEP-414 Vector API (Second Incubator)](https://openjdk.java.net/jeps/414), in preparation for when targeted. > > Intel Short Vector Math Library (SVML) based intrinsics in native x86 assembly provide optimized implementation for Vector API transcendental and trigonometric methods. > These methods are built into a separate library instead of being part of libjvm.so or jvm.dll. > > The following changes are made: > The source for these methods is placed in the jdk.incubator.vector module under src/jdk.incubator.vector/linux/native/libsvml and src/jdk.incubator.vector/windows/native/libsvml. > The assembly source files are named as ?*.S? and include files are named as ?*.S.inc?. > The corresponding build script is placed at make/modules/jdk.incubator.vector/Lib.gmk. > Changes are made to build system to support dependency tracking for assembly files with includes. > The built native libraries (libsvml.so/svml.dll) are placed in bin directory of JDK on Windows and lib directory of JDK on Linux. > The C2 JIT uses the dll_load and dll_lookup to get the addresses of optimized methods from this library. > > Build system changes and module library build scripts are contributed by Magnus (magnus.ihse.bursie at oracle.com). > > Looking forward to your review and feedback. > > Performance: > Micro benchmark Base Optimized Unit Gain(Optimized/Base) > Double128Vector.ACOS 45.91 87.34 ops/ms 1.90 > Double128Vector.ASIN 45.06 92.36 ops/ms 2.05 > Double128Vector.ATAN 19.92 118.36 ops/ms 5.94 > Double128Vector.ATAN2 15.24 88.17 ops/ms 5.79 > Double128Vector.CBRT 45.77 208.36 ops/ms 4.55 > Double128Vector.COS 49.94 245.89 ops/ms 4.92 > Double128Vector.COSH 26.91 126.00 ops/ms 4.68 > Double128Vector.EXP 71.64 379.65 ops/ms 5.30 > Double128Vector.EXPM1 35.95 150.37 ops/ms 4.18 > Double128Vector.HYPOT 50.67 174.10 ops/ms 3.44 > Double128Vector.LOG 61.95 279.84 ops/ms 4.52 > Double128Vector.LOG10 59.34 239.05 ops/ms 4.03 > Double128Vector.LOG1P 18.56 200.32 ops/ms 10.79 > Double128Vector.SIN 49.36 240.79 ops/ms 4.88 > Double128Vector.SINH 26.59 103.75 ops/ms 3.90 > Double128Vector.TAN 41.05 152.39 ops/ms 3.71 > Double128Vector.TANH 45.29 169.53 ops/ms 3.74 > Double256Vector.ACOS 54.21 106.39 ops/ms 1.96 > Double256Vector.ASIN 53.60 107.99 ops/ms 2.01 > Double256Vector.ATAN 21.53 189.11 ops/ms 8.78 > Double256Vector.ATAN2 16.67 140.76 ops/ms 8.44 > Double256Vector.CBRT 56.45 397.13 ops/ms 7.04 > Double256Vector.COS 58.26 389.77 ops/ms 6.69 > Double256Vector.COSH 29.44 151.11 ops/ms 5.13 > Double256Vector.EXP 86.67 564.68 ops/ms 6.52 > Double256Vector.EXPM1 41.96 201.28 ops/ms 4.80 > Double256Vector.HYPOT 66.18 305.74 ops/ms 4.62 > Double256Vector.LOG 71.52 394.90 ops/ms 5.52 > Double256Vector.LOG10 65.43 362.32 ops/ms 5.54 > Double256Vector.LOG1P 19.99 300.88 ops/ms 15.05 > Double256Vector.SIN 57.06 380.98 ops/ms 6.68 > Double256Vector.SINH 29.40 117.37 ops/ms 3.99 > Double256Vector.TAN 44.90 279.90 ops/ms 6.23 > Double256Vector.TANH 54.08 274.71 ops/ms 5.08 > Double512Vector.ACOS 55.65 687.54 ops/ms 12.35 > Double512Vector.ASIN 57.31 777.72 ops/ms 13.57 > Double512Vector.ATAN 21.42 729.21 ops/ms 34.04 > Double512Vector.ATAN2 16.37 414.33 ops/ms 25.32 > Double512Vector.CBRT 56.78 834.38 ops/ms 14.69 > Double512Vector.COS 59.88 837.04 ops/ms 13.98 > Double512Vector.COSH 30.34 172.76 ops/ms 5.70 > Double512Vector.EXP 99.66 1608.12 ops/ms 16.14 > Double512Vector.EXPM1 43.39 318.61 ops/ms 7.34 > Double512Vector.HYPOT 73.87 1502.72 ops/ms 20.34 > Double512Vector.LOG 74.84 996.00 ops/ms 13.31 > Double512Vector.LOG10 71.12 1046.52 ops/ms 14.72 > Double512Vector.LOG1P 19.75 776.87 ops/ms 39.34 > Double512Vector.POW 37.42 384.13 ops/ms 10.26 > Double512Vector.SIN 59.74 728.45 ops/ms 12.19 > Double512Vector.SINH 29.47 143.38 ops/ms 4.87 > Double512Vector.TAN 46.20 587.21 ops/ms 12.71 > Double512Vector.TANH 57.36 495.42 ops/ms 8.64 > Double64Vector.ACOS 24.04 73.67 ops/ms 3.06 > Double64Vector.ASIN 23.78 75.11 ops/ms 3.16 > Double64Vector.ATAN 14.14 62.81 ops/ms 4.44 > Double64Vector.ATAN2 10.38 44.43 ops/ms 4.28 > Double64Vector.CBRT 16.47 107.50 ops/ms 6.53 > Double64Vector.COS 23.42 152.01 ops/ms 6.49 > Double64Vector.COSH 17.34 113.34 ops/ms 6.54 > Double64Vector.EXP 27.08 203.53 ops/ms 7.52 > Double64Vector.EXPM1 18.77 96.73 ops/ms 5.15 > Double64Vector.HYPOT 18.54 103.62 ops/ms 5.59 > Double64Vector.LOG 26.75 142.63 ops/ms 5.33 > Double64Vector.LOG10 25.85 139.71 ops/ms 5.40 > Double64Vector.LOG1P 13.26 97.94 ops/ms 7.38 > Double64Vector.SIN 23.28 146.91 ops/ms 6.31 > Double64Vector.SINH 17.62 88.59 ops/ms 5.03 > Double64Vector.TAN 21.00 86.43 ops/ms 4.12 > Double64Vector.TANH 23.75 111.35 ops/ms 4.69 > Float128Vector.ACOS 57.52 110.65 ops/ms 1.92 > Float128Vector.ASIN 57.15 117.95 ops/ms 2.06 > Float128Vector.ATAN 22.52 318.74 ops/ms 14.15 > Float128Vector.ATAN2 17.06 246.07 ops/ms 14.42 > Float128Vector.CBRT 29.72 443.74 ops/ms 14.93 > Float128Vector.COS 42.82 803.02 ops/ms 18.75 > Float128Vector.COSH 31.44 118.34 ops/ms 3.76 > Float128Vector.EXP 72.43 855.33 ops/ms 11.81 > Float128Vector.EXPM1 37.82 127.85 ops/ms 3.38 > Float128Vector.HYPOT 53.20 591.68 ops/ms 11.12 > Float128Vector.LOG 52.95 877.94 ops/ms 16.58 > Float128Vector.LOG10 49.26 603.72 ops/ms 12.26 > Float128Vector.LOG1P 20.89 430.59 ops/ms 20.61 > Float128Vector.SIN 43.38 745.31 ops/ms 17.18 > Float128Vector.SINH 31.11 112.91 ops/ms 3.63 > Float128Vector.TAN 37.25 332.13 ops/ms 8.92 > Float128Vector.TANH 57.63 453.77 ops/ms 7.87 > Float256Vector.ACOS 65.23 123.73 ops/ms 1.90 > Float256Vector.ASIN 63.41 132.86 ops/ms 2.10 > Float256Vector.ATAN 23.51 649.02 ops/ms 27.61 > Float256Vector.ATAN2 18.19 455.95 ops/ms 25.07 > Float256Vector.CBRT 45.99 594.81 ops/ms 12.93 > Float256Vector.COS 43.75 926.69 ops/ms 21.18 > Float256Vector.COSH 33.52 130.46 ops/ms 3.89 > Float256Vector.EXP 75.70 1366.72 ops/ms 18.05 > Float256Vector.EXPM1 39.00 149.72 ops/ms 3.84 > Float256Vector.HYPOT 52.91 1023.18 ops/ms 19.34 > Float256Vector.LOG 53.31 1545.77 ops/ms 29.00 > Float256Vector.LOG10 50.31 863.80 ops/ms 17.17 > Float256Vector.LOG1P 21.51 616.59 ops/ms 28.66 > Float256Vector.SIN 44.07 911.04 ops/ms 20.67 > Float256Vector.SINH 33.16 122.50 ops/ms 3.69 > Float256Vector.TAN 37.85 497.75 ops/ms 13.15 > Float256Vector.TANH 64.27 537.20 ops/ms 8.36 > Float512Vector.ACOS 67.33 1718.00 ops/ms 25.52 > Float512Vector.ASIN 66.12 1780.85 ops/ms 26.93 > Float512Vector.ATAN 22.63 1780.31 ops/ms 78.69 > Float512Vector.ATAN2 17.52 1113.93 ops/ms 63.57 > Float512Vector.CBRT 54.78 2087.58 ops/ms 38.11 > Float512Vector.COS 40.92 1567.93 ops/ms 38.32 > Float512Vector.COSH 33.42 138.36 ops/ms 4.14 > Float512Vector.EXP 70.51 3835.97 ops/ms 54.41 > Float512Vector.EXPM1 38.06 279.80 ops/ms 7.35 > Float512Vector.HYPOT 50.99 3287.55 ops/ms 64.47 > Float512Vector.LOG 49.61 3156.99 ops/ms 63.64 > Float512Vector.LOG10 46.94 2489.16 ops/ms 53.02 > Float512Vector.LOG1P 20.66 1689.86 ops/ms 81.81 > Float512Vector.POW 32.73 1015.85 ops/ms 31.04 > Float512Vector.SIN 41.17 1587.71 ops/ms 38.56 > Float512Vector.SINH 33.05 129.39 ops/ms 3.91 > Float512Vector.TAN 35.60 1336.11 ops/ms 37.53 > Float512Vector.TANH 65.77 2295.28 ops/ms 34.90 > Float64Vector.ACOS 48.41 89.34 ops/ms 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 > Float64Vector.EXP 65.80 486.37 ops/ms 7.39 > Float64Vector.EXPM1 34.61 85.99 ops/ms 2.48 > Float64Vector.HYPOT 50.40 147.82 ops/ms 2.93 > Float64Vector.LOG 51.93 163.25 ops/ms 3.14 > Float64Vector.LOG10 49.53 147.98 ops/ms 2.99 > Float64Vector.LOG1P 19.20 206.81 ops/ms 10.77 > Float64Vector.SIN 44.41 382.09 ops/ms 8.60 > Float64Vector.SINH 28.20 90.68 ops/ms 3.22 > Float64Vector.TAN 36.29 160.89 ops/ms 4.43 > Float64Vector.TANH 47.65 214.04 ops/ms 4.49 Sandhya Viswanathan has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: - Merge master - update javadoc - correct javadoc - Javadoc changes - correct ppc.ad - Merge master - Commit missing changes - Implement Vladimir Ivanov and Paul Sandoz review comments - fix 32-bit build - Add comments explaining naming convention - ... and 11 more: https://git.openjdk.java.net/jdk/compare/52d8215a...03ac3197 ------------- Changes: https://git.openjdk.java.net/jdk/pull/3638/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3638&range=16 Stats: 416073 lines in 119 files changed: 415886 ins; 124 del; 63 mod Patch: https://git.openjdk.java.net/jdk/pull/3638.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3638/head:pull/3638 PR: https://git.openjdk.java.net/jdk/pull/3638 From naoto at openjdk.java.net Thu Jun 3 17:54:38 2021 From: naoto at openjdk.java.net (Naoto Sato) Date: Thu, 3 Jun 2021 17:54:38 GMT Subject: RFR: 8195129: System.load() fails to load from unicode paths [v3] In-Reply-To: References: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> <3y0nPfUyTPbNksPn1y5pvopzN2AReOgIl2CafPKD4b4=.3b490e90-5098-4d9f-8d7e-2770f5548895@github.com> Message-ID: On Thu, 3 Jun 2021 06:59:01 GMT, Maxim Kartashev wrote: >> test/hotspot/jtreg/runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java line 42: >> >>> 40: String nativePathSetting = "-Dtest.nativepath=" + getSystemProperty("test.nativepath"); >>> 41: ProcessBuilder pb = ProcessTools.createTestJvm(nativePathSetting, LoadLibraryUnicode.class.getName()); >>> 42: pb.environment().put("LC_ALL", "en_US.UTF-8"); >> >> Some environments/user configs may not have `UTF-8` codeset on the platform. May need to gracefully exit in such a case. > > I added `java.nio.charset.Charset.isSupported("UTF-8")` check to the test. Hope that's enough for the environments without `UTF-8`. `Charset.isSupported()` returns whether Java encoder/decoder supports it or not, not the platform has the codeset. I think we can simply limit the test platform only to Windows in `@requires` tag in the test. Also, I would see the test case using some supplementary characters. ------------- PR: https://git.openjdk.java.net/jdk/pull/4169 From hseigel at openjdk.java.net Thu Jun 3 19:00:59 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Thu, 3 Jun 2021 19:00:59 GMT Subject: RFR: 8267917: mark hotspot containers tests which ignore external VM flags In-Reply-To: <-2xStypSXn36rLxpzahRaczKC34Oy1QvW9oZISxyNTI=.e018fe94-7a4d-4e78-9212-349546a01908@github.com> References: <-2xStypSXn36rLxpzahRaczKC34Oy1QvW9oZISxyNTI=.e018fe94-7a4d-4e78-9212-349546a01908@github.com> Message-ID: On Fri, 28 May 2021 08:31:49 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this tiny and trivial patch that adds `@requires vm.flagless` to two container tests that, supposedly, ignore external VM flags? > > attn: @mseledts > > Cheers, > -- Igor Would this change prevent these tests from being running in many cases, such as if -Xint or -Xcomp was specified for the test run? Were tests failing because of the external flags? ------------- PR: https://git.openjdk.java.net/jdk/pull/4241 From hseigel at openjdk.java.net Thu Jun 3 19:25:02 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Thu, 3 Jun 2021 19:25:02 GMT Subject: RFR: 8267917: mark hotspot containers tests which ignore external VM flags In-Reply-To: <-2xStypSXn36rLxpzahRaczKC34Oy1QvW9oZISxyNTI=.e018fe94-7a4d-4e78-9212-349546a01908@github.com> References: <-2xStypSXn36rLxpzahRaczKC34Oy1QvW9oZISxyNTI=.e018fe94-7a4d-4e78-9212-349546a01908@github.com> Message-ID: On Fri, 28 May 2021 08:31:49 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this tiny and trivial patch that adds `@requires vm.flagless` to two container tests that, supposedly, ignore external VM flags? > > attn: @mseledts > > Cheers, > -- Igor LGTM Thanks for doing this. Harold ------------- Marked as reviewed by hseigel (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4241 From sspitsyn at openjdk.java.net Thu Jun 3 19:33:05 2021 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Thu, 3 Jun 2021 19:33:05 GMT Subject: RFR: 8212155: Race condition when posting dynamic_code_generated event leads to JVM crash [v2] In-Reply-To: References: Message-ID: On Thu, 3 Jun 2021 17:24:07 GMT, Leonid Mesnik wrote: >> Fixed a race condition between posting and enabling DynamicCodeGenerated event. > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > fixed comment Marked as reviewed by sspitsyn (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/4331 From sviswanathan at openjdk.java.net Thu Jun 3 20:07:09 2021 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Thu, 3 Jun 2021 20:07:09 GMT Subject: Integrated: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics In-Reply-To: References: Message-ID: On Thu, 22 Apr 2021 19:07:28 GMT, Sandhya Viswanathan wrote: > This PR contains Short Vector Math Library support related changes for [JEP-414 Vector API (Second Incubator)](https://openjdk.java.net/jeps/414), in preparation for when targeted. > > Intel Short Vector Math Library (SVML) based intrinsics in native x86 assembly provide optimized implementation for Vector API transcendental and trigonometric methods. > These methods are built into a separate library instead of being part of libjvm.so or jvm.dll. > > The following changes are made: > The source for these methods is placed in the jdk.incubator.vector module under src/jdk.incubator.vector/linux/native/libsvml and src/jdk.incubator.vector/windows/native/libsvml. > The assembly source files are named as ?*.S? and include files are named as ?*.S.inc?. > The corresponding build script is placed at make/modules/jdk.incubator.vector/Lib.gmk. > Changes are made to build system to support dependency tracking for assembly files with includes. > The built native libraries (libsvml.so/svml.dll) are placed in bin directory of JDK on Windows and lib directory of JDK on Linux. > The C2 JIT uses the dll_load and dll_lookup to get the addresses of optimized methods from this library. > > Build system changes and module library build scripts are contributed by Magnus (magnus.ihse.bursie at oracle.com). > > Looking forward to your review and feedback. > > Performance: > Micro benchmark Base Optimized Unit Gain(Optimized/Base) > Double128Vector.ACOS 45.91 87.34 ops/ms 1.90 > Double128Vector.ASIN 45.06 92.36 ops/ms 2.05 > Double128Vector.ATAN 19.92 118.36 ops/ms 5.94 > Double128Vector.ATAN2 15.24 88.17 ops/ms 5.79 > Double128Vector.CBRT 45.77 208.36 ops/ms 4.55 > Double128Vector.COS 49.94 245.89 ops/ms 4.92 > Double128Vector.COSH 26.91 126.00 ops/ms 4.68 > Double128Vector.EXP 71.64 379.65 ops/ms 5.30 > Double128Vector.EXPM1 35.95 150.37 ops/ms 4.18 > Double128Vector.HYPOT 50.67 174.10 ops/ms 3.44 > Double128Vector.LOG 61.95 279.84 ops/ms 4.52 > Double128Vector.LOG10 59.34 239.05 ops/ms 4.03 > Double128Vector.LOG1P 18.56 200.32 ops/ms 10.79 > Double128Vector.SIN 49.36 240.79 ops/ms 4.88 > Double128Vector.SINH 26.59 103.75 ops/ms 3.90 > Double128Vector.TAN 41.05 152.39 ops/ms 3.71 > Double128Vector.TANH 45.29 169.53 ops/ms 3.74 > Double256Vector.ACOS 54.21 106.39 ops/ms 1.96 > Double256Vector.ASIN 53.60 107.99 ops/ms 2.01 > Double256Vector.ATAN 21.53 189.11 ops/ms 8.78 > Double256Vector.ATAN2 16.67 140.76 ops/ms 8.44 > Double256Vector.CBRT 56.45 397.13 ops/ms 7.04 > Double256Vector.COS 58.26 389.77 ops/ms 6.69 > Double256Vector.COSH 29.44 151.11 ops/ms 5.13 > Double256Vector.EXP 86.67 564.68 ops/ms 6.52 > Double256Vector.EXPM1 41.96 201.28 ops/ms 4.80 > Double256Vector.HYPOT 66.18 305.74 ops/ms 4.62 > Double256Vector.LOG 71.52 394.90 ops/ms 5.52 > Double256Vector.LOG10 65.43 362.32 ops/ms 5.54 > Double256Vector.LOG1P 19.99 300.88 ops/ms 15.05 > Double256Vector.SIN 57.06 380.98 ops/ms 6.68 > Double256Vector.SINH 29.40 117.37 ops/ms 3.99 > Double256Vector.TAN 44.90 279.90 ops/ms 6.23 > Double256Vector.TANH 54.08 274.71 ops/ms 5.08 > Double512Vector.ACOS 55.65 687.54 ops/ms 12.35 > Double512Vector.ASIN 57.31 777.72 ops/ms 13.57 > Double512Vector.ATAN 21.42 729.21 ops/ms 34.04 > Double512Vector.ATAN2 16.37 414.33 ops/ms 25.32 > Double512Vector.CBRT 56.78 834.38 ops/ms 14.69 > Double512Vector.COS 59.88 837.04 ops/ms 13.98 > Double512Vector.COSH 30.34 172.76 ops/ms 5.70 > Double512Vector.EXP 99.66 1608.12 ops/ms 16.14 > Double512Vector.EXPM1 43.39 318.61 ops/ms 7.34 > Double512Vector.HYPOT 73.87 1502.72 ops/ms 20.34 > Double512Vector.LOG 74.84 996.00 ops/ms 13.31 > Double512Vector.LOG10 71.12 1046.52 ops/ms 14.72 > Double512Vector.LOG1P 19.75 776.87 ops/ms 39.34 > Double512Vector.POW 37.42 384.13 ops/ms 10.26 > Double512Vector.SIN 59.74 728.45 ops/ms 12.19 > Double512Vector.SINH 29.47 143.38 ops/ms 4.87 > Double512Vector.TAN 46.20 587.21 ops/ms 12.71 > Double512Vector.TANH 57.36 495.42 ops/ms 8.64 > Double64Vector.ACOS 24.04 73.67 ops/ms 3.06 > Double64Vector.ASIN 23.78 75.11 ops/ms 3.16 > Double64Vector.ATAN 14.14 62.81 ops/ms 4.44 > Double64Vector.ATAN2 10.38 44.43 ops/ms 4.28 > Double64Vector.CBRT 16.47 107.50 ops/ms 6.53 > Double64Vector.COS 23.42 152.01 ops/ms 6.49 > Double64Vector.COSH 17.34 113.34 ops/ms 6.54 > Double64Vector.EXP 27.08 203.53 ops/ms 7.52 > Double64Vector.EXPM1 18.77 96.73 ops/ms 5.15 > Double64Vector.HYPOT 18.54 103.62 ops/ms 5.59 > Double64Vector.LOG 26.75 142.63 ops/ms 5.33 > Double64Vector.LOG10 25.85 139.71 ops/ms 5.40 > Double64Vector.LOG1P 13.26 97.94 ops/ms 7.38 > Double64Vector.SIN 23.28 146.91 ops/ms 6.31 > Double64Vector.SINH 17.62 88.59 ops/ms 5.03 > Double64Vector.TAN 21.00 86.43 ops/ms 4.12 > Double64Vector.TANH 23.75 111.35 ops/ms 4.69 > Float128Vector.ACOS 57.52 110.65 ops/ms 1.92 > Float128Vector.ASIN 57.15 117.95 ops/ms 2.06 > Float128Vector.ATAN 22.52 318.74 ops/ms 14.15 > Float128Vector.ATAN2 17.06 246.07 ops/ms 14.42 > Float128Vector.CBRT 29.72 443.74 ops/ms 14.93 > Float128Vector.COS 42.82 803.02 ops/ms 18.75 > Float128Vector.COSH 31.44 118.34 ops/ms 3.76 > Float128Vector.EXP 72.43 855.33 ops/ms 11.81 > Float128Vector.EXPM1 37.82 127.85 ops/ms 3.38 > Float128Vector.HYPOT 53.20 591.68 ops/ms 11.12 > Float128Vector.LOG 52.95 877.94 ops/ms 16.58 > Float128Vector.LOG10 49.26 603.72 ops/ms 12.26 > Float128Vector.LOG1P 20.89 430.59 ops/ms 20.61 > Float128Vector.SIN 43.38 745.31 ops/ms 17.18 > Float128Vector.SINH 31.11 112.91 ops/ms 3.63 > Float128Vector.TAN 37.25 332.13 ops/ms 8.92 > Float128Vector.TANH 57.63 453.77 ops/ms 7.87 > Float256Vector.ACOS 65.23 123.73 ops/ms 1.90 > Float256Vector.ASIN 63.41 132.86 ops/ms 2.10 > Float256Vector.ATAN 23.51 649.02 ops/ms 27.61 > Float256Vector.ATAN2 18.19 455.95 ops/ms 25.07 > Float256Vector.CBRT 45.99 594.81 ops/ms 12.93 > Float256Vector.COS 43.75 926.69 ops/ms 21.18 > Float256Vector.COSH 33.52 130.46 ops/ms 3.89 > Float256Vector.EXP 75.70 1366.72 ops/ms 18.05 > Float256Vector.EXPM1 39.00 149.72 ops/ms 3.84 > Float256Vector.HYPOT 52.91 1023.18 ops/ms 19.34 > Float256Vector.LOG 53.31 1545.77 ops/ms 29.00 > Float256Vector.LOG10 50.31 863.80 ops/ms 17.17 > Float256Vector.LOG1P 21.51 616.59 ops/ms 28.66 > Float256Vector.SIN 44.07 911.04 ops/ms 20.67 > Float256Vector.SINH 33.16 122.50 ops/ms 3.69 > Float256Vector.TAN 37.85 497.75 ops/ms 13.15 > Float256Vector.TANH 64.27 537.20 ops/ms 8.36 > Float512Vector.ACOS 67.33 1718.00 ops/ms 25.52 > Float512Vector.ASIN 66.12 1780.85 ops/ms 26.93 > Float512Vector.ATAN 22.63 1780.31 ops/ms 78.69 > Float512Vector.ATAN2 17.52 1113.93 ops/ms 63.57 > Float512Vector.CBRT 54.78 2087.58 ops/ms 38.11 > Float512Vector.COS 40.92 1567.93 ops/ms 38.32 > Float512Vector.COSH 33.42 138.36 ops/ms 4.14 > Float512Vector.EXP 70.51 3835.97 ops/ms 54.41 > Float512Vector.EXPM1 38.06 279.80 ops/ms 7.35 > Float512Vector.HYPOT 50.99 3287.55 ops/ms 64.47 > Float512Vector.LOG 49.61 3156.99 ops/ms 63.64 > Float512Vector.LOG10 46.94 2489.16 ops/ms 53.02 > Float512Vector.LOG1P 20.66 1689.86 ops/ms 81.81 > Float512Vector.POW 32.73 1015.85 ops/ms 31.04 > Float512Vector.SIN 41.17 1587.71 ops/ms 38.56 > Float512Vector.SINH 33.05 129.39 ops/ms 3.91 > Float512Vector.TAN 35.60 1336.11 ops/ms 37.53 > Float512Vector.TANH 65.77 2295.28 ops/ms 34.90 > Float64Vector.ACOS 48.41 89.34 ops/ms 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 > Float64Vector.EXP 65.80 486.37 ops/ms 7.39 > Float64Vector.EXPM1 34.61 85.99 ops/ms 2.48 > Float64Vector.HYPOT 50.40 147.82 ops/ms 2.93 > Float64Vector.LOG 51.93 163.25 ops/ms 3.14 > Float64Vector.LOG10 49.53 147.98 ops/ms 2.99 > Float64Vector.LOG1P 19.20 206.81 ops/ms 10.77 > Float64Vector.SIN 44.41 382.09 ops/ms 8.60 > Float64Vector.SINH 28.20 90.68 ops/ms 3.22 > Float64Vector.TAN 36.29 160.89 ops/ms 4.43 > Float64Vector.TANH 47.65 214.04 ops/ms 4.49 This pull request has now been integrated. Changeset: 9f05c411 Author: Sandhya Viswanathan URL: https://git.openjdk.java.net/jdk/commit/9f05c411e6d6bdf612cf0cf8b9fe4ca9ecde50d1 Stats: 416073 lines in 119 files changed: 415886 ins; 124 del; 63 mod 8265783: Create a separate library for x86 Intel SVML assembly intrinsics Co-authored-by: Sandhya Viswanathan Co-authored-by: Rahul Kandu Co-authored-by: Razvan Lupusoru Co-authored-by: Magnus Ihse Bursie Co-authored-by: Jie Fu Co-authored-by: Ahmet Akkas Co-authored-by: Marius Cornea Reviewed-by: erikj, kvn, psandoz ------------- PR: https://git.openjdk.java.net/jdk/pull/3638 From ayang at openjdk.java.net Thu Jun 3 20:56:00 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Thu, 3 Jun 2021 20:56:00 GMT Subject: RFR: 8268164: Adopt cast notation for WorkerThread conversions [v2] In-Reply-To: References: Message-ID: On Thu, 3 Jun 2021 13:02:56 GMT, David Holmes wrote: >> src/hotspot/share/runtime/nonJavaThread.hpp line 111: >> >>> 109: >>> 110: static WorkerThread* cast(Thread* t) { >>> 111: assert(t->is_Worker_thread(), "incorrect cast to const WorkerThread"); >> >> It should probably not say *const* here. > > `as_Worker_thread()` used const so I wonder whether the new cast function should too? You mean `static const WorkerThread* cast(const Thread* t) { ... }`? As the implementation of this method is nothing but a `static_cast` (and it will probably stay that way in the future), having `const Thread*` won't really catch anything interesting, IMO. ------------- PR: https://git.openjdk.java.net/jdk/pull/4334 From dcubed at openjdk.java.net Thu Jun 3 21:19:59 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Thu, 3 Jun 2021 21:19:59 GMT Subject: RFR: 8212155: Race condition when posting dynamic_code_generated event leads to JVM crash [v2] In-Reply-To: References: Message-ID: On Thu, 3 Jun 2021 17:24:07 GMT, Leonid Mesnik wrote: >> Fixed a race condition between posting and enabling DynamicCodeGenerated event. > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > fixed comment Thumbs up. I just have minor comments. Just curious: what's the execution times for the new test on the Mach5 platforms? test/hotspot/jtreg/serviceability/jvmti/DynamicCodeGenerated/libDynamicCodeGenerated.cpp line 27: > 25: #include > 26: > 27: jvmtiEnv* jvmti; Should this be 'static'? test/hotspot/jtreg/serviceability/jvmti/DynamicCodeGenerated/libDynamicCodeGenerated.cpp line 35: > 33: JNIEXPORT > 34: void JNICALL Java_DynamicCodeGeneratedTest_changeEventNotificationMode(JNIEnv* jni, jclass cls) { > 35: while(true) { nit - please add space before '(' ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4331 From dcubed at openjdk.java.net Thu Jun 3 21:24:57 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Thu, 3 Jun 2021 21:24:57 GMT Subject: RFR: 8212155: Race condition when posting dynamic_code_generated event leads to JVM crash [v2] In-Reply-To: References: Message-ID: On Thu, 3 Jun 2021 17:24:07 GMT, Leonid Mesnik wrote: >> Fixed a race condition between posting and enabling DynamicCodeGenerated event. > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > fixed comment test/hotspot/jtreg/serviceability/jvmti/DynamicCodeGenerated/DynamicCodeGeneratedTest.java line 55: > 53: Reference.reachabilityFence(result); > 54: }).start(); > 55: } I just noticed no `join()` calls to clean up these threads. Does this mean we'll have 10,000 thread objects waiting around until the end of the program? ------------- PR: https://git.openjdk.java.net/jdk/pull/4331 From lmesnik at openjdk.java.net Thu Jun 3 21:33:29 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Thu, 3 Jun 2021 21:33:29 GMT Subject: RFR: 8212155: Race condition when posting dynamic_code_generated event leads to JVM crash [v3] In-Reply-To: References: Message-ID: > Fixed a race condition between posting and enabling DynamicCodeGenerated event. Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: updated test. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4331/files - new: https://git.openjdk.java.net/jdk/pull/4331/files/c650f446..7482d580 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4331&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4331&range=01-02 Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/4331.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4331/head:pull/4331 PR: https://git.openjdk.java.net/jdk/pull/4331 From dcubed at openjdk.java.net Thu Jun 3 21:57:55 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Thu, 3 Jun 2021 21:57:55 GMT Subject: RFR: 8212155: Race condition when posting dynamic_code_generated event leads to JVM crash [v3] In-Reply-To: References: Message-ID: On Thu, 3 Jun 2021 21:33:29 GMT, Leonid Mesnik wrote: >> Fixed a race condition between posting and enabling DynamicCodeGenerated event. > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > updated test. Thumbs up. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4331 From lmesnik at openjdk.java.net Thu Jun 3 22:26:58 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Thu, 3 Jun 2021 22:26:58 GMT Subject: RFR: 8212155: Race condition when posting dynamic_code_generated event leads to JVM crash [v3] In-Reply-To: References: Message-ID: <-7O2hk6F8PtOxkmPc2vO6pp5qIrhi9MzY42vAJCirrQ=.19624775-87f9-453c-92cd-8096aff3dd8d@github.com> On Thu, 3 Jun 2021 21:33:29 GMT, Leonid Mesnik wrote: >> Fixed a race condition between posting and enabling DynamicCodeGenerated event. > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > updated test. > Just curious: what's the execution times for the new test on the Mach5 platforms? It takes ~3 seconds to complete the test in Mach5. ------------- PR: https://git.openjdk.java.net/jdk/pull/4331 From lmesnik at openjdk.java.net Thu Jun 3 22:26:59 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Thu, 3 Jun 2021 22:26:59 GMT Subject: RFR: 8212155: Race condition when posting dynamic_code_generated event leads to JVM crash [v2] In-Reply-To: References: Message-ID: On Thu, 3 Jun 2021 21:21:45 GMT, Daniel D. Daugherty wrote: >> Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed comment > > test/hotspot/jtreg/serviceability/jvmti/DynamicCodeGenerated/DynamicCodeGeneratedTest.java line 55: > >> 53: Reference.reachabilityFence(result); >> 54: }).start(); >> 55: } > > I just noticed no `join()` calls to clean up these threads. > Does this mean we'll have 10,000 thread objects waiting around > until the end of the program? Yes, we don't care about thread completion. Just start new threads while the first ones are completed. I reduced the number of threads to 2000. It is still enough to reproduce the crash. However, 2,000 thread doesn't harm any system. I checked in Mach5. ------------- PR: https://git.openjdk.java.net/jdk/pull/4331 From david.holmes at oracle.com Thu Jun 3 22:31:47 2021 From: david.holmes at oracle.com (David Holmes) Date: Fri, 4 Jun 2021 08:31:47 +1000 Subject: RFR: 8268164: Adopt cast notation for WorkerThread conversions [v2] In-Reply-To: References: Message-ID: <12339cc3-e136-502c-51b4-ce4a666c8df4@oracle.com> On 4/06/2021 6:56 am, Albert Mingkun Yang wrote: > On Thu, 3 Jun 2021 13:02:56 GMT, David Holmes wrote: > >>> src/hotspot/share/runtime/nonJavaThread.hpp line 111: >>> >>>> 109: >>>> 110: static WorkerThread* cast(Thread* t) { >>>> 111: assert(t->is_Worker_thread(), "incorrect cast to const WorkerThread"); >>> >>> It should probably not say *const* here. >> >> `as_Worker_thread()` used const so I wonder whether the new cast function should too? > > You mean `static const WorkerThread* cast(const Thread* t) { ... }`? As the implementation of this method is nothing but a `static_cast` (and it will probably stay that way in the future), having `const Thread*` won't really catch anything interesting, IMO. Okay - just asking the question. :) For JavaThread we have a const and non-const version (not sure why though). David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/4334 > From dholmes at openjdk.java.net Fri Jun 4 02:16:04 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 4 Jun 2021 02:16:04 GMT Subject: RFR: 8195129: System.load() fails to load from unicode paths [v3] In-Reply-To: <923qsXnidgxLrhNUc8Bxw3zDCiw1ZNLC6OmIYqIeSOE=.783c385d-1d4c-4c87-b975-3ee27a71513d@github.com> References: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> <3y0nPfUyTPbNksPn1y5pvopzN2AReOgIl2CafPKD4b4=.3b490e90-5098-4d9f-8d7e-2770f5548895@github.com> <923qsXnidgxLrhNUc8Bxw3zDCiw1ZNLC6OmIYqIeSOE=.783c385d-1d4c-4c87-b975-3ee27a71513d@github.com> Message-ID: On Thu, 3 Jun 2021 17:48:59 GMT, Naoto Sato wrote: >> Right; I changed the code in NativeLibraries.c to pass down true UTF-8 instead of "modified UTF-8". Please, take a look. > > I am not sure we can pass non `modified UTF-8` through `JVM_LoadLibrary()`. Probably some VM folks can enlighten here? Not an expert by my understanding is that the VM only deals with modified UTF-8, as does JNI. So the incoming string should be modified-UTF8 IMO and then converted to UTF16. That said, this is shared code being modified on the JDK side so you can't just change the type of string being passed in without updating all the implementations of os::dll_load to support that! ------------- PR: https://git.openjdk.java.net/jdk/pull/4169 From david.holmes at oracle.com Fri Jun 4 02:21:10 2021 From: david.holmes at oracle.com (David Holmes) Date: Fri, 4 Jun 2021 12:21:10 +1000 Subject: RFR: 8212155: Race condition when posting dynamic_code_generated event leads to JVM crash [v2] In-Reply-To: References: Message-ID: Hi Dan, On 4/06/2021 7:24 am, Daniel D.Daugherty wrote: > On Thu, 3 Jun 2021 17:24:07 GMT, Leonid Mesnik wrote: > >>> Fixed a race condition between posting and enabling DynamicCodeGenerated event. >> >> Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed comment > > test/hotspot/jtreg/serviceability/jvmti/DynamicCodeGenerated/DynamicCodeGeneratedTest.java line 55: > >> 53: Reference.reachabilityFence(result); >> 54: }).start(); >> 55: } > > I just noticed no `join()` calls to clean up these threads. Java doesn't need a join() to "cleanup threads". The main reason to join() threads in a test is to ensure they have terminated before we hand control back to jtreg; and if not daemons to ensure we terminate more predictably. Cheers, David ----- > Does this mean we'll have 10,000 thread objects waiting around > until the end of the program? > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/4331 > From iignatyev at openjdk.java.net Fri Jun 4 02:25:01 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 4 Jun 2021 02:25:01 GMT Subject: RFR: 8267917: mark hotspot containers tests which ignore external VM flags In-Reply-To: <-2xStypSXn36rLxpzahRaczKC34Oy1QvW9oZISxyNTI=.e018fe94-7a4d-4e78-9212-349546a01908@github.com> References: <-2xStypSXn36rLxpzahRaczKC34Oy1QvW9oZISxyNTI=.e018fe94-7a4d-4e78-9212-349546a01908@github.com> Message-ID: On Fri, 28 May 2021 08:31:49 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this tiny and trivial patch that adds `@requires vm.flagless` to two container tests that, supposedly, ignore external VM flags? > > attn: @mseledts > > Cheers, > -- Igor Thanks, Harold. ------------- PR: https://git.openjdk.java.net/jdk/pull/4241 From iignatyev at openjdk.java.net Fri Jun 4 02:25:02 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 4 Jun 2021 02:25:02 GMT Subject: Integrated: 8267917: mark hotspot containers tests which ignore external VM flags In-Reply-To: <-2xStypSXn36rLxpzahRaczKC34Oy1QvW9oZISxyNTI=.e018fe94-7a4d-4e78-9212-349546a01908@github.com> References: <-2xStypSXn36rLxpzahRaczKC34Oy1QvW9oZISxyNTI=.e018fe94-7a4d-4e78-9212-349546a01908@github.com> Message-ID: On Fri, 28 May 2021 08:31:49 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this tiny and trivial patch that adds `@requires vm.flagless` to two container tests that, supposedly, ignore external VM flags? > > attn: @mseledts > > Cheers, > -- Igor This pull request has now been integrated. Changeset: edca245d Author: Igor Ignatyev URL: https://git.openjdk.java.net/jdk/commit/edca245d5a5f4b43ac853b0c27551a8da2c20309 Stats: 3 lines in 2 files changed: 2 ins; 0 del; 1 mod 8267917: mark hotspot containers tests which ignore external VM flags Reviewed-by: mseledtsov, hseigel ------------- PR: https://git.openjdk.java.net/jdk/pull/4241 From ysuenaga at openjdk.java.net Fri Jun 4 04:53:07 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Fri, 4 Jun 2021 04:53:07 GMT Subject: RFR: 8268228: TSC is not used for CPUTimeStampCounter on AMD processor In-Reply-To: References: Message-ID: On Fri, 4 Jun 2021 01:56:50 GMT, Yasumasa Suenaga wrote: > I ran JVM on Ryzen 3300X, and I got following `jdk.CPUTimeStampCounter` event. > > > jdk.CPUTimeStampCounter { > startTime = 10:41:14.993 > fastTimeEnabled = false > fastTimeAutoEnabled = true > osFrequency = 1000000000 > fastTimeFrequency = 1000000000 > } > > > I confirmed 3300X supports Invariant TSC (so `fastTimeAutoEnabled` is set to `true`), however it does not seem to be used (`fastTimeEnabled` is `false`). > > Frequency is come from brand string from CPUID (e.g. "Intel(R) Core(TM) i3-8145U CPU @ 2.10GHz"). However AMD processor (Ryzen at least) does not have it ("AMD Ryzen 3 3300X 4-Core Processor"). > Fortunately rdtsc_x86.cpp can calculate the frequency like bogomips. We should fallback to it if we cannot get the frequency even if invariant TSC is supported. > > After this change, I got following `jdk.CPUTimeStampCounter` event. Base clock of Ryzen 3 3300X is 3.8GHz, so `fastTimeFrequency` looks good. > > > jdk.CPUTimeStampCounter { > startTime = 10:33:52.884 > fastTimeEnabled = true > fastTimeAutoEnabled = true > osFrequency = 10000000 Hz > fastTimeFrequency = 3792929124 Hz > } > > > This problem is not only for JFR. I confirmed `Rdtsc` class is used in ticks.cpp , and it relates to GC code at least. Currently CPU frequency is detected at `VM_Version_Ext::max_qualified_cpu_freq_from_brand_string`. It would parse brand string from CPUID. However we can get base frequency from CPUID directly if we pass EAX = 16H (Processor Frequency Information Leaf: Please see Software Developer's Manual). I think it is more simple. Unfortunately AMD does not have it, so we need to depend on bogomips-like solution. Two failures on GHA (Windows aarch64 build and Linux x86 Tier1) does not seem to be caused by this change. ------------- PR: https://git.openjdk.java.net/jdk/pull/4350 From ysuenaga at openjdk.java.net Fri Jun 4 04:53:07 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Fri, 4 Jun 2021 04:53:07 GMT Subject: RFR: 8268228: TSC is not used for CPUTimeStampCounter on AMD processor Message-ID: I ran JVM on Ryzen 3300X, and I got following `jdk.CPUTimeStampCounter` event. jdk.CPUTimeStampCounter { startTime = 10:41:14.993 fastTimeEnabled = false fastTimeAutoEnabled = true osFrequency = 1000000000 fastTimeFrequency = 1000000000 } I confirmed 3300X supports Invariant TSC (so `fastTimeAutoEnabled` is set to `true`), however it does not seem to be used (`fastTimeEnabled` is `false`). Frequency is come from brand string from CPUID (e.g. "Intel(R) Core(TM) i3-8145U CPU @ 2.10GHz"). However AMD processor (Ryzen at least) does not have it ("AMD Ryzen 3 3300X 4-Core Processor"). Fortunately rdtsc_x86.cpp can calculate the frequency like bogomips. We should fallback to it if we cannot get the frequency even if invariant TSC is supported. After this change, I got following `jdk.CPUTimeStampCounter` event. Base clock of Ryzen 3 3300X is 3.8GHz, so `fastTimeFrequency` looks good. jdk.CPUTimeStampCounter { startTime = 10:33:52.884 fastTimeEnabled = true fastTimeAutoEnabled = true osFrequency = 10000000 Hz fastTimeFrequency = 3792929124 Hz } This problem is not only for JFR. I confirmed `Rdtsc` class is used in ticks.cpp , and it relates to GC code at least. ------------- Commit messages: - 8268228: TSC is not used on AMD processor Changes: https://git.openjdk.java.net/jdk/pull/4350/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4350&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8268228 Stats: 3 lines in 1 file changed: 1 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/4350.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4350/head:pull/4350 PR: https://git.openjdk.java.net/jdk/pull/4350 From dholmes at openjdk.java.net Fri Jun 4 05:11:54 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 4 Jun 2021 05:11:54 GMT Subject: RFR: 8268228: TSC is not used for CPUTimeStampCounter on AMD processor In-Reply-To: References: Message-ID: <6H5s4qMss7uZrgxPDXbSgljZxvGQlXqgytB6FrjMX38=.d1d32685-162f-478f-a769-d0e8d454e925@github.com> On Fri, 4 Jun 2021 01:56:50 GMT, Yasumasa Suenaga wrote: > I ran JVM on Ryzen 3300X, and I got following `jdk.CPUTimeStampCounter` event. > > > jdk.CPUTimeStampCounter { > startTime = 10:41:14.993 > fastTimeEnabled = false > fastTimeAutoEnabled = true > osFrequency = 1000000000 > fastTimeFrequency = 1000000000 > } > > > I confirmed 3300X supports Invariant TSC (so `fastTimeAutoEnabled` is set to `true`), however it does not seem to be used (`fastTimeEnabled` is `false`). > > Frequency is come from brand string from CPUID (e.g. "Intel(R) Core(TM) i3-8145U CPU @ 2.10GHz"). However AMD processor (Ryzen at least) does not have it ("AMD Ryzen 3 3300X 4-Core Processor"). > Fortunately rdtsc_x86.cpp can calculate the frequency like bogomips. We should fallback to it if we cannot get the frequency even if invariant TSC is supported. > > After this change, I got following `jdk.CPUTimeStampCounter` event. Base clock of Ryzen 3 3300X is 3.8GHz, so `fastTimeFrequency` looks good. > > > jdk.CPUTimeStampCounter { > startTime = 10:33:52.884 > fastTimeEnabled = true > fastTimeAutoEnabled = true > osFrequency = 10000000 Hz > fastTimeFrequency = 3792929124 Hz > } > > > This problem is not only for JFR. I confirmed `Rdtsc` class is used in ticks.cpp , and it relates to GC code at least. Hi Yasumasa, This seems reasonable to address the observed problem, though I have to wonder why a system with invariant TSC is not properly reporting a maximum frequency? Makes me wonder how many chips may actually be impacted here? Thanks, David src/hotspot/cpu/x86/rdtsc_x86.cpp line 109: > 107: os_to_tsc_conv_factor = tsc_freq / os_freq; > 108: } > 109: if (tsc_freq == 0.0f) { tsc_freq is a double so use 0.0 not 0.0f Probably worth a comment explaining this - something like: // If no invariant TSC, or the system failed to report a useful maximum frequency, use an estimation. ------------- PR: https://git.openjdk.java.net/jdk/pull/4350 From ysuenaga at openjdk.java.net Fri Jun 4 05:24:15 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Fri, 4 Jun 2021 05:24:15 GMT Subject: RFR: 8268228: TSC is not used for CPUTimeStampCounter on AMD processor [v2] In-Reply-To: References: Message-ID: > I ran JVM on Ryzen 3300X, and I got following `jdk.CPUTimeStampCounter` event. > > > jdk.CPUTimeStampCounter { > startTime = 10:41:14.993 > fastTimeEnabled = false > fastTimeAutoEnabled = true > osFrequency = 1000000000 > fastTimeFrequency = 1000000000 > } > > > I confirmed 3300X supports Invariant TSC (so `fastTimeAutoEnabled` is set to `true`), however it does not seem to be used (`fastTimeEnabled` is `false`). > > Frequency is come from brand string from CPUID (e.g. "Intel(R) Core(TM) i3-8145U CPU @ 2.10GHz"). However AMD processor (Ryzen at least) does not have it ("AMD Ryzen 3 3300X 4-Core Processor"). > Fortunately rdtsc_x86.cpp can calculate the frequency like bogomips. We should fallback to it if we cannot get the frequency even if invariant TSC is supported. > > After this change, I got following `jdk.CPUTimeStampCounter` event. Base clock of Ryzen 3 3300X is 3.8GHz, so `fastTimeFrequency` looks good. > > > jdk.CPUTimeStampCounter { > startTime = 10:33:52.884 > fastTimeEnabled = true > fastTimeAutoEnabled = true > osFrequency = 10000000 Hz > fastTimeFrequency = 3792929124 Hz > } > > > This problem is not only for JFR. I confirmed `Rdtsc` class is used in ticks.cpp , and it relates to GC code at least. Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: Fix comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4350/files - new: https://git.openjdk.java.net/jdk/pull/4350/files/1bb8226e..768ac458 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4350&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4350&range=00-01 Stats: 4 lines in 1 file changed: 3 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/4350.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4350/head:pull/4350 PR: https://git.openjdk.java.net/jdk/pull/4350 From ysuenaga at openjdk.java.net Fri Jun 4 05:42:57 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Fri, 4 Jun 2021 05:42:57 GMT Subject: RFR: 8268228: TSC is not used for CPUTimeStampCounter on AMD processor [v2] In-Reply-To: <6H5s4qMss7uZrgxPDXbSgljZxvGQlXqgytB6FrjMX38=.d1d32685-162f-478f-a769-d0e8d454e925@github.com> References: <6H5s4qMss7uZrgxPDXbSgljZxvGQlXqgytB6FrjMX38=.d1d32685-162f-478f-a769-d0e8d454e925@github.com> Message-ID: On Fri, 4 Jun 2021 05:08:33 GMT, David Holmes wrote: >> Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix comments > > Hi Yasumasa, > > This seems reasonable to address the observed problem, though I have to wonder why a system with invariant TSC is not properly reporting a maximum frequency? Makes me wonder how many chips may actually be impacted here? > > Thanks, > David Thanks @dholmes-ora for your review! I pushed new commit. Could you review again? > This seems reasonable to address the observed problem, though I have to wonder why a system with invariant TSC is not properly reporting a maximum frequency? I guess it is not specified to contain the frequency in brand string, I'm not sure. In AMD processor, we can seem to get effective frequency from MSR, but it requires kernel mode, so we can't use it. > Makes me wonder how many chips may actually be impacted here? I guess AMD processors are affected it at least. For example, Opteron and Athlon do not seem to have its frequency in their brand string. https://www.cpu-world.com/cgi-bin/CPUID.pl?CPUID=73494 https://www.cpu-world.com/cgi-bin/CPUID.pl?CPUID=74038 ------------- PR: https://git.openjdk.java.net/jdk/pull/4350 From ayang at openjdk.java.net Fri Jun 4 08:34:58 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Fri, 4 Jun 2021 08:34:58 GMT Subject: RFR: 8268164: Adopt cast notation for WorkerThread conversions In-Reply-To: <12339cc3-e136-502c-51b4-ce4a666c8df4@oracle.com> References: <12339cc3-e136-502c-51b4-ce4a666c8df4@oracle.com> Message-ID: On Thu, 3 Jun 2021 22:33:32 GMT, David Holmes wrote: > For JavaThread we have a const and non-const version (not sure why though). `as_Java_thread` is called with `const Thread*` in two places, listed below. The criteria of having a `const Thread*`-version cast method hinges on the caller, whether it has a `const Thread*` and/or wants a `const XThread*` in return. `as_Java_thread` is called both with and with `const`, while `XThread* cast(Thread* t)` is enough for others currently. traceid JfrThreadId::id(const Thread* t); const char* JfrThreadName::name(const Thread* t); ------------- PR: https://git.openjdk.java.net/jdk/pull/4334 From luhenry at openjdk.java.net Fri Jun 4 08:55:26 2021 From: luhenry at openjdk.java.net (Ludovic Henry) Date: Fri, 4 Jun 2021 08:55:26 GMT Subject: RFR: 8268178: Extract sender frame parsing to CodeBlob::FrameParser [v2] In-Reply-To: References: Message-ID: <9YeuztKcJZgcFWWnPg8QD8ujOHvRkK7K0WibcnIoH1g=.47f3221a-5167-4bfb-a105-d2c56aaae21a@github.com> > Whether and how a frame is setup is controlled by the code generator > for the specific CodeBlock. The CodeBlock is then in the best place to know how > to parse the sender's frame from the current frame in the given CodeBlock. > > This refactoring proposes to extract this parsing out of `frame` and into a > `CodeBlock::FrameParser`. This FrameParser is then specialized in the relevant > inherited children of CodeBlock. > > This change is to largely facilitate adding new supported cases for JDK-8252417 [1] > like runtime stubs. > > [1] https://bugs.openjdk.java.net/browse/JDK-8252417 Ludovic Henry has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Fix type casting for Windows - Merge branch 'master' of https://github.com/openjdk/jdk into codeblob-frameparser - 8268178: Extract sender frame parsing to CodeBlock::FrameParser Whether and how a frame is setup is controlled by the code generator for the specific CodeBlock. The CodeBlock is then in the best place to know how to parse the sender's frame from the current frame in the given CodeBlock. This refactoring proposes to extract this parsing out of `frame` and into a `CodeBlock::FrameParser`. This FrameParser is then specialized in the relevant inherited children of CodeBlock. This change is to largely facilitate adding new supported cases for JDK-8252417 like runtime stubs. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4337/files - new: https://git.openjdk.java.net/jdk/pull/4337/files/63e89d87..714ec5fb Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4337&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4337&range=00-01 Stats: 426534 lines in 279 files changed: 425213 ins; 394 del; 927 mod Patch: https://git.openjdk.java.net/jdk/pull/4337.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4337/head:pull/4337 PR: https://git.openjdk.java.net/jdk/pull/4337 From aph at openjdk.java.net Fri Jun 4 09:49:00 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 4 Jun 2021 09:49:00 GMT Subject: RFR: 8266749: AArch64: Backtracing broken on PAC enabled systems [v7] In-Reply-To: References: <3Ak1iZsEfTEKadfUcF6zGVuzsDoiQbaupm66NvSwlaY=.8323de39-d7e6-4049-9243-7c31a75bbc9f@github.com> Message-ID: On Thu, 3 Jun 2021 15:14:16 GMT, Alan Hayward wrote: >> On PAC systems, native code may sign return addresses before saving >> them to the stack. We must ensure we strip the any signed bits in >> order to walk the stack. >> Add extra asserts in places where we do not expect saved return >> addresses to be signed. >> >> On non-PAC systems, all PAC instructions are treated as NOPs. >> >> On Apple, use the provided ptrauth interface instead of asm >> as the compiler may optimise further. >> >> Fedora 33 compiles all distro packages using PAC. Running the distro >> provided OpenJDK-latest in GDB on a PAC system: >> >> Thread 2 "java" hit Breakpoint 1, 0x0000fffff68d7fe4 in init_globals() () >> from /usr/lib/jvm/java-16-openjdk-16.0.1.0.9-1.rolling.fc33.aarch64-fastdebug/lib/server/libjvm.so >> (gdb) call (int)pns($sp, $fp, $pc) >> >> "Executing pns" >> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0xe26fe4] init_globals()+0x10 >> C 0x006ffffff74750c4 >> C 0x0042fffff6a7f84c >> C 0x0037fffff7fa0954 >> C 0x0030fffff7fa4540 >> C 0x0078fffff7d980c8 >> >> OpenJDK with this patch at the same breakpoint: >> >> (gdb) call (int)pns($sp, $fp, $pc) >> "Executing pns" >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x189c47c] Threads::create_vm(JavaVMInitArgs*, bool*)+0x27c >> V [libjvm.so+0xf527a0] JNI_CreateJavaVM+0xc0 >> C [libjli.so+0x3860] JavaMain+0x7c >> C [libjli.so+0x732c] ThreadJavaMain+0xc >> C [libpthread.so.0+0x80c8] start_thread+0xd8 >> >> OpenJDK with this patch breakpointed at pd_hotspot_signal_handler: >> >> "Executing pns" >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x148a730] PosixSignals::pd_hotspot_signal_handler(int, siginfo_t*, ucontext_t*, JavaThread*)+0x0 >> C [linux-vdso.so.1+0x80c] __kernel_rt_sigreturn+0x0 >> J 53 c1 jdk.internal.org.objectweb.asm.SymbolTable.addConstantUtf8(Ljava/lang/String;)I java.base (98 bytes) @ 0x0000ffffe159cc3c [0x0000ffffe159cb40+0x00000000000000fc] >> j jdk.internal.org.objectweb.asm.SymbolTable.setMajorVersionAndClassName(ILjava/lang/String;)I+12 java.base >> j jdk.internal.org.objectweb.asm.ClassWriter.visit(IILjava/lang/String;Ljava/lang/String;Ljava/lang/String;[Ljava/lang/String;)V+20 java.base >> j java.lang.invoke.InvokerBytecodeGenerator.classFilePrologue()Ljdk/internal/org/objectweb/asm/ClassWriter;+30 java.base >> j java.lang.invoke.InvokerBytecodeGenerator.generateCustomizedCodeBytes()[B+1 java.base >> j java.lang.invoke.InvokerBytecodeGenerator.generateCustomizedCode(Ljava/lang/invoke/LambdaForm;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/MemberName;+27 java.base >> j java.lang.invoke.LambdaForm.compileToBytecode()V+69 java.base >> j java.lang.invoke.DirectMethodHandle.makePreparedLambdaForm(Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/LambdaForm;+792 java.base >> j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/LambdaForm;+17 java.base >> j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MemberName;Z)Ljava/lang/invoke/LambdaForm;+163 java.base >> j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MemberName;)Ljava/lang/invoke/LambdaForm;+2 java.base >> j java.lang.invoke.DirectMethodHandle.make(BLjava/lang/Class;Ljava/lang/invoke/MemberName;Ljava/lang/Class;)Ljava/lang/invoke/DirectMethodHandle;+159 java.base >> j java.lang.invoke.MethodHandles$Lookup.getDirectMethodCommon(BLjava/lang/Class;Ljava/lang/invoke/MemberName;ZZLjava/lang/invoke/MethodHandles$Lookup;)Ljava/lang/invoke/MethodHandle;+210 java.base >> j java.lang.invoke.MethodHandles$Lookup.getDirectMethodNoSecurityManager(BLjava/lang/Class;Ljava/lang/invoke/MemberName;Ljava/lang/invoke/MethodHandles$Lookup;)Ljava/lang/invoke/MethodHandle;+14 java.base >> j java.lang.invoke.MethodHandles$Lookup.getDirectMethodForConstant(BLjava/lang/Class;Ljava/lang/invoke/MemberName;)Ljava/lang/invoke/MethodHandle;+31 java.base >> j java.lang.invoke.MethodHandles$Lookup.linkMethodHandleConstant(BLjava/lang/Class;Ljava/lang/String;Ljava/lang/Object;)Ljava/lang/invoke/MethodHandle;+153 java.base >> j java.lang.invoke.MethodHandleNatives.linkMethodHandleConstant(Ljava/lang/Class;ILjava/lang/Class;Ljava/lang/String;Ljava/lang/Object;)Ljava/lang/invoke/MethodHandle;+38 java.base >> v ~StubRoutines::call_stub >> V [libjvm.so+0xe20118] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x5c8 >> V [libjvm.so+0xe20f64] JavaCalls::call_static(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x284 >> V [libjvm.so+0x184b778] SystemDictionary::link_method_handle_constant(Klass*, int, Klass*, Symbol*, Symbol*, Thread*)+0x398 >> V [libjvm.so+0xa1f104] ConstantPool::resolve_constant_at_impl(constantPoolHandle const&, int, int, bool*, Thread*)+0xca0 >> V [libjvm.so+0xa1fb6c] ConstantPool::copy_bootstrap_arguments_at_impl(constantPoolHandle const&, int, int, int, objArrayHandle, int, bool, Handle, Thread*)+0x3fc >> V [libjvm.so+0x6bef6c] BootstrapInfo::resolve_args(Thread*)+0xcbc >> V [libjvm.so+0x6c1538] BootstrapInfo::resolve_bsm(Thread*)+0x1194 >> V [libjvm.so+0x184d300] SystemDictionary::invoke_bootstrap_method(BootstrapInfo&, Thread*)+0x30 >> V [libjvm.so+0x120450c] LinkResolver::resolve_dynamic_call(CallInfo&, BootstrapInfo&, Thread*)+0x2c >> V [libjvm.so+0x1204b1c] LinkResolver::resolve_invokedynamic(CallInfo&, constantPoolHandle const&, int, Thread*)+0x1bc >> V [libjvm.so+0xe0ecc4] InterpreterRuntime::resolve_invokedynamic(JavaThread*)+0x190 >> V [libjvm.so+0xe123a0] InterpreterRuntime::resolve_from_cache(JavaThread*, Bytecodes::Code)+0x160 >> j jdk.internal.module.ModulePath.explodedPackages(Ljava/nio/file/Path;)Ljava/util/Set;+5 java.base >> j jdk.internal.module.ModulePath.lambda$readExplodedModule$9(Ljava/nio/file/Path;)Ljava/util/Set;+2 java.base >> j jdk.internal.module.ModulePath$$Lambda$2+0x000000010003bbe0.get()Ljava/lang/Object;+8 java.base >> j jdk.internal.module.ModuleInfo.doRead(Ljava/io/DataInput;)Ljdk/internal/module/ModuleInfo$Attributes;+762 java.base >> j jdk.internal.module.ModuleInfo.read(Ljava/io/InputStream;Ljava/util/function/Supplier;)Ljdk/internal/module/ModuleInfo$Attributes;+16 java.base >> j jdk.internal.module.ModulePath.readExplodedModule(Ljava/nio/file/Path;)Ljava/lang/module/ModuleReference;+35 java.base >> j jdk.internal.module.ModulePath.readModule(Ljava/nio/file/Path;Ljava/nio/file/attribute/BasicFileAttributes;)Ljava/lang/module/ModuleReference;+11 java.base >> j jdk.internal.module.ModulePath.scanDirectory(Ljava/nio/file/Path;)Ljava/util/Map;+69 java.base >> j jdk.internal.module.ModulePath.scan(Ljava/nio/file/Path;)Ljava/util/Map;+60 java.base >> j jdk.internal.module.ModulePath.scanNextEntry()V+23 java.base >> j jdk.internal.module.ModulePath.find(Ljava/lang/String;)Ljava/util/Optional;+36 java.base >> j jdk.internal.module.SystemModuleFinders$1.lambda$find$0(Ljava/lang/module/ModuleFinder;Ljava/lang/String;)Ljava/util/Optional;+2 java.base >> j jdk.internal.module.SystemModuleFinders$1$$Lambda$1+0x0000000100033b00.run()Ljava/lang/Object;+8 java.base >> j java.security.AccessController.executePrivileged(Ljava/security/PrivilegedAction;Ljava/security/AccessControlContext;Ljava/lang/Class;)Ljava/lang/Object;+29 java.base >> j java.security.AccessController.doPrivileged(Ljava/security/PrivilegedAction;)Ljava/lang/Object;+5 java.base >> j jdk.internal.module.SystemModuleFinders$1.find(Ljava/lang/String;)Ljava/util/Optional;+12 java.base >> j jdk.internal.module.ModuleBootstrap.boot2()Ljava/lang/ModuleLayer;+304 java.base >> j jdk.internal.module.ModuleBootstrap.boot()Ljava/lang/ModuleLayer;+64 java.base >> j java.lang.System.initPhase2(ZZ)I+0 java.base >> v ~StubRoutines::call_stub >> V [libjvm.so+0xe20118] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x5c8 >> V [libjvm.so+0xe20f64] JavaCalls::call_static(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x284 >> V [libjvm.so+0x189c7bc] Threads::create_vm(JavaVMInitArgs*, bool*)+0x5bc >> V [libjvm.so+0xf527a0] JNI_CreateJavaVM+0xc0 >> C [libjli.so+0x3860] JavaMain+0x7c >> C [libjli.so+0x732c] ThreadJavaMain+0xc >> C [libpthread.so.0+0x80c8] start_thread+0xd8 > > Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: > > sender_pc_raw -> sender_pc_maybe_signed > > Change-Id: I5d9b83f42d05a8773b341708579970d9c449ced2 > CustomizedGitHooks: yes src/hotspot/os_cpu/bsd_aarch64/pauth_bsd_aarch64.inline.hpp line 46: > 44: asm ("mov x30, %0;" > 45: XPACLRI > 46: "mov %0, x30;" : "+r"(ptr) : : "x30"); One minor nit: I wonder if it's necessary to do those two MOVs. We could put ptr into x30 thusly: register uint64_t result __asm__("x30"); result = ptr; asm (XPACLRI : "+r"(result)); return result; This would generate fewer MOVs in most cases, but as MOV instructions don't even issue in M1 it's perhaps not worth bothering. Up to you. ------------- PR: https://git.openjdk.java.net/jdk/pull/4029 From luhenry at openjdk.java.net Fri Jun 4 09:50:26 2021 From: luhenry at openjdk.java.net (Ludovic Henry) Date: Fri, 4 Jun 2021 09:50:26 GMT Subject: RFR: 8268178: Extract sender frame parsing to CodeBlob::FrameParser [v3] In-Reply-To: References: Message-ID: > Whether and how a frame is setup is controlled by the code generator > for the specific CodeBlock. The CodeBlock is then in the best place to know how > to parse the sender's frame from the current frame in the given CodeBlock. > > This refactoring proposes to extract this parsing out of `frame` and into a > `CodeBlock::FrameParser`. This FrameParser is then specialized in the relevant > inherited children of CodeBlock. > > This change is to largely facilitate adding new supported cases for JDK-8252417 [1] > like runtime stubs. > > [1] https://bugs.openjdk.java.net/browse/JDK-8252417 Ludovic Henry has updated the pull request incrementally with two additional commits since the last revision: - Fix Windows build - Fix zero build ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4337/files - new: https://git.openjdk.java.net/jdk/pull/4337/files/714ec5fb..53c879d9 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4337&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4337&range=01-02 Stats: 5 lines in 2 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/4337.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4337/head:pull/4337 PR: https://git.openjdk.java.net/jdk/pull/4337 From jbachorik at openjdk.java.net Fri Jun 4 10:24:04 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Fri, 4 Jun 2021 10:24:04 GMT Subject: RFR: 8268178: Extract sender frame parsing to CodeBlob::FrameParser [v3] In-Reply-To: References: Message-ID: <05AmPf-u0i0lTMtAdsIf1Pmg-tqEQwO_MBYWJnUNNNA=.e17e1778-dc56-455c-a67d-234c7f1f6982@github.com> On Fri, 4 Jun 2021 09:50:26 GMT, Ludovic Henry wrote: >> Whether and how a frame is setup is controlled by the code generator >> for the specific CodeBlock. The CodeBlock is then in the best place to know how >> to parse the sender's frame from the current frame in the given CodeBlock. >> >> This refactoring proposes to extract this parsing out of `frame` and into a >> `CodeBlock::FrameParser`. This FrameParser is then specialized in the relevant >> inherited children of CodeBlock. >> >> This change is to largely facilitate adding new supported cases for JDK-8252417 [1] >> like runtime stubs. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8252417 > > Ludovic Henry has updated the pull request incrementally with two additional commits since the last revision: > > - Fix Windows build > - Fix zero build src/hotspot/share/code/icBuffer.cpp line 146: > 144: BufferBlob* blob = BufferBlob::create("InlineCacheBuffer", code_size); > 145: if (blob == NULL) { > 146: vm_exit_out_of_memory(code_size, OOM_MALLOC_ERROR, "CodeCache: no room for InlineCacheBuffer"); Is VM exit here appropriate? AFAICS, previously the failure to allocate InlineCacheBuffer was caught in assertion only. ------------- PR: https://git.openjdk.java.net/jdk/pull/4337 From luhenry at openjdk.java.net Fri Jun 4 10:48:59 2021 From: luhenry at openjdk.java.net (Ludovic Henry) Date: Fri, 4 Jun 2021 10:48:59 GMT Subject: RFR: 8268178: Extract sender frame parsing to CodeBlob::FrameParser [v3] In-Reply-To: <05AmPf-u0i0lTMtAdsIf1Pmg-tqEQwO_MBYWJnUNNNA=.e17e1778-dc56-455c-a67d-234c7f1f6982@github.com> References: <05AmPf-u0i0lTMtAdsIf1Pmg-tqEQwO_MBYWJnUNNNA=.e17e1778-dc56-455c-a67d-234c7f1f6982@github.com> Message-ID: On Fri, 4 Jun 2021 10:21:26 GMT, Jaroslav Bachorik wrote: >> Ludovic Henry has updated the pull request incrementally with two additional commits since the last revision: >> >> - Fix Windows build >> - Fix zero build > > src/hotspot/share/code/icBuffer.cpp line 146: > >> 144: BufferBlob* blob = BufferBlob::create("InlineCacheBuffer", code_size); >> 145: if (blob == NULL) { >> 146: vm_exit_out_of_memory(code_size, OOM_MALLOC_ERROR, "CodeCache: no room for InlineCacheBuffer"); > > Is VM exit here appropriate? > AFAICS, previously the failure to allocate InlineCacheBuffer was caught in assertion only. This is extracted from https://github.com/openjdk/jdk/pull/4337/files#diff-044ffb3850314d23822b79de889820f6432757df5431eac3748a9eb60b1b6899L69-L73. ------------- PR: https://git.openjdk.java.net/jdk/pull/4337 From neliasso at openjdk.java.net Fri Jun 4 12:30:11 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Fri, 4 Jun 2021 12:30:11 GMT Subject: RFR: 8268125: ZGC: Clone oop array gets wrong acopy stub Message-ID: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> Hi, This fixes a problem I introduced with JDK-8267726. With that change clone oop array is treated as normal clone arrays with ZGC. I missed that a case was missing in zBarrierSetC2::clone_at_expansion - which caused clone_oop-arrays to get the wrong array copy stub. In this fix I move the entire leaf call creation inside zBarrierSetC2, and leave BarrierSetC2 as is. In this way I don't have to change anything for the other collectors. Please review, Best regards, Nils Eliasson ------------- Commit messages: - Fix elem check - clone_at_expansion Changes: https://git.openjdk.java.net/jdk/pull/4359/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4359&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8268125 Stats: 51 lines in 2 files changed: 46 ins; 0 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/4359.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4359/head:pull/4359 PR: https://git.openjdk.java.net/jdk/pull/4359 From chagedorn at openjdk.java.net Fri Jun 4 12:41:44 2021 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Fri, 4 Jun 2021 12:41:44 GMT Subject: RFR: 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests [v13] In-Reply-To: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> References: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> Message-ID: > This RFE provides an IR test framework to perform regex-based checks on the C2 IR shape of test methods emitted by the VM flags `-XX:+PrintIdeal` and `-XX:+PrintOptoAssembly`. The framework can also be used for other non-IR matching (and non-compiler) tests by providing easy to use annotations for commonly used testing patterns and compiler control flags. > > The framework is based on the ideas of the currently present IR test framework in [Valhalla](https://github.com/openjdk/valhalla/blob/e9c78ce4fcfd01361c35883e0d68f9ae5a80d079/test/hotspot/jtreg/compiler/valhalla/inlinetypes/InlineTypeTest.java) (mainly implemented by @TobiHartmann) which is being used with great success. This new framework aims to replace the old one in Valhalla at some point. > > A detailed description about how this new IR test framework works and how it is used is provided in the [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) file and in the [Javadocs](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc/jdk/test/lib/hotspot/ir_framework/package-summary.html) written for the framework classes. > > To finish a first version of this framework for JDK 17, I decided to leave some improvement possibilities and ideas to be followed up on in additional RFEs. Some ideas are mentioned in "Future Work" in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) and were also created as subtasks of this RFE. > > Testing (also described in "Internal Framework Tests in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md)): > There are various tests to verify the correctness of the test framework which can be found as JTreg tests in the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) folder. Additional testing was performed by converting all compiler Inline Types test of project Valhalla (done by @katyapav in [JDK-8263024](https://bugs.openjdk.java.net/browse/JDK-8263024)) that used the old framework to the new framework. This provided additional testing for the framework itself. We ran the converted tests with all the flag settings used in hs-tier1-9 and hs-precheckin-comp. For sanity checking, this was also done with a sample IR test in mainline. > > Some stats about the framework code added to [ir_framework](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework): > > - without the [Javadocs files](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc) : 60 changed files, 13212 insertions, 0 deletions. > - without the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) and [examples](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/examples) folder: 40 files changed, 6781 insertions > - comments: 2399 insertions (calculated with `git diff --cached !(tests|examples) | grep -c -E "(^[+-]\s*(/)?*)|(^[+-]\s*//)"`) > - which leaves 4382 lines of code inserted > > Big thanks to: > - @TobiHartmann for all his help by discussing the new framework and for providing insights from his IR test framework in Valhalla. > - @katyapav for converting the Valhalla tests to use the new framework which found some harder to catch bugs in the framework and also some actual C2 bugs. > - @iignatev for helping to simplify the framework usage with JTreg and with the framework internal VM calling structure. > - and others who provided valuable feedback. > > Thanks, > Christian Christian Hagedorn has updated the pull request incrementally with three additional commits since the last revision: - Update test and example package names, README files and fix some tests to let them pass in higher tiers - Move tests and examples #2 - Move tests and examples #1 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3508/files - new: https://git.openjdk.java.net/jdk/pull/3508/files/df7576f7..7b77370b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3508&range=12 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3508&range=11-12 Stats: 439 lines in 23 files changed: 199 ins; 174 del; 66 mod Patch: https://git.openjdk.java.net/jdk/pull/3508.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3508/head:pull/3508 PR: https://git.openjdk.java.net/jdk/pull/3508 From chagedorn at openjdk.java.net Fri Jun 4 12:41:47 2021 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Fri, 4 Jun 2021 12:41:47 GMT Subject: RFR: 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests [v12] In-Reply-To: References: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> Message-ID: On Thu, 3 Jun 2021 15:40:12 GMT, Christian Hagedorn wrote: >> This RFE provides an IR test framework to perform regex-based checks on the C2 IR shape of test methods emitted by the VM flags `-XX:+PrintIdeal` and `-XX:+PrintOptoAssembly`. The framework can also be used for other non-IR matching (and non-compiler) tests by providing easy to use annotations for commonly used testing patterns and compiler control flags. >> >> The framework is based on the ideas of the currently present IR test framework in [Valhalla](https://github.com/openjdk/valhalla/blob/e9c78ce4fcfd01361c35883e0d68f9ae5a80d079/test/hotspot/jtreg/compiler/valhalla/inlinetypes/InlineTypeTest.java) (mainly implemented by @TobiHartmann) which is being used with great success. This new framework aims to replace the old one in Valhalla at some point. >> >> A detailed description about how this new IR test framework works and how it is used is provided in the [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) file and in the [Javadocs](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc/jdk/test/lib/hotspot/ir_framework/package-summary.html) written for the framework classes. >> >> To finish a first version of this framework for JDK 17, I decided to leave some improvement possibilities and ideas to be followed up on in additional RFEs. Some ideas are mentioned in "Future Work" in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) and were also created as subtasks of this RFE. >> >> Testing (also described in "Internal Framework Tests in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md)): >> There are various tests to verify the correctness of the test framework which can be found as JTreg tests in the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) folder. Additional testing was performed by converting all compiler Inline Types test of project Valhalla (done by @katyapav in [JDK-8263024](https://bugs.openjdk.java.net/browse/JDK-8263024)) that used the old framework to the new framework. This provided additional testing for the framework itself. We ran the converted tests with all the flag settings used in hs-tier1-9 and hs-precheckin-comp. For sanity checking, this was also done with a sample IR test in mainline. >> >> Some stats about the framework code added to [ir_framework](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework): >> >> - without the [Javadocs files](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc) : 60 changed files, 13212 insertions, 0 deletions. >> - without the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) and [examples](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/examples) folder: 40 files changed, 6781 insertions >> - comments: 2399 insertions (calculated with `git diff --cached !(tests|examples) | grep -c -E "(^[+-]\s*(/)?*)|(^[+-]\s*//)"`) >> - which leaves 4382 lines of code inserted >> >> Big thanks to: >> - @TobiHartmann for all his help by discussing the new framework and for providing insights from his IR test framework in Valhalla. >> - @katyapav for converting the Valhalla tests to use the new framework which found some harder to catch bugs in the framework and also some actual C2 bugs. >> - @iignatev for helping to simplify the framework usage with JTreg and with the framework internal VM calling structure. >> - and others who provided valuable feedback. >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: > > Fix failing internal tests on Windows and add missing flag description in README Thanks a lot Igor for your review(s)! I pushed an update accordingly together with some last test fixes. I'm running some more testing over the weekend, also together with the converted and updated Valhalla tests. If everything looks fine, I will integrate it on Monday. ------------- PR: https://git.openjdk.java.net/jdk/pull/3508 From mhorie at openjdk.java.net Fri Jun 4 13:09:01 2021 From: mhorie at openjdk.java.net (Michihiro Horie) Date: Fri, 4 Jun 2021 13:09:01 GMT Subject: RFR: 8267968: [PPC64] Use prefixed load and addi instructions for better performance in POWER10 In-Reply-To: References: Message-ID: On Mon, 31 May 2021 05:39:25 GMT, Kazunori Ogata wrote: > The POWER10 processor supports prefixed load and addi instructions that have larger displacement field of up to 34-bits. We can reduce instruction cycles to load constant from TOC and load an immediate value to a register. > > Assembler::{load|add}_const_optimized() and LoadCon[LPFD]Nodes are modified to use prefixed instructions, with fixing other functions that are affected by this change. > > I ran jtreg test on both POWER10 and POWER8 machines by using "make test-tier1" and verified no additional fails by this change. I also ran DaCapo, Renaissance, and SPECjbb2015 on both of them and verified they run successfully. src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 311: > 309: return inv_d1_field(inst1); > 310: } else if (PowerArchitecturePPC64 >= 10 && is_pld_prefix(inst1)) { > 311: return (get_imm18(inst1_addr, 0) << 16) + (get_imm(inst1_addr, 1) & 0xffff); I couldn't understand here quickly. Maybe it would be nice to tell the reason by adding some comments like is_load_const_from_method_toc_at's one "CallDynamicJavaDirectSched_ExNode and CallLeaf(NoFP)Direct_ExNode are the nodes that require relocation and use pld". src/hotspot/cpu/ppc/ppc.ad line 2894: > 2892: if (loadConLNodes._small) nodes->push(loadConLNodes._small); > 2893: if (loadConLNodes._large_hi) nodes->push(loadConLNodes._large_hi); > 2894: if (loadConLNodes._large_lo) nodes->push(loadConLNodes._large_lo); Is removing the _last checking needed? lf it's needed, code related to _last should be removed such as in loadConLNodesTuple_create. Also, it would be better to use an if-else condition because it cannot happen both _small and _large_hi are non null. ------------- PR: https://git.openjdk.java.net/jdk/pull/4267 From github.com+28651297+mkartashev at openjdk.java.net Fri Jun 4 13:36:27 2021 From: github.com+28651297+mkartashev at openjdk.java.net (Maxim Kartashev) Date: Fri, 4 Jun 2021 13:36:27 GMT Subject: RFR: 8195129: System.load() fails to load from unicode paths [v5] In-Reply-To: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> References: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> Message-ID: > Character strings within JVM are produced and consumed in several formats. Strings come from/to Java in the UTF8 format and POSIX APIs (like fprintf() or dlopen()) consume strings also in UTF8. On Windows, however, the situation is far less simple: some new(er) APIs expect UTF16 (wide-character strings), some older APIs can only work with strings in a "platform" format, where not all UTF8 characters can be represented; which ones can depends on the current "code page". > > This commit switches the Windows version of native library loading code to using the new UTF16 API `LoadLibraryW()` and attempts to streamline the use of various string formats in the surrounding code. > > Namely, exception messages are made to consume strings explicitly in the UTF8 format, while logging functions (that end up using legacy Windows API) are made to consume "platform" strings in most cases. One exception is `JVM_LoadLibrary()` logging where the UTF8 name of the library is logged, which can, of course, be fixed, but was considered not worth the additional code (NB: this isn't a new bug). > > The test runs in a separate JVM in order to make NIO happy about non-ASCII characters in the file name; tests are executed with LC_ALL=C and that doesn't let NIO work with non-ASCII file names even on Linux or MacOS. > > Tested by running `test/hotspot/jtreg:tier1` on Linux and `jtreg:test/hotspot/jtreg/runtime` on Windows 10. The new test (` jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode`) was explicitly ran on those platforms as well. > > Results from Linux: > > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg:tier1 1784 1784 0 0 > ============================== > TEST SUCCESS > > > Building target 'run-test-only' in configuration 'linux-x86_64-server-release' > Test selection 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode', will run: > * jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode > > Running test 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode' > Passed: runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java > Test results: passed: 1 > > > Results from Windows 10: > > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg/runtime 746 746 0 0 > ============================== > TEST SUCCESS > Finished building target 'run-test-only' in configuration 'windows-x86_64-server-fastdebug' > > > Building target 'run-test-only' in configuration 'windows-x86_64-server-fastdebug' > Test selection 'test/hotspot/jtreg/runtime/jni/loadLibraryUnicode', will run: > * jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode > > Running test 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode' > Passed: runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java > Test results: passed: 1 Maxim Kartashev has updated the pull request incrementally with one additional commit since the last revision: Updated the test to run on Windows only and to use a character from the supplementary plane in the path name. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4169/files - new: https://git.openjdk.java.net/jdk/pull/4169/files/97c918ca..a5d45dca Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4169&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4169&range=03-04 Stats: 8 lines in 2 files changed: 1 ins; 5 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/4169.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4169/head:pull/4169 PR: https://git.openjdk.java.net/jdk/pull/4169 From github.com+28651297+mkartashev at openjdk.java.net Fri Jun 4 13:36:28 2021 From: github.com+28651297+mkartashev at openjdk.java.net (Maxim Kartashev) Date: Fri, 4 Jun 2021 13:36:28 GMT Subject: RFR: 8195129: System.load() fails to load from unicode paths [v3] In-Reply-To: References: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> <3y0nPfUyTPbNksPn1y5pvopzN2AReOgIl2CafPKD4b4=.3b490e90-5098-4d9f-8d7e-2770f5548895@github.com> Message-ID: On Thu, 3 Jun 2021 17:51:54 GMT, Naoto Sato wrote: > I think we can simply limit the test platform only to Windows in @requires tag in the test. Also, I would see the test case using some supplementary characters. Done. ------------- PR: https://git.openjdk.java.net/jdk/pull/4169 From github.com+28651297+mkartashev at openjdk.java.net Fri Jun 4 14:03:05 2021 From: github.com+28651297+mkartashev at openjdk.java.net (Maxim Kartashev) Date: Fri, 4 Jun 2021 14:03:05 GMT Subject: RFR: 8195129: System.load() fails to load from unicode paths [v3] In-Reply-To: References: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> <3y0nPfUyTPbNksPn1y5pvopzN2AReOgIl2CafPKD4b4=.3b490e90-5098-4d9f-8d7e-2770f5548895@github.com> <923qsXnidgxLrhNUc8Bxw3zDCiw1ZNLC6OmIYqIeSOE=.783c385d-1d4c-4c87-b975-3ee27a71513d@github.com> Message-ID: On Fri, 4 Jun 2021 02:12:42 GMT, David Holmes wrote: >> I am not sure we can pass non `modified UTF-8` through `JVM_LoadLibrary()`. Probably some VM folks can enlighten here? > > Not an expert by my understanding is that the VM only deals with modified UTF-8, as does JNI. So the incoming string should be modified-UTF8 IMO and then converted to UTF16. > > That said, this is shared code being modified on the JDK side so you can't just change the type of string being passed in without updating all the implementations of os::dll_load to support that! I think we need to establish some common ground before proceeding further with this fix. It's a bit of a long read; please, bear with me. The path name starts its life as a `jstring` in `Java_jdk_internal_loader_NativeLibraries_load()`, its encoding is irrelevant at this point. Next, the name has to be passed down to `JVM_LoadLibrary()` that takes `char*`. So we need to convert form `jstring` to `char*` (point (a)). Following that, `os::dll_load()` that actually performs loading in a platform-specific manner also receives `char*`. All platform implementations of `os::dll_load()` pass the path name down to their respective platform's APIs unmodified, but I think that's just incidental and here we have another possible point of conversion (point (b)). Other consumers of the path name are exception(c) and logging(d) messages; they also take `char*`, but potentially of a different encoding. Let me try to enumerate all conceivably valid conversions for `JVM_LoadLibrary()` consumption (point (a)): 1. jstring -> platform-specific encoding (status quo meaning possibly lossy encoding on Windows and UTF-8 elsewhere AFAICT), 2. jstring -> modified UTF-8, 3. jstring -> UTF-8. This bug [8195129](https://bugs.openjdk.java.net/browse/JDK-8195129) occurs because conversion (1) may loose information on Windows if the platform encoding happens to be NOT UTF-8 (which it often - or even always - is). So that's a no-go and we are left with either (2) or (3). On MacOS and Linux, "platform" encoding already is UTF-8 and since all the platform APIs happily consume UTF-8, no further conversion is necessary (neither for actual library loading, nor for log or exception messages; the latter have to convert to UTF-16, but do that under the hood). On Windows, we require at least these variants of the path name: 1. UTF16 for library loading (Unicode Windows API), 2. "platform" encoding for logging (yes, loosing information here, but that's tolerable), 3. "platform" (lossy) or UTF8 (lossless) encoding for exception messages (prefer lossless). This is what's behind my choice of UTF-8 for the path name encoding as it gets passed down to `JVM_LoadLibrary()`. We can go with modified UTF-8, of course, in which case all platforms - not just Windows - will have to do the conversion on their own, loosing the benefit of the knowledge about the original string encoding (the String.coder field of jstring). ------------- PR: https://git.openjdk.java.net/jdk/pull/4169 From lmesnik at openjdk.java.net Fri Jun 4 17:25:07 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Fri, 4 Jun 2021 17:25:07 GMT Subject: Integrated: 8212155: Race condition when posting dynamic_code_generated event leads to JVM crash In-Reply-To: References: Message-ID: On Thu, 3 Jun 2021 06:15:28 GMT, Leonid Mesnik wrote: > Fixed a race condition between posting and enabling DynamicCodeGenerated event. This pull request has now been integrated. Changeset: 64ec8b3e Author: Leonid Mesnik URL: https://git.openjdk.java.net/jdk/commit/64ec8b3e5c8a8d44c92591710d73b833f13c1500 Stats: 122 lines in 3 files changed: 116 ins; 0 del; 6 mod 8212155: Race condition when posting dynamic_code_generated event leads to JVM crash Reviewed-by: sspitsyn, dcubed ------------- PR: https://git.openjdk.java.net/jdk/pull/4331 From naoto at openjdk.java.net Fri Jun 4 17:35:58 2021 From: naoto at openjdk.java.net (Naoto Sato) Date: Fri, 4 Jun 2021 17:35:58 GMT Subject: RFR: 8195129: System.load() fails to load from unicode paths [v3] In-Reply-To: References: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> <3y0nPfUyTPbNksPn1y5pvopzN2AReOgIl2CafPKD4b4=.3b490e90-5098-4d9f-8d7e-2770f5548895@github.com> <923qsXnidgxLrhNUc8Bxw3zDCiw1ZNLC6OmIYqIeSOE=.783c385d-1d4c-4c87-b975-3ee27a71513d@github.com> Message-ID: <7KY_RquUMSy0puIyhxlDQ6NRkJyr2-rQCVmcxbff2m8=.4f423fd2-5aea-43eb-b539-af28e5f5084e@github.com> On Fri, 4 Jun 2021 14:00:25 GMT, Maxim Kartashev wrote: >> Not an expert by my understanding is that the VM only deals with modified UTF-8, as does JNI. So the incoming string should be modified-UTF8 IMO and then converted to UTF16. >> >> That said, this is shared code being modified on the JDK side so you can't just change the type of string being passed in without updating all the implementations of os::dll_load to support that! > > I think we need to establish some common ground before proceeding further with this fix. It's a bit of a long read; please, bear with me. > > The path name starts its life as a `jstring` in `Java_jdk_internal_loader_NativeLibraries_load()`, its encoding is irrelevant at this point. > > Next, the name has to be passed down to `JVM_LoadLibrary()` that takes `char*`. So we need to convert form `jstring` to `char*` (point (a)). Following that, `os::dll_load()` that actually performs loading in a platform-specific manner also receives `char*`. All platform implementations of `os::dll_load()` pass the path name down to their respective platform's APIs unmodified, but I think that's just incidental and here we have another possible point of conversion (point (b)). Other consumers of the path name are exception(c) and logging(d) messages; they also take `char*`, but potentially of a different encoding. > > Let me try to enumerate all conceivably valid conversions for `JVM_LoadLibrary()` consumption (point (a)): > 1. jstring -> platform-specific encoding (status quo meaning possibly lossy encoding on Windows and UTF-8 elsewhere AFAICT), > 2. jstring -> modified UTF-8, > 3. jstring -> UTF-8. > > This bug [8195129](https://bugs.openjdk.java.net/browse/JDK-8195129) occurs because conversion (1) may loose information on Windows if the platform encoding happens to be NOT UTF-8 (which it often - or even always - is). So that's a no-go and we are left with either (2) or (3). > > On MacOS and Linux, "platform" encoding already is UTF-8 and since all the platform APIs happily consume UTF-8, no further conversion is necessary (neither for actual library loading, nor for log or exception messages; the latter have to convert to UTF-16, but do that under the hood). > > On Windows, we require at least these variants of the path name: > 1. UTF16 for library loading (Unicode Windows API), > 2. "platform" encoding for logging (yes, loosing information here, but that's tolerable), > 3. "platform" (lossy) or UTF8 (lossless) encoding for exception messages (prefer lossless). > > This is what's behind my choice of UTF-8 for the path name encoding as it gets passed down to `JVM_LoadLibrary()`. We can go with modified UTF-8, of course, in which case all platforms - not just Windows - will have to do the conversion on their own, loosing the benefit of the knowledge about the original string encoding (the String.coder field of jstring). I think I am hesitant to change the JVM interface from modified UTF-8 to standard UTF-8, as it would be the only location in JNI/JVM interface that uses the standard UTF-8. Instead, I would implement `convert_UTF8_to_UTF16` or rather `convert_mUTF8_to_UTF16` with a fairly simple arithmetic logic. ------------- PR: https://git.openjdk.java.net/jdk/pull/4169 From naoto at openjdk.java.net Fri Jun 4 17:35:59 2021 From: naoto at openjdk.java.net (Naoto Sato) Date: Fri, 4 Jun 2021 17:35:59 GMT Subject: RFR: 8195129: System.load() fails to load from unicode paths [v5] In-Reply-To: References: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> Message-ID: On Fri, 4 Jun 2021 13:36:27 GMT, Maxim Kartashev wrote: >> Character strings within JVM are produced and consumed in several formats. Strings come from/to Java in the UTF8 format and POSIX APIs (like fprintf() or dlopen()) consume strings also in UTF8. On Windows, however, the situation is far less simple: some new(er) APIs expect UTF16 (wide-character strings), some older APIs can only work with strings in a "platform" format, where not all UTF8 characters can be represented; which ones can depends on the current "code page". >> >> This commit switches the Windows version of native library loading code to using the new UTF16 API `LoadLibraryW()` and attempts to streamline the use of various string formats in the surrounding code. >> >> Namely, exception messages are made to consume strings explicitly in the UTF8 format, while logging functions (that end up using legacy Windows API) are made to consume "platform" strings in most cases. One exception is `JVM_LoadLibrary()` logging where the UTF8 name of the library is logged, which can, of course, be fixed, but was considered not worth the additional code (NB: this isn't a new bug). >> >> The test runs in a separate JVM in order to make NIO happy about non-ASCII characters in the file name; tests are executed with LC_ALL=C and that doesn't let NIO work with non-ASCII file names even on Linux or MacOS. >> >> Tested by running `test/hotspot/jtreg:tier1` on Linux and `jtreg:test/hotspot/jtreg/runtime` on Windows 10. The new test (` jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode`) was explicitly ran on those platforms as well. >> >> Results from Linux: >> >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >> jtreg:test/hotspot/jtreg:tier1 1784 1784 0 0 >> ============================== >> TEST SUCCESS >> >> >> Building target 'run-test-only' in configuration 'linux-x86_64-server-release' >> Test selection 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode', will run: >> * jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode >> >> Running test 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode' >> Passed: runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java >> Test results: passed: 1 >> >> >> Results from Windows 10: >> >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >> jtreg:test/hotspot/jtreg/runtime 746 746 0 0 >> ============================== >> TEST SUCCESS >> Finished building target 'run-test-only' in configuration 'windows-x86_64-server-fastdebug' >> >> >> Building target 'run-test-only' in configuration 'windows-x86_64-server-fastdebug' >> Test selection 'test/hotspot/jtreg/runtime/jni/loadLibraryUnicode', will run: >> * jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode >> >> Running test 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode' >> Passed: runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java >> Test results: passed: 1 > > Maxim Kartashev has updated the pull request incrementally with one additional commit since the last revision: > > Updated the test to run on Windows only and to use a character from the > supplementary plane in the path name. src/java.base/share/native/libjava/jni_util.c line 680: > 678: utf8Encoding).z; > 679: return isUTF8EncodingSupported; > 680: } It would be moot if we choose not to modify the JVM_ interface to standard UTF-8, but this function is not necessary. UTF-8 encoding is guaranteed in every Java implementation. ------------- PR: https://git.openjdk.java.net/jdk/pull/4169 From kvn at openjdk.java.net Fri Jun 4 20:31:59 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 4 Jun 2021 20:31:59 GMT Subject: RFR: 8268125: ZGC: Clone oop array gets wrong acopy stub In-Reply-To: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> References: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> Message-ID: On Fri, 4 Jun 2021 11:43:26 GMT, Nils Eliasson wrote: > Hi, > > This fixes a problem I introduced with JDK-8267726. With that change clone oop array is treated as normal clone arrays with ZGC. I missed that a case was missing in zBarrierSetC2::clone_at_expansion - which caused clone_oop-arrays to get the wrong array copy stub. > > In this fix I move the entire leaf call creation inside zBarrierSetC2, and leave BarrierSetC2 as is. In this way I don't have to change anything for the other collectors. > > Please review, > Best regards, > Nils Eliasson Update copyright years in files. test/hotspot/jtreg/compiler/arraycopy/TestObjectArrayClone.java line 33: > 31: * compiler.arraycopy.TestObjectArrayClone > 32: * > 33: * @run main/othervm -XX:+IgnoreUnrecognizedVMOptions -XX:+UseZGC I suggest to clone it to separate `@test` block because you need `@requires vm.gc.Z` for it. ------------- PR: https://git.openjdk.java.net/jdk/pull/4359 From svkamath at openjdk.java.net Fri Jun 4 23:49:31 2021 From: svkamath at openjdk.java.net (Smita Kamath) Date: Fri, 4 Jun 2021 23:49:31 GMT Subject: RFR: 8267125: AES Galois CounterMode (GCM) interleaved implementation using AVX512 + VAES instructions [v2] In-Reply-To: <0a7b_-PDU_JYXR7OrJRK8Z8QPRwLlV2vcHbBbW06SO8=.f0d61fd3-0205-40a7-b1a1-58caa2ea0f45@github.com> References: <0a7b_-PDU_JYXR7OrJRK8Z8QPRwLlV2vcHbBbW06SO8=.f0d61fd3-0205-40a7-b1a1-58caa2ea0f45@github.com> Message-ID: > I would like to submit AES-GCM optimization for x86_64 architectures supporting AVX3+VAES (Evex encoded AES). This optimization interleaves AES and GHASH operations. > Performance gain of ~1.5x - 2x for message sizes 8k and above. Smita Kamath has updated the pull request incrementally with one additional commit since the last revision: 8267125:Updated intrinsic signature to remove copies of counter, state and subkeyHtbl ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4019/files - new: https://git.openjdk.java.net/jdk/pull/4019/files/433176d1..2ac209e3 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4019&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4019&range=00-01 Stats: 58 lines in 10 files changed: 3 ins; 28 del; 27 mod Patch: https://git.openjdk.java.net/jdk/pull/4019.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4019/head:pull/4019 PR: https://git.openjdk.java.net/jdk/pull/4019 From kbarrett at openjdk.java.net Sat Jun 5 03:38:11 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Sat, 5 Jun 2021 03:38:11 GMT Subject: RFR: 8263107: PSPromotionManager::copy_and_push_safe_barrier needs acquire memory barrier Message-ID: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> Please review this change to PSPromotionManager::copy_to_survivor_space (ParallelGC) to remove some redundant work, and to add some missing memory barriers. There are two callers of copy_to_survivor_space, both of which wrap that call with the idiom if obj->is_forwarded() then new_obj = obj->forwardee() else new_obj = copy_to_survivor_space(obj) endif There are problems with this. (1) The first thing copy_to_survivor_space does is check whether the object is already forwarded, and if so then return obj->forwardee_acquire(). The idiom used by the callers is a redundant check, and the redundancy can't be optimized away. It is also missing the acquire barrier that was added by JDK-8154736 after long discussion. (2) It turns out the forwardee_acquire() from JDK-8154736 isn't sufficient after all. The "if is_forwarded() then use forwardee()" idiom is hiding under the abstractions that we're doing two relaxed atomic loads of the mark word, and there is nothing here to prevent the second from reading a value older than that read by the first, with bad consequences. This possibility came up in the discussion of JDK-8154736, but seems to have been either lost or discounted. If you think loads from the same location can't do that, see JDK-8229169 for a counter example. Part of this change involves removing the conditionalization of the calls to copy_to_survivor_space; just call it directly. However, it turns out that some compilers don't inline copy_to_survivor_space because of its size. So we refactored it into two functions, one doing the already marked check and then calling the other to do most of the work. This is enough for the check to be inlined into callers, so we've effectively removed the redundant inner check. Note: This part of the change introduces a large block of whitespace differences due to removal of an if-else and outdenting the body; I recommend using a view that suppresses those when reviewing. The second part of the change involves adding or moving some acquire barriers. (a) For the initial check whether the object is already marked, if it is then add an acquire fence before returning the forwardee. We could instead use a load-acquire to obtain the mark word, but that would be an unneeded acquire barrier on the much more common unmarked case. Also removed forwardee_acquire(), which is no longer used. (b) If the cmpxchg race is lost, add an acquire fence before fetching and returning the forwardee. The failed release-cmpxchg effectively behaves like a relaxed-load, which must preceed the forwardee access and any reads from it. I've also changed to only log copying when actually copied, not when already copied and forwarded. Also changed a guarantee to an assert. I looked at all uses of forwardee() in light of problem (2), and did not find any additional problems. (That doesn't mean there aren't any, just that I didn't spot any. This is low-level atomics, after all.) Testing: mach5 tier1-3,5,7 (tier3,5,7 are where a lot of ParallelGC testing is done). Performance testing showed no significant change. ------------- Commit messages: - more barriers, remove redundent work Changes: https://git.openjdk.java.net/jdk/pull/4371/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4371&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263107 Stats: 224 lines in 6 files changed: 65 ins; 73 del; 86 mod Patch: https://git.openjdk.java.net/jdk/pull/4371.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4371/head:pull/4371 PR: https://git.openjdk.java.net/jdk/pull/4371 From github.com+6704669+asgibbons at openjdk.java.net Sat Jun 5 14:12:11 2021 From: github.com+6704669+asgibbons at openjdk.java.net (Scott Gibbons) Date: Sat, 5 Jun 2021 14:12:11 GMT Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 Message-ID: Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding. A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done. This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so. A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded. This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded. The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding. The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically. Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k. The numbers are given in the table below. **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**. Benchmark Name | Base Score | Optimized Score | Gain -- | -- | -- | -- testBase64Decode size 1 | 15.36 | 15.32 | 1.00 testBase64Decode size 3 | 17.00 | 16.72 | 1.02 testBase64Decode size 7 | 20.60 | 18.82 | 1.09 testBase64Decode size 32 | 34.21 | 26.77 | 1.28 testBase64Decode size 64 | 54.43 | 38.35 | 1.42 testBase64Decode size 80 | 66.40 | 48.34 | 1.37 testBase64Decode size 96 | 73.16 | 52.90 | 1.38 testBase64Decode size 112 | 84.93 | 51.82 | 1.64 testBase64Decode size 512 | 288.81 | 32.04 | 9.01 testBase64Decode size 1000 | 560.48 | 40.79 | 13.74 testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72 testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15 testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07 testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10 testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02 testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10 testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05 testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00 testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05 testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20 testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09 testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12 testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09 testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21 testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29 testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12 testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05 testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18 testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02 testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24 testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23 testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24 testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14 testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19 testBase64WithErrorInputsDecode size 20000 | 1398.44 | 1138.17 | 1.23 testBase64WithErrorInputsDecode size 50000 | 1409.41 | 1114.16 | 1.26 ------------- Commit messages: - Merge remote-tracking branch 'origin/base64_length_restrict' into base64_decode - Condition decode intrinsic within generator instead of outside to allow non-AVX acceleration - Adding MIME to signature. - Adding MIME to signature. - Adding MIME to signature. - Initialize vector before loop - Initialize vector before loop - Wrong register lengths. - Wrong register lengths. - writing in wrong order - ... and 418 more: https://git.openjdk.java.net/jdk/compare/48dc72b7...e527557a Changes: https://git.openjdk.java.net/jdk/pull/4368/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4368&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8268276 Stats: 743 lines in 10 files changed: 736 ins; 0 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/4368.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4368/head:pull/4368 PR: https://git.openjdk.java.net/jdk/pull/4368 From kbarrett at openjdk.java.net Sun Jun 6 16:26:19 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Sun, 6 Jun 2021 16:26:19 GMT Subject: RFR: 8268290: Improve LockFreeQueue<> utility Message-ID: Please review this change to the LockFreeQueue utility class. The LockFreeQueue originated as an implementation detail of G1DirtyCardQueueSet, and was recently refactored into a public utility class. In that refactoring it retained some limitations that were acceptable in its original context, but may be problematic as a general utility. In particular, under some conditions a thread was not be able to pop the last element in the queue, due to interference by a concurrent operation. And this state will persist, so retrying the pop operation won't help until the interfering thread had made sufficient progress. This was mitigated by making the API more complex to provide notice to the client that the queue may be in this state. But it turns out we can do somewhat better, eliminating one of the limitations, which is the point of this change. We introduce a pseudo-object used as an end of queue marker. We can use the transition of the last element's next value from the end marker to NULL by a pop operation as a claim on the element, allowing the losing thread to recognize, retry, and make progress. This queue still has the limitation that an in-progress push/append may prevent popping elements. Because of this, the class is being renamed to NonblockingQueue. The old name suggests stronger guarantees than actually provided. The PR has two commits, the first for the functional changes, the second for the renaming. The github diffs don't seem to be recognizing the renaming of the source files as a rename, instead treating the old files as deleted and the new files as added. The first commit by itself is probably more useful for reviewing the functional changes. Testing: mach5 tier1-5 ------------- Commit messages: - rename - use end marker to improve pop Changes: https://git.openjdk.java.net/jdk/pull/4379/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4379&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8268290 Stats: 1229 lines in 8 files changed: 619 ins; 601 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/4379.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4379/head:pull/4379 PR: https://git.openjdk.java.net/jdk/pull/4379 From ogatak at openjdk.java.net Sun Jun 6 20:12:04 2021 From: ogatak at openjdk.java.net (Kazunori Ogata) Date: Sun, 6 Jun 2021 20:12:04 GMT Subject: RFR: 8267968: [PPC64] Use prefixed load and addi instructions for better performance in POWER10 In-Reply-To: References: Message-ID: On Fri, 4 Jun 2021 06:54:05 GMT, Michihiro Horie wrote: >> The POWER10 processor supports prefixed load and addi instructions that have larger displacement field of up to 34-bits. We can reduce instruction cycles to load constant from TOC and load an immediate value to a register. >> >> Assembler::{load|add}_const_optimized() and LoadCon[LPFD]Nodes are modified to use prefixed instructions, with fixing other functions that are affected by this change. >> >> I ran jtreg test on both POWER10 and POWER8 machines by using "make test-tier1" and verified no additional fails by this change. I also ran DaCapo, Renaissance, and SPECjbb2015 on both of them and verified they run successfully. > > src/hotspot/cpu/ppc/ppc.ad line 2894: > >> 2892: if (loadConLNodes._small) nodes->push(loadConLNodes._small); >> 2893: if (loadConLNodes._large_hi) nodes->push(loadConLNodes._large_hi); >> 2894: if (loadConLNodes._large_lo) nodes->push(loadConLNodes._large_lo); > > Is removing the _last checking needed? lf it's needed, code related to _last should be removed such as in loadConLNodesTuple_create. Also, it would be better to use an if-else condition because it cannot happen both _small and _large_hi are non null. loadConLNodesTuple_create initializes loadConLNodes._last points to the node that is the same as either loadConLNodes._small or loadConLNodes._larege_lo. So load_ConLNode is added twice if we don't remove _last checking. (I actually made this bug and spent a few days to fix it...) The correct code here should be either: 1) use the same code as that before change, i.e., don't add _small and _large_lo, and only use _last (and _large_hi), or 2) use _small and _large_lo, and remove _last, as I modified. I chose the option 2 to avoid confusion and to make the change symmetrical to change at [L.3459](https://github.com/openjdk/jdk/pull/4267/commits/403b789cc068ea74a0768406852bf79149b23e32#diff-d21a64a4949f298476bf91083d3b956face9a6393a08a706b071068898533082R3459), where adding _small is mandatory and _last points to other node. If you (or other reviewers) think the option 1 is better, I can revert this change and add comments as a caveat. ------------- PR: https://git.openjdk.java.net/jdk/pull/4267 From ogatak at openjdk.java.net Sun Jun 6 20:28:27 2021 From: ogatak at openjdk.java.net (Kazunori Ogata) Date: Sun, 6 Jun 2021 20:28:27 GMT Subject: RFR: 8267968: [PPC64] Use prefixed load and addi instructions for better performance in POWER10 [v2] In-Reply-To: References: Message-ID: > The POWER10 processor supports prefixed load and addi instructions that have larger displacement field of up to 34-bits. We can reduce instruction cycles to load constant from TOC and load an immediate value to a register. > > Assembler::{load|add}_const_optimized() and LoadCon[LPFD]Nodes are modified to use prefixed instructions, with fixing other functions that are affected by this change. > > I ran jtreg test on both POWER10 and POWER8 machines by using "make test-tier1" and verified no additional fails by this change. I also ran DaCapo, Renaissance, and SPECjbb2015 on both of them and verified they run successfully. Kazunori Ogata has updated the pull request incrementally with one additional commit since the last revision: Improve comments in macroAssembler_ppc.cpp ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4267/files - new: https://git.openjdk.java.net/jdk/pull/4267/files/b87cb294..ea87e2c0 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4267&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4267&range=00-01 Stats: 6 lines in 1 file changed: 4 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/4267.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4267/head:pull/4267 PR: https://git.openjdk.java.net/jdk/pull/4267 From ogatak at openjdk.java.net Sun Jun 6 20:28:28 2021 From: ogatak at openjdk.java.net (Kazunori Ogata) Date: Sun, 6 Jun 2021 20:28:28 GMT Subject: RFR: 8267968: [PPC64] Use prefixed load and addi instructions for better performance in POWER10 [v2] In-Reply-To: References: Message-ID: On Fri, 4 Jun 2021 07:04:58 GMT, Michihiro Horie wrote: >> Kazunori Ogata has updated the pull request incrementally with one additional commit since the last revision: >> >> Improve comments in macroAssembler_ppc.cpp > > src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 311: > >> 309: return inv_d1_field(inst1); >> 310: } else if (PowerArchitecturePPC64 >= 10 && is_pld_prefix(inst1)) { >> 311: return (get_imm18(inst1_addr, 0) << 16) + (get_imm(inst1_addr, 1) & 0xffff); > > I couldn't understand here quickly. Maybe it would be nice to tell the reason by adding some comments like is_load_const_from_method_toc_at's one "CallDynamicJavaDirectSched_ExNode and CallLeaf(NoFP)Direct_ExNode are the nodes that require relocation and use pld". Thank you for your suggestion. I added the comment (and made some wordsmith in both is_load_const_from_method_toc_at and get_offset_of_load_const_from_method_toc_at). ------------- PR: https://git.openjdk.java.net/jdk/pull/4267 From dholmes at openjdk.java.net Sun Jun 6 22:29:02 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Sun, 6 Jun 2021 22:29:02 GMT Subject: RFR: 8195129: System.load() fails to load from unicode paths [v3] In-Reply-To: References: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> <3y0nPfUyTPbNksPn1y5pvopzN2AReOgIl2CafPKD4b4=.3b490e90-5098-4d9f-8d7e-2770f5548895@github.com> <923qsXnidgxLrhNUc8Bxw3zDCiw1ZNLC6OmIYqIeSOE=.783c385d-1d4c-4c87-b975-3ee27a71513d@github.com> Message-ID: <-V3-GuFQLcbKVotN0nKemAI3s3mkmbtHW0WgpYL6cvc=.e4eb2552-ff8f-459f-afa5-4a312508228e@github.com> On Fri, 4 Jun 2021 14:00:25 GMT, Maxim Kartashev wrote: >> Not an expert by my understanding is that the VM only deals with modified UTF-8, as does JNI. So the incoming string should be modified-UTF8 IMO and then converted to UTF16. >> >> That said, this is shared code being modified on the JDK side so you can't just change the type of string being passed in without updating all the implementations of os::dll_load to support that! > > I think we need to establish some common ground before proceeding further with this fix. It's a bit of a long read; please, bear with me. > > The path name starts its life as a `jstring` in `Java_jdk_internal_loader_NativeLibraries_load()`, its encoding is irrelevant at this point. > > Next, the name has to be passed down to `JVM_LoadLibrary()` that takes `char*`. So we need to convert form `jstring` to `char*` (point (a)). Following that, `os::dll_load()` that actually performs loading in a platform-specific manner also receives `char*`. All platform implementations of `os::dll_load()` pass the path name down to their respective platform's APIs unmodified, but I think that's just incidental and here we have another possible point of conversion (point (b)). Other consumers of the path name are exception(c) and logging(d) messages; they also take `char*`, but potentially of a different encoding. > > Let me try to enumerate all conceivably valid conversions for `JVM_LoadLibrary()` consumption (point (a)): > 1. jstring -> platform-specific encoding (status quo meaning possibly lossy encoding on Windows and UTF-8 elsewhere AFAICT), > 2. jstring -> modified UTF-8, > 3. jstring -> UTF-8. > > This bug [8195129](https://bugs.openjdk.java.net/browse/JDK-8195129) occurs because conversion (1) may loose information on Windows if the platform encoding happens to be NOT UTF-8 (which it often - or even always - is). So that's a no-go and we are left with either (2) or (3). > > On MacOS and Linux, "platform" encoding already is UTF-8 and since all the platform APIs happily consume UTF-8, no further conversion is necessary (neither for actual library loading, nor for log or exception messages; the latter have to convert to UTF-16, but do that under the hood). > > On Windows, we require at least these variants of the path name: > 1. UTF16 for library loading (Unicode Windows API), > 2. "platform" encoding for logging (yes, loosing information here, but that's tolerable), > 3. "platform" (lossy) or UTF8 (lossless) encoding for exception messages (prefer lossless). > > This is what's behind my choice of UTF-8 for the path name encoding as it gets passed down to `JVM_LoadLibrary()`. We can go with modified UTF-8, of course, in which case all platforms - not just Windows - will have to do the conversion on their own, loosing the benefit of the knowledge about the original string encoding (the String.coder field of jstring). @mkartashev thank you for the detailed explanation. It is not clear to me that the JDK's conformance to being a Unicode application has significantly changed since the evaluation of JDK-8017274 - @naotoj can you comment on that and related discussion from the CCC for JDK-4958170 ? In particular I'm not sure that using the platform encoding is wrong, nor how we can have a path that cannot be represented by the platform encoding? Not being an expert in this area I cannot evaluate the affects of these shared code changes on other platforms, and so am reluctant to introduce any change that affects any non-Windows platforms. Also the JVM and JNI work with modified-UTF8 so I do not think we should diverge from that. I would hate to see windows specific code introduced into the JDK or JVM's shared code for these APIs, but that may be the only choice to avoid potential disruption to other platforms. Though perhaps we could push the initial conversion down into the JVM? ------------- PR: https://git.openjdk.java.net/jdk/pull/4169 From david.holmes at oracle.com Sun Jun 6 23:27:34 2021 From: david.holmes at oracle.com (David Holmes) Date: Mon, 7 Jun 2021 09:27:34 +1000 Subject: RFR: 8268228: TSC is not used for CPUTimeStampCounter on AMD processor [v2] In-Reply-To: References: <6H5s4qMss7uZrgxPDXbSgljZxvGQlXqgytB6FrjMX38=.d1d32685-162f-478f-a769-d0e8d454e925@github.com> Message-ID: Hi Yasumasa, On 4/06/2021 3:42 pm, Yasumasa Suenaga wrote: > On Fri, 4 Jun 2021 05:08:33 GMT, David Holmes wrote: > >>> Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: >>> >>> Fix comments >> >> Hi Yasumasa, >> >> This seems reasonable to address the observed problem, though I have to wonder why a system with invariant TSC is not properly reporting a maximum frequency? Makes me wonder how many chips may actually be impacted here? >> >> Thanks, >> David > > Thanks @dholmes-ora for your review! > I pushed new commit. Could you review again? > >> This seems reasonable to address the observed problem, though I have to wonder why a system with invariant TSC is not properly reporting a maximum frequency? > > I guess it is not specified to contain the frequency in brand string, I'm not sure. > In AMD processor, we can seem to get effective frequency from MSR, but it requires kernel mode, so we can't use it. > >> Makes me wonder how many chips may actually be impacted here? > > I guess AMD processors are affected it at least. For example, Opteron and Athlon do not seem to have its frequency in their brand string. > > https://www.cpu-world.com/cgi-bin/CPUID.pl?CPUID=73494 > https://www.cpu-world.com/cgi-bin/CPUID.pl?CPUID=74038 As per my update to the bug report I'm less inclined to support this change as the affects could be more far reaching than I first thought. Suddenly enabling the TSC usage on systems that currently do not use it may be disruptive, especially if those systems do not infact have a reliable invariant-TSC to use. David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/4350 > From ysuenaga at openjdk.java.net Mon Jun 7 00:45:01 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Mon, 7 Jun 2021 00:45:01 GMT Subject: RFR: 8268228: TSC is not used for CPUTimeStampCounter on AMD processor [v2] In-Reply-To: References: Message-ID: On Fri, 4 Jun 2021 05:24:15 GMT, Yasumasa Suenaga wrote: >> I ran JVM on Ryzen 3300X, and I got following `jdk.CPUTimeStampCounter` event. >> >> >> jdk.CPUTimeStampCounter { >> startTime = 10:41:14.993 >> fastTimeEnabled = false >> fastTimeAutoEnabled = true >> osFrequency = 1000000000 >> fastTimeFrequency = 1000000000 >> } >> >> >> I confirmed 3300X supports Invariant TSC (so `fastTimeAutoEnabled` is set to `true`), however it does not seem to be used (`fastTimeEnabled` is `false`). >> >> Frequency is come from brand string from CPUID (e.g. "Intel(R) Core(TM) i3-8145U CPU @ 2.10GHz"). However AMD processor (Ryzen at least) does not have it ("AMD Ryzen 3 3300X 4-Core Processor"). >> Fortunately rdtsc_x86.cpp can calculate the frequency like bogomips. We should fallback to it if we cannot get the frequency even if invariant TSC is supported. >> >> After this change, I got following `jdk.CPUTimeStampCounter` event. Base clock of Ryzen 3 3300X is 3.8GHz, so `fastTimeFrequency` looks good. >> >> >> jdk.CPUTimeStampCounter { >> startTime = 10:33:52.884 >> fastTimeEnabled = true >> fastTimeAutoEnabled = true >> osFrequency = 10000000 Hz >> fastTimeFrequency = 3792929124 Hz >> } >> >> >> This problem is not only for JFR. I confirmed `Rdtsc` class is used in ticks.cpp , and it relates to GC code at least. > > Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: > > Fix comments I can't see discussion in JDK-8211240 (it seems to be security issue), but I think this PR is not disruptive. According to [CPUID specification from AMD](https://www.amd.com/system/files/TechDocs/25481.pdf), invariant TSC flag means we can use TSC as a clocksource - it is same with intel processors. Did you mean we can't believe invariant TSC flag other than intel processors? Indeed Linux kernel hasn't treat it in case of AMD processors because [a patch for Linux kernel](https://patchwork.kernel.org/project/linux-pm/patch/f1643565587e70dd2fe2714e9afa566689211a9a.1433050960.git.len.brown at intel.com/) was focused in intel processors. ------------- PR: https://git.openjdk.java.net/jdk/pull/4350 From david.holmes at oracle.com Mon Jun 7 02:02:29 2021 From: david.holmes at oracle.com (David Holmes) Date: Mon, 7 Jun 2021 12:02:29 +1000 Subject: RFR: 8268228: TSC is not used for CPUTimeStampCounter on AMD processor [v2] In-Reply-To: References: Message-ID: <63b90dbe-54a1-2103-65a4-cff776fef296@oracle.com> On 7/06/2021 10:45 am, Yasumasa Suenaga wrote: > On Fri, 4 Jun 2021 05:24:15 GMT, Yasumasa Suenaga wrote: > >>> I ran JVM on Ryzen 3300X, and I got following `jdk.CPUTimeStampCounter` event. >>> >>> >>> jdk.CPUTimeStampCounter { >>> startTime = 10:41:14.993 >>> fastTimeEnabled = false >>> fastTimeAutoEnabled = true >>> osFrequency = 1000000000 >>> fastTimeFrequency = 1000000000 >>> } >>> >>> >>> I confirmed 3300X supports Invariant TSC (so `fastTimeAutoEnabled` is set to `true`), however it does not seem to be used (`fastTimeEnabled` is `false`). >>> >>> Frequency is come from brand string from CPUID (e.g. "Intel(R) Core(TM) i3-8145U CPU @ 2.10GHz"). However AMD processor (Ryzen at least) does not have it ("AMD Ryzen 3 3300X 4-Core Processor"). >>> Fortunately rdtsc_x86.cpp can calculate the frequency like bogomips. We should fallback to it if we cannot get the frequency even if invariant TSC is supported. >>> >>> After this change, I got following `jdk.CPUTimeStampCounter` event. Base clock of Ryzen 3 3300X is 3.8GHz, so `fastTimeFrequency` looks good. >>> >>> >>> jdk.CPUTimeStampCounter { >>> startTime = 10:33:52.884 >>> fastTimeEnabled = true >>> fastTimeAutoEnabled = true >>> osFrequency = 10000000 Hz >>> fastTimeFrequency = 3792929124 Hz >>> } >>> >>> >>> This problem is not only for JFR. I confirmed `Rdtsc` class is used in ticks.cpp , and it relates to GC code at least. >> >> Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix comments > > I can't see discussion in JDK-8211240 (it seems to be security issue), but I think this PR is not disruptive. Sorry I hadn't noticed the issue was non-public. Any PR that suddenly changes how a large set of systems will behave is disruptive in my opinion. The ramifications of this PR are not clear to me. > According to [CPUID specification from AMD](https://www.amd.com/system/files/TechDocs/25481.pdf), invariant TSC flag means we can use TSC as a clocksource - it is same with intel processors. > > Did you mean we can't believe invariant TSC flag other than intel processors? We don't trust the invariant TSC flag in general. We are aware of Intel machines that claim to have invariant TSC and yet the Linux kernel does not consider the TSC to be a reliable clocksource. We generally prefer to trust the OS in these areas rather than having to deal with a wide range of hardware idiosyncracies in the VM - it is neither scalable nor maintainable. The rdtsc usage arose because JFR (at one point in time) needed lower latency timers for events than were available via the OS, and then we had to change other logging code (ie GC counters) to be consistent with the JFR code. JDK-8211240 is a RFE to investigate current latencies and if they are adequate then revert to only using OS timers/counters again. > Indeed Linux kernel hasn't treat it in case of AMD processors because [a patch for Linux kernel](https://patchwork.kernel.org/project/linux-pm/patch/f1643565587e70dd2fe2714e9afa566689211a9a.1433050960.git.len.brown at intel.com/) was focused in intel processors. A great example of how broken the TSC can be on machines that claim it is fine! Cheers, David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/4350 > From ysuenaga at openjdk.java.net Mon Jun 7 03:05:05 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Mon, 7 Jun 2021 03:05:05 GMT Subject: RFR: 8268228: TSC is not used for CPUTimeStampCounter on AMD processor [v2] In-Reply-To: References: Message-ID: On Fri, 4 Jun 2021 05:24:15 GMT, Yasumasa Suenaga wrote: >> I ran JVM on Ryzen 3300X, and I got following `jdk.CPUTimeStampCounter` event. >> >> >> jdk.CPUTimeStampCounter { >> startTime = 10:41:14.993 >> fastTimeEnabled = false >> fastTimeAutoEnabled = true >> osFrequency = 1000000000 >> fastTimeFrequency = 1000000000 >> } >> >> >> I confirmed 3300X supports Invariant TSC (so `fastTimeAutoEnabled` is set to `true`), however it does not seem to be used (`fastTimeEnabled` is `false`). >> >> Frequency is come from brand string from CPUID (e.g. "Intel(R) Core(TM) i3-8145U CPU @ 2.10GHz"). However AMD processor (Ryzen at least) does not have it ("AMD Ryzen 3 3300X 4-Core Processor"). >> Fortunately rdtsc_x86.cpp can calculate the frequency like bogomips. We should fallback to it if we cannot get the frequency even if invariant TSC is supported. >> >> After this change, I got following `jdk.CPUTimeStampCounter` event. Base clock of Ryzen 3 3300X is 3.8GHz, so `fastTimeFrequency` looks good. >> >> >> jdk.CPUTimeStampCounter { >> startTime = 10:33:52.884 >> fastTimeEnabled = true >> fastTimeAutoEnabled = true >> osFrequency = 10000000 Hz >> fastTimeFrequency = 3792929124 Hz >> } >> >> >> This problem is not only for JFR. I confirmed `Rdtsc` class is used in ticks.cpp , and it relates to GC code at least. > > Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: > > Fix comments Hi David, > The rdtsc usage arose because JFR (at one point in time) needed lower > latency timers for events than were available via the OS, and then we > had to change other logging code (ie GC counters) to be consistent with > the JFR code. JDK-8211240 is a RFE to investigate current latencies and > if they are adequate then revert to only using OS timers/counters again. I saw [the report](https://jjy.nict.go.jp/tsp/research/labo3/gettime.html) which describes processing time between systemcall (`gettimeofday()`) and `RDTSC` (it is in Japanese :) It reports `RDTSC` is much lesser than systemcall - it makes sense because RDTSC does not need to context switch to kernel mode. > > Indeed Linux kernel hasn't treat it in case of AMD processors because [a patch for Linux kernel](https://patchwork.kernel.org/project/linux-pm/patch/f1643565587e70dd2fe2714e9afa566689211a9a.1433050960.git.len.brown at intel.com/) was focused in intel processors. > > A great example of how broken the TSC can be on machines that claim it > is fine! I guess it is not a proof of the break the TSC - it is just for Intel processor because the discussion in it does not mentioned other processors. I understood we generally prefer to trust the OS, but I wonder why it is enabled on Intel processor only, and it may use on other x86 processors if there is merit in TSC - The difference is only CPU brand string! ------------- PR: https://git.openjdk.java.net/jdk/pull/4350 From kbarrett at openjdk.java.net Mon Jun 7 04:59:01 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 7 Jun 2021 04:59:01 GMT Subject: RFR: 8268228: TSC is not used for CPUTimeStampCounter on AMD processor [v2] In-Reply-To: References: Message-ID: On Fri, 4 Jun 2021 05:24:15 GMT, Yasumasa Suenaga wrote: >> I ran JVM on Ryzen 3300X, and I got following `jdk.CPUTimeStampCounter` event. >> >> >> jdk.CPUTimeStampCounter { >> startTime = 10:41:14.993 >> fastTimeEnabled = false >> fastTimeAutoEnabled = true >> osFrequency = 1000000000 >> fastTimeFrequency = 1000000000 >> } >> >> >> I confirmed 3300X supports Invariant TSC (so `fastTimeAutoEnabled` is set to `true`), however it does not seem to be used (`fastTimeEnabled` is `false`). >> >> Frequency is come from brand string from CPUID (e.g. "Intel(R) Core(TM) i3-8145U CPU @ 2.10GHz"). However AMD processor (Ryzen at least) does not have it ("AMD Ryzen 3 3300X 4-Core Processor"). >> Fortunately rdtsc_x86.cpp can calculate the frequency like bogomips. We should fallback to it if we cannot get the frequency even if invariant TSC is supported. >> >> After this change, I got following `jdk.CPUTimeStampCounter` event. Base clock of Ryzen 3 3300X is 3.8GHz, so `fastTimeFrequency` looks good. >> >> >> jdk.CPUTimeStampCounter { >> startTime = 10:33:52.884 >> fastTimeEnabled = true >> fastTimeAutoEnabled = true >> osFrequency = 10000000 Hz >> fastTimeFrequency = 3792929124 Hz >> } >> >> >> This problem is not only for JFR. I confirmed `Rdtsc` class is used in ticks.cpp , and it relates to GC code at least. > > Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: > > Fix comments I think JFR is the only VM subsystem that currently uses the "fast" time that is based on TSC, with a fallback to OS time facilities if "fast" time is not enabled. (There has been discussion (under JDK-8211240) about eliminating that distinction and just always using OS time facilities, but it hasn't received much attention.) GC (and maybe other places?) uses the dual time mechanism because we want reliable time but also send JFR events. So some some of the major VM clients for time information are currently paying some cost for having both implementations. Currently the TSC frequency is always obtained from the CPUID brand string, with the bogomips style estimate in initialize_frequency never being used. Rdtsc::is_supported() is true iff VM_Version_Ext::supports_tscinv_ext(). And initialize_frequency() uses the brand string if supports_tscinv_ext(). I think the current implementation of the bogomips calculation can intermittently produce catastrophically wrong results. Descheduling at the wrong place(s) in the loop can badly mess things up. I thought there was a bug for this, but can't find one. Also, there are things like the Intel erratum referenced here: http://lkml.iu.edu/hypermail/linux/kernel/1511.1/01048.html that make things even more fun. I think that detecting a "good" TSC and it's properties (like frequency) is pretty hard, and we should not try to duplicate the OS detection or second guess it. I also think that using the "fast" time when the TSC is not "good" is a mistake, but I have so far not convinced the JFR folks. So I'm not in favor of this change. I think we should be moving away from direct TSC access rather than trying to use it in more cases. ------------- PR: https://git.openjdk.java.net/jdk/pull/4350 From yyang at openjdk.java.net Mon Jun 7 05:01:59 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Mon, 7 Jun 2021 05:01:59 GMT Subject: RFR: 8267657: Add missing PrintC1Statistics before incrementing counters In-Reply-To: <3y_kn9nUEynQogBERppD9dKS5rx7An1LH01suSXUJho=.e3ffc36b-fa8a-45a5-b957-91ef009148d1@github.com> References: <3y_kn9nUEynQogBERppD9dKS5rx7An1LH01suSXUJho=.e3ffc36b-fa8a-45a5-b957-91ef009148d1@github.com> Message-ID: On Tue, 25 May 2021 03:07:15 GMT, Yi Yang wrote: > Trivial change to add missing PrintC1Statistics before incrementing counters. PING?Can I have a review of this trivial change?Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/4178 From dholmes at openjdk.java.net Mon Jun 7 05:27:03 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 7 Jun 2021 05:27:03 GMT Subject: RFR: 8268164: Adopt cast notation for WorkerThread conversions [v3] In-Reply-To: References: Message-ID: <3j5kGexprwh9EynKMLaia49KIic36z6Rn-FSaH0UfrQ=.b7c12871-8f20-4a32-821d-2126bd94a7cc@github.com> On Thu, 3 Jun 2021 13:14:54 GMT, Albert Mingkun Yang wrote: >> Followup of JDK-8267916 (#4240), the same refactoring for `WorkerThread`. > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/4334 From github.com+4146708+a74nh at openjdk.java.net Mon Jun 7 08:14:46 2021 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Mon, 7 Jun 2021 08:14:46 GMT Subject: RFR: 8266749: AArch64: Backtracing broken on PAC enabled systems [v8] In-Reply-To: <3Ak1iZsEfTEKadfUcF6zGVuzsDoiQbaupm66NvSwlaY=.8323de39-d7e6-4049-9243-7c31a75bbc9f@github.com> References: <3Ak1iZsEfTEKadfUcF6zGVuzsDoiQbaupm66NvSwlaY=.8323de39-d7e6-4049-9243-7c31a75bbc9f@github.com> Message-ID: > On PAC systems, native code may sign return addresses before saving > them to the stack. We must ensure we strip the any signed bits in > order to walk the stack. > Add extra asserts in places where we do not expect saved return > addresses to be signed. > > On non-PAC systems, all PAC instructions are treated as NOPs. > > On Apple, use the provided ptrauth interface instead of asm > as the compiler may optimise further. > > Fedora 33 compiles all distro packages using PAC. Running the distro > provided OpenJDK-latest in GDB on a PAC system: > > Thread 2 "java" hit Breakpoint 1, 0x0000fffff68d7fe4 in init_globals() () > from /usr/lib/jvm/java-16-openjdk-16.0.1.0.9-1.rolling.fc33.aarch64-fastdebug/lib/server/libjvm.so > (gdb) call (int)pns($sp, $fp, $pc) > > "Executing pns" > Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0xe26fe4] init_globals()+0x10 > C 0x006ffffff74750c4 > C 0x0042fffff6a7f84c > C 0x0037fffff7fa0954 > C 0x0030fffff7fa4540 > C 0x0078fffff7d980c8 > > OpenJDK with this patch at the same breakpoint: > > (gdb) call (int)pns($sp, $fp, $pc) > "Executing pns" > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x189c47c] Threads::create_vm(JavaVMInitArgs*, bool*)+0x27c > V [libjvm.so+0xf527a0] JNI_CreateJavaVM+0xc0 > C [libjli.so+0x3860] JavaMain+0x7c > C [libjli.so+0x732c] ThreadJavaMain+0xc > C [libpthread.so.0+0x80c8] start_thread+0xd8 > > OpenJDK with this patch breakpointed at pd_hotspot_signal_handler: > > "Executing pns" > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x148a730] PosixSignals::pd_hotspot_signal_handler(int, siginfo_t*, ucontext_t*, JavaThread*)+0x0 > C [linux-vdso.so.1+0x80c] __kernel_rt_sigreturn+0x0 > J 53 c1 jdk.internal.org.objectweb.asm.SymbolTable.addConstantUtf8(Ljava/lang/String;)I java.base (98 bytes) @ 0x0000ffffe159cc3c [0x0000ffffe159cb40+0x00000000000000fc] > j jdk.internal.org.objectweb.asm.SymbolTable.setMajorVersionAndClassName(ILjava/lang/String;)I+12 java.base > j jdk.internal.org.objectweb.asm.ClassWriter.visit(IILjava/lang/String;Ljava/lang/String;Ljava/lang/String;[Ljava/lang/String;)V+20 java.base > j java.lang.invoke.InvokerBytecodeGenerator.classFilePrologue()Ljdk/internal/org/objectweb/asm/ClassWriter;+30 java.base > j java.lang.invoke.InvokerBytecodeGenerator.generateCustomizedCodeBytes()[B+1 java.base > j java.lang.invoke.InvokerBytecodeGenerator.generateCustomizedCode(Ljava/lang/invoke/LambdaForm;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/MemberName;+27 java.base > j java.lang.invoke.LambdaForm.compileToBytecode()V+69 java.base > j java.lang.invoke.DirectMethodHandle.makePreparedLambdaForm(Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/LambdaForm;+792 java.base > j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/LambdaForm;+17 java.base > j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MemberName;Z)Ljava/lang/invoke/LambdaForm;+163 java.base > j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MemberName;)Ljava/lang/invoke/LambdaForm;+2 java.base > j java.lang.invoke.DirectMethodHandle.make(BLjava/lang/Class;Ljava/lang/invoke/MemberName;Ljava/lang/Class;)Ljava/lang/invoke/DirectMethodHandle;+159 java.base > j java.lang.invoke.MethodHandles$Lookup.getDirectMethodCommon(BLjava/lang/Class;Ljava/lang/invoke/MemberName;ZZLjava/lang/invoke/MethodHandles$Lookup;)Ljava/lang/invoke/MethodHandle;+210 java.base > j java.lang.invoke.MethodHandles$Lookup.getDirectMethodNoSecurityManager(BLjava/lang/Class;Ljava/lang/invoke/MemberName;Ljava/lang/invoke/MethodHandles$Lookup;)Ljava/lang/invoke/MethodHandle;+14 java.base > j java.lang.invoke.MethodHandles$Lookup.getDirectMethodForConstant(BLjava/lang/Class;Ljava/lang/invoke/MemberName;)Ljava/lang/invoke/MethodHandle;+31 java.base > j java.lang.invoke.MethodHandles$Lookup.linkMethodHandleConstant(BLjava/lang/Class;Ljava/lang/String;Ljava/lang/Object;)Ljava/lang/invoke/MethodHandle;+153 java.base > j java.lang.invoke.MethodHandleNatives.linkMethodHandleConstant(Ljava/lang/Class;ILjava/lang/Class;Ljava/lang/String;Ljava/lang/Object;)Ljava/lang/invoke/MethodHandle;+38 java.base > v ~StubRoutines::call_stub > V [libjvm.so+0xe20118] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x5c8 > V [libjvm.so+0xe20f64] JavaCalls::call_static(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x284 > V [libjvm.so+0x184b778] SystemDictionary::link_method_handle_constant(Klass*, int, Klass*, Symbol*, Symbol*, Thread*)+0x398 > V [libjvm.so+0xa1f104] ConstantPool::resolve_constant_at_impl(constantPoolHandle const&, int, int, bool*, Thread*)+0xca0 > V [libjvm.so+0xa1fb6c] ConstantPool::copy_bootstrap_arguments_at_impl(constantPoolHandle const&, int, int, int, objArrayHandle, int, bool, Handle, Thread*)+0x3fc > V [libjvm.so+0x6bef6c] BootstrapInfo::resolve_args(Thread*)+0xcbc > V [libjvm.so+0x6c1538] BootstrapInfo::resolve_bsm(Thread*)+0x1194 > V [libjvm.so+0x184d300] SystemDictionary::invoke_bootstrap_method(BootstrapInfo&, Thread*)+0x30 > V [libjvm.so+0x120450c] LinkResolver::resolve_dynamic_call(CallInfo&, BootstrapInfo&, Thread*)+0x2c > V [libjvm.so+0x1204b1c] LinkResolver::resolve_invokedynamic(CallInfo&, constantPoolHandle const&, int, Thread*)+0x1bc > V [libjvm.so+0xe0ecc4] InterpreterRuntime::resolve_invokedynamic(JavaThread*)+0x190 > V [libjvm.so+0xe123a0] InterpreterRuntime::resolve_from_cache(JavaThread*, Bytecodes::Code)+0x160 > j jdk.internal.module.ModulePath.explodedPackages(Ljava/nio/file/Path;)Ljava/util/Set;+5 java.base > j jdk.internal.module.ModulePath.lambda$readExplodedModule$9(Ljava/nio/file/Path;)Ljava/util/Set;+2 java.base > j jdk.internal.module.ModulePath$$Lambda$2+0x000000010003bbe0.get()Ljava/lang/Object;+8 java.base > j jdk.internal.module.ModuleInfo.doRead(Ljava/io/DataInput;)Ljdk/internal/module/ModuleInfo$Attributes;+762 java.base > j jdk.internal.module.ModuleInfo.read(Ljava/io/InputStream;Ljava/util/function/Supplier;)Ljdk/internal/module/ModuleInfo$Attributes;+16 java.base > j jdk.internal.module.ModulePath.readExplodedModule(Ljava/nio/file/Path;)Ljava/lang/module/ModuleReference;+35 java.base > j jdk.internal.module.ModulePath.readModule(Ljava/nio/file/Path;Ljava/nio/file/attribute/BasicFileAttributes;)Ljava/lang/module/ModuleReference;+11 java.base > j jdk.internal.module.ModulePath.scanDirectory(Ljava/nio/file/Path;)Ljava/util/Map;+69 java.base > j jdk.internal.module.ModulePath.scan(Ljava/nio/file/Path;)Ljava/util/Map;+60 java.base > j jdk.internal.module.ModulePath.scanNextEntry()V+23 java.base > j jdk.internal.module.ModulePath.find(Ljava/lang/String;)Ljava/util/Optional;+36 java.base > j jdk.internal.module.SystemModuleFinders$1.lambda$find$0(Ljava/lang/module/ModuleFinder;Ljava/lang/String;)Ljava/util/Optional;+2 java.base > j jdk.internal.module.SystemModuleFinders$1$$Lambda$1+0x0000000100033b00.run()Ljava/lang/Object;+8 java.base > j java.security.AccessController.executePrivileged(Ljava/security/PrivilegedAction;Ljava/security/AccessControlContext;Ljava/lang/Class;)Ljava/lang/Object;+29 java.base > j java.security.AccessController.doPrivileged(Ljava/security/PrivilegedAction;)Ljava/lang/Object;+5 java.base > j jdk.internal.module.SystemModuleFinders$1.find(Ljava/lang/String;)Ljava/util/Optional;+12 java.base > j jdk.internal.module.ModuleBootstrap.boot2()Ljava/lang/ModuleLayer;+304 java.base > j jdk.internal.module.ModuleBootstrap.boot()Ljava/lang/ModuleLayer;+64 java.base > j java.lang.System.initPhase2(ZZ)I+0 java.base > v ~StubRoutines::call_stub > V [libjvm.so+0xe20118] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x5c8 > V [libjvm.so+0xe20f64] JavaCalls::call_static(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x284 > V [libjvm.so+0x189c7bc] Threads::create_vm(JavaVMInitArgs*, bool*)+0x5bc > V [libjvm.so+0xf527a0] JNI_CreateJavaVM+0xc0 > C [libjli.so+0x3860] JavaMain+0x7c > C [libjli.so+0x732c] ThreadJavaMain+0xc > C [libpthread.so.0+0x80c8] start_thread+0xd8 Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: Remove movs from asm CustomizedGitHooks: yes Change-Id: I49215be1acaf39b56ce739a8b37d303f672f3a80 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4029/files - new: https://git.openjdk.java.net/jdk/pull/4029/files/ce2ed307..47805b96 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4029&range=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4029&range=06-07 Stats: 8 lines in 2 files changed: 0 ins; 2 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/4029.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4029/head:pull/4029 PR: https://git.openjdk.java.net/jdk/pull/4029 From github.com+4146708+a74nh at openjdk.java.net Mon Jun 7 08:14:47 2021 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Mon, 7 Jun 2021 08:14:47 GMT Subject: RFR: 8266749: AArch64: Backtracing broken on PAC enabled systems [v7] In-Reply-To: References: <3Ak1iZsEfTEKadfUcF6zGVuzsDoiQbaupm66NvSwlaY=.8323de39-d7e6-4049-9243-7c31a75bbc9f@github.com> Message-ID: <77UrlEvTvnXqJqQymjUwfKb_KLaQFTdFlWPPvn6PrYA=.42d28cf8-f457-4997-94f5-41b5db26a0a4@github.com> On Fri, 4 Jun 2021 09:46:03 GMT, Andrew Haley wrote: >One minor nit: I wonder if it's necessary to do those two MOVs. We could put ptr into x30 thusly: Looks much better like this. Updated. ------------- PR: https://git.openjdk.java.net/jdk/pull/4029 From ayang at openjdk.java.net Mon Jun 7 08:24:08 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Mon, 7 Jun 2021 08:24:08 GMT Subject: RFR: 8268164: Adopt cast notation for WorkerThread conversions [v3] In-Reply-To: References: Message-ID: On Thu, 3 Jun 2021 13:14:54 GMT, Albert Mingkun Yang wrote: >> Followup of JDK-8267916 (#4240), the same refactoring for `WorkerThread`. > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review Thanks for the review. ------------- PR: https://git.openjdk.java.net/jdk/pull/4334 From ayang at openjdk.java.net Mon Jun 7 08:24:09 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Mon, 7 Jun 2021 08:24:09 GMT Subject: Integrated: 8268164: Adopt cast notation for WorkerThread conversions In-Reply-To: References: Message-ID: On Thu, 3 Jun 2021 11:21:11 GMT, Albert Mingkun Yang wrote: > Followup of JDK-8267916 (#4240), the same refactoring for `WorkerThread`. This pull request has now been integrated. Changeset: 58bdabcd Author: Albert Mingkun Yang URL: https://git.openjdk.java.net/jdk/commit/58bdabcd40cc8895d5fd829ad3515ab418245c16 Stats: 18 lines in 4 files changed: 9 ins; 8 del; 1 mod 8268164: Adopt cast notation for WorkerThread conversions Reviewed-by: stefank, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/4334 From aph at openjdk.java.net Mon Jun 7 08:38:04 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Mon, 7 Jun 2021 08:38:04 GMT Subject: RFR: 8266749: AArch64: Backtracing broken on PAC enabled systems [v8] In-Reply-To: References: <3Ak1iZsEfTEKadfUcF6zGVuzsDoiQbaupm66NvSwlaY=.8323de39-d7e6-4049-9243-7c31a75bbc9f@github.com> Message-ID: <0McC4rd9taoWuPLfJQ8elbc9aqsDKfKIrT_uC91dQHs=.61577f41-87d0-4e96-be39-dd8df9c3442a@github.com> On Mon, 7 Jun 2021 08:14:46 GMT, Alan Hayward wrote: >> On PAC systems, native code may sign return addresses before saving >> them to the stack. We must ensure we strip the any signed bits in >> order to walk the stack. >> Add extra asserts in places where we do not expect saved return >> addresses to be signed. >> >> On non-PAC systems, all PAC instructions are treated as NOPs. >> >> On Apple, use the provided ptrauth interface instead of asm >> as the compiler may optimise further. >> >> Fedora 33 compiles all distro packages using PAC. Running the distro >> provided OpenJDK-latest in GDB on a PAC system: >> >> Thread 2 "java" hit Breakpoint 1, 0x0000fffff68d7fe4 in init_globals() () >> from /usr/lib/jvm/java-16-openjdk-16.0.1.0.9-1.rolling.fc33.aarch64-fastdebug/lib/server/libjvm.so >> (gdb) call (int)pns($sp, $fp, $pc) >> >> "Executing pns" >> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0xe26fe4] init_globals()+0x10 >> C 0x006ffffff74750c4 >> C 0x0042fffff6a7f84c >> C 0x0037fffff7fa0954 >> C 0x0030fffff7fa4540 >> C 0x0078fffff7d980c8 >> >> OpenJDK with this patch at the same breakpoint: >> >> (gdb) call (int)pns($sp, $fp, $pc) >> "Executing pns" >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x189c47c] Threads::create_vm(JavaVMInitArgs*, bool*)+0x27c >> V [libjvm.so+0xf527a0] JNI_CreateJavaVM+0xc0 >> C [libjli.so+0x3860] JavaMain+0x7c >> C [libjli.so+0x732c] ThreadJavaMain+0xc >> C [libpthread.so.0+0x80c8] start_thread+0xd8 >> >> OpenJDK with this patch breakpointed at pd_hotspot_signal_handler: >> >> "Executing pns" >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x148a730] PosixSignals::pd_hotspot_signal_handler(int, siginfo_t*, ucontext_t*, JavaThread*)+0x0 >> C [linux-vdso.so.1+0x80c] __kernel_rt_sigreturn+0x0 >> J 53 c1 jdk.internal.org.objectweb.asm.SymbolTable.addConstantUtf8(Ljava/lang/String;)I java.base (98 bytes) @ 0x0000ffffe159cc3c [0x0000ffffe159cb40+0x00000000000000fc] >> j jdk.internal.org.objectweb.asm.SymbolTable.setMajorVersionAndClassName(ILjava/lang/String;)I+12 java.base >> j jdk.internal.org.objectweb.asm.ClassWriter.visit(IILjava/lang/String;Ljava/lang/String;Ljava/lang/String;[Ljava/lang/String;)V+20 java.base >> j java.lang.invoke.InvokerBytecodeGenerator.classFilePrologue()Ljdk/internal/org/objectweb/asm/ClassWriter;+30 java.base >> j java.lang.invoke.InvokerBytecodeGenerator.generateCustomizedCodeBytes()[B+1 java.base >> j java.lang.invoke.InvokerBytecodeGenerator.generateCustomizedCode(Ljava/lang/invoke/LambdaForm;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/MemberName;+27 java.base >> j java.lang.invoke.LambdaForm.compileToBytecode()V+69 java.base >> j java.lang.invoke.DirectMethodHandle.makePreparedLambdaForm(Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/LambdaForm;+792 java.base >> j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/LambdaForm;+17 java.base >> j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MemberName;Z)Ljava/lang/invoke/LambdaForm;+163 java.base >> j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MemberName;)Ljava/lang/invoke/LambdaForm;+2 java.base >> j java.lang.invoke.DirectMethodHandle.make(BLjava/lang/Class;Ljava/lang/invoke/MemberName;Ljava/lang/Class;)Ljava/lang/invoke/DirectMethodHandle;+159 java.base >> j java.lang.invoke.MethodHandles$Lookup.getDirectMethodCommon(BLjava/lang/Class;Ljava/lang/invoke/MemberName;ZZLjava/lang/invoke/MethodHandles$Lookup;)Ljava/lang/invoke/MethodHandle;+210 java.base >> j java.lang.invoke.MethodHandles$Lookup.getDirectMethodNoSecurityManager(BLjava/lang/Class;Ljava/lang/invoke/MemberName;Ljava/lang/invoke/MethodHandles$Lookup;)Ljava/lang/invoke/MethodHandle;+14 java.base >> j java.lang.invoke.MethodHandles$Lookup.getDirectMethodForConstant(BLjava/lang/Class;Ljava/lang/invoke/MemberName;)Ljava/lang/invoke/MethodHandle;+31 java.base >> j java.lang.invoke.MethodHandles$Lookup.linkMethodHandleConstant(BLjava/lang/Class;Ljava/lang/String;Ljava/lang/Object;)Ljava/lang/invoke/MethodHandle;+153 java.base >> j java.lang.invoke.MethodHandleNatives.linkMethodHandleConstant(Ljava/lang/Class;ILjava/lang/Class;Ljava/lang/String;Ljava/lang/Object;)Ljava/lang/invoke/MethodHandle;+38 java.base >> v ~StubRoutines::call_stub >> V [libjvm.so+0xe20118] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x5c8 >> V [libjvm.so+0xe20f64] JavaCalls::call_static(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x284 >> V [libjvm.so+0x184b778] SystemDictionary::link_method_handle_constant(Klass*, int, Klass*, Symbol*, Symbol*, Thread*)+0x398 >> V [libjvm.so+0xa1f104] ConstantPool::resolve_constant_at_impl(constantPoolHandle const&, int, int, bool*, Thread*)+0xca0 >> V [libjvm.so+0xa1fb6c] ConstantPool::copy_bootstrap_arguments_at_impl(constantPoolHandle const&, int, int, int, objArrayHandle, int, bool, Handle, Thread*)+0x3fc >> V [libjvm.so+0x6bef6c] BootstrapInfo::resolve_args(Thread*)+0xcbc >> V [libjvm.so+0x6c1538] BootstrapInfo::resolve_bsm(Thread*)+0x1194 >> V [libjvm.so+0x184d300] SystemDictionary::invoke_bootstrap_method(BootstrapInfo&, Thread*)+0x30 >> V [libjvm.so+0x120450c] LinkResolver::resolve_dynamic_call(CallInfo&, BootstrapInfo&, Thread*)+0x2c >> V [libjvm.so+0x1204b1c] LinkResolver::resolve_invokedynamic(CallInfo&, constantPoolHandle const&, int, Thread*)+0x1bc >> V [libjvm.so+0xe0ecc4] InterpreterRuntime::resolve_invokedynamic(JavaThread*)+0x190 >> V [libjvm.so+0xe123a0] InterpreterRuntime::resolve_from_cache(JavaThread*, Bytecodes::Code)+0x160 >> j jdk.internal.module.ModulePath.explodedPackages(Ljava/nio/file/Path;)Ljava/util/Set;+5 java.base >> j jdk.internal.module.ModulePath.lambda$readExplodedModule$9(Ljava/nio/file/Path;)Ljava/util/Set;+2 java.base >> j jdk.internal.module.ModulePath$$Lambda$2+0x000000010003bbe0.get()Ljava/lang/Object;+8 java.base >> j jdk.internal.module.ModuleInfo.doRead(Ljava/io/DataInput;)Ljdk/internal/module/ModuleInfo$Attributes;+762 java.base >> j jdk.internal.module.ModuleInfo.read(Ljava/io/InputStream;Ljava/util/function/Supplier;)Ljdk/internal/module/ModuleInfo$Attributes;+16 java.base >> j jdk.internal.module.ModulePath.readExplodedModule(Ljava/nio/file/Path;)Ljava/lang/module/ModuleReference;+35 java.base >> j jdk.internal.module.ModulePath.readModule(Ljava/nio/file/Path;Ljava/nio/file/attribute/BasicFileAttributes;)Ljava/lang/module/ModuleReference;+11 java.base >> j jdk.internal.module.ModulePath.scanDirectory(Ljava/nio/file/Path;)Ljava/util/Map;+69 java.base >> j jdk.internal.module.ModulePath.scan(Ljava/nio/file/Path;)Ljava/util/Map;+60 java.base >> j jdk.internal.module.ModulePath.scanNextEntry()V+23 java.base >> j jdk.internal.module.ModulePath.find(Ljava/lang/String;)Ljava/util/Optional;+36 java.base >> j jdk.internal.module.SystemModuleFinders$1.lambda$find$0(Ljava/lang/module/ModuleFinder;Ljava/lang/String;)Ljava/util/Optional;+2 java.base >> j jdk.internal.module.SystemModuleFinders$1$$Lambda$1+0x0000000100033b00.run()Ljava/lang/Object;+8 java.base >> j java.security.AccessController.executePrivileged(Ljava/security/PrivilegedAction;Ljava/security/AccessControlContext;Ljava/lang/Class;)Ljava/lang/Object;+29 java.base >> j java.security.AccessController.doPrivileged(Ljava/security/PrivilegedAction;)Ljava/lang/Object;+5 java.base >> j jdk.internal.module.SystemModuleFinders$1.find(Ljava/lang/String;)Ljava/util/Optional;+12 java.base >> j jdk.internal.module.ModuleBootstrap.boot2()Ljava/lang/ModuleLayer;+304 java.base >> j jdk.internal.module.ModuleBootstrap.boot()Ljava/lang/ModuleLayer;+64 java.base >> j java.lang.System.initPhase2(ZZ)I+0 java.base >> v ~StubRoutines::call_stub >> V [libjvm.so+0xe20118] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x5c8 >> V [libjvm.so+0xe20f64] JavaCalls::call_static(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x284 >> V [libjvm.so+0x189c7bc] Threads::create_vm(JavaVMInitArgs*, bool*)+0x5bc >> V [libjvm.so+0xf527a0] JNI_CreateJavaVM+0xc0 >> C [libjli.so+0x3860] JavaMain+0x7c >> C [libjli.so+0x732c] ThreadJavaMain+0xc >> C [libpthread.so.0+0x80c8] start_thread+0xd8 > > Alan Hayward has updated the pull request incrementally with one additional commit since the last revision: > > Remove movs from asm > > CustomizedGitHooks: yes > Change-Id: I49215be1acaf39b56ce739a8b37d303f672f3a80 I don't think anyone can improve on that. Thanks for your patience. ------------- Marked as reviewed by aph (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4029 From ysuenaga at openjdk.java.net Mon Jun 7 08:52:02 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Mon, 7 Jun 2021 08:52:02 GMT Subject: RFR: 8268228: TSC is not used for CPUTimeStampCounter on AMD processor [v2] In-Reply-To: References: Message-ID: On Mon, 7 Jun 2021 04:55:47 GMT, Kim Barrett wrote: >> Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix comments > > I think JFR is the only VM subsystem that currently uses the "fast" time > that is based on TSC, with a fallback to OS time facilities if "fast" time > is not enabled. (There has been discussion (under JDK-8211240) about > eliminating that distinction and just always using OS time facilities, but > it hasn't received much attention.) GC (and maybe other places?) uses the > dual time mechanism because we want reliable time but also send JFR events. > So some some of the major VM clients for time information are currently > paying some cost for having both implementations. > > Currently the TSC frequency is always obtained from the CPUID brand string, > with the bogomips style estimate in initialize_frequency never being used. > Rdtsc::is_supported() is true iff VM_Version_Ext::supports_tscinv_ext(). > And initialize_frequency() uses the brand string if supports_tscinv_ext(). > > I think the current implementation of the bogomips calculation can > intermittently produce catastrophically wrong results. Descheduling at the > wrong place(s) in the loop can badly mess things up. I thought there was a > bug for this, but can't find one. > > Also, there are things like the Intel erratum referenced here: > http://lkml.iu.edu/hypermail/linux/kernel/1511.1/01048.html > that make things even more fun. > > I think that detecting a "good" TSC and it's properties (like frequency) is > pretty hard, and we should not try to duplicate the OS detection or second > guess it. I also think that using the "fast" time when the TSC is not > "good" is a mistake, but I have so far not convinced the JFR folks. > > So I'm not in favor of this change. I think we should be moving away from > direct TSC access rather than trying to use it in more cases. @kimbarrett Did you mean we cannot detect "good" TSC from invariant TSC flag? If so, I have to withdraw this PR. And also TSC support for intel processor should be removed (it will happen in JDK-8211240?) ASAP. > Currently the TSC frequency is always obtained from the CPUID brand string, > with the bogomips style estimate in initialize_frequency never being used. If TSC support will remain, it should be detected from CPUID with EAX = 16H, however it is not available on AMD processor as I said before, so we need to bogomips style calculation. ------------- PR: https://git.openjdk.java.net/jdk/pull/4350 From iwalulya at openjdk.java.net Mon Jun 7 09:08:01 2021 From: iwalulya at openjdk.java.net (Ivan Walulya) Date: Mon, 7 Jun 2021 09:08:01 GMT Subject: RFR: 8268290: Improve LockFreeQueue<> utility In-Reply-To: References: Message-ID: On Sun, 6 Jun 2021 16:17:40 GMT, Kim Barrett wrote: > Please review this change to the LockFreeQueue utility class. > > The LockFreeQueue originated as an implementation detail of > G1DirtyCardQueueSet, and was recently refactored into a public utility > class. In that refactoring it retained some limitations that were > acceptable in its original context, but may be problematic as a general > utility. > > In particular, under some conditions a thread was not be able to pop the > last element in the queue, due to interference by a concurrent operation. > And this state will persist, so retrying the pop operation won't help until > the interfering thread had made sufficient progress. This was mitigated by > making the API more complex to provide notice to the client that the queue > may be in this state. > > But it turns out we can do somewhat better, eliminating one of the > limitations, which is the point of this change. We introduce a > pseudo-object used as an end of queue marker. We can use the transition of > the last element's next value from the end marker to NULL by a pop operation > as a claim on the element, allowing the losing thread to recognize, retry, > and make progress. > > This queue still has the limitation that an in-progress push/append may > prevent popping elements. Because of this, the class is being renamed to > NonblockingQueue. The old name suggests stronger guarantees than actually > provided. > > The PR has two commits, the first for the functional changes, the second for > the renaming. The github diffs don't seem to be recognizing the renaming of > the source files as a rename, instead treating the old files as deleted and > the new files as added. The first commit by itself is probably more useful > for reviewing the functional changes. > > Testing: > mach5 tier1-5 lgtm! ------------- Marked as reviewed by iwalulya (Committer). PR: https://git.openjdk.java.net/jdk/pull/4379 From chagedorn at openjdk.java.net Mon Jun 7 09:08:14 2021 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Mon, 7 Jun 2021 09:08:14 GMT Subject: RFR: 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests [v13] In-Reply-To: References: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> Message-ID: On Fri, 4 Jun 2021 12:41:44 GMT, Christian Hagedorn wrote: >> This RFE provides an IR test framework to perform regex-based checks on the C2 IR shape of test methods emitted by the VM flags `-XX:+PrintIdeal` and `-XX:+PrintOptoAssembly`. The framework can also be used for other non-IR matching (and non-compiler) tests by providing easy to use annotations for commonly used testing patterns and compiler control flags. >> >> The framework is based on the ideas of the currently present IR test framework in [Valhalla](https://github.com/openjdk/valhalla/blob/e9c78ce4fcfd01361c35883e0d68f9ae5a80d079/test/hotspot/jtreg/compiler/valhalla/inlinetypes/InlineTypeTest.java) (mainly implemented by @TobiHartmann) which is being used with great success. This new framework aims to replace the old one in Valhalla at some point. >> >> A detailed description about how this new IR test framework works and how it is used is provided in the [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) file and in the [Javadocs](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc/jdk/test/lib/hotspot/ir_framework/package-summary.html) written for the framework classes. >> >> To finish a first version of this framework for JDK 17, I decided to leave some improvement possibilities and ideas to be followed up on in additional RFEs. Some ideas are mentioned in "Future Work" in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) and were also created as subtasks of this RFE. >> >> Testing (also described in "Internal Framework Tests in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md)): >> There are various tests to verify the correctness of the test framework which can be found as JTreg tests in the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) folder. Additional testing was performed by converting all compiler Inline Types test of project Valhalla (done by @katyapav in [JDK-8263024](https://bugs.openjdk.java.net/browse/JDK-8263024)) that used the old framework to the new framework. This provided additional testing for the framework itself. We ran the converted tests with all the flag settings used in hs-tier1-9 and hs-precheckin-comp. For sanity checking, this was also done with a sample IR test in mainline. >> >> Some stats about the framework code added to [ir_framework](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework): >> >> - without the [Javadocs files](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc) : 60 changed files, 13212 insertions, 0 deletions. >> - without the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) and [examples](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/examples) folder: 40 files changed, 6781 insertions >> - comments: 2399 insertions (calculated with `git diff --cached !(tests|examples) | grep -c -E "(^[+-]\s*(/)?*)|(^[+-]\s*//)"`) >> - which leaves 4382 lines of code inserted >> >> Big thanks to: >> - @TobiHartmann for all his help by discussing the new framework and for providing insights from his IR test framework in Valhalla. >> - @katyapav for converting the Valhalla tests to use the new framework which found some harder to catch bugs in the framework and also some actual C2 bugs. >> - @iignatev for helping to simplify the framework usage with JTreg and with the framework internal VM calling structure. >> - and others who provided valuable feedback. >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request incrementally with three additional commits since the last revision: > > - Update test and example package names, README files and fix some tests to let them pass in higher tiers > - Move tests and examples #2 > - Move tests and examples #1 Testing looked good! I'm doing some last checks before integrating it later today. Thanks to all for carefully reviewing it and providing valuable feedback. ------------- PR: https://git.openjdk.java.net/jdk/pull/3508 From github.com+28651297+mkartashev at openjdk.java.net Mon Jun 7 11:03:04 2021 From: github.com+28651297+mkartashev at openjdk.java.net (Maxim Kartashev) Date: Mon, 7 Jun 2021 11:03:04 GMT Subject: RFR: 8195129: System.load() fails to load from unicode paths [v3] In-Reply-To: <-V3-GuFQLcbKVotN0nKemAI3s3mkmbtHW0WgpYL6cvc=.e4eb2552-ff8f-459f-afa5-4a312508228e@github.com> References: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> <3y0nPfUyTPbNksPn1y5pvopzN2AReOgIl2CafPKD4b4=.3b490e90-5098-4d9f-8d7e-2770f5548895@github.com> <923qsXnidgxLrhNUc8Bxw3zDCiw1ZNLC6OmIYqIeSOE=.783c385d-1d4c-4c87-b975-3ee27a71513d@github.com> <-V3-GuFQLcbKVotN0nKemAI3s3mkmbtHW0WgpYL6cvc=.e4eb2552-ff8f-459f-afa5-4a312508228e@github.com> Message-ID: On Sun, 6 Jun 2021 22:25:44 GMT, David Holmes wrote: >> I think we need to establish some common ground before proceeding further with this fix. It's a bit of a long read; please, bear with me. >> >> The path name starts its life as a `jstring` in `Java_jdk_internal_loader_NativeLibraries_load()`, its encoding is irrelevant at this point. >> >> Next, the name has to be passed down to `JVM_LoadLibrary()` that takes `char*`. So we need to convert form `jstring` to `char*` (point (a)). Following that, `os::dll_load()` that actually performs loading in a platform-specific manner also receives `char*`. All platform implementations of `os::dll_load()` pass the path name down to their respective platform's APIs unmodified, but I think that's just incidental and here we have another possible point of conversion (point (b)). Other consumers of the path name are exception(c) and logging(d) messages; they also take `char*`, but potentially of a different encoding. >> >> Let me try to enumerate all conceivably valid conversions for `JVM_LoadLibrary()` consumption (point (a)): >> 1. jstring -> platform-specific encoding (status quo meaning possibly lossy encoding on Windows and UTF-8 elsewhere AFAICT), >> 2. jstring -> modified UTF-8, >> 3. jstring -> UTF-8. >> >> This bug [8195129](https://bugs.openjdk.java.net/browse/JDK-8195129) occurs because conversion (1) may loose information on Windows if the platform encoding happens to be NOT UTF-8 (which it often - or even always - is). So that's a no-go and we are left with either (2) or (3). >> >> On MacOS and Linux, "platform" encoding already is UTF-8 and since all the platform APIs happily consume UTF-8, no further conversion is necessary (neither for actual library loading, nor for log or exception messages; the latter have to convert to UTF-16, but do that under the hood). >> >> On Windows, we require at least these variants of the path name: >> 1. UTF16 for library loading (Unicode Windows API), >> 2. "platform" encoding for logging (yes, loosing information here, but that's tolerable), >> 3. "platform" (lossy) or UTF8 (lossless) encoding for exception messages (prefer lossless). >> >> This is what's behind my choice of UTF-8 for the path name encoding as it gets passed down to `JVM_LoadLibrary()`. We can go with modified UTF-8, of course, in which case all platforms - not just Windows - will have to do the conversion on their own, loosing the benefit of the knowledge about the original string encoding (the String.coder field of jstring). > > @mkartashev thank you for the detailed explanation. > > It is not clear to me that the JDK's conformance to being a Unicode application has significantly changed since the evaluation of JDK-8017274 - @naotoj can you comment on that and related discussion from the CCC for JDK-4958170 ? In particular I'm not sure that using the platform encoding is wrong, nor how we can have a path that cannot be represented by the platform encoding? > > Not being an expert in this area I cannot evaluate the affects of these shared code changes on other platforms, and so am reluctant to introduce any change that affects any non-Windows platforms. Also the JVM and JNI work with modified-UTF8 so I do not think we should diverge from that. > I would hate to see windows specific code introduced into the JDK or JVM's shared code for these APIs, but that may be the only choice to avoid potential disruption to other platforms. Though perhaps we could push the initial conversion down into the JVM? > I think I am hesitant to change the JVM interface from modified UTF-8 to standard UTF-8, AFAICT all platforms except Windows already use standard UTF-8 on that path (from `Java_jdk_internal_loader_NativeLibraries_load()` to `JVM_LoadLibrary()`) because the "platform" encoding for those happens to be "UTF-8". So at the current stage this patch actually maintains status quo for all platforms except Windows, the only platform where the bug exists. But I am not against changing the encoding to modified UTF-8 and updating os::dll_load() for all platforms. Just wanted to have some consensus before proceeding with that change. ------------- PR: https://git.openjdk.java.net/jdk/pull/4169 From iwalulya at openjdk.java.net Mon Jun 7 11:17:04 2021 From: iwalulya at openjdk.java.net (Ivan Walulya) Date: Mon, 7 Jun 2021 11:17:04 GMT Subject: RFR: 8263107: PSPromotionManager::copy_and_push_safe_barrier needs acquire memory barrier In-Reply-To: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> References: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> Message-ID: On Sat, 5 Jun 2021 03:30:42 GMT, Kim Barrett wrote: > Please review this change to PSPromotionManager::copy_to_survivor_space > (ParallelGC) to remove some redundant work, and to add some missing memory > barriers. > > There are two callers of copy_to_survivor_space, both of which wrap that > call with the idiom > > if obj->is_forwarded() then > new_obj = obj->forwardee() > else > new_obj = copy_to_survivor_space(obj) > endif > > There are problems with this. > > (1) The first thing copy_to_survivor_space does is check whether the object > is already forwarded, and if so then return obj->forwardee_acquire(). The > idiom used by the callers is a redundant check, and the redundancy can't be > optimized away. It is also missing the acquire barrier that was added by > JDK-8154736 after long discussion. > > (2) It turns out the forwardee_acquire() from JDK-8154736 isn't sufficient > after all. The "if is_forwarded() then use forwardee()" idiom is hiding > under the abstractions that we're doing two relaxed atomic loads of the mark > word, and there is nothing here to prevent the second from reading a value > older than that read by the first, with bad consequences. This possibility > came up in the discussion of JDK-8154736, but seems to have been either lost > or discounted. If you think loads from the same location can't do that, see > JDK-8229169 for a counter example. > > Part of this change involves removing the conditionalization of the calls to > copy_to_survivor_space; just call it directly. However, it turns out that > some compilers don't inline copy_to_survivor_space because of its size. So > we refactored it into two functions, one doing the already marked check and > then calling the other to do most of the work. This is enough for the check > to be inlined into callers, so we've effectively removed the redundant inner > check. Note: This part of the change introduces a large block of whitespace > differences due to removal of an if-else and outdenting the body; I recommend > using a view that suppresses those when reviewing. > > The second part of the change involves adding or moving some acquire barriers. > > (a) For the initial check whether the object is already marked, if it is > then add an acquire fence before returning the forwardee. We could instead > use a load-acquire to obtain the mark word, but that would be an unneeded > acquire barrier on the much more common unmarked case. Also removed > forwardee_acquire(), which is no longer used. > > (b) If the cmpxchg race is lost, add an acquire fence before fetching and > returning the forwardee. The failed release-cmpxchg effectively behaves > like a relaxed-load, which must preceed the forwardee access and any reads > from it. > > I've also changed to only log copying when actually copied, not when already > copied and forwarded. Also changed a guarantee to an assert. > > I looked at all uses of forwardee() in light of problem (2), and did not > find any additional problems. (That doesn't mean there aren't any, just > that I didn't spot any. This is low-level atomics, after all.) > > Testing: > mach5 tier1-3,5,7 (tier3,5,7 are where a lot of ParallelGC testing is done). > Performance testing showed no significant change. lgtm! ------------- Marked as reviewed by iwalulya (Committer). PR: https://git.openjdk.java.net/jdk/pull/4371 From luhenry at openjdk.java.net Mon Jun 7 11:21:32 2021 From: luhenry at openjdk.java.net (Ludovic Henry) Date: Mon, 7 Jun 2021 11:21:32 GMT Subject: RFR: 8268178: Extract sender frame parsing to CodeBlob::FrameParser [v4] In-Reply-To: References: Message-ID: > Whether and how a frame is setup is controlled by the code generator > for the specific CodeBlock. The CodeBlock is then in the best place to know how > to parse the sender's frame from the current frame in the given CodeBlock. > > This refactoring proposes to extract this parsing out of `frame` and into a > `CodeBlock::FrameParser`. This FrameParser is then specialized in the relevant > inherited children of CodeBlock. > > This change is to largely facilitate adding new supported cases for JDK-8252417 [1] > like runtime stubs. > > [1] https://bugs.openjdk.java.net/browse/JDK-8252417 Ludovic Henry has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: 8268178: Extract sender frame parsing to CodeBlock::FrameParser Whether and how a frame is setup is controlled by the code generator for the specific CodeBlock. The CodeBlock is then in the best place to know how to parse the sender's frame from the current frame in the given CodeBlock. This refactoring proposes to extract this parsing out of `frame` and into a `CodeBlock::FrameParser`. This FrameParser is then specialized in the relevant inherited children of CodeBlock. This change is to largely facilitate adding new supported cases for JDK-8252417 like runtime stubs. ------------- Changes: https://git.openjdk.java.net/jdk/pull/4337/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4337&range=03 Stats: 696 lines in 20 files changed: 511 ins; 117 del; 68 mod Patch: https://git.openjdk.java.net/jdk/pull/4337.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4337/head:pull/4337 PR: https://git.openjdk.java.net/jdk/pull/4337 From tschatzl at openjdk.java.net Mon Jun 7 12:18:03 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Mon, 7 Jun 2021 12:18:03 GMT Subject: RFR: 8263107: PSPromotionManager::copy_and_push_safe_barrier needs acquire memory barrier In-Reply-To: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> References: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> Message-ID: <9mPEpIVsXsslIDA63ChIoQOW3MmiE3QBcNpvuYah1mc=.8874cad7-99c6-438a-b5f6-ad7735d578dd@github.com> On Sat, 5 Jun 2021 03:30:42 GMT, Kim Barrett wrote: > Please review this change to PSPromotionManager::copy_to_survivor_space > (ParallelGC) to remove some redundant work, and to add some missing memory > barriers. > > There are two callers of copy_to_survivor_space, both of which wrap that > call with the idiom > > if obj->is_forwarded() then > new_obj = obj->forwardee() > else > new_obj = copy_to_survivor_space(obj) > endif > > There are problems with this. > > (1) The first thing copy_to_survivor_space does is check whether the object > is already forwarded, and if so then return obj->forwardee_acquire(). The > idiom used by the callers is a redundant check, and the redundancy can't be > optimized away. It is also missing the acquire barrier that was added by > JDK-8154736 after long discussion. > > (2) It turns out the forwardee_acquire() from JDK-8154736 isn't sufficient > after all. The "if is_forwarded() then use forwardee()" idiom is hiding > under the abstractions that we're doing two relaxed atomic loads of the mark > word, and there is nothing here to prevent the second from reading a value > older than that read by the first, with bad consequences. This possibility > came up in the discussion of JDK-8154736, but seems to have been either lost > or discounted. If you think loads from the same location can't do that, see > JDK-8229169 for a counter example. > > Part of this change involves removing the conditionalization of the calls to > copy_to_survivor_space; just call it directly. However, it turns out that > some compilers don't inline copy_to_survivor_space because of its size. So > we refactored it into two functions, one doing the already marked check and > then calling the other to do most of the work. This is enough for the check > to be inlined into callers, so we've effectively removed the redundant inner > check. Note: This part of the change introduces a large block of whitespace > differences due to removal of an if-else and outdenting the body; I recommend > using a view that suppresses those when reviewing. > > The second part of the change involves adding or moving some acquire barriers. > > (a) For the initial check whether the object is already marked, if it is > then add an acquire fence before returning the forwardee. We could instead > use a load-acquire to obtain the mark word, but that would be an unneeded > acquire barrier on the much more common unmarked case. Also removed > forwardee_acquire(), which is no longer used. > > (b) If the cmpxchg race is lost, add an acquire fence before fetching and > returning the forwardee. The failed release-cmpxchg effectively behaves > like a relaxed-load, which must preceed the forwardee access and any reads > from it. > > I've also changed to only log copying when actually copied, not when already > copied and forwarded. Also changed a guarantee to an assert. > > I looked at all uses of forwardee() in light of problem (2), and did not > find any additional problems. (That doesn't mean there aren't any, just > that I didn't spot any. This is low-level atomics, after all.) > > Testing: > mach5 tier1-3,5,7 (tier3,5,7 are where a lot of ParallelGC testing is done). > Performance testing showed no significant change. Lgtm. src/hotspot/share/gc/parallel/psPromotionManager.inline.hpp line 145: > 143: // the release-cmpxchg that performed the forwarding, possibly in some > 144: // other thread. > 145: OrderAccess::acquire(); Maybe add a comment here that `copy_unmarked_to_survivor_space` also guarantees this for all paths (i.e. both successful as well as failing CAS). Maybe this is not the correct place here, maybe in a comment for `copy_unmarked_to_survivor_space`; or something like "`copy_to_survivor_space` acts as an acquire" or something, so that the returned object can be used "safely" - idk. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4371 From erikj at openjdk.java.net Mon Jun 7 12:37:03 2021 From: erikj at openjdk.java.net (Erik Joelsson) Date: Mon, 7 Jun 2021 12:37:03 GMT Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 In-Reply-To: References: Message-ID: On Fri, 4 Jun 2021 20:55:51 GMT, Scott Gibbons wrote: > Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding. > > A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done. This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so. A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded. This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded. > > The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding. The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically. > > Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k. The numbers are given in the table below. > > **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**. > > > Benchmark Name | Base Score | Optimized Score | Gain > -- | -- | -- | -- > testBase64Decode size 1 | 15.36 | 15.32 | 1.00 > testBase64Decode size 3 | 17.00 | 16.72 | 1.02 > testBase64Decode size 7 | 20.60 | 18.82 | 1.09 > testBase64Decode size 32 | 34.21 | 26.77 | 1.28 > testBase64Decode size 64 | 54.43 | 38.35 | 1.42 > testBase64Decode size 80 | 66.40 | 48.34 | 1.37 > testBase64Decode size 96 | 73.16 | 52.90 | 1.38 > testBase64Decode size 112 | 84.93 | 51.82 | 1.64 > testBase64Decode size 512 | 288.81 | 32.04 | 9.01 > testBase64Decode size 1000 | 560.48 | 40.79 | 13.74 > testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72 > testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15 > testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07 > testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10 > testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02 > testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10 > testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05 > testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00 > testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05 > testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20 > testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09 > testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12 > testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09 > testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21 > testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29 > testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12 > testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05 > testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18 > testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02 > testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24 > testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23 > testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24 > testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14 > testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19 > testBase64WithErrorInputsDecode size 20000 | 1398.44 | 1138.17 | 1.23 > testBase64WithErrorInputsDecode size 50000 | 1409.41 | 1114.16 | 1.26 The gitignore change looks ok, but should maybe be a separate change. ------------- Marked as reviewed by erikj (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4368 From luhenry at openjdk.java.net Mon Jun 7 12:56:53 2021 From: luhenry at openjdk.java.net (Ludovic Henry) Date: Mon, 7 Jun 2021 12:56:53 GMT Subject: RFR: 8268178: Extract sender frame parsing to CodeBlob::FrameParser [v5] In-Reply-To: References: Message-ID: > Whether and how a frame is setup is controlled by the code generator > for the specific CodeBlock. The CodeBlock is then in the best place to know how > to parse the sender's frame from the current frame in the given CodeBlock. > > This refactoring proposes to extract this parsing out of `frame` and into a > `CodeBlock::FrameParser`. This FrameParser is then specialized in the relevant > inherited children of CodeBlock. > > This change is to largely facilitate adding new supported cases for JDK-8252417 [1] > like runtime stubs. > > [1] https://bugs.openjdk.java.net/browse/JDK-8252417 Ludovic Henry has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: 8268178: Extract sender frame parsing to CodeBlock::FrameParser Whether and how a frame is setup is controlled by the code generator for the specific CodeBlock. The CodeBlock is then in the best place to know how to parse the sender's frame from the current frame in the given CodeBlock. This refactoring proposes to extract this parsing out of `frame` and into a `CodeBlock::FrameParser`. This FrameParser is then specialized in the relevant inherited children of CodeBlock. This change is to largely facilitate adding new supported cases for JDK-8252417 like runtime stubs. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4337/files - new: https://git.openjdk.java.net/jdk/pull/4337/files/2800cc50..3b445203 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4337&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4337&range=03-04 Stats: 4 lines in 4 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/4337.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4337/head:pull/4337 PR: https://git.openjdk.java.net/jdk/pull/4337 From github.com+6704669+asgibbons at openjdk.java.net Mon Jun 7 13:20:20 2021 From: github.com+6704669+asgibbons at openjdk.java.net (Scott Gibbons) Date: Mon, 7 Jun 2021 13:20:20 GMT Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v2] In-Reply-To: References: Message-ID: > Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding. > > A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done. This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so. A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded. This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded. > > The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding. The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically. > > Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k. The numbers are given in the table below. > > **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**. > > > Benchmark Name | Base Score | Optimized Score | Gain > -- | -- | -- | -- > testBase64Decode size 1 | 15.36 | 15.32 | 1.00 > testBase64Decode size 3 | 17.00 | 16.72 | 1.02 > testBase64Decode size 7 | 20.60 | 18.82 | 1.09 > testBase64Decode size 32 | 34.21 | 26.77 | 1.28 > testBase64Decode size 64 | 54.43 | 38.35 | 1.42 > testBase64Decode size 80 | 66.40 | 48.34 | 1.37 > testBase64Decode size 96 | 73.16 | 52.90 | 1.38 > testBase64Decode size 112 | 84.93 | 51.82 | 1.64 > testBase64Decode size 512 | 288.81 | 32.04 | 9.01 > testBase64Decode size 1000 | 560.48 | 40.79 | 13.74 > testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72 > testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15 > testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07 > testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10 > testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02 > testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10 > testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05 > testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00 > testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05 > testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20 > testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09 > testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12 > testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09 > testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21 > testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29 > testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12 > testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05 > testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18 > testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02 > testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24 > testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23 > testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24 > testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14 > testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19 > testBase64WithErrorInputsDecode size 20000 | 1398.44 | 1138.17 | 1.23 > testBase64WithErrorInputsDecode size 50000 | 1409.41 | 1114.16 | 1.26 Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Update full name ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4368/files - new: https://git.openjdk.java.net/jdk/pull/4368/files/e527557a..00fd5621 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4368&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4368&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/4368.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4368/head:pull/4368 PR: https://git.openjdk.java.net/jdk/pull/4368 From chagedorn at openjdk.java.net Mon Jun 7 14:15:40 2021 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Mon, 7 Jun 2021 14:15:40 GMT Subject: Integrated: 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests In-Reply-To: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> References: <2iYQOJ5yeu7SvGcScLPBOWCPMLv69e1ksOL1vW3ytL8=.0c27621d-ef3d-422c-9d8c-922078ca3160@github.com> Message-ID: On Thu, 15 Apr 2021 07:45:50 GMT, Christian Hagedorn wrote: > This RFE provides an IR test framework to perform regex-based checks on the C2 IR shape of test methods emitted by the VM flags `-XX:+PrintIdeal` and `-XX:+PrintOptoAssembly`. The framework can also be used for other non-IR matching (and non-compiler) tests by providing easy to use annotations for commonly used testing patterns and compiler control flags. > > The framework is based on the ideas of the currently present IR test framework in [Valhalla](https://github.com/openjdk/valhalla/blob/e9c78ce4fcfd01361c35883e0d68f9ae5a80d079/test/hotspot/jtreg/compiler/valhalla/inlinetypes/InlineTypeTest.java) (mainly implemented by @TobiHartmann) which is being used with great success. This new framework aims to replace the old one in Valhalla at some point. > > A detailed description about how this new IR test framework works and how it is used is provided in the [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) file and in the [Javadocs](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc/jdk/test/lib/hotspot/ir_framework/package-summary.html) written for the framework classes. > > To finish a first version of this framework for JDK 17, I decided to leave some improvement possibilities and ideas to be followed up on in additional RFEs. Some ideas are mentioned in "Future Work" in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md) and were also created as subtasks of this RFE. > > Testing (also described in "Internal Framework Tests in [README.md](https://github.com/chhagedorn/jdk/blob/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/README.md)): > There are various tests to verify the correctness of the test framework which can be found as JTreg tests in the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) folder. Additional testing was performed by converting all compiler Inline Types test of project Valhalla (done by @katyapav in [JDK-8263024](https://bugs.openjdk.java.net/browse/JDK-8263024)) that used the old framework to the new framework. This provided additional testing for the framework itself. We ran the converted tests with all the flag settings used in hs-tier1-9 and hs-precheckin-comp. For sanity checking, this was also done with a sample IR test in mainline. > > Some stats about the framework code added to [ir_framework](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework): > > - without the [Javadocs files](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/doc) : 60 changed files, 13212 insertions, 0 deletions. > - without the [tests](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/tests) and [examples](https://github.com/chhagedorn/jdk/tree/aa005f384a4567c6c0b5f08f7c5df57f705dc540/test/lib/jdk/test/lib/hotspot/ir_framework/examples) folder: 40 files changed, 6781 insertions > - comments: 2399 insertions (calculated with `git diff --cached !(tests|examples) | grep -c -E "(^[+-]\s*(/)?*)|(^[+-]\s*//)"`) > - which leaves 4382 lines of code inserted > > Big thanks to: > - @TobiHartmann for all his help by discussing the new framework and for providing insights from his IR test framework in Valhalla. > - @katyapav for converting the Valhalla tests to use the new framework which found some harder to catch bugs in the framework and also some actual C2 bugs. > - @iignatev for helping to simplify the framework usage with JTreg and with the framework internal VM calling structure. > - and others who provided valuable feedback. > > Thanks, > Christian This pull request has now been integrated. Changeset: 3396b69f Author: Christian Hagedorn URL: https://git.openjdk.java.net/jdk/commit/3396b69fc91db4a9e29806562215f92179ba4757 Stats: 13454 lines in 67 files changed: 13454 ins; 0 del; 0 mod 8254129: IR Test Framework to support regex-based matching on the IR in JTreg compiler tests Co-authored-by: Christian Hagedorn Co-authored-by: Tobias Hartmann Reviewed-by: iignatyev ------------- PR: https://git.openjdk.java.net/jdk/pull/3508 From mdoerr at openjdk.java.net Mon Jun 7 14:38:15 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Mon, 7 Jun 2021 14:38:15 GMT Subject: RFR: 8263107: PSPromotionManager::copy_and_push_safe_barrier needs acquire memory barrier In-Reply-To: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> References: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> Message-ID: <-B5S5aS1AxTb4vCwvZtXnuM03m3ZIQ4y4gdS1lESgYc=.aa5696b9-4e38-4673-b165-09c67a6309ba@github.com> On Sat, 5 Jun 2021 03:30:42 GMT, Kim Barrett wrote: > Please review this change to PSPromotionManager::copy_to_survivor_space > (ParallelGC) to remove some redundant work, and to add some missing memory > barriers. > > There are two callers of copy_to_survivor_space, both of which wrap that > call with the idiom > > if obj->is_forwarded() then > new_obj = obj->forwardee() > else > new_obj = copy_to_survivor_space(obj) > endif > > There are problems with this. > > (1) The first thing copy_to_survivor_space does is check whether the object > is already forwarded, and if so then return obj->forwardee_acquire(). The > idiom used by the callers is a redundant check, and the redundancy can't be > optimized away. It is also missing the acquire barrier that was added by > JDK-8154736 after long discussion. > > (2) It turns out the forwardee_acquire() from JDK-8154736 isn't sufficient > after all. The "if is_forwarded() then use forwardee()" idiom is hiding > under the abstractions that we're doing two relaxed atomic loads of the mark > word, and there is nothing here to prevent the second from reading a value > older than that read by the first, with bad consequences. This possibility > came up in the discussion of JDK-8154736, but seems to have been either lost > or discounted. If you think loads from the same location can't do that, see > JDK-8229169 for a counter example. > > Part of this change involves removing the conditionalization of the calls to > copy_to_survivor_space; just call it directly. However, it turns out that > some compilers don't inline copy_to_survivor_space because of its size. So > we refactored it into two functions, one doing the already marked check and > then calling the other to do most of the work. This is enough for the check > to be inlined into callers, so we've effectively removed the redundant inner > check. Note: This part of the change introduces a large block of whitespace > differences due to removal of an if-else and outdenting the body; I recommend > using a view that suppresses those when reviewing. > > The second part of the change involves adding or moving some acquire barriers. > > (a) For the initial check whether the object is already marked, if it is > then add an acquire fence before returning the forwardee. We could instead > use a load-acquire to obtain the mark word, but that would be an unneeded > acquire barrier on the much more common unmarked case. Also removed > forwardee_acquire(), which is no longer used. > > (b) If the cmpxchg race is lost, add an acquire fence before fetching and > returning the forwardee. The failed release-cmpxchg effectively behaves > like a relaxed-load, which must preceed the forwardee access and any reads > from it. > > I've also changed to only log copying when actually copied, not when already > copied and forwarded. Also changed a guarantee to an assert. > > I looked at all uses of forwardee() in light of problem (2), and did not > find any additional problems. (That doesn't mean there aren't any, just > that I didn't spot any. This is low-level atomics, after all.) > > Testing: > mach5 tier1-3,5,7 (tier3,5,7 are where a lot of ParallelGC testing is done). > Performance testing showed no significant change. I have trouble understanding (2). I have no idea how it can happen that reading the same volatile memory location a second time can retrieve an older value. Regarding JDK-8229169, age() and age_top() read different sizes. The ordering issue was related to different Bytes which weren't read before AFAICS. ------------- PR: https://git.openjdk.java.net/jdk/pull/4371 From ayang at openjdk.java.net Mon Jun 7 14:51:13 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Mon, 7 Jun 2021 14:51:13 GMT Subject: RFR: 8263107: PSPromotionManager::copy_and_push_safe_barrier needs acquire memory barrier In-Reply-To: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> References: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> Message-ID: On Sat, 5 Jun 2021 03:30:42 GMT, Kim Barrett wrote: > Please review this change to PSPromotionManager::copy_to_survivor_space > (ParallelGC) to remove some redundant work, and to add some missing memory > barriers. > > There are two callers of copy_to_survivor_space, both of which wrap that > call with the idiom > > if obj->is_forwarded() then > new_obj = obj->forwardee() > else > new_obj = copy_to_survivor_space(obj) > endif > > There are problems with this. > > (1) The first thing copy_to_survivor_space does is check whether the object > is already forwarded, and if so then return obj->forwardee_acquire(). The > idiom used by the callers is a redundant check, and the redundancy can't be > optimized away. It is also missing the acquire barrier that was added by > JDK-8154736 after long discussion. > > (2) It turns out the forwardee_acquire() from JDK-8154736 isn't sufficient > after all. The "if is_forwarded() then use forwardee()" idiom is hiding > under the abstractions that we're doing two relaxed atomic loads of the mark > word, and there is nothing here to prevent the second from reading a value > older than that read by the first, with bad consequences. This possibility > came up in the discussion of JDK-8154736, but seems to have been either lost > or discounted. If you think loads from the same location can't do that, see > JDK-8229169 for a counter example. > > Part of this change involves removing the conditionalization of the calls to > copy_to_survivor_space; just call it directly. However, it turns out that > some compilers don't inline copy_to_survivor_space because of its size. So > we refactored it into two functions, one doing the already marked check and > then calling the other to do most of the work. This is enough for the check > to be inlined into callers, so we've effectively removed the redundant inner > check. Note: This part of the change introduces a large block of whitespace > differences due to removal of an if-else and outdenting the body; I recommend > using a view that suppresses those when reviewing. > > The second part of the change involves adding or moving some acquire barriers. > > (a) For the initial check whether the object is already marked, if it is > then add an acquire fence before returning the forwardee. We could instead > use a load-acquire to obtain the mark word, but that would be an unneeded > acquire barrier on the much more common unmarked case. Also removed > forwardee_acquire(), which is no longer used. > > (b) If the cmpxchg race is lost, add an acquire fence before fetching and > returning the forwardee. The failed release-cmpxchg effectively behaves > like a relaxed-load, which must preceed the forwardee access and any reads > from it. > > I've also changed to only log copying when actually copied, not when already > copied and forwarded. Also changed a guarantee to an assert. > > I looked at all uses of forwardee() in light of problem (2), and did not > find any additional problems. (That doesn't mean there aren't any, just > that I didn't spot any. This is low-level atomics, after all.) > > Testing: > mach5 tier1-3,5,7 (tier3,5,7 are where a lot of ParallelGC testing is done). > Performance testing showed no significant change. src/hotspot/share/gc/parallel/psPromotionManager.inline.hpp line 308: > 306: } > 307: > 308: // don't update this before the unallocation! This comment gives me the impression that there is some racy going on here, and deallocation and update to `new_obj` must be ordered this way. However, the actual reason is that the old value of`new_obj` is used for deallocation. I think it's best to remove this comment; it doesn't really say anything interesting. ------------- PR: https://git.openjdk.java.net/jdk/pull/4371 From github.com+28651297+mkartashev at openjdk.java.net Mon Jun 7 16:25:24 2021 From: github.com+28651297+mkartashev at openjdk.java.net (Maxim Kartashev) Date: Mon, 7 Jun 2021 16:25:24 GMT Subject: RFR: 8195129: System.load() fails to load from unicode paths [v5] In-Reply-To: References: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> Message-ID: On Fri, 4 Jun 2021 13:36:27 GMT, Maxim Kartashev wrote: >> Character strings within JVM are produced and consumed in several formats. Strings come from/to Java in the UTF8 format and POSIX APIs (like fprintf() or dlopen()) consume strings also in UTF8. On Windows, however, the situation is far less simple: some new(er) APIs expect UTF16 (wide-character strings), some older APIs can only work with strings in a "platform" format, where not all UTF8 characters can be represented; which ones can depends on the current "code page". >> >> This commit switches the Windows version of native library loading code to using the new UTF16 API `LoadLibraryW()` and attempts to streamline the use of various string formats in the surrounding code. >> >> Namely, exception messages are made to consume strings explicitly in the UTF8 format, while logging functions (that end up using legacy Windows API) are made to consume "platform" strings in most cases. One exception is `JVM_LoadLibrary()` logging where the UTF8 name of the library is logged, which can, of course, be fixed, but was considered not worth the additional code (NB: this isn't a new bug). >> >> The test runs in a separate JVM in order to make NIO happy about non-ASCII characters in the file name; tests are executed with LC_ALL=C and that doesn't let NIO work with non-ASCII file names even on Linux or MacOS. >> >> Tested by running `test/hotspot/jtreg:tier1` on Linux and `jtreg:test/hotspot/jtreg/runtime` on Windows 10. The new test (` jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode`) was explicitly ran on those platforms as well. >> >> Results from Linux: >> >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >> jtreg:test/hotspot/jtreg:tier1 1784 1784 0 0 >> ============================== >> TEST SUCCESS >> >> >> Building target 'run-test-only' in configuration 'linux-x86_64-server-release' >> Test selection 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode', will run: >> * jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode >> >> Running test 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode' >> Passed: runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java >> Test results: passed: 1 >> >> >> Results from Windows 10: >> >> Test summary >> ============================== >> TEST TOTAL PASS FAIL ERROR >> jtreg:test/hotspot/jtreg/runtime 746 746 0 0 >> ============================== >> TEST SUCCESS >> Finished building target 'run-test-only' in configuration 'windows-x86_64-server-fastdebug' >> >> >> Building target 'run-test-only' in configuration 'windows-x86_64-server-fastdebug' >> Test selection 'test/hotspot/jtreg/runtime/jni/loadLibraryUnicode', will run: >> * jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode >> >> Running test 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode' >> Passed: runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java >> Test results: passed: 1 > > Maxim Kartashev has updated the pull request incrementally with one additional commit since the last revision: > > Updated the test to run on Windows only and to use a character from the > supplementary plane in the path name. I came to realize that changing `os::dll_load()` to accept UTF-8 (standard or modified) will break all the users of that function except `JVM_LoadLibrary()`. Consider `os::native_java_library()` that still operates with the platform encoding on Windows and works correctly if CWD contains Latin-1 characters (assuming 1252 code page). With this change, `java` will fail to start if its path name contains, say, ? because `os::dll_load()` will expect it to be encoded as `c3 86` (UTF-8), but will get `c6` (Latin-1) instead. One possible solution is to update all the call sites of `os::dll_load()` (quite laborous), another is to introduce `os::dll_load_utf8()` and change only `JVM_LoadLibrary()` at this point in time. Advice is welcome. ------------- PR: https://git.openjdk.java.net/jdk/pull/4169 From naoto at openjdk.java.net Mon Jun 7 18:49:20 2021 From: naoto at openjdk.java.net (Naoto Sato) Date: Mon, 7 Jun 2021 18:49:20 GMT Subject: RFR: 8195129: System.load() fails to load from unicode paths [v3] In-Reply-To: <-V3-GuFQLcbKVotN0nKemAI3s3mkmbtHW0WgpYL6cvc=.e4eb2552-ff8f-459f-afa5-4a312508228e@github.com> References: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> <3y0nPfUyTPbNksPn1y5pvopzN2AReOgIl2CafPKD4b4=.3b490e90-5098-4d9f-8d7e-2770f5548895@github.com> <923qsXnidgxLrhNUc8Bxw3zDCiw1ZNLC6OmIYqIeSOE=.783c385d-1d4c-4c87-b975-3ee27a71513d@github.com> <-V3-GuFQLcbKVotN0nKemAI3s3mkmbtHW0WgpYL6cvc=.e4eb2552-ff8f-459f-afa5-4a312508228e@github.com> Message-ID: <5RuG3bGJjtM6zQu5tXdSlNwCg89bNOKqwsaz98t1iHQ=.eb527caf-5bde-4282-abfd-e8a745f5e4e6@github.com> On Sun, 6 Jun 2021 22:25:44 GMT, David Holmes wrote: >> I think we need to establish some common ground before proceeding further with this fix. It's a bit of a long read; please, bear with me. >> >> The path name starts its life as a `jstring` in `Java_jdk_internal_loader_NativeLibraries_load()`, its encoding is irrelevant at this point. >> >> Next, the name has to be passed down to `JVM_LoadLibrary()` that takes `char*`. So we need to convert form `jstring` to `char*` (point (a)). Following that, `os::dll_load()` that actually performs loading in a platform-specific manner also receives `char*`. All platform implementations of `os::dll_load()` pass the path name down to their respective platform's APIs unmodified, but I think that's just incidental and here we have another possible point of conversion (point (b)). Other consumers of the path name are exception(c) and logging(d) messages; they also take `char*`, but potentially of a different encoding. >> >> Let me try to enumerate all conceivably valid conversions for `JVM_LoadLibrary()` consumption (point (a)): >> 1. jstring -> platform-specific encoding (status quo meaning possibly lossy encoding on Windows and UTF-8 elsewhere AFAICT), >> 2. jstring -> modified UTF-8, >> 3. jstring -> UTF-8. >> >> This bug [8195129](https://bugs.openjdk.java.net/browse/JDK-8195129) occurs because conversion (1) may loose information on Windows if the platform encoding happens to be NOT UTF-8 (which it often - or even always - is). So that's a no-go and we are left with either (2) or (3). >> >> On MacOS and Linux, "platform" encoding already is UTF-8 and since all the platform APIs happily consume UTF-8, no further conversion is necessary (neither for actual library loading, nor for log or exception messages; the latter have to convert to UTF-16, but do that under the hood). >> >> On Windows, we require at least these variants of the path name: >> 1. UTF16 for library loading (Unicode Windows API), >> 2. "platform" encoding for logging (yes, loosing information here, but that's tolerable), >> 3. "platform" (lossy) or UTF8 (lossless) encoding for exception messages (prefer lossless). >> >> This is what's behind my choice of UTF-8 for the path name encoding as it gets passed down to `JVM_LoadLibrary()`. We can go with modified UTF-8, of course, in which case all platforms - not just Windows - will have to do the conversion on their own, loosing the benefit of the knowledge about the original string encoding (the String.coder field of jstring). > > @mkartashev thank you for the detailed explanation. > > It is not clear to me that the JDK's conformance to being a Unicode application has significantly changed since the evaluation of JDK-8017274 - @naotoj can you comment on that and related discussion from the CCC for JDK-4958170 ? In particular I'm not sure that using the platform encoding is wrong, nor how we can have a path that cannot be represented by the platform encoding? > > Not being an expert in this area I cannot evaluate the affects of these shared code changes on other platforms, and so am reluctant to introduce any change that affects any non-Windows platforms. Also the JVM and JNI work with modified-UTF8 so I do not think we should diverge from that. > I would hate to see windows specific code introduced into the JDK or JVM's shared code for these APIs, but that may be the only choice to avoid potential disruption to other platforms. Though perhaps we could push the initial conversion down into the JVM? @dholmes-ora Sorry, I don't think anything has changed as to the encoding as of JDK-8017274. For some reason, I had the impression that JVM_LoadLibrary() accepts UTF-8 (either modified or standard), but that was not correct. It is using the platform encoded string for the pathname. @mkartashev As you mentioned in another comment, the only way to fix this issue is to pass UTF-8 down to JVM_LoadLibray, but I don't think it is feasible. One reason is the effort is too great, and the other is that all VM implementations would need to be modified. ------------- PR: https://git.openjdk.java.net/jdk/pull/4169 From kevinw at openjdk.java.net Mon Jun 7 19:43:19 2021 From: kevinw at openjdk.java.net (Kevin Walls) Date: Mon, 7 Jun 2021 19:43:19 GMT Subject: RFR: 8266967: debug.cpp utility find() should print Java Object fields. [v2] In-Reply-To: <3q3pcFTsL_lG-lh78-zZSkTomOw0vPLLAoPm6ez-TAM=.45e3cf48-009b-4899-ac3d-851338987959@github.com> References: <3q3pcFTsL_lG-lh78-zZSkTomOw0vPLLAoPm6ez-TAM=.45e3cf48-009b-4899-ac3d-851338987959@github.com> Message-ID: On Thu, 13 May 2021 13:24:17 GMT, Kevin Walls wrote: >> This change enables debug.cpp's find() utility to print Java Objects with their fields. >> >> find() calls os::print_location, and Java heap objects are printed with instanceKlass oop_print_on. >> Removing the ifdef for defining oop_print_on for instanceKlass, and also on methods in FieldPrinter and FieldDescriptor, make this work. >> >> >> Checking other uses of os::print_location this might affect: >> >> macroAssembler_x86.cpp has MacroAssembler::print_state32 and MacroAssembler::print_state64 >> which use os::print_location to print register contents and print words at top of stack. >> These will be more verbose, as it already is in non-PRODUCT builds. >> >> vmError uses os::print_location when showing the stack, i.e. this output: >> >> Stack slot to memory mapping: >> stack at sp + 0 slots: 0x0000000000000002 is an unknown value >> ..etc... >> >> ...will be more verbose when Java object references are found (for the 8 stack slots it tries to show). >> >> >> Shenandoah uses os::print_location once, but for non-Java heap objects so nothing changes. >> >> >> Manual testing on Linux-x64 and Windows: old behaviour shows these two lines only: >> >> "Executing find" >> 0x00000000ff0a03e0 is an oop: jdk.internal.loader.ClassLoaders$AppClassLoader >> {0x00000000ff0a03e0} - klass: 'jdk/internal/loader/ClassLoaders$AppClassLoader' >> >> ...then with the change the full info: >> >> "Executing find" >> 0x00000000ff0a03e0 is an oop: jdk.internal.loader.ClassLoaders$AppClassLoader >> {0x00000000ff0a03e0} - klass: 'jdk/internal/loader/ClassLoaders$AppClassLoader' >> - ---- fields (total size 13 words): >> - private 'defaultAssertionStatus' 'Z' @12 false >> - private final 'parent' 'Ljava/lang/ClassLoader;' @24 a 'jdk/internal/loader/ClassLoaders$PlatformClassLoader'{0x00000000ff0a0a >> 40} (ff0a0a40) >> - private final 'name' 'Ljava/lang/String;' @28 "app"{0x00000000ff0d0060} (ff0d0060) >> - private final 'unnamedModule' 'Ljava/lang/Module;' @32 a 'java/lang/Module'{0x00000000ff0a0448} (ff0a0448) >> ...etc... > > Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: > > ifdef correction Thanks Serguei and Coleen! ------------- PR: https://git.openjdk.java.net/jdk/pull/4011 From kevinw at openjdk.java.net Mon Jun 7 19:43:17 2021 From: kevinw at openjdk.java.net (Kevin Walls) Date: Mon, 7 Jun 2021 19:43:17 GMT Subject: RFR: 8266967: debug.cpp utility find() should print Java Object fields. [v3] In-Reply-To: References: Message-ID: > This change enables debug.cpp's find() utility to print Java Objects with their fields. > > find() calls os::print_location, and Java heap objects are printed with instanceKlass oop_print_on. > Removing the ifdef for defining oop_print_on for instanceKlass, and also on methods in FieldPrinter and FieldDescriptor, make this work. > > > Checking other uses of os::print_location this might affect: > > macroAssembler_x86.cpp has MacroAssembler::print_state32 and MacroAssembler::print_state64 > which use os::print_location to print register contents and print words at top of stack. > These will be more verbose, as it already is in non-PRODUCT builds. > > vmError uses os::print_location when showing the stack, i.e. this output: > > Stack slot to memory mapping: > stack at sp + 0 slots: 0x0000000000000002 is an unknown value > ..etc... > > ...will be more verbose when Java object references are found (for the 8 stack slots it tries to show). > > > Shenandoah uses os::print_location once, but for non-Java heap objects so nothing changes. > > > Manual testing on Linux-x64 and Windows: old behaviour shows these two lines only: > > "Executing find" > 0x00000000ff0a03e0 is an oop: jdk.internal.loader.ClassLoaders$AppClassLoader > {0x00000000ff0a03e0} - klass: 'jdk/internal/loader/ClassLoaders$AppClassLoader' > > ...then with the change the full info: > > "Executing find" > 0x00000000ff0a03e0 is an oop: jdk.internal.loader.ClassLoaders$AppClassLoader > {0x00000000ff0a03e0} - klass: 'jdk/internal/loader/ClassLoaders$AppClassLoader' > - ---- fields (total size 13 words): > - private 'defaultAssertionStatus' 'Z' @12 false > - private final 'parent' 'Ljava/lang/ClassLoader;' @24 a 'jdk/internal/loader/ClassLoaders$PlatformClassLoader'{0x00000000ff0a0a > 40} (ff0a0a40) > - private final 'name' 'Ljava/lang/String;' @28 "app"{0x00000000ff0d0060} (ff0d0060) > - private final 'unnamedModule' 'Ljava/lang/Module;' @32 a 'java/lang/Module'{0x00000000ff0a0448} (ff0a0448) > ...etc... Kevin Walls has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge remote-tracking branch 'upstream/master' into 8266967_objectprint - ifdef correction - 8266967: debug.cpp utility find() should print Java Object fields. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4011/files - new: https://git.openjdk.java.net/jdk/pull/4011/files/f6454294..a1170727 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4011&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4011&range=01-02 Stats: 578299 lines in 4725 files changed: 494681 ins; 69830 del; 13788 mod Patch: https://git.openjdk.java.net/jdk/pull/4011.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4011/head:pull/4011 PR: https://git.openjdk.java.net/jdk/pull/4011 From kevinw at openjdk.java.net Mon Jun 7 22:29:18 2021 From: kevinw at openjdk.java.net (Kevin Walls) Date: Mon, 7 Jun 2021 22:29:18 GMT Subject: Integrated: 8266967: debug.cpp utility find() should print Java Object fields. In-Reply-To: References: Message-ID: On Thu, 13 May 2021 12:12:42 GMT, Kevin Walls wrote: > This change enables debug.cpp's find() utility to print Java Objects with their fields. > > find() calls os::print_location, and Java heap objects are printed with instanceKlass oop_print_on. > Removing the ifdef for defining oop_print_on for instanceKlass, and also on methods in FieldPrinter and FieldDescriptor, make this work. > > > Checking other uses of os::print_location this might affect: > > macroAssembler_x86.cpp has MacroAssembler::print_state32 and MacroAssembler::print_state64 > which use os::print_location to print register contents and print words at top of stack. > These will be more verbose, as it already is in non-PRODUCT builds. > > vmError uses os::print_location when showing the stack, i.e. this output: > > Stack slot to memory mapping: > stack at sp + 0 slots: 0x0000000000000002 is an unknown value > ..etc... > > ...will be more verbose when Java object references are found (for the 8 stack slots it tries to show). > > > Shenandoah uses os::print_location once, but for non-Java heap objects so nothing changes. > > > Manual testing on Linux-x64 and Windows: old behaviour shows these two lines only: > > "Executing find" > 0x00000000ff0a03e0 is an oop: jdk.internal.loader.ClassLoaders$AppClassLoader > {0x00000000ff0a03e0} - klass: 'jdk/internal/loader/ClassLoaders$AppClassLoader' > > ...then with the change the full info: > > "Executing find" > 0x00000000ff0a03e0 is an oop: jdk.internal.loader.ClassLoaders$AppClassLoader > {0x00000000ff0a03e0} - klass: 'jdk/internal/loader/ClassLoaders$AppClassLoader' > - ---- fields (total size 13 words): > - private 'defaultAssertionStatus' 'Z' @12 false > - private final 'parent' 'Ljava/lang/ClassLoader;' @24 a 'jdk/internal/loader/ClassLoaders$PlatformClassLoader'{0x00000000ff0a0a > 40} (ff0a0a40) > - private final 'name' 'Ljava/lang/String;' @28 "app"{0x00000000ff0d0060} (ff0d0060) > - private final 'unnamedModule' 'Ljava/lang/Module;' @32 a 'java/lang/Module'{0x00000000ff0a0448} (ff0a0448) > ...etc... This pull request has now been integrated. Changeset: 5e557d86 Author: Kevin Walls URL: https://git.openjdk.java.net/jdk/commit/5e557d8650d81f9f81938892de28a6dd8fea98b0 Stats: 20 lines in 4 files changed: 5 ins; 12 del; 3 mod 8266967: debug.cpp utility find() should print Java Object fields. Reviewed-by: sspitsyn, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/4011 From cashford at openjdk.java.net Mon Jun 7 22:55:13 2021 From: cashford at openjdk.java.net (Corey Ashford) Date: Mon, 7 Jun 2021 22:55:13 GMT Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v2] In-Reply-To: References: Message-ID: <2R-VOwcHuj-BCU7K5DyfNTyS4sGc_BcGtaPH321wm2w=.76034c06-b609-4035-8986-e285c2748d59@github.com> On Mon, 7 Jun 2021 13:20:20 GMT, Scott Gibbons wrote: >> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding. >> >> A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done. This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so. A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded. This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded. >> >> The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding. The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically. >> >> Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k. The numbers are given in the table below. >> >> **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**. >> >> >> Benchmark Name | Base Score | Optimized Score | Gain >> -- | -- | -- | -- >> testBase64Decode size 1 | 15.36 | 15.32 | 1.00 >> testBase64Decode size 3 | 17.00 | 16.72 | 1.02 >> testBase64Decode size 7 | 20.60 | 18.82 | 1.09 >> testBase64Decode size 32 | 34.21 | 26.77 | 1.28 >> testBase64Decode size 64 | 54.43 | 38.35 | 1.42 >> testBase64Decode size 80 | 66.40 | 48.34 | 1.37 >> testBase64Decode size 96 | 73.16 | 52.90 | 1.38 >> testBase64Decode size 112 | 84.93 | 51.82 | 1.64 >> testBase64Decode size 512 | 288.81 | 32.04 | 9.01 >> testBase64Decode size 1000 | 560.48 | 40.79 | 13.74 >> testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72 >> testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15 >> testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07 >> testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10 >> testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02 >> testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10 >> testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05 >> testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00 >> testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05 >> testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20 >> testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09 >> testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12 >> testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09 >> testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21 >> testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29 >> testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12 >> testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05 >> testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18 >> testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02 >> testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24 >> testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23 >> testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24 >> testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14 >> testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19 >> testBase64WithErrorInputsDecode size 20000 | 1398.44 | 1138.17 | 1.23 >> testBase64WithErrorInputsDecode size 50000 | 1409.41 | 1114.16 | 1.26 > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Update full name Thanks for making this interesting update, which improves the flexibility of intrinsics to make use of isMIME. src/java.base/share/classes/java/util/Base64.java line 813: > 811: while (sp < sl) { > 812: if (shiftto == 18 && sp < sl - 4) { // fast path > 813: int dl = decodeBlock(src, sp, sl, dst, dp, isURL, isMIME); This new param is passed all the way down to the intrinsic. I think existing intrinsics can safely ignore this parameter if it doesn't help the implementation (for example PPC64-LE has 16-byte vector registers, so isn't quite as seriously impacted by MIME). However, in the code for the PPC64-LE intrinsic, this new parameter isn't mentioned. I think if you're going to add a new parameter, it should be mentioned in the existing intrinsics as being present, but unused. src/java.base/share/classes/java/util/Base64.java line 818: > 816: * bytes of data were returned. > 817: */ > 818: int chars_decoded = ((dl + 2) / 3) * 4; In the PR comments, you say, "A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded.", however there's still a comment in the code above that says: * If the intrinsic function does not process all of the bytes in * src, it must process a multiple of four of them, making the * returned destination length a multiple of three. So this comment needs to be changed or removed to reflect your commit. ------------- Changes requested by cashford (Author). PR: https://git.openjdk.java.net/jdk/pull/4368 From cashford at openjdk.java.net Mon Jun 7 23:39:23 2021 From: cashford at openjdk.java.net (Corey Ashford) Date: Mon, 7 Jun 2021 23:39:23 GMT Subject: RFR: 8267968: [PPC64] Use prefixed load and addi instructions for better performance in POWER10 [v2] In-Reply-To: References: Message-ID: On Sun, 6 Jun 2021 20:28:27 GMT, Kazunori Ogata wrote: >> The POWER10 processor supports prefixed load and addi instructions that have larger displacement field of up to 34-bits. We can reduce instruction cycles to load constant from TOC and load an immediate value to a register. >> >> Assembler::{load|add}_const_optimized() and LoadCon[LPFD]Nodes are modified to use prefixed instructions, with fixing other functions that are affected by this change. >> >> I ran jtreg test on both POWER10 and POWER8 machines by using "make test-tier1" and verified no additional fails by this change. I also ran DaCapo, Renaissance, and SPECjbb2015 on both of them and verified they run successfully. > > Kazunori Ogata has updated the pull request incrementally with one additional commit since the last revision: > > Improve comments in macroAssembler_ppc.cpp I didn't review the details of the commit's functionality, because there are hundreds of details to check there, and to be honest there's a lot I don't understand about working with C2. Do you have a set of tests that check different sizes of immediate loads to guarantee you hit every case and emit the correct code? src/hotspot/cpu/ppc/assembler_ppc.cpp line 359: > 357: code_section()->scratch_emit()) { > 358: // Always emit a nop if the target is a scratch buffer, otherwise fill_buffer() may raise > 359: // an assertion failure because the size of actually generated code can be larger than that size of the* actual* generated code src/hotspot/cpu/ppc/assembler_ppc.cpp line 360: > 358: // Always emit a nop if the target is a scratch buffer, otherwise fill_buffer() may raise > 359: // an assertion failure because the size of actually generated code can be larger than that > 360: // in scratch_emit phase. A difference of code buffer addresses for the two phases can result in the* scratch_emit phase. src/hotspot/cpu/ppc/assembler_ppc.cpp line 362: > 360: // in scratch_emit phase. A difference of code buffer addresses for the two phases can result > 361: // in different number of nops for alignment. By emitting a nop before every paddi, we avoid > 362: // buffer overrun in acrual code generation phase. a* buffer overrun in the* acrual->actual* code generation phase. src/hotspot/cpu/ppc/assembler_ppc.cpp line 396: > 394: > 395: // pli can require a nop for alignement depending on the code address, so we don't use pli > 396: // when the caller expects the number of generated code is always the same. the amount* of generated code ... or the size* of the* generated code ... src/hotspot/cpu/ppc/assembler_ppc.cpp line 454: > 452: if (xd) { ori( d, d, (unsigned short)xd); } > 453: } else { > 454: // Exploit instruction level parallelism if we have a tmp register. instruction-level (hyphenated) src/hotspot/cpu/ppc/assembler_ppc.cpp line 600: > 598: // Case 3: Can use paddi. (However, paddi can require a nop for alignement depending > 599: // on the code address, so we don't use paddi when the caller > 600: // expects the number of generated code is always the same. same comment as earlier about "number" vs. amount or size src/hotspot/cpu/ppc/ppc.ad line 6042: > 6040: // costs do not prevent matching in this case. For that reason the > 6041: // operand immL_NM with predicate(false) is used. > 6042: // On Power 10 and up, this instruction is also used for larger offset upto signed 32-bit. larger offsets* src/hotspot/cpu/ppc/ppc.ad line 6327: > 6325: // costs do not prevent matching in this case. For that reason the > 6326: // operand immP_NM with predicate(false) is used. > 6327: // On Power 10 and up, this instruction is also used for larger offset upto signed 32-bit. offsets* src/hotspot/cpu/ppc/ppc.ad line 6397: > 6395: // costs do not prevent matching in this case. For that reason the > 6396: // operand immF_NM with predicate(false) is used. > 6397: // On Power 10 and up, this instruction is also used for larger offset upto signed 32-bit. offsets* src/hotspot/cpu/ppc/ppc.ad line 6472: > 6470: // costs do not prevent matching in this case. For that reason the > 6471: // operand immD_NM with predicate(false) is used. > 6472: // On Power 10 and up, this instruction is also used for larger offset upto signed 32-bit. offsets* ------------- Changes requested by cashford (Author). PR: https://git.openjdk.java.net/jdk/pull/4267 From github.com+6704669+asgibbons at openjdk.java.net Tue Jun 8 00:14:16 2021 From: github.com+6704669+asgibbons at openjdk.java.net (Scott Gibbons) Date: Tue, 8 Jun 2021 00:14:16 GMT Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v2] In-Reply-To: <2R-VOwcHuj-BCU7K5DyfNTyS4sGc_BcGtaPH321wm2w=.76034c06-b609-4035-8986-e285c2748d59@github.com> References: <2R-VOwcHuj-BCU7K5DyfNTyS4sGc_BcGtaPH321wm2w=.76034c06-b609-4035-8986-e285c2748d59@github.com> Message-ID: On Mon, 7 Jun 2021 22:34:33 GMT, Corey Ashford wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Update full name > > src/java.base/share/classes/java/util/Base64.java line 813: > >> 811: while (sp < sl) { >> 812: if (shiftto == 18 && sp < sl - 4) { // fast path >> 813: int dl = decodeBlock(src, sp, sl, dst, dp, isURL, isMIME); > > This new param is passed all the way down to the intrinsic. I think existing intrinsics can safely ignore this parameter if it doesn't help the implementation (for example PPC64-LE has 16-byte vector registers, so isn't quite as seriously impacted by MIME). However, in the code for the PPC64-LE intrinsic, this new parameter isn't mentioned. I think if you're going to add a new parameter, it should be mentioned in the existing intrinsics as being present, but unused. Are you suggesting that I change *all* intrinsic implementations (aarch64, ppc, etc.)? I have no problem doing that - just checking if this is what's desired. > src/java.base/share/classes/java/util/Base64.java line 818: > >> 816: * bytes of data were returned. >> 817: */ >> 818: int chars_decoded = ((dl + 2) / 3) * 4; > > In the PR comments, you say, "A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded.", however there's still a comment in the code above that says: > > * If the intrinsic function does not process all of the bytes in > * src, it must process a multiple of four of them, making the > * returned destination length a multiple of three. > > So this comment needs to be changed or removed to reflect your commit. I will change the comment, and add verbage regarding the new parameter. Thank you. ------------- PR: https://git.openjdk.java.net/jdk/pull/4368 From cashford at openjdk.java.net Tue Jun 8 00:20:17 2021 From: cashford at openjdk.java.net (Corey Ashford) Date: Tue, 8 Jun 2021 00:20:17 GMT Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v2] In-Reply-To: References: <2R-VOwcHuj-BCU7K5DyfNTyS4sGc_BcGtaPH321wm2w=.76034c06-b609-4035-8986-e285c2748d59@github.com> Message-ID: On Tue, 8 Jun 2021 00:11:42 GMT, Scott Gibbons wrote: >> src/java.base/share/classes/java/util/Base64.java line 813: >> >>> 811: while (sp < sl) { >>> 812: if (shiftto == 18 && sp < sl - 4) { // fast path >>> 813: int dl = decodeBlock(src, sp, sl, dst, dp, isURL, isMIME); >> >> This new param is passed all the way down to the intrinsic. I think existing intrinsics can safely ignore this parameter if it doesn't help the implementation (for example PPC64-LE has 16-byte vector registers, so isn't quite as seriously impacted by MIME). However, in the code for the PPC64-LE intrinsic, this new parameter isn't mentioned. I think if you're going to add a new parameter, it should be mentioned in the existing intrinsics as being present, but unused. > > Are you suggesting that I change *all* intrinsic implementations (aarch64, ppc, etc.)? I have no problem doing that - just checking if this is what's desired. Yes. I didn't realize that there's a decodeBlock intrinsic for aarch64 already, but yeah it should only be a couple of lines of comments for each. ------------- PR: https://git.openjdk.java.net/jdk/pull/4368 From github.com+6704669+asgibbons at openjdk.java.net Tue Jun 8 00:30:38 2021 From: github.com+6704669+asgibbons at openjdk.java.net (Scott Gibbons) Date: Tue, 8 Jun 2021 00:30:38 GMT Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v3] In-Reply-To: References: Message-ID: <_jzUJqPGgV255ofevS6BguJqQddvfdMdr0gGwwn3DA4=.03e8dc6d-ff6b-46de-8dc5-69ed36481615@github.com> > Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding. > > A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done. This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so. A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded. This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded. > > The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding. The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically. > > Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k. The numbers are given in the table below. > > **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**. > > > Benchmark Name | Base Score | Optimized Score | Gain > -- | -- | -- | -- > testBase64Decode size 1 | 15.36 | 15.32 | 1.00 > testBase64Decode size 3 | 17.00 | 16.72 | 1.02 > testBase64Decode size 7 | 20.60 | 18.82 | 1.09 > testBase64Decode size 32 | 34.21 | 26.77 | 1.28 > testBase64Decode size 64 | 54.43 | 38.35 | 1.42 > testBase64Decode size 80 | 66.40 | 48.34 | 1.37 > testBase64Decode size 96 | 73.16 | 52.90 | 1.38 > testBase64Decode size 112 | 84.93 | 51.82 | 1.64 > testBase64Decode size 512 | 288.81 | 32.04 | 9.01 > testBase64Decode size 1000 | 560.48 | 40.79 | 13.74 > testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72 > testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15 > testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07 > testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10 > testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02 > testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10 > testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05 > testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00 > testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05 > testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20 > testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09 > testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12 > testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09 > testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21 > testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29 > testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12 > testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05 > testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18 > testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02 > testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24 > testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23 > testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24 > testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14 > testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19 > testBase64WithErrorInputsDecode size 20000 | 1398.44 | 1138.17 | 1.23 > testBase64WithErrorInputsDecode size 50000 | 1409.41 | 1114.16 | 1.26 Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Fixing review comments. Adding notes about isMIME parameter for other architectures; clarifying decodeBlock comments. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4368/files - new: https://git.openjdk.java.net/jdk/pull/4368/files/00fd5621..d66e32e3 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4368&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4368&range=01-02 Stats: 19 lines in 3 files changed: 8 ins; 4 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/4368.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4368/head:pull/4368 PR: https://git.openjdk.java.net/jdk/pull/4368 From sspitsyn at openjdk.java.net Tue Jun 8 01:47:26 2021 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Tue, 8 Jun 2021 01:47:26 GMT Subject: RFR: JDK-8268241: deprecate JVMTI Heap functions 1.0 Message-ID: The JVM TI Heap functions 1.0 were superseded by newer functions in JVM TI 1.2 (Java 6) and should be deprecated so they can be removed in a future release. We need to replace this sentence: "These functions and data types were introduced in the original JVM TI version 1.0 and have been superseded by more powerful and flexible versions which:" with: "These functions and data types were introduced in the original JVM TI version 1.0. They are deprecated and will be changed to return an error in a future release. They were superseded in JVM TI version 1.2 (Java 6) by more powerful and flexible versions which:" The CSR has been approved: https://bugs.openjdk.java.net/browse/JDK-8268242 ------------- Commit messages: - deprecate JVMTI Heap functions 1.0 Changes: https://git.openjdk.java.net/jdk/pull/4406/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4406&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8268241 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/4406.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4406/head:pull/4406 PR: https://git.openjdk.java.net/jdk/pull/4406 From jbhateja at openjdk.java.net Tue Jun 8 02:02:19 2021 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Tue, 8 Jun 2021 02:02:19 GMT Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v3] In-Reply-To: <_jzUJqPGgV255ofevS6BguJqQddvfdMdr0gGwwn3DA4=.03e8dc6d-ff6b-46de-8dc5-69ed36481615@github.com> References: <_jzUJqPGgV255ofevS6BguJqQddvfdMdr0gGwwn3DA4=.03e8dc6d-ff6b-46de-8dc5-69ed36481615@github.com> Message-ID: On Tue, 8 Jun 2021 00:30:38 GMT, Scott Gibbons wrote: >> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding. >> >> A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done. This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so. A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded. This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded. >> >> The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding. The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically. >> >> Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k. The numbers are given in the table below. >> >> **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**. >> >> >> Benchmark Name | Base Score | Optimized Score | Gain >> -- | -- | -- | -- >> testBase64Decode size 1 | 15.36 | 15.32 | 1.00 >> testBase64Decode size 3 | 17.00 | 16.72 | 1.02 >> testBase64Decode size 7 | 20.60 | 18.82 | 1.09 >> testBase64Decode size 32 | 34.21 | 26.77 | 1.28 >> testBase64Decode size 64 | 54.43 | 38.35 | 1.42 >> testBase64Decode size 80 | 66.40 | 48.34 | 1.37 >> testBase64Decode size 96 | 73.16 | 52.90 | 1.38 >> testBase64Decode size 112 | 84.93 | 51.82 | 1.64 >> testBase64Decode size 512 | 288.81 | 32.04 | 9.01 >> testBase64Decode size 1000 | 560.48 | 40.79 | 13.74 >> testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72 >> testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15 >> testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07 >> testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10 >> testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02 >> testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10 >> testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05 >> testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00 >> testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05 >> testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20 >> testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09 >> testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12 >> testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09 >> testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21 >> testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29 >> testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12 >> testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05 >> testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18 >> testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02 >> testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24 >> testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23 >> testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24 >> testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14 >> testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19 >> testBase64WithErrorInputsDecode size 20000 | 1398.44 | 1138.17 | 1.23 >> testBase64WithErrorInputsDecode size 50000 | 1409.41 | 1114.16 | 1.26 > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Fixing review comments. Adding notes about isMIME parameter for other architectures; clarifying decodeBlock comments. src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6239: > 6237: > 6238: __ align(32); > 6239: __ BIND(L_bruteForce); Is this alignment needed ? Given that brute force loop is already aligned. ------------- PR: https://git.openjdk.java.net/jdk/pull/4368 From github.com+4146708+a74nh at openjdk.java.net Tue Jun 8 02:28:20 2021 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Tue, 8 Jun 2021 02:28:20 GMT Subject: Integrated: 8266749: AArch64: Backtracing broken on PAC enabled systems In-Reply-To: <3Ak1iZsEfTEKadfUcF6zGVuzsDoiQbaupm66NvSwlaY=.8323de39-d7e6-4049-9243-7c31a75bbc9f@github.com> References: <3Ak1iZsEfTEKadfUcF6zGVuzsDoiQbaupm66NvSwlaY=.8323de39-d7e6-4049-9243-7c31a75bbc9f@github.com> Message-ID: On Fri, 14 May 2021 11:22:38 GMT, Alan Hayward wrote: > On PAC systems, native code may sign return addresses before saving > them to the stack. We must ensure we strip the any signed bits in > order to walk the stack. > Add extra asserts in places where we do not expect saved return > addresses to be signed. > > On non-PAC systems, all PAC instructions are treated as NOPs. > > On Apple, use the provided ptrauth interface instead of asm > as the compiler may optimise further. > > Fedora 33 compiles all distro packages using PAC. Running the distro > provided OpenJDK-latest in GDB on a PAC system: > > Thread 2 "java" hit Breakpoint 1, 0x0000fffff68d7fe4 in init_globals() () > from /usr/lib/jvm/java-16-openjdk-16.0.1.0.9-1.rolling.fc33.aarch64-fastdebug/lib/server/libjvm.so > (gdb) call (int)pns($sp, $fp, $pc) > > "Executing pns" > Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0xe26fe4] init_globals()+0x10 > C 0x006ffffff74750c4 > C 0x0042fffff6a7f84c > C 0x0037fffff7fa0954 > C 0x0030fffff7fa4540 > C 0x0078fffff7d980c8 > > OpenJDK with this patch at the same breakpoint: > > (gdb) call (int)pns($sp, $fp, $pc) > "Executing pns" > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x189c47c] Threads::create_vm(JavaVMInitArgs*, bool*)+0x27c > V [libjvm.so+0xf527a0] JNI_CreateJavaVM+0xc0 > C [libjli.so+0x3860] JavaMain+0x7c > C [libjli.so+0x732c] ThreadJavaMain+0xc > C [libpthread.so.0+0x80c8] start_thread+0xd8 > > OpenJDK with this patch breakpointed at pd_hotspot_signal_handler: > > "Executing pns" > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x148a730] PosixSignals::pd_hotspot_signal_handler(int, siginfo_t*, ucontext_t*, JavaThread*)+0x0 > C [linux-vdso.so.1+0x80c] __kernel_rt_sigreturn+0x0 > J 53 c1 jdk.internal.org.objectweb.asm.SymbolTable.addConstantUtf8(Ljava/lang/String;)I java.base (98 bytes) @ 0x0000ffffe159cc3c [0x0000ffffe159cb40+0x00000000000000fc] > j jdk.internal.org.objectweb.asm.SymbolTable.setMajorVersionAndClassName(ILjava/lang/String;)I+12 java.base > j jdk.internal.org.objectweb.asm.ClassWriter.visit(IILjava/lang/String;Ljava/lang/String;Ljava/lang/String;[Ljava/lang/String;)V+20 java.base > j java.lang.invoke.InvokerBytecodeGenerator.classFilePrologue()Ljdk/internal/org/objectweb/asm/ClassWriter;+30 java.base > j java.lang.invoke.InvokerBytecodeGenerator.generateCustomizedCodeBytes()[B+1 java.base > j java.lang.invoke.InvokerBytecodeGenerator.generateCustomizedCode(Ljava/lang/invoke/LambdaForm;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/MemberName;+27 java.base > j java.lang.invoke.LambdaForm.compileToBytecode()V+69 java.base > j java.lang.invoke.DirectMethodHandle.makePreparedLambdaForm(Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/LambdaForm;+792 java.base > j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/LambdaForm;+17 java.base > j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MemberName;Z)Ljava/lang/invoke/LambdaForm;+163 java.base > j java.lang.invoke.DirectMethodHandle.preparedLambdaForm(Ljava/lang/invoke/MemberName;)Ljava/lang/invoke/LambdaForm;+2 java.base > j java.lang.invoke.DirectMethodHandle.make(BLjava/lang/Class;Ljava/lang/invoke/MemberName;Ljava/lang/Class;)Ljava/lang/invoke/DirectMethodHandle;+159 java.base > j java.lang.invoke.MethodHandles$Lookup.getDirectMethodCommon(BLjava/lang/Class;Ljava/lang/invoke/MemberName;ZZLjava/lang/invoke/MethodHandles$Lookup;)Ljava/lang/invoke/MethodHandle;+210 java.base > j java.lang.invoke.MethodHandles$Lookup.getDirectMethodNoSecurityManager(BLjava/lang/Class;Ljava/lang/invoke/MemberName;Ljava/lang/invoke/MethodHandles$Lookup;)Ljava/lang/invoke/MethodHandle;+14 java.base > j java.lang.invoke.MethodHandles$Lookup.getDirectMethodForConstant(BLjava/lang/Class;Ljava/lang/invoke/MemberName;)Ljava/lang/invoke/MethodHandle;+31 java.base > j java.lang.invoke.MethodHandles$Lookup.linkMethodHandleConstant(BLjava/lang/Class;Ljava/lang/String;Ljava/lang/Object;)Ljava/lang/invoke/MethodHandle;+153 java.base > j java.lang.invoke.MethodHandleNatives.linkMethodHandleConstant(Ljava/lang/Class;ILjava/lang/Class;Ljava/lang/String;Ljava/lang/Object;)Ljava/lang/invoke/MethodHandle;+38 java.base > v ~StubRoutines::call_stub > V [libjvm.so+0xe20118] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x5c8 > V [libjvm.so+0xe20f64] JavaCalls::call_static(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x284 > V [libjvm.so+0x184b778] SystemDictionary::link_method_handle_constant(Klass*, int, Klass*, Symbol*, Symbol*, Thread*)+0x398 > V [libjvm.so+0xa1f104] ConstantPool::resolve_constant_at_impl(constantPoolHandle const&, int, int, bool*, Thread*)+0xca0 > V [libjvm.so+0xa1fb6c] ConstantPool::copy_bootstrap_arguments_at_impl(constantPoolHandle const&, int, int, int, objArrayHandle, int, bool, Handle, Thread*)+0x3fc > V [libjvm.so+0x6bef6c] BootstrapInfo::resolve_args(Thread*)+0xcbc > V [libjvm.so+0x6c1538] BootstrapInfo::resolve_bsm(Thread*)+0x1194 > V [libjvm.so+0x184d300] SystemDictionary::invoke_bootstrap_method(BootstrapInfo&, Thread*)+0x30 > V [libjvm.so+0x120450c] LinkResolver::resolve_dynamic_call(CallInfo&, BootstrapInfo&, Thread*)+0x2c > V [libjvm.so+0x1204b1c] LinkResolver::resolve_invokedynamic(CallInfo&, constantPoolHandle const&, int, Thread*)+0x1bc > V [libjvm.so+0xe0ecc4] InterpreterRuntime::resolve_invokedynamic(JavaThread*)+0x190 > V [libjvm.so+0xe123a0] InterpreterRuntime::resolve_from_cache(JavaThread*, Bytecodes::Code)+0x160 > j jdk.internal.module.ModulePath.explodedPackages(Ljava/nio/file/Path;)Ljava/util/Set;+5 java.base > j jdk.internal.module.ModulePath.lambda$readExplodedModule$9(Ljava/nio/file/Path;)Ljava/util/Set;+2 java.base > j jdk.internal.module.ModulePath$$Lambda$2+0x000000010003bbe0.get()Ljava/lang/Object;+8 java.base > j jdk.internal.module.ModuleInfo.doRead(Ljava/io/DataInput;)Ljdk/internal/module/ModuleInfo$Attributes;+762 java.base > j jdk.internal.module.ModuleInfo.read(Ljava/io/InputStream;Ljava/util/function/Supplier;)Ljdk/internal/module/ModuleInfo$Attributes;+16 java.base > j jdk.internal.module.ModulePath.readExplodedModule(Ljava/nio/file/Path;)Ljava/lang/module/ModuleReference;+35 java.base > j jdk.internal.module.ModulePath.readModule(Ljava/nio/file/Path;Ljava/nio/file/attribute/BasicFileAttributes;)Ljava/lang/module/ModuleReference;+11 java.base > j jdk.internal.module.ModulePath.scanDirectory(Ljava/nio/file/Path;)Ljava/util/Map;+69 java.base > j jdk.internal.module.ModulePath.scan(Ljava/nio/file/Path;)Ljava/util/Map;+60 java.base > j jdk.internal.module.ModulePath.scanNextEntry()V+23 java.base > j jdk.internal.module.ModulePath.find(Ljava/lang/String;)Ljava/util/Optional;+36 java.base > j jdk.internal.module.SystemModuleFinders$1.lambda$find$0(Ljava/lang/module/ModuleFinder;Ljava/lang/String;)Ljava/util/Optional;+2 java.base > j jdk.internal.module.SystemModuleFinders$1$$Lambda$1+0x0000000100033b00.run()Ljava/lang/Object;+8 java.base > j java.security.AccessController.executePrivileged(Ljava/security/PrivilegedAction;Ljava/security/AccessControlContext;Ljava/lang/Class;)Ljava/lang/Object;+29 java.base > j java.security.AccessController.doPrivileged(Ljava/security/PrivilegedAction;)Ljava/lang/Object;+5 java.base > j jdk.internal.module.SystemModuleFinders$1.find(Ljava/lang/String;)Ljava/util/Optional;+12 java.base > j jdk.internal.module.ModuleBootstrap.boot2()Ljava/lang/ModuleLayer;+304 java.base > j jdk.internal.module.ModuleBootstrap.boot()Ljava/lang/ModuleLayer;+64 java.base > j java.lang.System.initPhase2(ZZ)I+0 java.base > v ~StubRoutines::call_stub > V [libjvm.so+0xe20118] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x5c8 > V [libjvm.so+0xe20f64] JavaCalls::call_static(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x284 > V [libjvm.so+0x189c7bc] Threads::create_vm(JavaVMInitArgs*, bool*)+0x5bc > V [libjvm.so+0xf527a0] JNI_CreateJavaVM+0xc0 > C [libjli.so+0x3860] JavaMain+0x7c > C [libjli.so+0x732c] ThreadJavaMain+0xc > C [libpthread.so.0+0x80c8] start_thread+0xd8 This pull request has now been integrated. Changeset: ae986bc8 Author: Alan Hayward Committer: Ningsheng Jian URL: https://git.openjdk.java.net/jdk/commit/ae986bc8dff92a77e91e6ee640aa27c68abb8def Stats: 180 lines in 7 files changed: 175 ins; 0 del; 5 mod 8266749: AArch64: Backtracing broken on PAC enabled systems Reviewed-by: gziemski, aph ------------- PR: https://git.openjdk.java.net/jdk/pull/4029 From kbarrett at openjdk.java.net Tue Jun 8 03:39:13 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Tue, 8 Jun 2021 03:39:13 GMT Subject: RFR: 8268228: TSC is not used for CPUTimeStampCounter on AMD processor [v2] In-Reply-To: References: Message-ID: On Mon, 7 Jun 2021 04:55:47 GMT, Kim Barrett wrote: >> Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix comments > > I think JFR is the only VM subsystem that currently uses the "fast" time > that is based on TSC, with a fallback to OS time facilities if "fast" time > is not enabled. (There has been discussion (under JDK-8211240) about > eliminating that distinction and just always using OS time facilities, but > it hasn't received much attention.) GC (and maybe other places?) uses the > dual time mechanism because we want reliable time but also send JFR events. > So some some of the major VM clients for time information are currently > paying some cost for having both implementations. > > Currently the TSC frequency is always obtained from the CPUID brand string, > with the bogomips style estimate in initialize_frequency never being used. > Rdtsc::is_supported() is true iff VM_Version_Ext::supports_tscinv_ext(). > And initialize_frequency() uses the brand string if supports_tscinv_ext(). > > I think the current implementation of the bogomips calculation can > intermittently produce catastrophically wrong results. Descheduling at the > wrong place(s) in the loop can badly mess things up. I thought there was a > bug for this, but can't find one. > > Also, there are things like the Intel erratum referenced here: > http://lkml.iu.edu/hypermail/linux/kernel/1511.1/01048.html > that make things even more fun. > > I think that detecting a "good" TSC and it's properties (like frequency) is > pretty hard, and we should not try to duplicate the OS detection or second > guess it. I also think that using the "fast" time when the TSC is not > "good" is a mistake, but I have so far not convinced the JFR folks. > > So I'm not in favor of this change. I think we should be moving away from > direct TSC access rather than trying to use it in more cases. > @kimbarrett Did you mean we cannot detect "good" TSC from invariant TSC flag? If so, I have to withdraw this PR. And also TSC support for intel processor should be removed (it will happen in JDK-8211240?) ASAP. According to the previously referenced erratum, the invariant TSC flag is not sufficient; one may also need to check the kernel version or some such. So there may be a JDK bug to be addressed there, though maybe it's aged out with support for older kernels. > If TSC support will remain, it should be detected from CPUID with EAX = 16H, however it is not available on AMD processor as I said before, so we need to bogomips style calculation. The current bogomips style estimator has the potential for catastrophic failure even if all the requirements for "safe" direct use of TSC are met. If those requirements are not met then such failures become more likely and solutions much harder. My point was that the bogomips estimator is currently unused unless one explicitly opts-in to potentially bogus data. ------------- PR: https://git.openjdk.java.net/jdk/pull/4350 From ysuenaga at openjdk.java.net Tue Jun 8 05:09:15 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Tue, 8 Jun 2021 05:09:15 GMT Subject: RFR: 8268228: TSC is not used for CPUTimeStampCounter on AMD processor [v2] In-Reply-To: References: Message-ID: On Fri, 4 Jun 2021 05:24:15 GMT, Yasumasa Suenaga wrote: >> I ran JVM on Ryzen 3300X, and I got following `jdk.CPUTimeStampCounter` event. >> >> >> jdk.CPUTimeStampCounter { >> startTime = 10:41:14.993 >> fastTimeEnabled = false >> fastTimeAutoEnabled = true >> osFrequency = 1000000000 >> fastTimeFrequency = 1000000000 >> } >> >> >> I confirmed 3300X supports Invariant TSC (so `fastTimeAutoEnabled` is set to `true`), however it does not seem to be used (`fastTimeEnabled` is `false`). >> >> Frequency is come from brand string from CPUID (e.g. "Intel(R) Core(TM) i3-8145U CPU @ 2.10GHz"). However AMD processor (Ryzen at least) does not have it ("AMD Ryzen 3 3300X 4-Core Processor"). >> Fortunately rdtsc_x86.cpp can calculate the frequency like bogomips. We should fallback to it if we cannot get the frequency even if invariant TSC is supported. >> >> After this change, I got following `jdk.CPUTimeStampCounter` event. Base clock of Ryzen 3 3300X is 3.8GHz, so `fastTimeFrequency` looks good. >> >> >> jdk.CPUTimeStampCounter { >> startTime = 10:33:52.884 >> fastTimeEnabled = true >> fastTimeAutoEnabled = true >> osFrequency = 10000000 Hz >> fastTimeFrequency = 3792929124 Hz >> } >> >> >> This problem is not only for JFR. I confirmed `Rdtsc` class is used in ticks.cpp , and it relates to GC code at least. > > Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: > > Fix comments To be honest I like TSC for calculating short time difference because execution cost is much lesser than syscalls. So I want to use it on AMD processor especially with JFR - it requires event duration which is calculated by TSC or by timestamp. I think there is not so big problem for this purpose. However, in general, if TSC should not be used in HotSpot, I will withdraw this PR and will close JBS. ------------- PR: https://git.openjdk.java.net/jdk/pull/4350 From david.holmes at oracle.com Tue Jun 8 06:16:42 2021 From: david.holmes at oracle.com (David Holmes) Date: Tue, 8 Jun 2021 16:16:42 +1000 Subject: RFR: 8268228: TSC is not used for CPUTimeStampCounter on AMD processor [v2] In-Reply-To: References: Message-ID: On 8/06/2021 3:09 pm, Yasumasa Suenaga wrote: > On Fri, 4 Jun 2021 05:24:15 GMT, Yasumasa Suenaga wrote: > >>> I ran JVM on Ryzen 3300X, and I got following `jdk.CPUTimeStampCounter` event. >>> >>> >>> jdk.CPUTimeStampCounter { >>> startTime = 10:41:14.993 >>> fastTimeEnabled = false >>> fastTimeAutoEnabled = true >>> osFrequency = 1000000000 >>> fastTimeFrequency = 1000000000 >>> } >>> >>> >>> I confirmed 3300X supports Invariant TSC (so `fastTimeAutoEnabled` is set to `true`), however it does not seem to be used (`fastTimeEnabled` is `false`). >>> >>> Frequency is come from brand string from CPUID (e.g. "Intel(R) Core(TM) i3-8145U CPU @ 2.10GHz"). However AMD processor (Ryzen at least) does not have it ("AMD Ryzen 3 3300X 4-Core Processor"). >>> Fortunately rdtsc_x86.cpp can calculate the frequency like bogomips. We should fallback to it if we cannot get the frequency even if invariant TSC is supported. >>> >>> After this change, I got following `jdk.CPUTimeStampCounter` event. Base clock of Ryzen 3 3300X is 3.8GHz, so `fastTimeFrequency` looks good. >>> >>> >>> jdk.CPUTimeStampCounter { >>> startTime = 10:33:52.884 >>> fastTimeEnabled = true >>> fastTimeAutoEnabled = true >>> osFrequency = 10000000 Hz >>> fastTimeFrequency = 3792929124 Hz >>> } >>> >>> >>> This problem is not only for JFR. I confirmed `Rdtsc` class is used in ticks.cpp , and it relates to GC code at least. >> >> Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix comments > > To be honest I like TSC for calculating short time difference because execution cost is much lesser than syscalls. So I want to use it on AMD processor especially with JFR - it requires event duration which is calculated by TSC or by timestamp. I think there is not so big problem for this purpose. > > However, in general, if TSC should not be used in HotSpot, I will withdraw this PR and will close JBS. The problem here is twofold: 1. Your change potentially affects thousands of users with AMD systems that don't currently use the TSC, in ways we can't be sure of. 2. As Kim points out, you may be the first to actually use the bogomips code with this change, which is even more scarey! So my recommendation is to withdraw the PR. Thanks, David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/4350 > From ysuenaga at openjdk.java.net Tue Jun 8 08:24:21 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Tue, 8 Jun 2021 08:24:21 GMT Subject: RFR: 8268228: TSC is not used for CPUTimeStampCounter on AMD processor [v2] In-Reply-To: References: Message-ID: On Fri, 4 Jun 2021 05:24:15 GMT, Yasumasa Suenaga wrote: >> I ran JVM on Ryzen 3300X, and I got following `jdk.CPUTimeStampCounter` event. >> >> >> jdk.CPUTimeStampCounter { >> startTime = 10:41:14.993 >> fastTimeEnabled = false >> fastTimeAutoEnabled = true >> osFrequency = 1000000000 >> fastTimeFrequency = 1000000000 >> } >> >> >> I confirmed 3300X supports Invariant TSC (so `fastTimeAutoEnabled` is set to `true`), however it does not seem to be used (`fastTimeEnabled` is `false`). >> >> Frequency is come from brand string from CPUID (e.g. "Intel(R) Core(TM) i3-8145U CPU @ 2.10GHz"). However AMD processor (Ryzen at least) does not have it ("AMD Ryzen 3 3300X 4-Core Processor"). >> Fortunately rdtsc_x86.cpp can calculate the frequency like bogomips. We should fallback to it if we cannot get the frequency even if invariant TSC is supported. >> >> After this change, I got following `jdk.CPUTimeStampCounter` event. Base clock of Ryzen 3 3300X is 3.8GHz, so `fastTimeFrequency` looks good. >> >> >> jdk.CPUTimeStampCounter { >> startTime = 10:33:52.884 >> fastTimeEnabled = true >> fastTimeAutoEnabled = true >> osFrequency = 10000000 Hz >> fastTimeFrequency = 3792929124 Hz >> } >> >> >> This problem is not only for JFR. I confirmed `Rdtsc` class is used in ticks.cpp , and it relates to GC code at least. > > Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: > > Fix comments Ok, I will withdraw this PR, but... > 2. As Kim points out, you may be the first to actually use the bogomips > code with this change, which is even more scarey! No, the bogomips might be used in some systems. https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/rdtsc_x86.cpp#L102-L119 For example, if the user runs JVM with -XX:+UseFastUnorderedTimeStamps on the machine which is not supported invariant TSC (e.g. virtualization guest), bogomips will be used. We can get following flight record on it: jdk.CPUTimeStampCounter { jdk.CPUTimeStampCounter { startTime = 17:11:52.351 fastTimeEnabled = true fastTimeAutoEnabled = false osFrequency = 1000000000 fastTimeFrequency = 3792659755 } But we can see following warnings, so most of user can understood it is unstable: The hardware does not support invariant tsc (INVTSC) register and/or cannot guarantee tsc synchronization between sockets at startup. Values returned via rdtsc() are not guaranteed to be accurate, esp. when comparing values from cross sockets reads. Enabling UseFastUnorderedTimeStamps on non-invariant tsc hardware should be considered experimental. I know it is corner case, so we can say bogomips is not widely used. ------------- PR: https://git.openjdk.java.net/jdk/pull/4350 From github.com+28651297+mkartashev at openjdk.java.net Tue Jun 8 09:48:20 2021 From: github.com+28651297+mkartashev at openjdk.java.net (Maxim Kartashev) Date: Tue, 8 Jun 2021 09:48:20 GMT Subject: Withdrawn: 8195129: System.load() fails to load from unicode paths In-Reply-To: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> References: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> Message-ID: On Mon, 24 May 2021 16:43:09 GMT, Maxim Kartashev wrote: > Character strings within JVM are produced and consumed in several formats. Strings come from/to Java in the UTF8 format and POSIX APIs (like fprintf() or dlopen()) consume strings also in UTF8. On Windows, however, the situation is far less simple: some new(er) APIs expect UTF16 (wide-character strings), some older APIs can only work with strings in a "platform" format, where not all UTF8 characters can be represented; which ones can depends on the current "code page". > > This commit switches the Windows version of native library loading code to using the new UTF16 API `LoadLibraryW()` and attempts to streamline the use of various string formats in the surrounding code. > > Namely, exception messages are made to consume strings explicitly in the UTF8 format, while logging functions (that end up using legacy Windows API) are made to consume "platform" strings in most cases. One exception is `JVM_LoadLibrary()` logging where the UTF8 name of the library is logged, which can, of course, be fixed, but was considered not worth the additional code (NB: this isn't a new bug). > > The test runs in a separate JVM in order to make NIO happy about non-ASCII characters in the file name; tests are executed with LC_ALL=C and that doesn't let NIO work with non-ASCII file names even on Linux or MacOS. > > Tested by running `test/hotspot/jtreg:tier1` on Linux and `jtreg:test/hotspot/jtreg/runtime` on Windows 10. The new test (` jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode`) was explicitly ran on those platforms as well. > > Results from Linux: > > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg:tier1 1784 1784 0 0 > ============================== > TEST SUCCESS > > > Building target 'run-test-only' in configuration 'linux-x86_64-server-release' > Test selection 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode', will run: > * jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode > > Running test 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode' > Passed: runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java > Test results: passed: 1 > > > Results from Windows 10: > > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg/runtime 746 746 0 0 > ============================== > TEST SUCCESS > Finished building target 'run-test-only' in configuration 'windows-x86_64-server-fastdebug' > > > Building target 'run-test-only' in configuration 'windows-x86_64-server-fastdebug' > Test selection 'test/hotspot/jtreg/runtime/jni/loadLibraryUnicode', will run: > * jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode > > Running test 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode' > Passed: runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java > Test results: passed: 1 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/4169 From github.com+28651297+mkartashev at openjdk.java.net Tue Jun 8 09:48:19 2021 From: github.com+28651297+mkartashev at openjdk.java.net (Maxim Kartashev) Date: Tue, 8 Jun 2021 09:48:19 GMT Subject: RFR: 8195129: System.load() fails to load from unicode paths [v3] In-Reply-To: <5RuG3bGJjtM6zQu5tXdSlNwCg89bNOKqwsaz98t1iHQ=.eb527caf-5bde-4282-abfd-e8a745f5e4e6@github.com> References: <6qzdQJy3fcfn-PjXHjGNRZH7ZTBt_Sehohf4zRkMWKc=.0e5fa6d7-0182-4242-bed6-bf4b602abafe@github.com> <3y0nPfUyTPbNksPn1y5pvopzN2AReOgIl2CafPKD4b4=.3b490e90-5098-4d9f-8d7e-2770f5548895@github.com> <923qsXnidgxLrhNUc8Bxw3zDCiw1ZNLC6OmIYqIeSOE=.783c385d-1d4c-4c87-b975-3ee27a71513d@github.com> <-V3-GuFQLcbKVotN0nKemAI3s3mkmbtHW0WgpYL6cvc=.e4eb2552-ff8f-459f-afa5-4a312508228e@github.com> <5RuG3bGJjtM6zQu5tXdSlNwCg89bNOKqwsaz98t1iHQ=.eb527caf-5bde-4282-abfd-e8a745f5e4e6@github.com> Message-ID: On Mon, 7 Jun 2021 18:46:11 GMT, Naoto Sato wrote: >> @mkartashev thank you for the detailed explanation. >> >> It is not clear to me that the JDK's conformance to being a Unicode application has significantly changed since the evaluation of JDK-8017274 - @naotoj can you comment on that and related discussion from the CCC for JDK-4958170 ? In particular I'm not sure that using the platform encoding is wrong, nor how we can have a path that cannot be represented by the platform encoding? >> >> Not being an expert in this area I cannot evaluate the affects of these shared code changes on other platforms, and so am reluctant to introduce any change that affects any non-Windows platforms. Also the JVM and JNI work with modified-UTF8 so I do not think we should diverge from that. >> I would hate to see windows specific code introduced into the JDK or JVM's shared code for these APIs, but that may be the only choice to avoid potential disruption to other platforms. Though perhaps we could push the initial conversion down into the JVM? > > @dholmes-ora Sorry, I don't think anything has changed as to the encoding as of JDK-8017274. For some reason, I had the impression that JVM_LoadLibrary() accepts UTF-8 (either modified or standard), but that was not correct. It is using the platform encoded string for the pathname. > > @mkartashev As you mentioned in another comment, the only way to fix this issue is to pass UTF-8 down to JVM_LoadLibray, but I don't think it is feasible. One reason is the effort is too great, and the other is that all VM implementations would need to be modified. @naotoj Then I guess this bug will have to wait until Windows evolves to the point when its platform encoding is UTF-8. In the mean time, I'm closing this PR. Thank you all so much for your time! ------------- PR: https://git.openjdk.java.net/jdk/pull/4169 From david.holmes at oracle.com Tue Jun 8 13:02:35 2021 From: david.holmes at oracle.com (David Holmes) Date: Tue, 8 Jun 2021 23:02:35 +1000 Subject: RFR: 8268228: TSC is not used for CPUTimeStampCounter on AMD processor [v2] In-Reply-To: References: Message-ID: <0a42bc7c-ef06-ca5d-f96a-d0b6c1690c81@oracle.com> On 8/06/2021 6:24 pm, Yasumasa Suenaga wrote: > On Fri, 4 Jun 2021 05:24:15 GMT, Yasumasa Suenaga wrote: > >>> I ran JVM on Ryzen 3300X, and I got following `jdk.CPUTimeStampCounter` event. >>> >>> >>> jdk.CPUTimeStampCounter { >>> startTime = 10:41:14.993 >>> fastTimeEnabled = false >>> fastTimeAutoEnabled = true >>> osFrequency = 1000000000 >>> fastTimeFrequency = 1000000000 >>> } >>> >>> >>> I confirmed 3300X supports Invariant TSC (so `fastTimeAutoEnabled` is set to `true`), however it does not seem to be used (`fastTimeEnabled` is `false`). >>> >>> Frequency is come from brand string from CPUID (e.g. "Intel(R) Core(TM) i3-8145U CPU @ 2.10GHz"). However AMD processor (Ryzen at least) does not have it ("AMD Ryzen 3 3300X 4-Core Processor"). >>> Fortunately rdtsc_x86.cpp can calculate the frequency like bogomips. We should fallback to it if we cannot get the frequency even if invariant TSC is supported. >>> >>> After this change, I got following `jdk.CPUTimeStampCounter` event. Base clock of Ryzen 3 3300X is 3.8GHz, so `fastTimeFrequency` looks good. >>> >>> >>> jdk.CPUTimeStampCounter { >>> startTime = 10:33:52.884 >>> fastTimeEnabled = true >>> fastTimeAutoEnabled = true >>> osFrequency = 10000000 Hz >>> fastTimeFrequency = 3792929124 Hz >>> } >>> >>> >>> This problem is not only for JFR. I confirmed `Rdtsc` class is used in ticks.cpp , and it relates to GC code at least. >> >> Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix comments > > Ok, I will withdraw this PR, but... > >> 2. As Kim points out, you may be the first to actually use the bogomips >> code with this change, which is even more scarey! > > No, the bogomips might be used in some systems. > > https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/rdtsc_x86.cpp#L102-L119 > > For example, if the user runs JVM with -XX:+UseFastUnorderedTimeStamps on the machine which is not supported invariant TSC (e.g. virtualization guest), bogomips will be used. We can get following flight record on it: Right but we don't know if anybody is actually using the flag to opt-in. Hence I said you _might_ be the first. David ----- > > jdk.CPUTimeStampCounter { > jdk.CPUTimeStampCounter { > startTime = 17:11:52.351 > fastTimeEnabled = true > fastTimeAutoEnabled = false > osFrequency = 1000000000 > fastTimeFrequency = 3792659755 > } > > > But we can see following warnings, so most of user can understood it is unstable: > > > The hardware does not support invariant tsc (INVTSC) register and/or cannot guarantee tsc synchronization between sockets at startup. > Values returned via rdtsc() are not guaranteed to be accurate, esp. when comparing values from cross sockets reads. Enabling UseFastUnorderedTimeStamps on non-invariant tsc hardware should be considered experimental. > > > I know it is corner case, so we can say bogomips is not widely used. > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/4350 > From github.com+6704669+asgibbons at openjdk.java.net Tue Jun 8 13:28:17 2021 From: github.com+6704669+asgibbons at openjdk.java.net (Scott Gibbons) Date: Tue, 8 Jun 2021 13:28:17 GMT Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v3] In-Reply-To: References: <_jzUJqPGgV255ofevS6BguJqQddvfdMdr0gGwwn3DA4=.03e8dc6d-ff6b-46de-8dc5-69ed36481615@github.com> Message-ID: On Tue, 8 Jun 2021 01:56:42 GMT, Jatin Bhateja wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixing review comments. Adding notes about isMIME parameter for other architectures; clarifying decodeBlock comments. > > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6239: > >> 6237: >> 6238: __ align(32); >> 6239: __ BIND(L_bruteForce); > > Is this alignment needed ? Given that brute force loop is already aligned. I must be missing something. How is the brute force loop aligned if not by this directive? I don't see an alignment anywhere else that could force it. After the entry(), there are pushes and length comparisons followed by the conditional on VBMI. The only thing I can guess would be that the jmp aligns, but I see no indication that that occurs. ------------- PR: https://git.openjdk.java.net/jdk/pull/4368 From jbhateja at openjdk.java.net Tue Jun 8 14:17:18 2021 From: jbhateja at openjdk.java.net (Jatin Bhateja) Date: Tue, 8 Jun 2021 14:17:18 GMT Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v3] In-Reply-To: References: <_jzUJqPGgV255ofevS6BguJqQddvfdMdr0gGwwn3DA4=.03e8dc6d-ff6b-46de-8dc5-69ed36481615@github.com> Message-ID: On Tue, 8 Jun 2021 13:25:00 GMT, Scott Gibbons wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6239: >> >>> 6237: >>> 6238: __ align(32); >>> 6239: __ BIND(L_bruteForce); >> >> Is this alignment needed ? Given that brute force loop is already aligned. > > I must be missing something. How is the brute force loop aligned if not by this directive? I don't see an alignment anywhere else that could force it. After the entry(), there are pushes and length comparisons followed by the conditional on VBMI. The only thing I can guess would be that the jmp aligns, but I see no indication that that occurs. > > Perhaps what you missed was that L_forceLoop is aligned (line 6288). This is not the same label as L_bruteForce, which is a jump target from within the VBMI conditional (which should be aligned)? Otherwise, I don't see how L_bruteForce could possibly already be aligned. Yes, I meant force loop already has alignment so earlier one can be removed. ------------- PR: https://git.openjdk.java.net/jdk/pull/4368 From luhenry at openjdk.java.net Tue Jun 8 15:12:17 2021 From: luhenry at openjdk.java.net (Ludovic Henry) Date: Tue, 8 Jun 2021 15:12:17 GMT Subject: Withdrawn: 8267985: Allow AsyncGetCallTrace and JFR to walk a stub frame In-Reply-To: References: Message-ID: <3HZgeMReIGp-iPHsK-RZnzO9tIrcBwmB2n6RQkL-IIA=.c08e55c2-dcac-4a6d-98b5-28b61f27c10f@github.com> On Mon, 31 May 2021 16:06:10 GMT, Ludovic Henry wrote: > When the signal sent for AsyncGetCallTrace or JFR would land on a stub > (like arraycopy), it wouldn't be able to detect the sender (caller) > frame because `_cb->frame_size() == 0`. > > Because we fully control how the prolog and epilog of stub code is > generated, we know there are two cases: > 1. A stack frame is allocated via macroAssembler->enter(), and consists > in `push rbp; mov rsp, rbp;`. > 2. No stack frames are allocated and rbp is left unchanged and rsp is > decremented with the `call` instruction that push the return `pc` on the > stack. > > For case 1., we can easily know the sender frame by simply looking at > rbp, especially since we know that all stubs preserve the frame pointer > (on x86 at least). > > For case 2., we end up returning the sender's sender, but that already > gives us more information than what we have today. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/4274 From luhenry at openjdk.java.net Tue Jun 8 15:12:19 2021 From: luhenry at openjdk.java.net (Ludovic Henry) Date: Tue, 8 Jun 2021 15:12:19 GMT Subject: RFR: 8268178: Extract sender frame parsing to CodeBlob::FrameParser [v5] In-Reply-To: References: Message-ID: On Mon, 7 Jun 2021 12:56:53 GMT, Ludovic Henry wrote: >> Whether and how a frame is setup is controlled by the code generator >> for the specific CodeBlock. The CodeBlock is then in the best place to know how >> to parse the sender's frame from the current frame in the given CodeBlock. >> >> This refactoring proposes to extract this parsing out of `frame` and into a >> `CodeBlock::FrameParser`. This FrameParser is then specialized in the relevant >> inherited children of CodeBlock. >> >> This change is to largely facilitate adding new supported cases for JDK-8252417 [1] >> like runtime stubs. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8252417 > > Ludovic Henry has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. Closing for now as there are still failing tests which I can't seem to reproduce locally. I'll reopen when I've figured out the root cause. ------------- PR: https://git.openjdk.java.net/jdk/pull/4337 From luhenry at openjdk.java.net Tue Jun 8 15:12:19 2021 From: luhenry at openjdk.java.net (Ludovic Henry) Date: Tue, 8 Jun 2021 15:12:19 GMT Subject: Withdrawn: 8268178: Extract sender frame parsing to CodeBlob::FrameParser In-Reply-To: References: Message-ID: On Thu, 3 Jun 2021 14:25:15 GMT, Ludovic Henry wrote: > Whether and how a frame is setup is controlled by the code generator > for the specific CodeBlock. The CodeBlock is then in the best place to know how > to parse the sender's frame from the current frame in the given CodeBlock. > > This refactoring proposes to extract this parsing out of `frame` and into a > `CodeBlock::FrameParser`. This FrameParser is then specialized in the relevant > inherited children of CodeBlock. > > This change is to largely facilitate adding new supported cases for JDK-8252417 [1] > like runtime stubs. > > [1] https://bugs.openjdk.java.net/browse/JDK-8252417 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/4337 From github.com+6704669+asgibbons at openjdk.java.net Tue Jun 8 16:13:19 2021 From: github.com+6704669+asgibbons at openjdk.java.net (Scott Gibbons) Date: Tue, 8 Jun 2021 16:13:19 GMT Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v3] In-Reply-To: References: <_jzUJqPGgV255ofevS6BguJqQddvfdMdr0gGwwn3DA4=.03e8dc6d-ff6b-46de-8dc5-69ed36481615@github.com> Message-ID: On Tue, 8 Jun 2021 14:13:53 GMT, Jatin Bhateja wrote: >> I must be missing something. How is the brute force loop aligned if not by this directive? I don't see an alignment anywhere else that could force it. After the entry(), there are pushes and length comparisons followed by the conditional on VBMI. The only thing I can guess would be that the jmp aligns, but I see no indication that that occurs. >> >> Perhaps what you missed was that L_forceLoop is aligned (line 6288). This is not the same label as L_bruteForce, which is a jump target from within the VBMI conditional (which should be aligned)? Otherwise, I don't see how L_bruteForce could possibly already be aligned. > > Yes, I meant force loop already has alignment so earlier one can be removed. Sorry - still confused. These are two different labels, bound to two different locations. I believe the alignments for both are justified. ------------- PR: https://git.openjdk.java.net/jdk/pull/4368 From ysuenaga at openjdk.java.net Tue Jun 8 23:53:15 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Tue, 8 Jun 2021 23:53:15 GMT Subject: Withdrawn: 8268228: TSC is not used for CPUTimeStampCounter on AMD processor In-Reply-To: References: Message-ID: On Fri, 4 Jun 2021 01:56:50 GMT, Yasumasa Suenaga wrote: > I ran JVM on Ryzen 3300X, and I got following `jdk.CPUTimeStampCounter` event. > > > jdk.CPUTimeStampCounter { > startTime = 10:41:14.993 > fastTimeEnabled = false > fastTimeAutoEnabled = true > osFrequency = 1000000000 > fastTimeFrequency = 1000000000 > } > > > I confirmed 3300X supports Invariant TSC (so `fastTimeAutoEnabled` is set to `true`), however it does not seem to be used (`fastTimeEnabled` is `false`). > > Frequency is come from brand string from CPUID (e.g. "Intel(R) Core(TM) i3-8145U CPU @ 2.10GHz"). However AMD processor (Ryzen at least) does not have it ("AMD Ryzen 3 3300X 4-Core Processor"). > Fortunately rdtsc_x86.cpp can calculate the frequency like bogomips. We should fallback to it if we cannot get the frequency even if invariant TSC is supported. > > After this change, I got following `jdk.CPUTimeStampCounter` event. Base clock of Ryzen 3 3300X is 3.8GHz, so `fastTimeFrequency` looks good. > > > jdk.CPUTimeStampCounter { > startTime = 10:33:52.884 > fastTimeEnabled = true > fastTimeAutoEnabled = true > osFrequency = 10000000 Hz > fastTimeFrequency = 3792929124 Hz > } > > > This problem is not only for JFR. I confirmed `Rdtsc` class is used in ticks.cpp , and it relates to GC code at least. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/4350 From jiefu at openjdk.java.net Wed Jun 9 00:58:25 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Wed, 9 Jun 2021 00:58:25 GMT Subject: RFR: 8268424: JFR tests fail due to GC cause 'G1 Preventive Collection' not in the valid causes after JDK-8257774 Message-ID: Hi all, A new gc cause called 'G1 Preventive Collection' was added in JDK-8257774. So the following two jfr tests should also be updated to fix the test failures. jdk/jfr/event/gc/collection/TestGCCauseWithG1ConcurrentMark.java jdk/jfr/event/gc/collection/TestGCCauseWithG1FullCollection.java Thanks. Best regards, Jie ------------- Commit messages: - 8268424: JFR tests fail due to GC cause 'G1 Preventive Collection' not in the valid causes after JDK-8257774 Changes: https://git.openjdk.java.net/jdk/pull/4422/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4422&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8268424 Stats: 4 lines in 2 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/4422.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4422/head:pull/4422 PR: https://git.openjdk.java.net/jdk/pull/4422 From sviswanathan at openjdk.java.net Wed Jun 9 00:59:19 2021 From: sviswanathan at openjdk.java.net (Sandhya Viswanathan) Date: Wed, 9 Jun 2021 00:59:19 GMT Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v3] In-Reply-To: <_jzUJqPGgV255ofevS6BguJqQddvfdMdr0gGwwn3DA4=.03e8dc6d-ff6b-46de-8dc5-69ed36481615@github.com> References: <_jzUJqPGgV255ofevS6BguJqQddvfdMdr0gGwwn3DA4=.03e8dc6d-ff6b-46de-8dc5-69ed36481615@github.com> Message-ID: On Tue, 8 Jun 2021 00:30:38 GMT, Scott Gibbons wrote: >> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding. >> >> A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done. This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so. A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded. This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded. >> >> The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding. The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically. >> >> Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k. The numbers are given in the table below. >> >> **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**. >> >> >> Benchmark Name | Base Score | Optimized Score | Gain >> -- | -- | -- | -- >> testBase64Decode size 1 | 15.36 | 15.32 | 1.00 >> testBase64Decode size 3 | 17.00 | 16.72 | 1.02 >> testBase64Decode size 7 | 20.60 | 18.82 | 1.09 >> testBase64Decode size 32 | 34.21 | 26.77 | 1.28 >> testBase64Decode size 64 | 54.43 | 38.35 | 1.42 >> testBase64Decode size 80 | 66.40 | 48.34 | 1.37 >> testBase64Decode size 96 | 73.16 | 52.90 | 1.38 >> testBase64Decode size 112 | 84.93 | 51.82 | 1.64 >> testBase64Decode size 512 | 288.81 | 32.04 | 9.01 >> testBase64Decode size 1000 | 560.48 | 40.79 | 13.74 >> testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72 >> testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15 >> testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07 >> testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10 >> testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02 >> testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10 >> testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05 >> testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00 >> testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05 >> testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20 >> testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09 >> testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12 >> testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09 >> testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21 >> testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29 >> testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12 >> testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05 >> testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18 >> testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02 >> testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24 >> testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23 >> testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24 >> testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14 >> testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19 >> testBase64WithErrorInputsDecode size 20000 | 1398.44 | 1138.17 | 1.23 >> testBase64WithErrorInputsDecode size 50000 | 1409.41 | 1114.16 | 1.26 > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Fixing review comments. Adding notes about isMIME parameter for other architectures; clarifying decodeBlock comments. @asgibbons Thanks a lot for contributing this. The performance gain is impressive. I have some minor comments. Please take a look. src/hotspot/cpu/x86/assembler_x86.cpp line 4555: > 4553: void Assembler::evpmaddubsw(XMMRegister dst, XMMRegister src1, XMMRegister src2, int vector_len) { > 4554: assert(VM_Version::supports_avx512bw(), ""); > 4555: InstructionAttr attributes(vector_len, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true); This instruction is also supported on AVX platforms. The assert check could be as follows: assert(vector_len == AVX_128bit? VM_Version::supports_avx() : vector_len == AVX_256bit? VM_Version::supports_avx2() : vector_len == AVX_512bit? VM_Version::supports_avx512bw() : 0, ""); Accordingly the instruction could be named as vpmaddubsw. src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 5688: > 5686: address base64_vbmi_lookup_lo_addr() { > 5687: __ align(64, (unsigned long) __ pc()); > 5688: StubCodeMark mark(this, "StubRoutines", "lookup_lo"); It will be good to add base64 to the StubCodeMark name for this and all the tables. src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 5983: > 5981: // calculate length from offsets > 5982: __ movq(length, end_offset); > 5983: __ subq(length, start_offset); These are 32bit, so movl, subl instead of movq, subq. Similar for all length relates instructions below. src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 5987: > 5985: > 5986: // If AVX512 VBMI not supported, just compile non-AVX code > 5987: if(VM_Version::supports_avx512_vbmi()) { Need to also check for VM_Version::supports_avx512bw() support. Could you please check if VM_Version::supports_avx512dq is needed as well? src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6134: > 6132: __ subq(length, 64); > 6133: __ addq(source, 64); > 6134: __ addq(dest, 48); All address related instructions here and below could use addptr, subptr etc. src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6273: > 6271: > 6272: __ shrq(length, 2); // Multiple of 4 bytes only - length is # 4-byte chunks > 6273: __ cmpq(length, 0); Should these be shrl, cmpl? src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6278: > 6276: // Set up src and dst pointers properly > 6277: __ addq(source, start_offset); // Initial offset > 6278: __ addq(dest, dp); The convention is to use addptr for pointers. src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6284: > 6282: __ shll(isURL, 8); // index into decode table based on isURL > 6283: __ lea(decode_table, ExternalAddress(StubRoutines::x86::base64_decoding_table_addr())); > 6284: __ addq(decode_table, isURL); addptr here. src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6297: > 6295: __ orl(byte1, byte4); > 6296: > 6297: __ incrementq(source, 4); addptr here. src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6317: > 6315: __ load_signed_byte(byte4, Address(source, RegisterOrConstant(), Address::times_1, 3)); > 6316: __ load_signed_byte(byte3, Address(decode_table, byte3, Address::times_1, 0)); > 6317: __ load_signed_byte(byte4, Address(decode_table, byte4, Address::times_1, 0)); You could use Address(base, offset) form directly here and other places: e.g. Address (source, 1) instead of Address(source, RegisterOrConstant(), Address::times_1, 1). src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6329: > 6327: __ subq(dest, rax); // Number of bytes converted > 6328: __ movq(rax, dest); > 6329: __ pop(rbx); subptr, movptr here. src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 7627: > 7625: StubRoutines::x86::_right_shift_mask = base64_right_shift_mask_addr(); > 7626: StubRoutines::_base64_encodeBlock = generate_base64_encodeBlock(); > 7627: if (VM_Version::supports_avx512_vbmi()) { Need to add avx512bw check here also. src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 7628: > 7626: StubRoutines::_base64_encodeBlock = generate_base64_encodeBlock(); > 7627: if (VM_Version::supports_avx512_vbmi()) { > 7628: StubRoutines::x86::_lookup_lo = base64_vbmi_lookup_lo_addr(); It would be good to add base64 to these names. ------------- PR: https://git.openjdk.java.net/jdk/pull/4368 From dholmes at openjdk.java.net Wed Jun 9 02:21:13 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 9 Jun 2021 02:21:13 GMT Subject: RFR: 8268424: JFR tests fail due to GC cause 'G1 Preventive Collection' not in the valid causes after JDK-8257774 In-Reply-To: References: Message-ID: <_d9zQkzykknRN4zrKFslRjJV3lxvx8jSln_GebGYMdU=.1de982b2-ced7-4e35-a5f3-688839159acd@github.com> On Wed, 9 Jun 2021 00:51:08 GMT, Jie Fu wrote: > Hi all, > > A new gc cause called 'G1 Preventive Collection' was added in JDK-8257774. > So the following two jfr tests should also be updated to fix the test failures. > > > jdk/jfr/event/gc/collection/TestGCCauseWithG1ConcurrentMark.java > jdk/jfr/event/gc/collection/TestGCCauseWithG1FullCollection.java > > > Thanks. > Best regards, > Jie Hi Jie, This looks good and trivial IMO. Thanks for fixing it! David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4422 From jiefu at openjdk.java.net Wed Jun 9 02:26:21 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Wed, 9 Jun 2021 02:26:21 GMT Subject: RFR: 8268424: JFR tests fail due to GC cause 'G1 Preventive Collection' not in the valid causes after JDK-8257774 In-Reply-To: <_d9zQkzykknRN4zrKFslRjJV3lxvx8jSln_GebGYMdU=.1de982b2-ced7-4e35-a5f3-688839159acd@github.com> References: <_d9zQkzykknRN4zrKFslRjJV3lxvx8jSln_GebGYMdU=.1de982b2-ced7-4e35-a5f3-688839159acd@github.com> Message-ID: On Wed, 9 Jun 2021 02:18:05 GMT, David Holmes wrote: > Hi Jie, > > This looks good and trivial IMO. > > Thanks for fixing it! > > David Thanks @dholmes-ora . ------------- PR: https://git.openjdk.java.net/jdk/pull/4422 From jiefu at openjdk.java.net Wed Jun 9 02:26:22 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Wed, 9 Jun 2021 02:26:22 GMT Subject: Integrated: 8268424: JFR tests fail due to GC cause 'G1 Preventive Collection' not in the valid causes after JDK-8257774 In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 00:51:08 GMT, Jie Fu wrote: > Hi all, > > A new gc cause called 'G1 Preventive Collection' was added in JDK-8257774. > So the following two jfr tests should also be updated to fix the test failures. > > > jdk/jfr/event/gc/collection/TestGCCauseWithG1ConcurrentMark.java > jdk/jfr/event/gc/collection/TestGCCauseWithG1FullCollection.java > > > Thanks. > Best regards, > Jie This pull request has now been integrated. Changeset: 2cc1977a Author: Jie Fu URL: https://git.openjdk.java.net/jdk/commit/2cc1977a9698af9538101a5842c311659521a0aa Stats: 4 lines in 2 files changed: 0 ins; 0 del; 4 mod 8268424: JFR tests fail due to GC cause 'G1 Preventive Collection' not in the valid causes after JDK-8257774 Reviewed-by: dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/4422 From ogatak at openjdk.java.net Wed Jun 9 02:30:13 2021 From: ogatak at openjdk.java.net (Kazunori Ogata) Date: Wed, 9 Jun 2021 02:30:13 GMT Subject: RFR: 8267968: [PPC64] Use prefixed load and addi instructions for better performance in POWER10 [v2] In-Reply-To: References: Message-ID: On Sun, 6 Jun 2021 20:08:41 GMT, Kazunori Ogata wrote: >> src/hotspot/cpu/ppc/ppc.ad line 2894: >> >>> 2892: if (loadConLNodes._small) nodes->push(loadConLNodes._small); >>> 2893: if (loadConLNodes._large_hi) nodes->push(loadConLNodes._large_hi); >>> 2894: if (loadConLNodes._large_lo) nodes->push(loadConLNodes._large_lo); >> >> Is removing the _last checking needed? lf it's needed, code related to _last should be removed such as in loadConLNodesTuple_create. Also, it would be better to use an if-else condition because it cannot happen both _small and _large_hi are non null. > > loadConLNodesTuple_create initializes loadConLNodes._last as it points to the same node that is either loadConLNodes._small or loadConLNodes._larege_lo. So loadConLNode is added twice if we don't remove _last checking. (I actually made this bug and spent a few days to fix it...) > > The correct code here should be either: 1) use the code before this change, i.e., don't add _small and _large_lo, and only use _last (and _large_hi), or 2) use _small and _large_lo, and remove _last, as I modified. > > I chose the option 2 to avoid confusion and to make the change symmetrical to change at [L.3459](https://github.com/openjdk/jdk/pull/4267/commits/403b789cc068ea74a0768406852bf79149b23e32#diff-d21a64a4949f298476bf91083d3b956face9a6393a08a706b071068898533082R3459), where adding _small is mandatory (_last points to other node here). > > If you (or other reviewers) think the option 1 is better, I can revert this change and add comments as a caveat. I double checked the code and I realized we can't remove _last because the non-"ABI_ELFv2" version of postalloc_expand_java_to_runtime_call() uses _last field in more complex way. It only uses _last, and set _small, _large_hi, and _large_lo to NULL. So I think it's better to revert the changes w.r.t. _last and add comment to avoid confusion. ------------- PR: https://git.openjdk.java.net/jdk/pull/4267 From whuang at openjdk.java.net Wed Jun 9 03:21:30 2021 From: whuang at openjdk.java.net (Wang Huang) Date: Wed, 9 Jun 2021 03:21:30 GMT Subject: RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals Message-ID: Dear all, Could you give me a favor to review this patch? It improves the performance of the intrinsic of `String.equals` on Neon backend of Aarch64. We profile the performance by using this JMH case: ```java package com.huawei.string; import java.util.*; import java.util.concurrent.TimeUnit; import org.openjdk.jmh.annotations.CompilerControl; import org.openjdk.jmh.annotations.Benchmark; import org.openjdk.jmh.annotations.Level; import org.openjdk.jmh.annotations.OutputTimeUnit; import org.openjdk.jmh.annotations.Param; import org.openjdk.jmh.annotations.Scope; import org.openjdk.jmh.annotations.Setup; import org.openjdk.jmh.annotations.State; import org.openjdk.jmh.annotations.Fork; import org.openjdk.jmh.infra.Blackhole; @State(Scope.Thread) @OutputTimeUnit(TimeUnit.MILLISECONDS) public class StringEqual { @Param({"8", "64", "4096"}) int size; String str1; String str2; @Setup(Level.Trial) public void init() { str1 = newString(size, 'c', '1'); str2 = newString(size, 'c', '2'); } public String newString(int length, char charToFill, char lastChar) { if (length > 0) { char[] array = new char[length]; Arrays.fill(array, charToFill); array[length - 1] = lastChar; return new String(array); } return ""; } @Benchmark @CompilerControl(CompilerControl.Mode.DONT_INLINE) public boolean EqualString() { return str1.equals(str2); } } ``` The result is list as following:?Linux aarch64 with 128cores? Benchmark | (size) | Mode | Cnt | Score | Error | Units ----------------------------------|-------|---------|-------|------------|------------|---------- StringEqual.EqualString | 8 | thrpt | 10 | 123971.994 | ? 1462.131 | ops/ms StringEqual.EqualString | 64 | thrpt | 10 | 56009.960 | ? 999.734 | ops/ms StringEqual.EqualString | 4096 | thrpt | 10 | 1943.852 | ? 8.159 | ops/ms StringEqual.EqualStringWithNEON | 8 | thrpt | 10 | 120319.271 | ? 1392.185 | ops/ms StringEqual.EqualStringWithNEON | 64 | thrpt | 10 | 72914.767 | ? 1814.173 | ops/ms StringEqual.EqualStringWithNEON | 4096 | thrpt | 10 | 2579.155 | ? 15.589 | ops/ms Yours, WANG Huang ------------- Commit messages: - 8268229: Aarch64: Use Neon in intrinsics for String.equals Changes: https://git.openjdk.java.net/jdk/pull/4423/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4423&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8268229 Stats: 28 lines in 2 files changed: 25 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/4423.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4423/head:pull/4423 PR: https://git.openjdk.java.net/jdk/pull/4423 From tschatzl at openjdk.java.net Wed Jun 9 07:11:38 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 9 Jun 2021 07:11:38 GMT Subject: RFR: 8017163: G1: Refactor remembered sets [v11] In-Reply-To: References: Message-ID: > Hi all, > > can I have reviews for this change that significantly refactors the remembered set for more scalability. > > The current G1 remembered set implementation has been designed for use cases and Java heaps and applications from 20 years ago. > > Over time many problems with performance and in particular memory usage have been observed: > > * adding elements to the lowest tier data structure takes a per-remembered set global lock. Measurements have shown that the applications can wait thousands of seconds acquiring these locks. While the affected threads are in most cases refinement threads so does not directly affect the application, it can still affect the ability of G1 to meet some goals needed for keeping pause times (i.e. amount of cards from the refinement buffers to be merged into the card table and then scanned during gc). > > * there is a substantial memory overhead for managing the data structures: examples are > * using separate (hash) tables for the three different types of card containers > * there is significant unnecessary preallocation of memory for some of the card set containers > * Containers store redundant information > > * inflexibility when reusing memory: in the current implementation the different containers use different approaches to manage memory. Most use the C heap directly, some the C heap with some internal global memory pool. This in practice makes it very difficult to implement anything other than giving back memory in the collection pause. The corresponding "Free Collection Set" pause can take a significant amount of time because of that. > Also memory reuse is limited and preallocating arenas is limited (or would have to be reimplemented multiple times), stressing the C heap allocator. > > * inability to support additional use cases: over time interesting ideas (e.g. JDK-8058803) came up for improving performance of remembered set management. Mostly due to redundant information everywhere and completely different handling of various aspects in the containers it is in practice impossible to implement these. > > * (partial) inability to give back memory to the OS. While some of the containers use the C heap allocator, and so in some way give back memory, these implementations and handling is different for every container. > > * the existing granularity of containers are unbalanced: currently there exist three tiers: "sparse", "fine" and "full". Sparse is an array of cards ranging in the hundreds maybe, "fine" is a bitmap covering a whole region and full is a bit indicating that that region should be scanned completely during GC. > > The problem is that there is nothing between "no card at all" and "sparse" and in particular the difference between the capability to hold entries of "sparse" and "fine". I.e. memory usage difference when exceeding a "sparse" array (holding 128 entries at 32M regions, taking ~256 bytes) to fine that is able to hold 65k entries using 8kB is significant. > For these reason there is even a dedicated option to stop allocating more "fine" containers and just give up and use "full" instead to avoid excessive memory usage. With extremely bad consequences in pause times. > > Over time some of these issues have been fixed or in many cases band-aided, and some of these fixes and ideas were the result of working on this change (e.g. JDK-8262185, JDK-8233919, JDK-8213108). > > This change is effectively a rewrite of the Java heap card based part of a region's remembered set. > > This initial fully working change can be roughly described with the following properties: > > * use a single `ConcurrentHashTable` for the card containers of a given region. The container in use replaced (coarsened) on the fly within the CHT node, completely lock-free. This implements JDK-6949259. > > * memory for a given region's remembered set for all containers (and the CHT nodes) is backed by per container type and per remembered set arena style bump-pointer allocation buffers. In this change, in the pause, memory is given back to free lists only. The implementation gives back memory to the OS concurrently to the application. Memory is still managed using the C heap memory manager though, but abstracted away and could be replaced by manual page memory management. > > * there are now four different container types and one meta-container type. These four actual containers are: > * inline pointer: the change store a few (3-5) cards in the CHT node directly and uses no extra memory. > * array of cards: similar to the "sparse" container, an array of cards with a configurable amount of entries. However bulk allocation of memory is now managed at a lower level so there is much less waste. > * bitmap: similar to "fine", a bitmap spanning a (sub-)range of memory > * full: same as full, indicating for a (sub-)range of memory that all cards are to be looked at during scan. Similar to inline pointers, this uses no extra memory. > * howl: the Howl container subdivides a given memory range into subranges where any of the other containers describing that sub-range of the heap may be stored in. This is somewhat similar to the idea suggested in JDK-8048504. > > * care has been taken to minimize container memory usage, e.g. by not adding redundant information there and in general carefully specify them. They have been designed with future enhancements in mind. > > In some benchmarks (where there is significant remembered set memory usage) we are seeing memory reduction to 25% of JDK 16 levels with this change. Garbage collection times are at most as long or shorter than before; most changes affecting pause times have been extracted earlier. Individiual affected phases are generally shorter now. > > Testing: tier1-8 many times, manual and automated perf testing Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: - Merge branch 'master' into submit/8017163-refactor-remembered-set - Merge branch 'master' of gh:openjdk/jdk into tschatzl:submit/8017163-refactor-remembered-set - Always have power-of-2 Howl buckets to avoid memory waste (these entries have never been used before, just taking a small amount of memory) - Improved documentation - Improve comment - Rename G1CardSetContainerOnHeap to G1CardSetContainer on popular demand - sjohanss-review 3 - Merge branch 'master' of gh:openjdk/jdk into 8017163-refactor-remembered-set - More cleanup after sjohanss comments - Rename FOUND - ... and 6 more: https://git.openjdk.java.net/jdk/compare/4d1cf51b...338b4829 ------------- Changes: https://git.openjdk.java.net/jdk/pull/4116/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4116&range=10 Stats: 6131 lines in 64 files changed: 4558 ins; 1315 del; 258 mod Patch: https://git.openjdk.java.net/jdk/pull/4116.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4116/head:pull/4116 PR: https://git.openjdk.java.net/jdk/pull/4116 From yyang at openjdk.java.net Wed Jun 9 08:17:40 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Wed, 9 Jun 2021 08:17:40 GMT Subject: RFR: 8265518: C1: Intrinsic support for Preconditions.checkIndex [v12] In-Reply-To: References: Message-ID: > The JDK codebase re-created many variants of checkIndex(`grep -I -r 'cehckIndex' jdk/`). A notable variant is java.nio.Buffer.checkIndex, which annotated with @IntrinsicCandidate and it only has a corresponding C1 intrinsic version. > > In fact, there is an utility method `jdk.internal.util.Preconditions.checkIndex`(wrapped by java.lang.Objects.checkIndex) that behaves the same as these variants of checkIndex, we can replace these re-created variants of checkIndex by Objects.checkIndex, it would significantly reduce duplicated code and enjoys performance improvement because Preconditions.checkIndex is @IntrinsicCandidate and it has a corresponding intrinsic method in HotSpot. > > But, the problem is currently HotSpot only implements the C2 version of Preconditions.checkIndex. To reuse it global-widely in JDK code, I think we can firstly implement its C1 counterpart. There are also a few kinds of stuff we can do later: > > 1. Replace all variants of checkIndex by Objects.checkIndex in the whole JDK codebase. > 2. Remove Buffer.checkIndex and obsolete/deprecate InlineNIOCheckIndex flag > > Testing: cds, compiler and jdk Yi Yang has updated the pull request incrementally with two additional commits since the last revision: - c1 can not handle 0 constant value when using cmp - fix ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3615/files - new: https://git.openjdk.java.net/jdk/pull/3615/files/289d752c..63f1c30d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3615&range=11 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3615&range=10-11 Stats: 51 lines in 1 file changed: 23 ins; 17 del; 11 mod Patch: https://git.openjdk.java.net/jdk/pull/3615.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3615/head:pull/3615 PR: https://git.openjdk.java.net/jdk/pull/3615 From aph at openjdk.java.net Wed Jun 9 08:28:22 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 9 Jun 2021 08:28:22 GMT Subject: RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 03:10:45 GMT, Wang Huang wrote: > Dear all, > Could you give me a favor to review this patch? It improves the performance of the intrinsic of `String.equals` on Neon backend of Aarch64. > We profile the performance by using this JMH case: > > > ```java > package com.huawei.string; > import java.util.*; > import java.util.concurrent.TimeUnit; > > import org.openjdk.jmh.annotations.CompilerControl; > import org.openjdk.jmh.annotations.Benchmark; > import org.openjdk.jmh.annotations.Level; > import org.openjdk.jmh.annotations.OutputTimeUnit; > import org.openjdk.jmh.annotations.Param; > import org.openjdk.jmh.annotations.Scope; > import org.openjdk.jmh.annotations.Setup; > import org.openjdk.jmh.annotations.State; > import org.openjdk.jmh.annotations.Fork; > import org.openjdk.jmh.infra.Blackhole; > > @State(Scope.Thread) > @OutputTimeUnit(TimeUnit.MILLISECONDS) > public class StringEqual { > @Param({"8", "64", "4096"}) > int size; > > String str1; > String str2; > > @Setup(Level.Trial) > public void init() { > str1 = newString(size, 'c', '1'); > str2 = newString(size, 'c', '2'); > } > > public String newString(int length, char charToFill, char lastChar) { > if (length > 0) { > char[] array = new char[length]; > Arrays.fill(array, charToFill); > array[length - 1] = lastChar; > return new String(array); > } > return ""; > } > > @Benchmark > @CompilerControl(CompilerControl.Mode.DONT_INLINE) > public boolean EqualString() { > return str1.equals(str2); > } > } > > ``` > The result is list as following:?Linux aarch64 with 128cores? > > Benchmark | (size) | Mode | Cnt | Score | Error | Units > ----------------------------------|-------|---------|-------|------------|------------|---------- > StringEqual.EqualString | 8 | thrpt | 10 | 123971.994 | ? 1462.131 | ops/ms > StringEqual.EqualString | 64 | thrpt | 10 | 56009.960 | ? 999.734 | ops/ms > StringEqual.EqualString | 4096 | thrpt | 10 | 1943.852 | ? 8.159 | ops/ms > StringEqual.EqualStringWithNEON | 8 | thrpt | 10 | 120319.271 | ? 1392.185 | ops/ms > StringEqual.EqualStringWithNEON | 64 | thrpt | 10 | 72914.767 | ? 1814.173 | ops/ms > StringEqual.EqualStringWithNEON | 4096 | thrpt | 10 | 2579.155 | ? 15.589 | ops/ms > > Yours, > WANG Huang src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 4807: > 4805: mov(tmp2, v0, T2D, 1); > 4806: cbnz(tmp2, DONE); > 4807: b(SAME); Shouldn't this be mov(tmp1, v0, T2D, 0); mov(tmp2, v0, T2D, 1); orr(tmp1, tmp1, tmp2); cbnz(tmp1, DONE); ... which would use up fewer branch prediction resources. ------------- PR: https://git.openjdk.java.net/jdk/pull/4423 From aph at openjdk.java.net Wed Jun 9 08:31:17 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 9 Jun 2021 08:31:17 GMT Subject: RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 08:25:23 GMT, Andrew Haley wrote: >> Dear all, >> Could you give me a favor to review this patch? It improves the performance of the intrinsic of `String.equals` on Neon backend of Aarch64. >> We profile the performance by using this JMH case: >> >> >> ```java >> package com.huawei.string; >> import java.util.*; >> import java.util.concurrent.TimeUnit; >> >> import org.openjdk.jmh.annotations.CompilerControl; >> import org.openjdk.jmh.annotations.Benchmark; >> import org.openjdk.jmh.annotations.Level; >> import org.openjdk.jmh.annotations.OutputTimeUnit; >> import org.openjdk.jmh.annotations.Param; >> import org.openjdk.jmh.annotations.Scope; >> import org.openjdk.jmh.annotations.Setup; >> import org.openjdk.jmh.annotations.State; >> import org.openjdk.jmh.annotations.Fork; >> import org.openjdk.jmh.infra.Blackhole; >> >> @State(Scope.Thread) >> @OutputTimeUnit(TimeUnit.MILLISECONDS) >> public class StringEqual { >> @Param({"8", "64", "4096"}) >> int size; >> >> String str1; >> String str2; >> >> @Setup(Level.Trial) >> public void init() { >> str1 = newString(size, 'c', '1'); >> str2 = newString(size, 'c', '2'); >> } >> >> public String newString(int length, char charToFill, char lastChar) { >> if (length > 0) { >> char[] array = new char[length]; >> Arrays.fill(array, charToFill); >> array[length - 1] = lastChar; >> return new String(array); >> } >> return ""; >> } >> >> @Benchmark >> @CompilerControl(CompilerControl.Mode.DONT_INLINE) >> public boolean EqualString() { >> return str1.equals(str2); >> } >> } >> >> ``` >> The result is list as following:?Linux aarch64 with 128cores? >> >> Benchmark | (size) | Mode | Cnt | Score | Error | Units >> ----------------------------------|-------|---------|-------|------------|------------|---------- >> StringEqual.EqualString | 8 | thrpt | 10 | 123971.994 | ? 1462.131 | ops/ms >> StringEqual.EqualString | 64 | thrpt | 10 | 56009.960 | ? 999.734 | ops/ms >> StringEqual.EqualString | 4096 | thrpt | 10 | 1943.852 | ? 8.159 | ops/ms >> StringEqual.EqualStringWithNEON | 8 | thrpt | 10 | 120319.271 | ? 1392.185 | ops/ms >> StringEqual.EqualStringWithNEON | 64 | thrpt | 10 | 72914.767 | ? 1814.173 | ops/ms >> StringEqual.EqualStringWithNEON | 4096 | thrpt | 10 | 2579.155 | ? 15.589 | ops/ms >> >> Yours, >> WANG Huang > > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 4807: > >> 4805: mov(tmp2, v0, T2D, 1); >> 4806: cbnz(tmp2, DONE); >> 4807: b(SAME); > > Shouldn't this be > > mov(tmp1, v0, T2D, 0); > mov(tmp2, v0, T2D, 1); > orr(tmp1, tmp1, tmp2); > cbnz(tmp1, DONE); > > > ... which would use up fewer branch prediction resources. ... or maybe do the OR in the vector unit? ------------- PR: https://git.openjdk.java.net/jdk/pull/4423 From sjohanss at openjdk.java.net Wed Jun 9 08:39:25 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Wed, 9 Jun 2021 08:39:25 GMT Subject: RFR: 8268388: Update large pages information in Java manpage Message-ID: Please review this update to the text for large pages in the Java man page. The text for `LargePageSizeInBytes` was reviewed in this [CSR](https://bugs.openjdk.java.net/browse/JDK-8265517), and this change will integrate them into the actual man page. The *Large Pages* section further down in the man page was also a bit out-dated and have been brushed up a bit. ------------- Commit messages: - Thomas review. - 8268388: Update large pages information in Java manpage Changes: https://git.openjdk.java.net/jdk/pull/4425/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4425&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8268388 Stats: 73 lines in 1 file changed: 18 ins; 8 del; 47 mod Patch: https://git.openjdk.java.net/jdk/pull/4425.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4425/head:pull/4425 PR: https://git.openjdk.java.net/jdk/pull/4425 From tschatzl at openjdk.java.net Wed Jun 9 08:39:26 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 9 Jun 2021 08:39:26 GMT Subject: RFR: 8268388: Update large pages information in Java manpage In-Reply-To: References: Message-ID: <9zquajD0fEAVNt-q18UUsWTCGQB-TgB4JKF0zGZIdKw=.6237ac75-7a42-4e55-af8e-0c48775bb11b@github.com> On Wed, 9 Jun 2021 08:01:46 GMT, Stefan Johansson wrote: > Please review this update to the text for large pages in the Java man page. > > The text for `LargePageSizeInBytes` was reviewed in this [CSR](https://bugs.openjdk.java.net/browse/JDK-8265517), and this change will integrate them into the actual man page. The *Large Pages* section further down in the man page was also a bit out-dated and have been brushed up a bit. Lgtm, one additional remark. src/java.base/share/man/java.1 line 5125: > 5123: .PP > 5124: However, large pages page memory can negatively affect system > 5125: performance. Suggestion: However, using large pages can negatively affect system performance. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4425 From lkorinth at openjdk.java.net Wed Jun 9 08:39:26 2021 From: lkorinth at openjdk.java.net (Leo Korinth) Date: Wed, 9 Jun 2021 08:39:26 GMT Subject: RFR: 8268388: Update large pages information in Java manpage In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 08:01:46 GMT, Stefan Johansson wrote: > Please review this update to the text for large pages in the Java man page. > > The text for `LargePageSizeInBytes` was reviewed in this [CSR](https://bugs.openjdk.java.net/browse/JDK-8265517), and this change will integrate them into the actual man page. The *Large Pages* section further down in the man page was also a bit out-dated and have been brushed up a bit. Looks good to me. ------------- Marked as reviewed by lkorinth (Committer). PR: https://git.openjdk.java.net/jdk/pull/4425 From aph at openjdk.java.net Wed Jun 9 08:47:20 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 9 Jun 2021 08:47:20 GMT Subject: RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 03:10:45 GMT, Wang Huang wrote: > Dear all, > Could you give me a favor to review this patch? It improves the performance of the intrinsic of `String.equals` on Neon backend of Aarch64. > We profile the performance by using this JMH case: > > > ```java > package com.huawei.string; > import java.util.*; > import java.util.concurrent.TimeUnit; > > import org.openjdk.jmh.annotations.CompilerControl; > import org.openjdk.jmh.annotations.Benchmark; > import org.openjdk.jmh.annotations.Level; > import org.openjdk.jmh.annotations.OutputTimeUnit; > import org.openjdk.jmh.annotations.Param; > import org.openjdk.jmh.annotations.Scope; > import org.openjdk.jmh.annotations.Setup; > import org.openjdk.jmh.annotations.State; > import org.openjdk.jmh.annotations.Fork; > import org.openjdk.jmh.infra.Blackhole; > > @State(Scope.Thread) > @OutputTimeUnit(TimeUnit.MILLISECONDS) > public class StringEqual { > @Param({"8", "64", "4096"}) > int size; > > String str1; > String str2; > > @Setup(Level.Trial) > public void init() { > str1 = newString(size, 'c', '1'); > str2 = newString(size, 'c', '2'); > } > > public String newString(int length, char charToFill, char lastChar) { > if (length > 0) { > char[] array = new char[length]; > Arrays.fill(array, charToFill); > array[length - 1] = lastChar; > return new String(array); > } > return ""; > } > > @Benchmark > @CompilerControl(CompilerControl.Mode.DONT_INLINE) > public boolean EqualString() { > return str1.equals(str2); > } > } > > ``` > The result is list as following:?Linux aarch64 with 128cores? > > Benchmark | (size) | Mode | Cnt | Score | Error | Units > ----------------------------------|-------|---------|-------|------------|------------|---------- > StringEqual.EqualString | 8 | thrpt | 10 | 123971.994 | ? 1462.131 | ops/ms > StringEqual.EqualString | 64 | thrpt | 10 | 56009.960 | ? 999.734 | ops/ms > StringEqual.EqualString | 4096 | thrpt | 10 | 1943.852 | ? 8.159 | ops/ms > StringEqual.EqualStringWithNEON | 8 | thrpt | 10 | 120319.271 | ? 1392.185 | ops/ms > StringEqual.EqualStringWithNEON | 64 | thrpt | 10 | 72914.767 | ? 1814.173 | ops/ms > StringEqual.EqualStringWithNEON | 4096 | thrpt | 10 | 2579.155 | ? 15.589 | ops/ms > > Yours, > WANG Huang So, this is a 30% gain for bulk comparisons. It's not a complete waste of time, but we should concentrate on shortish strings because that's the common case. Me must not do anything to compromise performance in this case The JMH test must be part of your patch. It should be in test/micro/org/openjdk/bench/java/lang. We also need to look at performance around lengths of 32 characters, which is very common. Let's see 8,16,32,64. Did you try comparing long strings that differ in, say the 31st character? ------------- PR: https://git.openjdk.java.net/jdk/pull/4423 From iignatyev at openjdk.java.net Wed Jun 9 08:50:24 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Wed, 9 Jun 2021 08:50:24 GMT Subject: RFR: 8267893: Improve jtreg test failure handler do get native/mixed stack traces for cores and live processes [v5] In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 01:00:53 GMT, Leonid Mesnik wrote: >> EFH is improved to process cores and get mixed stack traces with jhsdb and native stack traces with gdb/lldb. It might be useful because hs_err doesn't contain info about all threads, sometimes it is even not generated. > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > spaces updated. Changes requested by iignatyev (Reviewer). test/failure_handler/src/share/classes/jdk/test/failurehandler/GathererFactory.java line 32: > 30: import java.io.FileWriter; > 31: import java.io.PrintWriter; > 32: import java.nio.file.Files; I don't see why we need these 3 new imports. test/failure_handler/src/share/classes/jdk/test/failurehandler/ToolKit.java line 28: > 26: import jdk.test.failurehandler.action.ActionSet; > 27: import jdk.test.failurehandler.action.ActionHelper; > 28: import jdk.test.failurehandler.action.PatternAction; redundant import test/failure_handler/src/share/conf/linux.properties line 62: > 60: cores=native.gdb > 61: native.gdb.app=gdb > 62: native.gdb.args=%java\0-c\0%p\0-batch\0-ex\0thread apply all backtrace could you please add a comment similar to the one in `common.properties` file? test/failure_handler/src/share/conf/mac.properties line 71: > 69: native.lldb.app=lldb > 70: native.lldb.delimiter=\0 > 71: native.lldb.args=--core\0%p\0%java\0-o\0thread backtrace all\0-o\0quit could you please add a comment similar to the one in common.properties file? test/failure_handler/src/share/conf/mac.properties line 72: > 70: native.lldb.delimiter=\0 > 71: native.lldb.args=--core\0%p\0%java\0-o\0thread backtrace all\0-o\0quit > 72: native.lldb.params.timeout=3600000 why does `lldb` require an increases timeout, but `gdb` and `jhsdb` do not? ------------- PR: https://git.openjdk.java.net/jdk/pull/4234 From jiefu at openjdk.java.net Wed Jun 9 08:52:15 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Wed, 9 Jun 2021 08:52:15 GMT Subject: RFR: 8268125: ZGC: Clone oop array gets wrong acopy stub In-Reply-To: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> References: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> Message-ID: <6Tkv6fBBJ55PVaEpossADUDoOphtkagfGLXTB8qK58U=.c388ebf9-7a5e-4c56-8ae1-e740c897229a@github.com> On Fri, 4 Jun 2021 11:43:26 GMT, Nils Eliasson wrote: > Hi, > > This fixes a problem I introduced with JDK-8267726. With that change clone oop array is treated as normal clone arrays with ZGC. I missed that a case was missing in zBarrierSetC2::clone_at_expansion - which caused clone_oop-arrays to get the wrong array copy stub. > > In this fix I move the entire leaf call creation inside zBarrierSetC2, and leave BarrierSetC2 as is. In this way I don't have to change anything for the other collectors. > > Please review, > Best regards, > Nils Eliasson Hi @neliasso , The patch seems to fix the zgc-related failures in our CI/CD. Please go ahead. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/4359 From yyang at openjdk.java.net Wed Jun 9 08:53:40 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Wed, 9 Jun 2021 08:53:40 GMT Subject: RFR: 8265518: C1: Intrinsic support for Preconditions.checkIndex [v13] In-Reply-To: References: Message-ID: > The JDK codebase re-created many variants of checkIndex(`grep -I -r 'cehckIndex' jdk/`). A notable variant is java.nio.Buffer.checkIndex, which annotated with @IntrinsicCandidate and it only has a corresponding C1 intrinsic version. > > In fact, there is an utility method `jdk.internal.util.Preconditions.checkIndex`(wrapped by java.lang.Objects.checkIndex) that behaves the same as these variants of checkIndex, we can replace these re-created variants of checkIndex by Objects.checkIndex, it would significantly reduce duplicated code and enjoys performance improvement because Preconditions.checkIndex is @IntrinsicCandidate and it has a corresponding intrinsic method in HotSpot. > > But, the problem is currently HotSpot only implements the C2 version of Preconditions.checkIndex. To reuse it global-widely in JDK code, I think we can firstly implement its C1 counterpart. There are also a few kinds of stuff we can do later: > > 1. Replace all variants of checkIndex by Objects.checkIndex in the whole JDK codebase. > 2. Remove Buffer.checkIndex and obsolete/deprecate InlineNIOCheckIndex flag > > Testing: cds, compiler and jdk Yi Yang has updated the pull request incrementally with one additional commit since the last revision: more comment ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3615/files - new: https://git.openjdk.java.net/jdk/pull/3615/files/63f1c30d..87d8b399 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3615&range=12 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3615&range=11-12 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3615.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3615/head:pull/3615 PR: https://git.openjdk.java.net/jdk/pull/3615 From stuefe at openjdk.java.net Wed Jun 9 09:50:14 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 9 Jun 2021 09:50:14 GMT Subject: RFR: 8268388: Update large pages information in Java manpage In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 08:01:46 GMT, Stefan Johansson wrote: > Please review this update to the text for large pages in the Java man page. > > The text for `LargePageSizeInBytes` was reviewed in this [CSR](https://bugs.openjdk.java.net/browse/JDK-8265517), and this change will integrate them into the actual man page. The *Large Pages* section further down in the man page was also a bit out-dated and have been brushed up a bit. Hi Stefan, just idle nitpicking, looks good otherwise. Cheers, Thomas src/java.base/share/man/java.1 line 5174: > 5172: .RS > 5173: .PP > 5174: \f[CB]#\ echo\ 4096\ >\ /sys/kernel/mm/hugepages/hugepages\-2048kB/nr_hugepages\f[R] - vm.nr_hugepages is not the only way to do this setup, since we have vm.nr_overcommit_hugepages, which are a bit more flexible to use. Is the java man page supposed to be complete in explaining this, or should it only show one possible way of many? - most users probably just use UseLargePages and are not aware of the different flavors. So they would not know what "explicit" means. How about, instead of mentioning UseSHM and UseHugeTLBFS, not just: "if you use large pages but don't use transparent huge pages (UseTransparentHugePages), large pages need to be preallocated...". - Otherwise, at least reverse the order of the two options? UseSHM is mentioned first, which is maybe historical, but TLBFS is the standard. ------------- PR: https://git.openjdk.java.net/jdk/pull/4425 From sjohanss at openjdk.java.net Wed Jun 9 10:04:15 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Wed, 9 Jun 2021 10:04:15 GMT Subject: RFR: 8268388: Update large pages information in Java manpage In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 09:46:17 GMT, Thomas Stuefe wrote: > * vm.nr_hugepages is not the only way to do this setup, since we have vm.nr_overcommit_hugepages, which are a bit more flexible to use. Is the java man page supposed to be complete in explaining this, or should it only show one possible way of many? > I think mentioning one way to configure it is fine, we don't want to try to cover every possibility. I mean there are multiple ways of specifying each of the options as well. Most advanced users will likely refer to the kernel documentation for more details. I see this as a short explanation of one way to enable large pages. > * most users probably just use UseLargePages and are not aware of the different flavors. So they would not know what "explicit" means. How about, instead of mentioning UseSHM and UseHugeTLBFS, not just: "if you use large pages but don't use transparent huge pages (UseTransparentHugePages), large pages need to be preallocated...". > Something like: - When using explicit large pages (options `-XX:+UseSHM` or `-XX:+UseHugeTLBFS`), the number of... + When using large pages and not enabling transparent huge pages (option `-XX:+UseTransparentHugePages`), the number of... > * Otherwise, at least reverse the order of the two options? UseSHM is mentioned first, which is maybe historical, but TLBFS is the standard. Good point, I reversed the order of configuring the tow but forgot it here. But going with the above is better I thing. ------------- PR: https://git.openjdk.java.net/jdk/pull/4425 From kbarrett at openjdk.java.net Wed Jun 9 11:15:35 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 9 Jun 2021 11:15:35 GMT Subject: RFR: 8263107: PSPromotionManager::copy_and_push_safe_barrier needs acquire memory barrier [v2] In-Reply-To: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> References: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> Message-ID: > Please review this change to PSPromotionManager::copy_to_survivor_space > (ParallelGC) to remove some redundant work, and to add some missing memory > barriers. > > There are two callers of copy_to_survivor_space, both of which wrap that > call with the idiom > > if obj->is_forwarded() then > new_obj = obj->forwardee() > else > new_obj = copy_to_survivor_space(obj) > endif > > There are problems with this. > > (1) The first thing copy_to_survivor_space does is check whether the object > is already forwarded, and if so then return obj->forwardee_acquire(). The > idiom used by the callers is a redundant check, and the redundancy can't be > optimized away. It is also missing the acquire barrier that was added by > JDK-8154736 after long discussion. > > (2) It turns out the forwardee_acquire() from JDK-8154736 isn't sufficient > after all. The "if is_forwarded() then use forwardee()" idiom is hiding > under the abstractions that we're doing two relaxed atomic loads of the mark > word, and there is nothing here to prevent the second from reading a value > older than that read by the first, with bad consequences. This possibility > came up in the discussion of JDK-8154736, but seems to have been either lost > or discounted. If you think loads from the same location can't do that, see > JDK-8229169 for a counter example. > > Part of this change involves removing the conditionalization of the calls to > copy_to_survivor_space; just call it directly. However, it turns out that > some compilers don't inline copy_to_survivor_space because of its size. So > we refactored it into two functions, one doing the already marked check and > then calling the other to do most of the work. This is enough for the check > to be inlined into callers, so we've effectively removed the redundant inner > check. Note: This part of the change introduces a large block of whitespace > differences due to removal of an if-else and outdenting the body; I recommend > using a view that suppresses those when reviewing. > > The second part of the change involves adding or moving some acquire barriers. > > (a) For the initial check whether the object is already marked, if it is > then add an acquire fence before returning the forwardee. We could instead > use a load-acquire to obtain the mark word, but that would be an unneeded > acquire barrier on the much more common unmarked case. Also removed > forwardee_acquire(), which is no longer used. > > (b) If the cmpxchg race is lost, add an acquire fence before fetching and > returning the forwardee. The failed release-cmpxchg effectively behaves > like a relaxed-load, which must preceed the forwardee access and any reads > from it. > > I've also changed to only log copying when actually copied, not when already > copied and forwarded. Also changed a guarantee to an assert. > > I looked at all uses of forwardee() in light of problem (2), and did not > find any additional problems. (That doesn't mean there aren't any, just > that I didn't spot any. This is low-level atomics, after all.) > > Testing: > mach5 tier1-3,5,7 (tier3,5,7 are where a lot of ParallelGC testing is done). > Performance testing showed no significant change. Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: avoid reloading forwardee ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4371/files - new: https://git.openjdk.java.net/jdk/pull/4371/files/caef6a51..9ded099a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4371&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4371&range=00-01 Stats: 13 lines in 1 file changed: 3 ins; 6 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/4371.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4371/head:pull/4371 PR: https://git.openjdk.java.net/jdk/pull/4371 From kbarrett at openjdk.java.net Wed Jun 9 11:15:35 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 9 Jun 2021 11:15:35 GMT Subject: RFR: 8263107: PSPromotionManager::copy_and_push_safe_barrier needs acquire memory barrier In-Reply-To: <-B5S5aS1AxTb4vCwvZtXnuM03m3ZIQ4y4gdS1lESgYc=.aa5696b9-4e38-4673-b165-09c67a6309ba@github.com> References: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> <-B5S5aS1AxTb4vCwvZtXnuM03m3ZIQ4y4gdS1lESgYc=.aa5696b9-4e38-4673-b165-09c67a6309ba@github.com> Message-ID: On Mon, 7 Jun 2021 14:34:56 GMT, Martin Doerr wrote: > I have trouble understanding (2). I have no idea how it can happen that reading the same volatile memory location a second time can retrieve an older value. Regarding JDK-8229169, age() and age_top() read different sizes. The ordering issue was related to different Bytes which weren't read before AFAICS. Dredging through the Standard, the C++ memory model does require read-read coherence, i.e. if there are sequential reads of an "atomic object" then the second cannot obtain an older value than the first. (C++14 1.9/13-14, 1.10/14, 1.10/18). Of course, we're outside the Standard, since we aren't using std::atomic<>. But it seems likely to be okay to assume read-read coherence for our Atomic accesses. It's unclear to me what the Standard says about a case like JDK-8229169. The code is reading a 32bit value, then later reading a 64bit value from the same location. That can't be described within the Standard, since "atomic object" is defined in terms of std::atomic<>. C++20 adds std::atomic_ref<> which can temporarily make an existing object atomic, but that doesn't cover something like this either. So maybe the reordering behavior leading to that bug report is allowed? Of course, that code is also outside the Standard since it is writing to one union member and reading from another. Be that as it may, I've found a way to entirely bypass the whole question. Rather than using cas_forward_to and later rereading the forwardee, instead use forward_to_atomic and use the returned forwardee. Mostly done re-running tests. ------------- PR: https://git.openjdk.java.net/jdk/pull/4371 From kbarrett at openjdk.java.net Wed Jun 9 11:15:36 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 9 Jun 2021 11:15:36 GMT Subject: RFR: 8263107: PSPromotionManager::copy_and_push_safe_barrier needs acquire memory barrier [v2] In-Reply-To: References: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> Message-ID: On Mon, 7 Jun 2021 14:48:11 GMT, Albert Mingkun Yang wrote: >> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: >> >> avoid reloading forwardee > > src/hotspot/share/gc/parallel/psPromotionManager.inline.hpp line 308: > >> 306: } >> 307: >> 308: // don't update this before the unallocation! > > This comment gives me the impression that there is some racy going on here, and deallocation and update to `new_obj` must be ordered this way. However, the actual reason is that the old value of`new_obj` is used for deallocation. I think it's best to remove this comment; it doesn't really say anything interesting. That comment went away as part of latest change to avoid reloading the forwardee. ------------- PR: https://git.openjdk.java.net/jdk/pull/4371 From ogatak at openjdk.java.net Wed Jun 9 11:24:33 2021 From: ogatak at openjdk.java.net (Kazunori Ogata) Date: Wed, 9 Jun 2021 11:24:33 GMT Subject: RFR: 8267968: [PPC64] Use prefixed load and addi instructions for better performance in POWER10 [v3] In-Reply-To: References: Message-ID: <2o6EXLmcWJYa8EIJ6fWNYdq2bBfKfTxkMWfqS_Bt4P4=.f09a9226-ea79-4cc6-b88d-8a0b0252957a@github.com> > The POWER10 processor supports prefixed load and addi instructions that have larger displacement field of up to 34-bits. We can reduce instruction cycles to load constant from TOC and load an immediate value to a register. > > Assembler::{load|add}_const_optimized() and LoadCon[LPFD]Nodes are modified to use prefixed instructions, with fixing other functions that are affected by this change. > > I ran jtreg test on both POWER10 and POWER8 machines by using "make test-tier1" and verified no additional fails by this change. I also ran DaCapo, Renaissance, and SPECjbb2015 on both of them and verified they run successfully. Kazunori Ogata has updated the pull request incrementally with two additional commits since the last revision: - Revert changes for pusing nodes in loadConLNodesTuple and add comments about the node _last points to - Fix grammatical errors in comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4267/files - new: https://git.openjdk.java.net/jdk/pull/4267/files/ea87e2c0..0615adac Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4267&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4267&range=01-02 Stats: 27 lines in 2 files changed: 0 ins; 6 del; 21 mod Patch: https://git.openjdk.java.net/jdk/pull/4267.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4267/head:pull/4267 PR: https://git.openjdk.java.net/jdk/pull/4267 From mdoerr at openjdk.java.net Wed Jun 9 12:20:15 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 9 Jun 2021 12:20:15 GMT Subject: RFR: 8263107: PSPromotionManager::copy_and_push_safe_barrier needs acquire memory barrier [v2] In-Reply-To: References: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> Message-ID: On Wed, 9 Jun 2021 11:15:35 GMT, Kim Barrett wrote: >> Please review this change to PSPromotionManager::copy_to_survivor_space >> (ParallelGC) to remove some redundant work, and to add some missing memory >> barriers. >> >> There are two callers of copy_to_survivor_space, both of which wrap that >> call with the idiom >> >> if obj->is_forwarded() then >> new_obj = obj->forwardee() >> else >> new_obj = copy_to_survivor_space(obj) >> endif >> >> There are problems with this. >> >> (1) The first thing copy_to_survivor_space does is check whether the object >> is already forwarded, and if so then return obj->forwardee_acquire(). The >> idiom used by the callers is a redundant check, and the redundancy can't be >> optimized away. It is also missing the acquire barrier that was added by >> JDK-8154736 after long discussion. >> >> (2) It turns out the forwardee_acquire() from JDK-8154736 isn't sufficient >> after all. The "if is_forwarded() then use forwardee()" idiom is hiding >> under the abstractions that we're doing two relaxed atomic loads of the mark >> word, and there is nothing here to prevent the second from reading a value >> older than that read by the first, with bad consequences. This possibility >> came up in the discussion of JDK-8154736, but seems to have been either lost >> or discounted. If you think loads from the same location can't do that, see >> JDK-8229169 for a counter example. >> >> Part of this change involves removing the conditionalization of the calls to >> copy_to_survivor_space; just call it directly. However, it turns out that >> some compilers don't inline copy_to_survivor_space because of its size. So >> we refactored it into two functions, one doing the already marked check and >> then calling the other to do most of the work. This is enough for the check >> to be inlined into callers, so we've effectively removed the redundant inner >> check. Note: This part of the change introduces a large block of whitespace >> differences due to removal of an if-else and outdenting the body; I recommend >> using a view that suppresses those when reviewing. >> >> The second part of the change involves adding or moving some acquire barriers. >> >> (a) For the initial check whether the object is already marked, if it is >> then add an acquire fence before returning the forwardee. We could instead >> use a load-acquire to obtain the mark word, but that would be an unneeded >> acquire barrier on the much more common unmarked case. Also removed >> forwardee_acquire(), which is no longer used. >> >> (b) If the cmpxchg race is lost, add an acquire fence before fetching and >> returning the forwardee. The failed release-cmpxchg effectively behaves >> like a relaxed-load, which must preceed the forwardee access and any reads >> from it. >> >> I've also changed to only log copying when actually copied, not when already >> copied and forwarded. Also changed a guarantee to an assert. >> >> I looked at all uses of forwardee() in light of problem (2), and did not >> find any additional problems. (That doesn't mean there aren't any, just >> that I didn't spot any. This is low-level atomics, after all.) >> >> Testing: >> mach5 tier1-3,5,7 (tier3,5,7 are where a lot of ParallelGC testing is done). >> Performance testing showed no significant change. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > avoid reloading forwardee Thanks a lot for checking! That makes sense. ------------- PR: https://git.openjdk.java.net/jdk/pull/4371 From mdoerr at openjdk.java.net Wed Jun 9 14:10:16 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 9 Jun 2021 14:10:16 GMT Subject: RFR: 8267968: [PPC64] Use prefixed load and addi instructions for better performance in POWER10 [v3] In-Reply-To: <2o6EXLmcWJYa8EIJ6fWNYdq2bBfKfTxkMWfqS_Bt4P4=.f09a9226-ea79-4cc6-b88d-8a0b0252957a@github.com> References: <2o6EXLmcWJYa8EIJ6fWNYdq2bBfKfTxkMWfqS_Bt4P4=.f09a9226-ea79-4cc6-b88d-8a0b0252957a@github.com> Message-ID: <-tSvQWJLA_hW6MQGjY3SVFQM6BbXUb9KZPL-OkP1Xwk=.10cbd4e8-1c4b-4578-9c3c-d6edef490bfa@github.com> On Wed, 9 Jun 2021 11:24:33 GMT, Kazunori Ogata wrote: >> The POWER10 processor supports prefixed load and addi instructions that have larger displacement field of up to 34-bits. We can reduce instruction cycles to load constant from TOC and load an immediate value to a register. >> >> Assembler::{load|add}_const_optimized() and LoadCon[LPFD]Nodes are modified to use prefixed instructions, with fixing other functions that are affected by this change. >> >> I ran jtreg test on both POWER10 and POWER8 machines by using "make test-tier1" and verified no additional fails by this change. I also ran DaCapo, Renaissance, and SPECjbb2015 on both of them and verified they run successfully. > > Kazunori Ogata has updated the pull request incrementally with two additional commits since the last revision: > > - Revert changes for pusing nodes in loadConLNodesTuple and add comments about the node _last points to > - Fix grammatical errors in comments Sorry for my late response. I was busy with other things. I've looked at this change for some time and I wonder if such a complex change should be done at all. I like the idea, but does it really improve performance for any real applications or benchmarks? At least those parts which only increase complexity should not get done. src/hotspot/cpu/ppc/assembler_ppc.cpp line 352: > 350: void Assembler::paddi_or_addi(Register d, Register s, long si34) { > 351: if (is_simm16(si34)) { > 352: addi_r0ok(d, s, (int)si34); If r0 is ok, it should be named paddi_or_addi_r0ok and users should assert not to use r0 for a real addition. src/hotspot/cpu/ppc/assembler_ppc.cpp line 364: > 362: // we avoid a buffer overrun in the actual code generation phase. > 363: nop(); > 364: } Scratch emit should be able to determine the size precisely, not just a pessimistic estimation. Please don't break this design. src/hotspot/cpu/ppc/ppc.ad line 6400: > 6398: > 6399: format %{ "LFS $dst, offset, $toc \t// load float $src from TOC" %} > 6400: size(8); sizes should be precise. ------------- PR: https://git.openjdk.java.net/jdk/pull/4267 From tschatzl at openjdk.java.net Wed Jun 9 15:19:15 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 9 Jun 2021 15:19:15 GMT Subject: RFR: 8263107: PSPromotionManager::copy_and_push_safe_barrier needs acquire memory barrier [v2] In-Reply-To: References: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> Message-ID: On Wed, 9 Jun 2021 11:15:35 GMT, Kim Barrett wrote: >> Please review this change to PSPromotionManager::copy_to_survivor_space >> (ParallelGC) to remove some redundant work, and to add some missing memory >> barriers. >> >> There are two callers of copy_to_survivor_space, both of which wrap that >> call with the idiom >> >> if obj->is_forwarded() then >> new_obj = obj->forwardee() >> else >> new_obj = copy_to_survivor_space(obj) >> endif >> >> There are problems with this. >> >> (1) The first thing copy_to_survivor_space does is check whether the object >> is already forwarded, and if so then return obj->forwardee_acquire(). The >> idiom used by the callers is a redundant check, and the redundancy can't be >> optimized away. It is also missing the acquire barrier that was added by >> JDK-8154736 after long discussion. >> >> (2) It turns out the forwardee_acquire() from JDK-8154736 isn't sufficient >> after all. The "if is_forwarded() then use forwardee()" idiom is hiding >> under the abstractions that we're doing two relaxed atomic loads of the mark >> word, and there is nothing here to prevent the second from reading a value >> older than that read by the first, with bad consequences. This possibility >> came up in the discussion of JDK-8154736, but seems to have been either lost >> or discounted. If you think loads from the same location can't do that, see >> JDK-8229169 for a counter example. >> >> Part of this change involves removing the conditionalization of the calls to >> copy_to_survivor_space; just call it directly. However, it turns out that >> some compilers don't inline copy_to_survivor_space because of its size. So >> we refactored it into two functions, one doing the already marked check and >> then calling the other to do most of the work. This is enough for the check >> to be inlined into callers, so we've effectively removed the redundant inner >> check. Note: This part of the change introduces a large block of whitespace >> differences due to removal of an if-else and outdenting the body; I recommend >> using a view that suppresses those when reviewing. >> >> The second part of the change involves adding or moving some acquire barriers. >> >> (a) For the initial check whether the object is already marked, if it is >> then add an acquire fence before returning the forwardee. We could instead >> use a load-acquire to obtain the mark word, but that would be an unneeded >> acquire barrier on the much more common unmarked case. Also removed >> forwardee_acquire(), which is no longer used. >> >> (b) If the cmpxchg race is lost, add an acquire fence before fetching and >> returning the forwardee. The failed release-cmpxchg effectively behaves >> like a relaxed-load, which must preceed the forwardee access and any reads >> from it. >> >> I've also changed to only log copying when actually copied, not when already >> copied and forwarded. Also changed a guarantee to an assert. >> >> I looked at all uses of forwardee() in light of problem (2), and did not >> find any additional problems. (That doesn't mean there aren't any, just >> that I didn't spot any. This is low-level atomics, after all.) >> >> Testing: >> mach5 tier1-3,5,7 (tier3,5,7 are where a lot of ParallelGC testing is done). >> Performance testing showed no significant change. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > avoid reloading forwardee Still good. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4371 From sjohanss at openjdk.java.net Wed Jun 9 15:59:36 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Wed, 9 Jun 2021 15:59:36 GMT Subject: RFR: 8268388: Update large pages information in Java manpage [v2] In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 09:47:11 GMT, Thomas Stuefe wrote: >> Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision: >> >> Stufe review. > > Hi Stefan, > > just idle nitpicking, looks good otherwise. > > Cheers, Thomas @tstuefe, I updated the PR with your suggestion to mention `UseTransparentHugePages`. I think that is a good way to also get that option into the man page. I intend to integrate this before the fork tomorrow, I hope this is good with you. ------------- PR: https://git.openjdk.java.net/jdk/pull/4425 From sjohanss at openjdk.java.net Wed Jun 9 15:59:36 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Wed, 9 Jun 2021 15:59:36 GMT Subject: RFR: 8268388: Update large pages information in Java manpage [v2] In-Reply-To: References: Message-ID: > Please review this update to the text for large pages in the Java man page. > > The text for `LargePageSizeInBytes` was reviewed in this [CSR](https://bugs.openjdk.java.net/browse/JDK-8265517), and this change will integrate them into the actual man page. The *Large Pages* section further down in the man page was also a bit out-dated and have been brushed up a bit. Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision: Stufe review. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4425/files - new: https://git.openjdk.java.net/jdk/pull/4425/files/b9032051..0b9b079f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4425&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4425&range=00-01 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/4425.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4425/head:pull/4425 PR: https://git.openjdk.java.net/jdk/pull/4425 From minqi at openjdk.java.net Wed Jun 9 16:31:24 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Wed, 9 Jun 2021 16:31:24 GMT Subject: RFR: 8265954: Shared classes that failed to load should not be loaded again Message-ID: Hi, Please review Shared classes should not be loaded again at failed loading from CDS. In the failed case, restore_unshareable_info failed due to some reason (OOM), but the class already polluted and failed to be loaded again. Using the unused bit in _misc_flags indicates shared loading status to prevent it from being loaded again. Tests: tier1,tier2,tier3,tier4,tier7 Local tests: jtreg/hotspot/runtime/cds TestDynamicDumpAtOom.java (which failed in tier7) with variant allocation sizes (used to reproduce the failure) passed. Thanks Yumin ------------- Commit messages: - 8265954: Shared classes that failed to load should not be loaded again Changes: https://git.openjdk.java.net/jdk/pull/4434/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4434&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8265954 Stats: 72 lines in 6 files changed: 36 ins; 21 del; 15 mod Patch: https://git.openjdk.java.net/jdk/pull/4434.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4434/head:pull/4434 PR: https://git.openjdk.java.net/jdk/pull/4434 From iklam at openjdk.java.net Wed Jun 9 17:17:12 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 9 Jun 2021 17:17:12 GMT Subject: RFR: 8267954: Shared classes that failed to load should not be loaded again In-Reply-To: References: Message-ID: <-o1FR_6QChxTo2ExNWFYH0Hjb-lrKrWnDcXMI_jjOwo=.9b06d9d6-b98a-4a5f-8292-1b24d4757094@github.com> On Wed, 9 Jun 2021 16:24:42 GMT, Yumin Qi wrote: > Hi, Please review > Shared classes should not be loaded again at failed loading from CDS. In the failed case, restore_unshareable_info failed due to some reason (OOM), but the class already polluted and failed to be loaded again. > Using the unused bit in _misc_flags indicates shared loading status to prevent it from being loaded again. > > Tests: tier1,tier2,tier3,tier4,tier7 > Local tests: jtreg/hotspot/runtime/cds > TestDynamicDumpAtOom.java (which failed in tier7) with variant allocation sizes (used to reproduce the failure) passed. > > Thanks > Yumin LGTM. Some small nits. src/hotspot/share/classfile/systemDictionary.hpp line 84: > 82: class TableStatistics; > 83: > 84: class SharedClassLoadingMark { I think it's better to put this class into systemDictionaryShared.hpp src/hotspot/share/classfile/systemDictionaryShared.cpp line 1054: > 1052: if ((SystemDictionary::is_system_class_loader(class_loader()) && ik->is_shared_app_class()) || > 1053: (SystemDictionary::is_platform_class_loader(class_loader()) && ik->is_shared_platform_class())) { > 1054: SharedClassLoadingMark slm(THREAD, ik); `!ik->is_shared_boot_class()` is not needed because lines 1052 and 1053 will check for the proper loader type. ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4434 From luhenry at openjdk.java.net Wed Jun 9 17:25:52 2021 From: luhenry at openjdk.java.net (Ludovic Henry) Date: Wed, 9 Jun 2021 17:25:52 GMT Subject: RFR: 8178287: AsyncGetCallTrace fails to traverse valid Java stacks Message-ID: <9qfnLj_-jz8MocK7UIIs5-NYZsVPJ7J20ZLiORqpUlM=.cb712662-0eb9-4d17-a67d-42451423f470@github.com> When the signal sent for AsyncGetCallTrace or JFR would land on a runtime stub (like arraycopy), a vtable stub, or the prolog of a compiled method, it wouldn't be able to detect the sender (caller) frame for multiple reasons. This patch fixes these cases through adding CodeBlob-specific frame parser which are in the best position to know how a frame is setup. The following examples have been profiled with honest-profiler which uses `AsyncGetCallTrace`. # `Prof1` public class Prof1 { public static void main(String[] args) { StringBuilder sb = new StringBuilder(); for (int i = 0; i < 1000000; i++) { sb.append("ab"); sb.delete(0, 1); } System.out.println(sb.length()); } } - Baseline: Flat Profile (by method): (t 99.4,s 99.4) AGCT::Unknown Java[ERR=-5] (t 0.5,s 0.2) Prof1::main (t 0.2,s 0.2) java.lang.AbstractStringBuilder::append (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] (t 0.0,s 0.0) java.lang.AbstractStringBuilder::ensureCapacityInternal (t 0.0,s 0.0) java.lang.AbstractStringBuilder::shift (t 0.0,s 0.0) java.lang.String::getBytes (t 0.0,s 0.0) java.lang.AbstractStringBuilder::putStringAt (t 0.0,s 0.0) java.lang.StringBuilder::delete (t 0.2,s 0.0) java.lang.StringBuilder::append (t 0.0,s 0.0) java.lang.AbstractStringBuilder::delete (t 0.0,s 0.0) java.lang.AbstractStringBuilder::putStringAt - With `StubRoutinesBlob::FrameParser`: Flat Profile (by method): (t 98.7,s 98.7) java.lang.AbstractStringBuilder::ensureCapacityInternal (t 0.9,s 0.9) java.lang.AbstractStringBuilder::delete (t 99.8,s 0.2) Prof1::main (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] (t 0.0,s 0.0) AGCT::Unknown Java[ERR=-5] (t 98.8,s 0.0) java.lang.AbstractStringBuilder::append (t 98.8,s 0.0) java.lang.StringBuilder::append (t 0.9,s 0.0) java.lang.StringBuilder::delete # `Prof2` import java.util.function.Supplier; public class Prof2 { public static void main(String[] args) { var rand = new java.util.Random(0); Supplier[] suppliers = { () -> 0, () -> 1, () -> 2, () -> 3, }; long sum = 0; for (int i = 0; i >= 0; i++) { sum += (int)suppliers[i % suppliers.length].get(); } } } - Baseline: Flat Profile (by method): (t 60.7,s 60.7) AGCT::Unknown Java[ERR=-5] (t 39.2,s 35.2) Prof2::main (t 1.4,s 1.4) Prof2::lambda$main$3 (t 1.0,s 1.0) Prof2::lambda$main$2 (t 0.9,s 0.9) Prof2::lambda$main$1 (t 0.7,s 0.7) Prof2::lambda$main$0 (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] (t 0.0,s 0.0) java.lang.Thread::exit (t 0.9,s 0.0) Prof2$$Lambda$2.0x0000000800c00c28::get (t 1.0,s 0.0) Prof2$$Lambda$3.0x0000000800c01000::get (t 1.4,s 0.0) Prof2$$Lambda$4.0x0000000800c01220::get (t 0.7,s 0.0) Prof2$$Lambda$1.0x0000000800c00a08::get - With `VtableBlob::FrameParser` and `nmethod::FrameParser`: Flat Profile (by method): (t 74.1,s 70.3) Prof2::main (t 6.5,s 5.5) Prof2$$Lambda$29.0x0000000800081220::get (t 6.6,s 5.4) Prof2$$Lambda$28.0x0000000800081000::get (t 5.7,s 5.0) Prof2$$Lambda$26.0x0000000800080a08::get (t 5.9,s 5.0) Prof2$$Lambda$27.0x0000000800080c28::get (t 4.9,s 4.9) AGCT::Unknown Java[ERR=-5] (t 1.2,s 1.2) Prof2::lambda$main$2 (t 0.9,s 0.9) Prof2::lambda$main$3 (t 0.9,s 0.9) Prof2::lambda$main$1 (t 0.7,s 0.7) Prof2::lambda$main$0 (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] ------------- Commit messages: - Disable checks in FrameParser when known to be safe - Allow AsyncGetCallTrace and JFR to unwind stack from vtable stub - Allow AsyncGetCallTrace and JFR to unwind stack from nmethod's prolog - JDK-8267985: Allow AsyncGetCallTrace and JFR to walk a stub frame - 8268178: Extract sender frame parsing to CodeBlock::FrameParser Changes: https://git.openjdk.java.net/jdk/pull/4436/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4436&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8178287 Stats: 1303 lines in 26 files changed: 1057 ins; 166 del; 80 mod Patch: https://git.openjdk.java.net/jdk/pull/4436.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4436/head:pull/4436 PR: https://git.openjdk.java.net/jdk/pull/4436 From stuefe at openjdk.java.net Wed Jun 9 17:26:17 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 9 Jun 2021 17:26:17 GMT Subject: RFR: 8268388: Update large pages information in Java manpage [v2] In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 15:59:36 GMT, Stefan Johansson wrote: >> Please review this update to the text for large pages in the Java man page. >> >> The text for `LargePageSizeInBytes` was reviewed in this [CSR](https://bugs.openjdk.java.net/browse/JDK-8265517), and this change will integrate them into the actual man page. The *Large Pages* section further down in the man page was also a bit out-dated and have been brushed up a bit. > > Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision: > > Stufe review. Looks good Stefan. Thanks for taking my suggestion. ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4425 From egahlin at openjdk.java.net Wed Jun 9 17:39:20 2021 From: egahlin at openjdk.java.net (Erik Gahlin) Date: Wed, 9 Jun 2021 17:39:20 GMT Subject: RFR: 8247471: Enhance CPU load events with the actual elapsed CPU time In-Reply-To: References: Message-ID: On Thu, 21 Jan 2021 17:34:58 GMT, Jaroslav Bachorik wrote: > A continuation of an RFR thread started last year - https://mail.openjdk.java.net/pipermail/hotspot-jfr-dev/2020-June/001533.html > > This change adds the raw CPU time value to CPU load events (per-thread and per-process as well). > The CPU time value is already known and used to calculate the load so adding it to the events does not incur any extra overhead while making it much easier for the end users to eg. aggregate and compare the active execution time per time period without the detailed knowledge how JFR computes and normalizes the CPU load. Looks good ------------- PR: https://git.openjdk.java.net/jdk/pull/2186 From minqi at openjdk.java.net Wed Jun 9 17:42:19 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Wed, 9 Jun 2021 17:42:19 GMT Subject: RFR: 8267954: Shared classes that failed to load should not be loaded again In-Reply-To: <-o1FR_6QChxTo2ExNWFYH0Hjb-lrKrWnDcXMI_jjOwo=.9b06d9d6-b98a-4a5f-8292-1b24d4757094@github.com> References: <-o1FR_6QChxTo2ExNWFYH0Hjb-lrKrWnDcXMI_jjOwo=.9b06d9d6-b98a-4a5f-8292-1b24d4757094@github.com> Message-ID: On Wed, 9 Jun 2021 17:09:52 GMT, Ioi Lam wrote: >> Hi, Please review >> Shared classes should not be loaded again at failed loading from CDS. In the failed case, restore_unshareable_info failed due to some reason (OOM), but the class already polluted and failed to be loaded again. >> Using the unused bit in _misc_flags indicates shared loading status to prevent it from being loaded again. >> >> Tests: tier1,tier2,tier3,tier4,tier7 >> Local tests: jtreg/hotspot/runtime/cds >> TestDynamicDumpAtOom.java (which failed in tier7) with variant allocation sizes (used to reproduce the failure) passed. >> >> Thanks >> Yumin > > src/hotspot/share/classfile/systemDictionary.hpp line 84: > >> 82: class TableStatistics; >> 83: >> 84: class SharedClassLoadingMark { > > I think it's better to put this class into systemDictionaryShared.hpp The move to systemDictionaryShared.hpp will cause zero build failed. We need a guard for CDS at: 1294 if (k != NULL) { 1295 SharedClassLoadingMark slm(THREAD, k); 1296 k = find_or_define_instance_class(class_name, class_loader, k, CHECK_NULL); 1297 } That makes the code looks fragmented. Do you agree to keep it not moved? ------------- PR: https://git.openjdk.java.net/jdk/pull/4434 From minqi at openjdk.java.net Wed Jun 9 17:48:14 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Wed, 9 Jun 2021 17:48:14 GMT Subject: RFR: 8267954: Shared classes that failed to load should not be loaded again In-Reply-To: References: <-o1FR_6QChxTo2ExNWFYH0Hjb-lrKrWnDcXMI_jjOwo=.9b06d9d6-b98a-4a5f-8292-1b24d4757094@github.com> Message-ID: On Wed, 9 Jun 2021 17:39:33 GMT, Yumin Qi wrote: >> src/hotspot/share/classfile/systemDictionary.hpp line 84: >> >>> 82: class TableStatistics; >>> 83: >>> 84: class SharedClassLoadingMark { >> >> I think it's better to put this class into systemDictionaryShared.hpp > > The move to systemDictionaryShared.hpp will cause zero build failed. We need a guard for CDS at: > 1294 if (k != NULL) { > 1295 SharedClassLoadingMark slm(THREAD, k); > 1296 k = find_or_define_instance_class(class_name, class_loader, k, CHECK_NULL); > 1297 } > That makes the code looks fragmented. Do you agree to keep it not moved? Maybe we should add guard here for shared code --- putting it to systemDictionaryShared.hpp is more reasonable. ------------- PR: https://git.openjdk.java.net/jdk/pull/4434 From ascarpino at openjdk.java.net Wed Jun 9 18:52:15 2021 From: ascarpino at openjdk.java.net (Anthony Scarpino) Date: Wed, 9 Jun 2021 18:52:15 GMT Subject: RFR: 8267125: AES Galois CounterMode (GCM) interleaved implementation using AVX512 + VAES instructions [v2] In-Reply-To: References: <0a7b_-PDU_JYXR7OrJRK8Z8QPRwLlV2vcHbBbW06SO8=.f0d61fd3-0205-40a7-b1a1-58caa2ea0f45@github.com> Message-ID: On Fri, 4 Jun 2021 23:49:31 GMT, Smita Kamath wrote: >> I would like to submit AES-GCM optimization for x86_64 architectures supporting AVX3+VAES (Evex encoded AES). This optimization interleaves AES and GHASH operations. >> Performance gain of ~1.5x - 2x for message sizes 8k and above. > > Smita Kamath has updated the pull request incrementally with one additional commit since the last revision: > > 8267125:Updated intrinsic signature to remove copies of counter, state and subkeyHtbl With JDK-8255557 integrated, I'll provide you a merged copy of your java side changes. ------------- PR: https://git.openjdk.java.net/jdk/pull/4019 From lmesnik at openjdk.java.net Wed Jun 9 18:55:15 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Wed, 9 Jun 2021 18:55:15 GMT Subject: RFR: 8267893: Improve jtreg test failure handler do get native/mixed stack traces for cores and live processes [v5] In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 08:42:13 GMT, Igor Ignatyev wrote: >> Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: >> >> spaces updated. > > test/failure_handler/src/share/classes/jdk/test/failurehandler/GathererFactory.java line 32: > >> 30: import java.io.FileWriter; >> 31: import java.io.PrintWriter; >> 32: import java.nio.file.Files; > > I don't see why we need these 3 new imports. fixed > test/failure_handler/src/share/classes/jdk/test/failurehandler/ToolKit.java line 28: > >> 26: import jdk.test.failurehandler.action.ActionSet; >> 27: import jdk.test.failurehandler.action.ActionHelper; >> 28: import jdk.test.failurehandler.action.PatternAction; > > redundant import fixed > test/failure_handler/src/share/conf/mac.properties line 71: > >> 69: native.lldb.app=lldb >> 70: native.lldb.delimiter=\0 >> 71: native.lldb.args=--core\0%p\0%java\0-o\0thread backtrace all\0-o\0quit > > could you please add a comment similar to the one in common.properties file? fixed > test/failure_handler/src/share/conf/mac.properties line 72: > >> 70: native.lldb.delimiter=\0 >> 71: native.lldb.args=--core\0%p\0%java\0-o\0thread backtrace all\0-o\0quit >> 72: native.lldb.params.timeout=3600000 > > why does `lldb` require an increases timeout, but `gdb` and `jhsdb` do not? Not sure I remember if there is any reason. I remove it. Let increase it later if it actually needed. ------------- PR: https://git.openjdk.java.net/jdk/pull/4234 From lmesnik at openjdk.java.net Wed Jun 9 18:55:12 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Wed, 9 Jun 2021 18:55:12 GMT Subject: RFR: 8267893: Improve jtreg test failure handler do get native/mixed stack traces for cores and live processes [v5] In-Reply-To: References: Message-ID: On Wed, 2 Jun 2021 01:00:53 GMT, Leonid Mesnik wrote: >> EFH is improved to process cores and get mixed stack traces with jhsdb and native stack traces with gdb/lldb. It might be useful because hs_err doesn't contain info about all threads, sometimes it is even not generated. > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > spaces updated. updated diff ------------- PR: https://git.openjdk.java.net/jdk/pull/4234 From lmesnik at openjdk.java.net Wed Jun 9 19:00:39 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Wed, 9 Jun 2021 19:00:39 GMT Subject: RFR: 8267893: Improve jtreg test failure handler do get native/mixed stack traces for cores and live processes [v6] In-Reply-To: References: Message-ID: <6SZF9RWatyvXJXVRqFCre9XU9T7eX8Cp_Q5mABj68YQ=.ef3a10d8-000f-4436-86ca-a5b4b1bff6fb@github.com> > EFH is improved to process cores and get mixed stack traces with jhsdb and native stack traces with gdb/lldb. It might be useful because hs_err doesn't contain info about all threads, sometimes it is even not generated. Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: fxies ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4234/files - new: https://git.openjdk.java.net/jdk/pull/4234/files/e70518bc..67b61d01 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4234&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4234&range=04-05 Stats: 7 lines in 4 files changed: 2 ins; 5 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/4234.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4234/head:pull/4234 PR: https://git.openjdk.java.net/jdk/pull/4234 From minqi at openjdk.java.net Wed Jun 9 19:04:35 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Wed, 9 Jun 2021 19:04:35 GMT Subject: RFR: 8267954: Shared classes that failed to load should not be loaded again [v2] In-Reply-To: References: Message-ID: > Hi, Please review > Shared classes should not be loaded again at failed loading from CDS. In the failed case, restore_unshareable_info failed due to some reason (OOM), but the class already polluted and failed to be loaded again. > Using the unused bit in _misc_flags indicates shared loading status to prevent it from being loaded again. > > Tests: tier1,tier2,tier3,tier4,tier7 > Local tests: jtreg/hotspot/runtime/cds > TestDynamicDumpAtOom.java (which failed in tier7) with variant allocation sizes (used to reproduce the failure) passed. > > Thanks > Yumin Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: Move SharedClassLoadingMark to systemDictionaryShared.hpp ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4434/files - new: https://git.openjdk.java.net/jdk/pull/4434/files/3ca038ea..4cb6d351 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4434&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4434&range=00-01 Stats: 37 lines in 4 files changed: 19 ins; 17 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/4434.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4434/head:pull/4434 PR: https://git.openjdk.java.net/jdk/pull/4434 From sspitsyn at openjdk.java.net Wed Jun 9 19:08:19 2021 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Wed, 9 Jun 2021 19:08:19 GMT Subject: RFR: 8178287: AsyncGetCallTrace fails to traverse valid Java stacks In-Reply-To: <9qfnLj_-jz8MocK7UIIs5-NYZsVPJ7J20ZLiORqpUlM=.cb712662-0eb9-4d17-a67d-42451423f470@github.com> References: <9qfnLj_-jz8MocK7UIIs5-NYZsVPJ7J20ZLiORqpUlM=.cb712662-0eb9-4d17-a67d-42451423f470@github.com> Message-ID: On Wed, 9 Jun 2021 17:16:23 GMT, Ludovic Henry wrote: > When the signal sent for AsyncGetCallTrace or JFR would land on a runtime stub (like arraycopy), a vtable stub, or the prolog of a compiled method, it wouldn't be able to detect the sender (caller) frame for multiple reasons. This patch fixes these cases through adding CodeBlob-specific frame parser which are in the best position to know how a frame is setup. > > The following examples have been profiled with honest-profiler which uses `AsyncGetCallTrace`. > > # `Prof1` > > public class Prof1 { > > public static void main(String[] args) { > StringBuilder sb = new StringBuilder(); > for (int i = 0; i < 1000000; i++) { > sb.append("ab"); > sb.delete(0, 1); > } > System.out.println(sb.length()); > } > } > > > - Baseline: > > Flat Profile (by method): > (t 99.4,s 99.4) AGCT::Unknown Java[ERR=-5] > (t 0.5,s 0.2) Prof1::main > (t 0.2,s 0.2) java.lang.AbstractStringBuilder::append > (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] > (t 0.0,s 0.0) java.lang.AbstractStringBuilder::ensureCapacityInternal > (t 0.0,s 0.0) java.lang.AbstractStringBuilder::shift > (t 0.0,s 0.0) java.lang.String::getBytes > (t 0.0,s 0.0) java.lang.AbstractStringBuilder::putStringAt > (t 0.0,s 0.0) java.lang.StringBuilder::delete > (t 0.2,s 0.0) java.lang.StringBuilder::append > (t 0.0,s 0.0) java.lang.AbstractStringBuilder::delete > (t 0.0,s 0.0) java.lang.AbstractStringBuilder::putStringAt > > - With `StubRoutinesBlob::FrameParser`: > > Flat Profile (by method): > (t 98.7,s 98.7) java.lang.AbstractStringBuilder::ensureCapacityInternal > (t 0.9,s 0.9) java.lang.AbstractStringBuilder::delete > (t 99.8,s 0.2) Prof1::main > (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] > (t 0.0,s 0.0) AGCT::Unknown Java[ERR=-5] > (t 98.8,s 0.0) java.lang.AbstractStringBuilder::append > (t 98.8,s 0.0) java.lang.StringBuilder::append > (t 0.9,s 0.0) java.lang.StringBuilder::delete > > > # `Prof2` > > import java.util.function.Supplier; > > public class Prof2 { > > public static void main(String[] args) { > var rand = new java.util.Random(0); > Supplier[] suppliers = { > () -> 0, > () -> 1, > () -> 2, > () -> 3, > }; > > long sum = 0; > for (int i = 0; i >= 0; i++) { > sum += (int)suppliers[i % suppliers.length].get(); > } > } > } > > > - Baseline: > > Flat Profile (by method): > (t 60.7,s 60.7) AGCT::Unknown Java[ERR=-5] > (t 39.2,s 35.2) Prof2::main > (t 1.4,s 1.4) Prof2::lambda$main$3 > (t 1.0,s 1.0) Prof2::lambda$main$2 > (t 0.9,s 0.9) Prof2::lambda$main$1 > (t 0.7,s 0.7) Prof2::lambda$main$0 > (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] > (t 0.0,s 0.0) java.lang.Thread::exit > (t 0.9,s 0.0) Prof2$$Lambda$2.0x0000000800c00c28::get > (t 1.0,s 0.0) Prof2$$Lambda$3.0x0000000800c01000::get > (t 1.4,s 0.0) Prof2$$Lambda$4.0x0000000800c01220::get > (t 0.7,s 0.0) Prof2$$Lambda$1.0x0000000800c00a08::get > > > - With `VtableBlob::FrameParser` and `nmethod::FrameParser`: > > Flat Profile (by method): > (t 74.1,s 70.3) Prof2::main > (t 6.5,s 5.5) Prof2$$Lambda$29.0x0000000800081220::get > (t 6.6,s 5.4) Prof2$$Lambda$28.0x0000000800081000::get > (t 5.7,s 5.0) Prof2$$Lambda$26.0x0000000800080a08::get > (t 5.9,s 5.0) Prof2$$Lambda$27.0x0000000800080c28::get > (t 4.9,s 4.9) AGCT::Unknown Java[ERR=-5] > (t 1.2,s 1.2) Prof2::lambda$main$2 > (t 0.9,s 0.9) Prof2::lambda$main$3 > (t 0.9,s 0.9) Prof2::lambda$main$1 > (t 0.7,s 0.7) Prof2::lambda$main$0 > (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] Hi Ludovic, Thank you for working on this fix in the AsyncGetCallTrace. What version of JDK release do you intent to target? Just wanted to make sure you know the JDK 17 development cycle will be closed tomorrow for P4 bugs and enhancements. The repository will be forked and the RDP 1 phase started. I doubt the review of your fix will be completed by this time. So, please, keep in mind your PR will go to 18, not 17. Thanks, Serguei ------------- PR: https://git.openjdk.java.net/jdk/pull/4436 From ccheung at openjdk.java.net Wed Jun 9 19:18:18 2021 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Wed, 9 Jun 2021 19:18:18 GMT Subject: RFR: 8267954: Shared classes that failed to load should not be loaded again [v2] In-Reply-To: References: Message-ID: <77KjXcKwKF-OAnVP5_HbjsSyHSguLeOJARiUbxjBAtQ=.5689ba6d-2052-436f-b6fe-46cc3cee8761@github.com> On Wed, 9 Jun 2021 19:04:35 GMT, Yumin Qi wrote: >> Hi, Please review >> Shared classes should not be loaded again at failed loading from CDS. In the failed case, restore_unshareable_info failed due to some reason (OOM), but the class already polluted and failed to be loaded again. >> Using the unused bit in _misc_flags indicates shared loading status to prevent it from being loaded again. >> >> Tests: tier1,tier2,tier3,tier4,tier7 >> Local tests: jtreg/hotspot/runtime/cds >> TestDynamicDumpAtOom.java (which failed in tier7) with variant allocation sizes (used to reproduce the failure) passed. >> >> Thanks >> Yumin > > Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: > > Move SharedClassLoadingMark to systemDictionaryShared.hpp Looks good. ------------- Marked as reviewed by ccheung (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4434 From iklam at openjdk.java.net Wed Jun 9 19:18:19 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 9 Jun 2021 19:18:19 GMT Subject: RFR: 8267954: Shared classes that failed to load should not be loaded again [v2] In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 19:04:35 GMT, Yumin Qi wrote: >> Hi, Please review >> Shared classes should not be loaded again at failed loading from CDS. In the failed case, restore_unshareable_info failed due to some reason (OOM), but the class already polluted and failed to be loaded again. >> Using the unused bit in _misc_flags indicates shared loading status to prevent it from being loaded again. >> >> Tests: tier1,tier2,tier3,tier4,tier7 >> Local tests: jtreg/hotspot/runtime/cds >> TestDynamicDumpAtOom.java (which failed in tier7) with variant allocation sizes (used to reproduce the failure) passed. >> >> Thanks >> Yumin > > Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: > > Move SharedClassLoadingMark to systemDictionaryShared.hpp src/hotspot/share/classfile/systemDictionary.cpp line 1297: > 1295: #if INCLUDE_CDS > 1296: SharedClassLoadingMark slm(THREAD, k); > 1297: #endif I think it's cleaner to use the CDS_ONLY macro. This macro is already used elsewhere in this file. CDS_ONLY(SharedClassLoadingMark slm(THREAD, k)); ------------- PR: https://git.openjdk.java.net/jdk/pull/4434 From mdoerr at openjdk.java.net Wed Jun 9 19:40:20 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 9 Jun 2021 19:40:20 GMT Subject: RFR: 8263107: PSPromotionManager::copy_and_push_safe_barrier needs acquire memory barrier [v2] In-Reply-To: References: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> Message-ID: On Wed, 9 Jun 2021 11:15:35 GMT, Kim Barrett wrote: >> Please review this change to PSPromotionManager::copy_to_survivor_space >> (ParallelGC) to remove some redundant work, and to add some missing memory >> barriers. >> >> There are two callers of copy_to_survivor_space, both of which wrap that >> call with the idiom >> >> if obj->is_forwarded() then >> new_obj = obj->forwardee() >> else >> new_obj = copy_to_survivor_space(obj) >> endif >> >> There are problems with this. >> >> (1) The first thing copy_to_survivor_space does is check whether the object >> is already forwarded, and if so then return obj->forwardee_acquire(). The >> idiom used by the callers is a redundant check, and the redundancy can't be >> optimized away. It is also missing the acquire barrier that was added by >> JDK-8154736 after long discussion. >> >> (2) It turns out the forwardee_acquire() from JDK-8154736 isn't sufficient >> after all. The "if is_forwarded() then use forwardee()" idiom is hiding >> under the abstractions that we're doing two relaxed atomic loads of the mark >> word, and there is nothing here to prevent the second from reading a value >> older than that read by the first, with bad consequences. This possibility >> came up in the discussion of JDK-8154736, but seems to have been either lost >> or discounted. If you think loads from the same location can't do that, see >> JDK-8229169 for a counter example. >> >> Part of this change involves removing the conditionalization of the calls to >> copy_to_survivor_space; just call it directly. However, it turns out that >> some compilers don't inline copy_to_survivor_space because of its size. So >> we refactored it into two functions, one doing the already marked check and >> then calling the other to do most of the work. This is enough for the check >> to be inlined into callers, so we've effectively removed the redundant inner >> check. Note: This part of the change introduces a large block of whitespace >> differences due to removal of an if-else and outdenting the body; I recommend >> using a view that suppresses those when reviewing. >> >> The second part of the change involves adding or moving some acquire barriers. >> >> (a) For the initial check whether the object is already marked, if it is >> then add an acquire fence before returning the forwardee. We could instead >> use a load-acquire to obtain the mark word, but that would be an unneeded >> acquire barrier on the much more common unmarked case. Also removed >> forwardee_acquire(), which is no longer used. >> >> (b) If the cmpxchg race is lost, add an acquire fence before fetching and >> returning the forwardee. The failed release-cmpxchg effectively behaves >> like a relaxed-load, which must preceed the forwardee access and any reads >> from it. >> >> I've also changed to only log copying when actually copied, not when already >> copied and forwarded. Also changed a guarantee to an assert. >> >> I looked at all uses of forwardee() in light of problem (2), and did not >> find any additional problems. (That doesn't mean there aren't any, just >> that I didn't spot any. This is low-level atomics, after all.) >> >> Testing: >> mach5 tier1-3,5,7 (tier3,5,7 are where a lot of ParallelGC testing is done). >> Performance testing showed no significant change. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > avoid reloading forwardee Looks good. + Nice cleanup! ------------- Marked as reviewed by mdoerr (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4371 From minqi at openjdk.java.net Wed Jun 9 21:28:33 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Wed, 9 Jun 2021 21:28:33 GMT Subject: RFR: 8267954: Shared classes that failed to load should not be loaded again [v3] In-Reply-To: References: Message-ID: > Hi, Please review > Shared classes should not be loaded again at failed loading from CDS. In the failed case, restore_unshareable_info failed due to some reason (OOM), but the class already polluted and failed to be loaded again. > Using the unused bit in _misc_flags indicates shared loading status to prevent it from being loaded again. > > Tests: tier1,tier2,tier3,tier4,tier7 > Local tests: jtreg/hotspot/runtime/cds > TestDynamicDumpAtOom.java (which failed in tier7) with variant allocation sizes (used to reproduce the failure) passed. > > Thanks > Yumin Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: Use CDS_ONLY for one line to replace INCLUDE_CDS ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4434/files - new: https://git.openjdk.java.net/jdk/pull/4434/files/4cb6d351..951c0fae Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4434&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4434&range=01-02 Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/4434.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4434/head:pull/4434 PR: https://git.openjdk.java.net/jdk/pull/4434 From iklam at openjdk.java.net Wed Jun 9 21:28:34 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 9 Jun 2021 21:28:34 GMT Subject: RFR: 8267954: Shared classes that failed to load should not be loaded again [v3] In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 21:25:13 GMT, Yumin Qi wrote: >> Hi, Please review >> Shared classes should not be loaded again at failed loading from CDS. In the failed case, restore_unshareable_info failed due to some reason (OOM), but the class already polluted and failed to be loaded again. >> Using the unused bit in _misc_flags indicates shared loading status to prevent it from being loaded again. >> >> Tests: tier1,tier2,tier3,tier4,tier7 >> Local tests: jtreg/hotspot/runtime/cds >> TestDynamicDumpAtOom.java (which failed in tier7) with variant allocation sizes (used to reproduce the failure) passed. >> >> Thanks >> Yumin > > Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: > > Use CDS_ONLY for one line to replace INCLUDE_CDS Latest version LGTM ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4434 From minqi at openjdk.java.net Wed Jun 9 21:38:16 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Wed, 9 Jun 2021 21:38:16 GMT Subject: Integrated: 8267954: Shared classes that failed to load should not be loaded again In-Reply-To: References: Message-ID: <-JRscM5TCQn0ZiZIwjesbeG5YZsHyEES80Vp-rzPHR4=.9b9c1bb8-4220-4890-a4e4-7a64b9aa80c2@github.com> On Wed, 9 Jun 2021 16:24:42 GMT, Yumin Qi wrote: > Hi, Please review > Shared classes should not be loaded again at failed loading from CDS. In the failed case, restore_unshareable_info failed due to some reason (OOM), but the class already polluted and failed to be loaded again. > Using the unused bit in _misc_flags indicates shared loading status to prevent it from being loaded again. > > Tests: tier1,tier2,tier3,tier4,tier7 > Local tests: jtreg/hotspot/runtime/cds > TestDynamicDumpAtOom.java (which failed in tier7) with variant allocation sizes (used to reproduce the failure) passed. > > Thanks > Yumin This pull request has now been integrated. Changeset: 7ff6e7b2 Author: Yumin Qi URL: https://git.openjdk.java.net/jdk/commit/7ff6e7b2b1be088c37f50756b6822be01b4c657d Stats: 72 lines in 7 files changed: 36 ins; 21 del; 15 mod 8267954: Shared classes that failed to load should not be loaded again Reviewed-by: iklam, ccheung ------------- PR: https://git.openjdk.java.net/jdk/pull/4434 From minqi at openjdk.java.net Wed Jun 9 21:38:15 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Wed, 9 Jun 2021 21:38:15 GMT Subject: RFR: 8267954: Shared classes that failed to load should not be loaded again [v3] In-Reply-To: References: Message-ID: <0nSfJ4wLUGSDR5bA-2kGh4GyaKgRLihvgdq4nIfxETI=.048c40e9-a4a1-4983-b40d-e96d46b5c39b@github.com> On Wed, 9 Jun 2021 21:24:04 GMT, Ioi Lam wrote: >> Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: >> >> Use CDS_ONLY for one line to replace INCLUDE_CDS > > Latest version LGTM @iklam @calvinccheung Thanks for review! ------------- PR: https://git.openjdk.java.net/jdk/pull/4434 From iklam at openjdk.java.net Wed Jun 9 22:37:27 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 9 Jun 2021 22:37:27 GMT Subject: RFR: 8268520: VirtualSpace::print_on() should be const Message-ID: Please review this trivial patch. VirtualSpace::print_on() should be const so we can avoid the weird casting in epsilonHeap.cpp ------------- Commit messages: - 8268520: VirtualSpace::print_on() should be const Changes: https://git.openjdk.java.net/jdk/pull/4448/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4448&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8268520 Stats: 6 lines in 3 files changed: 0 ins; 1 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/4448.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4448/head:pull/4448 PR: https://git.openjdk.java.net/jdk/pull/4448 From kbarrett at openjdk.java.net Thu Jun 10 02:43:14 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Thu, 10 Jun 2021 02:43:14 GMT Subject: RFR: 8268520: VirtualSpace::print_on() should be const In-Reply-To: References: Message-ID: <79d7P2K7rG0kqAbb3yBXJi-sevg0pGEX-5fufasDLJE=.00240537-2b1f-4e32-8ce6-9037e4c8c8f8@github.com> On Wed, 9 Jun 2021 22:13:02 GMT, Ioi Lam wrote: > Please review this trivial patch. VirtualSpace::print_on() should be const so we can avoid the weird casting in epsilonHeap.cpp Looks good, and trivial. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4448 From iignatyev at openjdk.java.net Thu Jun 10 03:42:20 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Thu, 10 Jun 2021 03:42:20 GMT Subject: RFR: 8267893: Improve jtreg test failure handler do get native/mixed stack traces for cores and live processes [v6] In-Reply-To: <6SZF9RWatyvXJXVRqFCre9XU9T7eX8Cp_Q5mABj68YQ=.ef3a10d8-000f-4436-86ca-a5b4b1bff6fb@github.com> References: <6SZF9RWatyvXJXVRqFCre9XU9T7eX8Cp_Q5mABj68YQ=.ef3a10d8-000f-4436-86ca-a5b4b1bff6fb@github.com> Message-ID: On Wed, 9 Jun 2021 19:00:39 GMT, Leonid Mesnik wrote: >> EFH is improved to process cores and get mixed stack traces with jhsdb and native stack traces with gdb/lldb. It might be useful because hs_err doesn't contain info about all threads, sometimes it is even not generated. > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > fxies Marked as reviewed by iignatyev (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/4234 From dholmes at openjdk.java.net Thu Jun 10 04:18:20 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 10 Jun 2021 04:18:20 GMT Subject: RFR: 8267954: Shared classes that failed to load should not be loaded again [v3] In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 21:28:33 GMT, Yumin Qi wrote: >> Hi, Please review >> Shared classes should not be loaded again at failed loading from CDS. In the failed case, restore_unshareable_info failed due to some reason (OOM), but the class already polluted and failed to be loaded again. >> Using the unused bit in _misc_flags indicates shared loading status to prevent it from being loaded again. >> >> Tests: tier1,tier2,tier3,tier4,tier7 >> Local tests: jtreg/hotspot/runtime/cds >> TestDynamicDumpAtOom.java (which failed in tier7) with variant allocation sizes (used to reproduce the failure) passed. >> >> Thanks >> Yumin > > Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: > > Use CDS_ONLY for one line to replace INCLUDE_CDS src/hotspot/share/oops/instanceKlass.hpp line 364: > 362: } > 363: > 364: void clear_shared_loading_failed() { This seems unused. And I would not expect loading to "unfail" so can't see why this bit once set would ever be cleared. ------------- PR: https://git.openjdk.java.net/jdk/pull/4434 From sjohanss at openjdk.java.net Thu Jun 10 05:36:19 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Thu, 10 Jun 2021 05:36:19 GMT Subject: Integrated: 8268388: Update large pages information in Java manpage In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 08:01:46 GMT, Stefan Johansson wrote: > Please review this update to the text for large pages in the Java man page. > > The text for `LargePageSizeInBytes` was reviewed in this [CSR](https://bugs.openjdk.java.net/browse/JDK-8265517), and this change will integrate them into the actual man page. The *Large Pages* section further down in the man page was also a bit out-dated and have been brushed up a bit. This pull request has now been integrated. Changeset: ece3ae3c Author: Stefan Johansson URL: https://git.openjdk.java.net/jdk/commit/ece3ae3cc4cc1d45b65253a9bfafdefe2656afb8 Stats: 73 lines in 1 file changed: 18 ins; 8 del; 47 mod 8268388: Update large pages information in Java manpage Reviewed-by: tschatzl, lkorinth, stuefe ------------- PR: https://git.openjdk.java.net/jdk/pull/4425 From sjohanss at openjdk.java.net Thu Jun 10 05:36:18 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Thu, 10 Jun 2021 05:36:18 GMT Subject: RFR: 8268388: Update large pages information in Java manpage [v2] In-Reply-To: <9zquajD0fEAVNt-q18UUsWTCGQB-TgB4JKF0zGZIdKw=.6237ac75-7a42-4e55-af8e-0c48775bb11b@github.com> References: <9zquajD0fEAVNt-q18UUsWTCGQB-TgB4JKF0zGZIdKw=.6237ac75-7a42-4e55-af8e-0c48775bb11b@github.com> Message-ID: On Wed, 9 Jun 2021 08:07:21 GMT, Thomas Schatzl wrote: >> Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision: >> >> Stufe review. > > Lgtm, one additional remark. Thanks @tschatzl, @lkorinth and @tstuefe for the reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/4425 From dongbo at openjdk.java.net Thu Jun 10 06:19:17 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Thu, 10 Jun 2021 06:19:17 GMT Subject: RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals In-Reply-To: References: Message-ID: <398R8QgS4XD7EwoHFUrkjOAJgu2597DAhk2NoFDyWDI=.f80583fc-2c47-46b3-9ae2-1ef1bb8aca8e@github.com> On Wed, 9 Jun 2021 08:28:37 GMT, Andrew Haley wrote: >> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 4807: >> >>> 4805: mov(tmp2, v0, T2D, 1); >>> 4806: cbnz(tmp2, DONE); >>> 4807: b(SAME); >> >> Shouldn't this be >> >> mov(tmp1, v0, T2D, 0); >> mov(tmp2, v0, T2D, 1); >> orr(tmp1, tmp1, tmp2); >> cbnz(tmp1, DONE); >> >> >> ... which would use up fewer branch prediction resources. > > ... or maybe do the OR in the vector unit? I guess it can be done with: umaxv(v1, T4S, v0); mov(tmp1, v1, T4S, 0); cbnz(tmp1, DONE0); ------------- PR: https://git.openjdk.java.net/jdk/pull/4423 From kbarrett at openjdk.java.net Thu Jun 10 06:39:15 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Thu, 10 Jun 2021 06:39:15 GMT Subject: RFR: 8263107: PSPromotionManager::copy_and_push_safe_barrier needs acquire memory barrier [v2] In-Reply-To: References: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> Message-ID: On Wed, 9 Jun 2021 12:17:17 GMT, Martin Doerr wrote: >> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: >> >> avoid reloading forwardee > > Thanks a lot for checking! That makes sense. Thanks @TheRealMDoerr , @tschatzl , @walulyai for reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/4371 From stuefe at openjdk.java.net Thu Jun 10 06:42:12 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 10 Jun 2021 06:42:12 GMT Subject: RFR: 8268520: VirtualSpace::print_on() should be const In-Reply-To: References: Message-ID: <10LEl9rEqiS82sz4cdCwtjTbOVMd6wuOBoLTrYjDrIk=.eb6d59c4-c13a-42ad-9808-6099d1dcc83e@github.com> On Wed, 9 Jun 2021 22:13:02 GMT, Ioi Lam wrote: > Please review this trivial patch. VirtualSpace::print_on() should be const so we can avoid the weird casting in epsilonHeap.cpp +1 ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4448 From kbarrett at openjdk.java.net Thu Jun 10 07:25:39 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Thu, 10 Jun 2021 07:25:39 GMT Subject: RFR: 8263107: PSPromotionManager::copy_and_push_safe_barrier needs acquire memory barrier [v3] In-Reply-To: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> References: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> Message-ID: > Please review this change to PSPromotionManager::copy_to_survivor_space > (ParallelGC) to remove some redundant work, and to add some missing memory > barriers. > > There are two callers of copy_to_survivor_space, both of which wrap that > call with the idiom > > if obj->is_forwarded() then > new_obj = obj->forwardee() > else > new_obj = copy_to_survivor_space(obj) > endif > > There are problems with this. > > (1) The first thing copy_to_survivor_space does is check whether the object > is already forwarded, and if so then return obj->forwardee_acquire(). The > idiom used by the callers is a redundant check, and the redundancy can't be > optimized away. It is also missing the acquire barrier that was added by > JDK-8154736 after long discussion. > > (2) It turns out the forwardee_acquire() from JDK-8154736 isn't sufficient > after all. The "if is_forwarded() then use forwardee()" idiom is hiding > under the abstractions that we're doing two relaxed atomic loads of the mark > word, and there is nothing here to prevent the second from reading a value > older than that read by the first, with bad consequences. This possibility > came up in the discussion of JDK-8154736, but seems to have been either lost > or discounted. If you think loads from the same location can't do that, see > JDK-8229169 for a counter example. > > Part of this change involves removing the conditionalization of the calls to > copy_to_survivor_space; just call it directly. However, it turns out that > some compilers don't inline copy_to_survivor_space because of its size. So > we refactored it into two functions, one doing the already marked check and > then calling the other to do most of the work. This is enough for the check > to be inlined into callers, so we've effectively removed the redundant inner > check. Note: This part of the change introduces a large block of whitespace > differences due to removal of an if-else and outdenting the body; I recommend > using a view that suppresses those when reviewing. > > The second part of the change involves adding or moving some acquire barriers. > > (a) For the initial check whether the object is already marked, if it is > then add an acquire fence before returning the forwardee. We could instead > use a load-acquire to obtain the mark word, but that would be an unneeded > acquire barrier on the much more common unmarked case. Also removed > forwardee_acquire(), which is no longer used. > > (b) If the cmpxchg race is lost, add an acquire fence before fetching and > returning the forwardee. The failed release-cmpxchg effectively behaves > like a relaxed-load, which must preceed the forwardee access and any reads > from it. > > I've also changed to only log copying when actually copied, not when already > copied and forwarded. Also changed a guarantee to an assert. > > I looked at all uses of forwardee() in light of problem (2), and did not > find any additional problems. (That doesn't mean there aren't any, just > that I didn't spot any. This is low-level atomics, after all.) > > Testing: > mach5 tier1-3,5,7 (tier3,5,7 are where a lot of ParallelGC testing is done). > Performance testing showed no significant change. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into forwardee_barrier - avoid reloading forwardee - more barriers, remove redundent work ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4371/files - new: https://git.openjdk.java.net/jdk/pull/4371/files/9ded099a..8bf3b272 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4371&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4371&range=01-02 Stats: 35858 lines in 526 files changed: 29835 ins; 3103 del; 2920 mod Patch: https://git.openjdk.java.net/jdk/pull/4371.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4371/head:pull/4371 PR: https://git.openjdk.java.net/jdk/pull/4371 From ogatak at openjdk.java.net Thu Jun 10 07:33:38 2021 From: ogatak at openjdk.java.net (Kazunori Ogata) Date: Thu, 10 Jun 2021 07:33:38 GMT Subject: RFR: 8267968: [PPC64] Use prefixed load and addi instructions for better performance in POWER10 [v4] In-Reply-To: References: Message-ID: > The POWER10 processor supports prefixed load and addi instructions that have larger displacement field of up to 34-bits. We can reduce instruction cycles to load constant from TOC and load an immediate value to a register. > > Assembler::{load|add}_const_optimized() and LoadCon[LPFD]Nodes are modified to use prefixed instructions, with fixing other functions that are affected by this change. > > I ran jtreg test on both POWER10 and POWER8 machines by using "make test-tier1" and verified no additional fails by this change. I also ran DaCapo, Renaissance, and SPECjbb2015 on both of them and verified they run successfully. Kazunori Ogata has updated the pull request incrementally with two additional commits since the last revision: - Rename paddi_or_addi to paddi_or_addi_r0ok because it accepts R0 - Remove unreachable code blocks ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4267/files - new: https://git.openjdk.java.net/jdk/pull/4267/files/0615adac..607120cb Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4267&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4267&range=02-03 Stats: 56 lines in 2 files changed: 5 ins; 23 del; 28 mod Patch: https://git.openjdk.java.net/jdk/pull/4267.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4267/head:pull/4267 PR: https://git.openjdk.java.net/jdk/pull/4267 From ogatak at openjdk.java.net Thu Jun 10 07:33:39 2021 From: ogatak at openjdk.java.net (Kazunori Ogata) Date: Thu, 10 Jun 2021 07:33:39 GMT Subject: RFR: 8267968: [PPC64] Use prefixed load and addi instructions for better performance in POWER10 [v4] In-Reply-To: References: Message-ID: On Mon, 7 Jun 2021 23:36:23 GMT, Corey Ashford wrote: >> Kazunori Ogata has updated the pull request incrementally with two additional commits since the last revision: >> >> - Rename paddi_or_addi to paddi_or_addi_r0ok because it accepts R0 >> - Remove unreachable code blocks > > I didn't review the details of the commit's functionality, because there are hundreds of details to check there, and to be honest there's a lot I don't understand about working with C2. > > Do you have a set of tests that check different sizes of immediate loads to guarantee you hit every case and emit the correct code? @CoreyAshford Thank you for your review and heads up. I double checked if all paths are tested and found there are unreachable blocks. I removed them. ------------- PR: https://git.openjdk.java.net/jdk/pull/4267 From kbarrett at openjdk.java.net Thu Jun 10 07:34:20 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Thu, 10 Jun 2021 07:34:20 GMT Subject: Integrated: 8263107: PSPromotionManager::copy_and_push_safe_barrier needs acquire memory barrier In-Reply-To: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> References: <5MRc1jHL7fOsoJHThRQ-bwXTADPs0yYd-4jrIxYssRk=.183e1c2a-ec72-4e8b-9884-b2a3e0e50634@github.com> Message-ID: <7uVS8PyBNCNivSeEK2f_UOEb-WpGYHclgCbo-tnHWSY=.92af3641-1e88-4a95-84c1-b6d485ff478e@github.com> On Sat, 5 Jun 2021 03:30:42 GMT, Kim Barrett wrote: > Please review this change to PSPromotionManager::copy_to_survivor_space > (ParallelGC) to remove some redundant work, and to add some missing memory > barriers. > > There are two callers of copy_to_survivor_space, both of which wrap that > call with the idiom > > if obj->is_forwarded() then > new_obj = obj->forwardee() > else > new_obj = copy_to_survivor_space(obj) > endif > > There are problems with this. > > (1) The first thing copy_to_survivor_space does is check whether the object > is already forwarded, and if so then return obj->forwardee_acquire(). The > idiom used by the callers is a redundant check, and the redundancy can't be > optimized away. It is also missing the acquire barrier that was added by > JDK-8154736 after long discussion. > > (2) It turns out the forwardee_acquire() from JDK-8154736 isn't sufficient > after all. The "if is_forwarded() then use forwardee()" idiom is hiding > under the abstractions that we're doing two relaxed atomic loads of the mark > word, and there is nothing here to prevent the second from reading a value > older than that read by the first, with bad consequences. This possibility > came up in the discussion of JDK-8154736, but seems to have been either lost > or discounted. If you think loads from the same location can't do that, see > JDK-8229169 for a counter example. > > Part of this change involves removing the conditionalization of the calls to > copy_to_survivor_space; just call it directly. However, it turns out that > some compilers don't inline copy_to_survivor_space because of its size. So > we refactored it into two functions, one doing the already marked check and > then calling the other to do most of the work. This is enough for the check > to be inlined into callers, so we've effectively removed the redundant inner > check. Note: This part of the change introduces a large block of whitespace > differences due to removal of an if-else and outdenting the body; I recommend > using a view that suppresses those when reviewing. > > The second part of the change involves adding or moving some acquire barriers. > > (a) For the initial check whether the object is already marked, if it is > then add an acquire fence before returning the forwardee. We could instead > use a load-acquire to obtain the mark word, but that would be an unneeded > acquire barrier on the much more common unmarked case. Also removed > forwardee_acquire(), which is no longer used. > > (b) If the cmpxchg race is lost, add an acquire fence before fetching and > returning the forwardee. The failed release-cmpxchg effectively behaves > like a relaxed-load, which must preceed the forwardee access and any reads > from it. > > I've also changed to only log copying when actually copied, not when already > copied and forwarded. Also changed a guarantee to an assert. > > I looked at all uses of forwardee() in light of problem (2), and did not > find any additional problems. (That doesn't mean there aren't any, just > that I didn't spot any. This is low-level atomics, after all.) > > Testing: > mach5 tier1-3,5,7 (tier3,5,7 are where a lot of ParallelGC testing is done). > Performance testing showed no significant change. This pull request has now been integrated. Changeset: 5a666282 Author: Kim Barrett URL: https://git.openjdk.java.net/jdk/commit/5a666282a9e5b5748d85f4c012b36e5c8f7eab56 Stats: 227 lines in 6 files changed: 66 ins; 77 del; 84 mod 8263107: PSPromotionManager::copy_and_push_safe_barrier needs acquire memory barrier Reviewed-by: iwalulya, tschatzl, mdoerr ------------- PR: https://git.openjdk.java.net/jdk/pull/4371 From ogatak at openjdk.java.net Thu Jun 10 07:39:19 2021 From: ogatak at openjdk.java.net (Kazunori Ogata) Date: Thu, 10 Jun 2021 07:39:19 GMT Subject: RFR: 8267968: [PPC64] Use prefixed load and addi instructions for better performance in POWER10 [v3] In-Reply-To: <-tSvQWJLA_hW6MQGjY3SVFQM6BbXUb9KZPL-OkP1Xwk=.10cbd4e8-1c4b-4578-9c3c-d6edef490bfa@github.com> References: <2o6EXLmcWJYa8EIJ6fWNYdq2bBfKfTxkMWfqS_Bt4P4=.f09a9226-ea79-4cc6-b88d-8a0b0252957a@github.com> <-tSvQWJLA_hW6MQGjY3SVFQM6BbXUb9KZPL-OkP1Xwk=.10cbd4e8-1c4b-4578-9c3c-d6edef490bfa@github.com> Message-ID: On Wed, 9 Jun 2021 14:07:11 GMT, Martin Doerr wrote: >> Kazunori Ogata has updated the pull request incrementally with two additional commits since the last revision: >> >> - Revert changes for pusing nodes in loadConLNodesTuple and add comments about the node _last points to >> - Fix grammatical errors in comments > > Sorry for my late response. I was busy with other things. I've looked at this change for some time and I wonder if such a complex change should be done at all. I like the idea, but does it really improve performance for any real applications or benchmarks? At least those parts which only increase complexity should not get done. @TheRealMDoerr Thank you for your comment. I agree with it. Honestly, we can't measure performance in reliable manner because we only have development chips/machines, where some of the features are not optimal or disabled for development purpose. So I'll close this pull request. When we get measurement data, I'll revisit this change and check if I can pick up effective parts of this change with reducing complexity. ------------- PR: https://git.openjdk.java.net/jdk/pull/4267 From ogatak at openjdk.java.net Thu Jun 10 07:39:19 2021 From: ogatak at openjdk.java.net (Kazunori Ogata) Date: Thu, 10 Jun 2021 07:39:19 GMT Subject: Withdrawn: 8267968: [PPC64] Use prefixed load and addi instructions for better performance in POWER10 In-Reply-To: References: Message-ID: On Mon, 31 May 2021 05:39:25 GMT, Kazunori Ogata wrote: > The POWER10 processor supports prefixed load and addi instructions that have larger displacement field of up to 34-bits. We can reduce instruction cycles to load constant from TOC and load an immediate value to a register. > > Assembler::{load|add}_const_optimized() and LoadCon[LPFD]Nodes are modified to use prefixed instructions, with fixing other functions that are affected by this change. > > I ran jtreg test on both POWER10 and POWER8 machines by using "make test-tier1" and verified no additional fails by this change. I also ran DaCapo, Renaissance, and SPECjbb2015 on both of them and verified they run successfully. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/4267 From ogatak at openjdk.java.net Thu Jun 10 07:42:22 2021 From: ogatak at openjdk.java.net (Kazunori Ogata) Date: Thu, 10 Jun 2021 07:42:22 GMT Subject: RFR: 8267968: [PPC64] Use prefixed load and addi instructions for better performance in POWER10 [v3] In-Reply-To: <-tSvQWJLA_hW6MQGjY3SVFQM6BbXUb9KZPL-OkP1Xwk=.10cbd4e8-1c4b-4578-9c3c-d6edef490bfa@github.com> References: <2o6EXLmcWJYa8EIJ6fWNYdq2bBfKfTxkMWfqS_Bt4P4=.f09a9226-ea79-4cc6-b88d-8a0b0252957a@github.com> <-tSvQWJLA_hW6MQGjY3SVFQM6BbXUb9KZPL-OkP1Xwk=.10cbd4e8-1c4b-4578-9c3c-d6edef490bfa@github.com> Message-ID: On Wed, 9 Jun 2021 12:35:21 GMT, Martin Doerr wrote: >> Kazunori Ogata has updated the pull request incrementally with two additional commits since the last revision: >> >> - Revert changes for pusing nodes in loadConLNodesTuple and add comments about the node _last points to >> - Fix grammatical errors in comments > > src/hotspot/cpu/ppc/assembler_ppc.cpp line 364: > >> 362: // we avoid a buffer overrun in the actual code generation phase. >> 363: nop(); >> 364: } > > Scratch emit should be able to determine the size precisely, not just a pessimistic estimation. Please don't break this design. @TheRealMDoerr I understand the design. However, I'm wondering why the alignment can vary between scratch emit and real emit if the size can be determined precisely. It is helpful if you could point out some sources of variation. ------------- PR: https://git.openjdk.java.net/jdk/pull/4267 From luhenry at openjdk.java.net Thu Jun 10 07:50:57 2021 From: luhenry at openjdk.java.net (Ludovic Henry) Date: Thu, 10 Jun 2021 07:50:57 GMT Subject: RFR: 8178287: AsyncGetCallTrace fails to traverse valid Java stacks [v2] In-Reply-To: <9qfnLj_-jz8MocK7UIIs5-NYZsVPJ7J20ZLiORqpUlM=.cb712662-0eb9-4d17-a67d-42451423f470@github.com> References: <9qfnLj_-jz8MocK7UIIs5-NYZsVPJ7J20ZLiORqpUlM=.cb712662-0eb9-4d17-a67d-42451423f470@github.com> Message-ID: > When the signal sent for AsyncGetCallTrace or JFR would land on a runtime stub (like arraycopy), a vtable stub, or the prolog of a compiled method, it wouldn't be able to detect the sender (caller) frame for multiple reasons. This patch fixes these cases through adding CodeBlob-specific frame parser which are in the best position to know how a frame is setup. > > The following examples have been profiled with honest-profiler which uses `AsyncGetCallTrace`. > > # `Prof1` > > public class Prof1 { > > public static void main(String[] args) { > StringBuilder sb = new StringBuilder(); > for (int i = 0; i < 1000000; i++) { > sb.append("ab"); > sb.delete(0, 1); > } > System.out.println(sb.length()); > } > } > > > - Baseline: > > Flat Profile (by method): > (t 99.4,s 99.4) AGCT::Unknown Java[ERR=-5] > (t 0.5,s 0.2) Prof1::main > (t 0.2,s 0.2) java.lang.AbstractStringBuilder::append > (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] > (t 0.0,s 0.0) java.lang.AbstractStringBuilder::ensureCapacityInternal > (t 0.0,s 0.0) java.lang.AbstractStringBuilder::shift > (t 0.0,s 0.0) java.lang.String::getBytes > (t 0.0,s 0.0) java.lang.AbstractStringBuilder::putStringAt > (t 0.0,s 0.0) java.lang.StringBuilder::delete > (t 0.2,s 0.0) java.lang.StringBuilder::append > (t 0.0,s 0.0) java.lang.AbstractStringBuilder::delete > (t 0.0,s 0.0) java.lang.AbstractStringBuilder::putStringAt > > - With `StubRoutinesBlob::FrameParser`: > > Flat Profile (by method): > (t 98.7,s 98.7) java.lang.AbstractStringBuilder::ensureCapacityInternal > (t 0.9,s 0.9) java.lang.AbstractStringBuilder::delete > (t 99.8,s 0.2) Prof1::main > (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] > (t 0.0,s 0.0) AGCT::Unknown Java[ERR=-5] > (t 98.8,s 0.0) java.lang.AbstractStringBuilder::append > (t 98.8,s 0.0) java.lang.StringBuilder::append > (t 0.9,s 0.0) java.lang.StringBuilder::delete > > > # `Prof2` > > import java.util.function.Supplier; > > public class Prof2 { > > public static void main(String[] args) { > var rand = new java.util.Random(0); > Supplier[] suppliers = { > () -> 0, > () -> 1, > () -> 2, > () -> 3, > }; > > long sum = 0; > for (int i = 0; i >= 0; i++) { > sum += (int)suppliers[i % suppliers.length].get(); > } > } > } > > > - Baseline: > > Flat Profile (by method): > (t 60.7,s 60.7) AGCT::Unknown Java[ERR=-5] > (t 39.2,s 35.2) Prof2::main > (t 1.4,s 1.4) Prof2::lambda$main$3 > (t 1.0,s 1.0) Prof2::lambda$main$2 > (t 0.9,s 0.9) Prof2::lambda$main$1 > (t 0.7,s 0.7) Prof2::lambda$main$0 > (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] > (t 0.0,s 0.0) java.lang.Thread::exit > (t 0.9,s 0.0) Prof2$$Lambda$2.0x0000000800c00c28::get > (t 1.0,s 0.0) Prof2$$Lambda$3.0x0000000800c01000::get > (t 1.4,s 0.0) Prof2$$Lambda$4.0x0000000800c01220::get > (t 0.7,s 0.0) Prof2$$Lambda$1.0x0000000800c00a08::get > > > - With `VtableBlob::FrameParser` and `nmethod::FrameParser`: > > Flat Profile (by method): > (t 74.1,s 70.3) Prof2::main > (t 6.5,s 5.5) Prof2$$Lambda$29.0x0000000800081220::get > (t 6.6,s 5.4) Prof2$$Lambda$28.0x0000000800081000::get > (t 5.7,s 5.0) Prof2$$Lambda$26.0x0000000800080a08::get > (t 5.9,s 5.0) Prof2$$Lambda$27.0x0000000800080c28::get > (t 4.9,s 4.9) AGCT::Unknown Java[ERR=-5] > (t 1.2,s 1.2) Prof2::lambda$main$2 > (t 0.9,s 0.9) Prof2::lambda$main$3 > (t 0.9,s 0.9) Prof2::lambda$main$1 > (t 0.7,s 0.7) Prof2::lambda$main$0 > (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] Ludovic Henry has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains five new commits since the last revision: - Disable checks in FrameParser when known to be safe - Allow AsyncGetCallTrace and JFR to unwind stack from vtable stub The program is the following: ``` import java.util.function.Supplier; public class Prof2 { public static void main(String[] args) { var rand = new java.util.Random(0); Supplier[] suppliers = { () -> 0, () -> 1, () -> 2, () -> 3, }; long sum = 0; for (int i = 0; i >= 0; i++) { sum += (int)suppliers[i % suppliers.length].get(); } } } ``` The results are as follows: - Baseline (from previous commit): Flat Profile (by method): (t 39.3,s 39.3) AGCT::Unknown Java[ERR=-5] (t 40.3,s 36.1) Prof2::main (t 6.4,s 5.3) Prof2$$Lambda$28.0x0000000800081000::get (t 6.1,s 5.1) Prof2$$Lambda$29.0x0000000800081220::get (t 6.0,s 5.0) Prof2$$Lambda$27.0x0000000800080c28::get (t 6.1,s 5.0) Prof2$$Lambda$26.0x0000000800080a08::get (t 1.1,s 1.1) Prof2::lambda$main$2 (t 1.1,s 1.1) Prof2::lambda$main$0 (t 1.0,s 1.0) Prof2::lambda$main$1 (t 0.9,s 0.9) Prof2::lambda$main$3 (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] - With unwind from vtable stub Flat Profile (by method): (t 74.1,s 70.3) Prof2::main (t 6.5,s 5.5) Prof2$$Lambda$29.0x0000000800081220::get (t 6.6,s 5.4) Prof2$$Lambda$28.0x0000000800081000::get (t 5.7,s 5.0) Prof2$$Lambda$26.0x0000000800080a08::get (t 5.9,s 5.0) Prof2$$Lambda$27.0x0000000800080c28::get (t 4.9,s 4.9) AGCT::Unknown Java[ERR=-5] (t 1.2,s 1.2) Prof2::lambda$main$2 (t 0.9,s 0.9) Prof2::lambda$main$3 (t 0.9,s 0.9) Prof2::lambda$main$1 (t 0.7,s 0.7) Prof2::lambda$main$0 (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] We attribute the vtable stub to the caller and not the callee, which is already an improvement from the existing case. - Allow AsyncGetCallTrace and JFR to unwind stack from nmethod's prolog When sampling hits the prolog of a method, Hotspot assumes it's unable to parse the frame. This change allows to parse such frame on x86 by specializing which instruction it's hitting in the prolog. The results are as follows: - Baseline: Flat Profile (by method): (t 60.7,s 60.7) AGCT::Unknown Java[ERR=-5] (t 39.2,s 35.2) Prof2::main (t 1.4,s 1.4) Prof2::lambda$main$3 (t 1.0,s 1.0) Prof2::lambda$main$2 (t 0.9,s 0.9) Prof2::lambda$main$1 (t 0.7,s 0.7) Prof2::lambda$main$0 (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] (t 0.0,s 0.0) java.lang.Thread::exit (t 0.9,s 0.0) Prof2$$Lambda$2.0x0000000800c00c28::get (t 1.0,s 0.0) Prof2$$Lambda$3.0x0000000800c01000::get (t 1.4,s 0.0) Prof2$$Lambda$4.0x0000000800c01220::get (t 0.7,s 0.0) Prof2$$Lambda$1.0x0000000800c00a08::get - With incomplete frame parsing: Flat Profile (by method): (t 39.3,s 39.3) AGCT::Unknown Java[ERR=-5] (t 40.3,s 36.1) Prof2::main (t 6.4,s 5.3) Prof2$$Lambda$28.0x0000000800081000::get (t 6.1,s 5.1) Prof2$$Lambda$29.0x0000000800081220::get (t 6.0,s 5.0) Prof2$$Lambda$27.0x0000000800080c28::get (t 6.1,s 5.0) Prof2$$Lambda$26.0x0000000800080a08::get (t 1.1,s 1.1) Prof2::lambda$main$2 (t 1.1,s 1.1) Prof2::lambda$main$0 (t 1.0,s 1.0) Prof2::lambda$main$1 (t 0.9,s 0.9) Prof2::lambda$main$3 (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] (t 0.0,s 0.0) java.util.Locale::getInstance (t 0.0,s 0.0) AGCT::Not walkable Java[ERR=-6] (t 0.0,s 0.0) jdk.internal.loader.BuiltinClassLoader::loadClassOrNull (t 0.0,s 0.0) java.lang.ClassLoader::loadClass (t 0.0,s 0.0) sun.net.util.URLUtil::urlNoFragString (t 0.0,s 0.0) java.lang.Class::forName0 (t 0.0,s 0.0) java.util.Locale::initDefault (t 0.0,s 0.0) jdk.internal.loader.BuiltinClassLoader::loadClass (t 0.0,s 0.0) jdk.internal.loader.URLClassPath::getLoader (t 0.0,s 0.0) jdk.internal.loader.URLClassPath::getResource (t 0.0,s 0.0) java.lang.String::toLowerCase (t 0.0,s 0.0) sun.launcher.LauncherHelper::loadMainClass (t 0.0,s 0.0) sun.launcher.LauncherHelper::checkAndLoadMain (t 0.0,s 0.0) java.util.Locale:: (t 0.0,s 0.0) jdk.internal.loader.BuiltinClassLoader::findClassOnClassPathOrNull (t 0.0,s 0.0) jdk.internal.loader.ClassLoaders$AppClassLoader::loadClass (t 0.0,s 0.0) java.lang.Class::forName The program is as follows: ``` import java.util.function.Supplier; public class Prof2 { public static void main(String[] args) { var rand = new java.util.Random(0); Supplier[] suppliers = { () -> 0, () -> 1, () -> 2, () -> 3, }; long sum = 0; for (int i = 0; i >= 0; i++) { sum += (int)suppliers[i % suppliers.length].get(); } } } ``` We see that the results are particularely useful in this case as the methods are very short (it only returns an integer), and the probability of hitting the prolog is then very high. - Allow AsyncGetCallTrace and JFR to walk a stub frame When the signal sent for AsyncGetCallTrace or JFR would land on a stub (like arraycopy), it wouldn't be able to detect the sender (caller) frame because `_cb->frame_size() == 0`. Because we fully control how the prolog and epilog of stub code is generated, we know there are two cases: 1. A stack frame is allocated via macroAssembler->enter(), and consists in `push rbp; mov rsp, rbp;`. 2. No stack frames are allocated and rbp is left unchanged and rsp is decremented with the `call` instruction that push the return `pc` on the stack. For case 1., we can easily know the sender frame by simply looking at rbp, especially since we know that all stubs preserve the frame pointer (on x86 at least). For case 2., we end up returning the sender's sender, but that already gives us more information than what we have today. The results are as follows: - Baseline: Flat Profile (by method): (t 99.4,s 99.4) AGCT::Unknown Java[ERR=-5] (t 0.5,s 0.2) Prof1::main (t 0.2,s 0.2) java.lang.AbstractStringBuilder::append (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] (t 0.0,s 0.0) java.lang.AbstractStringBuilder::ensureCapacityInternal (t 0.0,s 0.0) java.lang.AbstractStringBuilder::shift (t 0.0,s 0.0) java.lang.String::getBytes (t 0.0,s 0.0) java.lang.AbstractStringBuilder::putStringAt (t 0.0,s 0.0) java.lang.StringBuilder::delete (t 0.2,s 0.0) java.lang.StringBuilder::append (t 0.0,s 0.0) java.lang.AbstractStringBuilder::delete (t 0.0,s 0.0) java.lang.AbstractStringBuilder::putStringAt - With StubRoutinesBlob::FrameParser Flat Profile (by method): (t 98.7,s 98.7) java.lang.AbstractStringBuilder::ensureCapacityInternal (t 0.9,s 0.9) java.lang.AbstractStringBuilder::delete (t 99.8,s 0.2) Prof1::main (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] (t 0.0,s 0.0) AGCT::Unknown Java[ERR=-5] (t 98.8,s 0.0) java.lang.AbstractStringBuilder::append (t 98.8,s 0.0) java.lang.StringBuilder::append (t 0.9,s 0.0) java.lang.StringBuilder::delete The program is as follows: ``` public class Prof1 { public static void main(String[] args) { StringBuilder sb = new StringBuilder(); for (int i = 0; i < 1000000; i++) { sb.append("ab"); sb.delete(0, 1); } System.out.println(sb.length()); } } ``` We now account for the arraycopy stub which is called by AbstractStringBuilder::ensureCapacityInternal. It was previously ignored because it would not know how to parse the frame for the arraycopy stub and would fall in the AGCT::Unknown Java[ERR=-5] section. However, it still isn't perfect since it doesn't point to the arraycopy stub directly. - Extract sender frame parsing to CodeBlock::FrameParser Whether and how a frame is setup is controlled by the code generator for the specific CodeBlock. The CodeBlock is then in the best place to know how to parse the sender's frame from the current frame in the given CodeBlock. This refactoring proposes to extract this parsing out of `frame` and into a `CodeBlock::FrameParser`. This FrameParser is then specialized in the relevant inherited children of CodeBlock. This change is to largely facilitate adding new supported cases for JDK-8252417 like runtime stubs. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4436/files - new: https://git.openjdk.java.net/jdk/pull/4436/files/137a0c48..85f218c8 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4436&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4436&range=00-01 Stats: 6 lines in 2 files changed: 6 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/4436.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4436/head:pull/4436 PR: https://git.openjdk.java.net/jdk/pull/4436 From luhenry at openjdk.java.net Thu Jun 10 08:01:18 2021 From: luhenry at openjdk.java.net (Ludovic Henry) Date: Thu, 10 Jun 2021 08:01:18 GMT Subject: RFR: 8178287: AsyncGetCallTrace fails to traverse valid Java stacks In-Reply-To: References: <9qfnLj_-jz8MocK7UIIs5-NYZsVPJ7J20ZLiORqpUlM=.cb712662-0eb9-4d17-a67d-42451423f470@github.com> Message-ID: <_gZbzPT5Ba4EwjKgTRdQ6uMOpIYmyAt-G-4qjCAN0Qw=.ae7a3154-4db7-4489-a5a7-38525a543b4c@github.com> On Wed, 9 Jun 2021 19:04:54 GMT, Serguei Spitsyn wrote: >> When the signal sent for AsyncGetCallTrace or JFR would land on a runtime stub (like arraycopy), a vtable stub, or the prolog of a compiled method, it wouldn't be able to detect the sender (caller) frame for multiple reasons. This patch fixes these cases through adding CodeBlob-specific frame parser which are in the best position to know how a frame is setup. >> >> The following examples have been profiled with honest-profiler which uses `AsyncGetCallTrace`. >> >> # `Prof1` >> >> public class Prof1 { >> >> public static void main(String[] args) { >> StringBuilder sb = new StringBuilder(); >> for (int i = 0; i < 1000000; i++) { >> sb.append("ab"); >> sb.delete(0, 1); >> } >> System.out.println(sb.length()); >> } >> } >> >> >> - Baseline: >> >> Flat Profile (by method): >> (t 99.4,s 99.4) AGCT::Unknown Java[ERR=-5] >> (t 0.5,s 0.2) Prof1::main >> (t 0.2,s 0.2) java.lang.AbstractStringBuilder::append >> (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] >> (t 0.0,s 0.0) java.lang.AbstractStringBuilder::ensureCapacityInternal >> (t 0.0,s 0.0) java.lang.AbstractStringBuilder::shift >> (t 0.0,s 0.0) java.lang.String::getBytes >> (t 0.0,s 0.0) java.lang.AbstractStringBuilder::putStringAt >> (t 0.0,s 0.0) java.lang.StringBuilder::delete >> (t 0.2,s 0.0) java.lang.StringBuilder::append >> (t 0.0,s 0.0) java.lang.AbstractStringBuilder::delete >> (t 0.0,s 0.0) java.lang.AbstractStringBuilder::putStringAt >> >> - With `StubRoutinesBlob::FrameParser`: >> >> Flat Profile (by method): >> (t 98.7,s 98.7) java.lang.AbstractStringBuilder::ensureCapacityInternal >> (t 0.9,s 0.9) java.lang.AbstractStringBuilder::delete >> (t 99.8,s 0.2) Prof1::main >> (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] >> (t 0.0,s 0.0) AGCT::Unknown Java[ERR=-5] >> (t 98.8,s 0.0) java.lang.AbstractStringBuilder::append >> (t 98.8,s 0.0) java.lang.StringBuilder::append >> (t 0.9,s 0.0) java.lang.StringBuilder::delete >> >> >> # `Prof2` >> >> import java.util.function.Supplier; >> >> public class Prof2 { >> >> public static void main(String[] args) { >> var rand = new java.util.Random(0); >> Supplier[] suppliers = { >> () -> 0, >> () -> 1, >> () -> 2, >> () -> 3, >> }; >> >> long sum = 0; >> for (int i = 0; i >= 0; i++) { >> sum += (int)suppliers[i % suppliers.length].get(); >> } >> } >> } >> >> >> - Baseline: >> >> Flat Profile (by method): >> (t 60.7,s 60.7) AGCT::Unknown Java[ERR=-5] >> (t 39.2,s 35.2) Prof2::main >> (t 1.4,s 1.4) Prof2::lambda$main$3 >> (t 1.0,s 1.0) Prof2::lambda$main$2 >> (t 0.9,s 0.9) Prof2::lambda$main$1 >> (t 0.7,s 0.7) Prof2::lambda$main$0 >> (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] >> (t 0.0,s 0.0) java.lang.Thread::exit >> (t 0.9,s 0.0) Prof2$$Lambda$2.0x0000000800c00c28::get >> (t 1.0,s 0.0) Prof2$$Lambda$3.0x0000000800c01000::get >> (t 1.4,s 0.0) Prof2$$Lambda$4.0x0000000800c01220::get >> (t 0.7,s 0.0) Prof2$$Lambda$1.0x0000000800c00a08::get >> >> >> - With `VtableBlob::FrameParser` and `nmethod::FrameParser`: >> >> Flat Profile (by method): >> (t 74.1,s 70.3) Prof2::main >> (t 6.5,s 5.5) Prof2$$Lambda$29.0x0000000800081220::get >> (t 6.6,s 5.4) Prof2$$Lambda$28.0x0000000800081000::get >> (t 5.7,s 5.0) Prof2$$Lambda$26.0x0000000800080a08::get >> (t 5.9,s 5.0) Prof2$$Lambda$27.0x0000000800080c28::get >> (t 4.9,s 4.9) AGCT::Unknown Java[ERR=-5] >> (t 1.2,s 1.2) Prof2::lambda$main$2 >> (t 0.9,s 0.9) Prof2::lambda$main$3 >> (t 0.9,s 0.9) Prof2::lambda$main$1 >> (t 0.7,s 0.7) Prof2::lambda$main$0 >> (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] > > Hi Ludovic, > Thank you for working on this fix in the AsyncGetCallTrace. > What version of JDK release do you intent to target? > Just wanted to make sure you know the JDK 17 development cycle will be closed tomorrow for P4 bugs and enhancements. The repository will be forked and the RDP 1 phase started. > I doubt the review of your fix will be completed by this time. > So, please, keep in mind your PR will go to 18, not 17. > Thanks, > Serguei @sspitsyn thank you for the reminder. It's perfectly fine for this change to land in JDK 18. We'll see in the future if there is a demand to backport it to JDK 17 and we'll do accordingly. ------------- PR: https://git.openjdk.java.net/jdk/pull/4436 From iignatyev at openjdk.java.net Thu Jun 10 08:35:25 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Thu, 10 Jun 2021 08:35:25 GMT Subject: RFR: 8267448: Add "ulimit -a" to environment.html Message-ID: <4UGs3avpcoXZtLgyl_8Hj8lapTPwVGk3FIP61i-dIXM=.4827f8f7-778b-439f-966c-c139ab3abd46@github.com> Hi all, could you please review this small patch that does $subj? Thanks, -- Igor attn: @plummercj ------------- Commit messages: - 8267448 Changes: https://git.openjdk.java.net/jdk/pull/4451/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4451&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8267448 Stats: 15 lines in 3 files changed: 15 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/4451.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4451/head:pull/4451 PR: https://git.openjdk.java.net/jdk/pull/4451 From jbachorik at openjdk.java.net Thu Jun 10 08:42:17 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Thu, 10 Jun 2021 08:42:17 GMT Subject: RFR: 8247471: Enhance CPU load events with the actual elapsed CPU time In-Reply-To: References: Message-ID: On Wed, 19 May 2021 16:41:45 GMT, Erik Gahlin wrote: >> A continuation of an RFR thread started last year - https://mail.openjdk.java.net/pipermail/hotspot-jfr-dev/2020-June/001533.html >> >> This change adds the raw CPU time value to CPU load events (per-thread and per-process as well). >> The CPU time value is already known and used to calculate the load so adding it to the events does not incur any extra overhead while making it much easier for the end users to eg. aggregate and compare the active execution time per time period without the detailed knowledge how JFR computes and normalizes the CPU load. > > It would be nice to have this for JDK 17, but I guess we are stuck on how the elapsed difference should be reported. > > I would like to avoid using the duration field, because it's not used for any other event that samples data periodically. We could be opening a can of worms. > > If we are going to measure execution time since the last sample, I think we should make it clear in the description, i.e "Elapsed JVM User CPU Time since last sample". While not optimal, a separate field or duration, could be added in a later release, even though it could lead to bugs, if using JFR on an old release where duration is 0 s. @egahlin thanks for the review! But I decided to close this PR since I was not able to get comparable behaviour across different platforms - namely the windows implementation was either providing inconsistent CPU time and CPU load values, due to them being retrieved from disparate sources, or when I used only the common source the precision of the CPU load value dropped significantly (the CPU time granularity on windows in somewhere in range of 10-100ms AFAIK and this was affecting the CPU load now computed from CPU time). Given these problems and no good solution (even after consulting the problem with a few people skilled in Windows architecture) I decided it was better not to pursue this PR further. ------------- PR: https://git.openjdk.java.net/jdk/pull/2186 From aph at openjdk.java.net Thu Jun 10 09:50:19 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 10 Jun 2021 09:50:19 GMT Subject: RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals In-Reply-To: <398R8QgS4XD7EwoHFUrkjOAJgu2597DAhk2NoFDyWDI=.f80583fc-2c47-46b3-9ae2-1ef1bb8aca8e@github.com> References: <398R8QgS4XD7EwoHFUrkjOAJgu2597DAhk2NoFDyWDI=.f80583fc-2c47-46b3-9ae2-1ef1bb8aca8e@github.com> Message-ID: <7SOGLj60pz2mW_0DvWXIkenehUdXLQ4qb4HdELbE1mU=.d841a3f5-03bc-4cba-828a-80775c104e75@github.com> On Thu, 10 Jun 2021 05:57:35 GMT, Dong Bo wrote: >> ... or maybe do the OR in the vector unit? > > I guess it can be done with: > > umaxv(v1, T4S, v0); > mov(tmp1, v1, T4S, 0); > cbnz(tmp1, DONE0); Sure, great idea. ------------- PR: https://git.openjdk.java.net/jdk/pull/4423 From mdoerr at openjdk.java.net Thu Jun 10 10:13:20 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Thu, 10 Jun 2021 10:13:20 GMT Subject: RFR: 8267968: [PPC64] Use prefixed load and addi instructions for better performance in POWER10 [v3] In-Reply-To: References: <2o6EXLmcWJYa8EIJ6fWNYdq2bBfKfTxkMWfqS_Bt4P4=.f09a9226-ea79-4cc6-b88d-8a0b0252957a@github.com> <-tSvQWJLA_hW6MQGjY3SVFQM6BbXUb9KZPL-OkP1Xwk=.10cbd4e8-1c4b-4578-9c3c-d6edef490bfa@github.com> Message-ID: On Thu, 10 Jun 2021 07:39:43 GMT, Kazunori Ogata wrote: >> src/hotspot/cpu/ppc/assembler_ppc.cpp line 364: >> >>> 362: // we avoid a buffer overrun in the actual code generation phase. >>> 363: nop(); >>> 364: } >> >> Scratch emit should be able to determine the size precisely, not just a pessimistic estimation. Please don't break this design. > > @TheRealMDoerr I understand the design. However, I'm wondering why the alignment can vary between scratch emit and real emit if the size can be determined precisely. It is helpful if you could point out some sources of variation. Scratch emit uses an empty buffer each time. Only the real emit uses real offsets. The concept is to make padding decisions only at places where the final padding requirements are known. Regarding C2, the compiler handles the padding based on compute_padding function and the emit sections should not contain alignment dependent nops. ------------- PR: https://git.openjdk.java.net/jdk/pull/4267 From tschatzl at openjdk.java.net Thu Jun 10 10:27:14 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 10 Jun 2021 10:27:14 GMT Subject: RFR: 8268520: VirtualSpace::print_on() should be const In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 22:13:02 GMT, Ioi Lam wrote: > Please review this trivial patch. VirtualSpace::print_on() should be const so we can avoid the weird casting in epsilonHeap.cpp Lgtm too :) Marked as reviewed by tschatzl (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/4448 From mdoerr at openjdk.java.net Thu Jun 10 10:28:22 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Thu, 10 Jun 2021 10:28:22 GMT Subject: RFR: 8267968: [PPC64] Use prefixed load and addi instructions for better performance in POWER10 [v4] In-Reply-To: References: Message-ID: On Thu, 10 Jun 2021 07:33:38 GMT, Kazunori Ogata wrote: >> The POWER10 processor supports prefixed load and addi instructions that have larger displacement field of up to 34-bits. We can reduce instruction cycles to load constant from TOC and load an immediate value to a register. >> >> Assembler::{load|add}_const_optimized() and LoadCon[LPFD]Nodes are modified to use prefixed instructions, with fixing other functions that are affected by this change. >> >> I ran jtreg test on both POWER10 and POWER8 machines by using "make test-tier1" and verified no additional fails by this change. I also ran DaCapo, Renaissance, and SPECjbb2015 on both of them and verified they run successfully. > > Kazunori Ogata has updated the pull request incrementally with two additional commits since the last revision: > > - Rename paddi_or_addi to paddi_or_addi_r0ok because it accepts R0 > - Remove unreachable code blocks Thanks for postponing it. We should have nightly test on Power10 when integrating complex changes. The independent parts of this change should get evaluated individually. I'm not sure if optimizing load_const_optimized this way is beneficial at all. It doesn't reduce code size AFAICS. And the latency reduction may be pointless if Power10 strongly uses out-of-order execution which can hide the latency. We should also check how relevant large constant sections are. ------------- PR: https://git.openjdk.java.net/jdk/pull/4267 From neliasso at openjdk.java.net Thu Jun 10 12:48:36 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Thu, 10 Jun 2021 12:48:36 GMT Subject: RFR: 8268125: ZGC: Clone oop array gets wrong acopy stub [v2] In-Reply-To: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> References: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> Message-ID: <0lI73BsUfP22589T2Xu7xFgtnFT8wP_ssxTY5qGkLEs=.ec1c9e6d-b800-448a-a439-6724bbbd8b44@github.com> > Hi, > > This fixes a problem I introduced with JDK-8267726. With that change clone oop array is treated as normal clone arrays with ZGC. I missed that a case was missing in zBarrierSetC2::clone_at_expansion - which caused clone_oop-arrays to get the wrong array copy stub. > > In this fix I move the entire leaf call creation inside zBarrierSetC2, and leave BarrierSetC2 as is. In this way I don't have to change anything for the other collectors. > > Please review, > Best regards, > Nils Eliasson Nils Eliasson has updated the pull request incrementally with one additional commit since the last revision: Fix acopy type ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4359/files - new: https://git.openjdk.java.net/jdk/pull/4359/files/b77f7c30..f9d403e5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4359&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4359&range=00-01 Stats: 18 lines in 1 file changed: 10 ins; 1 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/4359.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4359/head:pull/4359 PR: https://git.openjdk.java.net/jdk/pull/4359 From github.com+6704669+asgibbons at openjdk.java.net Thu Jun 10 16:16:32 2021 From: github.com+6704669+asgibbons at openjdk.java.net (Scott Gibbons) Date: Thu, 10 Jun 2021 16:16:32 GMT Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v3] In-Reply-To: References: <_jzUJqPGgV255ofevS6BguJqQddvfdMdr0gGwwn3DA4=.03e8dc6d-ff6b-46de-8dc5-69ed36481615@github.com> Message-ID: On Tue, 8 Jun 2021 23:42:13 GMT, Sandhya Viswanathan wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixing review comments. Adding notes about isMIME parameter for other architectures; clarifying decodeBlock comments. > > src/hotspot/cpu/x86/assembler_x86.cpp line 4555: > >> 4553: void Assembler::evpmaddubsw(XMMRegister dst, XMMRegister src1, XMMRegister src2, int vector_len) { >> 4554: assert(VM_Version::supports_avx512bw(), ""); >> 4555: InstructionAttr attributes(vector_len, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true); > > This instruction is also supported on AVX platforms. The assert check could be as follows: > assert(vector_len == AVX_128bit? VM_Version::supports_avx() : > vector_len == AVX_256bit? VM_Version::supports_avx2() : > vector_len == AVX_512bit? VM_Version::supports_avx512bw() : 0, ""); > Accordingly the instruction could be named as vpmaddubsw. Done. > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 5688: > >> 5686: address base64_vbmi_lookup_lo_addr() { >> 5687: __ align(64, (unsigned long) __ pc()); >> 5688: StubCodeMark mark(this, "StubRoutines", "lookup_lo"); > > It will be good to add base64 to the StubCodeMark name for this and all the tables. Done. > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 5983: > >> 5981: // calculate length from offsets >> 5982: __ movq(length, end_offset); >> 5983: __ subq(length, start_offset); > > These are 32bit, so movl, subl instead of movq, subq. Similar for all length relates instructions below. Done. > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 5987: > >> 5985: >> 5986: // If AVX512 VBMI not supported, just compile non-AVX code >> 5987: if(VM_Version::supports_avx512_vbmi()) { > > Need to also check for VM_Version::supports_avx512bw() support. > Could you please check if VM_Version::supports_avx512dq is needed as well? Done. No need for avx512dq. > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6134: > >> 6132: __ subq(length, 64); >> 6133: __ addq(source, 64); >> 6134: __ addq(dest, 48); > > All address related instructions here and below could use addptr, subptr etc. Done. > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6273: > >> 6271: >> 6272: __ shrq(length, 2); // Multiple of 4 bytes only - length is # 4-byte chunks >> 6273: __ cmpq(length, 0); > > Should these be shrl, cmpl? Done. > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6278: > >> 6276: // Set up src and dst pointers properly >> 6277: __ addq(source, start_offset); // Initial offset >> 6278: __ addq(dest, dp); > > The convention is to use addptr for pointers. Done. > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6284: > >> 6282: __ shll(isURL, 8); // index into decode table based on isURL >> 6283: __ lea(decode_table, ExternalAddress(StubRoutines::x86::base64_decoding_table_addr())); >> 6284: __ addq(decode_table, isURL); > > addptr here. Done. > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6297: > >> 6295: __ orl(byte1, byte4); >> 6296: >> 6297: __ incrementq(source, 4); > > addptr here. Done. > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6317: > >> 6315: __ load_signed_byte(byte4, Address(source, RegisterOrConstant(), Address::times_1, 3)); >> 6316: __ load_signed_byte(byte3, Address(decode_table, byte3, Address::times_1, 0)); >> 6317: __ load_signed_byte(byte4, Address(decode_table, byte4, Address::times_1, 0)); > > You could use Address(base, offset) form directly here and other places: e.g. Address (source, 1) instead of Address(source, RegisterOrConstant(), Address::times_1, 1). Done. > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6329: > >> 6327: __ subq(dest, rax); // Number of bytes converted >> 6328: __ movq(rax, dest); >> 6329: __ pop(rbx); > > subptr, movptr here. Done. > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 7627: > >> 7625: StubRoutines::x86::_right_shift_mask = base64_right_shift_mask_addr(); >> 7626: StubRoutines::_base64_encodeBlock = generate_base64_encodeBlock(); >> 7627: if (VM_Version::supports_avx512_vbmi()) { > > Need to add avx512bw check here also. Done. > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 7628: > >> 7626: StubRoutines::_base64_encodeBlock = generate_base64_encodeBlock(); >> 7627: if (VM_Version::supports_avx512_vbmi()) { >> 7628: StubRoutines::x86::_lookup_lo = base64_vbmi_lookup_lo_addr(); > > It would be good to add base64 to these names. Done. ------------- PR: https://git.openjdk.java.net/jdk/pull/4368 From github.com+6704669+asgibbons at openjdk.java.net Thu Jun 10 16:16:08 2021 From: github.com+6704669+asgibbons at openjdk.java.net (Scott Gibbons) Date: Thu, 10 Jun 2021 16:16:08 GMT Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v4] In-Reply-To: References: Message-ID: <0Qj7oQT5xTnuyhwykScVOvEgS0__4xiGdNM0RhawDoU=.dcd3bf74-9042-4dde-9058-e210da195f1b@github.com> > Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding. > > A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done. This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so. A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded. This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded. > > The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding. The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically. > > Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k. The numbers are given in the table below. > > **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**. > > > Benchmark Name | Base Score | Optimized Score | Gain > -- | -- | -- | -- > testBase64Decode size 1 | 15.36 | 15.32 | 1.00 > testBase64Decode size 3 | 17.00 | 16.72 | 1.02 > testBase64Decode size 7 | 20.60 | 18.82 | 1.09 > testBase64Decode size 32 | 34.21 | 26.77 | 1.28 > testBase64Decode size 64 | 54.43 | 38.35 | 1.42 > testBase64Decode size 80 | 66.40 | 48.34 | 1.37 > testBase64Decode size 96 | 73.16 | 52.90 | 1.38 > testBase64Decode size 112 | 84.93 | 51.82 | 1.64 > testBase64Decode size 512 | 288.81 | 32.04 | 9.01 > testBase64Decode size 1000 | 560.48 | 40.79 | 13.74 > testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72 > testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15 > testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07 > testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10 > testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02 > testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10 > testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05 > testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00 > testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05 > testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20 > testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09 > testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12 > testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09 > testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21 > testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29 > testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12 > testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05 > testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18 > testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02 > testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24 > testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23 > testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24 > testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14 > testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19 > testBase64WithErrorInputsDecode size 20000 | 1398.44 | 1138.17 | 1.23 > testBase64WithErrorInputsDecode size 50000 | 1409.41 | 1114.16 | 1.26 Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Addressing review comments. 1. Modified evpmaddubsw. Assert for avx512bw, renamed to vpmaddubsw. 2. Added base64 to StubCodeMark names and associated variables. 3. Added avx512bw check at top of vbmi loop. No need for avx512dq. 4. Fixed all length references (addq=>addl, addq=>addptr, etc.). 5. Converted to Address(base, offset) where appropriate. Compiles, and smoke-tested. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4368/files - new: https://git.openjdk.java.net/jdk/pull/4368/files/d66e32e3..247f2245 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4368&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4368&range=02-03 Stats: 104 lines in 5 files changed: 4 ins; 0 del; 100 mod Patch: https://git.openjdk.java.net/jdk/pull/4368.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4368/head:pull/4368 PR: https://git.openjdk.java.net/jdk/pull/4368 From iignatyev at openjdk.java.net Thu Jun 10 16:54:52 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Thu, 10 Jun 2021 16:54:52 GMT Subject: RFR: 8267448: Add "ulimit -a" to environment.html In-Reply-To: <4UGs3avpcoXZtLgyl_8Hj8lapTPwVGk3FIP61i-dIXM=.4827f8f7-778b-439f-966c-c139ab3abd46@github.com> References: <4UGs3avpcoXZtLgyl_8Hj8lapTPwVGk3FIP61i-dIXM=.4827f8f7-778b-439f-966c-c139ab3abd46@github.com> Message-ID: On Thu, 10 Jun 2021 06:26:53 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this small patch that does $subj? > > Thanks, > -- Igor > > attn: @plummercj closing in favor of openjdk/jdk17#2 ------------- PR: https://git.openjdk.java.net/jdk/pull/4451 From iignatyev at openjdk.java.net Thu Jun 10 16:54:52 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Thu, 10 Jun 2021 16:54:52 GMT Subject: Withdrawn: 8267448: Add "ulimit -a" to environment.html In-Reply-To: <4UGs3avpcoXZtLgyl_8Hj8lapTPwVGk3FIP61i-dIXM=.4827f8f7-778b-439f-966c-c139ab3abd46@github.com> References: <4UGs3avpcoXZtLgyl_8Hj8lapTPwVGk3FIP61i-dIXM=.4827f8f7-778b-439f-966c-c139ab3abd46@github.com> Message-ID: On Thu, 10 Jun 2021 06:26:53 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this small patch that does $subj? > > Thanks, > -- Igor > > attn: @plummercj This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/4451 From iignatyev at openjdk.java.net Thu Jun 10 16:57:24 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Thu, 10 Jun 2021 16:57:24 GMT Subject: [jdk17] RFR: 8267448: Add "ulimit -a" to environment.html Message-ID: <1wM6WR4ekbJEvda8op4B0Nl6RTATdLKo-4WT3ML-rFQ=.db5c3ada-ea43-4613-a5c9-5e3222e77ef1@github.com> (recreating openjdk/jdk#4451 against jdk17) Hi all, could you please review this small patch that does $subj? Thanks, -- Igor attn: @plummercj ------------- Commit messages: - 8267448 Changes: https://git.openjdk.java.net/jdk17/pull/2/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk17&pr=2&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8267448 Stats: 15 lines in 3 files changed: 15 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk17/pull/2.diff Fetch: git fetch https://git.openjdk.java.net/jdk17 pull/2/head:pull/2 PR: https://git.openjdk.java.net/jdk17/pull/2 From lmesnik at openjdk.java.net Thu Jun 10 17:49:09 2021 From: lmesnik at openjdk.java.net (Leonid Mesnik) Date: Thu, 10 Jun 2021 17:49:09 GMT Subject: Integrated: 8267893: Improve jtreg test failure handler do get native/mixed stack traces for cores and live processes In-Reply-To: References: Message-ID: On Thu, 27 May 2021 22:05:55 GMT, Leonid Mesnik wrote: > EFH is improved to process cores and get mixed stack traces with jhsdb and native stack traces with gdb/lldb. It might be useful because hs_err doesn't contain info about all threads, sometimes it is even not generated. This pull request has now been integrated. Changeset: 8c8422e0 Author: Leonid Mesnik URL: https://git.openjdk.java.net/jdk/commit/8c8422e0f8886d9bbfca29fd228368f88bf46f2c Stats: 159 lines in 12 files changed: 130 ins; 7 del; 22 mod 8267893: Improve jtreg test failure handler do get native/mixed stack traces for cores and live processes Reviewed-by: iignatyev ------------- PR: https://git.openjdk.java.net/jdk/pull/4234 From cjplummer at openjdk.java.net Thu Jun 10 17:54:15 2021 From: cjplummer at openjdk.java.net (Chris Plummer) Date: Thu, 10 Jun 2021 17:54:15 GMT Subject: [jdk17] RFR: 8267448: Add "ulimit -a" to environment.html In-Reply-To: <1wM6WR4ekbJEvda8op4B0Nl6RTATdLKo-4WT3ML-rFQ=.db5c3ada-ea43-4613-a5c9-5e3222e77ef1@github.com> References: <1wM6WR4ekbJEvda8op4B0Nl6RTATdLKo-4WT3ML-rFQ=.db5c3ada-ea43-4613-a5c9-5e3222e77ef1@github.com> Message-ID: On Thu, 10 Jun 2021 16:51:28 GMT, Igor Ignatyev wrote: > (recreating openjdk/jdk#4451 against jdk17) > > Hi all, > > could you please review this small patch that does $subj? > > Thanks, > -- Igor > > attn: @plummercj It looks good. Copyrights need updating. ------------- Marked as reviewed by cjplummer (Reviewer). PR: https://git.openjdk.java.net/jdk17/pull/2 From iignatyev at openjdk.java.net Thu Jun 10 18:09:35 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Thu, 10 Jun 2021 18:09:35 GMT Subject: [jdk17] RFR: 8267448: Add "ulimit -a" to environment.html [v2] In-Reply-To: <1wM6WR4ekbJEvda8op4B0Nl6RTATdLKo-4WT3ML-rFQ=.db5c3ada-ea43-4613-a5c9-5e3222e77ef1@github.com> References: <1wM6WR4ekbJEvda8op4B0Nl6RTATdLKo-4WT3ML-rFQ=.db5c3ada-ea43-4613-a5c9-5e3222e77ef1@github.com> Message-ID: > (recreating openjdk/jdk#4451 against jdk17) > > Hi all, > > could you please review this small patch that does $subj? > > Thanks, > -- Igor > > attn: @plummercj Igor Ignatyev has updated the pull request incrementally with one additional commit since the last revision: updated copyright year ------------- Changes: - all: https://git.openjdk.java.net/jdk17/pull/2/files - new: https://git.openjdk.java.net/jdk17/pull/2/files/83e0e238..43c0857d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk17&pr=2&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk17&pr=2&range=00-01 Stats: 3 lines in 3 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk17/pull/2.diff Fetch: git fetch https://git.openjdk.java.net/jdk17 pull/2/head:pull/2 PR: https://git.openjdk.java.net/jdk17/pull/2 From iignatyev at openjdk.java.net Thu Jun 10 18:09:37 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Thu, 10 Jun 2021 18:09:37 GMT Subject: [jdk17] RFR: 8267448: Add "ulimit -a" to environment.html In-Reply-To: <1wM6WR4ekbJEvda8op4B0Nl6RTATdLKo-4WT3ML-rFQ=.db5c3ada-ea43-4613-a5c9-5e3222e77ef1@github.com> References: <1wM6WR4ekbJEvda8op4B0Nl6RTATdLKo-4WT3ML-rFQ=.db5c3ada-ea43-4613-a5c9-5e3222e77ef1@github.com> Message-ID: On Thu, 10 Jun 2021 16:51:28 GMT, Igor Ignatyev wrote: > (recreating openjdk/jdk#4451 against jdk17) > > Hi all, > > could you please review this small patch that does $subj? > > Thanks, > -- Igor > > attn: @plummercj Thanks, Chris. I've updated the copyrights. ------------- PR: https://git.openjdk.java.net/jdk17/pull/2 From iignatyev at openjdk.java.net Thu Jun 10 18:09:37 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Thu, 10 Jun 2021 18:09:37 GMT Subject: [jdk17] Integrated: 8267448: Add "ulimit -a" to environment.html In-Reply-To: <1wM6WR4ekbJEvda8op4B0Nl6RTATdLKo-4WT3ML-rFQ=.db5c3ada-ea43-4613-a5c9-5e3222e77ef1@github.com> References: <1wM6WR4ekbJEvda8op4B0Nl6RTATdLKo-4WT3ML-rFQ=.db5c3ada-ea43-4613-a5c9-5e3222e77ef1@github.com> Message-ID: On Thu, 10 Jun 2021 16:51:28 GMT, Igor Ignatyev wrote: > (recreating openjdk/jdk#4451 against jdk17) > > Hi all, > > could you please review this small patch that does $subj? > > Thanks, > -- Igor > > attn: @plummercj This pull request has now been integrated. Changeset: 53b6e2c8 Author: Igor Ignatyev URL: https://git.openjdk.java.net/jdk17/commit/53b6e2c85cab251362d27a1cd0cd37bc7d380360 Stats: 18 lines in 3 files changed: 15 ins; 0 del; 3 mod 8267448: Add "ulimit -a" to environment.html Reviewed-by: cjplummer ------------- PR: https://git.openjdk.java.net/jdk17/pull/2 From jiefu at openjdk.java.net Fri Jun 11 00:13:04 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 11 Jun 2021 00:13:04 GMT Subject: [jdk17] RFR: 8268576: jdk/jfr/event/gc/collection/TestSystemGc.java fails Message-ID: <_WzdY0pIGAirko5s1iBkveldFq5rsiVAHUrc7Fn1T6U=.286d8f33-4748-4f6e-a50f-d1c6364ad50a@github.com> Hi all, jdk/jfr/event/gc/collection/TestSystemGc.java fails in jdk17 and jdk. It can be fixed by renaming : test/jdk/jdk/jfr/event/gc/collection/TestSystemGc.java -> test/jdk/jdk/jfr/event/gc/collection/TestSystemGC.java Thanks. Best regards, Jie ------------- Commit messages: - 8268576: jdk/jfr/event/gc/collection/TestSystemGc.java fails Changes: https://git.openjdk.java.net/jdk17/pull/11/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk17&pr=11&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8268576 Stats: 0 lines in 1 file changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk17/pull/11.diff Fetch: git fetch https://git.openjdk.java.net/jdk17 pull/11/head:pull/11 PR: https://git.openjdk.java.net/jdk17/pull/11 From egahlin at openjdk.java.net Fri Jun 11 00:56:52 2021 From: egahlin at openjdk.java.net (Erik Gahlin) Date: Fri, 11 Jun 2021 00:56:52 GMT Subject: [jdk17] RFR: 8268576: jdk/jfr/event/gc/collection/TestSystemGc.java fails In-Reply-To: <_WzdY0pIGAirko5s1iBkveldFq5rsiVAHUrc7Fn1T6U=.286d8f33-4748-4f6e-a50f-d1c6364ad50a@github.com> References: <_WzdY0pIGAirko5s1iBkveldFq5rsiVAHUrc7Fn1T6U=.286d8f33-4748-4f6e-a50f-d1c6364ad50a@github.com> Message-ID: On Fri, 11 Jun 2021 00:06:28 GMT, Jie Fu wrote: > Hi all, > > jdk/jfr/event/gc/collection/TestSystemGc.java fails in jdk17 and jdk. > > It can be fixed by renaming : > test/jdk/jdk/jfr/event/gc/collection/TestSystemGc.java -> test/jdk/jdk/jfr/event/gc/collection/TestSystemGC.java > > Thanks. > Best regards, > Jie Marked as reviewed by egahlin (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk17/pull/11 From jiefu at openjdk.java.net Fri Jun 11 01:08:49 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 11 Jun 2021 01:08:49 GMT Subject: [jdk17] RFR: 8268576: jdk/jfr/event/gc/collection/TestSystemGc.java fails In-Reply-To: References: <_WzdY0pIGAirko5s1iBkveldFq5rsiVAHUrc7Fn1T6U=.286d8f33-4748-4f6e-a50f-d1c6364ad50a@github.com> Message-ID: On Fri, 11 Jun 2021 00:54:02 GMT, Erik Gahlin wrote: >> Hi all, >> >> jdk/jfr/event/gc/collection/TestSystemGc.java fails in jdk17 and jdk. >> >> It can be fixed by renaming : >> test/jdk/jdk/jfr/event/gc/collection/TestSystemGc.java -> test/jdk/jdk/jfr/event/gc/collection/TestSystemGC.java >> >> Thanks. >> Best regards, >> Jie > > Marked as reviewed by egahlin (Reviewer). Thanks @egahlin for your review. Do you think it's trivial and can be pushed right now? @egahlin Thanks. ------------- PR: https://git.openjdk.java.net/jdk17/pull/11 From dholmes at openjdk.java.net Fri Jun 11 02:40:51 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 11 Jun 2021 02:40:51 GMT Subject: [jdk17] RFR: 8268576: jdk/jfr/event/gc/collection/TestSystemGc.java fails In-Reply-To: <_WzdY0pIGAirko5s1iBkveldFq5rsiVAHUrc7Fn1T6U=.286d8f33-4748-4f6e-a50f-d1c6364ad50a@github.com> References: <_WzdY0pIGAirko5s1iBkveldFq5rsiVAHUrc7Fn1T6U=.286d8f33-4748-4f6e-a50f-d1c6364ad50a@github.com> Message-ID: On Fri, 11 Jun 2021 00:06:28 GMT, Jie Fu wrote: > Hi all, > > jdk/jfr/event/gc/collection/TestSystemGc.java fails in jdk17 and jdk. > > It can be fixed by renaming : > test/jdk/jdk/jfr/event/gc/collection/TestSystemGc.java -> test/jdk/jdk/jfr/event/gc/collection/TestSystemGC.java > > Thanks. > Best regards, > Jie Thanks for fixing Jie! David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk17/pull/11 From jiefu at openjdk.java.net Fri Jun 11 02:46:11 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 11 Jun 2021 02:46:11 GMT Subject: [jdk17] RFR: 8268576: jdk/jfr/event/gc/collection/TestSystemGc.java fails In-Reply-To: References: <_WzdY0pIGAirko5s1iBkveldFq5rsiVAHUrc7Fn1T6U=.286d8f33-4748-4f6e-a50f-d1c6364ad50a@github.com> Message-ID: On Fri, 11 Jun 2021 02:37:51 GMT, David Holmes wrote: > Thanks for fixing Jie! > > David Thanks @dholmes-ora . Will push it soon. > Note this will fix the problem in JDK 18 not JDK 17. If you push to 18 you will need a manual backport to 17. If you push to 17 it will be automatically forweard-ported to 18. This PR is in JDK17. So I think it should go into JDK 17, right? ------------- PR: https://git.openjdk.java.net/jdk17/pull/11 From dholmes at openjdk.java.net Fri Jun 11 02:46:12 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 11 Jun 2021 02:46:12 GMT Subject: [jdk17] RFR: 8268576: jdk/jfr/event/gc/collection/TestSystemGc.java fails In-Reply-To: <_WzdY0pIGAirko5s1iBkveldFq5rsiVAHUrc7Fn1T6U=.286d8f33-4748-4f6e-a50f-d1c6364ad50a@github.com> References: <_WzdY0pIGAirko5s1iBkveldFq5rsiVAHUrc7Fn1T6U=.286d8f33-4748-4f6e-a50f-d1c6364ad50a@github.com> Message-ID: On Fri, 11 Jun 2021 00:06:28 GMT, Jie Fu wrote: > Hi all, > > jdk/jfr/event/gc/collection/TestSystemGc.java fails in jdk17 and jdk. > > It can be fixed by renaming : > test/jdk/jdk/jfr/event/gc/collection/TestSystemGc.java -> test/jdk/jdk/jfr/event/gc/collection/TestSystemGC.java > > Thanks. > Best regards, > Jie Note this will fix the problem in JDK 18 not JDK 17. If you push to 18 you will need a manual backport to 17. If you push to 17 it will be automatically forweard-ported to 18. Right. Sorry I didn't see the "17" only the openjdk:master :) ------------- PR: https://git.openjdk.java.net/jdk17/pull/11 From jiefu at openjdk.java.net Fri Jun 11 02:50:54 2021 From: jiefu at openjdk.java.net (Jie Fu) Date: Fri, 11 Jun 2021 02:50:54 GMT Subject: [jdk17] Integrated: 8268576: jdk/jfr/event/gc/collection/TestSystemGc.java fails In-Reply-To: <_WzdY0pIGAirko5s1iBkveldFq5rsiVAHUrc7Fn1T6U=.286d8f33-4748-4f6e-a50f-d1c6364ad50a@github.com> References: <_WzdY0pIGAirko5s1iBkveldFq5rsiVAHUrc7Fn1T6U=.286d8f33-4748-4f6e-a50f-d1c6364ad50a@github.com> Message-ID: On Fri, 11 Jun 2021 00:06:28 GMT, Jie Fu wrote: > Hi all, > > jdk/jfr/event/gc/collection/TestSystemGc.java fails in jdk17 and jdk. > > It can be fixed by renaming : > test/jdk/jdk/jfr/event/gc/collection/TestSystemGc.java -> test/jdk/jdk/jfr/event/gc/collection/TestSystemGC.java > > Thanks. > Best regards, > Jie This pull request has now been integrated. Changeset: e3eef3b4 Author: Jie Fu URL: https://git.openjdk.java.net/jdk17/commit/e3eef3b41ab22b3fb1e4ee33ce4a3d3457d35ff1 Stats: 0 lines in 1 file changed: 0 ins; 0 del; 0 mod 8268576: jdk/jfr/event/gc/collection/TestSystemGc.java fails Reviewed-by: egahlin, dholmes ------------- PR: https://git.openjdk.java.net/jdk17/pull/11 From dholmes at openjdk.java.net Fri Jun 11 04:57:00 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 11 Jun 2021 04:57:00 GMT Subject: [jdk17] RFR: 8266614: update manpage for -Xlog:async Message-ID: Please review this update to the java manpage to describe the new -Xlog:async flag There are two places where the text is changed: 1. At the start where the -Xlog synopsis is given it now shows that `-Xlog:directive` is an allowed form where directive can be one of: help, disable, async 2. A new subsection "-Xlog Output Mode" that explains async mode The commited file is the java.1 nroff version which is not very readable, so I've included a commit that also contains a html version with the changed text flagged by "START NEW TEXT" and "END NEW TEXT". You can view that in rendered html via this link: https://htmlpreview.github.io/?https://github.com/openjdk/jdk17/blob/8dcf544dfd2e19a3a49cce98d2d9abd9d2756538/java.html Note that because the nroff file has not been updated for a while it also contains changes unrelated to this PR, the source changes for which have already been reviewed and approved. So just ignore those bits and look at the html file. Thanks, David ------------- Commit messages: - Removed java.html again - Temporary commit to see changes in a readable form - 8266614: update manpage for -Xlog:async Changes: https://git.openjdk.java.net/jdk17/pull/16/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk17&pr=16&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8266614 Stats: 120 lines in 1 file changed: 38 ins; 78 del; 4 mod Patch: https://git.openjdk.java.net/jdk17/pull/16.diff Fetch: git fetch https://git.openjdk.java.net/jdk17 pull/16/head:pull/16 PR: https://git.openjdk.java.net/jdk17/pull/16 From kvn at openjdk.java.net Fri Jun 11 16:04:53 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 11 Jun 2021 16:04:53 GMT Subject: RFR: 8267125: AES Galois CounterMode (GCM) interleaved implementation using AVX512 + VAES instructions [v2] In-Reply-To: References: <0a7b_-PDU_JYXR7OrJRK8Z8QPRwLlV2vcHbBbW06SO8=.f0d61fd3-0205-40a7-b1a1-58caa2ea0f45@github.com> Message-ID: <5M6_B0kvIXhODk0jhkVJytNJ5oobCsGAx71x9mQbGvU=.c93aad5d-5a3d-48c2-8c23-65d9b45fb3e3@github.com> On Fri, 4 Jun 2021 23:49:31 GMT, Smita Kamath wrote: >> I would like to submit AES-GCM optimization for x86_64 architectures supporting AVX3+VAES (Evex encoded AES). This optimization interleaves AES and GHASH operations. >> Performance gain of ~1.5x - 2x for message sizes 8k and above. > > Smita Kamath has updated the pull request incrementally with one additional commit since the last revision: > > 8267125:Updated intrinsic signature to remove copies of counter, state and subkeyHtbl Do you plan to implement `decrypt` intrinsic too? src/hotspot/share/opto/library_call.cpp line 547: > 545: > 546: case vmIntrinsics::_galoisCounterMode_AESCrypt: > 547: return inline_galoisCounterMode_AESCrypt(intrinsic_id()); You don't need to pass `intrinsic_id()` for this implementation unless you plan to add decrypt intrinsic later. src/hotspot/share/opto/library_call.cpp line 6545: > 6543: top_out != NULL && top_out->klass() != NULL, "args are strange"); > 6544: > 6545: // checks are the responsibility of the caller Do you have all NULL for all objects and range checks in Java code for this intrinsic? src/hotspot/share/opto/library_call.cpp line 6564: > 6562: Node* subkeyHtbl = load_field_from_object(ghash_object, "subkeyHtbl", "[J"); > 6563: Node* state = load_field_from_object(ghash_object, "state", "[J"); > 6564: if (embeddedCipherObj == NULL || counter == NULL || subkeyHtbl == NULL || state == NULL) return false; Follow coding style for such long condition: if () { return false; } ------------- PR: https://git.openjdk.java.net/jdk/pull/4019 From svkamath at openjdk.java.net Fri Jun 11 17:22:51 2021 From: svkamath at openjdk.java.net (Smita Kamath) Date: Fri, 11 Jun 2021 17:22:51 GMT Subject: RFR: 8267125: AES Galois CounterMode (GCM) interleaved implementation using AVX512 + VAES instructions [v2] In-Reply-To: <5M6_B0kvIXhODk0jhkVJytNJ5oobCsGAx71x9mQbGvU=.c93aad5d-5a3d-48c2-8c23-65d9b45fb3e3@github.com> References: <0a7b_-PDU_JYXR7OrJRK8Z8QPRwLlV2vcHbBbW06SO8=.f0d61fd3-0205-40a7-b1a1-58caa2ea0f45@github.com> <5M6_B0kvIXhODk0jhkVJytNJ5oobCsGAx71x9mQbGvU=.c93aad5d-5a3d-48c2-8c23-65d9b45fb3e3@github.com> Message-ID: On Fri, 11 Jun 2021 15:45:02 GMT, Vladimir Kozlov wrote: >> Smita Kamath has updated the pull request incrementally with one additional commit since the last revision: >> >> 8267125:Updated intrinsic signature to remove copies of counter, state and subkeyHtbl > > src/hotspot/share/opto/library_call.cpp line 547: > >> 545: >> 546: case vmIntrinsics::_galoisCounterMode_AESCrypt: >> 547: return inline_galoisCounterMode_AESCrypt(intrinsic_id()); > > You don't need to pass `intrinsic_id()` for this implementation unless you plan to add decrypt intrinsic later. Thanks for your comments Vladimir. The intrinsic is called for encrypt as well as decrypt operation. > src/hotspot/share/opto/library_call.cpp line 6564: > >> 6562: Node* subkeyHtbl = load_field_from_object(ghash_object, "subkeyHtbl", "[J"); >> 6563: Node* state = load_field_from_object(ghash_object, "state", "[J"); >> 6564: if (embeddedCipherObj == NULL || counter == NULL || subkeyHtbl == NULL || state == NULL) return false; > > Follow coding style for such long condition: > > if () { > return false; > } I will make the change. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/4019 From kvn at openjdk.java.net Fri Jun 11 17:58:50 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 11 Jun 2021 17:58:50 GMT Subject: RFR: 8267125: AES Galois CounterMode (GCM) interleaved implementation using AVX512 + VAES instructions [v2] In-Reply-To: References: <0a7b_-PDU_JYXR7OrJRK8Z8QPRwLlV2vcHbBbW06SO8=.f0d61fd3-0205-40a7-b1a1-58caa2ea0f45@github.com> <5M6_B0kvIXhODk0jhkVJytNJ5oobCsGAx71x9mQbGvU=.c93aad5d-5a3d-48c2-8c23-65d9b45fb3e3@github.com> Message-ID: <5lSvX6Y5sMWA0SulDc3Z5ObaVV5M7G6_Zsb99AxWnv4=.1f610aa2-9124-4a46-8d83-c920de3e2a33@github.com> On Fri, 11 Jun 2021 17:19:37 GMT, Smita Kamath wrote: >> src/hotspot/share/opto/library_call.cpp line 547: >> >>> 545: >>> 546: case vmIntrinsics::_galoisCounterMode_AESCrypt: >>> 547: return inline_galoisCounterMode_AESCrypt(intrinsic_id()); >> >> You don't need to pass `intrinsic_id()` for this implementation unless you plan to add decrypt intrinsic later. > > Thanks for your comments Vladimir. The intrinsic is called for encrypt as well as decrypt operation. Only one intrinsic is declared in this change: `_galoisCounterMode_AESCrypt`. Other AES intrinsics have 2 that is why they have to pass intrinsic_id(). See lines before this. ------------- PR: https://git.openjdk.java.net/jdk/pull/4019 From kvn at openjdk.java.net Fri Jun 11 17:58:50 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 11 Jun 2021 17:58:50 GMT Subject: RFR: 8267125: AES Galois CounterMode (GCM) interleaved implementation using AVX512 + VAES instructions [v2] In-Reply-To: <5lSvX6Y5sMWA0SulDc3Z5ObaVV5M7G6_Zsb99AxWnv4=.1f610aa2-9124-4a46-8d83-c920de3e2a33@github.com> References: <0a7b_-PDU_JYXR7OrJRK8Z8QPRwLlV2vcHbBbW06SO8=.f0d61fd3-0205-40a7-b1a1-58caa2ea0f45@github.com> <5M6_B0kvIXhODk0jhkVJytNJ5oobCsGAx71x9mQbGvU=.c93aad5d-5a3d-48c2-8c23-65d9b45fb3e3@github.com> <5lSvX6Y5sMWA0SulDc3Z5ObaVV5M7G6_Zsb99AxWnv4=.1f610aa2-9124-4a46-8d83-c920de3e2a33@github.com> Message-ID: On Fri, 11 Jun 2021 17:54:02 GMT, Vladimir Kozlov wrote: >> Thanks for your comments Vladimir. The intrinsic is called for encrypt as well as decrypt operation. > > Only one intrinsic is declared in this change: `_galoisCounterMode_AESCrypt`. Other AES intrinsics have 2 that is why they have to pass intrinsic_id(). See lines before this. Note, _counterMode_AESCrypt is not example - it has the same issue. ------------- PR: https://git.openjdk.java.net/jdk/pull/4019 From iveresov at openjdk.java.net Fri Jun 11 17:59:54 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Fri, 11 Jun 2021 17:59:54 GMT Subject: RFR: 8265518: C1: Intrinsic support for Preconditions.checkIndex [v13] In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 08:53:40 GMT, Yi Yang wrote: >> The JDK codebase re-created many variants of checkIndex(`grep -I -r 'cehckIndex' jdk/`). A notable variant is java.nio.Buffer.checkIndex, which annotated with @IntrinsicCandidate and it only has a corresponding C1 intrinsic version. >> >> In fact, there is an utility method `jdk.internal.util.Preconditions.checkIndex`(wrapped by java.lang.Objects.checkIndex) that behaves the same as these variants of checkIndex, we can replace these re-created variants of checkIndex by Objects.checkIndex, it would significantly reduce duplicated code and enjoys performance improvement because Preconditions.checkIndex is @IntrinsicCandidate and it has a corresponding intrinsic method in HotSpot. >> >> But, the problem is currently HotSpot only implements the C2 version of Preconditions.checkIndex. To reuse it global-widely in JDK code, I think we can firstly implement its C1 counterpart. There are also a few kinds of stuff we can do later: >> >> 1. Replace all variants of checkIndex by Objects.checkIndex in the whole JDK codebase. >> 2. Remove Buffer.checkIndex and obsolete/deprecate InlineNIOCheckIndex flag >> >> Testing: cds, compiler and jdk > > Yi Yang has updated the pull request incrementally with one additional commit since the last revision: > > more comment Alright, tests pass now. I think we're good to go. ------------- PR: https://git.openjdk.java.net/jdk/pull/3615 From iveresov at openjdk.java.net Fri Jun 11 18:08:49 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Fri, 11 Jun 2021 18:08:49 GMT Subject: RFR: 8265518: C1: Intrinsic support for Preconditions.checkIndex [v13] In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 08:53:40 GMT, Yi Yang wrote: >> The JDK codebase re-created many variants of checkIndex(`grep -I -r 'cehckIndex' jdk/`). A notable variant is java.nio.Buffer.checkIndex, which annotated with @IntrinsicCandidate and it only has a corresponding C1 intrinsic version. >> >> In fact, there is an utility method `jdk.internal.util.Preconditions.checkIndex`(wrapped by java.lang.Objects.checkIndex) that behaves the same as these variants of checkIndex, we can replace these re-created variants of checkIndex by Objects.checkIndex, it would significantly reduce duplicated code and enjoys performance improvement because Preconditions.checkIndex is @IntrinsicCandidate and it has a corresponding intrinsic method in HotSpot. >> >> But, the problem is currently HotSpot only implements the C2 version of Preconditions.checkIndex. To reuse it global-widely in JDK code, I think we can firstly implement its C1 counterpart. There are also a few kinds of stuff we can do later: >> >> 1. Replace all variants of checkIndex by Objects.checkIndex in the whole JDK codebase. >> 2. Remove Buffer.checkIndex and obsolete/deprecate InlineNIOCheckIndex flag >> >> Testing: cds, compiler and jdk > > Yi Yang has updated the pull request incrementally with one additional commit since the last revision: > > more comment I guess you need to do the "integrate" command again. ------------- PR: https://git.openjdk.java.net/jdk/pull/3615 From hseigel at openjdk.java.net Fri Jun 11 20:30:49 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Fri, 11 Jun 2021 20:30:49 GMT Subject: [jdk17] RFR: 8266614: update manpage for -Xlog:async In-Reply-To: References: Message-ID: <94YE5-G3RJZ131x-QKRSrC9vMcFCsZYDJm3r-SuacQA=.d89b00ac-b715-4e58-8a9e-0fbafd734518@github.com> On Fri, 11 Jun 2021 04:40:09 GMT, David Holmes wrote: > Please review this update to the java manpage to describe the new -Xlog:async flag > > There are two places where the text is changed: > > 1. At the start where the -Xlog synopsis is given it now shows that `-Xlog:directive` is an allowed form where directive can be one of: help, disable, async > 2. A new subsection "-Xlog Output Mode" that explains async mode > > The commited file is the java.1 nroff version which is not very readable, so I've included a commit that also contains a html version with the changed text flagged by "START NEW TEXT" and "END NEW TEXT". You can view that in rendered html via this link: > > https://htmlpreview.github.io/?https://github.com/openjdk/jdk17/blob/8dcf544dfd2e19a3a49cce98d2d9abd9d2756538/java.html > > Note that because the nroff file has not been updated for a while it also contains changes unrelated to this PR, the source changes for which have already been reviewed and approved. So just ignore those bits and look at the html file. > > Thanks, > David Thanks for doing this! Harold The changes look good! Here's some optional suggestions. Under the "Description" heading where it says "The following provides quick reference to the -Xlog command and syntax for options:", perhaps add something about -Xlog:async? Possible rewording suggestions: 1. Change "The default value should be big enough to cater for most cases." to "The default value should be big enough to handle most cases." 2. Change "... trade memory overhead for log accuracy if they need to." to "... trade memory overhead for log accuracy if needed." ------------- Marked as reviewed by hseigel (Reviewer). PR: https://git.openjdk.java.net/jdk17/pull/16 From david.holmes at oracle.com Fri Jun 11 21:57:33 2021 From: david.holmes at oracle.com (David Holmes) Date: Sat, 12 Jun 2021 07:57:33 +1000 Subject: [jdk17] RFR: 8266614: update manpage for -Xlog:async In-Reply-To: <94YE5-G3RJZ131x-QKRSrC9vMcFCsZYDJm3r-SuacQA=.d89b00ac-b715-4e58-8a9e-0fbafd734518@github.com> References: <94YE5-G3RJZ131x-QKRSrC9vMcFCsZYDJm3r-SuacQA=.d89b00ac-b715-4e58-8a9e-0fbafd734518@github.com> Message-ID: <4241fc86-e11c-66c2-95f0-250058fc3ec1@oracle.com> Hi Harold, On 12/06/2021 6:30 am, Harold Seigel wrote: > On Fri, 11 Jun 2021 04:40:09 GMT, David Holmes wrote: > >> Please review this update to the java manpage to describe the new -Xlog:async flag >> >> There are two places where the text is changed: >> >> 1. At the start where the -Xlog synopsis is given it now shows that `-Xlog:directive` is an allowed form where directive can be one of: help, disable, async >> 2. A new subsection "-Xlog Output Mode" that explains async mode >> >> The commited file is the java.1 nroff version which is not very readable, so I've included a commit that also contains a html version with the changed text flagged by "START NEW TEXT" and "END NEW TEXT". You can view that in rendered html via this link: >> >> https://htmlpreview.github.io/?https://github.com/openjdk/jdk17/blob/8dcf544dfd2e19a3a49cce98d2d9abd9d2756538/java.html >> >> Note that because the nroff file has not been updated for a while it also contains changes unrelated to this PR, the source changes for which have already been reviewed and approved. So just ignore those bits and look at the html file. >> >> Thanks, >> David > > Thanks for doing this! > Harold > > The changes look good! Thanks for the review. > Here's some optional suggestions. > > Under the "Description" heading where it says "The following provides quick reference to the -Xlog command and syntax for options:", perhaps add something about -Xlog:async? I think that would just duplicate what is said later. > Possible rewording suggestions: > > 1. Change "The default value should be big enough to cater for most cases." to > "The default value should be big enough to handle most cases." > > 2. Change "... trade memory overhead for log accuracy if they need to." > to "... trade memory overhead for log accuracy if needed." The wording is consistent with the help text which itself was set via a CSR request, so I'll jsut leave it as-is. Thanks, David > ------------- > > Marked as reviewed by hseigel (Reviewer). > > PR: https://git.openjdk.java.net/jdk17/pull/16 > From yyang at openjdk.java.net Sat Jun 12 01:06:58 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Sat, 12 Jun 2021 01:06:58 GMT Subject: Integrated: 8265518: C1: Intrinsic support for Preconditions.checkIndex In-Reply-To: References: Message-ID: On Thu, 22 Apr 2021 06:55:41 GMT, Yi Yang wrote: > The JDK codebase re-created many variants of checkIndex(`grep -I -r 'cehckIndex' jdk/`). A notable variant is java.nio.Buffer.checkIndex, which annotated with @IntrinsicCandidate and it only has a corresponding C1 intrinsic version. > > In fact, there is an utility method `jdk.internal.util.Preconditions.checkIndex`(wrapped by java.lang.Objects.checkIndex) that behaves the same as these variants of checkIndex, we can replace these re-created variants of checkIndex by Objects.checkIndex, it would significantly reduce duplicated code and enjoys performance improvement because Preconditions.checkIndex is @IntrinsicCandidate and it has a corresponding intrinsic method in HotSpot. > > But, the problem is currently HotSpot only implements the C2 version of Preconditions.checkIndex. To reuse it global-widely in JDK code, I think we can firstly implement its C1 counterpart. There are also a few kinds of stuff we can do later: > > 1. Replace all variants of checkIndex by Objects.checkIndex in the whole JDK codebase. > 2. Remove Buffer.checkIndex and obsolete/deprecate InlineNIOCheckIndex flag > > Testing: cds, compiler and jdk This pull request has now been integrated. Changeset: 5cee23a9 Author: Yi Yang URL: https://git.openjdk.java.net/jdk/commit/5cee23a9ed0b7fe2657be7492d9c1f78fcd02ebf Stats: 347 lines in 11 files changed: 250 ins; 78 del; 19 mod 8265518: C1: Intrinsic support for Preconditions.checkIndex Reviewed-by: dfuchs, iveresov ------------- PR: https://git.openjdk.java.net/jdk/pull/3615 From yyang at openjdk.java.net Sat Jun 12 01:06:56 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Sat, 12 Jun 2021 01:06:56 GMT Subject: RFR: 8265518: C1: Intrinsic support for Preconditions.checkIndex [v13] In-Reply-To: References: Message-ID: <-kHu9fSLdqH287-Ud0MjAss-Jg8WBZZqRkL8PwKr-Yw=.77104be0-952d-44da-9390-22aca4ed21f0@github.com> On Fri, 11 Jun 2021 18:05:45 GMT, Igor Veresov wrote: > I guess you need to do the "integrate" command again. Okay?thank you all for taking time to look at this ------------- PR: https://git.openjdk.java.net/jdk/pull/3615 From simonis at openjdk.java.net Sat Jun 12 06:32:55 2021 From: simonis at openjdk.java.net (Volker Simonis) Date: Sat, 12 Jun 2021 06:32:55 GMT Subject: RFR: 8265518: C1: Intrinsic support for Preconditions.checkIndex [v13] In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 08:53:40 GMT, Yi Yang wrote: >> The JDK codebase re-created many variants of checkIndex(`grep -I -r 'cehckIndex' jdk/`). A notable variant is java.nio.Buffer.checkIndex, which annotated with @IntrinsicCandidate and it only has a corresponding C1 intrinsic version. >> >> In fact, there is an utility method `jdk.internal.util.Preconditions.checkIndex`(wrapped by java.lang.Objects.checkIndex) that behaves the same as these variants of checkIndex, we can replace these re-created variants of checkIndex by Objects.checkIndex, it would significantly reduce duplicated code and enjoys performance improvement because Preconditions.checkIndex is @IntrinsicCandidate and it has a corresponding intrinsic method in HotSpot. >> >> But, the problem is currently HotSpot only implements the C2 version of Preconditions.checkIndex. To reuse it global-widely in JDK code, I think we can firstly implement its C1 counterpart. There are also a few kinds of stuff we can do later: >> >> 1. Replace all variants of checkIndex by Objects.checkIndex in the whole JDK codebase. >> 2. Remove Buffer.checkIndex and obsolete/deprecate InlineNIOCheckIndex flag >> >> Testing: cds, compiler and jdk > > Yi Yang has updated the pull request incrementally with one additional commit since the last revision: > > more comment This change removed a product flag so I wonder how it could be integrated without a CSR? ------------- PR: https://git.openjdk.java.net/jdk/pull/3615 From stuefe at openjdk.java.net Sat Jun 12 06:53:57 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Sat, 12 Jun 2021 06:53:57 GMT Subject: RFR: 8265518: C1: Intrinsic support for Preconditions.checkIndex [v13] In-Reply-To: References: Message-ID: On Sat, 12 Jun 2021 06:29:50 GMT, Volker Simonis wrote: > This change removed a product flag so I wonder how it could be integrated without a CSR? And if the intention was to remove it, should it not have been marked as obsolete first? ------------- PR: https://git.openjdk.java.net/jdk/pull/3615 From yyang at openjdk.java.net Sat Jun 12 06:56:57 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Sat, 12 Jun 2021 06:56:57 GMT Subject: RFR: 8265518: C1: Intrinsic support for Preconditions.checkIndex [v13] In-Reply-To: References: Message-ID: On Sat, 12 Jun 2021 06:50:48 GMT, Thomas Stuefe wrote: > This change removed a product flag so I wonder how it could be integrated without a CSR? It's a diagnostic product flag, so it? okay to remove it without issuing CSR. But I am not 100% sure. @dholmes-ora, do you have any comment about this? Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/3615 From simonis at openjdk.java.net Sat Jun 12 07:04:53 2021 From: simonis at openjdk.java.net (Volker Simonis) Date: Sat, 12 Jun 2021 07:04:53 GMT Subject: RFR: 8265518: C1: Intrinsic support for Preconditions.checkIndex [v13] In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 08:53:40 GMT, Yi Yang wrote: >> The JDK codebase re-created many variants of checkIndex(`grep -I -r 'cehckIndex' jdk/`). A notable variant is java.nio.Buffer.checkIndex, which annotated with @IntrinsicCandidate and it only has a corresponding C1 intrinsic version. >> >> In fact, there is an utility method `jdk.internal.util.Preconditions.checkIndex`(wrapped by java.lang.Objects.checkIndex) that behaves the same as these variants of checkIndex, we can replace these re-created variants of checkIndex by Objects.checkIndex, it would significantly reduce duplicated code and enjoys performance improvement because Preconditions.checkIndex is @IntrinsicCandidate and it has a corresponding intrinsic method in HotSpot. >> >> But, the problem is currently HotSpot only implements the C2 version of Preconditions.checkIndex. To reuse it global-widely in JDK code, I think we can firstly implement its C1 counterpart. There are also a few kinds of stuff we can do later: >> >> 1. Replace all variants of checkIndex by Objects.checkIndex in the whole JDK codebase. >> 2. Remove Buffer.checkIndex and obsolete/deprecate InlineNIOCheckIndex flag >> >> Testing: cds, compiler and jdk > > Yi Yang has updated the pull request incrementally with one additional commit since the last revision: > > more comment Looks like you are right Yi: https://wiki.openjdk.java.net/display/HotSpot/Hotspot+Command-line+Flags%3A+Kinds%2C+Lifecycle+and+the+CSR+Process Seems like for diagnostic flags the creation of a CSR is up to the developer. Sorry for the confusion. Yi Yang ***@***.***> schrieb am Sa., 12. Juni 2021, 08:54: > This change removed a product flag so I wonder how it could be integrated > without a CSR? > > It's a diagnostic product flag, so it? okay to remove it without issuing > CSR. But I am not 100% sure. > > @dholmes-ora , do you have any comment > about this? Thanks! > > ? > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > , or > unsubscribe > > . > ------------- PR: https://git.openjdk.java.net/jdk/pull/3615 From stuefe at openjdk.java.net Sat Jun 12 08:25:52 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Sat, 12 Jun 2021 08:25:52 GMT Subject: RFR: 8265518: C1: Intrinsic support for Preconditions.checkIndex [v13] In-Reply-To: References: Message-ID: <9ixWD6Ea4OUcXAzU5s2Y68URPHJvq8RO5g-go6k1aMw=.e52e0571-e07d-4c5f-8afb-6b2408efd4a0@github.com> On Wed, 9 Jun 2021 08:53:40 GMT, Yi Yang wrote: >> The JDK codebase re-created many variants of checkIndex(`grep -I -r 'cehckIndex' jdk/`). A notable variant is java.nio.Buffer.checkIndex, which annotated with @IntrinsicCandidate and it only has a corresponding C1 intrinsic version. >> >> In fact, there is an utility method `jdk.internal.util.Preconditions.checkIndex`(wrapped by java.lang.Objects.checkIndex) that behaves the same as these variants of checkIndex, we can replace these re-created variants of checkIndex by Objects.checkIndex, it would significantly reduce duplicated code and enjoys performance improvement because Preconditions.checkIndex is @IntrinsicCandidate and it has a corresponding intrinsic method in HotSpot. >> >> But, the problem is currently HotSpot only implements the C2 version of Preconditions.checkIndex. To reuse it global-widely in JDK code, I think we can firstly implement its C1 counterpart. There are also a few kinds of stuff we can do later: >> >> 1. Replace all variants of checkIndex by Objects.checkIndex in the whole JDK codebase. >> 2. Remove Buffer.checkIndex and obsolete/deprecate InlineNIOCheckIndex flag >> >> Testing: cds, compiler and jdk > > Yi Yang has updated the pull request incrementally with one additional commit since the last revision: > > more comment Hi Yi, you may need to add the option to the obsolete-flags-table though as described in arguments.cpp: https://github.com/openjdk/jdk/blob/5cee23a9ed0b7fe2657be7492d9c1f78fcd02ebf/src/hotspot/share/runtime/arguments.cpp#L489-L490 I think the point is to give a customer a grace period where the option is still accepted on the command line. I am not sure if that step is optional though, if one is reasonably sure that the option is unused. Maybe @dholmes-ora can chime in. Cheers, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/3615 From dholmes at openjdk.java.net Sun Jun 13 22:21:51 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Sun, 13 Jun 2021 22:21:51 GMT Subject: [jdk17] RFR: 8266614: update manpage for -Xlog:async In-Reply-To: References: Message-ID: <7jTsxAjwKWVzn0fUpfk0xYWSuGsi6FKlm3CZaamO7pE=.e8c7fe2b-332d-4602-bedb-aad73b5a6a2a@github.com> On Fri, 11 Jun 2021 04:40:09 GMT, David Holmes wrote: > Please review this update to the java manpage to describe the new -Xlog:async flag > > There are two places where the text is changed: > > 1. At the start where the -Xlog synopsis is given it now shows that `-Xlog:directive` is an allowed form where directive can be one of: help, disable, async > 2. A new subsection "-Xlog Output Mode" that explains async mode > > The commited file is the java.1 nroff version which is not very readable, so I've included a commit that also contains a html version with the changed text flagged by "START NEW TEXT" and "END NEW TEXT". You can view that in rendered html via this link: > > https://htmlpreview.github.io/?https://github.com/openjdk/jdk17/blob/8dcf544dfd2e19a3a49cce98d2d9abd9d2756538/java.html > > Note that because the nroff file has not been updated for a while it also contains changes unrelated to this PR, the source changes for which have already been reviewed and approved. So just ignore those bits and look at the html file. > > Thanks, > David @navyxliu could you review this please. Thanks, David ------------- PR: https://git.openjdk.java.net/jdk17/pull/16 From xliu at openjdk.java.net Mon Jun 14 05:28:51 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Mon, 14 Jun 2021 05:28:51 GMT Subject: [jdk17] RFR: 8266614: update manpage for -Xlog:async In-Reply-To: References: Message-ID: On Fri, 11 Jun 2021 04:40:09 GMT, David Holmes wrote: > Please review this update to the java manpage to describe the new -Xlog:async flag > > There are two places where the text is changed: > > 1. At the start where the -Xlog synopsis is given it now shows that `-Xlog:directive` is an allowed form where directive can be one of: help, disable, async > 2. A new subsection "-Xlog Output Mode" that explains async mode > > The commited file is the java.1 nroff version which is not very readable, so I've included a commit that also contains a html version with the changed text flagged by "START NEW TEXT" and "END NEW TEXT". You can view that in rendered html via this link: > > https://htmlpreview.github.io/?https://github.com/openjdk/jdk17/blob/8dcf544dfd2e19a3a49cce98d2d9abd9d2756538/java.html > > Note that because the nroff file has not been updated for a while it also contains changes unrelated to this PR, the source changes for which have already been reviewed and approved. So just ignore those bits and look at the html file. > > Thanks, > David Thanks for updating the manage. LGTM. ------------- PR: https://git.openjdk.java.net/jdk17/pull/16 From iklam at openjdk.java.net Mon Jun 14 05:58:59 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 14 Jun 2021 05:58:59 GMT Subject: RFR: 8268520: VirtualSpace::print_on() should be const In-Reply-To: <10LEl9rEqiS82sz4cdCwtjTbOVMd6wuOBoLTrYjDrIk=.eb6d59c4-c13a-42ad-9808-6099d1dcc83e@github.com> References: <10LEl9rEqiS82sz4cdCwtjTbOVMd6wuOBoLTrYjDrIk=.eb6d59c4-c13a-42ad-9808-6099d1dcc83e@github.com> Message-ID: <1x8b0fnlXaxHFae0cfsmGPs1yqRbmKVS4HcddldGfog=.a846cad5-9262-481e-a4bd-4c4213f7ec2d@github.com> On Thu, 10 Jun 2021 06:39:45 GMT, Thomas Stuefe wrote: >> Please review this trivial patch. VirtualSpace::print_on() should be const so we can avoid the weird casting in epsilonHeap.cpp > > +1 Thanks @tstuefe @tschatzl @kimbarrett for the review! ------------- PR: https://git.openjdk.java.net/jdk/pull/4448 From iklam at openjdk.java.net Mon Jun 14 05:58:59 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 14 Jun 2021 05:58:59 GMT Subject: Integrated: 8268520: VirtualSpace::print_on() should be const In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 22:13:02 GMT, Ioi Lam wrote: > Please review this trivial patch. VirtualSpace::print_on() should be const so we can avoid the weird casting in epsilonHeap.cpp This pull request has now been integrated. Changeset: ba601b84 Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/ba601b8407c6d56e48b57a9524a11bb275e08adc Stats: 6 lines in 3 files changed: 0 ins; 1 del; 5 mod 8268520: VirtualSpace::print_on() should be const Reviewed-by: kbarrett, stuefe, tschatzl ------------- PR: https://git.openjdk.java.net/jdk/pull/4448 From sjohanss at openjdk.java.net Mon Jun 14 12:22:06 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Mon, 14 Jun 2021 12:22:06 GMT Subject: RFR: 8017163: G1: Refactor remembered sets [v11] In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 07:11:38 GMT, Thomas Schatzl wrote: >> Hi all, >> >> can I have reviews for this change that significantly refactors the remembered set for more scalability. >> >> The current G1 remembered set implementation has been designed for use cases and Java heaps and applications from 20 years ago. >> >> Over time many problems with performance and in particular memory usage have been observed: >> >> * adding elements to the lowest tier data structure takes a per-remembered set global lock. Measurements have shown that the applications can wait thousands of seconds acquiring these locks. While the affected threads are in most cases refinement threads so does not directly affect the application, it can still affect the ability of G1 to meet some goals needed for keeping pause times (i.e. amount of cards from the refinement buffers to be merged into the card table and then scanned during gc). >> >> * there is a substantial memory overhead for managing the data structures: examples are >> * using separate (hash) tables for the three different types of card containers >> * there is significant unnecessary preallocation of memory for some of the card set containers >> * Containers store redundant information >> >> * inflexibility when reusing memory: in the current implementation the different containers use different approaches to manage memory. Most use the C heap directly, some the C heap with some internal global memory pool. This in practice makes it very difficult to implement anything other than giving back memory in the collection pause. The corresponding "Free Collection Set" pause can take a significant amount of time because of that. >> Also memory reuse is limited and preallocating arenas is limited (or would have to be reimplemented multiple times), stressing the C heap allocator. >> >> * inability to support additional use cases: over time interesting ideas (e.g. JDK-8058803) came up for improving performance of remembered set management. Mostly due to redundant information everywhere and completely different handling of various aspects in the containers it is in practice impossible to implement these. >> >> * (partial) inability to give back memory to the OS. While some of the containers use the C heap allocator, and so in some way give back memory, these implementations and handling is different for every container. >> >> * the existing granularity of containers are unbalanced: currently there exist three tiers: "sparse", "fine" and "full". Sparse is an array of cards ranging in the hundreds maybe, "fine" is a bitmap covering a whole region and full is a bit indicating that that region should be scanned completely during GC. >> >> The problem is that there is nothing between "no card at all" and "sparse" and in particular the difference between the capability to hold entries of "sparse" and "fine". I.e. memory usage difference when exceeding a "sparse" array (holding 128 entries at 32M regions, taking ~256 bytes) to fine that is able to hold 65k entries using 8kB is significant. >> For these reason there is even a dedicated option to stop allocating more "fine" containers and just give up and use "full" instead to avoid excessive memory usage. With extremely bad consequences in pause times. >> >> Over time some of these issues have been fixed or in many cases band-aided, and some of these fixes and ideas were the result of working on this change (e.g. JDK-8262185, JDK-8233919, JDK-8213108). >> >> This change is effectively a rewrite of the Java heap card based part of a region's remembered set. >> >> This initial fully working change can be roughly described with the following properties: >> >> * use a single `ConcurrentHashTable` for the card containers of a given region. The container in use replaced (coarsened) on the fly within the CHT node, completely lock-free. This implements JDK-6949259. >> >> * memory for a given region's remembered set for all containers (and the CHT nodes) is backed by per container type and per remembered set arena style bump-pointer allocation buffers. In this change, in the pause, memory is given back to free lists only. The implementation gives back memory to the OS concurrently to the application. Memory is still managed using the C heap memory manager though, but abstracted away and could be replaced by manual page memory management. >> >> * there are now four different container types and one meta-container type. These four actual containers are: >> * inline pointer: the change store a few (3-5) cards in the CHT node directly and uses no extra memory. >> * array of cards: similar to the "sparse" container, an array of cards with a configurable amount of entries. However bulk allocation of memory is now managed at a lower level so there is much less waste. >> * bitmap: similar to "fine", a bitmap spanning a (sub-)range of memory >> * full: same as full, indicating for a (sub-)range of memory that all cards are to be looked at during scan. Similar to inline pointers, this uses no extra memory. >> * howl: the Howl container subdivides a given memory range into subranges where any of the other containers describing that sub-range of the heap may be stored in. This is somewhat similar to the idea suggested in JDK-8048504. >> >> * care has been taken to minimize container memory usage, e.g. by not adding redundant information there and in general carefully specify them. They have been designed with future enhancements in mind. >> >> In some benchmarks (where there is significant remembered set memory usage) we are seeing memory reduction to 25% of JDK 16 levels with this change. Garbage collection times are at most as long or shorter than before; most changes affecting pause times have been extracted earlier. Individiual affected phases are generally shorter now. >> >> Testing: tier1-8 many times, manual and automated perf testing > > Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: > > - Merge branch 'master' into submit/8017163-refactor-remembered-set > - Merge branch 'master' of gh:openjdk/jdk into tschatzl:submit/8017163-refactor-remembered-set > - Always have power-of-2 Howl buckets to avoid memory waste (these entries have never been used before, just taking a small amount of memory) > - Improved documentation > - Improve comment > - Rename G1CardSetContainerOnHeap to G1CardSetContainer on popular demand > - sjohanss-review 3 > - Merge branch 'master' of gh:openjdk/jdk into 8017163-refactor-remembered-set > - More cleanup after sjohanss comments > - Rename FOUND > - ... and 6 more: https://git.openjdk.java.net/jdk/compare/4d1cf51b...338b4829 Took a closer look at the new tests and overall they look good, just a couple of small comments. test/hotspot/gtest/gc/g1/test_g1CardSet.cpp line 421: > 419: const uint CardsPerRegion = 16384; > 420: const double FullCardSetThreshold = 1.0; > 421: const uint BitmapCoarsenThreshold = 1.0; Would it make sense to run this test with a few different config thresholds? To test the different levels of the card-set. If I understand those thresholds correct this card-set will never consider a region to be coarsend or full. I get that the accounting might turn into everything being "found" rather than added, but might be worth testing. test/hotspot/gtest/gc/g1/test_g1CardSet.cpp line 440: > 438: } > 439: > 440: log_error(gc)("MT parallel part, added " SIZE_FORMAT " duplicate " SIZE_FORMAT, cl.added(), cl.found()); Is there a reason to use `error`-level? I would prefer using `info`-level to avoid seeing this output every test run. ------------- PR: https://git.openjdk.java.net/jdk/pull/4116 From neliasso at openjdk.java.net Mon Jun 14 13:56:24 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Mon, 14 Jun 2021 13:56:24 GMT Subject: RFR: 8268125: ZGC: Clone oop array gets wrong acopy stub [v3] In-Reply-To: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> References: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> Message-ID: <_zTt9jmrmFzioQ6957w8vasnwTDm17SfLuCjXQbSbaI=.c5fd97f5-f58c-49ef-a566-8d2a2532bdad@github.com> > Hi, > > This fixes a problem I introduced with JDK-8267726. With that change clone oop array is treated as normal clone arrays with ZGC. I missed that a case was missing in zBarrierSetC2::clone_at_expansion - which caused clone_oop-arrays to get the wrong array copy stub. > > In this fix I move the entire leaf call creation inside zBarrierSetC2, and leave BarrierSetC2 as is. In this way I don't have to change anything for the other collectors. > > Please review, > Best regards, > Nils Eliasson Nils Eliasson has updated the pull request incrementally with one additional commit since the last revision: update test ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4359/files - new: https://git.openjdk.java.net/jdk/pull/4359/files/f9d403e5..3d3873c8 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4359&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4359&range=01-02 Stats: 277 lines in 1 file changed: 269 ins; 0 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/4359.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4359/head:pull/4359 PR: https://git.openjdk.java.net/jdk/pull/4359 From neliasso at openjdk.java.net Mon Jun 14 14:06:54 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Mon, 14 Jun 2021 14:06:54 GMT Subject: RFR: 8268125: ZGC: Clone oop array gets wrong acopy stub [v3] In-Reply-To: References: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> Message-ID: On Fri, 4 Jun 2021 20:24:28 GMT, Vladimir Kozlov wrote: >> Nils Eliasson has updated the pull request incrementally with one additional commit since the last revision: >> >> update test > > test/hotspot/jtreg/compiler/arraycopy/TestObjectArrayClone.java line 33: > >> 31: * compiler.arraycopy.TestObjectArrayClone >> 32: * >> 33: * @run main/othervm -XX:+IgnoreUnrecognizedVMOptions -XX:+UseZGC > > I suggest to clone it to separate `@test` block because you need `@requires vm.gc.Z` for it. I removed the explicit GC. I should be run in all configs. ------------- PR: https://git.openjdk.java.net/jdk/pull/4359 From neliasso at openjdk.java.net Mon Jun 14 14:17:17 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Mon, 14 Jun 2021 14:17:17 GMT Subject: RFR: 8268125: ZGC: Clone oop array gets wrong acopy stub [v4] In-Reply-To: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> References: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> Message-ID: > Hi, > > This fixes a problem I introduced with JDK-8267726. With that change clone oop array is treated as normal clone arrays with ZGC. I missed that a case was missing in zBarrierSetC2::clone_at_expansion - which caused clone_oop-arrays to get the wrong array copy stub. > > In this fix I move the entire leaf call creation inside zBarrierSetC2, and leave BarrierSetC2 as is. In this way I don't have to change anything for the other collectors. > > Please review, > Best regards, > Nils Eliasson Nils Eliasson has updated the pull request incrementally with one additional commit since the last revision: Remove whitespace ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4359/files - new: https://git.openjdk.java.net/jdk/pull/4359/files/3d3873c8..8439f915 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4359&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4359&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/4359.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4359/head:pull/4359 PR: https://git.openjdk.java.net/jdk/pull/4359 From neliasso at openjdk.java.net Mon Jun 14 14:25:50 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Mon, 14 Jun 2021 14:25:50 GMT Subject: RFR: 8268125: ZGC: Clone oop array gets wrong acopy stub [v4] In-Reply-To: References: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> Message-ID: On Mon, 14 Jun 2021 14:17:17 GMT, Nils Eliasson wrote: >> Hi, >> >> This fixes a problem I introduced with JDK-8267726. With that change clone oop array is treated as normal clone arrays with ZGC. I missed that a case was missing in zBarrierSetC2::clone_at_expansion - which caused clone_oop-arrays to get the wrong array copy stub. >> >> In this fix I move the entire leaf call creation inside zBarrierSetC2, and leave BarrierSetC2 as is. In this way I don't have to change anything for the other collectors. >> >> Please review, >> Best regards, >> Nils Eliasson > > Nils Eliasson has updated the pull request incrementally with one additional commit since the last revision: > > Remove whitespace I've added an assert(StressReflectiveCode, "...") for the clone_at_expansion case when we don't find an TypeAryPtr. This case only happens with StressReflectiveCode and the code generated is unreachable, but not yet removed. It happens because the StressReflectiveCode flag prevents the check for array to be resolved at compile time - leaving an unfolded check in compiled code. inline_clone will create a clone with cases for both array and oop. On the array path there will be an allocate_array with a checkcast to an oop, and an arraycopy without a TypeAryPtr on the src and dest. That code is unreachable - but must be tolerated. Please review, Nils Eliasson ------------- PR: https://git.openjdk.java.net/jdk/pull/4359 From jwilhelm at openjdk.java.net Mon Jun 14 14:36:59 2021 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Mon, 14 Jun 2021 14:36:59 GMT Subject: RFR: Merge jdk17 Message-ID: Forwardport JDK 17 -> JDK 18 ------------- Commit messages: - 8267579: Thread::cooked_allocated_bytes() hits assert(left >= right) failed: avoid underflow - 8268342: java/foreign/channels/TestAsyncSocketChannels.java fails with "IllegalStateException: This segment is already closed" - 8268630: ProblemList serviceability/jvmti/CompiledMethodLoad/Zombie.java on linux-aarch64 - 8268470: CDS dynamic dump asserts with JFR RecordingStream - 8268093: Manual Testcase: "sun/security/krb5/config/native/TestDynamicStore.java" Fails with NPE - 8268602: a couple runtime/os tests don't check exit code - 8268555: Update HttpClient tests that use ITestContext to jtreg 6+1 - 8268580: runtime/memory/LargePages/TestLargePagesFlags.java should be run in driver mode - 8268565: runtime/records/RedefineRecord.java should be run in driver mode - 8268576: jdk/jfr/event/gc/collection/TestSystemGc.java fails - ... and 3 more: https://git.openjdk.java.net/jdk/compare/74007890...b3185354 The merge commit only contains trivial merges, so no merge-specific webrevs have been generated. Changes: https://git.openjdk.java.net/jdk/pull/4484/files Stats: 786 lines in 57 files changed: 593 ins; 73 del; 120 mod Patch: https://git.openjdk.java.net/jdk/pull/4484.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4484/head:pull/4484 PR: https://git.openjdk.java.net/jdk/pull/4484 From kvn at openjdk.java.net Mon Jun 14 14:53:57 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 14 Jun 2021 14:53:57 GMT Subject: RFR: 8268125: ZGC: Clone oop array gets wrong acopy stub [v4] In-Reply-To: References: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> Message-ID: <3aq9Uw1UnI8rqBy4TpTujY9jYdxmX14vgVmiPrfk4IU=.867966b2-0cb9-4974-aac5-bc352d82b11f@github.com> On Mon, 14 Jun 2021 14:17:17 GMT, Nils Eliasson wrote: >> Hi, >> >> This fixes a problem I introduced with JDK-8267726. With that change clone oop array is treated as normal clone arrays with ZGC. I missed that a case was missing in zBarrierSetC2::clone_at_expansion - which caused clone_oop-arrays to get the wrong array copy stub. >> >> In this fix I move the entire leaf call creation inside zBarrierSetC2, and leave BarrierSetC2 as is. In this way I don't have to change anything for the other collectors. >> >> Please review, >> Best regards, >> Nils Eliasson > > Nils Eliasson has updated the pull request incrementally with one additional commit since the last revision: > > Remove whitespace Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4359 From kbarrett at openjdk.java.net Mon Jun 14 14:55:46 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 14 Jun 2021 14:55:46 GMT Subject: RFR: 8268290: Improve LockFreeQueue<> utility [v2] In-Reply-To: References: Message-ID: > Please review this change to the LockFreeQueue utility class. > > The LockFreeQueue originated as an implementation detail of > G1DirtyCardQueueSet, and was recently refactored into a public utility > class. In that refactoring it retained some limitations that were > acceptable in its original context, but may be problematic as a general > utility. > > In particular, under some conditions a thread was not be able to pop the > last element in the queue, due to interference by a concurrent operation. > And this state will persist, so retrying the pop operation won't help until > the interfering thread had made sufficient progress. This was mitigated by > making the API more complex to provide notice to the client that the queue > may be in this state. > > But it turns out we can do somewhat better, eliminating one of the > limitations, which is the point of this change. We introduce a > pseudo-object used as an end of queue marker. We can use the transition of > the last element's next value from the end marker to NULL by a pop operation > as a claim on the element, allowing the losing thread to recognize, retry, > and make progress. > > This queue still has the limitation that an in-progress push/append may > prevent popping elements. Because of this, the class is being renamed to > NonblockingQueue. The old name suggests stronger guarantees than actually > provided. > > The PR has two commits, the first for the functional changes, the second for > the renaming. The github diffs don't seem to be recognizing the renaming of > the source files as a rename, instead treating the old files as deleted and > the new files as added. The first commit by itself is probably more useful > for reviewing the functional changes. > > Testing: > mach5 tier1-5 Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into lfqueue - rename - use end marker to improve pop ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4379/files - new: https://git.openjdk.java.net/jdk/pull/4379/files/8fe607c3..0adb5954 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4379&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4379&range=00-01 Stats: 45413 lines in 744 files changed: 37010 ins; 4383 del; 4020 mod Patch: https://git.openjdk.java.net/jdk/pull/4379.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4379/head:pull/4379 PR: https://git.openjdk.java.net/jdk/pull/4379 From dcubed at openjdk.java.net Mon Jun 14 15:49:59 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 14 Jun 2021 15:49:59 GMT Subject: RFR: Merge jdk17 In-Reply-To: References: Message-ID: On Mon, 14 Jun 2021 14:28:33 GMT, Jesper Wilhelmsson wrote: > Forwardport JDK 17 -> JDK 18 Thumbs up! Thanks for doing this sync forward. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4484 From jwilhelm at openjdk.java.net Mon Jun 14 16:02:24 2021 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Mon, 14 Jun 2021 16:02:24 GMT Subject: RFR: Merge jdk17 [v2] In-Reply-To: References: Message-ID: > Forwardport JDK 17 -> JDK 18 Jesper Wilhelmsson has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4484/files - new: https://git.openjdk.java.net/jdk/pull/4484/files/b3185354..b3185354 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4484&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4484&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/4484.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4484/head:pull/4484 PR: https://git.openjdk.java.net/jdk/pull/4484 From dfuchs at openjdk.java.net Mon Jun 14 16:02:28 2021 From: dfuchs at openjdk.java.net (Daniel Fuchs) Date: Mon, 14 Jun 2021 16:02:28 GMT Subject: RFR: Merge jdk17 [v2] In-Reply-To: References: Message-ID: <2daAOoP8hnt5nqsZJasI5g1_QUovwyTwY3JutixTGik=.0bf9118f-8348-4f28-ba59-220c51a8732e@github.com> On Mon, 14 Jun 2021 15:58:15 GMT, Jesper Wilhelmsson wrote: >> Forwardport JDK 17 -> JDK 18 > > Jesper Wilhelmsson has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. Looked at this two changesets and they were fine. - 8268342: java/foreign/channels/TestAsyncSocketChannels.java fails with "IllegalStateException: This segment is already - 8268555: Update HttpClient tests that use ITestContext to jtreg 6+1 ------------- Marked as reviewed by dfuchs (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4484 From jwilhelm at openjdk.java.net Mon Jun 14 16:02:33 2021 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Mon, 14 Jun 2021 16:02:33 GMT Subject: Integrated: Merge jdk17 In-Reply-To: References: Message-ID: On Mon, 14 Jun 2021 14:28:33 GMT, Jesper Wilhelmsson wrote: > Forwardport JDK 17 -> JDK 18 This pull request has now been integrated. Changeset: 17295b1b Author: Jesper Wilhelmsson URL: https://git.openjdk.java.net/jdk/commit/17295b1bb02b2121978f1459b2e75c5e1031e7ea Stats: 721 lines in 30 files changed: 573 ins; 73 del; 75 mod Merge Reviewed-by: dcubed ------------- PR: https://git.openjdk.java.net/jdk/pull/4484 From xliu at openjdk.java.net Mon Jun 14 17:33:51 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Mon, 14 Jun 2021 17:33:51 GMT Subject: [jdk17] RFR: 8266614: update manpage for -Xlog:async In-Reply-To: References: Message-ID: On Fri, 11 Jun 2021 04:40:09 GMT, David Holmes wrote: > Please review this update to the java manpage to describe the new -Xlog:async flag > > There are two places where the text is changed: > > 1. At the start where the -Xlog synopsis is given it now shows that `-Xlog:directive` is an allowed form where directive can be one of: help, disable, async > 2. A new subsection "-Xlog Output Mode" that explains async mode > > The commited file is the java.1 nroff version which is not very readable, so I've included a commit that also contains a html version with the changed text flagged by "START NEW TEXT" and "END NEW TEXT". You can view that in rendered html via this link: > > https://htmlpreview.github.io/?https://github.com/openjdk/jdk17/blob/8dcf544dfd2e19a3a49cce98d2d9abd9d2756538/java.html > > Note that because the nroff file has not been updated for a while it also contains changes unrelated to this PR, the source changes for which have already been reviewed and approved. So just ignore those bits and look at the html file. > > Thanks, > David Marked as reviewed by xliu (no project role). ------------- PR: https://git.openjdk.java.net/jdk17/pull/16 From vlivanov at openjdk.java.net Mon Jun 14 19:07:56 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Mon, 14 Jun 2021 19:07:56 GMT Subject: RFR: 8268125: ZGC: Clone oop array gets wrong acopy stub [v4] In-Reply-To: References: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> Message-ID: <4uFY93gZQQt8wveDZTLQ0Zwyr2eqIv38fTm4SP_cgcg=.81fd425e-cf3d-40be-b107-420358da0e46@github.com> On Mon, 14 Jun 2021 14:17:17 GMT, Nils Eliasson wrote: >> Hi, >> >> This fixes a problem I introduced with JDK-8267726. With that change clone oop array is treated as normal clone arrays with ZGC. I missed that a case was missing in zBarrierSetC2::clone_at_expansion - which caused clone_oop-arrays to get the wrong array copy stub. >> >> In this fix I move the entire leaf call creation inside zBarrierSetC2, and leave BarrierSetC2 as is. In this way I don't have to change anything for the other collectors. >> >> Please review, >> Best regards, >> Nils Eliasson > > Nils Eliasson has updated the pull request incrementally with one additional commit since the last revision: > > Remove whitespace Looks good. src/hotspot/share/gc/z/c2/zBarrierSetC2.cpp line 257: > 255: } > 256: > 257: #define XTOP LP64_ONLY(COMMA phase->top()) I'm curious whether `XTOP` should be underfined at the end of the scope (here and in other places). ------------- Marked as reviewed by vlivanov (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4359 From neliasso at openjdk.java.net Mon Jun 14 19:49:20 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Mon, 14 Jun 2021 19:49:20 GMT Subject: RFR: 8268125: ZGC: Clone oop array gets wrong acopy stub [v4] In-Reply-To: <4uFY93gZQQt8wveDZTLQ0Zwyr2eqIv38fTm4SP_cgcg=.81fd425e-cf3d-40be-b107-420358da0e46@github.com> References: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> <4uFY93gZQQt8wveDZTLQ0Zwyr2eqIv38fTm4SP_cgcg=.81fd425e-cf3d-40be-b107-420358da0e46@github.com> Message-ID: On Mon, 14 Jun 2021 19:04:51 GMT, Vladimir Ivanov wrote: >> Nils Eliasson has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove whitespace > > src/hotspot/share/gc/z/c2/zBarrierSetC2.cpp line 257: > >> 255: } >> 256: >> 257: #define XTOP LP64_ONLY(COMMA phase->top()) > > I'm curious whether `XTOP` should be underfined at the end of the scope (here and in other places). Added Undef in all four files that has it defined. ------------- PR: https://git.openjdk.java.net/jdk/pull/4359 From neliasso at openjdk.java.net Mon Jun 14 19:49:19 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Mon, 14 Jun 2021 19:49:19 GMT Subject: RFR: 8268125: ZGC: Clone oop array gets wrong acopy stub [v5] In-Reply-To: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> References: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> Message-ID: > Hi, > > This fixes a problem I introduced with JDK-8267726. With that change clone oop array is treated as normal clone arrays with ZGC. I missed that a case was missing in zBarrierSetC2::clone_at_expansion - which caused clone_oop-arrays to get the wrong array copy stub. > > In this fix I move the entire leaf call creation inside zBarrierSetC2, and leave BarrierSetC2 as is. In this way I don't have to change anything for the other collectors. > > Please review, > Best regards, > Nils Eliasson Nils Eliasson has updated the pull request incrementally with one additional commit since the last revision: Fix undef XTOP ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4359/files - new: https://git.openjdk.java.net/jdk/pull/4359/files/8439f915..b18679a9 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4359&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4359&range=03-04 Stats: 8 lines in 4 files changed: 8 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/4359.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4359/head:pull/4359 PR: https://git.openjdk.java.net/jdk/pull/4359 From neliasso at openjdk.java.net Mon Jun 14 19:53:16 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Mon, 14 Jun 2021 19:53:16 GMT Subject: RFR: 8268125: ZGC: Clone oop array gets wrong acopy stub [v6] In-Reply-To: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> References: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> Message-ID: > Hi, > > This fixes a problem I introduced with JDK-8267726. With that change clone oop array is treated as normal clone arrays with ZGC. I missed that a case was missing in zBarrierSetC2::clone_at_expansion - which caused clone_oop-arrays to get the wrong array copy stub. > > In this fix I move the entire leaf call creation inside zBarrierSetC2, and leave BarrierSetC2 as is. In this way I don't have to change anything for the other collectors. > > Please review, > Best regards, > Nils Eliasson Nils Eliasson has updated the pull request incrementally with one additional commit since the last revision: added newline at eof ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4359/files - new: https://git.openjdk.java.net/jdk/pull/4359/files/b18679a9..d973a617 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4359&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4359&range=04-05 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/4359.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4359/head:pull/4359 PR: https://git.openjdk.java.net/jdk/pull/4359 From vlivanov at openjdk.java.net Mon Jun 14 21:01:49 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Mon, 14 Jun 2021 21:01:49 GMT Subject: RFR: 8268125: ZGC: Clone oop array gets wrong acopy stub [v6] In-Reply-To: References: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> Message-ID: On Mon, 14 Jun 2021 19:53:16 GMT, Nils Eliasson wrote: >> Hi, >> >> This fixes a problem I introduced with JDK-8267726. With that change clone oop array is treated as normal clone arrays with ZGC. I missed that a case was missing in zBarrierSetC2::clone_at_expansion - which caused clone_oop-arrays to get the wrong array copy stub. >> >> In this fix I move the entire leaf call creation inside zBarrierSetC2, and leave BarrierSetC2 as is. In this way I don't have to change anything for the other collectors. >> >> Please review, >> Best regards, >> Nils Eliasson > > Nils Eliasson has updated the pull request incrementally with one additional commit since the last revision: > > added newline at eof Marked as reviewed by vlivanov (Reviewer). test/hotspot/jtreg/compiler/arraycopy/TestObjectArrayClone.java line 29: > 27: * @summary Test Object.clone() intrinsic if ReduceInitialCardMarks is disabled. > 28: * > 29: * @run main/othervm -XX:+IgnoreUnrecognizedVMOptions -XX:-ReduceInitialCardMarks Considering the complications `-XX:+StressReflectiveCode` introduces, does it make sense to add a configuration with the flag explicitly enabled? ------------- PR: https://git.openjdk.java.net/jdk/pull/4359 From dholmes at openjdk.java.net Mon Jun 14 23:06:42 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 14 Jun 2021 23:06:42 GMT Subject: [jdk17] Integrated: 8266614: update manpage for -Xlog:async In-Reply-To: References: Message-ID: On Fri, 11 Jun 2021 04:40:09 GMT, David Holmes wrote: > Please review this update to the java manpage to describe the new -Xlog:async flag > > There are two places where the text is changed: > > 1. At the start where the -Xlog synopsis is given it now shows that `-Xlog:directive` is an allowed form where directive can be one of: help, disable, async > 2. A new subsection "-Xlog Output Mode" that explains async mode > > The commited file is the java.1 nroff version which is not very readable, so I've included a commit that also contains a html version with the changed text flagged by "START NEW TEXT" and "END NEW TEXT". You can view that in rendered html via this link: > > https://htmlpreview.github.io/?https://github.com/openjdk/jdk17/blob/8dcf544dfd2e19a3a49cce98d2d9abd9d2756538/java.html > > Note that because the nroff file has not been updated for a while it also contains changes unrelated to this PR, the source changes for which have already been reviewed and approved. So just ignore those bits and look at the html file. > > Thanks, > David This pull request has now been integrated. Changeset: a5bf5e0e Author: David Holmes URL: https://git.openjdk.java.net/jdk17/commit/a5bf5e0e5f6c18b51e398ab81ed9d0a29bf31b6f Stats: 120 lines in 1 file changed: 38 ins; 78 del; 4 mod 8266614: update manpage for -Xlog:async Reviewed-by: hseigel, xliu ------------- PR: https://git.openjdk.java.net/jdk17/pull/16 From yyang at openjdk.java.net Tue Jun 15 02:39:47 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Tue, 15 Jun 2021 02:39:47 GMT Subject: RFR: 8265518: C1: Intrinsic support for Preconditions.checkIndex [v13] In-Reply-To: <9ixWD6Ea4OUcXAzU5s2Y68URPHJvq8RO5g-go6k1aMw=.e52e0571-e07d-4c5f-8afb-6b2408efd4a0@github.com> References: <9ixWD6Ea4OUcXAzU5s2Y68URPHJvq8RO5g-go6k1aMw=.e52e0571-e07d-4c5f-8afb-6b2408efd4a0@github.com> Message-ID: On Sat, 12 Jun 2021 08:22:32 GMT, Thomas Stuefe wrote: > Hi Yi, > > you may need to add the option to the obsolete-flags-table though as described in arguments.cpp: > > https://github.com/openjdk/jdk/blob/5cee23a9ed0b7fe2657be7492d9c1f78fcd02ebf/src/hotspot/share/runtime/arguments.cpp#L489-L490 > > I think the point is to give a customer a grace period where the option is still accepted on the command line. I am not sure if that step is optional though, if one is reasonably sure that the option is unused. Maybe @dholmes-ora can chime in. > > Cheers, Thomas Hi Thomas, I think what you said is right. It does not take too much time to do this but it can give users a smooth transition for unavailable options! I will create a new PR to do this stuff if there are no objections. Thanks, Yang ------------- PR: https://git.openjdk.java.net/jdk/pull/3615 From github.com+20216587+miao-zheng at openjdk.java.net Tue Jun 15 03:27:08 2021 From: github.com+20216587+miao-zheng at openjdk.java.net (Miao Zheng) Date: Tue, 15 Jun 2021 03:27:08 GMT Subject: RFR: 8268727: Remove unused slowpath locking method in OptoRuntime Message-ID: 8268727: Remove unused slowpath locking method in OptoRuntime ------------- Commit messages: - 8268727: Remove unused slowpath locking method in OptoRuntime Changes: https://git.openjdk.java.net/jdk/pull/4490/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4490&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8268727 Stats: 5 lines in 2 files changed: 0 ins; 4 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/4490.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4490/head:pull/4490 PR: https://git.openjdk.java.net/jdk/pull/4490 From ngasson at openjdk.java.net Tue Jun 15 03:57:48 2021 From: ngasson at openjdk.java.net (Nick Gasson) Date: Tue, 15 Jun 2021 03:57:48 GMT Subject: RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 03:10:45 GMT, Wang Huang wrote: > Dear all, > Could you give me a favor to review this patch? It improves the performance of the intrinsic of `String.equals` on Neon backend of Aarch64. > We profile the performance by using this JMH case: > > > ```java > package com.huawei.string; > import java.util.*; > import java.util.concurrent.TimeUnit; > > import org.openjdk.jmh.annotations.CompilerControl; > import org.openjdk.jmh.annotations.Benchmark; > import org.openjdk.jmh.annotations.Level; > import org.openjdk.jmh.annotations.OutputTimeUnit; > import org.openjdk.jmh.annotations.Param; > import org.openjdk.jmh.annotations.Scope; > import org.openjdk.jmh.annotations.Setup; > import org.openjdk.jmh.annotations.State; > import org.openjdk.jmh.annotations.Fork; > import org.openjdk.jmh.infra.Blackhole; > > @State(Scope.Thread) > @OutputTimeUnit(TimeUnit.MILLISECONDS) > public class StringEqual { > @Param({"8", "64", "4096"}) > int size; > > String str1; > String str2; > > @Setup(Level.Trial) > public void init() { > str1 = newString(size, 'c', '1'); > str2 = newString(size, 'c', '2'); > } > > public String newString(int length, char charToFill, char lastChar) { > if (length > 0) { > char[] array = new char[length]; > Arrays.fill(array, charToFill); > array[length - 1] = lastChar; > return new String(array); > } > return ""; > } > > @Benchmark > @CompilerControl(CompilerControl.Mode.DONT_INLINE) > public boolean EqualString() { > return str1.equals(str2); > } > } > > ``` > The result is list as following:?Linux aarch64 with 128cores? > > Benchmark | (size) | Mode | Cnt | Score | Error | Units > ----------------------------------|-------|---------|-------|------------|------------|---------- > StringEqual.EqualString | 8 | thrpt | 10 | 123971.994 | ? 1462.131 | ops/ms > StringEqual.EqualString | 64 | thrpt | 10 | 56009.960 | ? 999.734 | ops/ms > StringEqual.EqualString | 4096 | thrpt | 10 | 1943.852 | ? 8.159 | ops/ms > StringEqual.EqualStringWithNEON | 8 | thrpt | 10 | 120319.271 | ? 1392.185 | ops/ms > StringEqual.EqualStringWithNEON | 64 | thrpt | 10 | 72914.767 | ? 1814.173 | ops/ms > StringEqual.EqualStringWithNEON | 4096 | thrpt | 10 | 2579.155 | ? 15.589 | ops/ms > > Yours, > WANG Huang With this change the size of the`string_equals` intrinsic increases by ~60% from 120 bytes to 196 bytes and this gets expanded at every `String.equals` call site. It looks good on a micro-benchmark but I wonder if on a larger program this improvement is outweighed by the negative effects of methods taking up more space in the icache. src/hotspot/cpu/aarch64/aarch64.ad line 16676: > 16674: format %{ "String Equals $str1,$str2,$cnt -> $result" %} > 16675: ins_encode %{ > 16676: // Count is in 8-bit bytes; non-Compact chars are 8 bits. This change is a bit confusing: non-compact chars are still 16 bits, it's just at this point we know the string contains only 8-bit Latin characters. I think it's better to instead delete everything after the ";" (or leave it as it is). ------------- PR: https://git.openjdk.java.net/jdk/pull/4423 From iveresov at openjdk.java.net Tue Jun 15 04:12:46 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Tue, 15 Jun 2021 04:12:46 GMT Subject: RFR: 8265518: C1: Intrinsic support for Preconditions.checkIndex [v13] In-Reply-To: References: Message-ID: On Wed, 9 Jun 2021 08:53:40 GMT, Yi Yang wrote: >> The JDK codebase re-created many variants of checkIndex(`grep -I -r 'cehckIndex' jdk/`). A notable variant is java.nio.Buffer.checkIndex, which annotated with @IntrinsicCandidate and it only has a corresponding C1 intrinsic version. >> >> In fact, there is an utility method `jdk.internal.util.Preconditions.checkIndex`(wrapped by java.lang.Objects.checkIndex) that behaves the same as these variants of checkIndex, we can replace these re-created variants of checkIndex by Objects.checkIndex, it would significantly reduce duplicated code and enjoys performance improvement because Preconditions.checkIndex is @IntrinsicCandidate and it has a corresponding intrinsic method in HotSpot. >> >> But, the problem is currently HotSpot only implements the C2 version of Preconditions.checkIndex. To reuse it global-widely in JDK code, I think we can firstly implement its C1 counterpart. There are also a few kinds of stuff we can do later: >> >> 1. Replace all variants of checkIndex by Objects.checkIndex in the whole JDK codebase. >> 2. Remove Buffer.checkIndex and obsolete/deprecate InlineNIOCheckIndex flag >> >> Testing: cds, compiler and jdk > > Yi Yang has updated the pull request incrementally with one additional commit since the last revision: > > more comment It's up to you, we can of course print a warning that the flag has been removed. In my opinion, we shouldn't waste time on this, that was an obscure non-product flag. ------------- PR: https://git.openjdk.java.net/jdk/pull/3615 From neliasso at openjdk.java.net Tue Jun 15 07:04:41 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Tue, 15 Jun 2021 07:04:41 GMT Subject: RFR: 8268125: ZGC: Clone oop array gets wrong acopy stub [v6] In-Reply-To: References: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> Message-ID: On Mon, 14 Jun 2021 20:59:02 GMT, Vladimir Ivanov wrote: >> Nils Eliasson has updated the pull request incrementally with one additional commit since the last revision: >> >> added newline at eof > > test/hotspot/jtreg/compiler/arraycopy/TestObjectArrayClone.java line 29: > >> 27: * @summary Test Object.clone() intrinsic if ReduceInitialCardMarks is disabled. >> 28: * >> 29: * @run main/othervm -XX:+IgnoreUnrecognizedVMOptions -XX:-ReduceInitialCardMarks > > Considering the complications `-XX:+StressReflectiveCode` introduces, does it make sense to add a configuration with the flag explicitly enabled? We have this test: compiler.arraycopy.TestEliminateArrayCopy (and there are some additional variants) that reproduces the problem, it uses the StressReflectiveCode flag. ------------- PR: https://git.openjdk.java.net/jdk/pull/4359 From aph at openjdk.java.net Tue Jun 15 08:55:42 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 15 Jun 2021 08:55:42 GMT Subject: RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals In-Reply-To: References: Message-ID: On Tue, 15 Jun 2021 03:54:18 GMT, Nick Gasson wrote: > With this change the size of the`string_equals` intrinsic increases by ~60% from 120 bytes to 196 bytes and this gets expanded at every `String.equals` call site. It looks good on a micro-benchmark but I wonder if on a larger program this improvement is outweighed by the negative effects of methods taking up more space in the icache. That's an excellent point. > With this change the size of the`string_equals` intrinsic increases by ~60% from 120 bytes to 196 bytes and this gets expanded at every `String.equals` call site. It looks good on a micro-benchmark but I wonder if on a larger program this improvement is outweighed by the negative effects of methods taking up more space in the icache. That's an excellent point. There's no need at all for the Neon part to be expanded inline: it could be a subroutine. We'd have to use fixed Neon registers at the call site. ------------- PR: https://git.openjdk.java.net/jdk/pull/4423 From neliasso at openjdk.java.net Tue Jun 15 08:56:43 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Tue, 15 Jun 2021 08:56:43 GMT Subject: Integrated: 8268125: ZGC: Clone oop array gets wrong acopy stub In-Reply-To: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> References: <3dwEUfcz0HHi3v5A7vXB4pLCwCn1nmLmEJjkvQImxlo=.e3acfeb7-d115-4ad0-a2f8-5bd9258dbda3@github.com> Message-ID: On Fri, 4 Jun 2021 11:43:26 GMT, Nils Eliasson wrote: > Hi, > > This fixes a problem I introduced with JDK-8267726. With that change clone oop array is treated as normal clone arrays with ZGC. I missed that a case was missing in zBarrierSetC2::clone_at_expansion - which caused clone_oop-arrays to get the wrong array copy stub. > > In this fix I move the entire leaf call creation inside zBarrierSetC2, and leave BarrierSetC2 as is. In this way I don't have to change anything for the other collectors. > > Please review, > Best regards, > Nils Eliasson This pull request has now been integrated. Changeset: d3840932 Author: Nils Eliasson URL: https://git.openjdk.java.net/jdk/commit/d384093289561015c69b684a9e21a8c4c1851c4c Stats: 339 lines in 5 files changed: 332 ins; 0 del; 7 mod 8268125: ZGC: Clone oop array gets wrong acopy stub Reviewed-by: kvn, vlivanov ------------- PR: https://git.openjdk.java.net/jdk/pull/4359 From tschatzl at openjdk.java.net Tue Jun 15 10:11:04 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 15 Jun 2021 10:11:04 GMT Subject: RFR: 8017163: G1: Refactor remembered sets [v12] In-Reply-To: References: Message-ID: > Hi all, > > can I have reviews for this change that significantly refactors the remembered set for more scalability. > > The current G1 remembered set implementation has been designed for use cases and Java heaps and applications from 20 years ago. > > Over time many problems with performance and in particular memory usage have been observed: > > * adding elements to the lowest tier data structure takes a per-remembered set global lock. Measurements have shown that the applications can wait thousands of seconds acquiring these locks. While the affected threads are in most cases refinement threads so does not directly affect the application, it can still affect the ability of G1 to meet some goals needed for keeping pause times (i.e. amount of cards from the refinement buffers to be merged into the card table and then scanned during gc). > > * there is a substantial memory overhead for managing the data structures: examples are > * using separate (hash) tables for the three different types of card containers > * there is significant unnecessary preallocation of memory for some of the card set containers > * Containers store redundant information > > * inflexibility when reusing memory: in the current implementation the different containers use different approaches to manage memory. Most use the C heap directly, some the C heap with some internal global memory pool. This in practice makes it very difficult to implement anything other than giving back memory in the collection pause. The corresponding "Free Collection Set" pause can take a significant amount of time because of that. > Also memory reuse is limited and preallocating arenas is limited (or would have to be reimplemented multiple times), stressing the C heap allocator. > > * inability to support additional use cases: over time interesting ideas (e.g. JDK-8058803) came up for improving performance of remembered set management. Mostly due to redundant information everywhere and completely different handling of various aspects in the containers it is in practice impossible to implement these. > > * (partial) inability to give back memory to the OS. While some of the containers use the C heap allocator, and so in some way give back memory, these implementations and handling is different for every container. > > * the existing granularity of containers are unbalanced: currently there exist three tiers: "sparse", "fine" and "full". Sparse is an array of cards ranging in the hundreds maybe, "fine" is a bitmap covering a whole region and full is a bit indicating that that region should be scanned completely during GC. > > The problem is that there is nothing between "no card at all" and "sparse" and in particular the difference between the capability to hold entries of "sparse" and "fine". I.e. memory usage difference when exceeding a "sparse" array (holding 128 entries at 32M regions, taking ~256 bytes) to fine that is able to hold 65k entries using 8kB is significant. > For these reason there is even a dedicated option to stop allocating more "fine" containers and just give up and use "full" instead to avoid excessive memory usage. With extremely bad consequences in pause times. > > Over time some of these issues have been fixed or in many cases band-aided, and some of these fixes and ideas were the result of working on this change (e.g. JDK-8262185, JDK-8233919, JDK-8213108). > > This change is effectively a rewrite of the Java heap card based part of a region's remembered set. > > This initial fully working change can be roughly described with the following properties: > > * use a single `ConcurrentHashTable` for the card containers of a given region. The container in use replaced (coarsened) on the fly within the CHT node, completely lock-free. This implements JDK-6949259. > > * memory for a given region's remembered set for all containers (and the CHT nodes) is backed by per container type and per remembered set arena style bump-pointer allocation buffers. In this change, in the pause, memory is given back to free lists only. The implementation gives back memory to the OS concurrently to the application. Memory is still managed using the C heap memory manager though, but abstracted away and could be replaced by manual page memory management. > > * there are now four different container types and one meta-container type. These four actual containers are: > * inline pointer: the change store a few (3-5) cards in the CHT node directly and uses no extra memory. > * array of cards: similar to the "sparse" container, an array of cards with a configurable amount of entries. However bulk allocation of memory is now managed at a lower level so there is much less waste. > * bitmap: similar to "fine", a bitmap spanning a (sub-)range of memory > * full: same as full, indicating for a (sub-)range of memory that all cards are to be looked at during scan. Similar to inline pointers, this uses no extra memory. > * howl: the Howl container subdivides a given memory range into subranges where any of the other containers describing that sub-range of the heap may be stored in. This is somewhat similar to the idea suggested in JDK-8048504. > > * care has been taken to minimize container memory usage, e.g. by not adding redundant information there and in general carefully specify them. They have been designed with future enhancements in mind. > > In some benchmarks (where there is significant remembered set memory usage) we are seeing memory reduction to 25% of JDK 16 levels with this change. Garbage collection times are at most as long or shorter than before; most changes affecting pause times have been extracted earlier. Individiual affected phases are generally shorter now. > > Testing: tier1-8 many times, manual and automated perf testing Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 19 commits: - Merge branch 'submit/8017163-refactor-remembered-set' of gh:tschatzl/jdk into 8017163-refactor-remembered-set - Merge branch 'master' into submit/8017163-refactor-remembered-set - Update obsoletion/removal versions - Merge branch 'master' into 8017163-refactor-remembered-set - Merge branch 'master' of gh:openjdk/jdk into tschatzl:submit/8017163-refactor-remembered-set - Always have power-of-2 Howl buckets to avoid memory waste (these entries have never been used before, just taking a small amount of memory) - Improved documentation - Improve comment - Rename G1CardSetContainerOnHeap to G1CardSetContainer on popular demand - sjohanss-review 3 - ... and 9 more: https://git.openjdk.java.net/jdk/compare/d3840932...86f7484e ------------- Changes: https://git.openjdk.java.net/jdk/pull/4116/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4116&range=11 Stats: 6131 lines in 64 files changed: 4558 ins; 1315 del; 258 mod Patch: https://git.openjdk.java.net/jdk/pull/4116.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4116/head:pull/4116 PR: https://git.openjdk.java.net/jdk/pull/4116 From tschatzl at openjdk.java.net Tue Jun 15 10:20:18 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 15 Jun 2021 10:20:18 GMT Subject: RFR: 8017163: G1: Refactor remembered sets [v13] In-Reply-To: References: Message-ID: > Hi all, > > can I have reviews for this change that significantly refactors the remembered set for more scalability. > > The current G1 remembered set implementation has been designed for use cases and Java heaps and applications from 20 years ago. > > Over time many problems with performance and in particular memory usage have been observed: > > * adding elements to the lowest tier data structure takes a per-remembered set global lock. Measurements have shown that the applications can wait thousands of seconds acquiring these locks. While the affected threads are in most cases refinement threads so does not directly affect the application, it can still affect the ability of G1 to meet some goals needed for keeping pause times (i.e. amount of cards from the refinement buffers to be merged into the card table and then scanned during gc). > > * there is a substantial memory overhead for managing the data structures: examples are > * using separate (hash) tables for the three different types of card containers > * there is significant unnecessary preallocation of memory for some of the card set containers > * Containers store redundant information > > * inflexibility when reusing memory: in the current implementation the different containers use different approaches to manage memory. Most use the C heap directly, some the C heap with some internal global memory pool. This in practice makes it very difficult to implement anything other than giving back memory in the collection pause. The corresponding "Free Collection Set" pause can take a significant amount of time because of that. > Also memory reuse is limited and preallocating arenas is limited (or would have to be reimplemented multiple times), stressing the C heap allocator. > > * inability to support additional use cases: over time interesting ideas (e.g. JDK-8058803) came up for improving performance of remembered set management. Mostly due to redundant information everywhere and completely different handling of various aspects in the containers it is in practice impossible to implement these. > > * (partial) inability to give back memory to the OS. While some of the containers use the C heap allocator, and so in some way give back memory, these implementations and handling is different for every container. > > * the existing granularity of containers are unbalanced: currently there exist three tiers: "sparse", "fine" and "full". Sparse is an array of cards ranging in the hundreds maybe, "fine" is a bitmap covering a whole region and full is a bit indicating that that region should be scanned completely during GC. > > The problem is that there is nothing between "no card at all" and "sparse" and in particular the difference between the capability to hold entries of "sparse" and "fine". I.e. memory usage difference when exceeding a "sparse" array (holding 128 entries at 32M regions, taking ~256 bytes) to fine that is able to hold 65k entries using 8kB is significant. > For these reason there is even a dedicated option to stop allocating more "fine" containers and just give up and use "full" instead to avoid excessive memory usage. With extremely bad consequences in pause times. > > Over time some of these issues have been fixed or in many cases band-aided, and some of these fixes and ideas were the result of working on this change (e.g. JDK-8262185, JDK-8233919, JDK-8213108). > > This change is effectively a rewrite of the Java heap card based part of a region's remembered set. > > This initial fully working change can be roughly described with the following properties: > > * use a single `ConcurrentHashTable` for the card containers of a given region. The container in use replaced (coarsened) on the fly within the CHT node, completely lock-free. This implements JDK-6949259. > > * memory for a given region's remembered set for all containers (and the CHT nodes) is backed by per container type and per remembered set arena style bump-pointer allocation buffers. In this change, in the pause, memory is given back to free lists only. The implementation gives back memory to the OS concurrently to the application. Memory is still managed using the C heap memory manager though, but abstracted away and could be replaced by manual page memory management. > > * there are now four different container types and one meta-container type. These four actual containers are: > * inline pointer: the change store a few (3-5) cards in the CHT node directly and uses no extra memory. > * array of cards: similar to the "sparse" container, an array of cards with a configurable amount of entries. However bulk allocation of memory is now managed at a lower level so there is much less waste. > * bitmap: similar to "fine", a bitmap spanning a (sub-)range of memory > * full: same as full, indicating for a (sub-)range of memory that all cards are to be looked at during scan. Similar to inline pointers, this uses no extra memory. > * howl: the Howl container subdivides a given memory range into subranges where any of the other containers describing that sub-range of the heap may be stored in. This is somewhat similar to the idea suggested in JDK-8048504. > > * care has been taken to minimize container memory usage, e.g. by not adding redundant information there and in general carefully specify them. They have been designed with future enhancements in mind. > > In some benchmarks (where there is significant remembered set memory usage) we are seeing memory reduction to 25% of JDK 16 levels with this change. Garbage collection times are at most as long or shorter than before; most changes affecting pause times have been extracted earlier. Individiual affected phases are generally shorter now. > > Testing: tier1-8 many times, manual and automated perf testing Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: sjohanss review - remove debug code ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4116/files - new: https://git.openjdk.java.net/jdk/pull/4116/files/86f7484e..3dd0e7af Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4116&range=12 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4116&range=11-12 Stats: 4 lines in 1 file changed: 0 ins; 4 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/4116.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4116/head:pull/4116 PR: https://git.openjdk.java.net/jdk/pull/4116 From tschatzl at openjdk.java.net Tue Jun 15 10:20:25 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 15 Jun 2021 10:20:25 GMT Subject: RFR: 8017163: G1: Refactor remembered sets [v11] In-Reply-To: References: Message-ID: On Mon, 14 Jun 2021 10:51:36 GMT, Stefan Johansson wrote: >> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: >> >> - Merge branch 'master' into submit/8017163-refactor-remembered-set >> - Merge branch 'master' of gh:openjdk/jdk into tschatzl:submit/8017163-refactor-remembered-set >> - Always have power-of-2 Howl buckets to avoid memory waste (these entries have never been used before, just taking a small amount of memory) >> - Improved documentation >> - Improve comment >> - Rename G1CardSetContainerOnHeap to G1CardSetContainer on popular demand >> - sjohanss-review 3 >> - Merge branch 'master' of gh:openjdk/jdk into 8017163-refactor-remembered-set >> - More cleanup after sjohanss comments >> - Rename FOUND >> - ... and 6 more: https://git.openjdk.java.net/jdk/compare/4d1cf51b...338b4829 > > test/hotspot/gtest/gc/g1/test_g1CardSet.cpp line 421: > >> 419: const uint CardsPerRegion = 16384; >> 420: const double FullCardSetThreshold = 1.0; >> 421: const uint BitmapCoarsenThreshold = 1.0; > > Would it make sense to run this test with a few different config thresholds? To test the different levels of the card-set. If I understand those thresholds correct this card-set will never consider a region to be coarsend or full. I get that the accounting might turn into everything being "found" rather than added, but might be worth testing. Since that test randomly adds cards, it is very hard to calculate the expected number of cards for verification if we do not know where the coarsening exactly happens. This is very complicated to test, and other tests already test the coarsening, although not in an MT context, so I would like to not spend the time for either a brittle or useless test. ------------- PR: https://git.openjdk.java.net/jdk/pull/4116 From ddong at openjdk.java.net Tue Jun 15 17:29:51 2021 From: ddong at openjdk.java.net (Denghui Dong) Date: Tue, 15 Jun 2021 17:29:51 GMT Subject: RFR: 8268780: Use 'print_cr' instead of 'print' for the message 'eliminated ' Message-ID: Hi, Cound I have a review of this small fix that adds a line feed for the message 'eliminated '. When we run the following code and run `jstack ` public static void main(String[] args) { for (int i = 0; i < 200000000; i++) { try { test(); } catch (Exception e) { } } } private static void test() throws Exception { Object obj = new Object(); synchronized (obj) { throw new Exception(); } } We could find that a frame that does not wrap properly. "main" #1 prio=5 os_prio=0 cpu=53202.88ms elapsed=54.10s tid=0x00007f8c2c022550 nid=0x4743 runnable [0x00007f8c35ac2000] java.lang.Thread.State: RUNNABLE at java.lang.Throwable.fillInStackTrace(java.base at 17-internal/Native Method) at java.lang.Throwable.fillInStackTrace(java.base at 17-internal/Throwable.java:798) - locked <0x00000000f3b00340> (a java.lang.Exception) at java.lang.Throwable.(java.base at 17-internal/Throwable.java:256) at java.lang.Exception.(java.base at 17-internal/Exception.java:55) at Test.test(Test.java:14) - eliminated (a java.lang.Object) at Test.main(Test.java:5) Also, I noticed that this message has a correct line feed in the implementation of JavaVFrame.java. ------------- Commit messages: - 8268780: Use 'print_cr' instead of 'print' for the message 'eliminated ' Changes: https://git.openjdk.java.net/jdk/pull/4495/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4495&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8268780 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/4495.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4495/head:pull/4495 PR: https://git.openjdk.java.net/jdk/pull/4495 From cjplummer at openjdk.java.net Tue Jun 15 18:32:42 2021 From: cjplummer at openjdk.java.net (Chris Plummer) Date: Tue, 15 Jun 2021 18:32:42 GMT Subject: RFR: 8268780: Use 'print_cr' instead of 'print' for the message 'eliminated ' In-Reply-To: References: Message-ID: On Tue, 15 Jun 2021 17:19:26 GMT, Denghui Dong wrote: > Hi, > > Cound I have a review of this small fix that adds a line feed for the message 'eliminated '. > > When we run the following code and run `jstack ` > > > public static void main(String[] args) { > for (int i = 0; i < 200000000; i++) { > try { > test(); > } catch (Exception e) { > } > } > } > > private static void test() throws Exception { > Object obj = new Object(); > synchronized (obj) { > throw new Exception(); > } > } > > > We could find that a frame that does not wrap properly. > > "main" #1 prio=5 os_prio=0 cpu=53202.88ms elapsed=54.10s tid=0x00007f8c2c022550 nid=0x4743 runnable [0x00007f8c35ac2000] > java.lang.Thread.State: RUNNABLE > at java.lang.Throwable.fillInStackTrace(java.base at 17-internal/Native Method) > at java.lang.Throwable.fillInStackTrace(java.base at 17-internal/Throwable.java:798) > - locked <0x00000000f3b00340> (a java.lang.Exception) > at java.lang.Throwable.(java.base at 17-internal/Throwable.java:256) > at java.lang.Exception.(java.base at 17-internal/Exception.java:55) > at Test.test(Test.java:14) > - eliminated (a java.lang.Object) at Test.main(Test.java:5) > > > Also, I noticed that this message has a correct line feed in the implementation of JavaVFrame.java. Marked as reviewed by cjplummer (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/4495 From zgu at openjdk.java.net Tue Jun 15 18:40:47 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Tue, 15 Jun 2021 18:40:47 GMT Subject: RFR: 8268780: Use 'print_cr' instead of 'print' for the message 'eliminated ' In-Reply-To: References: Message-ID: <2GJdbs9vA92HgK37Fi8ZVFW_TNYssEH5huPlpisOcNg=.b682b429-d15b-4cb4-9bcd-7689d9aeb9a4@github.com> On Tue, 15 Jun 2021 17:19:26 GMT, Denghui Dong wrote: > Hi, > > Cound I have a review of this small fix that adds a line feed for the message 'eliminated '. > > When we run the following code and run `jstack ` > > > public static void main(String[] args) { > for (int i = 0; i < 200000000; i++) { > try { > test(); > } catch (Exception e) { > } > } > } > > private static void test() throws Exception { > Object obj = new Object(); > synchronized (obj) { > throw new Exception(); > } > } > > > We could find that a frame that does not wrap properly. > > "main" #1 prio=5 os_prio=0 cpu=53202.88ms elapsed=54.10s tid=0x00007f8c2c022550 nid=0x4743 runnable [0x00007f8c35ac2000] > java.lang.Thread.State: RUNNABLE > at java.lang.Throwable.fillInStackTrace(java.base at 17-internal/Native Method) > at java.lang.Throwable.fillInStackTrace(java.base at 17-internal/Throwable.java:798) > - locked <0x00000000f3b00340> (a java.lang.Exception) > at java.lang.Throwable.(java.base at 17-internal/Throwable.java:256) > at java.lang.Exception.(java.base at 17-internal/Exception.java:55) > at Test.test(Test.java:14) > - eliminated (a java.lang.Object) at Test.main(Test.java:5) > > > Also, I noticed that this message has a correct line feed in the implementation of JavaVFrame.java. Looks good and trivial ------------- Marked as reviewed by zgu (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4495 From dcubed at openjdk.java.net Tue Jun 15 19:04:54 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Tue, 15 Jun 2021 19:04:54 GMT Subject: RFR: 8268780: Use 'print_cr' instead of 'print' for the message 'eliminated ' In-Reply-To: References: Message-ID: <-rG5KW7myyYsTcle_m3nl9TIiLEsxGzCSBMgCtk-Qiw=.6b94d4f0-e7b9-41b2-93c0-fd4a87d70ef5@github.com> On Tue, 15 Jun 2021 17:19:26 GMT, Denghui Dong wrote: > Hi, > > Cound I have a review of this small fix that adds a line feed for the message 'eliminated '. > > When we run the following code and run `jstack ` > > > public static void main(String[] args) { > for (int i = 0; i < 200000000; i++) { > try { > test(); > } catch (Exception e) { > } > } > } > > private static void test() throws Exception { > Object obj = new Object(); > synchronized (obj) { > throw new Exception(); > } > } > > > We could find that a frame that does not wrap properly. > > "main" #1 prio=5 os_prio=0 cpu=53202.88ms elapsed=54.10s tid=0x00007f8c2c022550 nid=0x4743 runnable [0x00007f8c35ac2000] > java.lang.Thread.State: RUNNABLE > at java.lang.Throwable.fillInStackTrace(java.base at 17-internal/Native Method) > at java.lang.Throwable.fillInStackTrace(java.base at 17-internal/Throwable.java:798) > - locked <0x00000000f3b00340> (a java.lang.Exception) > at java.lang.Throwable.(java.base at 17-internal/Throwable.java:256) > at java.lang.Exception.(java.base at 17-internal/Exception.java:55) > at Test.test(Test.java:14) > - eliminated (a java.lang.Object) at Test.main(Test.java:5) > > > Also, I noticed that this message has a correct line feed in the implementation of JavaVFrame.java. Marked as reviewed by dcubed (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/4495 From jwilhelm at openjdk.java.net Tue Jun 15 22:01:09 2021 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Tue, 15 Jun 2021 22:01:09 GMT Subject: RFR: Merge jdk17 Message-ID: Forwardport JDK 17 -> JDK 18 ------------- Commit messages: - Merge jdk17 - 8268768: idea.sh has been updated in surprising and incompatible ways - 8268828: ProblemList compiler/intrinsics/VectorizedMismatchTest.java on win-x64 - 8268723: Problem list SA core file tests on OSX when using ZGC - 8268736: Use apiNote in AutoCloseable.close javadoc - 8263321: Regression 8% in javadoc-steady in 17-b11 - 8268125: ZGC: Clone oop array gets wrong acopy stub - 8268663: Crash when guards contain boolean expression - 8268347: C2: nested locks optimization may create unbalanced monitor enter/exit code - 8268643: SVML lib shouldn't be generated when C2 is absent - ... and 7 more: https://git.openjdk.java.net/jdk/compare/0b09129f...e748b877 The webrevs contain the adjustments done while merging with regards to each parent branch: - master: https://webrevs.openjdk.java.net/?repo=jdk&pr=4499&range=00.0 - jdk17: https://webrevs.openjdk.java.net/?repo=jdk&pr=4499&range=00.1 Changes: https://git.openjdk.java.net/jdk/pull/4499/files Stats: 1606 lines in 62 files changed: 1180 ins; 181 del; 245 mod Patch: https://git.openjdk.java.net/jdk/pull/4499.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4499/head:pull/4499 PR: https://git.openjdk.java.net/jdk/pull/4499 From coleenp at openjdk.java.net Tue Jun 15 22:42:56 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 15 Jun 2021 22:42:56 GMT Subject: RFR: 8267752: KVHashtable doesn't deallocate entries Message-ID: Add a free_entry iteration to the destructor of ~KVHashtables. Tested with tier1-3. ------------- Commit messages: - 8267752: KVHashtable doesn't deallocate entries Changes: https://git.openjdk.java.net/jdk/pull/4501/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4501&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8267752 Stats: 20 lines in 1 file changed: 20 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/4501.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4501/head:pull/4501 PR: https://git.openjdk.java.net/jdk/pull/4501 From jwilhelm at openjdk.java.net Tue Jun 15 22:49:40 2021 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Tue, 15 Jun 2021 22:49:40 GMT Subject: RFR: Merge jdk17 [v2] In-Reply-To: References: Message-ID: > Forwardport JDK 17 -> JDK 18 Jesper Wilhelmsson has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 25 additional commits since the last revision: - Merge jdk17 - 8268620: InfiniteLoopException test may fail on x86 platforms Reviewed-by: prr, trebari, azvegint - 8268125: ZGC: Clone oop array gets wrong acopy stub Reviewed-by: kvn, vlivanov - 8238649: Call new Win32 API SetThreadDescription in os::set_native_thread_name Co-authored-by: Markus GaisBauer Reviewed-by: stuefe, luhenry - 8268626: Remove native pre-jdk9 support for jtreg failure handler Reviewed-by: erikj - 8268699: Shenandoah: Add test for JDK-8268127 Reviewed-by: rkennke - Merge Reviewed-by: dcubed - 8262731: [macOS] Exception from "Printable.print" is swallowed during "PrinterJob.print" Reviewed-by: prr - 8267579: Thread::cooked_allocated_bytes() hits assert(left >= right) failed: avoid underflow Reviewed-by: dcubed, stefank, kbarrett - 8266791: Annotation property which is compiled as an array property but changed to a single element throws NullPointerException Reviewed-by: darcy, jfranck - ... and 15 more: https://git.openjdk.java.net/jdk/compare/6da37cd0...e748b877 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4499/files - new: https://git.openjdk.java.net/jdk/pull/4499/files/e748b877..e748b877 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4499&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4499&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/4499.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4499/head:pull/4499 PR: https://git.openjdk.java.net/jdk/pull/4499 From jwilhelm at openjdk.java.net Tue Jun 15 22:49:41 2021 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Tue, 15 Jun 2021 22:49:41 GMT Subject: Integrated: Merge jdk17 In-Reply-To: References: Message-ID: On Tue, 15 Jun 2021 21:51:33 GMT, Jesper Wilhelmsson wrote: > Forwardport JDK 17 -> JDK 18 This pull request has now been integrated. Changeset: e0f6f70d Author: Jesper Wilhelmsson URL: https://git.openjdk.java.net/jdk/commit/e0f6f70d3f9e748d2bc53f371beca487e9343d4a Stats: 1606 lines in 62 files changed: 1180 ins; 181 del; 245 mod Merge ------------- PR: https://git.openjdk.java.net/jdk/pull/4499 From iklam at openjdk.java.net Wed Jun 16 01:25:34 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 16 Jun 2021 01:25:34 GMT Subject: RFR: 8267752: KVHashtable doesn't deallocate entries In-Reply-To: References: Message-ID: <_45HQYm5Dzb2S9cSyvCEB5CRP7eXSJDfnfcN1qK9HMQ=.005f7c73-e2ef-45bb-aa14-06324c5e21f7@github.com> On Tue, 15 Jun 2021 22:34:40 GMT, Coleen Phillimore wrote: > Add a free_entry iteration to the destructor of ~KVHashtables. > Tested with tier1-3. Changes requested by iklam (Reviewer). src/hotspot/share/utilities/hashtable.hpp line 268: > 266: for (KVHashtableEntry** p = bucket_addr(index); *p != NULL; ) { > 267: probe = *p; > 268: *p = probe->next(); Should we also call the destructor (something like `probe->_value.~V()` , not sure about the syntax). I.e., similar to `GrowableArray`: https://github.com/openjdk/jdk/blob/e0f6f70d3f9e748d2bc53f371beca487e9343d4a/src/hotspot/share/utilities/growableArray.hpp#L499-L501 This way, we can get rid of this code, and move `delete ref()` into `SourceObjInfo::~SourceObjInfo()` https://github.com/openjdk/jdk/blob/e0f6f70d3f9e748d2bc53f371beca487e9343d4a/src/hotspot/share/cds/archiveBuilder.hpp#L180-L186 ------------- PR: https://git.openjdk.java.net/jdk/pull/4501 From ddong at openjdk.java.net Wed Jun 16 02:11:35 2021 From: ddong at openjdk.java.net (Denghui Dong) Date: Wed, 16 Jun 2021 02:11:35 GMT Subject: Integrated: 8268780: Use 'print_cr' instead of 'print' for the message 'eliminated ' In-Reply-To: References: Message-ID: On Tue, 15 Jun 2021 17:19:26 GMT, Denghui Dong wrote: > Hi, > > Cound I have a review of this small fix that adds a line feed for the message 'eliminated '. > > When we run the following code and run `jstack ` > > > public static void main(String[] args) { > for (int i = 0; i < 200000000; i++) { > try { > test(); > } catch (Exception e) { > } > } > } > > private static void test() throws Exception { > Object obj = new Object(); > synchronized (obj) { > throw new Exception(); > } > } > > > We could find that a frame that does not wrap properly. > > "main" #1 prio=5 os_prio=0 cpu=53202.88ms elapsed=54.10s tid=0x00007f8c2c022550 nid=0x4743 runnable [0x00007f8c35ac2000] > java.lang.Thread.State: RUNNABLE > at java.lang.Throwable.fillInStackTrace(java.base at 17-internal/Native Method) > at java.lang.Throwable.fillInStackTrace(java.base at 17-internal/Throwable.java:798) > - locked <0x00000000f3b00340> (a java.lang.Exception) > at java.lang.Throwable.(java.base at 17-internal/Throwable.java:256) > at java.lang.Exception.(java.base at 17-internal/Exception.java:55) > at Test.test(Test.java:14) > - eliminated (a java.lang.Object) at Test.main(Test.java:5) > > > Also, I noticed that this message has a correct line feed in the implementation of JavaVFrame.java. This pull request has now been integrated. Changeset: 48d45628 Author: Denghui Dong Committer: Yi Yang URL: https://git.openjdk.java.net/jdk/commit/48d456281ea73e22eaaae6a082bb43610647d660 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8268780: Use 'print_cr' instead of 'print' for the message 'eliminated ' Reviewed-by: cjplummer, zgu, dcubed ------------- PR: https://git.openjdk.java.net/jdk/pull/4495 From coleenp at openjdk.java.net Wed Jun 16 03:11:34 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 16 Jun 2021 03:11:34 GMT Subject: RFR: 8267752: KVHashtable doesn't deallocate entries In-Reply-To: <_45HQYm5Dzb2S9cSyvCEB5CRP7eXSJDfnfcN1qK9HMQ=.005f7c73-e2ef-45bb-aa14-06324c5e21f7@github.com> References: <_45HQYm5Dzb2S9cSyvCEB5CRP7eXSJDfnfcN1qK9HMQ=.005f7c73-e2ef-45bb-aa14-06324c5e21f7@github.com> Message-ID: On Wed, 16 Jun 2021 01:22:45 GMT, Ioi Lam wrote: >> Add a free_entry iteration to the destructor of ~KVHashtables. >> Tested with tier1-3. > > src/hotspot/share/utilities/hashtable.hpp line 268: > >> 266: for (KVHashtableEntry** p = bucket_addr(index); *p != NULL; ) { >> 267: probe = *p; >> 268: *p = probe->next(); > > Should we also call the destructor (something like `probe->_value.~V()` , not sure about the syntax). I.e., similar to `GrowableArray`: > > https://github.com/openjdk/jdk/blob/e0f6f70d3f9e748d2bc53f371beca487e9343d4a/src/hotspot/share/utilities/growableArray.hpp#L499-L501 > > This way, we can get rid of this code, and move `delete ref()` into `SourceObjInfo::~SourceObjInfo()` > > https://github.com/openjdk/jdk/blob/e0f6f70d3f9e748d2bc53f371beca487e9343d4a/src/hotspot/share/cds/archiveBuilder.hpp#L180-L186 I wanted to do that (and even had a version that did), but unfortunately for this code, adding a destructor to SrcObjRef destroys the ref pointer in this scope, so we'd need to have a copy constructor etc to keep the pointer alive. https://github.com/openjdk/jdk/blob/e0f6f70d3f9e748d2bc53f371beca487e9343d4a/src/hotspot/share/cds/archiveBuilder.cpp#L464 So I thought we should wait until we replace this table to make it better. I was going to add a comment here. ------------- PR: https://git.openjdk.java.net/jdk/pull/4501 From iklam at openjdk.java.net Wed Jun 16 04:27:35 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 16 Jun 2021 04:27:35 GMT Subject: RFR: 8267752: KVHashtable doesn't deallocate entries In-Reply-To: References: Message-ID: On Tue, 15 Jun 2021 22:34:40 GMT, Coleen Phillimore wrote: > Add a free_entry iteration to the destructor of ~KVHashtables. > Tested with tier1-3. Marked as reviewed by iklam (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/4501 From iklam at openjdk.java.net Wed Jun 16 04:27:36 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 16 Jun 2021 04:27:36 GMT Subject: RFR: 8267752: KVHashtable doesn't deallocate entries In-Reply-To: References: <_45HQYm5Dzb2S9cSyvCEB5CRP7eXSJDfnfcN1qK9HMQ=.005f7c73-e2ef-45bb-aa14-06324c5e21f7@github.com> Message-ID: On Wed, 16 Jun 2021 03:08:35 GMT, Coleen Phillimore wrote: >> src/hotspot/share/utilities/hashtable.hpp line 268: >> >>> 266: for (KVHashtableEntry** p = bucket_addr(index); *p != NULL; ) { >>> 267: probe = *p; >>> 268: *p = probe->next(); >> >> Should we also call the destructor (something like `probe->_value.~V()` , not sure about the syntax). I.e., similar to `GrowableArray`: >> >> https://github.com/openjdk/jdk/blob/e0f6f70d3f9e748d2bc53f371beca487e9343d4a/src/hotspot/share/utilities/growableArray.hpp#L499-L501 >> >> This way, we can get rid of this code, and move `delete ref()` into `SourceObjInfo::~SourceObjInfo()` >> >> https://github.com/openjdk/jdk/blob/e0f6f70d3f9e748d2bc53f371beca487e9343d4a/src/hotspot/share/cds/archiveBuilder.hpp#L180-L186 > > I wanted to do that (and even had a version that did), but unfortunately for this code, adding a destructor to SrcObjRef destroys the ref pointer in this scope, so we'd need to have a copy constructor etc to keep the pointer alive. > > https://github.com/openjdk/jdk/blob/e0f6f70d3f9e748d2bc53f371beca487e9343d4a/src/hotspot/share/cds/archiveBuilder.cpp#L464 > > So I thought we should wait until we replace this table to make it better. I was going to add a comment here. Sounds good. The current code is a bit clumsy but there's no memory leak anymore. We should eventually move all functionalities of KVHashTable into ResourceHashtable so we don't have 2 different partially working hashtable types! ------------- PR: https://git.openjdk.java.net/jdk/pull/4501 From stuefe at openjdk.java.net Wed Jun 16 06:10:32 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 16 Jun 2021 06:10:32 GMT Subject: RFR: 8267752: KVHashtable doesn't deallocate entries In-Reply-To: References: Message-ID: On Tue, 15 Jun 2021 22:34:40 GMT, Coleen Phillimore wrote: > Add a free_entry iteration to the destructor of ~KVHashtables. > Tested with tier1-3. Hi Coleen, seems to be right. I did a little test (filling a KV hashtable with 500mio entries and releasing it) and was confused when RSS did not go down es expected; but this seems to be a glibc issue (repeating the test did not allocate twice as much memory, so the second round of mallocs reused the first frees). See my question inline about unlink(). Otherwise this is fine. This was only ever used in CDS dumping, right? Cheers, Thomas src/hotspot/share/utilities/hashtable.hpp line 269: > 267: probe = *p; > 268: *p = probe->next(); > 269: free_entry(probe); I tried to understand `BasicHashTable::free_entry()` and `BasicHashTable::unlink_entry()`. I may be wrong, but the latter looks wrong. I expected some linked list splicing, but all it does is set the next ptr of the unlinked entry to NULL. So it would orphan follow up entries, and leave a dangling pointer to itself at its chain predecessor. I may be missing something here. You code still seems to be correct since you manually walk the whole entry chain. ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4501 From eliu at openjdk.java.net Wed Jun 16 08:53:50 2021 From: eliu at openjdk.java.net (Eric Liu) Date: Wed, 16 Jun 2021 08:53:50 GMT Subject: [jdk17] RFR: 8268739: AArch64: Build failure after JDK-8267663 Message-ID: The failure is cased by the build option "--with-jvm-variants=client". In client mode, BoolTest[1] is used by "neon_compare"[2] but not declared in macroAssembler_aarch64.hpp[3]. Since "neon_compare" is c2 specific, this patch moves it to c2_MacroAssembler_aarch64.cpp. [1] https://github.com/openjdk/jdk17/blob/master/src/hotspot/share/opto/subnode.hpp#L308 [2] https://github.com/openjdk/jdk17/blob/master/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L5342 [3] https://github.com/openjdk/jdk17/blob/master/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L58 ------------- Commit messages: - 8268739: AArch64: Build failure after JDK-8267663 Changes: https://git.openjdk.java.net/jdk17/pull/73/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk17&pr=73&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8268739 Stats: 94 lines in 4 files changed: 47 ins; 45 del; 2 mod Patch: https://git.openjdk.java.net/jdk17/pull/73.diff Fetch: git fetch https://git.openjdk.java.net/jdk17 pull/73/head:pull/73 PR: https://git.openjdk.java.net/jdk17/pull/73 From jvernee at openjdk.java.net Wed Jun 16 11:25:58 2021 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Wed, 16 Jun 2021 11:25:58 GMT Subject: [jdk17] RFR: 8268717: Upstream: 8268673: Stack walk across optimized entry frame on fresh native thread fails Message-ID: Upstream a critical fix from the panama-foreign repo. See the prior review thread here: https://github.com/openjdk/panama-foreign/pull/558 Testing: tier 1-2, local run of run-test-jdk_foreign. ------------- Commit messages: - Fix a couple of CI build problems - Upstream: 8268673: Stack walk across optimized entry frame on fresh native thread fails Changes: https://git.openjdk.java.net/jdk17/pull/76/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk17&pr=76&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8268717 Stats: 197 lines in 12 files changed: 196 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk17/pull/76.diff Fetch: git fetch https://git.openjdk.java.net/jdk17 pull/76/head:pull/76 PR: https://git.openjdk.java.net/jdk17/pull/76 From coleenp at openjdk.java.net Wed Jun 16 12:30:33 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 16 Jun 2021 12:30:33 GMT Subject: RFR: 8267752: KVHashtable doesn't deallocate entries In-Reply-To: References: <_45HQYm5Dzb2S9cSyvCEB5CRP7eXSJDfnfcN1qK9HMQ=.005f7c73-e2ef-45bb-aa14-06324c5e21f7@github.com> Message-ID: On Wed, 16 Jun 2021 04:24:04 GMT, Ioi Lam wrote: >> I wanted to do that (and even had a version that did), but unfortunately for this code, adding a destructor to SrcObjRef destroys the ref pointer in this scope, so we'd need to have a copy constructor etc to keep the pointer alive. >> >> https://github.com/openjdk/jdk/blob/e0f6f70d3f9e748d2bc53f371beca487e9343d4a/src/hotspot/share/cds/archiveBuilder.cpp#L464 >> >> So I thought we should wait until we replace this table to make it better. I was going to add a comment here. > > Sounds good. The current code is a bit clumsy but there's no memory leak anymore. We should eventually move all functionalities of KVHashTable into ResourceHashtable so we don't have 2 different partially working hashtable types! Yes, I agree. This src_obj_table will still need some surgery, since the replacement table should call destructors on all the elements so maybe the table shouldn't contain a copy of SourceObjectInfo. ------------- PR: https://git.openjdk.java.net/jdk/pull/4501 From coleenp at openjdk.java.net Wed Jun 16 12:36:36 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 16 Jun 2021 12:36:36 GMT Subject: RFR: 8267752: KVHashtable doesn't deallocate entries In-Reply-To: References: Message-ID: On Wed, 16 Jun 2021 05:58:05 GMT, Thomas Stuefe wrote: >> Add a free_entry iteration to the destructor of ~KVHashtables. >> Tested with tier1-3. > > src/hotspot/share/utilities/hashtable.hpp line 269: > >> 267: probe = *p; >> 268: *p = probe->next(); >> 269: free_entry(probe); > > I tried to understand `BasicHashTable::free_entry()` and `BasicHashTable::unlink_entry()`. I may be wrong, but the latter looks wrong. I expected some linked list splicing, but all it does is set the next ptr of the unlinked entry to NULL. So it would orphan follow up entries, and leave a dangling pointer to itself at its chain predecessor. I may be missing something here. > > You code still seems to be correct since you manually walk the whole entry chain. I hope unlink_entry isn't wrong because it's used for all the BasicHashtables. It does orphan the further entries, but this entry has already been removed from the list. So setting the next isn't really necessary here, but decrementing the number of elements is for printing and logging. ------------- PR: https://git.openjdk.java.net/jdk/pull/4501 From aph at openjdk.java.net Wed Jun 16 12:36:34 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 16 Jun 2021 12:36:34 GMT Subject: [jdk17] RFR: 8268739: AArch64: Build failure after JDK-8267663 In-Reply-To: References: Message-ID: On Wed, 16 Jun 2021 08:47:46 GMT, Eric Liu wrote: > The failure is cased by the build option "--with-jvm-variants=client". > In client mode, BoolTest[1] is used by "neon_compare"[2] but not > declared in macroAssembler_aarch64.hpp[3]. > > Since "neon_compare" is c2 specific, this patch moves it to > c2_MacroAssembler_aarch64.cpp. > > [1] https://github.com/openjdk/jdk17/blob/master/src/hotspot/share/opto/subnode.hpp#L308 > [2] https://github.com/openjdk/jdk17/blob/master/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L5342 > [3] https://github.com/openjdk/jdk17/blob/master/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L58 Marked as reviewed by aph (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk17/pull/73 From coleenp at openjdk.java.net Wed Jun 16 12:48:42 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 16 Jun 2021 12:48:42 GMT Subject: RFR: 8267752: KVHashtable doesn't deallocate entries In-Reply-To: References: Message-ID: On Tue, 15 Jun 2021 22:34:40 GMT, Coleen Phillimore wrote: > Add a free_entry iteration to the destructor of ~KVHashtables. > Tested with tier1-3. Thanks Ioi and Thomas for the code reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/4501 From coleenp at openjdk.java.net Wed Jun 16 12:48:43 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 16 Jun 2021 12:48:43 GMT Subject: Integrated: 8267752: KVHashtable doesn't deallocate entries In-Reply-To: References: Message-ID: <2hG0vM8y20SBqAyLBVriufbS8FCNzVc6fbQzOkEkedg=.6447dbac-8b12-424e-a84f-801bbaf98b09@github.com> On Tue, 15 Jun 2021 22:34:40 GMT, Coleen Phillimore wrote: > Add a free_entry iteration to the destructor of ~KVHashtables. > Tested with tier1-3. This pull request has now been integrated. Changeset: 72b3b0af Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/72b3b0af08136342e54e1cdea0c48d64172e8870 Stats: 20 lines in 1 file changed: 20 ins; 0 del; 0 mod 8267752: KVHashtable doesn't deallocate entries Reviewed-by: iklam, stuefe ------------- PR: https://git.openjdk.java.net/jdk/pull/4501 From coleenp at openjdk.java.net Wed Jun 16 13:01:46 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 16 Jun 2021 13:01:46 GMT Subject: RFR: 8264941: Remove CodeCache::mark_for_evol_deoptimization() method Message-ID: This change removes the mark_for_evol_deoptimization method and removes the flag that all dependencies are recorded. Before the change to walk the entire nmethod looking for "old" (redefined) methods with metadata_do(), we used to find methods in the code cache to deoptimize based on evol_method dependencies. If the dependencies weren't yet recorded, we had to deoptimize all of the methods. A long time ago, we had a customer who was unhappy with the pause for this when they had late attach. Now we don't have this problem. The evol_method dependencies are still used by the compiler to check for old methods during compilation. I didn't change this but it might be something someone who knows the compiler better can do differently and remove these dependencies too. Tested with tier1-6. ------------- Commit messages: - 8264941: Remove CodeCache::mark_for_evol_deoptimization() method Changes: https://git.openjdk.java.net/jdk/pull/4509/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4509&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264941 Stats: 78 lines in 7 files changed: 0 ins; 73 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/4509.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4509/head:pull/4509 PR: https://git.openjdk.java.net/jdk/pull/4509 From coleenp at openjdk.java.net Wed Jun 16 13:10:35 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 16 Jun 2021 13:10:35 GMT Subject: RFR: 8264941: Remove CodeCache::mark_for_evol_deoptimization() method In-Reply-To: References: Message-ID: On Wed, 16 Jun 2021 12:52:46 GMT, Coleen Phillimore wrote: > This change removes the mark_for_evol_deoptimization method and removes the flag that all dependencies are recorded. Before the change to walk the entire nmethod looking for "old" (redefined) methods with metadata_do(), we used to find methods in the code cache to deoptimize based on evol_method dependencies. If the dependencies weren't yet recorded, we had to deoptimize all of the methods. A long time ago, we had a customer who was unhappy with the pause for this when they had late attach. Now we don't have this problem. > The evol_method dependencies are still used by the compiler to check for old methods during compilation. I didn't change this but it might be something someone who knows the compiler better can do differently and remove these dependencies too. > Tested with tier1-6. Hi @iwanowww can you have a look? ------------- PR: https://git.openjdk.java.net/jdk/pull/4509 From github.com+42899633+eastig at openjdk.java.net Wed Jun 16 13:13:32 2021 From: github.com+42899633+eastig at openjdk.java.net (Evgeny Astigeevich) Date: Wed, 16 Jun 2021 13:13:32 GMT Subject: RFR: 8264941: Remove CodeCache::mark_for_evol_deoptimization() method In-Reply-To: References: Message-ID: On Wed, 16 Jun 2021 12:52:46 GMT, Coleen Phillimore wrote: > This change removes the mark_for_evol_deoptimization method and removes the flag that all dependencies are recorded. Before the change to walk the entire nmethod looking for "old" (redefined) methods with metadata_do(), we used to find methods in the code cache to deoptimize based on evol_method dependencies. If the dependencies weren't yet recorded, we had to deoptimize all of the methods. A long time ago, we had a customer who was unhappy with the pause for this when they had late attach. Now we don't have this problem. > The evol_method dependencies are still used by the compiler to check for old methods during compilation. I didn't change this but it might be something someone who knows the compiler better can do differently and remove these dependencies too. > Tested with tier1-6. LGTM ------------- PR: https://git.openjdk.java.net/jdk/pull/4509 From coleenp at openjdk.java.net Wed Jun 16 14:18:35 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 16 Jun 2021 14:18:35 GMT Subject: RFR: 8264941: Remove CodeCache::mark_for_evol_deoptimization() method In-Reply-To: References: Message-ID: On Wed, 16 Jun 2021 13:10:55 GMT, Evgeny Astigeevich wrote: >> This change removes the mark_for_evol_deoptimization method and removes the flag that all dependencies are recorded. Before the change to walk the entire nmethod looking for "old" (redefined) methods with metadata_do(), we used to find methods in the code cache to deoptimize based on evol_method dependencies. If the dependencies weren't yet recorded, we had to deoptimize all of the methods. A long time ago, we had a customer who was unhappy with the pause for this when they had late attach. Now we don't have this problem. >> The evol_method dependencies are still used by the compiler to check for old methods during compilation. I didn't change this but it might be something someone who knows the compiler better can do differently and remove these dependencies too. >> Tested with tier1-6. > > LGTM Thanks @eastig . ------------- PR: https://git.openjdk.java.net/jdk/pull/4509 From mcimadamore at openjdk.java.net Wed Jun 16 15:57:44 2021 From: mcimadamore at openjdk.java.net (Maurizio Cimadamore) Date: Wed, 16 Jun 2021 15:57:44 GMT Subject: [jdk17] RFR: 8268717: Upstream: 8268673: Stack walk across optimized entry frame on fresh native thread fails In-Reply-To: References: Message-ID: On Wed, 16 Jun 2021 11:19:37 GMT, Jorn Vernee wrote: > Upstream a critical fix from the panama-foreign repo. > > See the prior review thread here: https://github.com/openjdk/panama-foreign/pull/558 > > Testing: tier 1-2, local run of run-test-jdk_foreign. I've approved a similar changeset on panama-dev. Looks still good :-) ------------- Marked as reviewed by mcimadamore (Reviewer). PR: https://git.openjdk.java.net/jdk17/pull/76 From erikj at openjdk.java.net Wed Jun 16 18:31:17 2021 From: erikj at openjdk.java.net (Erik Joelsson) Date: Wed, 16 Jun 2021 18:31:17 GMT Subject: [jdk17] RFR: 8268717: Upstream: 8268673: Stack walk across optimized entry frame on fresh native thread fails In-Reply-To: References: Message-ID: <3W5BzWLZfdLQsyUogaHRLbensa4yVdcYJoIIHRMjfyU=.840abb1d-d52e-4941-af51-a7b04f853563@github.com> On Wed, 16 Jun 2021 11:19:37 GMT, Jorn Vernee wrote: > Upstream a critical fix from the panama-foreign repo. > > See the prior review thread here: https://github.com/openjdk/panama-foreign/pull/558 > > Testing: tier 1-2, local run of run-test-jdk_foreign. Build changes look good. ------------- Marked as reviewed by erikj (Reviewer). PR: https://git.openjdk.java.net/jdk17/pull/76 From vitaly.provodin at jetbrains.com Wed Jun 16 23:47:49 2021 From: vitaly.provodin at jetbrains.com (Vitaly Provodin) Date: Thu, 17 Jun 2021 06:47:49 +0700 Subject: link error: unresolved external symbol "int __cdecl convert_to_unicode(char const *,wchar_t * *) Message-ID: <546CCF15-1F0A-40A2-9261-AFD0654172A8@jetbrains.com> Hi all, Building OpenJDK on Windows I am faced with the following error ---------------------------8<--------------------------- os_windows.obj : error LNK2019: unresolved external symbol "int __cdecl convert_to_unicode(char const *,wchar_t * *)" (?convert_to_unicode@@YAHPEBDPEAPEA_W at Z) referenced in function "public: static void __cdecl os::set_native_thread_name(char const *)" (?set_native_thread_name at os@@SAXPEBD at Z) c:\buildagent\work\d0555747f6bd5c6\build\windows-x86_64-server-release\support\modules_libs\java.base\server\jvm.dll : fatal error LNK1120: 1 unresolved externals make[3]: *** [lib/CompileJvm.gmk:144: /cygdrive/c/buildagent/work/d0555747f6bd5c6/build/windows-x86_64-server-release/support/modules_libs/java.base/server/jvm.dll] Error 1 make[2]: *** [make/Main.gmk:252: hotspot-server-libs] Error 2 make[2]: *** Waiting for unfinished jobs.... ERROR: Build failed for targets 'clean images test-image' in configuration 'windows-x86_64-server-release' (exit code 2) ---------------------------8<--------------------------- The issue was integrated with the patch https://github.com/openjdk/jdk/commit/9f3c7e74ff00a7550742ed8b9d6671c2d4bb6041 that fixes https://bugs.openjdk.java.net/browse/JDK-8238649 Call new Win32 API SetThreadDescription in os::set_native_thread_name After reverting this commit the build completes successfully Note Visual Studio 2019 Developer Command Prompt v16.8.5 is used for building Is this issue actual for me only? - could not find any mentions about it (in maillists, JBS) Could you please advice how it can be resolved? Thanks in advance, Vitaly From dholmes at openjdk.java.net Thu Jun 17 00:27:22 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 17 Jun 2021 00:27:22 GMT Subject: [jdk17] RFR: 8268717: Upstream: 8268673: Stack walk across optimized entry frame on fresh native thread fails In-Reply-To: References: Message-ID: On Wed, 16 Jun 2021 11:19:37 GMT, Jorn Vernee wrote: > Upstream a critical fix from the panama-foreign repo. > > See the prior review thread here: https://github.com/openjdk/panama-foreign/pull/558 > > Testing: tier 1-2, local run of run-test-jdk_foreign. Hi Jorn, Seems okay but I have one query below. Thanks, David src/hotspot/share/runtime/frame.inline.hpp line 54: > 52: inline bool frame::is_first_frame() const { > 53: return (is_entry_frame() && entry_frame_is_first()) > 54: || (is_optimized_entry_frame() && optimized_entry_frame_is_first()); Given `optimized_entry_frame_is_first` is only defined on a couple of platforms, it is far from obvious that this call can never happen on the other platforms. A comment explaining this would be useful. ------------- PR: https://git.openjdk.java.net/jdk17/pull/76 From dholmes at openjdk.java.net Thu Jun 17 00:34:13 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 17 Jun 2021 00:34:13 GMT Subject: [jdk17] RFR: 8268739: AArch64: Build failure after JDK-8267663 In-Reply-To: References: Message-ID: On Wed, 16 Jun 2021 08:47:46 GMT, Eric Liu wrote: > The failure is cased by the build option "--with-jvm-variants=client". > In client mode, BoolTest[1] is used by "neon_compare"[2] but not > declared in macroAssembler_aarch64.hpp[3]. > > Since "neon_compare" is c2 specific, this patch moves it to > c2_MacroAssembler_aarch64.cpp. > > [1] https://github.com/openjdk/jdk17/blob/master/src/hotspot/share/opto/subnode.hpp#L308 > [2] https://github.com/openjdk/jdk17/blob/master/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L5342 > [3] https://github.com/openjdk/jdk17/blob/master/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L58 Seems fine. I assume the processing of the ad file will already check both MacroAssembler and C2_MacroAssembler for the definition? Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk17/pull/73 From jwilhelm at openjdk.java.net Thu Jun 17 00:57:15 2021 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Thu, 17 Jun 2021 00:57:15 GMT Subject: RFR: Merge jdk17 Message-ID: Forwardport JDK 17 -> JDK 18 ------------- Commit messages: - Merge - 8260194: Update the documentation for -Xcheck:jni - 8268863: ProblemList serviceability/sa/TestJmapCoreMetaspace.java on linux-x64 with ZGC - 8268909: ProblemList jdk/jfr/api/consumer/streaming/TestLatestEvent.java on win-x64 - 8259338: Add expiry exception for identrustdstx3 alias to VerifyCACerts.java test - 8268774: Residual logging output written to STDOUT, not STDERR - 8268714: [macos-aarch64] 7 java/net/httpclient/websocket tests failed - 8268901: JDK-8268768 missed removing two files - 8256934: C2: assert(C->live_nodes() <= C->max_node_limit()) failed: Live Node limit exceeded limit - 8268861: Disable Windows-Aarch64 build in GitHub Actions - ... and 4 more: https://git.openjdk.java.net/jdk/compare/02c9bf08...c47ba95e The merge commit only contains trivial merges, so no merge-specific webrevs have been generated. Changes: https://git.openjdk.java.net/jdk/pull/4514/files Stats: 659 lines in 33 files changed: 450 ins; 121 del; 88 mod Patch: https://git.openjdk.java.net/jdk/pull/4514.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4514/head:pull/4514 PR: https://git.openjdk.java.net/jdk/pull/4514 From david.holmes at oracle.com Thu Jun 17 01:04:56 2021 From: david.holmes at oracle.com (David Holmes) Date: Thu, 17 Jun 2021 11:04:56 +1000 Subject: link error: unresolved external symbol "int __cdecl convert_to_unicode(char const *,wchar_t * *) In-Reply-To: <546CCF15-1F0A-40A2-9261-AFD0654172A8@jetbrains.com> References: <546CCF15-1F0A-40A2-9261-AFD0654172A8@jetbrains.com> Message-ID: Hi Vitaly, On 17/06/2021 9:47 am, Vitaly Provodin wrote: > Hi all, > > Building OpenJDK on Windows I am faced with the following error > > ---------------------------8<--------------------------- > os_windows.obj : error LNK2019: unresolved external symbol "int __cdecl convert_to_unicode(char const *,wchar_t * *)" (?convert_to_unicode@@YAHPEBDPEAPEA_W at Z) referenced in function "public: static void __cdecl os::set_native_thread_name(char const *)" (?set_native_thread_name at os@@SAXPEBD at Z) That is strange. convert_to_unicode is a static function in os_windows.cpp so there is no reason for the linker to have any issue as far as I can see - and we have not seen any build issues locally. ??? I wonder if the forward declaration / prototype also needs to state static? can you try this change: diff --git a/src/hotspot/os/windows/os_windows.cpp b/src/hotspot/os/windows/os_windows.cpp index 6e996b11993..affe8a10265 100644 --- a/src/hotspot/os/windows/os_windows.cpp +++ b/src/hotspot/os/windows/os_windows.cpp @@ -892,7 +892,7 @@ static SetThreadDescriptionFnPtr _SetThreadDescription = NULL; DEBUG_ONLY(static GetThreadDescriptionFnPtr _GetThreadDescription = NULL;) // forward decl. -errno_t convert_to_unicode(char const* char_path, LPWSTR* unicode_path); +static errno_t convert_to_unicode(char const* char_path, LPWSTR* unicode_path); void os::set_native_thread_name(const char *name) { Otherwise the simple fix would be to move the definition of convert_to_unicode to be ahead of set_native_thread_name() and get rid of the prototype. But again I have no idea why we would not have seen this build problem locally. I will file a bug. Thanks, David ----- > c:\buildagent\work\d0555747f6bd5c6\build\windows-x86_64-server-release\support\modules_libs\java.base\server\jvm.dll : fatal error LNK1120: 1 unresolved externals > make[3]: *** [lib/CompileJvm.gmk:144: /cygdrive/c/buildagent/work/d0555747f6bd5c6/build/windows-x86_64-server-release/support/modules_libs/java.base/server/jvm.dll] Error 1 > make[2]: *** [make/Main.gmk:252: hotspot-server-libs] Error 2 > make[2]: *** Waiting for unfinished jobs.... > > ERROR: Build failed for targets 'clean images test-image' in configuration 'windows-x86_64-server-release' (exit code 2) > ---------------------------8<--------------------------- > > The issue was integrated with the patch https://github.com/openjdk/jdk/commit/9f3c7e74ff00a7550742ed8b9d6671c2d4bb6041 that fixes https://bugs.openjdk.java.net/browse/JDK-8238649 Call new Win32 API SetThreadDescription in os::set_native_thread_name > After reverting this commit the build completes successfully > > Note Visual Studio 2019 Developer Command Prompt v16.8.5 is used for building > > Is this issue actual for me only? - could not find any mentions about it (in maillists, JBS) > Could you please advice how it can be resolved? > > Thanks in advance, > Vitaly > From jwilhelm at openjdk.java.net Thu Jun 17 01:11:25 2021 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Thu, 17 Jun 2021 01:11:25 GMT Subject: RFR: Merge jdk17 [v2] In-Reply-To: References: Message-ID: > Forwardport JDK 17 -> JDK 18 Jesper Wilhelmsson has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 35 additional commits since the last revision: - Merge - 8268852: AsyncLogWriter should not overide is_Named_thread() Reviewed-by: dholmes, ysuenaga - 8259338: Add expiry exception for identrustdstx3 alias to VerifyCACerts.java test Reviewed-by: xuelei - 8259066: Obsolete -XX:+AlwaysLockClassLoader Reviewed-by: hseigel - 8268778: CDS check_excluded_classes needs DumpTimeTable_lock Reviewed-by: ccheung, minqi - 8267752: KVHashtable doesn't deallocate entries Reviewed-by: iklam, stuefe - 8267870: Remove unnecessary char_converter during class loading Reviewed-by: dholmes, iklam - 8268078: ClassListParser::_interfaces should be freed Reviewed-by: minqi, iklam, coleenp - 8268780: Use 'print_cr' instead of 'print' for the message 'eliminated ' Reviewed-by: cjplummer, zgu, dcubed - Merge - ... and 25 more: https://git.openjdk.java.net/jdk/compare/fdaabfed...c47ba95e ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/4514/files - new: https://git.openjdk.java.net/jdk/pull/4514/files/c47ba95e..c47ba95e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4514&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4514&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/4514.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4514/head:pull/4514 PR: https://git.openjdk.java.net/jdk/pull/4514 From jwilhelm at openjdk.java.net Thu Jun 17 01:11:26 2021 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Thu, 17 Jun 2021 01:11:26 GMT Subject: Integrated: Merge jdk17 In-Reply-To: References: Message-ID: On Thu, 17 Jun 2021 00:49:27 GMT, Jesper Wilhelmsson wrote: > Forwardport JDK 17 -> JDK 18 This pull request has now been integrated. Changeset: 3637e50b Author: Jesper Wilhelmsson URL: https://git.openjdk.java.net/jdk/commit/3637e50b30e92538510c1a8e8893cedc3bd4ccd5 Stats: 659 lines in 33 files changed: 450 ins; 121 del; 88 mod Merge ------------- PR: https://git.openjdk.java.net/jdk/pull/4514 From suenaga at oss.nttdata.com Thu Jun 17 01:28:37 2021 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Thu, 17 Jun 2021 10:28:37 +0900 Subject: link error: unresolved external symbol "int __cdecl convert_to_unicode(char const *,wchar_t * *) In-Reply-To: References: <546CCF15-1F0A-40A2-9261-AFD0654172A8@jetbrains.com> Message-ID: <73be84f3-8c37-0d3e-69e7-8440711c1869@oss.nttdata.com> Hi, I can build current HEAD of upstream (02c9bf087e5) successfully both fastdebug and release build. I use VS 2019 (16.10.1) on WSL 1 Ubuntu 20.04 . Thanks, Yasumasa On 2021/06/17 10:04, David Holmes wrote: > Hi Vitaly, > > On 17/06/2021 9:47 am, Vitaly Provodin wrote: >> Hi all, >> >> Building OpenJDK on Windows I am faced with the following error >> >> ---------------------------8<--------------------------- >> os_windows.obj : error LNK2019: unresolved external symbol "int __cdecl convert_to_unicode(char const *,wchar_t * *)" (?convert_to_unicode@@YAHPEBDPEAPEA_W at Z) referenced in function "public: static void __cdecl os::set_native_thread_name(char const *)" (?set_native_thread_name at os@@SAXPEBD at Z) > > That is strange. convert_to_unicode is a static function in os_windows.cpp so there is no reason for the linker to have any issue as far as I can see - and we have not seen any build issues locally. ??? > > I wonder if the forward declaration / prototype also needs to state static? can you try this change: > > diff --git a/src/hotspot/os/windows/os_windows.cpp b/src/hotspot/os/windows/os_windows.cpp > index 6e996b11993..affe8a10265 100644 > --- a/src/hotspot/os/windows/os_windows.cpp > +++ b/src/hotspot/os/windows/os_windows.cpp > @@ -892,7 +892,7 @@ static SetThreadDescriptionFnPtr _SetThreadDescription = NULL; > ?DEBUG_ONLY(static GetThreadDescriptionFnPtr _GetThreadDescription = NULL;) > > ?// forward decl. > -errno_t convert_to_unicode(char const* char_path, LPWSTR* unicode_path); > +static errno_t convert_to_unicode(char const* char_path, LPWSTR* unicode_path); > > ?void os::set_native_thread_name(const char *name) { > > Otherwise the simple fix would be to move the definition of convert_to_unicode to be ahead of set_native_thread_name() and get rid of the prototype. > > But again I have no idea why we would not have seen this build problem locally. I will file a bug. > > Thanks, > David > ----- > >> c:\buildagent\work\d0555747f6bd5c6\build\windows-x86_64-server-release\support\modules_libs\java.base\server\jvm.dll : fatal error LNK1120: 1 unresolved externals >> make[3]: *** [lib/CompileJvm.gmk:144: /cygdrive/c/buildagent/work/d0555747f6bd5c6/build/windows-x86_64-server-release/support/modules_libs/java.base/server/jvm.dll] Error 1 >> make[2]: *** [make/Main.gmk:252: hotspot-server-libs] Error 2 >> make[2]: *** Waiting for unfinished jobs.... >> ERROR: Build failed for targets 'clean images test-image' in configuration 'windows-x86_64-server-release' (exit code 2) >> ---------------------------8<--------------------------- >> >> The issue was integrated with the patch https://github.com/openjdk/jdk/commit/9f3c7e74ff00a7550742ed8b9d6671c2d4bb6041 that fixes https://bugs.openjdk.java.net/browse/JDK-8238649 Call new Win32 API SetThreadDescription in os::set_native_thread_name >> After reverting this commit the build completes successfully >> >> Note Visual Studio 2019 Developer Command Prompt v16.8.5 is used for building >> >> Is this issue actual for me only? - could not find any mentions about it (in maillists, JBS) >> Could you please advice how it can be resolved? >> >> Thanks in advance, >> Vitaly >> From david.holmes at oracle.com Thu Jun 17 01:31:37 2021 From: david.holmes at oracle.com (David Holmes) Date: Thu, 17 Jun 2021 11:31:37 +1000 Subject: link error: unresolved external symbol "int __cdecl convert_to_unicode(char const *,wchar_t * *) In-Reply-To: <73be84f3-8c37-0d3e-69e7-8440711c1869@oss.nttdata.com> References: <546CCF15-1F0A-40A2-9261-AFD0654172A8@jetbrains.com> <73be84f3-8c37-0d3e-69e7-8440711c1869@oss.nttdata.com> Message-ID: <15748e6c-ab37-827a-e5d7-7c60b0cd243d@oracle.com> Hi Yasumasa, On 17/06/2021 11:28 am, Yasumasa Suenaga wrote: > Hi, > > I can build current HEAD of upstream (02c9bf087e5) successfully both > fastdebug and release build. > I use VS 2019 (16.10.1) on WSL 1 Ubuntu 20.04 . We use 16.9.3 with no problem, but it does seem that the prototype is missing the static storage class modifier to match the definition (otherwise it is assumed externa). I filed: https://bugs.openjdk.java.net/browse/JDK-8268927 and am testing a trivial fix to add 'static' which will hopefully fix Vitaly's problem. Thanks, David > > Thanks, > > Yasumasa > > > On 2021/06/17 10:04, David Holmes wrote: >> Hi Vitaly, >> >> On 17/06/2021 9:47 am, Vitaly Provodin wrote: >>> Hi all, >>> >>> Building OpenJDK on Windows I am faced with the following error >>> >>> ---------------------------8<--------------------------- >>> os_windows.obj : error LNK2019: unresolved external symbol "int >>> __cdecl convert_to_unicode(char const *,wchar_t * *)" >>> (?convert_to_unicode@@YAHPEBDPEAPEA_W at Z) referenced in function >>> "public: static void __cdecl os::set_native_thread_name(char const >>> *)" (?set_native_thread_name at os@@SAXPEBD at Z) >> >> That is strange. convert_to_unicode is a static function in >> os_windows.cpp so there is no reason for the linker to have any issue >> as far as I can see - and we have not seen any build issues locally. ??? >> >> I wonder if the forward declaration / prototype also needs to state >> static? can you try this change: >> >> diff --git a/src/hotspot/os/windows/os_windows.cpp >> b/src/hotspot/os/windows/os_windows.cpp >> index 6e996b11993..affe8a10265 100644 >> --- a/src/hotspot/os/windows/os_windows.cpp >> +++ b/src/hotspot/os/windows/os_windows.cpp >> @@ -892,7 +892,7 @@ static SetThreadDescriptionFnPtr >> _SetThreadDescription = NULL; >> ??DEBUG_ONLY(static GetThreadDescriptionFnPtr _GetThreadDescription = >> NULL;) >> >> ??// forward decl. >> -errno_t convert_to_unicode(char const* char_path, LPWSTR* unicode_path); >> +static errno_t convert_to_unicode(char const* char_path, LPWSTR* >> unicode_path); >> >> ??void os::set_native_thread_name(const char *name) { >> >> Otherwise the simple fix would be to move the definition of >> convert_to_unicode to be ahead of set_native_thread_name() and get rid >> of the prototype. >> >> But again I have no idea why we would not have seen this build problem >> locally. I will file a bug. >> >> Thanks, >> David >> ----- >> >>> c:\buildagent\work\d0555747f6bd5c6\build\windows-x86_64-server-release\support\modules_libs\java.base\server\jvm.dll >>> : fatal error LNK1120: 1 unresolved externals >>> make[3]: *** [lib/CompileJvm.gmk:144: >>> /cygdrive/c/buildagent/work/d0555747f6bd5c6/build/windows-x86_64-server-release/support/modules_libs/java.base/server/jvm.dll] >>> Error 1 >>> make[2]: *** [make/Main.gmk:252: hotspot-server-libs] Error 2 >>> make[2]: *** Waiting for unfinished jobs.... >>> ERROR: Build failed for targets 'clean images test-image' in >>> configuration 'windows-x86_64-server-release' (exit code 2) >>> ---------------------------8<--------------------------- >>> >>> The issue was integrated with the patch >>> https://github.com/openjdk/jdk/commit/9f3c7e74ff00a7550742ed8b9d6671c2d4bb6041 >>> that fixes https://bugs.openjdk.java.net/browse/JDK-8238649 Call new >>> Win32 API SetThreadDescription in os::set_native_thread_name >>> After reverting this commit the build completes successfully >>> >>> Note Visual Studio 2019 Developer Command Prompt v16.8.5 is used for >>> building >>> >>> Is this issue actual for me only? - could not find any mentions about >>> it (in maillists, JBS) >>> Could you please advice how it can be resolved? >>> >>> Thanks in advance, >>> Vitaly >>> From suenaga at oss.nttdata.com Thu Jun 17 01:37:40 2021 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Thu, 17 Jun 2021 10:37:40 +0900 Subject: link error: unresolved external symbol "int __cdecl convert_to_unicode(char const *,wchar_t * *) In-Reply-To: <15748e6c-ab37-827a-e5d7-7c60b0cd243d@oracle.com> References: <546CCF15-1F0A-40A2-9261-AFD0654172A8@jetbrains.com> <73be84f3-8c37-0d3e-69e7-8440711c1869@oss.nttdata.com> <15748e6c-ab37-827a-e5d7-7c60b0cd243d@oracle.com> Message-ID: On 2021/06/17 10:31, David Holmes wrote: > Hi Yasumasa, > > On 17/06/2021 11:28 am, Yasumasa Suenaga wrote: >> Hi, >> >> I can build current HEAD of upstream (02c9bf087e5) successfully both fastdebug and release build. >> I use VS 2019 (16.10.1) on WSL 1 Ubuntu 20.04 . > > We use 16.9.3 with no problem, but it does seem that the prototype is missing the static storage class modifier to match the definition (otherwise it is assumed externa). I'm not sure, but I guess the problem may be gone if we remove build/ directory before running configure. (Vitaly seems to run `make clean` instead of `rm -fR build`) I've encountered some issue when HotSpot source was changed, then I could solve the most of problem when I do it. Thanks, Yasumasa > I filed: > > https://bugs.openjdk.java.net/browse/JDK-8268927 > > and am testing a trivial fix to add 'static' which will hopefully fix Vitaly's problem. > > Thanks, > David > >> >> Thanks, >> >> Yasumasa >> >> >> On 2021/06/17 10:04, David Holmes wrote: >>> Hi Vitaly, >>> >>> On 17/06/2021 9:47 am, Vitaly Provodin wrote: >>>> Hi all, >>>> >>>> Building OpenJDK on Windows I am faced with the following error >>>> >>>> ---------------------------8<--------------------------- >>>> os_windows.obj : error LNK2019: unresolved external symbol "int __cdecl convert_to_unicode(char const *,wchar_t * *)" (?convert_to_unicode@@YAHPEBDPEAPEA_W at Z) referenced in function "public: static void __cdecl os::set_native_thread_name(char const *)" (?set_native_thread_name at os@@SAXPEBD at Z) >>> >>> That is strange. convert_to_unicode is a static function in os_windows.cpp so there is no reason for the linker to have any issue as far as I can see - and we have not seen any build issues locally. ??? >>> >>> I wonder if the forward declaration / prototype also needs to state static? can you try this change: >>> >>> diff --git a/src/hotspot/os/windows/os_windows.cpp b/src/hotspot/os/windows/os_windows.cpp >>> index 6e996b11993..affe8a10265 100644 >>> --- a/src/hotspot/os/windows/os_windows.cpp >>> +++ b/src/hotspot/os/windows/os_windows.cpp >>> @@ -892,7 +892,7 @@ static SetThreadDescriptionFnPtr _SetThreadDescription = NULL; >>> ??DEBUG_ONLY(static GetThreadDescriptionFnPtr _GetThreadDescription = NULL;) >>> >>> ??// forward decl. >>> -errno_t convert_to_unicode(char const* char_path, LPWSTR* unicode_path); >>> +static errno_t convert_to_unicode(char const* char_path, LPWSTR* unicode_path); >>> >>> ??void os::set_native_thread_name(const char *name) { >>> >>> Otherwise the simple fix would be to move the definition of convert_to_unicode to be ahead of set_native_thread_name() and get rid of the prototype. >>> >>> But again I have no idea why we would not have seen this build problem locally. I will file a bug. >>> >>> Thanks, >>> David >>> ----- >>> >>>> c:\buildagent\work\d0555747f6bd5c6\build\windows-x86_64-server-release\support\modules_libs\java.base\server\jvm.dll : fatal error LNK1120: 1 unresolved externals >>>> make[3]: *** [lib/CompileJvm.gmk:144: /cygdrive/c/buildagent/work/d0555747f6bd5c6/build/windows-x86_64-server-release/support/modules_libs/java.base/server/jvm.dll] Error 1 >>>> make[2]: *** [make/Main.gmk:252: hotspot-server-libs] Error 2 >>>> make[2]: *** Waiting for unfinished jobs.... >>>> ERROR: Build failed for targets 'clean images test-image' in configuration 'windows-x86_64-server-release' (exit code 2) >>>> ---------------------------8<--------------------------- >>>> >>>> The issue was integrated with the patch https://github.com/openjdk/jdk/commit/9f3c7e74ff00a7550742ed8b9d6671c2d4bb6041 that fixes https://bugs.openjdk.java.net/browse/JDK-8238649 Call new Win32 API SetThreadDescription in os::set_native_thread_name >>>> After reverting this commit the build completes successfully >>>> >>>> Note Visual Studio 2019 Developer Command Prompt v16.8.5 is used for building >>>> >>>> Is this issue actual for me only? - could not find any mentions about it (in maillists, JBS) >>>> Could you please advice how it can be resolved? >>>> >>>> Thanks in advance, >>>> Vitaly >>>> From whuang at openjdk.java.net Thu Jun 17 01:58:13 2021 From: whuang at openjdk.java.net (Wang Huang) Date: Thu, 17 Jun 2021 01:58:13 GMT Subject: RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals In-Reply-To: <398R8QgS4XD7EwoHFUrkjOAJgu2597DAhk2NoFDyWDI=.f80583fc-2c47-46b3-9ae2-1ef1bb8aca8e@github.com> References: <398R8QgS4XD7EwoHFUrkjOAJgu2597DAhk2NoFDyWDI=.f80583fc-2c47-46b3-9ae2-1ef1bb8aca8e@github.com> Message-ID: On Thu, 10 Jun 2021 05:57:35 GMT, Dong Bo wrote: >> ... or maybe do the OR in the vector unit? > > I guess it can be done with: > > umaxv(v1, T4S, v0); > mov(tmp1, v1, T4S, 0); > cbnz(tmp1, DONE0); I have tested @dgbo 's suggestion and found that the performance degradation happened by using `umaxv`. ------------- PR: https://git.openjdk.java.net/jdk/pull/4423 From eliu at openjdk.java.net Thu Jun 17 02:42:14 2021 From: eliu at openjdk.java.net (Eric Liu) Date: Thu, 17 Jun 2021 02:42:14 GMT Subject: [jdk17] RFR: 8268739: AArch64: Build failure after JDK-8267663 In-Reply-To: References: Message-ID: On Thu, 17 Jun 2021 00:31:19 GMT, David Holmes wrote: > Seems fine. I assume the processing of the ad file will already check both MacroAssembler and C2_MacroAssembler for the definition? > > Thanks, > David Thanks for your review. Yes, ad will check both. But it's more reasonable to move c2 specific code to C2_MacroAssembler. Besides, in this case BoolTest is not declared in client mode since related head files are c2 specific. Please refer to the links in commit messages. ------------- PR: https://git.openjdk.java.net/jdk17/pull/73 From eliu at openjdk.java.net Thu Jun 17 02:50:14 2021 From: eliu at openjdk.java.net (Eric Liu) Date: Thu, 17 Jun 2021 02:50:14 GMT Subject: [jdk17] Integrated: 8268739: AArch64: Build failure after JDK-8267663 In-Reply-To: References: Message-ID: On Wed, 16 Jun 2021 08:47:46 GMT, Eric Liu wrote: > The failure is cased by the build option "--with-jvm-variants=client". > In client mode, BoolTest[1] is used by "neon_compare"[2] but not > declared in macroAssembler_aarch64.hpp[3]. > > Since "neon_compare" is c2 specific, this patch moves it to > c2_MacroAssembler_aarch64.cpp. > > [1] https://github.com/openjdk/jdk17/blob/master/src/hotspot/share/opto/subnode.hpp#L308 > [2] https://github.com/openjdk/jdk17/blob/master/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L5342 > [3] https://github.com/openjdk/jdk17/blob/master/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L58 This pull request has now been integrated. Changeset: 4c9aefdb Author: Eric Liu Committer: Pengfei Li URL: https://git.openjdk.java.net/jdk17/commit/4c9aefdb6193f754bfac3ae022f08a76b0cae718 Stats: 94 lines in 4 files changed: 47 ins; 45 del; 2 mod 8268739: AArch64: Build failure after JDK-8267663 Reviewed-by: aph, dholmes ------------- PR: https://git.openjdk.java.net/jdk17/pull/73 From aph at openjdk.java.net Thu Jun 17 09:14:15 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 17 Jun 2021 09:14:15 GMT Subject: RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals In-Reply-To: References: <398R8QgS4XD7EwoHFUrkjOAJgu2597DAhk2NoFDyWDI=.f80583fc-2c47-46b3-9ae2-1ef1bb8aca8e@github.com> Message-ID: <83ycoo43K2gufLK6gRTZftn0mGIkvEx4_ZzgU1pxwBk=.f2db5fa3-ceb3-4692-a077-7e396ebe39f4@github.com> On Thu, 17 Jun 2021 01:55:17 GMT, Wang Huang wrote: >> I guess it can be done with: >> >> umaxv(v1, T4S, v0); >> mov(tmp1, v1, T4S, 0); >> cbnz(tmp1, DONE0); > > I have tested @dgbo 's suggestion and found that the performance degradation happened by using `umaxv`. I guess I'm not surprised it's slower: even Firestorm has a 3-cycle latency for UMAX, and its output is used immediately. ------------- PR: https://git.openjdk.java.net/jdk/pull/4423 From aph at openjdk.java.net Thu Jun 17 09:31:11 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 17 Jun 2021 09:31:11 GMT Subject: RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals In-Reply-To: References: Message-ID: On Tue, 15 Jun 2021 08:52:16 GMT, Andrew Haley wrote: > > With this change the size of the`string_equals` intrinsic increases by ~60% from 120 bytes to 196 bytes and this gets expanded at every `String.equals` call site. It looks good on a micro-benchmark but I wonder if on a larger program this improvement is outweighed by the negative effects of methods taking up more space in the icache. > > That's an excellent point. There's no need at all for the Neon part to be expanded inline: it could be a subroutine. We'd have to use fixed Neon registers at the call site. Thinking some more,we could use this opportunity to move as much of the bulk comparison code as we can out of line, hopefully achieving a reduction in footprint as well as an improvement in performance. ------------- PR: https://git.openjdk.java.net/jdk/pull/4423 From jbachorik at openjdk.java.net Thu Jun 17 11:17:12 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Thu, 17 Jun 2021 11:17:12 GMT Subject: RFR: 8178287: AsyncGetCallTrace fails to traverse valid Java stacks [v2] In-Reply-To: References: <9qfnLj_-jz8MocK7UIIs5-NYZsVPJ7J20ZLiORqpUlM=.cb712662-0eb9-4d17-a67d-42451423f470@github.com> Message-ID: On Thu, 10 Jun 2021 07:50:57 GMT, Ludovic Henry wrote: >> When the signal sent for AsyncGetCallTrace or JFR would land on a runtime stub (like arraycopy), a vtable stub, or the prolog of a compiled method, it wouldn't be able to detect the sender (caller) frame for multiple reasons. This patch fixes these cases through adding CodeBlob-specific frame parser which are in the best position to know how a frame is setup. >> >> The following examples have been profiled with honest-profiler which uses `AsyncGetCallTrace`. >> >> # `Prof1` >> >> public class Prof1 { >> >> public static void main(String[] args) { >> StringBuilder sb = new StringBuilder(); >> for (int i = 0; i < 1000000; i++) { >> sb.append("ab"); >> sb.delete(0, 1); >> } >> System.out.println(sb.length()); >> } >> } >> >> >> - Baseline: >> >> Flat Profile (by method): >> (t 99.4,s 99.4) AGCT::Unknown Java[ERR=-5] >> (t 0.5,s 0.2) Prof1::main >> (t 0.2,s 0.2) java.lang.AbstractStringBuilder::append >> (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] >> (t 0.0,s 0.0) java.lang.AbstractStringBuilder::ensureCapacityInternal >> (t 0.0,s 0.0) java.lang.AbstractStringBuilder::shift >> (t 0.0,s 0.0) java.lang.String::getBytes >> (t 0.0,s 0.0) java.lang.AbstractStringBuilder::putStringAt >> (t 0.0,s 0.0) java.lang.StringBuilder::delete >> (t 0.2,s 0.0) java.lang.StringBuilder::append >> (t 0.0,s 0.0) java.lang.AbstractStringBuilder::delete >> (t 0.0,s 0.0) java.lang.AbstractStringBuilder::putStringAt >> >> - With `StubRoutinesBlob::FrameParser`: >> >> Flat Profile (by method): >> (t 98.7,s 98.7) java.lang.AbstractStringBuilder::ensureCapacityInternal >> (t 0.9,s 0.9) java.lang.AbstractStringBuilder::delete >> (t 99.8,s 0.2) Prof1::main >> (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] >> (t 0.0,s 0.0) AGCT::Unknown Java[ERR=-5] >> (t 98.8,s 0.0) java.lang.AbstractStringBuilder::append >> (t 98.8,s 0.0) java.lang.StringBuilder::append >> (t 0.9,s 0.0) java.lang.StringBuilder::delete >> >> >> # `Prof2` >> >> import java.util.function.Supplier; >> >> public class Prof2 { >> >> public static void main(String[] args) { >> var rand = new java.util.Random(0); >> Supplier[] suppliers = { >> () -> 0, >> () -> 1, >> () -> 2, >> () -> 3, >> }; >> >> long sum = 0; >> for (int i = 0; i >= 0; i++) { >> sum += (int)suppliers[i % suppliers.length].get(); >> } >> } >> } >> >> >> - Baseline: >> >> Flat Profile (by method): >> (t 60.7,s 60.7) AGCT::Unknown Java[ERR=-5] >> (t 39.2,s 35.2) Prof2::main >> (t 1.4,s 1.4) Prof2::lambda$main$3 >> (t 1.0,s 1.0) Prof2::lambda$main$2 >> (t 0.9,s 0.9) Prof2::lambda$main$1 >> (t 0.7,s 0.7) Prof2::lambda$main$0 >> (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] >> (t 0.0,s 0.0) java.lang.Thread::exit >> (t 0.9,s 0.0) Prof2$$Lambda$2.0x0000000800c00c28::get >> (t 1.0,s 0.0) Prof2$$Lambda$3.0x0000000800c01000::get >> (t 1.4,s 0.0) Prof2$$Lambda$4.0x0000000800c01220::get >> (t 0.7,s 0.0) Prof2$$Lambda$1.0x0000000800c00a08::get >> >> >> - With `VtableBlob::FrameParser` and `nmethod::FrameParser`: >> >> Flat Profile (by method): >> (t 74.1,s 70.3) Prof2::main >> (t 6.5,s 5.5) Prof2$$Lambda$29.0x0000000800081220::get >> (t 6.6,s 5.4) Prof2$$Lambda$28.0x0000000800081000::get >> (t 5.7,s 5.0) Prof2$$Lambda$26.0x0000000800080a08::get >> (t 5.9,s 5.0) Prof2$$Lambda$27.0x0000000800080c28::get >> (t 4.9,s 4.9) AGCT::Unknown Java[ERR=-5] >> (t 1.2,s 1.2) Prof2::lambda$main$2 >> (t 0.9,s 0.9) Prof2::lambda$main$3 >> (t 0.9,s 0.9) Prof2::lambda$main$1 >> (t 0.7,s 0.7) Prof2::lambda$main$0 >> (t 0.1,s 0.1) AGCT::Unknown not Java[ERR=-3] > > Ludovic Henry has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. I have done a sanity check to make sure that the code was not functionally modified while moving around. No problems found there. src/hotspot/cpu/aarch64/codeBlob_aarch64.cpp line 47: > 45: > 46: if (check && !_cb->is_frame_complete_at(pc)) { > 47: if (_cb->is_adapter_blob()) { Please, update the comment at L41-44 to correspond to the modified condition, ------------- PR: https://git.openjdk.java.net/jdk/pull/4436 From jvernee at openjdk.java.net Thu Jun 17 11:28:54 2021 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Thu, 17 Jun 2021 11:28:54 GMT Subject: [jdk17] RFR: 8268717: Upstream: 8268673: Stack walk across optimized entry frame on fresh native thread fails [v2] In-Reply-To: References: Message-ID: > Upstream a critical fix from the panama-foreign repo. > > See the prior review thread here: https://github.com/openjdk/panama-foreign/pull/558 > > Testing: tier 1-2, local run of run-test-jdk_foreign. Jorn Vernee has updated the pull request incrementally with one additional commit since the last revision: Add comment about optimized entry frames only being generated on x86_64 ------------- Changes: - all: https://git.openjdk.java.net/jdk17/pull/76/files - new: https://git.openjdk.java.net/jdk17/pull/76/files/97fe0555..d2110fa4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk17&pr=76&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk17&pr=76&range=00-01 Stats: 6 lines in 1 file changed: 6 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk17/pull/76.diff Fetch: git fetch https://git.openjdk.java.net/jdk17 pull/76/head:pull/76 PR: https://git.openjdk.java.net/jdk17/pull/76 From jvernee at openjdk.java.net Thu Jun 17 11:28:59 2021 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Thu, 17 Jun 2021 11:28:59 GMT Subject: [jdk17] RFR: 8268717: Upstream: 8268673: Stack walk across optimized entry frame on fresh native thread fails [v2] In-Reply-To: References: Message-ID: On Thu, 17 Jun 2021 00:23:19 GMT, David Holmes wrote: >> Jorn Vernee has updated the pull request incrementally with one additional commit since the last revision: >> >> Add comment about optimized entry frames only being generated on x86_64 > > src/hotspot/share/runtime/frame.inline.hpp line 54: > >> 52: inline bool frame::is_first_frame() const { >> 53: return (is_entry_frame() && entry_frame_is_first()) >> 54: || (is_optimized_entry_frame() && optimized_entry_frame_is_first()); > > Given `optimized_entry_frame_is_first` is only defined on a couple of platforms, it is far from obvious that this call can never happen on the other platforms. A comment explaining this would be useful. Thanks, I've added the following comment: ```C++ inline bool frame::is_first_frame() const { return (is_entry_frame() && entry_frame_is_first()) // optimized_entry_frame_is_first is currently only implemented on x86_64. // This is okay since optimized entry frames are only generated on x86_64 // as well (see ProgrammableUpcallHandler::generate_optimized_upcall_stub // in universalUpcallHandler_x86_64.cpp), so is_optimized_entry_frame will // always return false on platforms where optimized_entry_frame_is_first // is not implemented. || (is_optimized_entry_frame() && optimized_entry_frame_is_first()); } ------------- PR: https://git.openjdk.java.net/jdk17/pull/76 From david.holmes at oracle.com Thu Jun 17 12:29:09 2021 From: david.holmes at oracle.com (David Holmes) Date: Thu, 17 Jun 2021 22:29:09 +1000 Subject: [jdk17] RFR: 8268717: Upstream: 8268673: Stack walk across optimized entry frame on fresh native thread fails [v2] In-Reply-To: References: Message-ID: <66467767-84b2-0d42-4910-ad95610e4e65@oracle.com> Hi Jorn, On 17/06/2021 9:28 pm, Jorn Vernee wrote: > On Thu, 17 Jun 2021 00:23:19 GMT, David Holmes wrote: > >>> Jorn Vernee has updated the pull request incrementally with one additional commit since the last revision: >>> >>> Add comment about optimized entry frames only being generated on x86_64 >> >> src/hotspot/share/runtime/frame.inline.hpp line 54: >> >>> 52: inline bool frame::is_first_frame() const { >>> 53: return (is_entry_frame() && entry_frame_is_first()) >>> 54: || (is_optimized_entry_frame() && optimized_entry_frame_is_first()); >> >> Given `optimized_entry_frame_is_first` is only defined on a couple of platforms, it is far from obvious that this call can never happen on the other platforms. A comment explaining this would be useful. > > Thanks, I've added the following comment: > > ```C++ > inline bool frame::is_first_frame() const { > return (is_entry_frame() && entry_frame_is_first()) > // optimized_entry_frame_is_first is currently only implemented on x86_64. > // This is okay since optimized entry frames are only generated on x86_64 > // as well (see ProgrammableUpcallHandler::generate_optimized_upcall_stub > // in universalUpcallHandler_x86_64.cpp), so is_optimized_entry_frame will > // always return false on platforms where optimized_entry_frame_is_first > // is not implemented. > || (is_optimized_entry_frame() && optimized_entry_frame_is_first()); > } Now that you have explained it I think a much simpler comment will suffice :) return (is_entry_frame() && entry_frame_is_first()) || // Optimized entry frames are only present on certain platforms (is_optimized_entry_frame() && optimized_entry_frame_is_first()); Cheers, David > ------------- > > PR: https://git.openjdk.java.net/jdk17/pull/76 > From mbaesken at openjdk.java.net Thu Jun 17 12:34:35 2021 From: mbaesken at openjdk.java.net (Matthias Baesken) Date: Thu, 17 Jun 2021 12:34:35 GMT Subject: RFR: JDK-8266490: Extend the OSContainer API to support the pids controller of cgroups Message-ID: Hello, please review this PR; it extend the OSContainer API in order to also support the pids controller of cgroups. I noticed that unlike the other controllers "cpu", "cpuset", "cpuacct", "memory" on some older Linux distros (SLES 12.1, RHEL 7.1) the pids controller might not be there (or not fully supported) so it was added as optional , see the coding if (!cg_infos[PIDS_IDX]._data_complete) { log_debug(os, container)("Optional cgroup v1 pids subsystem not found"); // keep the other controller info, pids is optional } ------------- Commit messages: - JDK-8266490 Changes: https://git.openjdk.java.net/jdk/pull/4518/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4518&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8266490 Stats: 203 lines in 16 files changed: 159 ins; 2 del; 42 mod Patch: https://git.openjdk.java.net/jdk/pull/4518.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4518/head:pull/4518 PR: https://git.openjdk.java.net/jdk/pull/4518 From jvernee at openjdk.java.net Thu Jun 17 12:53:49 2021 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Thu, 17 Jun 2021 12:53:49 GMT Subject: [jdk17] RFR: 8268717: Upstream: 8268673: Stack walk across optimized entry frame on fresh native thread fails [v3] In-Reply-To: References: Message-ID: On Thu, 17 Jun 2021 12:50:10 GMT, Jorn Vernee wrote: >> Upstream a critical fix from the panama-foreign repo. >> >> See the prior review thread here: https://github.com/openjdk/panama-foreign/pull/558 >> >> Testing: tier 1-2, local run of run-test-jdk_foreign. > > Jorn Vernee has updated the pull request incrementally with two additional commits since the last revision: > > - Remove whitespace > - Simplify comment src/hotspot/share/runtime/frame.inline.hpp line 54: > 52: inline bool frame::is_first_frame() const { > 53: return (is_entry_frame() && entry_frame_is_first()) > 54: // Optimized entry frames are only present on certain platforms Suggestion: // Optimized entry frames are only present on certain platforms ------------- PR: https://git.openjdk.java.net/jdk17/pull/76 From jvernee at openjdk.java.net Thu Jun 17 12:53:45 2021 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Thu, 17 Jun 2021 12:53:45 GMT Subject: [jdk17] RFR: 8268717: Upstream: 8268673: Stack walk across optimized entry frame on fresh native thread fails [v3] In-Reply-To: References: Message-ID: > Upstream a critical fix from the panama-foreign repo. > > See the prior review thread here: https://github.com/openjdk/panama-foreign/pull/558 > > Testing: tier 1-2, local run of run-test-jdk_foreign. Jorn Vernee has updated the pull request incrementally with two additional commits since the last revision: - Remove whitespace - Simplify comment ------------- Changes: - all: https://git.openjdk.java.net/jdk17/pull/76/files - new: https://git.openjdk.java.net/jdk17/pull/76/files/d2110fa4..ce4acfd5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk17&pr=76&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk17&pr=76&range=01-02 Stats: 6 lines in 1 file changed: 0 ins; 5 del; 1 mod Patch: https://git.openjdk.java.net/jdk17/pull/76.diff Fetch: git fetch https://git.openjdk.java.net/jdk17 pull/76/head:pull/76 PR: https://git.openjdk.java.net/jdk17/pull/76 From jvernee at openjdk.java.net Thu Jun 17 12:53:46 2021 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Thu, 17 Jun 2021 12:53:46 GMT Subject: [jdk17] RFR: 8268717: Upstream: 8268673: Stack walk across optimized entry frame on fresh native thread fails In-Reply-To: <66467767-84b2-0d42-4910-ad95610e4e65@oracle.com> References: <66467767-84b2-0d42-4910-ad95610e4e65@oracle.com> Message-ID: <2k1Yx7J3mSHnupWgANXEIvxCWKdtL6Qy8MzRfQczj5k=.bd0242a0-51d3-4bda-ae21-8d381b73a4c5@github.com> On Thu, 17 Jun 2021 12:30:46 GMT, David Holmes wrote: > Now that you have explained it I think a much simpler comment will suffice :) Ok, I've shortened the comment. Thanks :) ------------- PR: https://git.openjdk.java.net/jdk17/pull/76 From jvernee at openjdk.java.net Thu Jun 17 12:53:55 2021 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Thu, 17 Jun 2021 12:53:55 GMT Subject: [jdk17] RFR: 8268717: Upstream: 8268673: Stack walk across optimized entry frame on fresh native thread fails [v2] In-Reply-To: References: Message-ID: <4G3ERV8f0_IAsg6vSR3CBUCFT2uoLKljWd0PqQ77mz0=.a7ac5bbb-868e-484b-803d-fa10a6c6a9d1@github.com> On Thu, 17 Jun 2021 11:28:54 GMT, Jorn Vernee wrote: >> Upstream a critical fix from the panama-foreign repo. >> >> See the prior review thread here: https://github.com/openjdk/panama-foreign/pull/558 >> >> Testing: tier 1-2, local run of run-test-jdk_foreign. > > Jorn Vernee has updated the pull request incrementally with one additional commit since the last revision: > > Add comment about optimized entry frames only being generated on x86_64 src/hotspot/share/runtime/frame.inline.hpp line 59: > 57: // in universalUpcallHandler_x86_64.cpp), so is_optimized_entry_frame will > 58: // always return false on platforms where optimized_entry_frame_is_first > 59: // is not implemented. Suggestion: // Optimized entry frames are only present on certain platforms ------------- PR: https://git.openjdk.java.net/jdk17/pull/76 From sgehwolf at openjdk.java.net Thu Jun 17 14:39:31 2021 From: sgehwolf at openjdk.java.net (Severin Gehwolf) Date: Thu, 17 Jun 2021 14:39:31 GMT Subject: RFR: JDK-8266490: Extend the OSContainer API to support the pids controller of cgroups In-Reply-To: References: Message-ID: On Thu, 17 Jun 2021 12:27:25 GMT, Matthias Baesken wrote: > Hello, please review this PR; it extend the OSContainer API in order to also support the pids controller of cgroups. > > I noticed that unlike the other controllers "cpu", "cpuset", "cpuacct", "memory" on some older Linux distros (SLES 12.1, RHEL 7.1) the pids controller might not be there (or not fully supported) so it was added as optional , see the coding > > > if (!cg_infos[PIDS_IDX]._data_complete) { > log_debug(os, container)("Optional cgroup v1 pids subsystem not found"); > // keep the other controller info, pids is optional > } Thanks for this work. How did you test this? Did you run container tests on a cgroups v1 and cgroups v2 system? src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 136: > 134: char *p; > 135: bool is_cgroupsV2; > 136: // true iff all required controllers, memory, cpu, cpuset, cpuacct enabled *are* enabled, please. src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 193: > 191: all_required_controllers_enabled = true; > 192: for (int i = 0; i < CG_INFO_LENGTH; i++) { > 193: // the pids controller is not there on older Linux distros Suggestion: Change the code comment to `// pids controller is optional. All other controllers are required` src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 198: > 196: all_required_controllers_enabled = all_required_controllers_enabled && cg_infos[i]._enabled; > 197: } > 198: if (! cg_infos[i]._enabled) { This if is only present for debug logging and should be guarded to that effect. src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 424: > 422: return false; > 423: } > 424: if (!cg_infos[PIDS_IDX]._data_complete) { Same here, this if should be guarded with debug logging being enabled. src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 252: > 250: * maximum number of tasks > 251: * -1 for no setup > 252: * -3 for "max" (special value) I'd suggest to use: -1 if unlimited OSCONTAINER_ERROR for not supported src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 261: > 259: "Maximum number of tasks is: " JLONG_FORMAT, JLONG_FORMAT, pidsmax); > 260: if (pidsmax < 0) { > 261: // check for potential special value It would be clearer if this comment mentioned that the value might be `max` and, thus, wouldn't be parseable with `GET_CONTAINER_INFO`. src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 266: > 264: err2 = subsystem_file_line_contents(_pids, "/pids.max", NULL, "%1023s", myline); > 265: if (err2 != 0) { > 266: if (strncmp(myline, "max", 3) == 0) return -3; We use `-1` for "unlimited" elsewhere and should probably do the same here. src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 266: > 264: err2 = subsystem_file_line_contents(_pids, "/pids.max", NULL, "%1023s", myline); > 265: if (err2 != 0) { > 266: if (strncmp(myline, "max", 3) == 0) return -3; This looks like it should use `GET_CONTAINER_INFO_CPTR` macro and then `limit_from_str` from cgroups v2 code. Perhaps move `limit_from_str` method to the base class. src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 260: > 258: } > 259: } > 260: return pidsmax; We have this pattern of needing to handle `max` elsewhere in cgroups v2 code. See for example: `CgroupV2Subsystem::cpu_quota()`. We should handle it similarly here. src/hotspot/os/linux/os_linux.cpp line 2319: > 2317: st->print_cr("max"); > 2318: } else { > 2319: st->print_cr("%s", j == OSCONTAINER_ERROR ? "not supported" : "unlimited"); We should treat the unlimited case similar to how we handle them elsewhere. I'm not sure this magic constant of `-3` gives us any more info that we'd get with `-1` that we use elsewhere. src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1Subsystem.java line 415: > 413: ****************************************************************/ > 414: public long getPidsMax() { > 415: return CgroupV1SubsystemController.longValOrUnlimited(getLongValue(pids, "pids.max")); Since this value may be `max` we should use the same logic than for v2. I.e.: String pidsMaxStr = CgroupSubsystemController.getStringValue(pids, "pids.max"); return CgroupSubsystemController.limitFromString(pidsMaxStr); test/hotspot/jtreg/containers/cgroup/CgroupSubsystemFactory.java line 172: > 170: "net_prio 5 1 1\n" + > 171: "hugetlb 6 1 1\n" + > 172: "pids 9 80 1"; // the 3 did not match 9 This comment leaves the reader none the wiser. I think you are alluding to controller id matching between `/proc/cgroups` and `/proc/self/cgroup`. If so, please use that info. ------------- Changes requested by sgehwolf (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4518 From sjohanss at openjdk.java.net Thu Jun 17 15:52:33 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Thu, 17 Jun 2021 15:52:33 GMT Subject: RFR: 8017163: G1: Refactor remembered sets [v13] In-Reply-To: References: Message-ID: On Tue, 15 Jun 2021 10:20:18 GMT, Thomas Schatzl wrote: >> Hi all, >> >> can I have reviews for this change that significantly refactors the remembered set for more scalability. >> >> The current G1 remembered set implementation has been designed for use cases and Java heaps and applications from 20 years ago. >> >> Over time many problems with performance and in particular memory usage have been observed: >> >> * adding elements to the lowest tier data structure takes a per-remembered set global lock. Measurements have shown that the applications can wait thousands of seconds acquiring these locks. While the affected threads are in most cases refinement threads so does not directly affect the application, it can still affect the ability of G1 to meet some goals needed for keeping pause times (i.e. amount of cards from the refinement buffers to be merged into the card table and then scanned during gc). >> >> * there is a substantial memory overhead for managing the data structures: examples are >> * using separate (hash) tables for the three different types of card containers >> * there is significant unnecessary preallocation of memory for some of the card set containers >> * Containers store redundant information >> >> * inflexibility when reusing memory: in the current implementation the different containers use different approaches to manage memory. Most use the C heap directly, some the C heap with some internal global memory pool. This in practice makes it very difficult to implement anything other than giving back memory in the collection pause. The corresponding "Free Collection Set" pause can take a significant amount of time because of that. >> Also memory reuse is limited and preallocating arenas is limited (or would have to be reimplemented multiple times), stressing the C heap allocator. >> >> * inability to support additional use cases: over time interesting ideas (e.g. JDK-8058803) came up for improving performance of remembered set management. Mostly due to redundant information everywhere and completely different handling of various aspects in the containers it is in practice impossible to implement these. >> >> * (partial) inability to give back memory to the OS. While some of the containers use the C heap allocator, and so in some way give back memory, these implementations and handling is different for every container. >> >> * the existing granularity of containers are unbalanced: currently there exist three tiers: "sparse", "fine" and "full". Sparse is an array of cards ranging in the hundreds maybe, "fine" is a bitmap covering a whole region and full is a bit indicating that that region should be scanned completely during GC. >> >> The problem is that there is nothing between "no card at all" and "sparse" and in particular the difference between the capability to hold entries of "sparse" and "fine". I.e. memory usage difference when exceeding a "sparse" array (holding 128 entries at 32M regions, taking ~256 bytes) to fine that is able to hold 65k entries using 8kB is significant. >> For these reason there is even a dedicated option to stop allocating more "fine" containers and just give up and use "full" instead to avoid excessive memory usage. With extremely bad consequences in pause times. >> >> Over time some of these issues have been fixed or in many cases band-aided, and some of these fixes and ideas were the result of working on this change (e.g. JDK-8262185, JDK-8233919, JDK-8213108). >> >> This change is effectively a rewrite of the Java heap card based part of a region's remembered set. >> >> This initial fully working change can be roughly described with the following properties: >> >> * use a single `ConcurrentHashTable` for the card containers of a given region. The container in use replaced (coarsened) on the fly within the CHT node, completely lock-free. This implements JDK-6949259. >> >> * memory for a given region's remembered set for all containers (and the CHT nodes) is backed by per container type and per remembered set arena style bump-pointer allocation buffers. In this change, in the pause, memory is given back to free lists only. The implementation gives back memory to the OS concurrently to the application. Memory is still managed using the C heap memory manager though, but abstracted away and could be replaced by manual page memory management. >> >> * there are now four different container types and one meta-container type. These four actual containers are: >> * inline pointer: the change store a few (3-5) cards in the CHT node directly and uses no extra memory. >> * array of cards: similar to the "sparse" container, an array of cards with a configurable amount of entries. However bulk allocation of memory is now managed at a lower level so there is much less waste. >> * bitmap: similar to "fine", a bitmap spanning a (sub-)range of memory >> * full: same as full, indicating for a (sub-)range of memory that all cards are to be looked at during scan. Similar to inline pointers, this uses no extra memory. >> * howl: the Howl container subdivides a given memory range into subranges where any of the other containers describing that sub-range of the heap may be stored in. This is somewhat similar to the idea suggested in JDK-8048504. >> >> * care has been taken to minimize container memory usage, e.g. by not adding redundant information there and in general carefully specify them. They have been designed with future enhancements in mind. >> >> In some benchmarks (where there is significant remembered set memory usage) we are seeing memory reduction to 25% of JDK 16 levels with this change. Garbage collection times are at most as long or shorter than before; most changes affecting pause times have been extracted earlier. Individiual affected phases are generally shorter now. >> >> Testing: tier1-8 many times, manual and automated perf testing > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > sjohanss review - remove debug code Looked through the changes again and I think they are good. As we have all of JDK 18 to test, polish it and fix any potential problems I see no reason to not approve this now. I found a few unused functions, please remove them unless you have some future plans for any of them. src/hotspot/share/gc/g1/g1CardSet.cpp line 146: > 144: } > 145: } > 146: Unused from what I can tell. Suggestion: src/hotspot/share/gc/g1/g1CardSet.cpp line 839: > 837: return occupied() <= limit; > 838: } > 839: Unused. Suggestion: src/hotspot/share/gc/g1/g1CardSet.hpp line 138: > 136: void reset(); > 137: > 138: void add(G1CardSetCoarsenStats& other); Same as above. Suggestion: src/hotspot/share/gc/g1/g1CardSet.hpp line 145: > 143: void record_coarsening(uint tag, bool collision); > 144: > 145: size_t num_coarsening(uint tag) const { return Atomic::load(&_coarsen_from[tag]); } Unused. Suggestion: src/hotspot/share/gc/g1/g1CardSet.hpp line 309: > 307: // Returns whether this remembered set (and all sub-sets) have an occupancy > 308: // that is less or equal to the given occupancy. > 309: bool occupancy_less_or_equal_to(size_t limit) const; Suggestion: src/hotspot/share/gc/g1/heapRegionRemSet.hpp line 85: > 83: static G1CardSetCoarsenStats coarsen_stats() { return G1CardSet::coarsen_stats(); } > 84: > 85: G1CardSetConfiguration* config() const { return _card_set.config(); } Never used. Suggestion: src/hotspot/share/gc/g1/heapRegionRemSet.hpp line 140: > 138: // Returns the memory occupancy of all free_list data structures associated > 139: // with remembered sets. > 140: static size_t free_list_mem_size(); Please also remove the comment. Suggestion: ------------- Marked as reviewed by sjohanss (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4116 From kvn at openjdk.java.net Thu Jun 17 16:57:31 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 17 Jun 2021 16:57:31 GMT Subject: RFR: 8264941: Remove CodeCache::mark_for_evol_deoptimization() method In-Reply-To: References: Message-ID: On Wed, 16 Jun 2021 12:52:46 GMT, Coleen Phillimore wrote: > This change removes the mark_for_evol_deoptimization method and removes the flag that all dependencies are recorded. Before the change to walk the entire nmethod looking for "old" (redefined) methods with metadata_do(), we used to find methods in the code cache to deoptimize based on evol_method dependencies. If the dependencies weren't yet recorded, we had to deoptimize all of the methods. A long time ago, we had a customer who was unhappy with the pause for this when they had late attach. Now we don't have this problem. > The evol_method dependencies are still used by the compiler to check for old methods during compilation. I didn't change this but it might be something someone who knows the compiler better can do differently and remove these dependencies too. > Tested with tier1-6. Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4509 From pchilanomate at openjdk.java.net Thu Jun 17 19:52:00 2021 From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo) Date: Thu, 17 Jun 2021 19:52:00 GMT Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 Message-ID: Hi all, Please review the following patch which handles the removal of biased locking code. The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though. For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed. There are some tests that were only meaningful when run with biased locking enabled so I removed them. Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones. Thanks, Patricio ------------- Commit messages: - Update java manpage - 8256425: Obsolete Biased Locking in JDK 18 Changes: https://git.openjdk.java.net/jdk/pull/4522/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4522&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8256425 Stats: 5283 lines in 163 files changed: 66 ins; 4994 del; 223 mod Patch: https://git.openjdk.java.net/jdk/pull/4522.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/4522/head:pull/4522 PR: https://git.openjdk.java.net/jdk/pull/4522 From vlivanov at openjdk.java.net Thu Jun 17 21:08:30 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Thu, 17 Jun 2021 21:08:30 GMT Subject: RFR: 8264941: Remove CodeCache::mark_for_evol_deoptimization() method In-Reply-To: References: Message-ID: On Wed, 16 Jun 2021 12:52:46 GMT, Coleen Phillimore wrote: > This change removes the mark_for_evol_deoptimization method and removes the flag that all dependencies are recorded. Before the change to walk the entire nmethod looking for "old" (redefined) methods with metadata_do(), we used to find methods in the code cache to deoptimize based on evol_method dependencies. If the dependencies weren't yet recorded, we had to deoptimize all of the methods. A long time ago, we had a customer who was unhappy with the pause for this when they had late attach. Now we don't have this problem. > The evol_method dependencies are still used by the compiler to check for old methods during compilation. I didn't change this but it might be something someone who knows the compiler better can do differently and remove these dependencies too. > Tested with tier1-6. Looks good. ------------- Marked as reviewed by vlivanov (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/4509 From jwilhelm at openjdk.java.net Thu Jun 17 23:34:36 2021 From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson) Date: Thu, 17 Jun 2021 23:34:36 GMT Subject: RFR: Merge jdk17 Message-ID: Forwardport JDK 17 -> JDK 18 ------------- Commit messages: - Merge - 8268371: C2: assert(_gvn.type(obj)->higher_equal(tjp)) failed: cast_up is no longer needed - 8268676: assert(!ik->is_interface() && !ik->has_subklass()) failed: inconsistent klass hierarchy - 8268265: MutableSpaceUsedHelper::take_sample() hits assert(left >= right) failed: avoid overflow - 8268971: ProblemList tools/jpackage/windows/WinInstallerIconTest.java on win-x64 - 8264843: Javac crashes with NullPointerException when finding unencoded XML in
 tag
 - 8265297: javax/net/ssl/SSLSession/TestEnabledProtocols.java failed with "RuntimeException: java.net.SocketException: Connection reset"
 - 8268353: Test libsvml.so is and is not present in jdk image
 - 8249899: jdk/javadoc/tool/InlineTagsWithBraces.java uses @ignore w/o bug-id
 - 8268776: Test `ADatagramSocket.java` missing /othervm from @run tag
 - ... and 3 more: https://git.openjdk.java.net/jdk/compare/bb24fa65...a3951c44

The merge commit only contains trivial merges, so no merge-specific webrevs have been generated.

Changes: https://git.openjdk.java.net/jdk/pull/4525/files
  Stats: 845 lines in 26 files changed: 408 ins; 384 del; 53 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4525.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4525/head:pull/4525

PR: https://git.openjdk.java.net/jdk/pull/4525

From jwilhelm at openjdk.java.net  Fri Jun 18 00:56:32 2021
From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson)
Date: Fri, 18 Jun 2021 00:56:32 GMT
Subject: Integrated: Merge jdk17
In-Reply-To: 
References: 
Message-ID: 

On Thu, 17 Jun 2021 23:26:26 GMT, Jesper Wilhelmsson  wrote:

> Forwardport JDK 17 -> JDK 18

This pull request has now been integrated.

Changeset: a051e735
Author:    Jesper Wilhelmsson 
URL:       https://git.openjdk.java.net/jdk/commit/a051e735cda0d5ee5cb6ce0738aa549a7319a28c
Stats:     845 lines in 26 files changed: 408 ins; 384 del; 53 mod

Merge

-------------

PR: https://git.openjdk.java.net/jdk/pull/4525

From kvn at openjdk.java.net  Fri Jun 18 01:56:27 2021
From: kvn at openjdk.java.net (Vladimir Kozlov)
Date: Fri, 18 Jun 2021 01:56:27 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18
In-Reply-To: 
References: 
Message-ID: <6SkM2mxuRj_IXbux_5Ip5sdYc1aR-UGM-uHcma3PXM0=.f512b6e0-6ada-415d-8a93-b62a1aeb71db@github.com>

On Thu, 17 Jun 2021 15:37:40 GMT, Patricio Chilano Mateo  wrote:

> Hi all,
> 
> Please review the following patch which handles the removal of biased locking code. 
> 
> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
> 
> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
> 
> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
> 
> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
> 
> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
> 
> Thanks,
> Patricio

Very nice clean up. Thank you. I have small nitpick and question about BiasedLocking flags deprecation. Obsolete flags [table](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/arguments.cpp#L537) says: ```{ "UseBiasedLocking",             JDK_Version::jdk(15), JDK_Version::jdk(18), JDK_Version::jdk(19) },```

It means in JDK 18 JVM have to accept flags on command line but issue warning.
May be I mistaking, but it means you can not remove flags declaration.
You can remove corresponding code.

src/hotspot/cpu/ppc/vm_version_ppc.cpp line 382:

> 380:   if (UseRTMLocking) {
> 381:     // If CPU or OS do not support TM:
> 382:     // Can't continue because UseRTMLocking affects UseBiasedLocking flag

Can you fix in previous line `TM` -> `RTM`

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From david.holmes at oracle.com  Fri Jun 18 03:17:15 2021
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 18 Jun 2021 13:17:15 +1000
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18
In-Reply-To: <6SkM2mxuRj_IXbux_5Ip5sdYc1aR-UGM-uHcma3PXM0=.f512b6e0-6ada-415d-8a93-b62a1aeb71db@github.com>
References: 
 <6SkM2mxuRj_IXbux_5Ip5sdYc1aR-UGM-uHcma3PXM0=.f512b6e0-6ada-415d-8a93-b62a1aeb71db@github.com>
Message-ID: 

On 18/06/2021 11:56 am, Vladimir Kozlov wrote:
> 
> Very nice clean up. Thank you. I have small nitpick and question about BiasedLocking flags deprecation. Obsolete flags [table](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/arguments.cpp#L537) says: ```{ "UseBiasedLocking",             JDK_Version::jdk(15), JDK_Version::jdk(18), JDK_Version::jdk(19) },```
> 
> It means in JDK 18 JVM have to accept flags on command line but issue warning.

Correct.

> May be I mistaking, but it means you can not remove flags declaration.

You can remove the flag (and must). Obsolete flags are handled purely by 
lookup in the obsolete flag table. Once a flag is obsoleted there should 
only be one occurrence left of it in the source code - in that table. :)

Cheers,
David
-----

> You can remove corresponding code.
> 
> src/hotspot/cpu/ppc/vm_version_ppc.cpp line 382:
> 
>> 380:   if (UseRTMLocking) {
>> 381:     // If CPU or OS do not support TM:
>> 382:     // Can't continue because UseRTMLocking affects UseBiasedLocking flag
> 
> Can you fix in previous line `TM` -> `RTM`
> 
> -------------
> 
> PR: https://git.openjdk.java.net/jdk/pull/4522
> 

From dholmes at openjdk.java.net  Fri Jun 18 06:01:29 2021
From: dholmes at openjdk.java.net (David Holmes)
Date: Fri, 18 Jun 2021 06:01:29 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18
In-Reply-To: 
References: 
Message-ID: 

On Thu, 17 Jun 2021 15:37:40 GMT, Patricio Chilano Mateo  wrote:

> Hi all,
> 
> Please review the following patch which handles the removal of biased locking code. 
> 
> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
> 
> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
> 
> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
> 
> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
> 
> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
> 
> Thanks,
> Patricio

Hi Patricio,

Huge cleanup! Looks great to see so much red. :)

I looked through everything and have a few minor comments below. Can't comment in detail on JIT changes (or whether further improvements are possible) but it all looks okay from an "eradicate biased-locking" perspective.

Thanks,
David

src/hotspot/share/code/nmethod.hpp line 281:

> 279:   // will never cause Class instances to be biased but this code
> 280:   // handles the static synchronized case as well.
> 281:   // JVMTI's GetLocalInstance() also uses these offsets to find the receiver

Not obvious that this entire comment is no longer relevant. The basic description of the use of the offsets seems applicable even if not actually needed for revoking the bias.

src/jdk.internal.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotVMConfig.java line 156:

> 154:     long prototypeMarkWord() {
> 155:         return markWordNoHashInPlace | markWordNoLockInPlace;
> 156:     }

It is not immediately obvious that this is correct for all object types. Does this match what the initial_mark() is now?

test/hotspot/jtreg/compiler/c2/Test8062950.java line 2:

> 1: /*
> 2:  * Copyright (c) 2018, 2021, Oracle and/or its affiliates. All rights reserved.

I'd argue this test serves no purpose now.

test/hotspot/jtreg/runtime/handshake/HandshakeDirectTest.java line 65:

> 63:                 // Inflate locks[handshakee] if possible
> 64:                 System.identityHashCode(locks[handshakee]);
> 65:                 walked = wb.handshakeReadMonitors(workingThreads[handshakee]);

It is not at all obvious that this revised test and the new WB routine actually test what was previously being tested. Do we actually need to involve monitors here or is that just something that has been picked to examine while in a handshake?

-------------

Marked as reviewed by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4522

From yyang at openjdk.java.net  Fri Jun 18 06:17:30 2021
From: yyang at openjdk.java.net (Yi Yang)
Date: Fri, 18 Jun 2021 06:17:30 GMT
Subject: RFR: 8268425: Show integer nid of OSThread instead of hex format
 one
In-Reply-To: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
Message-ID: 

On Thu, 10 Jun 2021 02:07:36 GMT, Yi Yang  wrote:

> From users' perspective, we can find corresponding os thread via top directly, otherwise, we must convert hex format based nid to an integer, and find that thread via `top -pid `. This slightly facilitates our debugging process, but would obviously break some existing jstack analysis tool.
> 
> Jstack Before:
> 
> "ParGC Thread#7" os_prio=0 cpu=103260.18ms elapsed=5255043.58s tid=0x00007f967000b000 nid=0x12e67 runnable
> 
> "ParGC Thread#8" os_prio=0 cpu=104818.76ms elapsed=5255043.58s tid=0x00007f967000c000 nid=0x12e68 runnable
> 
> "ParGC Thread#9" os_prio=0 cpu=102164.69ms elapsed=5255043.58s tid=0x00007f967000e000 nid=0x12e69 runnable
> 
> Jstack After:
> "G1 Conc#0" os_prio=0 cpu=0.03ms elapsed=1295.27s tid=0x00007f99dc096490 nid=117707 runnable
> 
> "G1 Refine#0" os_prio=0 cpu=0.06ms elapsed=1295.22s tid=0x00007f99dc2cad20 nid=117708 runnable
> 
> "G1 Service" os_prio=0 cpu=87.05ms elapsed=1295.22s tid=0x00007f99dc2cc140 nid=117709 runnable
> 
> Top:
>    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>  49083 tianxia+ 20 0 32.8g 594148 10796 S 103.3 0.1 0:10.05 java
>  71291 qingfen+ 20 0 39.3g 26.7g 18312 S 100.7 5.3 16861:35 jhsdb
>  50407 tianxia+ 20 0 32.5g 32796 9768 S 100.3 0.0 0:05.80 java
> 107429 maolian+ 20 0 11.4g 1.1g 10956 S 100.3 0.2 20173:52 java
>  99923 root 10 -10 288520 163228 5088 S 5.9 0.0 6463:53 AliYunDun

Do you think this would facilitate debugging process? And is it acceptable? Any feedback is appreciated!

-------------

PR: https://git.openjdk.java.net/jdk/pull/4449

From tschatzl at openjdk.java.net  Fri Jun 18 08:56:34 2021
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Fri, 18 Jun 2021 08:56:34 GMT
Subject: RFR: 8017163: G1: Refactor remembered sets [v13]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Thu, 17 Jun 2021 15:34:21 GMT, Stefan Johansson  wrote:

>> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   sjohanss review - remove debug code
>
> src/hotspot/share/gc/g1/g1CardSet.cpp line 146:
> 
>> 144:   }
>> 145: }
>> 146: 
> 
> Unused from what I can tell.
> Suggestion:

Yes, after the change to add `subtract_from`...

> src/hotspot/share/gc/g1/g1CardSet.cpp line 839:
> 
>> 837:   return occupied() <= limit;
>> 838: }
>> 839: 
> 
> Unused.
> Suggestion:

I forgot to forward to this method in `HeapRegionRemSet::occupancy_less_or_equal_than()`.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4116

From luhenry at openjdk.java.net  Fri Jun 18 08:56:32 2021
From: luhenry at openjdk.java.net (Ludovic Henry)
Date: Fri, 18 Jun 2021 08:56:32 GMT
Subject: RFR: 8178287: AsyncGetCallTrace fails to traverse valid Java
 stacks [v3]
In-Reply-To: <9qfnLj_-jz8MocK7UIIs5-NYZsVPJ7J20ZLiORqpUlM=.cb712662-0eb9-4d17-a67d-42451423f470@github.com>
References: <9qfnLj_-jz8MocK7UIIs5-NYZsVPJ7J20ZLiORqpUlM=.cb712662-0eb9-4d17-a67d-42451423f470@github.com>
Message-ID: 

> When the signal sent for AsyncGetCallTrace or JFR would land on a runtime stub (like arraycopy), a vtable stub, or the prolog of a compiled method,  it wouldn't be able to detect the sender (caller) frame for multiple reasons. This patch fixes these cases through adding CodeBlob-specific frame parser which are in the best position to know how a frame is setup.
> 
> The following examples have been profiled with honest-profiler which uses `AsyncGetCallTrace`.
> 
> # `Prof1`
> 
> public class Prof1 {
> 
>     public static void main(String[] args) {
>         StringBuilder sb = new StringBuilder();
>         for (int i = 0; i < 1000000; i++) {
>             sb.append("ab");
>             sb.delete(0, 1);
>         }
>         System.out.println(sb.length());
>     }
> }
> 
> 
> - Baseline:
> 
> Flat Profile (by method):
>         (t 99.4,s 99.4) AGCT::Unknown Java[ERR=-5]
>         (t  0.5,s  0.2) Prof1::main
>         (t  0.2,s  0.2) java.lang.AbstractStringBuilder::append
>         (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::ensureCapacityInternal
>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::shift
>         (t  0.0,s  0.0) java.lang.String::getBytes
>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::putStringAt
>         (t  0.0,s  0.0) java.lang.StringBuilder::delete
>         (t  0.2,s  0.0) java.lang.StringBuilder::append
>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::delete
>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::putStringAt
> 
> - With `StubRoutinesBlob::FrameParser`:
> 
> Flat Profile (by method):
>         (t 98.7,s 98.7) java.lang.AbstractStringBuilder::ensureCapacityInternal
>         (t  0.9,s  0.9) java.lang.AbstractStringBuilder::delete
>         (t 99.8,s  0.2) Prof1::main
>         (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>         (t  0.0,s  0.0) AGCT::Unknown Java[ERR=-5]
>         (t 98.8,s  0.0) java.lang.AbstractStringBuilder::append
>         (t 98.8,s  0.0) java.lang.StringBuilder::append
>         (t  0.9,s  0.0) java.lang.StringBuilder::delete
> 
> 
> # `Prof2`
> 
> import java.util.function.Supplier;
> 
> public class Prof2 {
> 
>     public static void main(String[] args) {
>         var rand = new java.util.Random(0);
>         Supplier[] suppliers = {
>                 () -> 0,
>                 () -> 1,
>                 () -> 2,
>                 () -> 3,
>         };
> 
>         long sum = 0;
>         for (int i = 0; i >= 0; i++) {
>             sum += (int)suppliers[i % suppliers.length].get();
>         }
>     }
> }
> 
> 
> - Baseline:
> 
> Flat Profile (by method):
>         (t 60.7,s 60.7) AGCT::Unknown Java[ERR=-5]
>         (t 39.2,s 35.2) Prof2::main
>         (t  1.4,s  1.4) Prof2::lambda$main$3
>         (t  1.0,s  1.0) Prof2::lambda$main$2
>         (t  0.9,s  0.9) Prof2::lambda$main$1
>         (t  0.7,s  0.7) Prof2::lambda$main$0
>         (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>         (t  0.0,s  0.0) java.lang.Thread::exit
>         (t  0.9,s  0.0) Prof2$$Lambda$2.0x0000000800c00c28::get
>         (t  1.0,s  0.0) Prof2$$Lambda$3.0x0000000800c01000::get
>         (t  1.4,s  0.0) Prof2$$Lambda$4.0x0000000800c01220::get
>         (t  0.7,s  0.0) Prof2$$Lambda$1.0x0000000800c00a08::get
> 
> 
> - With `VtableBlob::FrameParser` and `nmethod::FrameParser`:
> 
> Flat Profile (by method):
>         (t 74.1,s 70.3) Prof2::main
>         (t  6.5,s  5.5) Prof2$$Lambda$29.0x0000000800081220::get
>         (t  6.6,s  5.4) Prof2$$Lambda$28.0x0000000800081000::get
>         (t  5.7,s  5.0) Prof2$$Lambda$26.0x0000000800080a08::get
>         (t  5.9,s  5.0) Prof2$$Lambda$27.0x0000000800080c28::get
>         (t  4.9,s  4.9) AGCT::Unknown Java[ERR=-5]
>         (t  1.2,s  1.2) Prof2::lambda$main$2
>         (t  0.9,s  0.9) Prof2::lambda$main$3
>         (t  0.9,s  0.9) Prof2::lambda$main$1
>         (t  0.7,s  0.7) Prof2::lambda$main$0
>         (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]

Ludovic Henry has updated the pull request incrementally with one additional commit since the last revision:

  Fix comments

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4436/files
  - new: https://git.openjdk.java.net/jdk/pull/4436/files/85f218c8..b1d8611c

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4436&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4436&range=01-02

  Stats: 34 lines in 5 files changed: 14 ins; 20 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4436.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4436/head:pull/4436

PR: https://git.openjdk.java.net/jdk/pull/4436

From tschatzl at openjdk.java.net  Fri Jun 18 09:37:54 2021
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Fri, 18 Jun 2021 09:37:54 GMT
Subject: RFR: 8017163: G1: Refactor remembered sets [v14]
In-Reply-To: 
References: 
Message-ID: 

> Hi all,
> 
>   can I have reviews for this change that significantly refactors the remembered set for more scalability.
> 
>  The current G1 remembered set implementation has been designed for use cases and Java heaps and applications from 20 years ago.
> 
> Over time many problems with performance and in particular memory usage have been observed:
> 
> * adding elements to the lowest tier data structure takes a per-remembered set global lock. Measurements have shown that the applications can wait thousands of seconds acquiring these locks. While the affected threads are in most cases refinement threads so does not directly affect the application, it can still affect the ability of G1 to meet some goals needed for keeping pause times (i.e. amount of cards from the refinement buffers to be merged into the card table and then scanned during gc).
> 
> * there is a substantial memory overhead for managing the data structures: examples are
>     * using separate (hash) tables for the three different types of card containers
>     * there is significant unnecessary preallocation of memory for some of the card set containers
>     * Containers store redundant information
> 
> * inflexibility when reusing memory: in the current implementation the different containers use different approaches to manage memory. Most use the C heap directly, some the C heap with some internal global memory pool. This in practice makes it very difficult to implement anything other than giving back memory in the collection pause. The corresponding "Free Collection Set" pause can take a significant amount of time because of that.
> Also memory reuse is limited and preallocating arenas is limited (or would have to be reimplemented multiple times), stressing the C heap allocator.
> 
> * inability to support additional use cases: over time interesting ideas (e.g. JDK-8058803) came up for improving performance of remembered set management. Mostly due to redundant information everywhere and completely different handling of various aspects in the containers it is in practice impossible to implement these.
> 
> * (partial) inability to give back memory to the OS. While some of the containers use the C heap allocator, and so in some way give back memory, these implementations and handling is different for every container.
> 
> * the existing granularity of containers are unbalanced: currently there exist three tiers: "sparse", "fine" and "full". Sparse is an array of cards ranging in the hundreds maybe, "fine" is a bitmap covering a whole region and full is a bit indicating that that region should be scanned completely during GC.
> 
> The problem is that there is nothing between "no card at all" and "sparse" and in particular the difference between the capability to hold entries of "sparse" and "fine". I.e. memory usage difference when exceeding a "sparse" array (holding 128 entries at 32M regions, taking ~256 bytes) to fine that is able to hold 65k entries using 8kB is significant.
> For these reason there is even a dedicated option to stop allocating more "fine" containers and just give up and use "full" instead to avoid excessive memory usage. With extremely bad consequences in pause times.
> 
> Over time some of these issues have been fixed or in many cases band-aided, and some of these fixes and ideas were the result of working on this change (e.g. JDK-8262185, JDK-8233919, JDK-8213108).
> 
> This change is effectively a rewrite of the Java heap card based part of a region's remembered set.
> 
> This initial fully working change can be roughly described with the following properties:
> 
> * use a single `ConcurrentHashTable` for the card containers of a given region. The container in use replaced (coarsened) on the fly within the CHT node, completely lock-free. This implements JDK-6949259.
> 
> * memory for a given region's remembered set for all containers (and the CHT nodes) is backed by per container type and per remembered set arena style bump-pointer allocation buffers. In this change, in the pause, memory is given back to free lists only. The implementation gives back memory to the OS concurrently to the application. Memory is still managed using the C heap memory manager though, but abstracted away and could be replaced by manual page memory management.
> 
> * there are now four different container types and one meta-container type. These four actual containers are:
>   * inline pointer: the change store a few (3-5) cards in the CHT node directly and uses no extra memory.
>   * array of cards: similar to the "sparse" container, an array of cards with a configurable amount of entries. However bulk allocation of memory is now managed at a lower level so there is much less waste.
>   * bitmap: similar to "fine", a bitmap spanning a (sub-)range of memory
>   * full: same as full, indicating for a (sub-)range of memory that all cards are to be looked at during scan. Similar to inline pointers, this uses no extra memory.
>   * howl: the Howl container subdivides a given memory range into subranges where any of the other containers describing that sub-range of the heap may be stored in. This is somewhat similar to the idea suggested in JDK-8048504.
> 
> * care has been taken to minimize container memory usage, e.g. by not adding redundant information there and in general carefully specify them. They have been designed with future enhancements in mind.
> 
> In some benchmarks (where there is significant remembered set memory usage) we are seeing memory reduction to 25% of JDK 16 levels with this change. Garbage collection times are at most as long or shorter than before; most changes affecting pause times have been extracted earlier. Individiual affected phases are generally shorter now.
> 
> Testing: tier1-8 many times, manual and automated perf testing

Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits:

 - sjohanss review - more removal of dead code
 - Merge branch 'master' into 8017163-refactor-remembered-set
 - sjohanss review - remove debug code
 - Merge branch 'submit/8017163-refactor-remembered-set' of gh:tschatzl/jdk into 8017163-refactor-remembered-set
 - Merge branch 'master' into submit/8017163-refactor-remembered-set
 - Update obsoletion/removal versions
 - Merge branch 'master' into 8017163-refactor-remembered-set
 - Merge branch 'master' of gh:openjdk/jdk into tschatzl:submit/8017163-refactor-remembered-set
 - Always have power-of-2 Howl buckets to avoid memory waste (these entries have never been used before, just taking a small amount of memory)
 - Improved documentation
 - ... and 12 more: https://git.openjdk.java.net/jdk/compare/a051e735...6df9cf35

-------------

Changes: https://git.openjdk.java.net/jdk/pull/4116/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4116&range=13
  Stats: 6114 lines in 64 files changed: 4539 ins; 1317 del; 258 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4116.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4116/head:pull/4116

PR: https://git.openjdk.java.net/jdk/pull/4116

From iwalulya at openjdk.java.net  Fri Jun 18 10:36:34 2021
From: iwalulya at openjdk.java.net (Ivan Walulya)
Date: Fri, 18 Jun 2021 10:36:34 GMT
Subject: RFR: 8017163: G1: Refactor remembered sets [v14]
In-Reply-To: 
References: 
 
Message-ID: 

On Fri, 18 Jun 2021 09:37:54 GMT, Thomas Schatzl  wrote:

>> Hi all,
>> 
>>   can I have reviews for this change that significantly refactors the remembered set for more scalability.
>> 
>>  The current G1 remembered set implementation has been designed for use cases and Java heaps and applications from 20 years ago.
>> 
>> Over time many problems with performance and in particular memory usage have been observed:
>> 
>> * adding elements to the lowest tier data structure takes a per-remembered set global lock. Measurements have shown that the applications can wait thousands of seconds acquiring these locks. While the affected threads are in most cases refinement threads so does not directly affect the application, it can still affect the ability of G1 to meet some goals needed for keeping pause times (i.e. amount of cards from the refinement buffers to be merged into the card table and then scanned during gc).
>> 
>> * there is a substantial memory overhead for managing the data structures: examples are
>>     * using separate (hash) tables for the three different types of card containers
>>     * there is significant unnecessary preallocation of memory for some of the card set containers
>>     * Containers store redundant information
>> 
>> * inflexibility when reusing memory: in the current implementation the different containers use different approaches to manage memory. Most use the C heap directly, some the C heap with some internal global memory pool. This in practice makes it very difficult to implement anything other than giving back memory in the collection pause. The corresponding "Free Collection Set" pause can take a significant amount of time because of that.
>> Also memory reuse is limited and preallocating arenas is limited (or would have to be reimplemented multiple times), stressing the C heap allocator.
>> 
>> * inability to support additional use cases: over time interesting ideas (e.g. JDK-8058803) came up for improving performance of remembered set management. Mostly due to redundant information everywhere and completely different handling of various aspects in the containers it is in practice impossible to implement these.
>> 
>> * (partial) inability to give back memory to the OS. While some of the containers use the C heap allocator, and so in some way give back memory, these implementations and handling is different for every container.
>> 
>> * the existing granularity of containers are unbalanced: currently there exist three tiers: "sparse", "fine" and "full". Sparse is an array of cards ranging in the hundreds maybe, "fine" is a bitmap covering a whole region and full is a bit indicating that that region should be scanned completely during GC.
>> 
>> The problem is that there is nothing between "no card at all" and "sparse" and in particular the difference between the capability to hold entries of "sparse" and "fine". I.e. memory usage difference when exceeding a "sparse" array (holding 128 entries at 32M regions, taking ~256 bytes) to fine that is able to hold 65k entries using 8kB is significant.
>> For these reason there is even a dedicated option to stop allocating more "fine" containers and just give up and use "full" instead to avoid excessive memory usage. With extremely bad consequences in pause times.
>> 
>> Over time some of these issues have been fixed or in many cases band-aided, and some of these fixes and ideas were the result of working on this change (e.g. JDK-8262185, JDK-8233919, JDK-8213108).
>> 
>> This change is effectively a rewrite of the Java heap card based part of a region's remembered set.
>> 
>> This initial fully working change can be roughly described with the following properties:
>> 
>> * use a single `ConcurrentHashTable` for the card containers of a given region. The container in use replaced (coarsened) on the fly within the CHT node, completely lock-free. This implements JDK-6949259.
>> 
>> * memory for a given region's remembered set for all containers (and the CHT nodes) is backed by per container type and per remembered set arena style bump-pointer allocation buffers. In this change, in the pause, memory is given back to free lists only. The implementation gives back memory to the OS concurrently to the application. Memory is still managed using the C heap memory manager though, but abstracted away and could be replaced by manual page memory management.
>> 
>> * there are now four different container types and one meta-container type. These four actual containers are:
>>   * inline pointer: the change store a few (3-5) cards in the CHT node directly and uses no extra memory.
>>   * array of cards: similar to the "sparse" container, an array of cards with a configurable amount of entries. However bulk allocation of memory is now managed at a lower level so there is much less waste.
>>   * bitmap: similar to "fine", a bitmap spanning a (sub-)range of memory
>>   * full: same as full, indicating for a (sub-)range of memory that all cards are to be looked at during scan. Similar to inline pointers, this uses no extra memory.
>>   * howl: the Howl container subdivides a given memory range into subranges where any of the other containers describing that sub-range of the heap may be stored in. This is somewhat similar to the idea suggested in JDK-8048504.
>> 
>> * care has been taken to minimize container memory usage, e.g. by not adding redundant information there and in general carefully specify them. They have been designed with future enhancements in mind.
>> 
>> In some benchmarks (where there is significant remembered set memory usage) we are seeing memory reduction to 25% of JDK 16 levels with this change. Garbage collection times are at most as long or shorter than before; most changes affecting pause times have been extracted earlier. Individiual affected phases are generally shorter now.
>> 
>> Testing: tier1-8 many times, manual and automated perf testing
>
> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits:
> 
>  - sjohanss review - more removal of dead code
>  - Merge branch 'master' into 8017163-refactor-remembered-set
>  - sjohanss review - remove debug code
>  - Merge branch 'submit/8017163-refactor-remembered-set' of gh:tschatzl/jdk into 8017163-refactor-remembered-set
>  - Merge branch 'master' into submit/8017163-refactor-remembered-set
>  - Update obsoletion/removal versions
>  - Merge branch 'master' into 8017163-refactor-remembered-set
>  - Merge branch 'master' of gh:openjdk/jdk into tschatzl:submit/8017163-refactor-remembered-set
>  - Always have power-of-2 Howl buckets to avoid memory waste (these entries have never been used before, just taking a small amount of memory)
>  - Improved documentation
>  - ... and 12 more: https://git.openjdk.java.net/jdk/compare/a051e735...6df9cf35

Looks good!

A few minor nits.

src/hotspot/share/gc/g1/g1CardSet.cpp line 282:

> 280: 
> 281:   void grow() {
> 282:     // Just double for now.

Remove the comment, or make it clearer what the "for now" indicates

src/hotspot/share/gc/g1/g1CardSet.cpp line 457:

> 455:       break;
> 456:     }
> 457:     // Card set has overflown. Coarsen and retry.

Suggestion:

    // Card set has overflown. Coarsen or retry.

src/hotspot/share/gc/g1/g1CardSet.cpp line 582:

> 580: 
> 581:   // We only need to transfer from anything below CardSetHowl. "Full" contains
> 582:   // all elements anyway.

Should merge these two comments, seem to repeat the same information

src/hotspot/share/gc/g1/g1CardSet.cpp line 681:

> 679:       break;
> 680:     }
> 681:     // Card set has overflown. Coarsen and retry.

Suggestion:

    // Card set has overflown. Coarsen or retry.

-------------

Marked as reviewed by iwalulya (Committer).

PR: https://git.openjdk.java.net/jdk/pull/4116

From tschatzl at openjdk.java.net  Fri Jun 18 11:05:00 2021
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Fri, 18 Jun 2021 11:05:00 GMT
Subject: RFR: 8017163: G1: Refactor remembered sets [v15]
In-Reply-To: 
References: 
Message-ID: 

> Hi all,
> 
>   can I have reviews for this change that significantly refactors the remembered set for more scalability.
> 
>  The current G1 remembered set implementation has been designed for use cases and Java heaps and applications from 20 years ago.
> 
> Over time many problems with performance and in particular memory usage have been observed:
> 
> * adding elements to the lowest tier data structure takes a per-remembered set global lock. Measurements have shown that the applications can wait thousands of seconds acquiring these locks. While the affected threads are in most cases refinement threads so does not directly affect the application, it can still affect the ability of G1 to meet some goals needed for keeping pause times (i.e. amount of cards from the refinement buffers to be merged into the card table and then scanned during gc).
> 
> * there is a substantial memory overhead for managing the data structures: examples are
>     * using separate (hash) tables for the three different types of card containers
>     * there is significant unnecessary preallocation of memory for some of the card set containers
>     * Containers store redundant information
> 
> * inflexibility when reusing memory: in the current implementation the different containers use different approaches to manage memory. Most use the C heap directly, some the C heap with some internal global memory pool. This in practice makes it very difficult to implement anything other than giving back memory in the collection pause. The corresponding "Free Collection Set" pause can take a significant amount of time because of that.
> Also memory reuse is limited and preallocating arenas is limited (or would have to be reimplemented multiple times), stressing the C heap allocator.
> 
> * inability to support additional use cases: over time interesting ideas (e.g. JDK-8058803) came up for improving performance of remembered set management. Mostly due to redundant information everywhere and completely different handling of various aspects in the containers it is in practice impossible to implement these.
> 
> * (partial) inability to give back memory to the OS. While some of the containers use the C heap allocator, and so in some way give back memory, these implementations and handling is different for every container.
> 
> * the existing granularity of containers are unbalanced: currently there exist three tiers: "sparse", "fine" and "full". Sparse is an array of cards ranging in the hundreds maybe, "fine" is a bitmap covering a whole region and full is a bit indicating that that region should be scanned completely during GC.
> 
> The problem is that there is nothing between "no card at all" and "sparse" and in particular the difference between the capability to hold entries of "sparse" and "fine". I.e. memory usage difference when exceeding a "sparse" array (holding 128 entries at 32M regions, taking ~256 bytes) to fine that is able to hold 65k entries using 8kB is significant.
> For these reason there is even a dedicated option to stop allocating more "fine" containers and just give up and use "full" instead to avoid excessive memory usage. With extremely bad consequences in pause times.
> 
> Over time some of these issues have been fixed or in many cases band-aided, and some of these fixes and ideas were the result of working on this change (e.g. JDK-8262185, JDK-8233919, JDK-8213108).
> 
> This change is effectively a rewrite of the Java heap card based part of a region's remembered set.
> 
> This initial fully working change can be roughly described with the following properties:
> 
> * use a single `ConcurrentHashTable` for the card containers of a given region. The container in use replaced (coarsened) on the fly within the CHT node, completely lock-free. This implements JDK-6949259.
> 
> * memory for a given region's remembered set for all containers (and the CHT nodes) is backed by per container type and per remembered set arena style bump-pointer allocation buffers. In this change, in the pause, memory is given back to free lists only. The implementation gives back memory to the OS concurrently to the application. Memory is still managed using the C heap memory manager though, but abstracted away and could be replaced by manual page memory management.
> 
> * there are now four different container types and one meta-container type. These four actual containers are:
>   * inline pointer: the change store a few (3-5) cards in the CHT node directly and uses no extra memory.
>   * array of cards: similar to the "sparse" container, an array of cards with a configurable amount of entries. However bulk allocation of memory is now managed at a lower level so there is much less waste.
>   * bitmap: similar to "fine", a bitmap spanning a (sub-)range of memory
>   * full: same as full, indicating for a (sub-)range of memory that all cards are to be looked at during scan. Similar to inline pointers, this uses no extra memory.
>   * howl: the Howl container subdivides a given memory range into subranges where any of the other containers describing that sub-range of the heap may be stored in. This is somewhat similar to the idea suggested in JDK-8048504.
> 
> * care has been taken to minimize container memory usage, e.g. by not adding redundant information there and in general carefully specify them. They have been designed with future enhancements in mind.
> 
> In some benchmarks (where there is significant remembered set memory usage) we are seeing memory reduction to 25% of JDK 16 levels with this change. Garbage collection times are at most as long or shorter than before; most changes affecting pause times have been extracted earlier. Individiual affected phases are generally shorter now.
> 
> Testing: tier1-8 many times, manual and automated perf testing

Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision:

  iwalulya review - comments

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4116/files
  - new: https://git.openjdk.java.net/jdk/pull/4116/files/6df9cf35..b8e67fe8

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4116&range=14
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4116&range=13-14

  Stats: 7 lines in 1 file changed: 0 ins; 4 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4116.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4116/head:pull/4116

PR: https://git.openjdk.java.net/jdk/pull/4116

From kvn at openjdk.java.net  Fri Jun 18 13:03:30 2021
From: kvn at openjdk.java.net (Vladimir Kozlov)
Date: Fri, 18 Jun 2021 13:03:30 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18
In-Reply-To: 
References: 
Message-ID: 

On Thu, 17 Jun 2021 15:37:40 GMT, Patricio Chilano Mateo  wrote:

> Hi all,
> 
> Please review the following patch which handles the removal of biased locking code. 
> 
> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
> 
> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
> 
> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
> 
> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
> 
> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
> 
> Thanks,
> Patricio

Thank you, David, for explanation.

-------------

Marked as reviewed by kvn (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4522

From pchilanomate at openjdk.java.net  Fri Jun 18 15:04:28 2021
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Fri, 18 Jun 2021 15:04:28 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v2]
In-Reply-To: 
References: 
Message-ID: 

> Hi all,
> 
> Please review the following patch which handles the removal of biased locking code. 
> 
> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
> 
> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
> 
> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
> 
> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
> 
> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
> 
> Thanks,
> Patricio

Patricio Chilano Mateo has updated the pull request incrementally with two additional commits since the last revision:

 - remove test Test8062950.java + fix commments
 - fix comment in vm_version_ppc.cpp

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4522/files
  - new: https://git.openjdk.java.net/jdk/pull/4522/files/cba01d01..5f844d36

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4522&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4522&range=00-01

  Stats: 57 lines in 3 files changed: 6 ins; 48 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4522.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4522/head:pull/4522

PR: https://git.openjdk.java.net/jdk/pull/4522

From pchilanomate at openjdk.java.net  Fri Jun 18 15:04:32 2021
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Fri, 18 Jun 2021 15:04:32 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v2]
In-Reply-To: <6SkM2mxuRj_IXbux_5Ip5sdYc1aR-UGM-uHcma3PXM0=.f512b6e0-6ada-415d-8a93-b62a1aeb71db@github.com>
References: 
 <6SkM2mxuRj_IXbux_5Ip5sdYc1aR-UGM-uHcma3PXM0=.f512b6e0-6ada-415d-8a93-b62a1aeb71db@github.com>
Message-ID: 

On Thu, 17 Jun 2021 21:39:25 GMT, Vladimir Kozlov  wrote:

>> Patricio Chilano Mateo has updated the pull request incrementally with two additional commits since the last revision:
>> 
>>  - remove test Test8062950.java + fix commments
>>  - fix comment in vm_version_ppc.cpp
>
> src/hotspot/cpu/ppc/vm_version_ppc.cpp line 382:
> 
>> 380:   if (UseRTMLocking) {
>> 381:     // If CPU or OS do not support TM:
>> 382:     // Can't continue because UseRTMLocking affects UseBiasedLocking flag
> 
> Can you fix in previous line `TM` -> `RTM`

Fixed.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From pchilanomate at openjdk.java.net  Fri Jun 18 15:04:42 2021
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Fri, 18 Jun 2021 15:04:42 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v2]
In-Reply-To: 
References: 
 
Message-ID: <9mKT_9Ndf7h1pCC4qwLh3SZpjWmhkp_Icd3-y2kZqLI=.cb497443-0753-407b-bc7d-fe843f190a69@github.com>

On Thu, 17 Jun 2021 23:48:57 GMT, David Holmes  wrote:

>> Patricio Chilano Mateo has updated the pull request incrementally with two additional commits since the last revision:
>> 
>>  - remove test Test8062950.java + fix commments
>>  - fix comment in vm_version_ppc.cpp
>
> src/hotspot/share/code/nmethod.hpp line 281:
> 
>> 279:   nmethod(Method* method,
>> 280:           CompilerType type,
>> 281:           int nmethod_size,
> 
> Not obvious that this entire comment is no longer relevant. The basic description of the use of the offsets seems applicable even if not actually needed for revoking the bias.

Fixed. I restored most of the comment but remove the biased locking references.

> src/jdk.internal.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotVMConfig.java line 156:
> 
>> 154:     long prototypeMarkWord() {
>> 155:         return markWordNoHashInPlace | markWordNoLockInPlace;
>> 156:     }
> 
> It is not immediately obvious that this is correct for all object types. Does this match what the initial_mark() is now?

Yes, with biased locking gone there is now only one prototype for the markword. This should match markWord::prototype().

> test/hotspot/jtreg/compiler/c2/Test8062950.java line 2:
> 
>> 1: /*
>> 2:  * Copyright (c) 2018, 2021, Oracle and/or its affiliates. All rights reserved.
> 
> I'd argue this test serves no purpose now.

Right, removed.

> test/hotspot/jtreg/runtime/handshake/HandshakeDirectTest.java line 65:
> 
>> 63:                 // Inflate locks[handshakee] if possible
>> 64:                 System.identityHashCode(locks[handshakee]);
>> 65:                 walked = wb.handshakeReadMonitors(workingThreads[handshakee]);
> 
> It is not at all obvious that this revised test and the new WB routine actually test what was previously being tested. Do we actually need to involve monitors here or is that just something that has been picked to examine while in a handshake?

So the purpose of this test was to exercise direct handshakes between threads. Back then I used biased locking because it was the only one that was using them (and I didn't thought about using whitebox). Since the test has proven to be good to uncover some bugs I tried to keep the same elements. Walking the stack looking for monitors is one. Then I added System.identityHashCode() to force transitions from stack-locked to inflated whenever possible so that we also change the state of the lock as we did with biased locking (from biased to stack-locked).

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From pchilanomate at openjdk.java.net  Fri Jun 18 15:06:13 2021
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Fri, 18 Jun 2021 15:06:13 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18
In-Reply-To: 
References: 
Message-ID: 

On Thu, 17 Jun 2021 15:37:40 GMT, Patricio Chilano Mateo  wrote:

> Hi all,
> 
> Please review the following patch which handles the removal of biased locking code. 
> 
> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
> 
> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
> 
> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
> 
> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
> 
> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
> 
> Thanks,
> Patricio

Thanks for the reviews Vladimir and David!

Patricio

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From dcubed at openjdk.java.net  Fri Jun 18 16:40:50 2021
From: dcubed at openjdk.java.net (Daniel D.Daugherty)
Date: Fri, 18 Jun 2021 16:40:50 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v2]
In-Reply-To: 
References: 
 
Message-ID: 

On Fri, 18 Jun 2021 15:04:28 GMT, Patricio Chilano Mateo  wrote:

>> Hi all,
>> 
>> Please review the following patch which handles the removal of biased locking code. 
>> 
>> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
>> 
>> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
>> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
>> 
>> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
>> 
>> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
>> 
>> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
>> 
>> Thanks,
>> Patricio
>
> Patricio Chilano Mateo has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - remove test Test8062950.java + fix commments
>  - fix comment in vm_version_ppc.cpp

Wow! 163 files touched... Biased Locking was certainly more wide
spread than I imagined/remembered.

Thumbs up! I only spotted very minor nits.

Thanks for persisting on this patch. Keeping it up to date over a
couple of release is hard with a 163 file footprint...

src/hotspot/cpu/arm/arm.ad line 5457:

> 5455:     __ b(loop, eq);
> 5456:     __ teq($tmp$$Register, 0);
> 5457:     __ membar(MacroAssembler::Membar_mask_bits(MacroAssembler::LoadStore | MacroAssembler::LoadLoad), noreg);

Does the comment you deleted mean that storeXConditional() is only used
by biased locking or that only biased locking's use of storeXConditional() is
the only caller that needed the membar?

src/hotspot/cpu/arm/vm_version_arm_32.cpp line 362:

> 360:   // Therefore the Biased Locking is enabled on ARMv5 and ARM MP only.
> 361:   //
> 362:   return (!os::is_MP() && (arm_arch() > 5)) ? false : true;

Wow. The gory details are amazing...

src/hotspot/cpu/ppc/ppc.ad line 12140:

> 12138:                                  _rtm_counters, _stack_rtm_counters,
> 12139:                                  ((Method*)(ra_->C->method()->constant_encoding()))->method_data(),
> 12140:                                  /*TM*/ true, ra_->C->profile_rtm());

Not your bug, but that "TM" should be "RTM".

src/hotspot/cpu/ppc/ppc.ad line 12174:

> 12172:     __ compiler_fast_unlock_object($crx$$CondRegister, $oop$$Register, $box$$Register,
> 12173:                                    $tmp1$$Register, $tmp2$$Register, $tmp3$$Register,
> 12174:                                    /*TM*/ true);

Not your bug, but that "TM" should be "RTM".

src/hotspot/cpu/x86/templateTable_x86.cpp line 4026:

> 4024:     // initialize object header only.
> 4025:     __ bind(initialize_header);
> 4026:     __ movptr(Address(rax, oopDesc::mark_offset_in_bytes ()),

Not your bug, but can you delete the space before `()`?

src/hotspot/share/runtime/deoptimization.cpp line 1440:

> 1438:           if (mark.has_locker() && fr.sp() > (intptr_t*)mark.locker()) {
> 1439:             // With exec_mode == Unpack_none obj may be thread local and locked in
> 1440:             // a callee frame. // Make the lock in the callee a recursive lock and restore the displaced header.

Please delete the embedded `//`.

-------------

Marked as reviewed by dcubed (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4522

From cjplummer at openjdk.java.net  Fri Jun 18 18:15:36 2021
From: cjplummer at openjdk.java.net (Chris Plummer)
Date: Fri, 18 Jun 2021 18:15:36 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v2]
In-Reply-To: 
References: 
 
Message-ID: 

On Fri, 18 Jun 2021 15:04:28 GMT, Patricio Chilano Mateo  wrote:

>> Hi all,
>> 
>> Please review the following patch which handles the removal of biased locking code. 
>> 
>> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
>> 
>> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
>> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
>> 
>> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
>> 
>> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
>> 
>> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
>> 
>> Thanks,
>> Patricio
>
> Patricio Chilano Mateo has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - remove test Test8062950.java + fix commments
>  - fix comment in vm_version_ppc.cpp

test/jdk/com/sun/jdi/EATests.java line 52:

> 50:  *                 -XX:+WhiteBoxAPI
> 51:  *                 -Xbatch
> 52:  *                 -XX:+DoEscapeAnalysis -XX:+EliminateAllocations -XX:-EliminateLocks -XX:+EliminateNestedLocks -XX:+UseBiasedLocking -XX:-UseOptoBiasInlining

I don't see this combination of flags in the new diff. I think the approach should be to remove the biased locking flags, and then remove any duplicate test runs that result from doing that.

test/jdk/com/sun/jdi/EATests.java line 235:

> 233:         // Relocking test cases
> 234:         new EARelockingSimpleTarget()                                                       .run();
> 235:         new EARelockingSimple_2Target()                                                     .run();

I know all the tests that were removed mention biased locking in the comments, but do they require biased locking to function properly? I'm just wondering if we might get better EA test coverage if they are left in place.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From pchilanomate at openjdk.java.net  Fri Jun 18 18:59:32 2021
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Fri, 18 Jun 2021 18:59:32 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v3]
In-Reply-To: 
References: 
Message-ID: <_-xnCSNtYtcahG9u3XdGmXBkDOMJFx0pSgTw8KJmd7U=.b295c374-e99e-4d3b-9e14-83935a1ea1dd@github.com>

> Hi all,
> 
> Please review the following patch which handles the removal of biased locking code. 
> 
> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
> 
> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
> 
> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
> 
> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
> 
> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
> 
> Thanks,
> Patricio

Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:

  Dan's comments

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4522/files
  - new: https://git.openjdk.java.net/jdk/pull/4522/files/5f844d36..cb3b5e22

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4522&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4522&range=01-02

  Stats: 4 lines in 3 files changed: 0 ins; 0 del; 4 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4522.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4522/head:pull/4522

PR: https://git.openjdk.java.net/jdk/pull/4522

From pchilanomate at openjdk.java.net  Fri Jun 18 18:59:43 2021
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Fri, 18 Jun 2021 18:59:43 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v2]
In-Reply-To: 
References: 
 
 
Message-ID: <9MAr08guijfyHDWKY9z_brRhegZz_N1u0gAGklZIwFU=.840ced6d-01db-4679-8e86-d78392017e50@github.com>

On Fri, 18 Jun 2021 15:34:07 GMT, Daniel D. Daugherty  wrote:

>> Patricio Chilano Mateo has updated the pull request incrementally with two additional commits since the last revision:
>> 
>>  - remove test Test8062950.java + fix commments
>>  - fix comment in vm_version_ppc.cpp
>
> src/hotspot/cpu/arm/arm.ad line 5457:
> 
>> 5455:     __ b(loop, eq);
>> 5456:     __ teq($tmp$$Register, 0);
>> 5457:     __ membar(MacroAssembler::Membar_mask_bits(MacroAssembler::LoadStore | MacroAssembler::LoadLoad), noreg);
> 
> Does the comment you deleted mean that storeXConditional() is only used
> by biased locking or that only biased locking's use of storeXConditional() is
> the only caller that needed the membar?

If I grep for "new StoreIConditional" I don't find anything, same with StoreLConditional so it seems they are not used outside of biased locking (?). StoreIConditional was actually introduced for biased locking (6462850). I searched back in history and StoreLConditional appears first in opto/classes.hpp in 2002. Maybe @vnkozlov could confirm if we should keep them or remove them?

> src/hotspot/cpu/ppc/ppc.ad line 12140:
> 
>> 12138:                                  _rtm_counters, _stack_rtm_counters,
>> 12139:                                  ((Method*)(ra_->C->method()->constant_encoding()))->method_data(),
>> 12140:                                  /*TM*/ true, ra_->C->profile_rtm());
> 
> Not your bug, but that "TM" should be "RTM".

Fixed.

> src/hotspot/cpu/ppc/ppc.ad line 12174:
> 
>> 12172:     __ compiler_fast_unlock_object($crx$$CondRegister, $oop$$Register, $box$$Register,
>> 12173:                                    $tmp1$$Register, $tmp2$$Register, $tmp3$$Register,
>> 12174:                                    /*TM*/ true);
> 
> Not your bug, but that "TM" should be "RTM".

Fixed.

> src/hotspot/cpu/x86/templateTable_x86.cpp line 4026:
> 
>> 4024:     // initialize object header only.
>> 4025:     __ bind(initialize_header);
>> 4026:     __ movptr(Address(rax, oopDesc::mark_offset_in_bytes ()),
> 
> Not your bug, but can you delete the space before `()`?

Fixed.

> src/hotspot/share/runtime/deoptimization.cpp line 1440:
> 
>> 1438:           if (mark.has_locker() && fr.sp() > (intptr_t*)mark.locker()) {
>> 1439:             // With exec_mode == Unpack_none obj may be thread local and locked in
>> 1440:             // a callee frame. // Make the lock in the callee a recursive lock and restore the displaced header.
> 
> Please delete the embedded `//`.

Fixed.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From dcubed at openjdk.java.net  Fri Jun 18 19:11:46 2021
From: dcubed at openjdk.java.net (Daniel D.Daugherty)
Date: Fri, 18 Jun 2021 19:11:46 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v3]
In-Reply-To: <_-xnCSNtYtcahG9u3XdGmXBkDOMJFx0pSgTw8KJmd7U=.b295c374-e99e-4d3b-9e14-83935a1ea1dd@github.com>
References: 
 <_-xnCSNtYtcahG9u3XdGmXBkDOMJFx0pSgTw8KJmd7U=.b295c374-e99e-4d3b-9e14-83935a1ea1dd@github.com>
Message-ID: 

On Fri, 18 Jun 2021 18:59:32 GMT, Patricio Chilano Mateo  wrote:

>> Hi all,
>> 
>> Please review the following patch which handles the removal of biased locking code. 
>> 
>> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
>> 
>> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
>> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
>> 
>> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
>> 
>> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
>> 
>> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
>> 
>> Thanks,
>> Patricio
>
> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Dan's comments

Re-reviewed the v02 incremental webrev.
Still thumbs up!

-------------

Marked as reviewed by dcubed (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4522

From pchilanomate at openjdk.java.net  Fri Jun 18 19:23:25 2021
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Fri, 18 Jun 2021 19:23:25 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v4]
In-Reply-To: 
References: 
Message-ID: 

> Hi all,
> 
> Please review the following patch which handles the removal of biased locking code. 
> 
> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
> 
> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
> 
> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
> 
> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
> 
> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
> 
> Thanks,
> Patricio

Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:

  restore run in EATests.java

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4522/files
  - new: https://git.openjdk.java.net/jdk/pull/4522/files/cb3b5e22..215e46b8

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4522&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4522&range=02-03

  Stats: 8 lines in 1 file changed: 8 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4522.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4522/head:pull/4522

PR: https://git.openjdk.java.net/jdk/pull/4522

From pchilanomate at openjdk.java.net  Fri Jun 18 19:23:26 2021
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Fri, 18 Jun 2021 19:23:26 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v3]
In-Reply-To: 
References: 
 <_-xnCSNtYtcahG9u3XdGmXBkDOMJFx0pSgTw8KJmd7U=.b295c374-e99e-4d3b-9e14-83935a1ea1dd@github.com>
 
Message-ID: 

On Fri, 18 Jun 2021 19:08:40 GMT, Daniel D. Daugherty  wrote:

> Re-reviewed the v02 incremental webrev.
> Still thumbs up!
Thanks for the review Dan!

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From pchilanomate at openjdk.java.net  Fri Jun 18 19:23:31 2021
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Fri, 18 Jun 2021 19:23:31 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v4]
In-Reply-To: 
References: 
 
 
Message-ID: <7Kh5bbLKYK3eeJKt5zDWevHMRlSSmVryy3OuCBHZvIQ=.155672c0-cf1e-4258-bfd2-489bd6aa7058@github.com>

On Fri, 18 Jun 2021 18:00:59 GMT, Chris Plummer  wrote:

>> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   restore run in EATests.java
>
> test/jdk/com/sun/jdi/EATests.java line 52:
> 
>> 50:  *                 -XX:+WhiteBoxAPI
>> 51:  *                 -Xbatch
>> 52:  *                 -XX:+DoEscapeAnalysis -XX:+EliminateAllocations -XX:-EliminateLocks -XX:+EliminateNestedLocks -XX:+UseBiasedLocking -XX:-UseOptoBiasInlining
> 
> I don't see this combination of flags in the new diff. I think the approach should be to remove the biased locking flags, and then remove any duplicate test runs that result from doing that.

Sorry I restored that run. I must have confused the +/- with the previous one.

> test/jdk/com/sun/jdi/EATests.java line 235:
> 
>> 233:         // Relocking test cases
>> 234:         new EARelockingSimpleTarget()                                                       .run();
>> 235:         new EARelockingSimple_2Target()                                                     .run();
> 
> I know all the tests that were removed mention biased locking in the comments, but do they require biased locking to function properly? I'm just wondering if we might get better EA test coverage if they are left in place.

They are trying to exercise some biased locking specific paths, but maybe @reinrich can comment on wether it is worth keeping them since he wrote the tests.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From github.com+6704669+asgibbons at openjdk.java.net  Fri Jun 18 22:12:11 2021
From: github.com+6704669+asgibbons at openjdk.java.net (Scott Gibbons)
Date: Fri, 18 Jun 2021 22:12:11 GMT
Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512
 [v5]
In-Reply-To: 
References: 
Message-ID: 

> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding.
> 
> A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done.  This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so.  A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded.  This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded.
> 
> The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding.  The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically.
> 
> Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k.  The numbers are given in the table below.
> 
> **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**.
> 
> 
> Benchmark Name | Base Score | Optimized Score | Gain
> -- | -- | -- | --
> testBase64Decode size 1 | 15.36 | 15.32 | 1.00
> testBase64Decode size 3 | 17.00 | 16.72 | 1.02
> testBase64Decode size 7 | 20.60 | 18.82 | 1.09
> testBase64Decode size 32 | 34.21 | 26.77 | 1.28
> testBase64Decode size 64 | 54.43 | 38.35 | 1.42
> testBase64Decode size 80 | 66.40 | 48.34 | 1.37
> testBase64Decode size 96 | 73.16 | 52.90 | 1.38
> testBase64Decode size 112 | 84.93 | 51.82 | 1.64
> testBase64Decode size 512 | 288.81 | 32.04 | 9.01
> testBase64Decode size 1000 | 560.48 | 40.79 | 13.74
> testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72
> testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15
> testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07
> testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10
> testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02
> testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10
> testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05
> testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00
> testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05
> testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20
> testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09
> testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12
> testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09
> testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21
> testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29
> testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12
> testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05
> testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18
> testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02
> testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24
> testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23
> testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24
> testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14
> testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19
> testBase64WithErrorInputsDecode size   20000 | 1398.44 | 1138.17 | 1.23
> testBase64WithErrorInputsDecode size   50000 | 1409.41 | 1114.16 | 1.26

Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:

  Added comments.  Streamlined flow for decode.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4368/files
  - new: https://git.openjdk.java.net/jdk/pull/4368/files/247f2245..bb73df6c

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4368&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4368&range=03-04

  Stats: 44 lines in 1 file changed: 18 ins; 10 del; 16 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4368.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4368/head:pull/4368

PR: https://git.openjdk.java.net/jdk/pull/4368

From jwilhelm at openjdk.java.net  Fri Jun 18 22:26:34 2021
From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson)
Date: Fri, 18 Jun 2021 22:26:34 GMT
Subject: RFR: Merge jdk17
Message-ID: 

Forwardport JDK 17 -> JDK 18

-------------

Commit messages:
 - Merge
 - 8268316: Typo in JFR jdk.Deserialization event
 - 8268638: semaphores of AsyncLogWriter may be broken when JVM is exiting.
 - 8264775: ClhsdbFindPC still fails with java.lang.RuntimeException: 'In java stack' missing from stdout/stderr
 - 8265073: XML transformation and indentation when using xml:space
 - 8269025: jsig/Testjsig.java doesn't check exit code
 - 8266518: Refactor and expand scatter/gather tests
 - 8268903: JFR: RecordingStream::dump is missing @since
 - 8265369: [macos-aarch64] java/net/MulticastSocket/Promiscuous.java failed with "SocketException: Cannot allocate memory"
 - 8268564: mark hotspot serviceability/attach tests which ignore external VM flags
 - ... and 13 more: https://git.openjdk.java.net/jdk/compare/8f2456e5...ed622f4b

The merge commit only contains trivial merges, so no merge-specific webrevs have been generated.

Changes: https://git.openjdk.java.net/jdk/pull/4533/files
  Stats: 12229 lines in 119 files changed: 6768 ins; 5337 del; 124 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4533.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4533/head:pull/4533

PR: https://git.openjdk.java.net/jdk/pull/4533

From jwilhelm at openjdk.java.net  Fri Jun 18 23:08:32 2021
From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson)
Date: Fri, 18 Jun 2021 23:08:32 GMT
Subject: RFR: Merge jdk17 [v2]
In-Reply-To: 
References: 
Message-ID: 

> Forwardport JDK 17 -> JDK 18

Jesper Wilhelmsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 46 commits:

 - Merge
 - 8267042: bug in monitor locking/unlocking on ARM32 C1 due to uninitialized BasicObjectLock::_displaced_header
   
   Co-authored-by: Chris Cole 
   Reviewed-by: dsamersoff
 - 8268964: Remove unused ReferenceProcessorAtomicMutator
   
   Reviewed-by: tschatzl, pliden
 - 8268900: com/sun/net/httpserver/Headers.java: Fix indentation and whitespace
   
   Reviewed-by: dfuchs, chegar, michaelm
 - Merge
 - 8268678: LetsEncryptCA.java test fails as Let?s Encrypt Authority X3 is retired
   
   Reviewed-by: xuelei
 - 8267189: Remove duplicated unregistered classes from dynamic archive
   
   Reviewed-by: ccheung, minqi
 - 8268638: semaphores of AsyncLogWriter may be broken when JVM is exiting.
   
   Reviewed-by: dholmes, phh
 - 8268556: Use bitmap for storing regions that failed evacuation
   
   Reviewed-by: kbarrett, iwalulya, sjohanss
 - 8268294: Reusing HttpClient in a WebSocket.Listener hangs.
   
   Reviewed-by: dfuchs
 - ... and 36 more: https://git.openjdk.java.net/jdk/compare/b8f073be...ed622f4b

-------------

Changes: https://git.openjdk.java.net/jdk/pull/4533/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4533&range=01
  Stats: 8681 lines in 159 files changed: 7992 ins; 386 del; 303 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4533.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4533/head:pull/4533

PR: https://git.openjdk.java.net/jdk/pull/4533

From jwilhelm at openjdk.java.net  Fri Jun 18 23:08:36 2021
From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson)
Date: Fri, 18 Jun 2021 23:08:36 GMT
Subject: Integrated: Merge jdk17
In-Reply-To: 
References: 
Message-ID: <2lNbBvRChZKSawaU5THoDLfbazhJcqIyt1V_Ew-3kNY=.f7fb1cfd-ecda-4fd3-874d-311a6d8362b9@github.com>

On Fri, 18 Jun 2021 22:17:41 GMT, Jesper Wilhelmsson  wrote:

> Forwardport JDK 17 -> JDK 18

This pull request has now been integrated.

Changeset: b7d78a5b
Author:    Jesper Wilhelmsson 
URL:       https://git.openjdk.java.net/jdk/commit/b7d78a5b661e2b00f271298db3b6cc873cf754e7
Stats:     12229 lines in 119 files changed: 6768 ins; 5337 del; 124 mod

Merge

-------------

PR: https://git.openjdk.java.net/jdk/pull/4533

From sviswanathan at openjdk.java.net  Sat Jun 19 00:49:38 2021
From: sviswanathan at openjdk.java.net (Sandhya Viswanathan)
Date: Sat, 19 Jun 2021 00:49:38 GMT
Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512
 [v5]
In-Reply-To: 
References: 
 
Message-ID: 

On Fri, 18 Jun 2021 22:12:11 GMT, Scott Gibbons  wrote:

>> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding.
>> 
>> A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done.  This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so.  A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded.  This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded.
>> 
>> The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding.  The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically.
>> 
>> Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k.  The numbers are given in the table below.
>> 
>> **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**.
>> 
>> 
>> Benchmark Name | Base Score | Optimized Score | Gain
>> -- | -- | -- | --
>> testBase64Decode size 1 | 15.36 | 15.32 | 1.00
>> testBase64Decode size 3 | 17.00 | 16.72 | 1.02
>> testBase64Decode size 7 | 20.60 | 18.82 | 1.09
>> testBase64Decode size 32 | 34.21 | 26.77 | 1.28
>> testBase64Decode size 64 | 54.43 | 38.35 | 1.42
>> testBase64Decode size 80 | 66.40 | 48.34 | 1.37
>> testBase64Decode size 96 | 73.16 | 52.90 | 1.38
>> testBase64Decode size 112 | 84.93 | 51.82 | 1.64
>> testBase64Decode size 512 | 288.81 | 32.04 | 9.01
>> testBase64Decode size 1000 | 560.48 | 40.79 | 13.74
>> testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72
>> testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15
>> testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07
>> testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10
>> testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02
>> testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10
>> testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05
>> testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00
>> testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05
>> testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20
>> testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09
>> testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12
>> testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09
>> testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21
>> testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29
>> testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12
>> testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05
>> testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18
>> testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02
>> testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24
>> testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23
>> testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24
>> testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14
>> testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19
>> testBase64WithErrorInputsDecode size   20000 | 1398.44 | 1138.17 | 1.23
>> testBase64WithErrorInputsDecode size   50000 | 1409.41 | 1114.16 | 1.26
>
> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Added comments.  Streamlined flow for decode.

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6004:

> 6002:       __ BIND(L_continue);
> 6003: 
> 6004:       __ vpxor(errorvec, errorvec, errorvec, Assembler::AVX_512bit);

Why clearing errorvec is needed here?

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6023:

> 6021:       __ evmovdquq(tmp16_op3, pack16_op, Assembler::AVX_512bit);
> 6022:       __ evmovdquq(tmp16_op2, pack16_op, Assembler::AVX_512bit);
> 6023:       __ evmovdquq(tmp16_op1, pack16_op, Assembler::AVX_512bit);

Why do we need 3 additional copies of pack16_op?

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6026:

> 6024:       __ evmovdquq(tmp32_op3, pack32_op, Assembler::AVX_512bit);
> 6025:       __ evmovdquq(tmp32_op2, pack32_op, Assembler::AVX_512bit);
> 6026:       __ evmovdquq(tmp32_op1, pack32_op, Assembler::AVX_512bit);

Why do we need 3 additional copies of pack32_op?

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6051:

> 6049:       __ vpternlogd(t0, 0xfe, input1, input2, Assembler::AVX_512bit);
> 6050: 
> 6051:       __ vpternlogd(t1, 0xfe, translated0, translated1, Assembler::AVX_512bit);

The comment here could be something like below for easy understanding:
// OR all of the inputs and translations together ...
// Here t0 has input0 and t1 has input3

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6055:

> 6053:       __ vpternlogd(t2, 0xfe, translated3, t0, Assembler::AVX_512bit);
> 6054:       __ evmovdquq(errorvec, t0, Assembler::AVX_512bit);
> 6055:       __ vpternlogd(errorvec, 0xfe, t1, t2, Assembler::AVX_512bit);

Could this be simplified as:
__ vpternlogd(t0, 0xfe, translated2, translated3, Assembler::AVX_512bit);
__ vpor(errorvec, t0, t1, Assembler::AVX_512bit);

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6060:

> 6058:       __ evpmovb2m(k3, errorvec, Assembler::AVX_512bit);
> 6059:       __ kortestql(k3, k3);
> 6060:       __ vpxor(errorvec, errorvec, errorvec, Assembler::AVX_512bit);

Why clearing errorvec is needed here? Seems to be not necessary for the 256 byte processing loop.

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6069:

> 6067:       __ vpmaddubsw(merge_ab_bc0, translated0, tmp16_op3, Assembler::AVX_512bit);
> 6068:       __ vpmaddubsw(merge_ab_bc1, translated1, tmp16_op2, Assembler::AVX_512bit);
> 6069:       __ vpmaddubsw(merge_ab_bc2, translated2, tmp16_op1, Assembler::AVX_512bit);

Could we not use pack16_op directly here instead of its copies tmp16_*?

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6078:

> 6076:       __ vpmaddwd(merged0, merge_ab_bc0, tmp32_op2, Assembler::AVX_512bit);
> 6077:       __ vpmaddwd(merged1, merge_ab_bc1, tmp32_op1, Assembler::AVX_512bit);
> 6078:       __ vpmaddwd(merged2, merge_ab_bc2, tmp32_op3, Assembler::AVX_512bit);

Could we not use pack32_op directly here instead of its copies tmp32_*?

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6086:

> 6084:       __ evpermt2b(arr01, join01, merged1, Assembler::AVX_512bit);
> 6085:       __ evpermt2b(arr12, join12, merged2, Assembler::AVX_512bit);
> 6086:       __ evpermt2b(arr23, join23, merged3, Assembler::AVX_512bit);

arr01 is same as merged0, arr12 is same as merged1, arr23 is same as merged2.
So the above will be easy to understand if coded as below:
__ evpermt2b(merged0, join01, merged1, Assembler::AVX_512bit);
__ evpermt2b(merged1, join12, merged2, Assembler::AVX_512bit);
__ evpermt2b(merged2, join23, merged3, Assembler::AVX_512bit);

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6091:

> 6089:       __ evmovdquq(Address(dest, dp, Address::times_1, 0x00), arr01, Assembler::AVX_512bit);
> 6090:       __ evmovdquq(Address(dest, dp, Address::times_1, 0x40), arr12, Assembler::AVX_512bit);
> 6091:       __ evmovdquq(Address(dest, dp, Address::times_1, 0x80), arr23, Assembler::AVX_512bit);

Here you can directly used the merged0, merged1 and merged2 inplace of arr01, arr12 and arr23 respectively.

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6122:

> 6120:       __ evpermt2b(translated0, input0, lookup_hi, Assembler::AVX_512bit);
> 6121: 
> 6122:       __ vpternlogd(errorvec, 0xfe, translated0, input0, Assembler::AVX_512bit);

This could be a simple vpor:
__ vpor(errorvec, translated0, input0, Assembler::AVX_512bit);

-------------

PR: https://git.openjdk.java.net/jdk/pull/4368

From sviswanathan at openjdk.java.net  Sat Jun 19 20:33:31 2021
From: sviswanathan at openjdk.java.net (Sandhya Viswanathan)
Date: Sat, 19 Jun 2021 20:33:31 GMT
Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512
 [v5]
In-Reply-To: 
References: 
 
Message-ID: 

On Fri, 18 Jun 2021 22:12:11 GMT, Scott Gibbons  wrote:

>> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding.
>> 
>> A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done.  This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so.  A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded.  This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded.
>> 
>> The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding.  The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically.
>> 
>> Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k.  The numbers are given in the table below.
>> 
>> **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**.
>> 
>> 
>> Benchmark Name | Base Score | Optimized Score | Gain
>> -- | -- | -- | --
>> testBase64Decode size 1 | 15.36 | 15.32 | 1.00
>> testBase64Decode size 3 | 17.00 | 16.72 | 1.02
>> testBase64Decode size 7 | 20.60 | 18.82 | 1.09
>> testBase64Decode size 32 | 34.21 | 26.77 | 1.28
>> testBase64Decode size 64 | 54.43 | 38.35 | 1.42
>> testBase64Decode size 80 | 66.40 | 48.34 | 1.37
>> testBase64Decode size 96 | 73.16 | 52.90 | 1.38
>> testBase64Decode size 112 | 84.93 | 51.82 | 1.64
>> testBase64Decode size 512 | 288.81 | 32.04 | 9.01
>> testBase64Decode size 1000 | 560.48 | 40.79 | 13.74
>> testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72
>> testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15
>> testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07
>> testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10
>> testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02
>> testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10
>> testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05
>> testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00
>> testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05
>> testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20
>> testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09
>> testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12
>> testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09
>> testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21
>> testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29
>> testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12
>> testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05
>> testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18
>> testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02
>> testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24
>> testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23
>> testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24
>> testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14
>> testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19
>> testBase64WithErrorInputsDecode size   20000 | 1398.44 | 1138.17 | 1.23
>> testBase64WithErrorInputsDecode size   50000 | 1409.41 | 1114.16 | 1.26
>
> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Added comments.  Streamlined flow for decode.

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6155:

> 6153:       __ subl(output_size, length);
> 6154:       __ movq(rax, -1);
> 6155:       __ shrxq(rax, rax, output_size);    // Input mask in rax

I think this could also be implemented as:
__ movq(rax, -1);
__ bzhiq(rax, rax, length);

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6173:

> 6171:       __ movq(rax, 64);
> 6172:       __ subq(rax, output_size);
> 6173:       __ shrxq(output_mask, output_mask, rax);

The output mask can also be computed using bzhiq:
__ movq(output_mask, -1);
__ bzhiq(output_mask, output_mask, output_size);

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6243:

> 6241: 
> 6242:       __ BIND(L_padding);
> 6243:       __ decrementq(r13, 1);

It will be good to use output_size here instead of r13.

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6249:

> 6247:       __ jcc(Assembler::notEqual, L_donePadding);
> 6248: 
> 6249:       __ decrementq(r13, 1);

It will be good to use output_size here instead of r13.

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6320:

> 6318:     __ BIND(L_bottomLoop);
> 6319:     __ load_signed_byte(byte1, Address(source, start_offset, Address::times_1, 0x00));
> 6320:     __ load_signed_byte(byte2, Address(source, start_offset, Address::times_1, 0x01));

This should be unsigned_byte.

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6324:

> 6322:     __ load_signed_byte(byte2, Address(decode_table, byte2));
> 6323:     __ load_signed_byte(byte3, Address(source, start_offset, Address::times_1, 0x02));
> 6324:     __ load_signed_byte(byte4, Address(source, start_offset, Address::times_1, 0x03));

This should be unsigned_byte.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4368

From aph at openjdk.java.net  Sun Jun 20 14:15:34 2021
From: aph at openjdk.java.net (Andrew Haley)
Date: Sun, 20 Jun 2021 14:15:34 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v4]
In-Reply-To: 
References: 
 
Message-ID: 

On Fri, 18 Jun 2021 19:23:25 GMT, Patricio Chilano Mateo  wrote:

>> Hi all,
>> 
>> Please review the following patch which handles the removal of biased locking code. 
>> 
>> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
>> 
>> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
>> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
>> 
>> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
>> 
>> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
>> 
>> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
>> 
>> Thanks,
>> Patricio
>
> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:
> 
>   restore run in EATests.java

AArch64 changes look good.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From rrich at openjdk.java.net  Sun Jun 20 23:08:31 2021
From: rrich at openjdk.java.net (Richard Reingruber)
Date: Sun, 20 Jun 2021 23:08:31 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v4]
In-Reply-To: <7Kh5bbLKYK3eeJKt5zDWevHMRlSSmVryy3OuCBHZvIQ=.155672c0-cf1e-4258-bfd2-489bd6aa7058@github.com>
References: 
 
 
 <7Kh5bbLKYK3eeJKt5zDWevHMRlSSmVryy3OuCBHZvIQ=.155672c0-cf1e-4258-bfd2-489bd6aa7058@github.com>
Message-ID: 

On Fri, 18 Jun 2021 19:19:25 GMT, Patricio Chilano Mateo  wrote:

>> test/jdk/com/sun/jdi/EATests.java line 235:
>> 
>>> 233:         // Relocking test cases
>>> 234:         new EARelockingSimpleTarget()                                                       .run();
>>> 235:         new EARelockingSimple_2Target()                                                     .run();
>> 
>> I know all the tests that were removed mention biased locking in the comments, but do they require biased locking to function properly? I'm just wondering if we might get better EA test coverage if they are left in place.
>
> They are trying to exercise some biased locking specific paths, but maybe @reinrich can comment on wether it is worth keeping them since he wrote the tests.

The test cases are very specific to biased locking. It is not worth keeping them.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From rrich at openjdk.java.net  Sun Jun 20 23:27:37 2021
From: rrich at openjdk.java.net (Richard Reingruber)
Date: Sun, 20 Jun 2021 23:27:37 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v4]
In-Reply-To: 
References: 
 
Message-ID: 

On Fri, 18 Jun 2021 19:23:25 GMT, Patricio Chilano Mateo  wrote:

>> Hi all,
>> 
>> Please review the following patch which handles the removal of biased locking code. 
>> 
>> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
>> 
>> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
>> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
>> 
>> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
>> 
>> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
>> 
>> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
>> 
>> Thanks,
>> Patricio
>
> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:
> 
>   restore run in EATests.java

Hi Patricio,

the part that's related to JDK-8227745 (40f847e2) looks good to me. Thanks for taking care of it!

Tests are pending...

Cheers, Richard.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From dholmes at openjdk.java.net  Sun Jun 20 23:56:37 2021
From: dholmes at openjdk.java.net (David Holmes)
Date: Sun, 20 Jun 2021 23:56:37 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v4]
In-Reply-To: 
References: 
 
Message-ID: 

On Fri, 18 Jun 2021 19:23:25 GMT, Patricio Chilano Mateo  wrote:

>> Hi all,
>> 
>> Please review the following patch which handles the removal of biased locking code. 
>> 
>> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
>> 
>> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
>> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
>> 
>> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
>> 
>> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
>> 
>> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
>> 
>> Thanks,
>> Patricio
>
> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:
> 
>   restore run in EATests.java

Updates look fine.

Thanks,
David

-------------

Marked as reviewed by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4522

From github.com+20216587+miao-zheng at openjdk.java.net  Mon Jun 21 02:27:31 2021
From: github.com+20216587+miao-zheng at openjdk.java.net (Miao Zheng)
Date: Mon, 21 Jun 2021 02:27:31 GMT
Subject: RFR: 8268727: Remove unused slowpath locking method in OptoRuntime
In-Reply-To: 
References: 
Message-ID: 

On Tue, 15 Jun 2021 03:18:27 GMT, Miao Zheng  wrote:

> 8268727: Remove unused slowpath locking method in OptoRuntime

Could someone help to review this change? Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4490

From iklam at openjdk.java.net  Mon Jun 21 05:01:59 2021
From: iklam at openjdk.java.net (Ioi Lam)
Date: Mon, 21 Jun 2021 05:01:59 GMT
Subject: RFR: 8269004 Implement ResizableResourceHashtable
Message-ID: 

In HotSpot we have (at least) two hashtable designs in the C++ code:

- share/utilities/hashtable.hpp
- share/utilities/resourceHash.hpp

Of the two, the `ResourceHashtable` API is much cleaner and most new code has been written with it. However, one issue is that the `SIZE` of `ResourceHashtable` is a compile-time constant. This makes the hash-to-index computation very fast on x64 (gcc can avoid using the slow divq instruction for modulo). However, the downside is we cannot use `ResourceHashtable` when we need a hashtable whose size is determined at run time (and, optionally, resizeable).

This PR refactors `ResourceHashtable` into a base template class `ResourceHashtableBase`, whose `size()` function can be configured by a subclass to be either constant or runtime-configurable. 

Note: since we want to preserve the performance of `hash % SIZE`, we can't make `size()` a virtual function.

Preliminary benchmark shows that this refactoring has no impact on the performance of the constant `ResourceHashtable`. See https://github.com/iklam/tools/tree/main/bench/resourceHash:

*before*
ResourceHashtable: 2.70 sec

*after*
ResourceHashtable: 2.72 sec
ResizableResourceHashtable: 5.29 sec

To make sure `ResizableResourceHashtable` works, I rewrote some CDS code to use `ResizableResourceHashtable` instead of `KVHashtable`

-------------

Commit messages:
 - cleanup
 - step4 - implemented resizing
 - step3
 - step2
 - step1

Changes: https://git.openjdk.java.net/jdk/pull/4536/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4536&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8269004
  Stats: 183 lines in 7 files changed: 148 ins; 6 del; 29 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4536.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4536/head:pull/4536

PR: https://git.openjdk.java.net/jdk/pull/4536

From xliu at openjdk.java.net  Mon Jun 21 07:10:50 2021
From: xliu at openjdk.java.net (Xin Liu)
Date: Mon, 21 Jun 2021 07:10:50 GMT
Subject: RFR: 8269064: Dropped messages of AsyncLogWriter cause memleak
Message-ID: 

free c-strings of the dropped messages.

This patch added KVHashtable::remove(K), which allows user to remove the 
useless entries from the hashtable. Remove the corresponding entry from
AsyncLogWriter::_stats when the output is about to delete.

-------------

Commit messages:
 - Remove the entry of output in AsyncLogWriter when delete the output.
 - 8269064: Dropped messages of AsyncLogWriter cause memleak

Changes: https://git.openjdk.java.net/jdk/pull/4537/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4537&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8269064
  Stats: 143 lines in 6 files changed: 142 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4537.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4537/head:pull/4537

PR: https://git.openjdk.java.net/jdk/pull/4537

From david.holmes at oracle.com  Mon Jun 21 07:25:55 2021
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 21 Jun 2021 17:25:55 +1000
Subject: RFR: 8269064: Dropped messages of AsyncLogWriter cause memleak
In-Reply-To: 
References: 
Message-ID: 

Hi Xin,

I think this needs to be fixed in 17 so please create a PR against 17 
and change the fix version of the JBS issue to 17.

Thanks,
David

On 21/06/2021 5:10 pm, Xin Liu wrote:
> free c-strings of the dropped messages.
> 
> This patch added KVHashtable::remove(K), which allows user to remove the
> useless entries from the hashtable. Remove the corresponding entry from
> AsyncLogWriter::_stats when the output is about to delete.
> 
> -------------
> 
> Commit messages:
>   - Remove the entry of output in AsyncLogWriter when delete the output.
>   - 8269064: Dropped messages of AsyncLogWriter cause memleak
> 
> Changes: https://git.openjdk.java.net/jdk/pull/4537/files
>   Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4537&range=00
>    Issue: https://bugs.openjdk.java.net/browse/JDK-8269064
>    Stats: 143 lines in 6 files changed: 142 ins; 0 del; 1 mod
>    Patch: https://git.openjdk.java.net/jdk/pull/4537.diff
>    Fetch: git fetch https://git.openjdk.java.net/jdk pull/4537/head:pull/4537
> 
> PR: https://git.openjdk.java.net/jdk/pull/4537
> 

From xliu at openjdk.java.net  Mon Jun 21 08:05:32 2021
From: xliu at openjdk.java.net (Xin Liu)
Date: Mon, 21 Jun 2021 08:05:32 GMT
Subject: RFR: 8269064: Dropped messages of AsyncLogWriter cause memleak
In-Reply-To: 
References: 
Message-ID: <1e7whwXc7I9MFChajQsQU7Xd7fYqCw9d8ubxxdgUuOE=.1bd7f1df-366c-43e2-bc98-4184351609a6@github.com>

On Mon, 21 Jun 2021 07:02:09 GMT, Xin Liu  wrote:

> free c-strings of the dropped messages.
> 
> This patch added KVHashtable::remove(K), which allows user to remove the 
> useless entries from the hashtable. Remove the corresponding entry from
> AsyncLogWriter::_stats when the output is about to delete.

hi, David, 

I change "Fix Version/s" of JDK-8269064  to 17. 

Do you mean I need close this PR and create a PR from jdk17 instead?
Now jdk is pointing to 18. Shouldn't I work on jdk first and backport it to jdk17 then? 

thanks,
--lx

-------------

PR: https://git.openjdk.java.net/jdk/pull/4537

From tschatzl at openjdk.java.net  Mon Jun 21 08:27:35 2021
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Mon, 21 Jun 2021 08:27:35 GMT
Subject: RFR: 8017163: G1: Refactor remembered sets [v13]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Thu, 17 Jun 2021 15:49:29 GMT, Stefan Johansson  wrote:

>> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   sjohanss review - remove debug code
>
> Looked through the changes again and I think they are good. As we have all of JDK 18 to test, polish it and fix any potential problems I see no reason to not approve this now. 
> 
> I found a few unused functions, please remove them unless you have some future plans for any of them.

Thanks @kstefanj @walulyai for your reviews.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4116

From david.holmes at oracle.com  Mon Jun 21 09:40:08 2021
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 21 Jun 2021 19:40:08 +1000
Subject: RFR: 8269064: Dropped messages of AsyncLogWriter cause memleak
In-Reply-To: <1e7whwXc7I9MFChajQsQU7Xd7fYqCw9d8ubxxdgUuOE=.1bd7f1df-366c-43e2-bc98-4184351609a6@github.com>
References: 
 <1e7whwXc7I9MFChajQsQU7Xd7fYqCw9d8ubxxdgUuOE=.1bd7f1df-366c-43e2-bc98-4184351609a6@github.com>
Message-ID: 

Hi Xin,

On 21/06/2021 6:05 pm, Xin Liu wrote:
> On Mon, 21 Jun 2021 07:02:09 GMT, Xin Liu  wrote:
> 
>> free c-strings of the dropped messages.
>>
>> This patch added KVHashtable::remove(K), which allows user to remove the
>> useless entries from the hashtable. Remove the corresponding entry from
>> AsyncLogWriter::_stats when the output is about to delete.
> 
> hi, David,
> 
> I change "Fix Version/s" of JDK-8269064  to 17.
> 
> Do you mean I need close this PR and create a PR from jdk17 instead?

Yes.

> Now jdk is pointing to 18. Shouldn't I work on jdk first and backport it to jdk17 then?

Not during RDP1. If you want a fix to be in 17 then put it into 17 first 
and it will be automatically forward-ported to 18, but only during RDP1.

Thanks,
David

> thanks,
> --lx
> 
> -------------
> 
> PR: https://git.openjdk.java.net/jdk/pull/4537
> 

From tschatzl at openjdk.java.net  Mon Jun 21 10:09:33 2021
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Mon, 21 Jun 2021 10:09:33 GMT
Subject: Integrated: 8017163: G1: Refactor remembered sets
In-Reply-To: 
References: 
Message-ID: <4kSZyh4_LibUjKHC1_6_qiVx8ZNR9LK4H8UlLzYeuSc=.cf5857ef-6a64-490d-be28-e9664906bf76@github.com>

On Wed, 19 May 2021 15:23:00 GMT, Thomas Schatzl  wrote:

> Hi all,
> 
>   can I have reviews for this change that significantly refactors the remembered set for more scalability.
> 
>  The current G1 remembered set implementation has been designed for use cases and Java heaps and applications from 20 years ago.
> 
> Over time many problems with performance and in particular memory usage have been observed:
> 
> * adding elements to the lowest tier data structure takes a per-remembered set global lock. Measurements have shown that the applications can wait thousands of seconds acquiring these locks. While the affected threads are in most cases refinement threads so does not directly affect the application, it can still affect the ability of G1 to meet some goals needed for keeping pause times (i.e. amount of cards from the refinement buffers to be merged into the card table and then scanned during gc).
> 
> * there is a substantial memory overhead for managing the data structures: examples are
>     * using separate (hash) tables for the three different types of card containers
>     * there is significant unnecessary preallocation of memory for some of the card set containers
>     * Containers store redundant information
> 
> * inflexibility when reusing memory: in the current implementation the different containers use different approaches to manage memory. Most use the C heap directly, some the C heap with some internal global memory pool. This in practice makes it very difficult to implement anything other than giving back memory in the collection pause. The corresponding "Free Collection Set" pause can take a significant amount of time because of that.
> Also memory reuse is limited and preallocating arenas is limited (or would have to be reimplemented multiple times), stressing the C heap allocator.
> 
> * inability to support additional use cases: over time interesting ideas (e.g. JDK-8058803) came up for improving performance of remembered set management. Mostly due to redundant information everywhere and completely different handling of various aspects in the containers it is in practice impossible to implement these.
> 
> * (partial) inability to give back memory to the OS. While some of the containers use the C heap allocator, and so in some way give back memory, these implementations and handling is different for every container.
> 
> * the existing granularity of containers are unbalanced: currently there exist three tiers: "sparse", "fine" and "full". Sparse is an array of cards ranging in the hundreds maybe, "fine" is a bitmap covering a whole region and full is a bit indicating that that region should be scanned completely during GC.
> 
> The problem is that there is nothing between "no card at all" and "sparse" and in particular the difference between the capability to hold entries of "sparse" and "fine". I.e. memory usage difference when exceeding a "sparse" array (holding 128 entries at 32M regions, taking ~256 bytes) to fine that is able to hold 65k entries using 8kB is significant.
> For these reason there is even a dedicated option to stop allocating more "fine" containers and just give up and use "full" instead to avoid excessive memory usage. With extremely bad consequences in pause times.
> 
> Over time some of these issues have been fixed or in many cases band-aided, and some of these fixes and ideas were the result of working on this change (e.g. JDK-8262185, JDK-8233919, JDK-8213108).
> 
> This change is effectively a rewrite of the Java heap card based part of a region's remembered set.
> 
> This initial fully working change can be roughly described with the following properties:
> 
> * use a single `ConcurrentHashTable` for the card containers of a given region. The container in use replaced (coarsened) on the fly within the CHT node, completely lock-free. This implements JDK-6949259.
> 
> * memory for a given region's remembered set for all containers (and the CHT nodes) is backed by per container type and per remembered set arena style bump-pointer allocation buffers. In this change, in the pause, memory is given back to free lists only. The implementation gives back memory to the OS concurrently to the application. Memory is still managed using the C heap memory manager though, but abstracted away and could be replaced by manual page memory management.
> 
> * there are now four different container types and one meta-container type. These four actual containers are:
>   * inline pointer: the change store a few (3-5) cards in the CHT node directly and uses no extra memory.
>   * array of cards: similar to the "sparse" container, an array of cards with a configurable amount of entries. However bulk allocation of memory is now managed at a lower level so there is much less waste.
>   * bitmap: similar to "fine", a bitmap spanning a (sub-)range of memory
>   * full: same as full, indicating for a (sub-)range of memory that all cards are to be looked at during scan. Similar to inline pointers, this uses no extra memory.
>   * howl: the Howl container subdivides a given memory range into subranges where any of the other containers describing that sub-range of the heap may be stored in. This is somewhat similar to the idea suggested in JDK-8048504.
> 
> * care has been taken to minimize container memory usage, e.g. by not adding redundant information there and in general carefully specify them. They have been designed with future enhancements in mind.
> 
> In some benchmarks (where there is significant remembered set memory usage) we are seeing memory reduction to 25% of JDK 16 levels with this change. Garbage collection times are at most as long or shorter than before; most changes affecting pause times have been extracted earlier. Individiual affected phases are generally shorter now.
> 
> Testing: tier1-8 many times, manual and automated perf testing

This pull request has now been integrated.

Changeset: 1692fd2e
Author:    Thomas Schatzl 
URL:       https://git.openjdk.java.net/jdk/commit/1692fd2eba7164ebd11fce1c02696a9053d131af
Stats:     6110 lines in 64 files changed: 4535 ins; 1317 del; 258 mod

8017163: G1: Refactor remembered sets
8048504: G1: Investigate replacing the coarse and fine grained data structures in the remembered sets
6949259: G1: Merge sparse and fine remembered set hash tables

Co-authored-by: Ivan Walulya 
Co-authored-by: Thomas Schatzl 
Reviewed-by: sjohanss, iwalulya

-------------

PR: https://git.openjdk.java.net/jdk/pull/4116

From jvernee at openjdk.java.net  Mon Jun 21 12:10:43 2021
From: jvernee at openjdk.java.net (Jorn Vernee)
Date: Mon, 21 Jun 2021 12:10:43 GMT
Subject: [jdk17] Integrated: 8268717: Upstream: 8268673: Stack walk across
 optimized entry frame on fresh native thread fails
In-Reply-To: 
References: 
Message-ID: <48y3dj2CbgnTSFbw905vP2NC7Z-V6xQCZegUZ7Ot744=.a30d849f-2826-40f0-ace4-4164866aad87@github.com>

On Wed, 16 Jun 2021 11:19:37 GMT, Jorn Vernee  wrote:

> Upstream a critical fix from the panama-foreign repo.
> 
> See the prior review thread here: https://github.com/openjdk/panama-foreign/pull/558
> 
> Testing: tier 1-2, local run of run-test-jdk_foreign.

This pull request has now been integrated.

Changeset: f25e7197
Author:    Jorn Vernee 
URL:       https://git.openjdk.java.net/jdk17/commit/f25e7197fef76cc87a15da7cc96a42b84d69bbfe
Stats:     198 lines in 12 files changed: 197 ins; 0 del; 1 mod

8268717: Upstream: 8268673: Stack walk across optimized entry frame on fresh native thread fails

Reviewed-by: mcimadamore, erikj

-------------

PR: https://git.openjdk.java.net/jdk17/pull/76

From coleenp at openjdk.java.net  Mon Jun 21 13:03:32 2021
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Mon, 21 Jun 2021 13:03:32 GMT
Subject: RFR: 8264941: Remove CodeCache::mark_for_evol_deoptimization()
 method
In-Reply-To: 
References: 
Message-ID: 

On Wed, 16 Jun 2021 12:52:46 GMT, Coleen Phillimore  wrote:

> This change removes the mark_for_evol_deoptimization method and removes the flag that all dependencies are recorded.  Before the change to walk the entire nmethod looking for "old" (redefined) methods with metadata_do(), we used to find methods in the code cache to deoptimize based on evol_method dependencies.  If the dependencies weren't yet recorded, we had to deoptimize all of the methods.  A long time ago, we had a customer who was unhappy with the pause for this when they had late attach.  Now we don't have this problem.
> The evol_method dependencies are still used by the compiler to check for old methods during compilation.  I didn't change this but it might be something someone who knows the compiler better can do differently and remove these dependencies too.
> Tested with tier1-6.

Thank you both Vladimir.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4509

From rrich at openjdk.java.net  Mon Jun 21 13:04:34 2021
From: rrich at openjdk.java.net (Richard Reingruber)
Date: Mon, 21 Jun 2021 13:04:34 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v4]
In-Reply-To: 
References: 
 
Message-ID: 

On Fri, 18 Jun 2021 19:23:25 GMT, Patricio Chilano Mateo  wrote:

>> Hi all,
>> 
>> Please review the following patch which handles the removal of biased locking code. 
>> 
>> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
>> 
>> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
>> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
>> 
>> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
>> 
>> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
>> 
>> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
>> 
>> Thanks,
>> Patricio
>
> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:
> 
>   restore run in EATests.java

No issues from my own testing. Broader test coverage on all platforms is expected tomorrow.

Best regards,
Richard.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From pchilanomate at openjdk.java.net  Mon Jun 21 14:01:33 2021
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Mon, 21 Jun 2021 14:01:33 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v4]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Sun, 20 Jun 2021 14:12:08 GMT, Andrew Haley  wrote:

> AArch64 changes look good.
Thanks for checking AArch64 changes @theRealAph!

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From pchilanomate at openjdk.java.net  Mon Jun 21 14:01:33 2021
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Mon, 21 Jun 2021 14:01:33 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v4]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Mon, 21 Jun 2021 13:01:39 GMT, Richard Reingruber  wrote:

> No issues from my own testing. Broader test coverage on all platforms is expected tomorrow.
Great, I'll wait for that. Thanks for all the testing Richard!

Patricio

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From pchilanomate at openjdk.java.net  Mon Jun 21 14:49:20 2021
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Mon, 21 Jun 2021 14:49:20 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v5]
In-Reply-To: 
References: 
Message-ID: 

> Hi all,
> 
> Please review the following patch which handles the removal of biased locking code. 
> 
> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
> 
> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
> 
> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
> 
> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
> 
> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
> 
> Thanks,
> Patricio

Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:

  Un-ProblemList serviceability tests (8268574 and 8268644)

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4522/files
  - new: https://git.openjdk.java.net/jdk/pull/4522/files/215e46b8..c6015171

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4522&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4522&range=03-04

  Stats: 7 lines in 2 files changed: 0 ins; 6 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4522.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4522/head:pull/4522

PR: https://git.openjdk.java.net/jdk/pull/4522

From iveresov at openjdk.java.net  Mon Jun 21 15:44:33 2021
From: iveresov at openjdk.java.net (Igor Veresov)
Date: Mon, 21 Jun 2021 15:44:33 GMT
Subject: RFR: 8267657: Add missing PrintC1Statistics before incrementing
 counters
In-Reply-To: <3y_kn9nUEynQogBERppD9dKS5rx7An1LH01suSXUJho=.e3ffc36b-fa8a-45a5-b957-91ef009148d1@github.com>
References: <3y_kn9nUEynQogBERppD9dKS5rx7An1LH01suSXUJho=.e3ffc36b-fa8a-45a5-b957-91ef009148d1@github.com>
Message-ID: 

On Tue, 25 May 2021 03:07:15 GMT, Yi Yang  wrote:

> Trivial change to add missing PrintC1Statistics before incrementing counters.

Looks good.

-------------

Marked as reviewed by iveresov (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4178

From xliu at openjdk.java.net  Mon Jun 21 17:02:39 2021
From: xliu at openjdk.java.net (Xin Liu)
Date: Mon, 21 Jun 2021 17:02:39 GMT
Subject: Withdrawn: 8269064: Dropped messages of AsyncLogWriter cause memleak
In-Reply-To: 
References: 
Message-ID: 

On Mon, 21 Jun 2021 07:02:09 GMT, Xin Liu  wrote:

> free c-strings of the dropped messages.
> 
> This patch added KVHashtable::remove(K), which allows user to remove the 
> useless entries from the hashtable. Remove the corresponding entry from
> AsyncLogWriter::_stats when the output is about to delete.

This pull request has been closed without being integrated.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4537

From xliu at openjdk.java.net  Mon Jun 21 18:17:46 2021
From: xliu at openjdk.java.net (Xin Liu)
Date: Mon, 21 Jun 2021 18:17:46 GMT
Subject: [jdk17] RFR: 8267752: KVHashtable doesn't deallocate entries
Message-ID: 

8267752: KVHashtable doesn't deallocate entries

-------------

Commit messages:
 - Backport 72b3b0af08136342e54e1cdea0c48d64172e8870

Changes: https://git.openjdk.java.net/jdk17/pull/110/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk17&pr=110&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8267752
  Stats: 20 lines in 1 file changed: 20 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk17/pull/110.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk17 pull/110/head:pull/110

PR: https://git.openjdk.java.net/jdk17/pull/110

From xliu at openjdk.java.net  Mon Jun 21 18:42:26 2021
From: xliu at openjdk.java.net (Xin Liu)
Date: Mon, 21 Jun 2021 18:42:26 GMT
Subject: [jdk17] RFR: 8267752: KVHashtable doesn't deallocate entries
In-Reply-To: 
References: 
Message-ID: 

On Mon, 21 Jun 2021 18:08:48 GMT, Xin Liu  wrote:

> Add a free_entry iteration to the destructor of ~KVHashtables.
> Tested with tier1-3.

hi, @coleenp 

[JDK-8269064](https://bugs.openjdk.java.net/browse/JDK-8269064) depends on this [JDK-8267752](https://bugs.openjdk.java.net/browse/JDK-8267752).  I think JDK-8267752 itself is also worthy in jdk17. 

ctx:
https://github.com/openjdk/jdk/pull/4537#event-4918066602 

Could you take a look at it?

-------------

PR: https://git.openjdk.java.net/jdk17/pull/110

From sspitsyn at openjdk.java.net  Mon Jun 21 18:54:34 2021
From: sspitsyn at openjdk.java.net (Serguei Spitsyn)
Date: Mon, 21 Jun 2021 18:54:34 GMT
Subject: RFR: 8264941: Remove CodeCache::mark_for_evol_deoptimization()
 method
In-Reply-To: 
References: 
Message-ID: 

On Wed, 16 Jun 2021 12:52:46 GMT, Coleen Phillimore  wrote:

> This change removes the mark_for_evol_deoptimization method and removes the flag that all dependencies are recorded.  Before the change to walk the entire nmethod looking for "old" (redefined) methods with metadata_do(), we used to find methods in the code cache to deoptimize based on evol_method dependencies.  If the dependencies weren't yet recorded, we had to deoptimize all of the methods.  A long time ago, we had a customer who was unhappy with the pause for this when they had late attach.  Now we don't have this problem.
> The evol_method dependencies are still used by the compiler to check for old methods during compilation.  I didn't change this but it might be something someone who knows the compiler better can do differently and remove these dependencies too.
> Tested with tier1-6.

Marked as reviewed by sspitsyn (Reviewer).

Hi Coleen,
LGTM.
Thanks,
Serguei

-------------

PR: https://git.openjdk.java.net/jdk/pull/4509Marked as reviewed by sspitsyn (Reviewer).

From jwilhelm at openjdk.java.net  Mon Jun 21 22:11:54 2021
From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson)
Date: Mon, 21 Jun 2021 22:11:54 GMT
Subject: RFR: Merge jdk17
Message-ID: <4D0KACEvY4TGf4sVl6GUP8Z7byPVOfOy7iCrCtz1Z1o=.712c74cf-1b4e-42be-b7eb-319cce468aa3@github.com>

Forwardport JDK 17 -> JDK 18

-------------

Commit messages:
 - Merge
 - 8268672: C2: assert(!loop->is_member(u_loop)) failed: can be in outer loop or out of both loops only
 - 8268717: Upstream: 8268673: Stack walk across optimized entry frame on fresh native thread fails
 - 8268362: [REDO] C2 crash when compile negative Arrays.copyOf length after loop
 - 8268702: JFR diagnostic commands lack argument descriptors when viewed using Platform MBean Server
 - 8267042: bug in monitor locking/unlocking on ARM32 C1 due to uninitialized BasicObjectLock::_displaced_header
 - 8269063: Build failure due to VerifyReceiverTypes was not declared after JDK-8268405

The webrevs contain the adjustments done while merging with regards to each parent branch:
 - master: https://webrevs.openjdk.java.net/?repo=jdk&pr=4545&range=00.0
 - jdk17: https://webrevs.openjdk.java.net/?repo=jdk&pr=4545&range=00.1

Changes: https://git.openjdk.java.net/jdk/pull/4545/files
  Stats: 608 lines in 25 files changed: 578 ins; 17 del; 13 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4545.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4545/head:pull/4545

PR: https://git.openjdk.java.net/jdk/pull/4545

From jwilhelm at openjdk.java.net  Mon Jun 21 23:13:34 2021
From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson)
Date: Mon, 21 Jun 2021 23:13:34 GMT
Subject: RFR: Merge jdk17 [v2]
In-Reply-To: <4D0KACEvY4TGf4sVl6GUP8Z7byPVOfOy7iCrCtz1Z1o=.712c74cf-1b4e-42be-b7eb-319cce468aa3@github.com>
References: <4D0KACEvY4TGf4sVl6GUP8Z7byPVOfOy7iCrCtz1Z1o=.712c74cf-1b4e-42be-b7eb-319cce468aa3@github.com>
Message-ID: <6_vIbsJ6HFVVhVo491H1wJI-2gof1AIDJZHJ-OpfE8Q=.af0d1f7f-6381-4b10-9e80-22ce54b777c5@github.com>

> Forwardport JDK 17 -> JDK 18

Jesper Wilhelmsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 52 commits:

 - Merge
 - 8268458: Add verification type for evacuation failures
   
   Reviewed-by: kbarrett, iwalulya
 - 8268952: Automatically update heap sizes in G1MonitoringScope
   
   Reviewed-by: kbarrett, iwalulya
 - 8269029: compiler/codegen/TestCharVect2.java fails for client VMs
   
   Reviewed-by: kvn, jiefu
 - 8017163: G1: Refactor remembered sets
   8048504: G1: Investigate replacing the coarse and fine grained data structures in the remembered sets
   6949259: G1: Merge sparse and fine remembered set hash tables
   
   Co-authored-by: Ivan Walulya 
   Co-authored-by: Thomas Schatzl 
   Reviewed-by: sjohanss, iwalulya
 - 8266082: AssertionError in Annotate.fromAnnotations with -Xdoclint
   
   Reviewed-by: vromero
 - Merge
 - 8267042: bug in monitor locking/unlocking on ARM32 C1 due to uninitialized BasicObjectLock::_displaced_header
   
   Co-authored-by: Chris Cole 
   Reviewed-by: dsamersoff
 - 8268964: Remove unused ReferenceProcessorAtomicMutator
   
   Reviewed-by: tschatzl, pliden
 - 8268900: com/sun/net/httpserver/Headers.java: Fix indentation and whitespace
   
   Reviewed-by: dfuchs, chegar, michaelm
 - ... and 42 more: https://git.openjdk.java.net/jdk/compare/d3ad8cd3...40c550db

-------------

Changes: https://git.openjdk.java.net/jdk/pull/4545/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4545&range=01
  Stats: 14796 lines in 230 files changed: 12511 ins; 1719 del; 566 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4545.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4545/head:pull/4545

PR: https://git.openjdk.java.net/jdk/pull/4545

From jwilhelm at openjdk.java.net  Mon Jun 21 23:13:35 2021
From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson)
Date: Mon, 21 Jun 2021 23:13:35 GMT
Subject: Integrated: Merge jdk17
In-Reply-To: <4D0KACEvY4TGf4sVl6GUP8Z7byPVOfOy7iCrCtz1Z1o=.712c74cf-1b4e-42be-b7eb-319cce468aa3@github.com>
References: <4D0KACEvY4TGf4sVl6GUP8Z7byPVOfOy7iCrCtz1Z1o=.712c74cf-1b4e-42be-b7eb-319cce468aa3@github.com>
Message-ID: <0DZcV1V15Eo794hocKR8fbsedv13NPolxXHTnDwma4c=.a0e9754f-5f23-4f6e-a040-0d8f82461708@github.com>

On Mon, 21 Jun 2021 22:03:59 GMT, Jesper Wilhelmsson  wrote:

> Forwardport JDK 17 -> JDK 18

This pull request has now been integrated.

Changeset: 0458113c
Author:    Jesper Wilhelmsson 
URL:       https://git.openjdk.java.net/jdk/commit/0458113c6b1cf500ffdf049c1e3a698b16ce12ce
Stats:     608 lines in 25 files changed: 578 ins; 17 del; 13 mod

Merge

-------------

PR: https://git.openjdk.java.net/jdk/pull/4545

From dholmes at openjdk.java.net  Mon Jun 21 23:43:28 2021
From: dholmes at openjdk.java.net (David Holmes)
Date: Mon, 21 Jun 2021 23:43:28 GMT
Subject: [jdk17] RFR: 8267752: KVHashtable doesn't deallocate entries
In-Reply-To: 
References: 
Message-ID: 

On Mon, 21 Jun 2021 18:08:48 GMT, Xin Liu  wrote:

> Add a free_entry iteration to the destructor of ~KVHashtables.
> Tested with tier1-3.

As a P4 bug this is not permitted in JDK 17 under RDP1. You would need to change to a P3 with suitable justification.

-------------

PR: https://git.openjdk.java.net/jdk17/pull/110

From gli at openjdk.java.net  Tue Jun 22 01:22:46 2021
From: gli at openjdk.java.net (Guoxiong Li)
Date: Tue, 22 Jun 2021 01:22:46 GMT
Subject: RFR: 8268368: Adopt cast notation for JavaThread conversions
Message-ID: 

Hi all,

Considering the consistency of `JavaThread` and other threads, such as WorkerThread and CompilerThread, `JavaThread` could use a method named `cast` to replace the method `Thread::as_Java_thread()`. It can reduce the Thread's knowledge about the subtypes.

This patch removes two methods, `JavaThread* Thread::as_Java_thread()` and `const JavaThread* Thread::as_Java_thread() const`, of the class `Thread` and adds two static methods, `JavaThread* cast(Thread* t)` and `const JavaThread* cast(const Thread* t)`, to the class `JavaThread`. Correspondingly, the code of the method `JavaThread::current()` need to be adjusted and many places where the method `Thread::as_Java_thread()` is used need to use `JavaThread::cast` instead.

Test:
tier1 passed locally.

Thanks for taking the time to review.

Best Regards,
-- Guoxiong

-------------

Commit messages:
 - 8268368: Adopt cast notation for JavaThread conversions

Changes: https://git.openjdk.java.net/jdk/pull/4546/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4546&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8268368
  Stats: 159 lines in 64 files changed: 13 ins; 19 del; 127 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4546.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4546/head:pull/4546

PR: https://git.openjdk.java.net/jdk/pull/4546

From dholmes at openjdk.java.net  Tue Jun 22 02:02:28 2021
From: dholmes at openjdk.java.net (David Holmes)
Date: Tue, 22 Jun 2021 02:02:28 GMT
Subject: RFR: 8268368: Adopt cast notation for JavaThread conversions
In-Reply-To: 
References: 
Message-ID: <3jEWcoPqG9EMr-wmTMV66lnEqEZBbwR45E5gmwZhbOk=.0cbbc64d-07ac-4667-9037-b136c279188c@github.com>

On Tue, 22 Jun 2021 01:11:30 GMT, Guoxiong Li  wrote:

> Hi all,
> 
> Considering the consistency of `JavaThread` and other threads, such as WorkerThread and CompilerThread, `JavaThread` could use a method named `cast` to replace the method `Thread::as_Java_thread()`. It can reduce the Thread's knowledge about the subtypes.
> 
> This patch removes two methods, `JavaThread* Thread::as_Java_thread()` and `const JavaThread* Thread::as_Java_thread() const`, of the class `Thread` and adds two static methods, `JavaThread* cast(Thread* t)` and `const JavaThread* cast(const Thread* t)`, to the class `JavaThread`. Correspondingly, the code of the method `JavaThread::current()` need to be adjusted and many places where the method `Thread::as_Java_thread()` is used need to use `JavaThread::cast` instead.
> 
> Test:
> tier1 passed locally.
> 
> Thanks for taking the time to review.
> 
> Best Regards,
> -- Guoxiong

Hi Guoxiong,

Thanks for picking up this enhancement request.

I wasn't sure if this would be worth the churn/disruption to the source code, but I think it is ok and preferable to use the cast notation.

The changes look good except for one mistake flagged below.

Note you need at least two reviewers before integrating this.

Thanks,
David

src/hotspot/share/gc/z/zFuture.inline.hpp line 49:

> 47:   // Wait for notification
> 48:   Thread* const thread = Thread::current();
> 49:   if (JavaThread::cast(thread)) {

This is wrong - we still need the is_Java_thread() query; and cast is not a boolean operator.

-------------

Changes requested by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4546

From yyang at openjdk.java.net  Tue Jun 22 02:51:30 2021
From: yyang at openjdk.java.net (Yi Yang)
Date: Tue, 22 Jun 2021 02:51:30 GMT
Subject: RFR: 8267657: Add missing PrintC1Statistics before incrementing
 counters
In-Reply-To: <3y_kn9nUEynQogBERppD9dKS5rx7An1LH01suSXUJho=.e3ffc36b-fa8a-45a5-b957-91ef009148d1@github.com>
References: <3y_kn9nUEynQogBERppD9dKS5rx7An1LH01suSXUJho=.e3ffc36b-fa8a-45a5-b957-91ef009148d1@github.com>
Message-ID: 

On Tue, 25 May 2021 03:07:15 GMT, Yi Yang  wrote:

> Trivial change to add missing PrintC1Statistics before incrementing counters.

Thank you Igor for the review!

-------------

PR: https://git.openjdk.java.net/jdk/pull/4178

From gli at openjdk.java.net  Tue Jun 22 07:17:00 2021
From: gli at openjdk.java.net (Guoxiong Li)
Date: Tue, 22 Jun 2021 07:17:00 GMT
Subject: RFR: 8268368: Adopt cast notation for JavaThread conversions [v2]
In-Reply-To: 
References: 
Message-ID: <7Q4mj03cn9mF86H_Cr44sWib8Za6dZyindyQUaVAl40=.c2422fbd-4ba8-46e3-818a-c3139d930d0f@github.com>

> Hi all,
> 
> Considering the consistency of `JavaThread` and other threads, such as WorkerThread and CompilerThread, `JavaThread` could use a method named `cast` to replace the method `Thread::as_Java_thread()`. It can reduce the Thread's knowledge about the subtypes.
> 
> This patch removes two methods, `JavaThread* Thread::as_Java_thread()` and `const JavaThread* Thread::as_Java_thread() const`, of the class `Thread` and adds two static methods, `JavaThread* cast(Thread* t)` and `const JavaThread* cast(const Thread* t)`, to the class `JavaThread`. Correspondingly, the code of the method `JavaThread::current()` need to be adjusted and many places where the method `Thread::as_Java_thread()` is used need to use `JavaThread::cast` instead.
> 
> Test:
> tier1 passed locally.
> 
> Thanks for taking the time to review.
> 
> Best Regards,
> -- Guoxiong

Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision:

  Fix incorrect use of the method cast

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4546/files
  - new: https://git.openjdk.java.net/jdk/pull/4546/files/3c9f6dbe..259730be

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4546&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4546&range=00-01

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4546.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4546/head:pull/4546

PR: https://git.openjdk.java.net/jdk/pull/4546

From gli at openjdk.java.net  Tue Jun 22 07:24:29 2021
From: gli at openjdk.java.net (Guoxiong Li)
Date: Tue, 22 Jun 2021 07:24:29 GMT
Subject: RFR: 8268368: Adopt cast notation for JavaThread conversions [v2]
In-Reply-To: <3jEWcoPqG9EMr-wmTMV66lnEqEZBbwR45E5gmwZhbOk=.0cbbc64d-07ac-4667-9037-b136c279188c@github.com>
References: 
 <3jEWcoPqG9EMr-wmTMV66lnEqEZBbwR45E5gmwZhbOk=.0cbbc64d-07ac-4667-9037-b136c279188c@github.com>
Message-ID: 

On Tue, 22 Jun 2021 01:59:17 GMT, David Holmes  wrote:

>> Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Fix incorrect use of the method cast
>
> Hi Guoxiong,
> 
> Thanks for picking up this enhancement request.
> 
> I wasn't sure if this would be worth the churn/disruption to the source code, but I think it is ok and preferable to use the cast notation.
> 
> The changes look good except for one mistake flagged below.
> 
> Note you need at least two reviewers before integrating this.
> 
> Thanks,
> David

@dholmes-ora Thanks for your review. I updated the code just now.

I am surprised that the `tier1` (locally and the `Pre-submit tests`) can't find the mistake you pointed out.
Maybe we can improve the `tier1` or the `Pre-submit tests` in the future.

> src/hotspot/share/gc/z/zFuture.inline.hpp line 49:
> 
>> 47:   // Wait for notification
>> 48:   Thread* const thread = Thread::current();
>> 49:   if (JavaThread::cast(thread)) {
> 
> This is wrong - we still need the is_Java_thread() query; and cast is not a boolean operator.

Fixed. It's a wrong use of the method `cast`. Thanks for finding it. I re-read my patch to avoid the similar mistake.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4546

From yyang at openjdk.java.net  Tue Jun 22 08:35:33 2021
From: yyang at openjdk.java.net (Yi Yang)
Date: Tue, 22 Jun 2021 08:35:33 GMT
Subject: Integrated: 8267657: Add missing PrintC1Statistics before
 incrementing counters
In-Reply-To: <3y_kn9nUEynQogBERppD9dKS5rx7An1LH01suSXUJho=.e3ffc36b-fa8a-45a5-b957-91ef009148d1@github.com>
References: <3y_kn9nUEynQogBERppD9dKS5rx7An1LH01suSXUJho=.e3ffc36b-fa8a-45a5-b957-91ef009148d1@github.com>
Message-ID: 

On Tue, 25 May 2021 03:07:15 GMT, Yi Yang  wrote:

> Trivial change to add missing PrintC1Statistics before incrementing counters.

This pull request has now been integrated.

Changeset: 2e639dd3
Author:    Yi Yang 
URL:       https://git.openjdk.java.net/jdk/commit/2e639dd34a4342de6e1b9470448d66ef89c4bd52
Stats:     96 lines in 7 files changed: 61 ins; 6 del; 29 mod

8267657: Add missing PrintC1Statistics before incrementing counters

Reviewed-by: iveresov

-------------

PR: https://git.openjdk.java.net/jdk/pull/4178

From tschatzl at openjdk.java.net  Tue Jun 22 09:15:34 2021
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Tue, 22 Jun 2021 09:15:34 GMT
Subject: RFR: 8268290: Improve LockFreeQueue<> utility [v2]
In-Reply-To: 
References: 
 
Message-ID: 

On Mon, 14 Jun 2021 14:55:46 GMT, Kim Barrett  wrote:

>> Please review this change to the LockFreeQueue utility class.
>> 
>> The LockFreeQueue originated as an implementation detail of
>> G1DirtyCardQueueSet, and was recently refactored into a public utility
>> class.  In that refactoring it retained some limitations that were
>> acceptable in its original context, but may be problematic as a general
>> utility.
>> 
>> In particular, under some conditions a thread was not be able to pop the
>> last element in the queue, due to interference by a concurrent operation.
>> And this state will persist, so retrying the pop operation won't help until
>> the interfering thread had made sufficient progress. This was mitigated by
>> making the API more complex to provide notice to the client that the queue
>> may be in this state.
>> 
>> But it turns out we can do somewhat better, eliminating one of the
>> limitations, which is the point of this change.  We introduce a
>> pseudo-object used as an end of queue marker.  We can use the transition of
>> the last element's next value from the end marker to NULL by a pop operation
>> as a claim on the element, allowing the losing thread to recognize, retry,
>> and make progress.
>> 
>> This queue still has the limitation that an in-progress push/append may
>> prevent popping elements.  Because of this, the class is being renamed to
>> NonblockingQueue.  The old name suggests stronger guarantees than actually
>> provided.
>> 
>> The PR has two commits, the first for the functional changes, the second for
>> the renaming.  The github diffs don't seem to be recognizing the renaming of
>> the source files as a rename, instead treating the old files as deleted and
>> the new files as added.  The first commit by itself is probably more useful
>> for reviewing the functional changes.
>> 
>> Testing:
>> mach5 tier1-5
>
> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
> 
>  - Merge branch 'master' into lfqueue
>  - rename
>  - use end marker to improve pop

I think this is good.

-------------

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4379

From gli at openjdk.java.net  Tue Jun 22 10:13:27 2021
From: gli at openjdk.java.net (Guoxiong Li)
Date: Tue, 22 Jun 2021 10:13:27 GMT
Subject: RFR: 8268368: Adopt cast notation for JavaThread conversions [v2]
In-Reply-To: 
References: 
 <3jEWcoPqG9EMr-wmTMV66lnEqEZBbwR45E5gmwZhbOk=.0cbbc64d-07ac-4667-9037-b136c279188c@github.com>
 
Message-ID: 

On Tue, 22 Jun 2021 07:21:29 GMT, Guoxiong Li  wrote:

> Maybe we can improve the tier1 or the Pre-submit tests in the future.

I meant to improve the test coverage.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4546

From stefank at openjdk.java.net  Tue Jun 22 10:24:31 2021
From: stefank at openjdk.java.net (Stefan Karlsson)
Date: Tue, 22 Jun 2021 10:24:31 GMT
Subject: RFR: 8268368: Adopt cast notation for JavaThread conversions [v2]
In-Reply-To: <7Q4mj03cn9mF86H_Cr44sWib8Za6dZyindyQUaVAl40=.c2422fbd-4ba8-46e3-818a-c3139d930d0f@github.com>
References: 
 <7Q4mj03cn9mF86H_Cr44sWib8Za6dZyindyQUaVAl40=.c2422fbd-4ba8-46e3-818a-c3139d930d0f@github.com>
Message-ID: 

On Tue, 22 Jun 2021 07:17:00 GMT, Guoxiong Li  wrote:

>> Hi all,
>> 
>> Considering the consistency of `JavaThread` and other threads, such as WorkerThread and CompilerThread, `JavaThread` could use a method named `cast` to replace the method `Thread::as_Java_thread()`. It can reduce the Thread's knowledge about the subtypes.
>> 
>> This patch removes two methods, `JavaThread* Thread::as_Java_thread()` and `const JavaThread* Thread::as_Java_thread() const`, of the class `Thread` and adds two static methods, `JavaThread* cast(Thread* t)` and `const JavaThread* cast(const Thread* t)`, to the class `JavaThread`. Correspondingly, the code of the method `JavaThread::current()` need to be adjusted and many places where the method `Thread::as_Java_thread()` is used need to use `JavaThread::cast` instead.
>> 
>> Test:
>> tier1 passed locally.
>> 
>> Thanks for taking the time to review.
>> 
>> Best Regards,
>> -- Guoxiong
>
> Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix incorrect use of the method cast

Marked as reviewed by stefank (Reviewer).

src/hotspot/share/runtime/thread.hpp line 1432:

> 1430:     assert(t->is_Java_thread(), "incorrect cast to const JavaThread");
> 1431:     return static_cast(t);
> 1432:   }

Now that you've written the code in-place, you could  remove the `inline` specifier.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4546

From gli at openjdk.java.net  Tue Jun 22 10:53:06 2021
From: gli at openjdk.java.net (Guoxiong Li)
Date: Tue, 22 Jun 2021 10:53:06 GMT
Subject: RFR: 8268368: Adopt cast notation for JavaThread conversions [v3]
In-Reply-To: 
References: 
Message-ID: <6cHOFbz1wh819kKFRB-yKKniSrwGL-oNF4iiJqZrh1s=.0974ec65-e8ee-4609-b999-c44acadb58cb@github.com>

> Hi all,
> 
> Considering the consistency of `JavaThread` and other threads, such as WorkerThread and CompilerThread, `JavaThread` could use a method named `cast` to replace the method `Thread::as_Java_thread()`. It can reduce the Thread's knowledge about the subtypes.
> 
> This patch removes two methods, `JavaThread* Thread::as_Java_thread()` and `const JavaThread* Thread::as_Java_thread() const`, of the class `Thread` and adds two static methods, `JavaThread* cast(Thread* t)` and `const JavaThread* cast(const Thread* t)`, to the class `JavaThread`. Correspondingly, the code of the method `JavaThread::current()` need to be adjusted and many places where the method `Thread::as_Java_thread()` is used need to use `JavaThread::cast` instead.
> 
> Test:
> tier1 passed locally.
> 
> Thanks for taking the time to review.
> 
> Best Regards,
> -- Guoxiong

Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision:

  Remove inline specifier

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4546/files
  - new: https://git.openjdk.java.net/jdk/pull/4546/files/259730be..c0347b6c

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4546&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4546&range=01-02

  Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4546.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4546/head:pull/4546

PR: https://git.openjdk.java.net/jdk/pull/4546

From gli at openjdk.java.net  Tue Jun 22 10:56:25 2021
From: gli at openjdk.java.net (Guoxiong Li)
Date: Tue, 22 Jun 2021 10:56:25 GMT
Subject: RFR: 8268368: Adopt cast notation for JavaThread conversions [v2]
In-Reply-To: 
References: 
 <7Q4mj03cn9mF86H_Cr44sWib8Za6dZyindyQUaVAl40=.c2422fbd-4ba8-46e3-818a-c3139d930d0f@github.com>
 
Message-ID: 

On Tue, 22 Jun 2021 10:21:45 GMT, Stefan Karlsson  wrote:

>> Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Fix incorrect use of the method cast
>
> src/hotspot/share/runtime/thread.hpp line 1432:
> 
>> 1430:     assert(t->is_Java_thread(), "incorrect cast to const JavaThread");
>> 1431:     return static_cast(t);
>> 1432:   }
> 
> Now that you've written the code in-place, you could  remove the `inline` specifier.

Fixed. Thanks @stefank for your review.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4546

From david.holmes at oracle.com  Tue Jun 22 11:33:57 2021
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 22 Jun 2021 21:33:57 +1000
Subject: RFR: 8268368: Adopt cast notation for JavaThread conversions [v2]
In-Reply-To: 
References: 
 <3jEWcoPqG9EMr-wmTMV66lnEqEZBbwR45E5gmwZhbOk=.0cbbc64d-07ac-4667-9037-b136c279188c@github.com>
 
Message-ID: <77ae4ca8-47eb-12b2-96e4-743adb0d3987@oracle.com>

On 22/06/2021 5:24 pm, Guoxiong Li wrote:
> On Tue, 22 Jun 2021 01:59:17 GMT, David Holmes  wrote:
> 
>>> Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision:
>>>
>>>    Fix incorrect use of the method cast
>>
>> Hi Guoxiong,
>>
>> Thanks for picking up this enhancement request.
>>
>> I wasn't sure if this would be worth the churn/disruption to the source code, but I think it is ok and preferable to use the cast notation.
>>
>> The changes look good except for one mistake flagged below.
>>
>> Note you need at least two reviewers before integrating this.
>>
>> Thanks,
>> David
> 
> @dholmes-ora Thanks for your review. I updated the code just now.
> 
> I am surprised that the `tier1` (locally and the `Pre-submit tests`) can't find the mistake you pointed out.
> Maybe we can improve the `tier1` or the `Pre-submit tests` in the future.

The error is in ZGC code and ZGC is not tested in tier-1 or pre-submit.

Cheers,
David

>> src/hotspot/share/gc/z/zFuture.inline.hpp line 49:
>>
>>> 47:   // Wait for notification
>>> 48:   Thread* const thread = Thread::current();
>>> 49:   if (JavaThread::cast(thread)) {
>>
>> This is wrong - we still need the is_Java_thread() query; and cast is not a boolean operator.
> 
> Fixed. It's a wrong use of the method `cast`. Thanks for finding it. I re-read my patch to avoid the similar mistake.
> 
> -------------
> 
> PR: https://git.openjdk.java.net/jdk/pull/4546
> 

From dholmes at openjdk.java.net  Tue Jun 22 11:37:28 2021
From: dholmes at openjdk.java.net (David Holmes)
Date: Tue, 22 Jun 2021 11:37:28 GMT
Subject: RFR: 8268368: Adopt cast notation for JavaThread conversions [v3]
In-Reply-To: <6cHOFbz1wh819kKFRB-yKKniSrwGL-oNF4iiJqZrh1s=.0974ec65-e8ee-4609-b999-c44acadb58cb@github.com>
References: 
 <6cHOFbz1wh819kKFRB-yKKniSrwGL-oNF4iiJqZrh1s=.0974ec65-e8ee-4609-b999-c44acadb58cb@github.com>
Message-ID: 

On Tue, 22 Jun 2021 10:53:06 GMT, Guoxiong Li  wrote:

>> Hi all,
>> 
>> Considering the consistency of `JavaThread` and other threads, such as WorkerThread and CompilerThread, `JavaThread` could use a method named `cast` to replace the method `Thread::as_Java_thread()`. It can reduce the Thread's knowledge about the subtypes.
>> 
>> This patch removes two methods, `JavaThread* Thread::as_Java_thread()` and `const JavaThread* Thread::as_Java_thread() const`, of the class `Thread` and adds two static methods, `JavaThread* cast(Thread* t)` and `const JavaThread* cast(const Thread* t)`, to the class `JavaThread`. Correspondingly, the code of the method `JavaThread::current()` need to be adjusted and many places where the method `Thread::as_Java_thread()` is used need to use `JavaThread::cast` instead.
>> 
>> Test:
>> tier1 passed locally.
>> 
>> Thanks for taking the time to review.
>> 
>> Best Regards,
>> -- Guoxiong
>
> Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove inline specifier

Updates look fine.

Thanks,
David

-------------

Marked as reviewed by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4546

From gli at openjdk.java.net  Tue Jun 22 12:10:30 2021
From: gli at openjdk.java.net (Guoxiong Li)
Date: Tue, 22 Jun 2021 12:10:30 GMT
Subject: RFR: 8268368: Adopt cast notation for JavaThread conversions [v3]
In-Reply-To: 
References: 
 <6cHOFbz1wh819kKFRB-yKKniSrwGL-oNF4iiJqZrh1s=.0974ec65-e8ee-4609-b999-c44acadb58cb@github.com>
 
Message-ID: 

On Tue, 22 Jun 2021 11:34:58 GMT, David Holmes  wrote:

>> Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Remove inline specifier
>
> Updates look fine.
> 
> Thanks,
> David

@dholmes-ora @stefank Thanks for your reviews. Could I get your help to sponsor this patch?

-------------

PR: https://git.openjdk.java.net/jdk/pull/4546

From gli at openjdk.java.net  Tue Jun 22 12:10:30 2021
From: gli at openjdk.java.net (Guoxiong Li)
Date: Tue, 22 Jun 2021 12:10:30 GMT
Subject: RFR: 8268368: Adopt cast notation for JavaThread conversions
In-Reply-To: <77ae4ca8-47eb-12b2-96e4-743adb0d3987@oracle.com>
References: 
 <77ae4ca8-47eb-12b2-96e4-743adb0d3987@oracle.com>
Message-ID: 

On Tue, 22 Jun 2021 11:36:02 GMT, David Holmes  wrote:

> The error is in ZGC code and ZGC is not tested in tier-1 or pre-submit.

Got it. Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4546

From dholmes at openjdk.java.net  Tue Jun 22 12:59:35 2021
From: dholmes at openjdk.java.net (David Holmes)
Date: Tue, 22 Jun 2021 12:59:35 GMT
Subject: RFR: 8268368: Adopt cast notation for JavaThread conversions [v3]
In-Reply-To: <6cHOFbz1wh819kKFRB-yKKniSrwGL-oNF4iiJqZrh1s=.0974ec65-e8ee-4609-b999-c44acadb58cb@github.com>
References: 
 <6cHOFbz1wh819kKFRB-yKKniSrwGL-oNF4iiJqZrh1s=.0974ec65-e8ee-4609-b999-c44acadb58cb@github.com>
Message-ID: 

On Tue, 22 Jun 2021 10:53:06 GMT, Guoxiong Li  wrote:

>> Hi all,
>> 
>> Considering the consistency of `JavaThread` and other threads, such as WorkerThread and CompilerThread, `JavaThread` could use a method named `cast` to replace the method `Thread::as_Java_thread()`. It can reduce the Thread's knowledge about the subtypes.
>> 
>> This patch removes two methods, `JavaThread* Thread::as_Java_thread()` and `const JavaThread* Thread::as_Java_thread() const`, of the class `Thread` and adds two static methods, `JavaThread* cast(Thread* t)` and `const JavaThread* cast(const Thread* t)`, to the class `JavaThread`. Correspondingly, the code of the method `JavaThread::current()` need to be adjusted and many places where the method `Thread::as_Java_thread()` is used need to use `JavaThread::cast` instead.
>> 
>> Test:
>> tier1 passed locally.
>> 
>> Thanks for taking the time to review.
>> 
>> Best Regards,
>> -- Guoxiong
>
> Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove inline specifier

I can sponsor, but as per hotspot integration guidelines we should wait at least 24 hours to give people in different timezones a chance to comment.

Thanks,
David

-------------

PR: https://git.openjdk.java.net/jdk/pull/4546

From gli at openjdk.java.net  Tue Jun 22 13:03:32 2021
From: gli at openjdk.java.net (Guoxiong Li)
Date: Tue, 22 Jun 2021 13:03:32 GMT
Subject: RFR: 8268368: Adopt cast notation for JavaThread conversions [v3]
In-Reply-To: 
References: 
 <6cHOFbz1wh819kKFRB-yKKniSrwGL-oNF4iiJqZrh1s=.0974ec65-e8ee-4609-b999-c44acadb58cb@github.com>
 
Message-ID: 

On Tue, 22 Jun 2021 12:56:27 GMT, David Holmes  wrote:

> I can sponsor, but as per hotspot integration guidelines we should wait at least 24 hours to give people in different timezones a chance to comment.

Got it. Thanks a lot!

-------------

PR: https://git.openjdk.java.net/jdk/pull/4546

From suenaga at oss.nttdata.com  Tue Jun 22 14:51:59 2021
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Tue, 22 Jun 2021 23:51:59 +0900
Subject: SpinPause() should be inlined?
Message-ID: <9ab1d5fe8c58f5a5cf71c6692a605d3b@oss.nttdata.com>

Hi all,

I saw lock contention in SecureRandom. When I was analyzing it, I had a 
question.

ObjectMonitor::TrySpin() calls SpinPause(). SpinPause() would issue 
PAUSE, but it is not inlined as following:

```
  1587e90:       f0 4d 0f b1 2e          lock cmpxchg %r13,(%r14)
  1587e95:       48 85 c0                test   %rax,%rax
  1587e98:       74 16                   je     1587eb0 

  1587e9a:       e8 31 7b d4 ff          callq  12cf9d0 
  1587e9f:       83 eb 01                sub    $0x1,%ebx
  1587ea2:       72 da                   jb     1587e7e 

```

I found following comment about it in os.hpp. It says SpinPause() should 
be inlined.

```
// Note that "PAUSE" is almost always used with synchronization
// so arguably we should provide Atomic::SpinPause() instead
// of the global SpinPause() with C linkage.
// It'd also be eligible for inlining on many platforms.
```

According to Intel Software Developer's Manual, PAUSE seems to need to 
be inlined, but I'm not sure it can be allow through function call.
I've fixed it to do so for Linux x64 [1] and I benchmarked with [2], 
then I saw some advantage in PAUSE on my Core i3-8145U, but it may be 
within the margin of error.

* original
Benchmark                       (algo)  (bytes)   Mode  Cnt       Score  
       Error  Units
RandomBenchmark.fillRandomInMT    DRBG       16  thrpt    3  510141.578 
? 138543.261  ops/s

* with inlined PAUSE
Benchmark                       (algo)  (bytes)   Mode  Cnt       Score  
      Error  Units
RandomBenchmark.fillRandomInMT    DRBG       16  thrpt    3  531589.958 
? 66942.549  ops/s


Should be inlined PAUSE as `Atomic::SpinPause()`?


Thanks,

Yasumasa


[1] 
https://github.com/YaSuenag/jdk/commit/66b31d35fac6fb0537bb9d85957157342fec564d
[2] 
https://github.com/YaSuenag/hwrand/blob/master/benchmark/src/main/java/com/yasuenag/hwrand/benchmark/RandomBenchmark.java#L59-L64

From shade at openjdk.java.net  Tue Jun 22 15:10:37 2021
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Tue, 22 Jun 2021 15:10:37 GMT
Subject: RFR: 8269138: Move typeArrayOop.inline.hpp include to
 vectorSupport.cpp
Message-ID: <4dtXj293cS81fbEC-HmGdzcrVfI4dEf3u0HTY2tjhxM=.737f4023-a810-4953-9b50-80993ae2e480@github.com>

See the bug report for inclusion circularity that breaks current Loom workspace. Including stuff properly, `.hpp` in `.hpp`, and `.inline.hpp` in `.cpp` resolves this.

Additional testing:
 - [x] Linux x86_64 builds
 - [x] Loom repository builds Linux x86_64 with this patch cherry-picked

-------------

Commit messages:
 - 8269138: Move typeArrayOop.inline.hpp include to vectorSupport.cpp

Changes: https://git.openjdk.java.net/jdk/pull/4559/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4559&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8269138
  Stats: 2 lines in 2 files changed: 1 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4559.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4559/head:pull/4559

PR: https://git.openjdk.java.net/jdk/pull/4559

From rrich at openjdk.java.net  Tue Jun 22 15:16:34 2021
From: rrich at openjdk.java.net (Richard Reingruber)
Date: Tue, 22 Jun 2021 15:16:34 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v4]
In-Reply-To: 
References: 
 
 
 
Message-ID: <6ro5movKH_OguGNNPICzLLJYy7Q_IHA169XBbTdWXdI=.92b036de-fbc2-4845-bd13-410c9e68f745@github.com>

On Mon, 21 Jun 2021 13:58:32 GMT, Patricio Chilano Mateo  wrote:

> 
> 
> > No issues from my own testing. Broader test coverage on all platforms is expected tomorrow.
>
>  Great, I'll wait for that. Thanks for all the testing Richard!

Tests look fine. Also on s390. Unfortunately on ppc64le the tests didn't succeed because of another change. I'd suggest to wait one more day if you don't mind.

Richard.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From stefank at openjdk.java.net  Tue Jun 22 15:27:31 2021
From: stefank at openjdk.java.net (Stefan Karlsson)
Date: Tue, 22 Jun 2021 15:27:31 GMT
Subject: RFR: 8269138: Move typeArrayOop.inline.hpp include to
 vectorSupport.cpp
In-Reply-To: <4dtXj293cS81fbEC-HmGdzcrVfI4dEf3u0HTY2tjhxM=.737f4023-a810-4953-9b50-80993ae2e480@github.com>
References: <4dtXj293cS81fbEC-HmGdzcrVfI4dEf3u0HTY2tjhxM=.737f4023-a810-4953-9b50-80993ae2e480@github.com>
Message-ID: <2wvZ4uHijJMlMM-_P6zWq3AClkUDd37A64EKiFIHHQ4=.865dee6c-d4d9-40a4-b93a-a453f75efa2f@github.com>

On Tue, 22 Jun 2021 15:02:58 GMT, Aleksey Shipilev  wrote:

> See the bug report for inclusion circularity that breaks current Loom workspace. Including stuff properly, `.hpp` in `.hpp`, and `.inline.hpp` in `.cpp` resolves this.
> 
> Additional testing:
>  - [x] Linux x86_64 builds
>  - [x] Loom repository builds Linux x86_64 with this patch cherry-picked

Marked as reviewed by stefank (Reviewer).

Yes, trivial.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4559

From shade at openjdk.java.net  Tue Jun 22 15:27:31 2021
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Tue, 22 Jun 2021 15:27:31 GMT
Subject: RFR: 8269138: Move typeArrayOop.inline.hpp include to
 vectorSupport.cpp
In-Reply-To: <4dtXj293cS81fbEC-HmGdzcrVfI4dEf3u0HTY2tjhxM=.737f4023-a810-4953-9b50-80993ae2e480@github.com>
References: <4dtXj293cS81fbEC-HmGdzcrVfI4dEf3u0HTY2tjhxM=.737f4023-a810-4953-9b50-80993ae2e480@github.com>
Message-ID: <-_mM_TRUtBrUWjhtNvyJSrlq-so5_TcrrYr0fmR4HUU=.5e3858e3-9b05-4be4-a04b-d09d349dfafc@github.com>

On Tue, 22 Jun 2021 15:02:58 GMT, Aleksey Shipilev  wrote:

> See the bug report for inclusion circularity that breaks current Loom workspace. Including stuff properly, `.hpp` in `.hpp`, and `.inline.hpp` in `.cpp` resolves this.
> 
> Additional testing:
>  - [x] Linux x86_64 builds
>  - [x] Loom repository builds Linux x86_64 with this patch cherry-picked

Thank you. Trivial?

-------------

PR: https://git.openjdk.java.net/jdk/pull/4559

From aph at redhat.com  Tue Jun 22 15:32:39 2021
From: aph at redhat.com (Andrew Haley)
Date: Tue, 22 Jun 2021 16:32:39 +0100
Subject: S/390 builds failing in CI
Message-ID: 

Like this:

https://github.com/theRealAph/jdk/runs/2885812815

Run sudo apt-get install gcc-10-s390x-linux-gnu=10.2.0-5ubuntu1~20.04cross1 g++-10-s390x-linux-gnu=10.2.0-5ubuntu1~20.04cross1

The following packages have unmet dependencies:
 g++-10-s390x-linux-gnu : Depends: gcc-10-s390x-linux-gnu-base (= 10.2.0-5ubuntu1~20.04cross1) but 10.3.0-1ubuntu1~20.04cross1 is to be installed
 gcc-10-s390x-linux-gnu : Depends: cpp-10-s390x-linux-gnu (= 10.2.0-5ubuntu1~20.04cross1) but 10.3.0-1ubuntu1~20.04cross1 is to be installed
                          Depends: gcc-10-s390x-linux-gnu-base (= 10.2.0-5ubuntu1~20.04cross1) but 10.3.0-1ubuntu1~20.04cross1 is to be installed
E: Unable to correct problems, you have held broken packages.
Error: Process completed with exit code 100.

Nothing to do with my changes, I think.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. 
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From vlivanov at openjdk.java.net  Tue Jun 22 15:40:29 2021
From: vlivanov at openjdk.java.net (Vladimir Ivanov)
Date: Tue, 22 Jun 2021 15:40:29 GMT
Subject: RFR: 8269138: Move typeArrayOop.inline.hpp include to
 vectorSupport.cpp
In-Reply-To: <4dtXj293cS81fbEC-HmGdzcrVfI4dEf3u0HTY2tjhxM=.737f4023-a810-4953-9b50-80993ae2e480@github.com>
References: <4dtXj293cS81fbEC-HmGdzcrVfI4dEf3u0HTY2tjhxM=.737f4023-a810-4953-9b50-80993ae2e480@github.com>
Message-ID: 

On Tue, 22 Jun 2021 15:02:58 GMT, Aleksey Shipilev  wrote:

> See the bug report for inclusion circularity that breaks current Loom workspace. Including stuff properly, `.hpp` in `.hpp`, and `.inline.hpp` in `.cpp` resolves this.
> 
> Additional testing:
>  - [x] Linux x86_64 builds
>  - [x] Loom repository builds Linux x86_64 with this patch cherry-picked

Looks good and trivial.

-------------

Marked as reviewed by vlivanov (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4559

From pchilanomate at openjdk.java.net  Tue Jun 22 15:52:34 2021
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Tue, 22 Jun 2021 15:52:34 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v5]
In-Reply-To: 
References: 
 
Message-ID: 

On Mon, 21 Jun 2021 14:49:20 GMT, Patricio Chilano Mateo  wrote:

>> Hi all,
>> 
>> Please review the following patch which handles the removal of biased locking code. 
>> 
>> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
>> 
>> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
>> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
>> 
>> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
>> 
>> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
>> 
>> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
>> 
>> Thanks,
>> Patricio
>
> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Un-ProblemList serviceability tests (8268574 and 8268644)

Hi Richard,
> 
> Tests look fine. Also on s390. Unfortunately on ppc64le the tests didn't succeed because of another change. I'd suggest to wait one more day if you don't mind.

No problem, let me know when tests complete successfully.

Thanks again!

Patricio

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From daniel.daugherty at oracle.com  Tue Jun 22 15:55:14 2021
From: daniel.daugherty at oracle.com (daniel.daugherty at oracle.com)
Date: Tue, 22 Jun 2021 11:55:14 -0400
Subject: SpinPause() should be inlined?
In-Reply-To: <9ab1d5fe8c58f5a5cf71c6692a605d3b@oss.nttdata.com>
References: <9ab1d5fe8c58f5a5cf71c6692a605d3b@oss.nttdata.com>
Message-ID: <4072d3f0-d0a9-cd08-f81f-f5f9c96dd60f@oracle.com>

Here's the RFE where that was discussed back in 2018:

 ??? JDK-8208458 Simplify and inline os::SpinPause() for non-Windows OS 
on X86
 ??? https://bugs.openjdk.java.net/browse/JDK-8208458

Dan

On 6/22/21 10:51 AM, Yasumasa Suenaga wrote:
> Hi all,
>
> I saw lock contention in SecureRandom. When I was analyzing it, I had 
> a question.
>
> ObjectMonitor::TrySpin() calls SpinPause(). SpinPause() would issue 
> PAUSE, but it is not inlined as following:
>
> ```
> ?1587e90:?????? f0 4d 0f b1 2e????????? lock cmpxchg %r13,(%r14)
> ?1587e95:?????? 48 85 c0??????????????? test?? %rax,%rax
> ?1587e98:?????? 74 16?????????????????? je???? 1587eb0 
> 
> ?1587e9a:?????? e8 31 7b d4 ff????????? callq? 12cf9d0 
> ?1587e9f:?????? 83 eb 01??????????????? sub??? $0x1,%ebx
> ?1587ea2:?????? 72 da?????????????????? jb???? 1587e7e 
> 
> ```
>
> I found following comment about it in os.hpp. It says SpinPause() 
> should be inlined.
>
> ```
> // Note that "PAUSE" is almost always used with synchronization
> // so arguably we should provide Atomic::SpinPause() instead
> // of the global SpinPause() with C linkage.
> // It'd also be eligible for inlining on many platforms.
> ```
>
> According to Intel Software Developer's Manual, PAUSE seems to need to 
> be inlined, but I'm not sure it can be allow through function call.
> I've fixed it to do so for Linux x64 [1] and I benchmarked with [2], 
> then I saw some advantage in PAUSE on my Core i3-8145U, but it may be 
> within the margin of error.
>
> * original
> Benchmark?????????????????????? (algo)? (bytes)?? Mode? Cnt Score? 
> ????? Error? Units
> RandomBenchmark.fillRandomInMT??? DRBG?????? 16? thrpt??? 3 510141.578 
> ? 138543.261? ops/s
>
> * with inlined PAUSE
> Benchmark?????????????????????? (algo)? (bytes)?? Mode? Cnt Score? 
> ???? Error? Units
> RandomBenchmark.fillRandomInMT??? DRBG?????? 16? thrpt??? 3 531589.958 
> ? 66942.549? ops/s
>
>
> Should be inlined PAUSE as `Atomic::SpinPause()`?
>
>
> Thanks,
>
> Yasumasa
>
>
> [1] 
> https://github.com/YaSuenag/jdk/commit/66b31d35fac6fb0537bb9d85957157342fec564d
> [2] 
> https://github.com/YaSuenag/hwrand/blob/master/benchmark/src/main/java/com/yasuenag/hwrand/benchmark/RandomBenchmark.java#L59-L64


From coleenp at openjdk.java.net  Tue Jun 22 16:12:34 2021
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Tue, 22 Jun 2021 16:12:34 GMT
Subject: Integrated: 8264941: Remove CodeCache::mark_for_evol_deoptimization()
 method
In-Reply-To: 
References: 
Message-ID: 

On Wed, 16 Jun 2021 12:52:46 GMT, Coleen Phillimore  wrote:

> This change removes the mark_for_evol_deoptimization method and removes the flag that all dependencies are recorded.  Before the change to walk the entire nmethod looking for "old" (redefined) methods with metadata_do(), we used to find methods in the code cache to deoptimize based on evol_method dependencies.  If the dependencies weren't yet recorded, we had to deoptimize all of the methods.  A long time ago, we had a customer who was unhappy with the pause for this when they had late attach.  Now we don't have this problem.
> The evol_method dependencies are still used by the compiler to check for old methods during compilation.  I didn't change this but it might be something someone who knows the compiler better can do differently and remove these dependencies too.
> Tested with tier1-6.

This pull request has now been integrated.

Changeset: 33c23a1c
Author:    Coleen Phillimore 
URL:       https://git.openjdk.java.net/jdk/commit/33c23a1cf2aa81551eee4a2acf271edf573558aa
Stats:     78 lines in 7 files changed: 0 ins; 73 del; 5 mod

8264941: Remove CodeCache::mark_for_evol_deoptimization() method

Reviewed-by: kvn, vlivanov, sspitsyn

-------------

PR: https://git.openjdk.java.net/jdk/pull/4509

From coleenp at openjdk.java.net  Tue Jun 22 16:12:34 2021
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Tue, 22 Jun 2021 16:12:34 GMT
Subject: RFR: 8264941: Remove CodeCache::mark_for_evol_deoptimization()
 method
In-Reply-To: 
References: 
Message-ID: 

On Wed, 16 Jun 2021 12:52:46 GMT, Coleen Phillimore  wrote:

> This change removes the mark_for_evol_deoptimization method and removes the flag that all dependencies are recorded.  Before the change to walk the entire nmethod looking for "old" (redefined) methods with metadata_do(), we used to find methods in the code cache to deoptimize based on evol_method dependencies.  If the dependencies weren't yet recorded, we had to deoptimize all of the methods.  A long time ago, we had a customer who was unhappy with the pause for this when they had late attach.  Now we don't have this problem.
> The evol_method dependencies are still used by the compiler to check for old methods during compilation.  I didn't change this but it might be something someone who knows the compiler better can do differently and remove these dependencies too.
> Tested with tier1-6.

Thanks Serguei.  Thanks for the code reviews.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4509

From kbarrett at openjdk.java.net  Tue Jun 22 16:48:31 2021
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Tue, 22 Jun 2021 16:48:31 GMT
Subject: RFR: 8268290: Improve LockFreeQueue<> utility [v2]
In-Reply-To: 
References: 
 
Message-ID: <8doyCtCRAkS0NzKJ8mO3DS115UAokxpK6JSExxsfPuI=.58ca15b7-646e-45f8-8e5b-5c137301bb71@github.com>

On Mon, 7 Jun 2021 09:05:06 GMT, Ivan Walulya  wrote:

>> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
>> 
>>  - Merge branch 'master' into lfqueue
>>  - rename
>>  - use end marker to improve pop
>
> lgtm!

Thanks @walulyai and @tschatzl  for reviews.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4379

From shade at redhat.com  Tue Jun 22 17:15:42 2021
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 22 Jun 2021 19:15:42 +0200
Subject: S/390 builds failing in CI
In-Reply-To: 
References: 
Message-ID: 

On 6/22/21 5:32 PM, Andrew Haley wrote:
> Like this:
> 
> https://github.com/theRealAph/jdk/runs/2885812815
> 
> Run sudo apt-get install gcc-10-s390x-linux-gnu=10.2.0-5ubuntu1~20.04cross1 g++-10-s390x-linux-gnu=10.2.0-5ubuntu1~20.04cross1
> 
> The following packages have unmet dependencies:
>   g++-10-s390x-linux-gnu : Depends: gcc-10-s390x-linux-gnu-base (= 10.2.0-5ubuntu1~20.04cross1) but 10.3.0-1ubuntu1~20.04cross1 is to be installed
>   gcc-10-s390x-linux-gnu : Depends: cpp-10-s390x-linux-gnu (= 10.2.0-5ubuntu1~20.04cross1) but 10.3.0-1ubuntu1~20.04cross1 is to be installed
>                            Depends: gcc-10-s390x-linux-gnu-base (= 10.2.0-5ubuntu1~20.04cross1) but 10.3.0-1ubuntu1~20.04cross1 is to be installed
> E: Unable to correct problems, you have held broken packages.
> Error: Process completed with exit code 100.
> 
> Nothing to do with my changes, I think.

I think we just need to update the minor version to 10.3.0-1ubuntu1~20.04cross1:
   https://bugs.openjdk.java.net/browse/JDK-8269148

-- 
Thanks,
-Aleksey


From kbarrett at openjdk.java.net  Tue Jun 22 17:47:10 2021
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Tue, 22 Jun 2021 17:47:10 GMT
Subject: Integrated: 8268290: Improve LockFreeQueue<> utility
In-Reply-To: 
References: 
Message-ID: 

On Sun, 6 Jun 2021 16:17:40 GMT, Kim Barrett  wrote:

> Please review this change to the LockFreeQueue utility class.
> 
> The LockFreeQueue originated as an implementation detail of
> G1DirtyCardQueueSet, and was recently refactored into a public utility
> class.  In that refactoring it retained some limitations that were
> acceptable in its original context, but may be problematic as a general
> utility.
> 
> In particular, under some conditions a thread was not be able to pop the
> last element in the queue, due to interference by a concurrent operation.
> And this state will persist, so retrying the pop operation won't help until
> the interfering thread had made sufficient progress. This was mitigated by
> making the API more complex to provide notice to the client that the queue
> may be in this state.
> 
> But it turns out we can do somewhat better, eliminating one of the
> limitations, which is the point of this change.  We introduce a
> pseudo-object used as an end of queue marker.  We can use the transition of
> the last element's next value from the end marker to NULL by a pop operation
> as a claim on the element, allowing the losing thread to recognize, retry,
> and make progress.
> 
> This queue still has the limitation that an in-progress push/append may
> prevent popping elements.  Because of this, the class is being renamed to
> NonblockingQueue.  The old name suggests stronger guarantees than actually
> provided.
> 
> The PR has two commits, the first for the functional changes, the second for
> the renaming.  The github diffs don't seem to be recognizing the renaming of
> the source files as a rename, instead treating the old files as deleted and
> the new files as added.  The first commit by itself is probably more useful
> for reviewing the functional changes.
> 
> Testing:
> mach5 tier1-5

This pull request has now been integrated.

Changeset: 0c693e2f
Author:    Kim Barrett 
URL:       https://git.openjdk.java.net/jdk/commit/0c693e2f03b1adef0e946ebc32827ac09192f5f0
Stats:     1229 lines in 8 files changed: 619 ins; 601 del; 9 mod

8268290: Improve LockFreeQueue<> utility

Reviewed-by: iwalulya, tschatzl

-------------

PR: https://git.openjdk.java.net/jdk/pull/4379

From kbarrett at openjdk.java.net  Tue Jun 22 17:47:08 2021
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Tue, 22 Jun 2021 17:47:08 GMT
Subject: RFR: 8268290: Improve LockFreeQueue<> utility [v3]
In-Reply-To: 
References: 
Message-ID: <4fBS_LTY8B9qMRZTpBQw7b53tE-3Fy8iZ07SYQ1CXo0=.848cffe3-85b3-4343-b330-37591da3b7bc@github.com>

> Please review this change to the LockFreeQueue utility class.
> 
> The LockFreeQueue originated as an implementation detail of
> G1DirtyCardQueueSet, and was recently refactored into a public utility
> class.  In that refactoring it retained some limitations that were
> acceptable in its original context, but may be problematic as a general
> utility.
> 
> In particular, under some conditions a thread was not be able to pop the
> last element in the queue, due to interference by a concurrent operation.
> And this state will persist, so retrying the pop operation won't help until
> the interfering thread had made sufficient progress. This was mitigated by
> making the API more complex to provide notice to the client that the queue
> may be in this state.
> 
> But it turns out we can do somewhat better, eliminating one of the
> limitations, which is the point of this change.  We introduce a
> pseudo-object used as an end of queue marker.  We can use the transition of
> the last element's next value from the end marker to NULL by a pop operation
> as a claim on the element, allowing the losing thread to recognize, retry,
> and make progress.
> 
> This queue still has the limitation that an in-progress push/append may
> prevent popping elements.  Because of this, the class is being renamed to
> NonblockingQueue.  The old name suggests stronger guarantees than actually
> provided.
> 
> The PR has two commits, the first for the functional changes, the second for
> the renaming.  The github diffs don't seem to be recognizing the renaming of
> the source files as a rename, instead treating the old files as deleted and
> the new files as added.  The first commit by itself is probably more useful
> for reviewing the functional changes.
> 
> Testing:
> mach5 tier1-5

Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:

 - Merge branch 'master' into lfqueue
 - Merge branch 'master' into lfqueue
 - rename
 - use end marker to improve pop

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4379/files
  - new: https://git.openjdk.java.net/jdk/pull/4379/files/0adb5954..3710810d

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4379&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4379&range=01-02

  Stats: 25472 lines in 434 files changed: 16406 ins; 7854 del; 1212 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4379.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4379/head:pull/4379

PR: https://git.openjdk.java.net/jdk/pull/4379

From shade at openjdk.java.net  Tue Jun 22 17:59:38 2021
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Tue, 22 Jun 2021 17:59:38 GMT
Subject: RFR: 8261492: Shenandoah: reconsider forwardee accesses memory
 ordering [v3]
In-Reply-To: 
References: 
 
Message-ID: 

On Tue, 16 Feb 2021 10:26:06 GMT, Aleksey Shipilev  wrote:

>> Shenandoah carries forwardee information in object's mark word. Installing the new mark word is effectively "releasing" the object copy, and reading from the new mark word is "acquiring" that object copy.
>> 
>> For the forwardee update side, Hotspot's default for atomic operations is memory_order_conservative, which emits two-way memory fences around the CASes at least on AArch64 and PPC64. This seems to be excessive for Shenandoah forwardee updates, and "release" is enough.
>> 
>> For the forwardee load side, we need to guarantee "acquire". We do not do it now, reading the markword without memory semantics. It does not seem to pose a practical problem today, because GC does not access the object contents in the new copy, and mutators get this from the JRT-called stub that separates the fwdptr access and object contents access by a lot. It still should be cleaner to "acquire" the mark on load to avoid surprises.
>> 
>> Additional testing:
>>  - [x] Linux x86_64 `hotspot_gc_shenandoah`
>>  - [x] Linux AArch64 `hotspot_gc_shenandoah`
>>  - [x] Linux AArch64 `tier1` with Shenandoah
>
> Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision:
> 
>  - A few minor touchups
>  - Add a blurb to x86 code as well
>  - Use implicit "consume" in AArch64, add more notes.
>  - Merge branch 'master' into JDK-8261492-shenandoah-forwardee-memord
>  - Make sure to access fwdptr with acquire semantics in assembler code
>  - 8261492: Shenandoah: reconsider forwardee accesses memory ordering

Not now, bot. Still waiting.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2496

From manc at openjdk.java.net  Tue Jun 22 19:43:31 2021
From: manc at openjdk.java.net (Man Cao)
Date: Tue, 22 Jun 2021 19:43:31 GMT
Subject: Withdrawn: 8267079: Support async handshakes that can be executed by
 a remote thread
In-Reply-To: 
References: 
Message-ID: 

On Thu, 13 May 2021 02:05:30 GMT, Man Cao  wrote:

> Hi all,
> 
> Can I have reviews for this small refactoring change? It resolves a pending concern from [JDK-8238761](https://bugs.openjdk.java.net/browse/JDK-8238761), clarifies the code and allows more use case of async handshakes. See [JDK-8267079](https://bugs.openjdk.java.net/browse/JDK-8267079) for detailed description.
> 
> -Man

This pull request has been closed without being integrated.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4005

From manc at openjdk.java.net  Tue Jun 22 19:43:31 2021
From: manc at openjdk.java.net (Man Cao)
Date: Tue, 22 Jun 2021 19:43:31 GMT
Subject: RFR: 8267079: Support async handshakes that can be executed by a
 remote thread [v2]
In-Reply-To: 
References: 
 
Message-ID: 

On Tue, 18 May 2021 19:08:08 GMT, Man Cao  wrote:

>> Hi all,
>> 
>> Can I have reviews for this small refactoring change? It resolves a pending concern from [JDK-8238761](https://bugs.openjdk.java.net/browse/JDK-8238761), clarifies the code and allows more use case of async handshakes. See [JDK-8267079](https://bugs.openjdk.java.net/browse/JDK-8267079) for detailed description.
>> 
>> -Man
>
> Man Cao has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Added missing deallocation and renamed "remote" to "non-self".

Closing this PR as the "arm-the-poll-only approach" seems to work, although there are some performance problems to resolve.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4005

From github.com+6704669+asgibbons at openjdk.java.net  Tue Jun 22 20:47:55 2021
From: github.com+6704669+asgibbons at openjdk.java.net (Scott Gibbons)
Date: Tue, 22 Jun 2021 20:47:55 GMT
Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512
 [v6]
In-Reply-To: 
References: 
Message-ID: 

> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding.
> 
> A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done.  This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so.  A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded.  This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded.
> 
> The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding.  The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically.
> 
> Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k.  The numbers are given in the table below.
> 
> **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**.
> 
> 
> Benchmark Name | Base Score | Optimized Score | Gain
> -- | -- | -- | --
> testBase64Decode size 1 | 15.36 | 15.32 | 1.00
> testBase64Decode size 3 | 17.00 | 16.72 | 1.02
> testBase64Decode size 7 | 20.60 | 18.82 | 1.09
> testBase64Decode size 32 | 34.21 | 26.77 | 1.28
> testBase64Decode size 64 | 54.43 | 38.35 | 1.42
> testBase64Decode size 80 | 66.40 | 48.34 | 1.37
> testBase64Decode size 96 | 73.16 | 52.90 | 1.38
> testBase64Decode size 112 | 84.93 | 51.82 | 1.64
> testBase64Decode size 512 | 288.81 | 32.04 | 9.01
> testBase64Decode size 1000 | 560.48 | 40.79 | 13.74
> testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72
> testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15
> testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07
> testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10
> testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02
> testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10
> testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05
> testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00
> testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05
> testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20
> testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09
> testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12
> testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09
> testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21
> testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29
> testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12
> testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05
> testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18
> testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02
> testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24
> testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23
> testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24
> testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14
> testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19
> testBase64WithErrorInputsDecode size   20000 | 1398.44 | 1138.17 | 1.23
> testBase64WithErrorInputsDecode size   50000 | 1409.41 | 1114.16 | 1.26

Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:

  Addressing review comments.
  
  1. Changed errorvec handling
  2. Removed unnecessary register copies and aliasing
  3. Streamlined mask generation

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4368/files
  - new: https://git.openjdk.java.net/jdk/pull/4368/files/bb73df6c..e1b4af9e

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4368&range=05
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4368&range=04-05

  Stats: 55 lines in 1 file changed: 0 ins; 29 del; 26 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4368.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4368/head:pull/4368

PR: https://git.openjdk.java.net/jdk/pull/4368

From xliu at openjdk.java.net  Tue Jun 22 21:02:26 2021
From: xliu at openjdk.java.net (Xin Liu)
Date: Tue, 22 Jun 2021 21:02:26 GMT
Subject: [jdk17] RFR: 8267752: KVHashtable doesn't deallocate entries
In-Reply-To: 
References: 
Message-ID: 

On Mon, 21 Jun 2021 18:08:48 GMT, Xin Liu  wrote:

> Add a free_entry iteration to the destructor of ~KVHashtables.
> Tested with tier1-3.

I have consulted Coleen on JBS if there's any risk to bring her patch to jdk17. 
Meanwhile, I split JDK-8269064 into the urgent(https://github.com/openjdk/jdk17/pull/121) one and not so urgent one.  The later depends on this PR.

-------------

PR: https://git.openjdk.java.net/jdk17/pull/110

From sviswanathan at openjdk.java.net  Tue Jun 22 21:07:30 2021
From: sviswanathan at openjdk.java.net (Sandhya Viswanathan)
Date: Tue, 22 Jun 2021 21:07:30 GMT
Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512
 [v6]
In-Reply-To: 
References: 
 
Message-ID: 

On Tue, 22 Jun 2021 20:47:55 GMT, Scott Gibbons  wrote:

>> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding.
>> 
>> A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done.  This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so.  A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded.  This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded.
>> 
>> The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding.  The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically.
>> 
>> Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k.  The numbers are given in the table below.
>> 
>> **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**.
>> 
>> 
>> Benchmark Name | Base Score | Optimized Score | Gain
>> -- | -- | -- | --
>> testBase64Decode size 1 | 15.36 | 15.32 | 1.00
>> testBase64Decode size 3 | 17.00 | 16.72 | 1.02
>> testBase64Decode size 7 | 20.60 | 18.82 | 1.09
>> testBase64Decode size 32 | 34.21 | 26.77 | 1.28
>> testBase64Decode size 64 | 54.43 | 38.35 | 1.42
>> testBase64Decode size 80 | 66.40 | 48.34 | 1.37
>> testBase64Decode size 96 | 73.16 | 52.90 | 1.38
>> testBase64Decode size 112 | 84.93 | 51.82 | 1.64
>> testBase64Decode size 512 | 288.81 | 32.04 | 9.01
>> testBase64Decode size 1000 | 560.48 | 40.79 | 13.74
>> testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72
>> testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15
>> testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07
>> testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10
>> testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02
>> testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10
>> testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05
>> testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00
>> testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05
>> testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20
>> testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09
>> testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12
>> testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09
>> testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21
>> testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29
>> testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12
>> testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05
>> testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18
>> testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02
>> testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24
>> testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23
>> testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24
>> testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14
>> testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19
>> testBase64WithErrorInputsDecode size   20000 | 1398.44 | 1138.17 | 1.23
>> testBase64WithErrorInputsDecode size   50000 | 1409.41 | 1114.16 | 1.26
>
> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Addressing review comments.
>   
>   1. Changed errorvec handling
>   2. Removed unnecessary register copies and aliasing
>   3. Streamlined mask generation

Marked as reviewed by sviswanathan (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/4368

From sviswanathan at openjdk.java.net  Tue Jun 22 21:11:36 2021
From: sviswanathan at openjdk.java.net (Sandhya Viswanathan)
Date: Tue, 22 Jun 2021 21:11:36 GMT
Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512
 [v6]
In-Reply-To: 
References: 
 
Message-ID: <-YJZMeuxz5By5y7uZnipoRkIB2A9Ha9cDADsb6MRR4M=.3e941c7c-2c4c-4524-9230-782001e8fc28@github.com>

On Tue, 22 Jun 2021 20:47:55 GMT, Scott Gibbons  wrote:

>> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding.
>> 
>> A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done.  This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so.  A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded.  This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded.
>> 
>> The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding.  The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically.
>> 
>> Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k.  The numbers are given in the table below.
>> 
>> **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**.
>> 
>> 
>> Benchmark Name | Base Score | Optimized Score | Gain
>> -- | -- | -- | --
>> testBase64Decode size 1 | 15.36 | 15.32 | 1.00
>> testBase64Decode size 3 | 17.00 | 16.72 | 1.02
>> testBase64Decode size 7 | 20.60 | 18.82 | 1.09
>> testBase64Decode size 32 | 34.21 | 26.77 | 1.28
>> testBase64Decode size 64 | 54.43 | 38.35 | 1.42
>> testBase64Decode size 80 | 66.40 | 48.34 | 1.37
>> testBase64Decode size 96 | 73.16 | 52.90 | 1.38
>> testBase64Decode size 112 | 84.93 | 51.82 | 1.64
>> testBase64Decode size 512 | 288.81 | 32.04 | 9.01
>> testBase64Decode size 1000 | 560.48 | 40.79 | 13.74
>> testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72
>> testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15
>> testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07
>> testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10
>> testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02
>> testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10
>> testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05
>> testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00
>> testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05
>> testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20
>> testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09
>> testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12
>> testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09
>> testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21
>> testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29
>> testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12
>> testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05
>> testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18
>> testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02
>> testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24
>> testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23
>> testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24
>> testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14
>> testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19
>> testBase64WithErrorInputsDecode size   20000 | 1398.44 | 1138.17 | 1.23
>> testBase64WithErrorInputsDecode size   50000 | 1409.41 | 1114.16 | 1.26
>
> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Addressing review comments.
>   
>   1. Changed errorvec handling
>   2. Removed unnecessary register copies and aliasing
>   3. Streamlined mask generation

@asgibbons The patch looks good to me.

@vnkozlov We need one more review for this patch. Could you please help?

-------------

PR: https://git.openjdk.java.net/jdk/pull/4368

From xliu at openjdk.java.net  Tue Jun 22 23:33:26 2021
From: xliu at openjdk.java.net (Xin Liu)
Date: Tue, 22 Jun 2021 23:33:26 GMT
Subject: [jdk17] Withdrawn: 8267752: KVHashtable doesn't deallocate entries
In-Reply-To: 
References: 
Message-ID: 

On Mon, 21 Jun 2021 18:08:48 GMT, Xin Liu  wrote:

> Add a free_entry iteration to the destructor of ~KVHashtables.
> Tested with tier1-3.

This pull request has been closed without being integrated.

-------------

PR: https://git.openjdk.java.net/jdk17/pull/110

From xliu at openjdk.java.net  Tue Jun 22 23:33:26 2021
From: xliu at openjdk.java.net (Xin Liu)
Date: Tue, 22 Jun 2021 23:33:26 GMT
Subject: [jdk17] RFR: 8267752: KVHashtable doesn't deallocate entries
In-Reply-To: 
References: 
Message-ID: 

On Mon, 21 Jun 2021 18:08:48 GMT, Xin Liu  wrote:

> Add a free_entry iteration to the destructor of ~KVHashtables.
> Tested with tier1-3.

Close this PR because JDK-8267752 is P4, which doesn't  fit for the RDP1.

-------------

PR: https://git.openjdk.java.net/jdk17/pull/110

From jwilhelm at openjdk.java.net  Wed Jun 23 00:31:30 2021
From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson)
Date: Wed, 23 Jun 2021 00:31:30 GMT
Subject: RFR: Merge jdk17
Message-ID: 

Forwardport JDK 17 -> JDK 18

-------------

Commit messages:
 - Merge
 - 8268404: [TESTBUG] tools/jpackage/windows/WinInstallerIconTest.java failed "AssertionError: Failed: Check icon"
 - 8267652: c2 loop unrolling by 8 results in reading memory past array
 - 8267399: C2: java/text/Normalizer/ConformanceTest.java test failed with assertion
 - 8268888: Upstream 8268230: Foreign Linker API & Windows user32/kernel32: String conversion seems broken
 - 8268524: nmethod::post_compiled_method_load_event racingly called on zombie
 - 8266631: StandardJavaFileManager: getJavaFileObjects() impl violates the spec
 - 8267421: j.l.constant.DirectMethodHandleDesc.Kind.valueOf(int) implementation doesn't conform to the spec regarding REF_invokeInterface handling
 - 8268349: Provide clear run-time warnings about Security Manager deprecation
 - 8268293: VectorAPI cast operation on mask and shuffle is broken
 - ... and 1 more: https://git.openjdk.java.net/jdk/compare/0c693e2f...7bf4b35f

The merge commit only contains trivial merges, so no merge-specific webrevs have been generated.

Changes: https://git.openjdk.java.net/jdk/pull/4562/files
  Stats: 1931 lines in 60 files changed: 653 ins; 1061 del; 217 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4562.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4562/head:pull/4562

PR: https://git.openjdk.java.net/jdk/pull/4562

From github.com+6704669+asgibbons at openjdk.java.net  Wed Jun 23 00:31:55 2021
From: github.com+6704669+asgibbons at openjdk.java.net (Scott Gibbons)
Date: Wed, 23 Jun 2021 00:31:55 GMT
Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512
 [v7]
In-Reply-To: 
References: 
Message-ID: 

> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding.
> 
> A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done.  This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so.  A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded.  This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded.
> 
> The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding.  The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically.
> 
> Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k.  The numbers are given in the table below.
> 
> **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**.
> 
> 
> Benchmark Name | Base Score | Optimized Score | Gain
> -- | -- | -- | --
> testBase64Decode size 1 | 15.36 | 15.32 | 1.00
> testBase64Decode size 3 | 17.00 | 16.72 | 1.02
> testBase64Decode size 7 | 20.60 | 18.82 | 1.09
> testBase64Decode size 32 | 34.21 | 26.77 | 1.28
> testBase64Decode size 64 | 54.43 | 38.35 | 1.42
> testBase64Decode size 80 | 66.40 | 48.34 | 1.37
> testBase64Decode size 96 | 73.16 | 52.90 | 1.38
> testBase64Decode size 112 | 84.93 | 51.82 | 1.64
> testBase64Decode size 512 | 288.81 | 32.04 | 9.01
> testBase64Decode size 1000 | 560.48 | 40.79 | 13.74
> testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72
> testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15
> testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07
> testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10
> testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02
> testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10
> testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05
> testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00
> testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05
> testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20
> testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09
> testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12
> testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09
> testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21
> testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29
> testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12
> testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05
> testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18
> testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02
> testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24
> testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23
> testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24
> testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14
> testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19
> testBase64WithErrorInputsDecode size   20000 | 1398.44 | 1138.17 | 1.23
> testBase64WithErrorInputsDecode size   50000 | 1409.41 | 1114.16 | 1.26

Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:

  Fixing Windows build warnings

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4368/files
  - new: https://git.openjdk.java.net/jdk/pull/4368/files/e1b4af9e..58461b80

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4368&range=06
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4368&range=05-06

  Stats: 24 lines in 1 file changed: 8 ins; 0 del; 16 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4368.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4368/head:pull/4368

PR: https://git.openjdk.java.net/jdk/pull/4368

From coleenp at openjdk.java.net  Wed Jun 23 01:04:41 2021
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Wed, 23 Jun 2021 01:04:41 GMT
Subject: RFR: 8269188: [BACKOUT] Remove
 CodeCache::mark_for_evol_deoptimization() method
Message-ID: 

This reverts commit 33c23a1cf2aa81551eee4a2acf271edf573558aa.

Building locally.

See bug for details.

-------------

Commit messages:
 - Revert "8264941: Remove CodeCache::mark_for_evol_deoptimization() method"

Changes: https://git.openjdk.java.net/jdk/pull/4563/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4563&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8269188
  Stats: 78 lines in 7 files changed: 73 ins; 0 del; 5 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4563.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4563/head:pull/4563

PR: https://git.openjdk.java.net/jdk/pull/4563

From jwilhelm at openjdk.java.net  Wed Jun 23 01:09:32 2021
From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson)
Date: Wed, 23 Jun 2021 01:09:32 GMT
Subject: RFR: Merge jdk17 [v2]
In-Reply-To: 
References: 
Message-ID: 

> Forwardport JDK 17 -> JDK 18

Jesper Wilhelmsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 59 commits:

 - Merge
 - 8268290: Improve LockFreeQueue<> utility
   
   Reviewed-by: iwalulya, tschatzl
 - 8264941: Remove CodeCache::mark_for_evol_deoptimization() method
   
   Reviewed-by: kvn, vlivanov, sspitsyn
 - 8269031: linux x86_64 check for binutils 2.25 or higher after 8265783
   
   Reviewed-by: ihse, erikj
 - 8267657: Add missing PrintC1Statistics before incrementing counters
   
   Reviewed-by: iveresov
 - 8268857: Merge VM_PrintJNI and VM_PrintThreads and remove the unused field 'is_deadlock' of DeadlockCycle
   
   Reviewed-by: dholmes
 - 8269077: TestSystemGC uses "require vm.gc.G1" for large pages subtest
   
   Reviewed-by: tschatzl, kbarrett
 - Merge
 - 8268458: Add verification type for evacuation failures
   
   Reviewed-by: kbarrett, iwalulya
 - 8268952: Automatically update heap sizes in G1MonitoringScope
   
   Reviewed-by: kbarrett, iwalulya
 - ... and 49 more: https://git.openjdk.java.net/jdk/compare/35e4c272...7bf4b35f

-------------

Changes: https://git.openjdk.java.net/jdk/pull/4562/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4562&range=01
  Stats: 16267 lines in 261 files changed: 13210 ins; 2433 del; 624 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4562.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4562/head:pull/4562

PR: https://git.openjdk.java.net/jdk/pull/4562

From jwilhelm at openjdk.java.net  Wed Jun 23 01:09:33 2021
From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson)
Date: Wed, 23 Jun 2021 01:09:33 GMT
Subject: Integrated: Merge jdk17
In-Reply-To: 
References: 
Message-ID: <_z9pXnmkit5l3vbQ1LT3mAP67fTQ2wh8P8g7ScaZVGw=.91f3374b-1abd-416d-a80b-d0a540f6889e@github.com>

On Wed, 23 Jun 2021 00:21:57 GMT, Jesper Wilhelmsson  wrote:

> Forwardport JDK 17 -> JDK 18

This pull request has now been integrated.

Changeset: b6cfca8a
Author:    Jesper Wilhelmsson 
URL:       https://git.openjdk.java.net/jdk/commit/b6cfca8a89810c7ed63ebc34ed9855b66ebcb5d9
Stats:     1931 lines in 60 files changed: 653 ins; 1061 del; 217 mod

Merge

-------------

PR: https://git.openjdk.java.net/jdk/pull/4562

From coleenp at openjdk.java.net  Wed Jun 23 01:44:29 2021
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Wed, 23 Jun 2021 01:44:29 GMT
Subject: RFR: 8269004 Implement ResizableResourceHashtable
In-Reply-To: 
References: 
Message-ID: 

On Mon, 21 Jun 2021 04:31:42 GMT, Ioi Lam  wrote:

> In HotSpot we have (at least) two hashtable designs in the C++ code:
> 
> - share/utilities/hashtable.hpp
> - share/utilities/resourceHash.hpp
> 
> Of the two, the `ResourceHashtable` API is much cleaner and most new code has been written with it. However, one issue is that the `SIZE` of `ResourceHashtable` is a compile-time constant. This makes the hash-to-index computation very fast on x64 (gcc can avoid using the slow divq instruction for modulo). However, the downside is we cannot use `ResourceHashtable` when we need a hashtable whose size is determined at run time (and, optionally, resizeable).
> 
> This PR refactors `ResourceHashtable` into a base template class `ResourceHashtableBase`, whose `size()` function can be configured by a subclass to be either constant or runtime-configurable. 
> 
> Note: since we want to preserve the performance of `hash % SIZE`, we can't make `size()` a virtual function.
> 
> Preliminary benchmark shows that this refactoring has no impact on the performance of the constant `ResourceHashtable`. See https://github.com/iklam/tools/tree/main/bench/resourceHash:
> 
> *before*
> ResourceHashtable: 2.70 sec
> 
> *after*
> ResourceHashtable: 2.72 sec
> ResizableResourceHashtable: 5.29 sec
> 
> To make sure `ResizableResourceHashtable` works, I rewrote some CDS code to use `ResizableResourceHashtable` instead of `KVHashtable`

This looks really good to me.

src/hotspot/share/cds/classListParser.hpp line 37:

> 35: 
> 36: class constantPoolHandle;
> 37: class Thread;

I was looking for the use of constantPoolHandle in the header and I know why the forward declaration is needed.  Shouldn't this declaration use a const reference so that the handle code doesn't create an unnecessary copy?

bool ClassListParser::is_matching_cp_entry(constantPoolHandle &pool, int cp_index, TRAPS) {

src/hotspot/share/utilities/resourceHash.hpp line 252:

> 250:     // http://stackoverflow.com/questions/8532961/template-argument-of-type-that-is-defined-by-inner-typedef-from-other-template-c
> 251:     //typename ResourceHashtableFns::hash_fn   HASH   = primitive_hash,
> 252:     //typename ResourceHashtableFns::equals_fn EQUALS = primitive_equals,

Can you remove this xlC comment?  Not sure why we care.

-------------

Marked as reviewed by coleenp (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4536

From dholmes at openjdk.java.net  Wed Jun 23 02:20:32 2021
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 23 Jun 2021 02:20:32 GMT
Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512
 [v7]
In-Reply-To: 
References: 
 
Message-ID: 

On Wed, 23 Jun 2021 00:31:55 GMT, Scott Gibbons  wrote:

>> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding.
>> 
>> A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done.  This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so.  A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded.  This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded.
>> 
>> The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding.  The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically.
>> 
>> Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k.  The numbers are given in the table below.
>> 
>> **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**.
>> 
>> 
>> Benchmark Name | Base Score | Optimized Score | Gain
>> -- | -- | -- | --
>> testBase64Decode size 1 | 15.36 | 15.32 | 1.00
>> testBase64Decode size 3 | 17.00 | 16.72 | 1.02
>> testBase64Decode size 7 | 20.60 | 18.82 | 1.09
>> testBase64Decode size 32 | 34.21 | 26.77 | 1.28
>> testBase64Decode size 64 | 54.43 | 38.35 | 1.42
>> testBase64Decode size 80 | 66.40 | 48.34 | 1.37
>> testBase64Decode size 96 | 73.16 | 52.90 | 1.38
>> testBase64Decode size 112 | 84.93 | 51.82 | 1.64
>> testBase64Decode size 512 | 288.81 | 32.04 | 9.01
>> testBase64Decode size 1000 | 560.48 | 40.79 | 13.74
>> testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72
>> testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15
>> testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07
>> testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10
>> testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02
>> testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10
>> testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05
>> testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00
>> testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05
>> testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20
>> testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09
>> testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12
>> testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09
>> testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21
>> testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29
>> testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12
>> testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05
>> testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18
>> testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02
>> testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24
>> testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23
>> testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24
>> testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14
>> testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19
>> testBase64WithErrorInputsDecode size   20000 | 1398.44 | 1138.17 | 1.23
>> testBase64WithErrorInputsDecode size   50000 | 1409.41 | 1114.16 | 1.26
>
> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fixing Windows build warnings

What testing has been done for this change? I do not see that the Github Actions have been run for this PR. Has this been tested on a range of x86 systems with differing AVX capabilities?

Thanks,
David

-------------

PR: https://git.openjdk.java.net/jdk/pull/4368

From gli at openjdk.java.net  Wed Jun 23 02:24:37 2021
From: gli at openjdk.java.net (Guoxiong Li)
Date: Wed, 23 Jun 2021 02:24:37 GMT
Subject: Integrated: 8268368: Adopt cast notation for JavaThread conversions
In-Reply-To: 
References: 
Message-ID: 

On Tue, 22 Jun 2021 01:11:30 GMT, Guoxiong Li  wrote:

> Hi all,
> 
> Considering the consistency of `JavaThread` and other threads, such as WorkerThread and CompilerThread, `JavaThread` could use a method named `cast` to replace the method `Thread::as_Java_thread()`. It can reduce the Thread's knowledge about the subtypes.
> 
> This patch removes two methods, `JavaThread* Thread::as_Java_thread()` and `const JavaThread* Thread::as_Java_thread() const`, of the class `Thread` and adds two static methods, `JavaThread* cast(Thread* t)` and `const JavaThread* cast(const Thread* t)`, to the class `JavaThread`. Correspondingly, the code of the method `JavaThread::current()` need to be adjusted and many places where the method `Thread::as_Java_thread()` is used need to use `JavaThread::cast` instead.
> 
> Test:
> tier1 passed locally.
> 
> Thanks for taking the time to review.
> 
> Best Regards,
> -- Guoxiong

This pull request has now been integrated.

Changeset: cd678a38
Author:    Guoxiong Li 
Committer: David Holmes 
URL:       https://git.openjdk.java.net/jdk/commit/cd678a383f7b23ea40132b207ddfc041394ba4c1
Stats:     158 lines in 64 files changed: 13 ins; 19 del; 126 mod

8268368: Adopt cast notation for JavaThread conversions

Reviewed-by: dholmes, stefank

-------------

PR: https://git.openjdk.java.net/jdk/pull/4546

From shade at openjdk.java.net  Wed Jun 23 06:30:28 2021
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Wed, 23 Jun 2021 06:30:28 GMT
Subject: Integrated: 8269138: Move typeArrayOop.inline.hpp include to
 vectorSupport.cpp
In-Reply-To: <4dtXj293cS81fbEC-HmGdzcrVfI4dEf3u0HTY2tjhxM=.737f4023-a810-4953-9b50-80993ae2e480@github.com>
References: <4dtXj293cS81fbEC-HmGdzcrVfI4dEf3u0HTY2tjhxM=.737f4023-a810-4953-9b50-80993ae2e480@github.com>
Message-ID: <9TBWX8-JqcYj6c2TgiQRSHF2RAuA4axzoHuovIPlacI=.5276d853-670e-4d37-8cdc-5295ac8ed9c1@github.com>

On Tue, 22 Jun 2021 15:02:58 GMT, Aleksey Shipilev  wrote:

> See the bug report for inclusion circularity that breaks current Loom workspace. Including stuff properly, `.hpp` in `.hpp`, and `.inline.hpp` in `.cpp` resolves this.
> 
> Additional testing:
>  - [x] Linux x86_64 builds
>  - [x] Loom repository builds Linux x86_64 with this patch cherry-picked

This pull request has now been integrated.

Changeset: 17daf32a
Author:    Aleksey Shipilev 
URL:       https://git.openjdk.java.net/jdk/commit/17daf32a073bc4f12602b4872ce708e09c453ced
Stats:     2 lines in 2 files changed: 1 ins; 0 del; 1 mod

8269138: Move typeArrayOop.inline.hpp include to vectorSupport.cpp

Reviewed-by: stefank, vlivanov

-------------

PR: https://git.openjdk.java.net/jdk/pull/4559

From shade at openjdk.java.net  Wed Jun 23 06:30:27 2021
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Wed, 23 Jun 2021 06:30:27 GMT
Subject: RFR: 8269138: Move typeArrayOop.inline.hpp include to
 vectorSupport.cpp
In-Reply-To: <4dtXj293cS81fbEC-HmGdzcrVfI4dEf3u0HTY2tjhxM=.737f4023-a810-4953-9b50-80993ae2e480@github.com>
References: <4dtXj293cS81fbEC-HmGdzcrVfI4dEf3u0HTY2tjhxM=.737f4023-a810-4953-9b50-80993ae2e480@github.com>
Message-ID: 

On Tue, 22 Jun 2021 15:02:58 GMT, Aleksey Shipilev  wrote:

> See the bug report for inclusion circularity that breaks current Loom workspace. Including stuff properly, `.hpp` in `.hpp`, and `.inline.hpp` in `.cpp` resolves this.
> 
> Additional testing:
>  - [x] Linux x86_64 builds
>  - [x] Loom repository builds Linux x86_64 with this patch cherry-picked

There are GHA failures on Linux additional builds. I believe these are due to JDK-8269148. Since that fix would go through jdk17 -> jdk sync much later, I would not wait for it here. I'll integrate this simple fix now.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4559

From dholmes at openjdk.java.net  Wed Jun 23 07:10:25 2021
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 23 Jun 2021 07:10:25 GMT
Subject: RFR: 8269188: [BACKOUT] Remove
 CodeCache::mark_for_evol_deoptimization() method
In-Reply-To: 
References: 
Message-ID: 

On Wed, 23 Jun 2021 00:57:33 GMT, Coleen Phillimore  wrote:

> This reverts commit 33c23a1cf2aa81551eee4a2acf271edf573558aa.
> 
> Built locally linux-x64-debug and ran vmTestbase/nsk/jvmti tests.
> 
> See bug for details.

Looks like an accurate backout.

Thanks,
David

-------------

Marked as reviewed by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4563

From aph at redhat.com  Wed Jun 23 09:22:30 2021
From: aph at redhat.com (Andrew Haley)
Date: Wed, 23 Jun 2021 10:22:30 +0100
Subject: RFR: 8261492: Shenandoah: reconsider forwardee accesses memory
 ordering [v2]
In-Reply-To: 
References: 
 
 
 
 
 
 
 
 
Message-ID: <5b9365a6-b915-1ac7-299f-0142c0150589@redhat.com>

On 2/15/21 2:06 PM, Andrew Haley wrote:
> On 15/02/2021 12:00, Martin Doerr wrote:
>> I'd prefer using load-consume with comment in assembly code and acquire in C++ code. That would be consistent with other code. But that's just my opinion. I'll leave the aarch64 maintainers free to decide.
> 
> That sounds right to me too. One day we'll get memory_order_consume
> for HotSpot C++ code, but until then acquire will have to do.

Sorry, I missed this. Is it just acquire and release you need?

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. 
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Wed Jun 23 09:35:44 2021
From: aph at redhat.com (Andrew Haley)
Date: Wed, 23 Jun 2021 10:35:44 +0100
Subject: RFR: 8261492: Shenandoah: reconsider forwardee accesses memory
 ordering [v2]
In-Reply-To: <5b9365a6-b915-1ac7-299f-0142c0150589@redhat.com>
References: 
 
 
 
 
 
 
 
 
 <5b9365a6-b915-1ac7-299f-0142c0150589@redhat.com>
Message-ID: 

On 6/23/21 10:22 AM, Andrew Haley wrote:
> On 2/15/21 2:06 PM, Andrew Haley wrote:
>> On 15/02/2021 12:00, Martin Doerr wrote:
>>> I'd prefer using load-consume with comment in assembly code and acquire in C++ code. That would be consistent with other code. But that's just my opinion. I'll leave the aarch64 maintainers free to decide.
>>
>> That sounds right to me too. One day we'll get memory_order_consume
>> for HotSpot C++ code, but until then acquire will have to do.
> 
> Sorry, I missed this. Is it just acquire and release you need?

Just to be clear: in a CAS you can have all 4 combinations: none,
acq, rel, acq_rel. I don't want to populate all possibilities if they
won't be used.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. 
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From shade at redhat.com  Wed Jun 23 09:39:37 2021
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 23 Jun 2021 11:39:37 +0200
Subject: RFR: 8261492: Shenandoah: reconsider forwardee accesses memory
 ordering [v2]
In-Reply-To: <5b9365a6-b915-1ac7-299f-0142c0150589@redhat.com>
References: 
 
 
 
 
 
 
 
 
 <5b9365a6-b915-1ac7-299f-0142c0150589@redhat.com>
Message-ID: <9b37092c-1676-751e-f6d1-e73ff624a263@redhat.com>

On 6/23/21 11:22 AM, Andrew Haley wrote:
> On 2/15/21 2:06 PM, Andrew Haley wrote:
>> On 15/02/2021 12:00, Martin Doerr wrote:
>>> I'd prefer using load-consume with comment in assembly code and acquire in C++ code. That would be consistent with other code. But that's just my opinion. I'll leave the aarch64 maintainers free to decide.
>>
>> That sounds right to me too. One day we'll get memory_order_consume
>> for HotSpot C++ code, but until then acquire will have to do.
> 
> Sorry, I missed this. Is it just acquire and release you need?

No problem. Yes, current patches need {acquire, release, acquire_release}.

-- 
Thanks,
-Aleksey


From shade at redhat.com  Wed Jun 23 09:43:51 2021
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 23 Jun 2021 11:43:51 +0200
Subject: RFR: 8261492: Shenandoah: reconsider forwardee accesses memory
 ordering [v2]
In-Reply-To: 
References: 
 
 
 
 
 
 
 
 
 <5b9365a6-b915-1ac7-299f-0142c0150589@redhat.com>
 
Message-ID: 

On 6/23/21 11:35 AM, Andrew Haley wrote:
> On 6/23/21 10:22 AM, Andrew Haley wrote:
>> On 2/15/21 2:06 PM, Andrew Haley wrote:
>>> On 15/02/2021 12:00, Martin Doerr wrote:
>>>> I'd prefer using load-consume with comment in assembly code and acquire in C++ code. That would be consistent with other code. But that's just my opinion. I'll leave the aarch64 maintainers free to decide.
>>>
>>> That sounds right to me too. One day we'll get memory_order_consume
>>> for HotSpot C++ code, but until then acquire will have to do.
>>
>> Sorry, I missed this. Is it just acquire and release you need?
> 
> Just to be clear: in a CAS you can have all 4 combinations: none,
> acq, rel, acq_rel. I don't want to populate all possibilities if they
> won't be used.

The only CAS in those patches uses memory_order_acq_rel.

-- 
Thanks,
-Aleksey


From sspitsyn at openjdk.java.net  Wed Jun 23 09:45:29 2021
From: sspitsyn at openjdk.java.net (Serguei Spitsyn)
Date: Wed, 23 Jun 2021 09:45:29 GMT
Subject: RFR: 8269188: [BACKOUT] Remove
 CodeCache::mark_for_evol_deoptimization() method
In-Reply-To: 
References: 
Message-ID: 

On Wed, 23 Jun 2021 00:57:33 GMT, Coleen Phillimore  wrote:

> This reverts commit 33c23a1cf2aa81551eee4a2acf271edf573558aa.
> 
> Built locally linux-x64-debug and ran vmTestbase/nsk/jvmti tests.
> 
> See bug for details.

Hi Coleen,
This looks as correct backout to me too.
Thanks,
Serguei

-------------

Marked as reviewed by sspitsyn (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4563

From dholmes at openjdk.java.net  Wed Jun 23 12:59:49 2021
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 23 Jun 2021 12:59:49 GMT
Subject: RFR: 8268855: Cleanup name handling in the Thread class and subclasses
Message-ID: 

Please review this small cleanup item.

We can simplify and cleanup up name() management:

- make name() return "const char *" and only cast away constness at API boundaries when essential
- add type_name() so that we can avoid code like "if (t->is_VM_Thread()) print("VMThread");
- Rename JavaThread::get_thread_name() to name() (no need for the extra indirection)

There are a couple of minor changes to the appearance of some internal threads in the hs_err log e.g.

  0x000055af03e5b1b0 WatcherThread [stack: 0x00007f685df00000,0x00007f685e000000] [id=15952]

is now:

  0x000055af03e5b1b0 WatcherThread "VM Periodic Task Thread" [stack: 0x00007f685df00000,0x00007f685e000000] [id=15952]

but this shouldn't affect anything and makes things more consistent.

Notes: 

1. While "override" is the ideal style when declaring overriding methods it has to be applied to all virtual methods in a class. So unless "override" is already used in a class, I did not start using it. I have filed a separate RFE to convert the Thread classes to use "override" consistently.
2.  While there is no need to redeclare a virtual method as "virtual" I kept to the existing style in those classes where changes were made.
3. I did not override type_name() for all the JavaThread subclasses as it seemed unnecessary, but happy to hear other views on this.

Testing (in progress):
 - All builds in tiers 1-5
 - GHA
 - tiers 1-3 as a sanity test

Thanks,
David

-------------

Commit messages:
 - Missed ShenandoahControlThread
 - 8268855: Cleanup name handling in the Thread class and subclasses

Changes: https://git.openjdk.java.net/jdk/pull/4569/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4569&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8268855
  Stats: 85 lines in 16 files changed: 34 ins; 14 del; 37 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4569.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4569/head:pull/4569

PR: https://git.openjdk.java.net/jdk/pull/4569

From mbaesken at openjdk.java.net  Wed Jun 23 13:37:59 2021
From: mbaesken at openjdk.java.net (Matthias Baesken)
Date: Wed, 23 Jun 2021 13:37:59 GMT
Subject: RFR: JDK-8266490: Extend the OSContainer API to support the pids
 controller of cgroups [v2]
In-Reply-To: 
References: 
Message-ID: 

> Hello, please review this PR; it extend the OSContainer API in order to also support the pids controller of cgroups.
> 
> I noticed that unlike the other controllers "cpu", "cpuset", "cpuacct", "memory"  on some older Linux distros (SLES 12.1, RHEL 7.1) the pids controller might not be there (or not fully supported) so it was added as optional  , see the coding
> 
> 
>   if (!cg_infos[PIDS_IDX]._data_complete) {
>     log_debug(os, container)("Optional cgroup v1 pids subsystem not found");
>     // keep the other controller info, pids is optional
>   }

Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision:

  Adjustments following Severins comments

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4518/files
  - new: https://git.openjdk.java.net/jdk/pull/4518/files/0e6ecb8e..afd7bf61

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4518&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4518&range=00-01

  Stats: 125 lines in 11 files changed: 56 ins; 48 del; 21 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4518.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4518/head:pull/4518

PR: https://git.openjdk.java.net/jdk/pull/4518

From mbaesken at openjdk.java.net  Wed Jun 23 13:41:28 2021
From: mbaesken at openjdk.java.net (Matthias Baesken)
Date: Wed, 23 Jun 2021 13:41:28 GMT
Subject: RFR: JDK-8266490: Extend the OSContainer API to support the pids
 controller of cgroups [v2]
In-Reply-To: 
References: 
 
Message-ID: 

On Wed, 23 Jun 2021 13:37:59 GMT, Matthias Baesken  wrote:

>> Hello, please review this PR; it extend the OSContainer API in order to also support the pids controller of cgroups.
>> 
>> I noticed that unlike the other controllers "cpu", "cpuset", "cpuacct", "memory"  on some older Linux distros (SLES 12.1, RHEL 7.1) the pids controller might not be there (or not fully supported) so it was added as optional  , see the coding
>> 
>> 
>>   if (!cg_infos[PIDS_IDX]._data_complete) {
>>     log_debug(os, container)("Optional cgroup v1 pids subsystem not found");
>>     // keep the other controller info, pids is optional
>>   }
>
> Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Adjustments following Severins comments

Hi Severin , thanks for all the comments.   I prepared a second version with those changes
added a couple of log_is_enabled checks like you suggested
moved limit_from_str to CgroupSubsystem
added helpers pids_max_val()  and swicthed to GET_CONTAINER_INFO_CPTR
pids_max() now returns -1 for unlimited/max , and the -3 is gone
moved limitFromString java coding to src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystem.java
added a better comment to test/hotspot/jtreg/containers/cgroup/CgroupSubsystemFactory.java about pids hiearchy values

Regarding your questions about tests, I run the exisiting docker/cgroup related tests; and also checked 
the hs_err output (on SLES/Ubuntu)  for new added "maximum number of tasks"  (this is present because systemd cgroup usage).
But I think that the testing needs to be enhanced (e.g. with some added docker tests?).  Do you have some good suggestions
where I could look at existing (docker?) tests and  adjust those for the new pids.max ?

-------------

PR: https://git.openjdk.java.net/jdk/pull/4518

From lfoltan at openjdk.java.net  Wed Jun 23 14:15:27 2021
From: lfoltan at openjdk.java.net (Lois Foltan)
Date: Wed, 23 Jun 2021 14:15:27 GMT
Subject: RFR: 8268855: Cleanup name handling in the Thread class and
 subclasses
In-Reply-To: 
References: 
Message-ID: 

On Wed, 23 Jun 2021 06:21:43 GMT, David Holmes  wrote:

> Please review this small cleanup item.
> 
> We can simplify and cleanup up name() management:
> 
> - make name() return "const char *" and only cast away constness at API boundaries when essential
> - add type_name() so that we can avoid code like "if (t->is_VM_Thread()) print("VMThread");
> - Rename JavaThread::get_thread_name() to name() (no need for the extra indirection)
> 
> There are a couple of minor changes to the appearance of some internal threads in the hs_err log e.g.
> 
>   0x000055af03e5b1b0 WatcherThread [stack: 0x00007f685df00000,0x00007f685e000000] [id=15952]
> 
> is now:
> 
>   0x000055af03e5b1b0 WatcherThread "VM Periodic Task Thread" [stack: 0x00007f685df00000,0x00007f685e000000] [id=15952]
> 
> but this shouldn't affect anything and makes things more consistent.
> 
> Notes: 
> 
> 1. While "override" is the ideal style when declaring overriding methods it has to be applied to all virtual methods in a class. So unless "override" is already used in a class, I did not start using it. I have filed a separate RFE to convert the Thread classes to use "override" consistently.
> 2.  While there is no need to redeclare a virtual method as "virtual" I kept to the existing style in those classes where changes were made.
> 3. I did not override type_name() for all the JavaThread subclasses as it seemed unnecessary, but happy to hear other views on this.
> 
> Testing (in progress):
>  - All builds in tiers 1-5
>  - GHA
>  - tiers 1-3 as a sanity test
> 
> Thanks,
> David

Looks good!
Lois

-------------

Marked as reviewed by lfoltan (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4569

From coleenp at openjdk.java.net  Wed Jun 23 14:29:33 2021
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Wed, 23 Jun 2021 14:29:33 GMT
Subject: RFR: 8269188: [BACKOUT] Remove
 CodeCache::mark_for_evol_deoptimization() method
In-Reply-To: 
References: 
Message-ID: 

On Wed, 23 Jun 2021 00:57:33 GMT, Coleen Phillimore  wrote:

> This reverts commit 33c23a1cf2aa81551eee4a2acf271edf573558aa.
> 
> Built locally linux-x64-debug and ran vmTestbase/nsk/jvmti tests.
> 
> See bug for details.

Thanks David and Serguei.  git had no problem with the revert.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4563

From coleenp at openjdk.java.net  Wed Jun 23 14:29:33 2021
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Wed, 23 Jun 2021 14:29:33 GMT
Subject: Integrated: 8269188: [BACKOUT] Remove
 CodeCache::mark_for_evol_deoptimization() method
In-Reply-To: 
References: 
Message-ID: 

On Wed, 23 Jun 2021 00:57:33 GMT, Coleen Phillimore  wrote:

> This reverts commit 33c23a1cf2aa81551eee4a2acf271edf573558aa.
> 
> Built locally linux-x64-debug and ran vmTestbase/nsk/jvmti tests.
> 
> See bug for details.

This pull request has now been integrated.

Changeset: 52d5d1b3
Author:    Coleen Phillimore 
URL:       https://git.openjdk.java.net/jdk/commit/52d5d1b3617731bf312aa5813bf7e78ca4dacb00
Stats:     78 lines in 7 files changed: 73 ins; 0 del; 5 mod

8269188: [BACKOUT] Remove CodeCache::mark_for_evol_deoptimization() method

Reviewed-by: dholmes, sspitsyn

-------------

PR: https://git.openjdk.java.net/jdk/pull/4563

From sgehwolf at openjdk.java.net  Wed Jun 23 14:51:31 2021
From: sgehwolf at openjdk.java.net (Severin Gehwolf)
Date: Wed, 23 Jun 2021 14:51:31 GMT
Subject: RFR: JDK-8266490: Extend the OSContainer API to support the pids
 controller of cgroups [v2]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Wed, 23 Jun 2021 13:38:58 GMT, Matthias Baesken  wrote:

> But I think that the testing needs to be enhanced (e.g. with some added docker tests?). Do you have some good suggestions
> where I could look at existing (docker?) tests and adjust those for the new pids.max ?

Have a look at `test/hotspot/jtreg/containers/docker/TestMisc.java` which already does some assertions on `print_container_info()` output. Either extend that test with some actual pid limits (`--pids-limit=` option) in place or write a similar one. That would cover the hotspot side.

Then consider adding the pids limit to the `-Xshowsettings:system` output (see `LauncherHelper.printSystemMetrics()`) using the Java API and add a docker test using that in `test/jdk/jdk/internal/platform/docker/`.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4518

From shade at openjdk.java.net  Wed Jun 23 15:32:59 2021
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Wed, 23 Jun 2021 15:32:59 GMT
Subject: RFR: 8261492: Shenandoah: reconsider forwardee accesses memory
 ordering [v4]
In-Reply-To: 
References: 
Message-ID: <1LQKG5euzn56GFgSdFtQYO4B9Q2zDCNQpHz3Rl4QEoA=.78d8f5a1-e2ca-4e28-ad5b-ccc89b837be8@github.com>

> Shenandoah carries forwardee information in object's mark word. Installing the new mark word is effectively "releasing" the object copy, and reading from the new mark word is "acquiring" that object copy.
> 
> For the forwardee update side, Hotspot's default for atomic operations is memory_order_conservative, which emits two-way memory fences around the CASes at least on AArch64 and PPC64. This seems to be excessive for Shenandoah forwardee updates, and "release" is enough.
> 
> For the forwardee load side, we need to guarantee "acquire". We do not do it now, reading the markword without memory semantics. It does not seem to pose a practical problem today, because GC does not access the object contents in the new copy, and mutators get this from the JRT-called stub that separates the fwdptr access and object contents access by a lot. It still should be cleaner to "acquire" the mark on load to avoid surprises.
> 
> Additional testing:
>  - [x] Linux x86_64 `hotspot_gc_shenandoah`
>  - [x] Linux AArch64 `hotspot_gc_shenandoah`
>  - [x] Linux AArch64 `tier1` with Shenandoah

Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit:

  8261492: Shenandoah: reconsider forwardee accesses memory ordering

-------------

Changes: https://git.openjdk.java.net/jdk/pull/2496/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2496&range=03
  Stats: 46 lines in 5 files changed: 39 ins; 0 del; 7 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2496.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2496/head:pull/2496

PR: https://git.openjdk.java.net/jdk/pull/2496

From shade at openjdk.java.net  Wed Jun 23 16:37:03 2021
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Wed, 23 Jun 2021 16:37:03 GMT
Subject: RFR: 8261492: Shenandoah: reconsider forwardee accesses memory
 ordering [v5]
In-Reply-To: 
References: 
Message-ID: <6FBSSKklcr_WaFfYouE8Pk3VajH5iw_x9cEFaIU9gnk=.063a6261-637f-419a-a1a8-ebdb0996a76c@github.com>

> Shenandoah carries forwardee information in object's mark word. Installing the new mark word is effectively "releasing" the object copy, and reading from the new mark word is "acquiring" that object copy.
> 
> For the forwardee update side, Hotspot's default for atomic operations is memory_order_conservative, which emits two-way memory fences around the CASes at least on AArch64 and PPC64. This seems to be excessive for Shenandoah forwardee updates, and "release" is enough.
> 
> For the forwardee load side, we need to guarantee "acquire". We do not do it now, reading the markword without memory semantics. It does not seem to pose a practical problem today, because GC does not access the object contents in the new copy, and mutators get this from the JRT-called stub that separates the fwdptr access and object contents access by a lot. It still should be cleaner to "acquire" the mark on load to avoid surprises.
> 
> Additional testing:
>  - [x] Linux x86_64 `hotspot_gc_shenandoah`
>  - [x] Linux AArch64 `hotspot_gc_shenandoah`
>  - [x] Linux AArch64 `tier1` with Shenandoah

Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision:

  8261492: Shenandoah: reconsider forwardee accesses memory ordering

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/2496/files
  - new: https://git.openjdk.java.net/jdk/pull/2496/files/337b31c3..36e2da27

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2496&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2496&range=03-04

  Stats: 3686 lines in 163 files changed: 1466 ins; 1839 del; 381 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2496.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2496/head:pull/2496

PR: https://git.openjdk.java.net/jdk/pull/2496

From xliu at openjdk.java.net  Wed Jun 23 16:44:26 2021
From: xliu at openjdk.java.net (Xin Liu)
Date: Wed, 23 Jun 2021 16:44:26 GMT
Subject: RFR: 8268855: Cleanup name handling in the Thread class and
 subclasses
In-Reply-To: 
References: 
Message-ID: <0kvvnE-iqCKtFJ2BrscFpXUgF-dnJ8QHuVWjYKuYX_Y=.85ead74c-80c2-4446-9940-64f7ace7eebe@github.com>

On Wed, 23 Jun 2021 06:21:43 GMT, David Holmes  wrote:

> Please review this small cleanup item.
> 
> We can simplify and cleanup up name() management:
> 
> - make name() return "const char *" and only cast away constness at API boundaries when essential
> - add type_name() so that we can avoid code like "if (t->is_VM_Thread()) print("VMThread");
> - Rename JavaThread::get_thread_name() to name() (no need for the extra indirection)
> 
> There are a couple of minor changes to the appearance of some internal threads in the hs_err log e.g.
> 
>   0x000055af03e5b1b0 WatcherThread [stack: 0x00007f685df00000,0x00007f685e000000] [id=15952]
> 
> is now:
> 
>   0x000055af03e5b1b0 WatcherThread "VM Periodic Task Thread" [stack: 0x00007f685df00000,0x00007f685e000000] [id=15952]
> 
> but this shouldn't affect anything and makes things more consistent.
> 
> Notes: 
> 
> 1. While "override" is the ideal style when declaring overriding methods it has to be applied to all virtual methods in a class. So unless "override" is already used in a class, I did not start using it. I have filed a separate RFE to convert the Thread classes to use "override" consistently.
> 2.  While there is no need to redeclare a virtual method as "virtual" I kept to the existing style in those classes where changes were made.
> 3. I did not override type_name() for all the JavaThread subclasses as it seemed unnecessary, but happy to hear other views on this.
> 
> Testing (in progress):
>  - All builds in tiers 1-5
>  - GHA
>  - tiers 1-3 as a sanity test
> 
> Thanks,
> David

LGTM.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4569

From rrich at openjdk.java.net  Wed Jun 23 17:03:38 2021
From: rrich at openjdk.java.net (Richard Reingruber)
Date: Wed, 23 Jun 2021 17:03:38 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v5]
In-Reply-To: 
References: 
 
Message-ID: <18E520rbaxnUxCGlWCqkc_I5cHwA7dvlCVGIFNEY0Ds=.95cb0673-790a-43a2-836d-1417223df845@github.com>

On Mon, 21 Jun 2021 14:49:20 GMT, Patricio Chilano Mateo  wrote:

>> Hi all,
>> 
>> Please review the following patch which handles the removal of biased locking code. 
>> 
>> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
>> 
>> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
>> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
>> 
>> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
>> 
>> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
>> 
>> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
>> 
>> Thanks,
>> Patricio
>
> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Un-ProblemList serviceability tests (8268574 and 8268644)

Hi Patricio,

ppc64le test results are available now. There's no failure related to this change.

Thanks for your patience,
Richard.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From rrich at openjdk.java.net  Wed Jun 23 17:11:32 2021
From: rrich at openjdk.java.net (Richard Reingruber)
Date: Wed, 23 Jun 2021 17:11:32 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v5]
In-Reply-To: 
References: 
 
Message-ID: 

On Mon, 21 Jun 2021 14:49:20 GMT, Patricio Chilano Mateo  wrote:

>> Hi all,
>> 
>> Please review the following patch which handles the removal of biased locking code. 
>> 
>> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
>> 
>> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
>> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
>> 
>> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
>> 
>> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
>> 
>> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
>> 
>> Thanks,
>> Patricio
>
> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Un-ProblemList serviceability tests (8268574 and 8268644)

Hi Patricio,

as stated before I've reviewed the part of this change that is related to JDK-8227745 and found it to be good.

Good thing to get rid of so much complex code!

Thanks, Richard.

-------------

Marked as reviewed by rrich (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4522

From scott.gibbons at intel.com  Wed Jun 23 17:31:06 2021
From: scott.gibbons at intel.com (Gibbons, Scott)
Date: Wed, 23 Jun 2021 17:31:06 +0000
Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512
 [v7]
In-Reply-To: 
References: 
 
 
Message-ID: 

Hi, David.  I don't have permissions to run tests in this repo.  I have tested on several x86 platforms (ICX, SKL) with several options.  I'll be running more tests today.

Thanks,
--Scott

-----Original Message-----
From: hotspot-dev  On Behalf Of David Holmes
Sent: Tuesday, June 22, 2021 7:21 PM
To: build-dev at openjdk.java.net; core-libs-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net
Subject: Re: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v7]

On Wed, 23 Jun 2021 00:31:55 GMT, Scott Gibbons  wrote:

>> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding.
>> 
>> A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done.  This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so.  A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded.  This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded.
>> 
>> The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding.  The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically.
>> 
>> Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k.  The numbers are given in the table below.
>> 
>> **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**.
>> 
>> 
>> Benchmark Name | Base Score | Optimized Score | Gain
>> -- | -- | -- | --
>> testBase64Decode size 1 | 15.36 | 15.32 | 1.00 testBase64Decode size 
>> 3 | 17.00 | 16.72 | 1.02 testBase64Decode size 7 | 20.60 | 18.82 | 
>> 1.09 testBase64Decode size 32 | 34.21 | 26.77 | 1.28 testBase64Decode 
>> size 64 | 54.43 | 38.35 | 1.42 testBase64Decode size 80 | 66.40 | 
>> 48.34 | 1.37 testBase64Decode size 96 | 73.16 | 52.90 | 1.38 
>> testBase64Decode size 112 | 84.93 | 51.82 | 1.64 testBase64Decode 
>> size 512 | 288.81 | 32.04 | 9.01 testBase64Decode size 1000 | 560.48 
>> | 40.79 | 13.74 testBase64Decode size 20000 | 9530.28 | 483.37 | 
>> 19.72 testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15 
>> testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07 
>> testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10 
>> testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02 
>> testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10 
>> testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05 
>> testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00 
>> testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05 
>> testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20 
>> testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09 
>> testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12 
>> testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09 
>> testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21 
>> testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29 
>> testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12 
>> testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05 
>> testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18 
>> testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02 
>> testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24 
>> testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23 
>> testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24 
>> testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14 
>> testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19
>> testBase64WithErrorInputsDecode size   20000 | 1398.44 | 1138.17 | 1.23
>> testBase64WithErrorInputsDecode size   50000 | 1409.41 | 1114.16 | 1.26
>
> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fixing Windows build warnings

What testing has been done for this change? I do not see that the Github Actions have been run for this PR. Has this been tested on a range of x86 systems with differing AVX capabilities?

Thanks,
David

-------------

PR: https://git.openjdk.java.net/jdk/pull/4368

From coleenp at openjdk.java.net  Wed Jun 23 17:34:55 2021
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Wed, 23 Jun 2021 17:34:55 GMT
Subject: RFR: 8269186: [REDO] Remove CodeCache::mark_for_evol_deoptimization()
 method
Message-ID: 

This is somewhat trivial change to remove CodeCache::mark_for_evol_deoptimization() and its calling method, and nothing else this time.
Ran vmTestbase/nsk/jvmti tests.

-------------

Commit messages:
 - 8269186: [REDO] Remove CodeCache::mark_for_evol_deoptimization() method

Changes: https://git.openjdk.java.net/jdk/pull/4575/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4575&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8269186
  Stats: 24 lines in 4 files changed: 0 ins; 23 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4575.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4575/head:pull/4575

PR: https://git.openjdk.java.net/jdk/pull/4575

From pchilanomate at openjdk.java.net  Wed Jun 23 17:41:31 2021
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Wed, 23 Jun 2021 17:41:31 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v5]
In-Reply-To: 
References: 
 
 
Message-ID: <7JM-MMFwRNovuAzdpYoqrzXKUujvGbqaQDb4m-I12wk=.f1b26972-5f1f-4fa6-b29c-1cd8723d3b48@github.com>

On Wed, 23 Jun 2021 17:08:12 GMT, Richard Reingruber  wrote:

> Hi Patricio,
> 
> as stated before I've reviewed the part of this change that is related to JDK-8227745 and found it to be good.
> 
> Good thing to get rid of so much complex code!
> 
Great, thanks for reviewing and all the testing Richard!

Patricio

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From pchilanomate at openjdk.java.net  Wed Jun 23 17:57:34 2021
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Wed, 23 Jun 2021 17:57:34 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v6]
In-Reply-To: 
References: 
Message-ID: 

> Hi all,
> 
> Please review the following patch which handles the removal of biased locking code. 
> 
> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
> 
> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
> 
> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
> 
> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
> 
> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
> 
> Thanks,
> Patricio

Patricio Chilano Mateo has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits:

 - remove extra whitespace
 - Merge master
 - Un-ProblemList serviceability tests (8268574 and 8268644)
 - restore run in EATests.java
 - Dan's comments
 - remove test Test8062950.java + fix commments
 - fix comment in vm_version_ppc.cpp
 - Update java manpage
 - 8256425: Obsolete Biased Locking in JDK 18

-------------

Changes: https://git.openjdk.java.net/jdk/pull/4522/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4522&range=05
  Stats: 5328 lines in 165 files changed: 66 ins; 5034 del; 228 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4522.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4522/head:pull/4522

PR: https://git.openjdk.java.net/jdk/pull/4522

From hseigel at openjdk.java.net  Wed Jun 23 18:10:27 2021
From: hseigel at openjdk.java.net (Harold Seigel)
Date: Wed, 23 Jun 2021 18:10:27 GMT
Subject: RFR: 8269186: [REDO] Remove
 CodeCache::mark_for_evol_deoptimization() method
In-Reply-To: 
References: 
Message-ID: 

On Wed, 23 Jun 2021 17:27:00 GMT, Coleen Phillimore  wrote:

> This is somewhat trivial change to remove CodeCache::mark_for_evol_deoptimization() and its calling method, and nothing else this time.
> Ran vmTestbase/nsk/jvmti tests.

LGTM
Thanks, Harold

-------------

Marked as reviewed by hseigel (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4575

From pchilanomate at openjdk.java.net  Wed Jun 23 18:15:26 2021
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Wed, 23 Jun 2021 18:15:26 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v7]
In-Reply-To: 
References: 
Message-ID: 

> Hi all,
> 
> Please review the following patch which handles the removal of biased locking code. 
> 
> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
> 
> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
> 
> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
> 
> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
> 
> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
> 
> Thanks,
> Patricio

Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:

  fix cast in added whitebox method after 8268368

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4522/files
  - new: https://git.openjdk.java.net/jdk/pull/4522/files/8d10c0e2..a1164afb

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4522&range=06
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4522&range=05-06

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4522.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4522/head:pull/4522

PR: https://git.openjdk.java.net/jdk/pull/4522

From lfoltan at openjdk.java.net  Wed Jun 23 18:39:28 2021
From: lfoltan at openjdk.java.net (Lois Foltan)
Date: Wed, 23 Jun 2021 18:39:28 GMT
Subject: RFR: 8269186: [REDO] Remove
 CodeCache::mark_for_evol_deoptimization() method
In-Reply-To: 
References: 
Message-ID: 

On Wed, 23 Jun 2021 17:27:00 GMT, Coleen Phillimore  wrote:

> This is somewhat trivial change to remove CodeCache::mark_for_evol_deoptimization() and its calling method, and nothing else this time.
> Ran vmTestbase/nsk/jvmti tests.

LGTM.

Thanks,
Lois

-------------

Marked as reviewed by lfoltan (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4575

From sspitsyn at openjdk.java.net  Wed Jun 23 20:34:27 2021
From: sspitsyn at openjdk.java.net (Serguei Spitsyn)
Date: Wed, 23 Jun 2021 20:34:27 GMT
Subject: RFR: 8269186: [REDO] Remove
 CodeCache::mark_for_evol_deoptimization() method
In-Reply-To: 
References: 
Message-ID: 

On Wed, 23 Jun 2021 17:27:00 GMT, Coleen Phillimore  wrote:

> This is somewhat trivial change to remove CodeCache::mark_for_evol_deoptimization() and its calling method, and nothing else this time.
> Ran vmTestbase/nsk/jvmti tests.

Looks good.
Thanks,
Serguei

-------------

Marked as reviewed by sspitsyn (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4575

From coleenp at openjdk.java.net  Wed Jun 23 21:14:29 2021
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Wed, 23 Jun 2021 21:14:29 GMT
Subject: RFR: 8269186: [REDO] Remove
 CodeCache::mark_for_evol_deoptimization() method
In-Reply-To: 
References: 
Message-ID: <1o-JC2SM16KpuS4N4Qj6y0heMEU3JwQZ7ENNPMbeKRk=.a2df25dc-c945-46cf-8e0a-d94589ac1f9e@github.com>

On Wed, 23 Jun 2021 17:27:00 GMT, Coleen Phillimore  wrote:

> This is somewhat trivial change to remove CodeCache::mark_for_evol_deoptimization() and its calling method, and nothing else this time.
> Ran vmTestbase/nsk/jvmti tests.

Thanks Harold, Lois and Serguei for reviewing this trivial change.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4575

From coleenp at openjdk.java.net  Wed Jun 23 21:14:30 2021
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Wed, 23 Jun 2021 21:14:30 GMT
Subject: Integrated: 8269186: [REDO] Remove
 CodeCache::mark_for_evol_deoptimization() method
In-Reply-To: 
References: 
Message-ID: 

On Wed, 23 Jun 2021 17:27:00 GMT, Coleen Phillimore  wrote:

> This is somewhat trivial change to remove CodeCache::mark_for_evol_deoptimization() and its calling method, and nothing else this time.
> Ran vmTestbase/nsk/jvmti tests.

This pull request has now been integrated.

Changeset: f3759164
Author:    Coleen Phillimore 
URL:       https://git.openjdk.java.net/jdk/commit/f3759164179b2471d34df1225085deaf6c0f8fed
Stats:     24 lines in 4 files changed: 0 ins; 23 del; 1 mod

8269186: [REDO] Remove CodeCache::mark_for_evol_deoptimization() method

Reviewed-by: hseigel, lfoltan, sspitsyn

-------------

PR: https://git.openjdk.java.net/jdk/pull/4575

From coleenp at openjdk.java.net  Wed Jun 23 22:32:29 2021
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Wed, 23 Jun 2021 22:32:29 GMT
Subject: RFR: 8268855: Cleanup name handling in the Thread class and
 subclasses
In-Reply-To: 
References: 
Message-ID: 

On Wed, 23 Jun 2021 06:21:43 GMT, David Holmes  wrote:

> Please review this small cleanup item.
> 
> We can simplify and cleanup up name() management:
> 
> - make name() return "const char *" and only cast away constness at API boundaries when essential
> - add type_name() so that we can avoid code like "if (t->is_VM_Thread()) print("VMThread");
> - Rename JavaThread::get_thread_name() to name() (no need for the extra indirection)
> 
> There are a couple of minor changes to the appearance of some internal threads in the hs_err log e.g.
> 
>   0x000055af03e5b1b0 WatcherThread [stack: 0x00007f685df00000,0x00007f685e000000] [id=15952]
> 
> is now:
> 
>   0x000055af03e5b1b0 WatcherThread "VM Periodic Task Thread" [stack: 0x00007f685df00000,0x00007f685e000000] [id=15952]
> 
> but this shouldn't affect anything and makes things more consistent.
> 
> Notes: 
> 
> 1. While "override" is the ideal style when declaring overriding methods it has to be applied to all virtual methods in a class. So unless "override" is already used in a class, I did not start using it. I have filed a separate RFE to convert the Thread classes to use "override" consistently.
> 2.  While there is no need to redeclare a virtual method as "virtual" I kept to the existing style in those classes where changes were made.
> 3. I did not override type_name() for all the JavaThread subclasses as it seemed unnecessary, but happy to hear other views on this.
> 
> Testing (in progress):
>  - All builds in tiers 1-5
>  - GHA
>  - tiers 1-3 as a sanity test
> 
> Thanks,
> David

This looks good.

-------------

Marked as reviewed by coleenp (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4569

From jwilhelm at openjdk.java.net  Thu Jun 24 00:44:37 2021
From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson)
Date: Thu, 24 Jun 2021 00:44:37 GMT
Subject: RFR: Merge jdk17
Message-ID: 

Forwardport JDK 17 -> JDK 18

-------------

Commit messages:
 - Merge
 - 8266854: LibraryCallKit::inline_preconditions_checkIndex modifies control flow even if the intrinsic bailed out
 - 8254571: Erroneous generic type inference in a lambda expression with a checked exception
 - 8269125: Klass enqueue element size calculation wrong when traceid value cross compress limit
 - 8268961: Parenthesized pattern with guards does not work
 - 8269066: assert(ZAddress::is_marked(addr)) failed: Should be marked
 - 8269064: Dropped messages of AsyncLogWriter cause memleak
 - 8269148: Update minor GCC version in GitHub Actions pipeline
 - 8266885: [aarch64] Crash with 'Field too big for insn' for some tests under compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/

The merge commit only contains trivial merges, so no merge-specific webrevs have been generated.

Changes: https://git.openjdk.java.net/jdk/pull/4579/files
  Stats: 408 lines in 17 files changed: 347 ins; 32 del; 29 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4579.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4579/head:pull/4579

PR: https://git.openjdk.java.net/jdk/pull/4579

From suenaga at oss.nttdata.com  Thu Jun 24 00:55:28 2021
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Thu, 24 Jun 2021 09:55:28 +0900
Subject: SpinPause() should be inlined?
In-Reply-To: <4072d3f0-d0a9-cd08-f81f-f5f9c96dd60f@oracle.com>
References: <9ab1d5fe8c58f5a5cf71c6692a605d3b@oss.nttdata.com>
 <4072d3f0-d0a9-cd08-f81f-f5f9c96dd60f@oracle.com>
Message-ID: <5d6226d0-ead9-ca26-c842-80dd1cd7044c@oss.nttdata.com>

Thanks Dan for your information!

At a glance, I think JDK-8200697 which has been mentioned in the comment will resolve the consideration about refactoring.
I wonder why SpinYield which has been intriduced in JDK-8200697 has not been applied all of the use of SpinPause(). This RFE seems to aim to allow future work without needing to rewrite all of SpinPause() use - it helps much for inlining!

OTOH I can't explain performance benefits clearly now...


Thanks,

Yasumasa


On 2021/06/23 0:55, daniel.daugherty at oracle.com wrote:
> Here's the RFE where that was discussed back in 2018:
> 
>  ??? JDK-8208458 Simplify and inline os::SpinPause() for non-Windows OS on X86
>  ??? https://bugs.openjdk.java.net/browse/JDK-8208458
> 
> Dan
> 
> On 6/22/21 10:51 AM, Yasumasa Suenaga wrote:
>> Hi all,
>>
>> I saw lock contention in SecureRandom. When I was analyzing it, I had a question.
>>
>> ObjectMonitor::TrySpin() calls SpinPause(). SpinPause() would issue PAUSE, but it is not inlined as following:
>>
>> ```
>> ?1587e90:?????? f0 4d 0f b1 2e????????? lock cmpxchg %r13,(%r14)
>> ?1587e95:?????? 48 85 c0??????????????? test?? %rax,%rax
>> ?1587e98:?????? 74 16?????????????????? je???? 1587eb0 
>> ?1587e9a:?????? e8 31 7b d4 ff????????? callq? 12cf9d0 
>> ?1587e9f:?????? 83 eb 01??????????????? sub??? $0x1,%ebx
>> ?1587ea2:?????? 72 da?????????????????? jb???? 1587e7e 
>> ```
>>
>> I found following comment about it in os.hpp. It says SpinPause() should be inlined.
>>
>> ```
>> // Note that "PAUSE" is almost always used with synchronization
>> // so arguably we should provide Atomic::SpinPause() instead
>> // of the global SpinPause() with C linkage.
>> // It'd also be eligible for inlining on many platforms.
>> ```
>>
>> According to Intel Software Developer's Manual, PAUSE seems to need to be inlined, but I'm not sure it can be allow through function call.
>> I've fixed it to do so for Linux x64 [1] and I benchmarked with [2], then I saw some advantage in PAUSE on my Core i3-8145U, but it may be within the margin of error.
>>
>> * original
>> Benchmark?????????????????????? (algo)? (bytes)?? Mode? Cnt Score ????? Error? Units
>> RandomBenchmark.fillRandomInMT??? DRBG?????? 16? thrpt??? 3 510141.578 ? 138543.261? ops/s
>>
>> * with inlined PAUSE
>> Benchmark?????????????????????? (algo)? (bytes)?? Mode? Cnt Score ???? Error? Units
>> RandomBenchmark.fillRandomInMT??? DRBG?????? 16? thrpt??? 3 531589.958 ? 66942.549? ops/s
>>
>>
>> Should be inlined PAUSE as `Atomic::SpinPause()`?
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> [1] https://github.com/YaSuenag/jdk/commit/66b31d35fac6fb0537bb9d85957157342fec564d
>> [2] https://github.com/YaSuenag/hwrand/blob/master/benchmark/src/main/java/com/yasuenag/hwrand/benchmark/RandomBenchmark.java#L59-L64
> 

From dholmes at openjdk.java.net  Thu Jun 24 01:41:34 2021
From: dholmes at openjdk.java.net (David Holmes)
Date: Thu, 24 Jun 2021 01:41:34 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v7]
In-Reply-To: 
References: 
 
Message-ID: <9dvP65XCwHawwZxRIJaZKZdhtENfJuBS7GPr_0hez5E=.b5e5764e-7a00-4a04-8702-8db5235e14d8@github.com>

On Wed, 23 Jun 2021 18:15:26 GMT, Patricio Chilano Mateo  wrote:

>> Hi all,
>> 
>> Please review the following patch which handles the removal of biased locking code. 
>> 
>> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
>> 
>> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
>> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
>> 
>> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
>> 
>> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
>> 
>> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
>> 
>> Thanks,
>> Patricio
>
> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:
> 
>   fix cast in added whitebox method after 8268368

Marked as reviewed by dholmes (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From kvn at openjdk.java.net  Thu Jun 24 02:21:32 2021
From: kvn at openjdk.java.net (Vladimir Kozlov)
Date: Thu, 24 Jun 2021 02:21:32 GMT
Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512
 [v7]
In-Reply-To: 
References: 
 
Message-ID: 

On Wed, 23 Jun 2021 00:31:55 GMT, Scott Gibbons  wrote:

>> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding.
>> 
>> A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done.  This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so.  A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded.  This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded.
>> 
>> The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding.  The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically.
>> 
>> Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k.  The numbers are given in the table below.
>> 
>> **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**.
>> 
>> 
>> Benchmark Name | Base Score | Optimized Score | Gain
>> -- | -- | -- | --
>> testBase64Decode size 1 | 15.36 | 15.32 | 1.00
>> testBase64Decode size 3 | 17.00 | 16.72 | 1.02
>> testBase64Decode size 7 | 20.60 | 18.82 | 1.09
>> testBase64Decode size 32 | 34.21 | 26.77 | 1.28
>> testBase64Decode size 64 | 54.43 | 38.35 | 1.42
>> testBase64Decode size 80 | 66.40 | 48.34 | 1.37
>> testBase64Decode size 96 | 73.16 | 52.90 | 1.38
>> testBase64Decode size 112 | 84.93 | 51.82 | 1.64
>> testBase64Decode size 512 | 288.81 | 32.04 | 9.01
>> testBase64Decode size 1000 | 560.48 | 40.79 | 13.74
>> testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72
>> testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15
>> testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07
>> testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10
>> testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02
>> testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10
>> testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05
>> testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00
>> testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05
>> testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20
>> testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09
>> testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12
>> testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09
>> testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21
>> testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29
>> testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12
>> testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05
>> testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18
>> testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02
>> testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24
>> testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23
>> testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24
>> testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14
>> testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19
>> testBase64WithErrorInputsDecode size   20000 | 1398.44 | 1138.17 | 1.23
>> testBase64WithErrorInputsDecode size   50000 | 1409.41 | 1114.16 | 1.26
>
> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fixing Windows build warnings

I will run our internal testing before approving this.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4368

From kvn at openjdk.java.net  Thu Jun 24 06:11:36 2021
From: kvn at openjdk.java.net (Vladimir Kozlov)
Date: Thu, 24 Jun 2021 06:11:36 GMT
Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512
 [v7]
In-Reply-To: 
References: 
 
Message-ID: <-y7T9-mZJhhRfkSPoi-F70b1Y9O0_ELxBUmKeqWZRNA=.bfb60493-b5bc-4cf5-954b-6ecd01d76161@github.com>

On Wed, 23 Jun 2021 00:31:55 GMT, Scott Gibbons  wrote:

>> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding.
>> 
>> A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done.  This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so.  A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded.  This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded.
>> 
>> The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding.  The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically.
>> 
>> Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k.  The numbers are given in the table below.
>> 
>> **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**.
>> 
>> 
>> Benchmark Name | Base Score | Optimized Score | Gain
>> -- | -- | -- | --
>> testBase64Decode size 1 | 15.36 | 15.32 | 1.00
>> testBase64Decode size 3 | 17.00 | 16.72 | 1.02
>> testBase64Decode size 7 | 20.60 | 18.82 | 1.09
>> testBase64Decode size 32 | 34.21 | 26.77 | 1.28
>> testBase64Decode size 64 | 54.43 | 38.35 | 1.42
>> testBase64Decode size 80 | 66.40 | 48.34 | 1.37
>> testBase64Decode size 96 | 73.16 | 52.90 | 1.38
>> testBase64Decode size 112 | 84.93 | 51.82 | 1.64
>> testBase64Decode size 512 | 288.81 | 32.04 | 9.01
>> testBase64Decode size 1000 | 560.48 | 40.79 | 13.74
>> testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72
>> testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15
>> testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07
>> testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10
>> testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02
>> testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10
>> testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05
>> testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00
>> testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05
>> testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20
>> testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09
>> testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12
>> testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09
>> testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21
>> testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29
>> testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12
>> testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05
>> testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18
>> testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02
>> testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24
>> testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23
>> testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24
>> testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14
>> testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19
>> testBase64WithErrorInputsDecode size   20000 | 1398.44 | 1138.17 | 1.23
>> testBase64WithErrorInputsDecode size   50000 | 1409.41 | 1114.16 | 1.26
>
> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fixing Windows build warnings

I hit strange failure in compiler/intrinsics/base64/TestBase64.java test on Windows machine which have Intel 8167M cpu (AVX512).

#  EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00007ff92bcbd99e, pid=24628, tid=6804
#
# Problematic frame:
# V  [jvm.dll+0xabd99e]  ObjectMonitor::object_peek+0xe
#

Current thread (0x0000016c923de2c0):  JavaThread "MainThread" [_thread_in_Java, id=6804, stack(0x00000060df600000,0x00000060df700000)]

Stack: [0x00000060df600000,0x00000060df700000],  sp=0x00000060df6fcb50,  free space=1010k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [jvm.dll+0xabd99e]  ObjectMonitor::object_peek+0xe  (objectMonitor.cpp:304)
V  [jvm.dll+0xc48d5b]  ObjectSynchronizer::quick_enter+0x9b  (synchronizer.cpp:331)
V  [jvm.dll+0xb9b6f6]  SharedRuntime::monitor_enter_helper+0x36  (sharedRuntime.cpp:2112)
V  [jvm.dll+0x389894]  Runtime1::monitorenter+0x94  (c1_Runtime1.cpp:748)
C  0x0000016c99c4a757

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
v  ~RuntimeStub::monitorenter_nofpu Runtime1 stub
J 40 c1 java.util.concurrent.ConcurrentHashMap.putVal(Ljava/lang/Object;Ljava/lang/Object;Z)Ljava/lang/Object; java.base at 18-internal (432 bytes) @ 0x0000016c9a1801f8 [0x0000016c9a17e6a0+0x0000000000001b58]
J 43 c1 java.util.concurrent.ConcurrentHashMap.putIfAbsent(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; java.base at 18-internal (8 bytes) @ 0x0000016c9a181c34 [0x0000016c9a181bc0+0x0000000000000074]
j  java.lang.ClassLoader.getClassLoadingLock(Ljava/lang/String;)Ljava/lang/Object;+23 java.base at 18-internal
j  jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(Ljava/lang/String;Z)Ljava/lang/Class;+2 java.base at 18-internal
j  jdk.internal.loader.BuiltinClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+3 java.base at 18-internal
j  jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+36 java.base at 18-internal
j  java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;+3 java.base at 18-internal
v  ~StubRoutines::call_stub
j  compiler.intrinsics.base64.TestBase64.test0(Lcompiler/intrinsics/base64/TestBase64$FileType;Lcompiler/intrinsics/base64/TestBase64$Base64Type;Ljava/util/Base64$Encoder;Ljava/util/Base64$Decoder;Ljava/lang/String;Ljava/lang/String;I)V+25
j  compiler.intrinsics.base64.TestBase64.main([Ljava/lang/String;)V+116
v  ~StubRoutines::call_stub
j  jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Ljava/lang/reflect/Method;Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+0 java.base at 18-internal
j  jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+133 java.base at 18-internal
j  jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+6 java.base at 18-internal
j  java.lang.reflect.Method.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+59 java.base at 18-internal
j  com.sun.javatest.regtest.agent.MainWrapper$MainThread.run()V+172
j  java.lang.Thread.run()V+11 java.base at 18-internal
v  ~StubRoutines::call_stub

siginfo: EXCEPTION_ACCESS_VIOLATION (0xc0000005), reading address 0x00000000000000bc


Register to memory mapping:

RIP=0x00007ff92bcbd99e jvm.dll::ObjectMonitor::object_peek + 0xe
RAX=0x00000000000000ac is an unknown value
RBX=0x00000000000000ac is an unknown value
RCX=0x00000000000000ac is an unknown value
RDX=0x0 is NULL
RSP=0x00000060df6fcb50 is pointing into the stack for thread: 0x0000016c923de2c0
RBP=0x00000060df6fd110 is pointing into the stack for thread: 0x0000016c923de2c0
RSI=0x0000016c923de2c0 is a thread
RDI=0x0000016c923de2c0 is a thread
R8 =0x00000060df6fd1f0 is pointing into the stack for thread: 0x0000016c923de2c0
R9 =0x00000000000002f8 is an unknown value
R10=0x00007ff92b589800 jvm.dll::Runtime1::monitorenter + 0x0
R11=0x00000060df6fcc78 is pointing into the stack for thread: 0x0000016c923de2c0
R12=0x0 is NULL
R13=0x0000000000000200 is an unknown value
R14=0x0000000000000396 is an unknown value
R15=0x0000016c923de2c0 is a thread


Registers:
RAX=0x00000000000000ac, RBX=0x00000000000000ac, RCX=0x00000000000000ac, RDX=0x0000000000000000
RSP=0x00000060df6fcb50, RBP=0x00000060df6fd110, RSI=0x0000016c923de2c0, RDI=0x0000016c923de2c0
R8 =0x00000060df6fd1f0, R9 =0x00000000000002f8, R10=0x00007ff92b589800, R11=0x00000060df6fcc78
R12=0x0000000000000000, R13=0x0000000000000200, R14=0x0000000000000396, R15=0x0000016c923de2c0
RIP=0x00007ff92bcbd99e, EFLAGS=0x0000000000010206

Top of Stack: (sp=0x00000060df6fcb50)
0x00000060df6fcb50:   0000016c923de2c0 0000000000000000
0x00000060df6fcb60:   0000000000000000 00007ff92b8980a0
0x00000060df6fcb70:   0000016c923de2c0 00007ff92be48d5b
0x00000060df6fcb80:   00000000000000ac 000000074bd727d0
0x00000060df6fcb90:   0000000000000000 0000000000000000
0x00000060df6fcba0:   0000000000000000 00007ff92c1de2b0
0x00000060df6fcbb0:   0000016c923de2c0 00007ff92b8980a0
0x00000060df6fcbc0:   00000060df6fd1f0 00007ff92bd9b6f6
0x00000060df6fcbd0:   000000074bd727d0 0000016c923de2c0
0x00000060df6fcbe0:   00000060df6fd1f0 0000016c923de2c0
0x00000060df6fcbf0:   0000000000000000 0000000000000000
0x00000060df6fcc00:   0000000000000000 0000000000000000
0x00000060df6fcc10:   0000000000000000 0000000000000000
0x00000060df6fcc20:   000000074bd727d0 00007ff92b589894
0x00000060df6fcc30:   000000074bd727d0 00000060df6fd1f0
0x00000060df6fcc40:   0000016c923de2c0 00007ff92b8980a0 

Instructions: (pc=0x00007ff92bcbd99e)
0x00007ff92bcbd89e:   ff 48 8b c8 48 8b d8 48 8b 10 ff 52 48 48 8b 13
0x00007ff92bcbd8ae:   48 8b cb 84 c0 0f 84 83 00 00 00 ff 52 48 84 c0
0x00007ff92bcbd8be:   75 24 4c 8d 0d f1 7b 2e 00 ba 91 05 00 00 4c 8d
0x00007ff92bcbd8ce:   05 05 7c 2e 00 48 8d 0d c6 8b 2d 00 e8 71 aa a0
0x00007ff92bcbd8de:   ff e8 3c c3 01 00 8b 83 88 03 00 00 83 c0 fa a9
0x00007ff92bcbd8ee:   fd ff ff ff 74 23 4c 8d 0d c5 25 4f 00 41 b8 05
0x00007ff92bcbd8fe:   01 00 00 48 8d 15 e0 25 4f 00 b9 00 00 00 e0 e8
0x00007ff92bcbd90e:   4e a7 a0 ff e8 09 c3 01 00 48 8b 03 48 8b cb ff
0x00007ff92bcbd91e:   90 b8 00 00 00 84 c0 75 40 4c 8d 0d fa 25 4f 00
0x00007ff92bcbd92e:   ba 07 01 00 00 4c 8d 05 0e 26 4f 00 eb 1a ff 52
0x00007ff92bcbd93e:   40 84 c0 75 24 4c 8d 0d 7e 86 2d 00 ba 0b 01 00
0x00007ff92bcbd94e:   00 4c 8d 05 22 26 4f 00 48 8d 0d 8b 25 4f 00 e8
0x00007ff92bcbd95e:   ee a9 a0 ff e8 b9 c2 01 00 48 8b 44 24 30 48 8b
0x00007ff92bcbd96e:   48 10 48 85 c9 75 08 33 c0 48 83 c4 20 5b c3 48
0x00007ff92bcbd97e:   83 c4 20 5b 48 ff 25 8f f4 6f 00 cc cc cc cc cc
0x00007ff92bcbd98e:   cc cc 48 89 4c 24 08 48 83 ec 28 48 8b 44 24 30
0x00007ff92bcbd99e:   48 8b 48 10 48 85 c9 75 07 33 c0 48 83 c4 28 c3
0x00007ff92bcbd9ae:   48 83 c4 28 48 ff 25 5f f5 6f 00 cc cc cc cc cc
0x00007ff92bcbd9be:   cc cc 48 89 5c 24 18 48 89 54 24 10 48 89 4c 24
0x00007ff92bcbd9ce:   08 57 48 83 ec 20 48 8b 5c 24 30 48 8d 15 68 37
0x00007ff92bcbd9de:   4f 00 48 8b 7c 24 38 4c 8b c3 48 8b cf e8 50 8f
0x00007ff92bcbd9ee:   02 00 4c 8b 43 08 48 8d 15 6d 37 4f 00 48 8b cf
0x00007ff92bcbd9fe:   e8 3d 8f 02 00 48 8b 4b 10 48 85 c9 75 04 33 c0
0x00007ff92bcbda0e:   eb 06 ff 15 02 f5 6f 00 4c 8b c0 48 8d 15 60 37
0x00007ff92bcbda1e:   4f 00 48 8b cf e8 18 8f 02 00 48 8d 15 69 37 4f
0x00007ff92bcbda2e:   00 48 8b cf e8 09 8f 02 00 48 8d 15 6a 37 4f 00
0x00007ff92bcbda3e:   48 8b cf e8 fa 8e 02 00 48 8d 15 6b 37 4f 00 48
0x00007ff92bcbda4e:   8b cf e8 eb 8e 02 00 41 b8 2f 00 00 00 48 8d 15
0x00007ff92bcbda5e:   5e 37 4f 00 48 8b cf e8 d6 8e 02 00 48 8d 15 5f
0x00007ff92bcbda6e:   37 4f 00 48 8b cf e8 c7 8e 02 00 4c 8b 43 48 48
0x00007ff92bcbda7e:   8d 15 54 37 4f 00 48 8b cf e8 b4 8e 02 00 4c 8b
0x00007ff92bcbda8e:   43 50 48 8d 15 59 37 4f 00 48 8b cf e8 a1 8e 02 

Stack slot to memory mapping:
stack at sp + 0 slots: 0x0000016c923de2c0 is a thread
stack at sp + 1 slots: 0x0 is NULL
stack at sp + 2 slots: 0x0 is NULL
stack at sp + 3 slots: 0x00007ff92b8980a0 jvm.dll::VMEntryWrapper::VMEntryWrapper + 0x110
stack at sp + 4 slots: 0x0000016c923de2c0 is a thread
stack at sp + 5 slots: 0x00007ff92be48d5b jvm.dll::ObjectSynchronizer::quick_enter + 0x9b
stack at sp + 6 slots: 0x00000000000000ac is an unknown value
stack at sp + 7 slots: 0x000000074bd727d0 is an oop: java.util.concurrent.ConcurrentHashMap$Node 
{0x000000074bd727d0} - klass: 'java/util/concurrent/ConcurrentHashMap$Node'
 - ---- fields (total size 4 words):
 - final 'hash' 'I' @12  683507634 (28bd7fb2)
 - final 'key' 'Ljava/lang/Object;' @16  "java.util.Base64"{0x000000074bd72788} (e97ae4f1)
 - volatile 'val' 'Ljava/lang/Object;' @20  a 'java/lang/Object'{0x000000074bd727c0} (e97ae4f8)
 - volatile 'next' 'Ljava/util/concurrent/ConcurrentHashMap$Node;' @24  NULL (0)

-------------

PR: https://git.openjdk.java.net/jdk/pull/4368

From mbaesken at openjdk.java.net  Thu Jun 24 09:00:38 2021
From: mbaesken at openjdk.java.net (Matthias Baesken)
Date: Thu, 24 Jun 2021 09:00:38 GMT
Subject: RFR: JDK-8266490: Extend the OSContainer API to support the pids
 controller of cgroups [v2]
In-Reply-To: 
References: 
 
 
 
Message-ID: 

On Wed, 23 Jun 2021 14:48:22 GMT, Severin Gehwolf  wrote:

> > But I think that the testing needs to be enhanced (e.g. with some added docker tests?). Do you have some good suggestions
> > where I could look at existing (docker?) tests and adjust those for the new pids.max ?
> 
> Have a look at `test/hotspot/jtreg/containers/docker/TestMisc.java` which already does some assertions on `print_container_info()` output. Either extend that test with some actual pid limits (`--pids-limit=` option) in place or write a similar one. That would cover the hotspot side.
> 
> Then consider adding the pids limit to the `-Xshowsettings:system` output (see `LauncherHelper.printSystemMetrics()`) using the Java API and add a docker test using that in `test/jdk/jdk/internal/platform/docker/`.

Hi Severin, thanks for the suggestions .
I'll have a look.

Best regards, Matthias

-------------

PR: https://git.openjdk.java.net/jdk/pull/4518

From whuang at openjdk.java.net  Thu Jun 24 09:02:58 2021
From: whuang at openjdk.java.net (Wang Huang)
Date: Thu, 24 Jun 2021 09:02:58 GMT
Subject: RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals
 [v2]
In-Reply-To: 
References: 
Message-ID: 

> Dear all, 
>      Could you give me a favor to review this patch? It improves the performance of the intrinsic of `String.equals` on Neon backend of Aarch64.
>      We profile the performance by using this JMH case:
>  
> 
>    ```java
>     package com.huawei.string;
>     import java.util.*;
>     import java.util.concurrent.TimeUnit;
>     
>     import org.openjdk.jmh.annotations.CompilerControl;
>     import org.openjdk.jmh.annotations.Benchmark;
>     import org.openjdk.jmh.annotations.Level;
>     import org.openjdk.jmh.annotations.OutputTimeUnit;
>     import org.openjdk.jmh.annotations.Param;
>     import org.openjdk.jmh.annotations.Scope;
>     import org.openjdk.jmh.annotations.Setup;
>     import org.openjdk.jmh.annotations.State;
>     import org.openjdk.jmh.annotations.Fork;
>     import org.openjdk.jmh.infra.Blackhole;
>     
>     @State(Scope.Thread)
>     @OutputTimeUnit(TimeUnit.MILLISECONDS)
>     public class StringEqual {
>         @Param({"8", "64", "4096"})
>         int size;
>     
>         String str1;
>         String str2;
>     
>         @Setup(Level.Trial)
>         public void init() {
>             str1 = newString(size, 'c', '1');
>             str2 = newString(size, 'c', '2');
>         }
>     
>         public String newString(int length, char charToFill, char lastChar) {
>             if (length > 0) {
>                 char[] array = new char[length];
>                 Arrays.fill(array, charToFill);
>                 array[length - 1] = lastChar;
>                 return new String(array);
>             }
>             return "";
>         }
>     
>         @Benchmark
>         @CompilerControl(CompilerControl.Mode.DONT_INLINE)
>         public boolean EqualString() {
>             return str1.equals(str2);
>         }
>     }
> 
>    ```
> The result is list as following:?Linux aarch64 with 128cores?
> 
> Benchmark                       | (size) |  Mode | Cnt  |     Score |     Error |  Units
> ----------------------------------|-------|---------|-------|------------|------------|----------
> StringEqual.EqualString      |         8 | thrpt  | 10 | 123971.994 | ? 1462.131 | ops/ms
> StringEqual.EqualString       |       64 | thrpt |  10  | 56009.960  | ?  999.734 | ops/ms
> StringEqual.EqualString        |    4096 | thrpt |  10 |   1943.852 | ?  8.159 | ops/ms
> StringEqual.EqualStringWithNEON    |   8 | thrpt |  10 | 120319.271  | ? 1392.185 | ops/ms
> StringEqual.EqualStringWithNEON    |  64 | thrpt |  10 |  72914.767 | ? 1814.173 | ops/ms
> StringEqual.EqualStringWithNEON  |  4096 | thrpt  | 10  |  2579.155 | ? 15.589 | ops/ms
> 
> Yours, 
> WANG Huang

Wang Huang has updated the pull request incrementally with one additional commit since the last revision:

  enhancement of string.equals

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4423/files
  - new: https://git.openjdk.java.net/jdk/pull/4423/files/c65431e7..4f02c00f

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4423&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4423&range=00-01

  Stats: 271 lines in 8 files changed: 209 ins; 30 del; 32 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4423.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4423/head:pull/4423

PR: https://git.openjdk.java.net/jdk/pull/4423

From whuang at openjdk.java.net  Thu Jun 24 09:26:30 2021
From: whuang at openjdk.java.net (Wang Huang)
Date: Thu, 24 Jun 2021 09:26:30 GMT
Subject: RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals
 [v2]
In-Reply-To: 
References: 
 
 
 
Message-ID: <1Wj6NF9GkpRrLj6qfnt7G5FaFuXeijuXp8ErTnvLxLs=.68a4520f-6c51-4b51-a1c5-d5a7fd906382@github.com>

On Thu, 17 Jun 2021 09:28:19 GMT, Andrew Haley  wrote:

>>> With this change the size of the`string_equals` intrinsic increases by ~60% from 120 bytes to 196 bytes and this gets expanded at every `String.equals` call site. It looks good on a micro-benchmark but I wonder if on a larger program this improvement is outweighed by the negative effects of methods taking up more space in the icache.
>> 
>> That's an excellent point. There's no need at all for the Neon part to be expanded inline: it could be a subroutine. We'd have to use fixed Neon registers at the call site.
>
>> > With this change the size of the`string_equals` intrinsic increases by ~60% from 120 bytes to 196 bytes and this gets expanded at every `String.equals` call site. It looks good on a micro-benchmark but I wonder if on a larger program this improvement is outweighed by the negative effects of methods taking up more space in the icache.
>> 
>> That's an excellent point. There's no need at all for the Neon part to be expanded inline: it could be a subroutine. We'd have to use fixed Neon registers at the call site.
> 
> Thinking some more,we could use this opportunity to move as much of the bulk comparison code as we can out of line, hopefully achieving a reduction in footprint as well as an improvement in performance.

Dear @theRealAph @dgbo @nick-arm @mdinacci?
   I have pushed my recent patch. In this commit,
   * I have tested some cases as @theRealAph suggested and found some points
      1)  we changed the diff postions in the strings and get the data if we used neon in all cases
![image](https://user-images.githubusercontent.com/73928571/123235128-2c095680-d50e-11eb-95cf-c32d2b58a634.png)
       Due to this result, if the string is small, we used old implementaion. 
     2) The result of `8:64` in this figure is something like bugs, and I fixed it by unrolling the loop
       ```c++
      bind(LOOP); {
      ldr(tmp1, Address(post(a1, wordSize)));
      ldr(tmp2, Address(post(a2, wordSize)));
      subs(cnt1, cnt1, wordSize);
      eor(tmp1, tmp1, tmp2);
      cbnz(tmp1, DONE);
      br(LT, SHORT);

      ldr(tmp1, Address(post(a1, wordSize)));
      ldr(tmp2, Address(post(a2, wordSize)));
      subs(cnt1, cnt1, wordSize);
      eor(tmp1, tmp1, tmp2);
      cbnz(tmp1, DONE);
       } br(GE, LOOP);
      ```
      3)  `UseSimpleStringEquals` is added in this patch. If the option is `true` , we used old implentation. 
 * The result of my JMH is listed here ,
 
 **Diff postion is in the LAST 2/3 of whole string**

Benchmark                      |(size) |Mode |Cnt|  Score|  Error |Units
-------------------------------|-------|-----|---|-------|--------|-----
StringEquals.equalsLenT        |     8 |avgt | 10|  7.869|? 0.063 |ns/op
StringEquals.equalsLenT        |    16 |avgt | 10|  8.651|? 0.201 |ns/op
StringEquals.equalsLenT        |    32 |avgt | 10|  9.869|? 0.049 |ns/op
StringEquals.equalsLenT        |    64 |avgt | 10| 11.379|? 0.134 |ns/op
StringEquals.equalsLenT        |   128 |avgt | 10| 17.312|? 0.274 |ns/op
StringEquals.equalsLenT_simple |     8 |avgt | 10|  7.912|? 0.439 |ns/op
StringEquals.equalsLenT_simple |    16 |avgt | 10|  8.764|? 0.061 |ns/op
StringEquals.equalsLenT_simple |    32 |avgt | 10| 30.452|? 0.065 |ns/op
StringEquals.equalsLenT_simple |    64 |avgt | 10| 14.550|? 0.199 |ns/op
StringEquals.equalsLenT_simple |   128 |avgt | 10| 20.071|? 2.465 |ns/op

 **Diff postion is in the FIRST 1/3 of whole string**

Benchmark                     | (size) |Mode |Cnt | Score|  Error |Units
------------------------------|--------|-----|----|------|--------|-----
StringEquals.equalsLenH       |      8 |avgt | 10 | 7.822|? 0.148 |ns/op
StringEquals.equalsLenH       |     16 |avgt | 10 | 7.631|? 0.179 |ns/op
StringEquals.equalsLenH       |     32 |avgt | 10 | 8.553|? 0.064 |ns/op
StringEquals.equalsLenH       |     64 |avgt | 10 |11.944|? 0.554 |ns/op
StringEquals.equalsLenH       |    128 |avgt | 10 |12.691|? 0.091 |ns/op
StringEquals.equalsLenH_simple|      8 |avgt | 10 | 7.873|? 0.141 |ns/op
StringEquals.equalsLenH_simple|     16 |avgt | 10 | 7.972|? 0.556 |ns/op
StringEquals.equalsLenH_simple|     32 |avgt | 10 | 8.383|? 0.100 |ns/op
StringEquals.equalsLenH_simple|     64 |avgt | 10 |29.364|? 0.344 |ns/op
StringEquals.equalsLenH_simple|    128 |avgt | 10 |14.748|? 0.354 |ns/op

-------------

PR: https://git.openjdk.java.net/jdk/pull/4423

From whuang at openjdk.java.net  Thu Jun 24 09:26:32 2021
From: whuang at openjdk.java.net (Wang Huang)
Date: Thu, 24 Jun 2021 09:26:32 GMT
Subject: RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals
 [v2]
In-Reply-To: 
References: 
 
Message-ID: 

On Tue, 15 Jun 2021 03:20:06 GMT, Nick Gasson  wrote:

>> Wang Huang has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   enhancement of string.equals
>
> src/hotspot/cpu/aarch64/aarch64.ad line 16676:
> 
>> 16674:   format %{ "String Equals $str1,$str2,$cnt -> $result" %}
>> 16675:   ins_encode %{
>> 16676:     // Count is in 8-bit bytes; non-Compact chars are 8 bits.
> 
> This change is a bit confusing: non-compact chars are still 16 bits, it's just at this point we know the string contains only 8-bit Latin characters. I think it's better to instead delete everything after the ";" (or leave it as it is).

I have fixed this comment. Thank you for your suggestion.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4423

From kvn at openjdk.java.net  Thu Jun 24 14:53:32 2021
From: kvn at openjdk.java.net (Vladimir Kozlov)
Date: Thu, 24 Jun 2021 14:53:32 GMT
Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512
 [v7]
In-Reply-To: 
References: 
 
Message-ID: 

On Wed, 23 Jun 2021 00:31:55 GMT, Scott Gibbons  wrote:

>> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding.
>> 
>> A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done.  This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so.  A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded.  This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded.
>> 
>> The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding.  The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically.
>> 
>> Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k.  The numbers are given in the table below.
>> 
>> **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**.
>> 
>> 
>> Benchmark Name | Base Score | Optimized Score | Gain
>> -- | -- | -- | --
>> testBase64Decode size 1 | 15.36 | 15.32 | 1.00
>> testBase64Decode size 3 | 17.00 | 16.72 | 1.02
>> testBase64Decode size 7 | 20.60 | 18.82 | 1.09
>> testBase64Decode size 32 | 34.21 | 26.77 | 1.28
>> testBase64Decode size 64 | 54.43 | 38.35 | 1.42
>> testBase64Decode size 80 | 66.40 | 48.34 | 1.37
>> testBase64Decode size 96 | 73.16 | 52.90 | 1.38
>> testBase64Decode size 112 | 84.93 | 51.82 | 1.64
>> testBase64Decode size 512 | 288.81 | 32.04 | 9.01
>> testBase64Decode size 1000 | 560.48 | 40.79 | 13.74
>> testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72
>> testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15
>> testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07
>> testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10
>> testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02
>> testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10
>> testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05
>> testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00
>> testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05
>> testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20
>> testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09
>> testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12
>> testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09
>> testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21
>> testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29
>> testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12
>> testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05
>> testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18
>> testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02
>> testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24
>> testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23
>> testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24
>> testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14
>> testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19
>> testBase64WithErrorInputsDecode size   20000 | 1398.44 | 1138.17 | 1.23
>> testBase64WithErrorInputsDecode size   50000 | 1409.41 | 1114.16 | 1.26
>
> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fixing Windows build warnings

The rest of testing hs-tier1-4 and xcomp is finished and clean.
So this is the only failure. I attached hs_err file to RFE.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4368

From github.com+6704669+asgibbons at openjdk.java.net  Thu Jun 24 17:02:03 2021
From: github.com+6704669+asgibbons at openjdk.java.net (Scott Gibbons)
Date: Thu, 24 Jun 2021 17:02:03 GMT
Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512
 [v8]
In-Reply-To: 
References: 
Message-ID: 

> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding.
> 
> A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done.  This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so.  A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded.  This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded.
> 
> The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding.  The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically.
> 
> Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k.  The numbers are given in the table below.
> 
> **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**.
> 
> 
> Benchmark Name | Base Score | Optimized Score | Gain
> -- | -- | -- | --
> testBase64Decode size 1 | 15.36 | 15.32 | 1.00
> testBase64Decode size 3 | 17.00 | 16.72 | 1.02
> testBase64Decode size 7 | 20.60 | 18.82 | 1.09
> testBase64Decode size 32 | 34.21 | 26.77 | 1.28
> testBase64Decode size 64 | 54.43 | 38.35 | 1.42
> testBase64Decode size 80 | 66.40 | 48.34 | 1.37
> testBase64Decode size 96 | 73.16 | 52.90 | 1.38
> testBase64Decode size 112 | 84.93 | 51.82 | 1.64
> testBase64Decode size 512 | 288.81 | 32.04 | 9.01
> testBase64Decode size 1000 | 560.48 | 40.79 | 13.74
> testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72
> testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15
> testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07
> testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10
> testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02
> testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10
> testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05
> testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00
> testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05
> testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20
> testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09
> testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12
> testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09
> testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21
> testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29
> testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12
> testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05
> testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18
> testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02
> testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24
> testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23
> testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24
> testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14
> testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19
> testBase64WithErrorInputsDecode size   20000 | 1398.44 | 1138.17 | 1.23
> testBase64WithErrorInputsDecode size   50000 | 1409.41 | 1114.16 | 1.26

Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:

  Fixed Windows register stomping.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4368/files
  - new: https://git.openjdk.java.net/jdk/pull/4368/files/58461b80..1729232c

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4368&range=07
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4368&range=06-07

  Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4368.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4368/head:pull/4368

PR: https://git.openjdk.java.net/jdk/pull/4368

From github.com+6704669+asgibbons at openjdk.java.net  Thu Jun 24 17:17:43 2021
From: github.com+6704669+asgibbons at openjdk.java.net (Scott Gibbons)
Date: Thu, 24 Jun 2021 17:17:43 GMT
Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512
 [v7]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Thu, 24 Jun 2021 14:50:01 GMT, Vladimir Kozlov  wrote:

>> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Fixing Windows build warnings
>
> The rest of testing hs-tier1-4 and xcomp is finished and clean.
> So this is the only failure. I attached hs_err file to RFE.

Hi, @vnkozlov.  I just pushed a change that fixes a register overwrite.  Can you please start the tests again?

Thanks

-------------

PR: https://git.openjdk.java.net/jdk/pull/4368

From zgu at openjdk.java.net  Thu Jun 24 17:37:37 2021
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Thu, 24 Jun 2021 17:37:37 GMT
Subject: RFR: 8269303: Remove unnecessary forward declaration of
 PSPromotionManager in cpCache.hpp
Message-ID: 

Please review this trivial change to remove the unnecessary forward declaration.

-------------

Commit messages:
 - 8269303: Remove unnecessary forward declaration of PSPromotionManager in cpCache.hpp

Changes: https://git.openjdk.java.net/jdk/pull/4585/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4585&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8269303
  Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4585.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4585/head:pull/4585

PR: https://git.openjdk.java.net/jdk/pull/4585

From pchilanomate at openjdk.java.net  Thu Jun 24 18:58:39 2021
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Thu, 24 Jun 2021 18:58:39 GMT
Subject: Integrated: 8256425: Obsolete Biased Locking in JDK 18
In-Reply-To: 
References: 
Message-ID: 

On Thu, 17 Jun 2021 15:37:40 GMT, Patricio Chilano Mateo  wrote:

> Hi all,
> 
> Please review the following patch which handles the removal of biased locking code. 
> 
> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
> 
> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
> 
> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
> 
> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
> 
> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
> 
> Thanks,
> Patricio

This pull request has now been integrated.

Changeset: 2fd7943e
Author:    Patricio Chilano Mateo 
URL:       https://git.openjdk.java.net/jdk/commit/2fd7943ec191559bfb2778305daf82bcc4422028
Stats:     5328 lines in 165 files changed: 66 ins; 5034 del; 228 mod

8256425: Obsolete Biased Locking in JDK 18

Reviewed-by: kvn, dholmes, dcubed, rrich

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From pchilanomate at openjdk.java.net  Thu Jun 24 18:58:38 2021
From: pchilanomate at openjdk.java.net (Patricio Chilano Mateo)
Date: Thu, 24 Jun 2021 18:58:38 GMT
Subject: RFR: 8256425: Obsolete Biased Locking in JDK 18 [v7]
In-Reply-To: 
References: 
 
Message-ID: 

On Wed, 23 Jun 2021 18:15:26 GMT, Patricio Chilano Mateo  wrote:

>> Hi all,
>> 
>> Please review the following patch which handles the removal of biased locking code. 
>> 
>> The third least significant bit of the markword is now always unused. I didn't try to give it back to the age field as it was prior to biased locking introduction since it will likely be taken away by other projects (probably Valhalla). 
>> 
>> Regarding c1 changes, the scratch register passed to LIRGenerator::monitor_enter() was only used by biased locking code except in ppc, so in all other platforms I removed the scratch parameter from C1_MacroAssembler::lock_object() (except in s390 where it wasn't defined already). 
>> We could probably just always use R0 as a temp register in lock_object() for ppc, since we were already using it as temp in biased_locking_enter(), and remove the scratch parameter from there too. Then we could remove the scratch field from LIR_OpLock. I haven't done that in this patch though.
>> 
>> For c2, type.hpp defined XorXNode, StoreXConditionalNode, LoadXNode and StoreXNode as needed by UseOptoBiasInlining. I see that LoadXNode and StoreXNode are also used by shenandoahSupport so I kept those two defines. I removed only the biased locking comments from the storeIConditional/storeLConditional implementations in .ad files since I don't know if they might be needed.
>> 
>> There are some tests that were only meaningful when run with biased locking enabled so I removed them.
>> 
>> Tested in mach5 tiers 1-7. I tested it builds also on ppc, s390 and arm32 but can't run any tests on those platforms so it would be good if somebody can do some sanity check on those ones.
>> 
>> Thanks,
>> Patricio
>
> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:
> 
>   fix cast in added whitebox method after 8268368

Thanks all for reviews and comments!

-------------

PR: https://git.openjdk.java.net/jdk/pull/4522

From kbarrett at openjdk.java.net  Thu Jun 24 20:56:07 2021
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Thu, 24 Jun 2021 20:56:07 GMT
Subject: RFR: 8269303: Remove unnecessary forward declaration of
 PSPromotionManager in cpCache.hpp
In-Reply-To: 
References: 
Message-ID: <8zydtAbgU58ZNO1FJaOu9bXC4qiwM6WAITrWJimuk8Y=.d879e491-abb3-4e95-b54a-43950b7128b7@github.com>

On Thu, 24 Jun 2021 14:44:43 GMT, Zhengyu Gu  wrote:

> Please review this trivial change to remove the unnecessary forward declaration.

Looks good, and trivial.

-------------

Marked as reviewed by kbarrett (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4585

From zgu at openjdk.java.net  Thu Jun 24 21:06:09 2021
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Thu, 24 Jun 2021 21:06:09 GMT
Subject: RFR: 8269303: Remove unnecessary forward declaration of
 PSPromotionManager in cpCache.hpp
In-Reply-To: <8zydtAbgU58ZNO1FJaOu9bXC4qiwM6WAITrWJimuk8Y=.d879e491-abb3-4e95-b54a-43950b7128b7@github.com>
References: 
 <8zydtAbgU58ZNO1FJaOu9bXC4qiwM6WAITrWJimuk8Y=.d879e491-abb3-4e95-b54a-43950b7128b7@github.com>
Message-ID: 

On Thu, 24 Jun 2021 20:52:58 GMT, Kim Barrett  wrote:

> Looks good, and trivial.

Thanks @kimbarrett

-------------

PR: https://git.openjdk.java.net/jdk/pull/4585

From zgu at openjdk.java.net  Thu Jun 24 21:06:10 2021
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Thu, 24 Jun 2021 21:06:10 GMT
Subject: Integrated: 8269303: Remove unnecessary forward declaration of
 PSPromotionManager in cpCache.hpp
In-Reply-To: 
References: 
Message-ID: 

On Thu, 24 Jun 2021 14:44:43 GMT, Zhengyu Gu  wrote:

> Please review this trivial change to remove the unnecessary forward declaration.

This pull request has now been integrated.

Changeset: c79034e0
Author:    Zhengyu Gu 
URL:       https://git.openjdk.java.net/jdk/commit/c79034e0c94a21a0ef3655e0d7da7629d7b40d8c
Stats:     3 lines in 1 file changed: 0 ins; 2 del; 1 mod

8269303: Remove unnecessary forward declaration of PSPromotionManager in cpCache.hpp

Reviewed-by: kbarrett

-------------

PR: https://git.openjdk.java.net/jdk/pull/4585

From kvn at openjdk.java.net  Thu Jun 24 23:00:05 2021
From: kvn at openjdk.java.net (Vladimir Kozlov)
Date: Thu, 24 Jun 2021 23:00:05 GMT
Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512
 [v8]
In-Reply-To: 
References: 
 
Message-ID: 

On Thu, 24 Jun 2021 17:02:03 GMT, Scott Gibbons  wrote:

>> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding.
>> 
>> A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done.  This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so.  A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded.  This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded.
>> 
>> The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding.  The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically.
>> 
>> Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k.  The numbers are given in the table below.
>> 
>> **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**.
>> 
>> 
>> Benchmark Name | Base Score | Optimized Score | Gain
>> -- | -- | -- | --
>> testBase64Decode size 1 | 15.36 | 15.32 | 1.00
>> testBase64Decode size 3 | 17.00 | 16.72 | 1.02
>> testBase64Decode size 7 | 20.60 | 18.82 | 1.09
>> testBase64Decode size 32 | 34.21 | 26.77 | 1.28
>> testBase64Decode size 64 | 54.43 | 38.35 | 1.42
>> testBase64Decode size 80 | 66.40 | 48.34 | 1.37
>> testBase64Decode size 96 | 73.16 | 52.90 | 1.38
>> testBase64Decode size 112 | 84.93 | 51.82 | 1.64
>> testBase64Decode size 512 | 288.81 | 32.04 | 9.01
>> testBase64Decode size 1000 | 560.48 | 40.79 | 13.74
>> testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72
>> testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15
>> testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07
>> testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10
>> testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02
>> testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10
>> testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05
>> testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00
>> testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05
>> testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20
>> testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09
>> testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12
>> testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09
>> testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21
>> testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29
>> testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12
>> testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05
>> testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18
>> testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02
>> testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24
>> testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23
>> testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24
>> testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14
>> testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19
>> testBase64WithErrorInputsDecode size   20000 | 1398.44 | 1138.17 | 1.23
>> testBase64WithErrorInputsDecode size   50000 | 1409.41 | 1114.16 | 1.26
>
> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fixed Windows register stomping.

Latest update fixed TestBase64.java test issue.

-------------

Marked as reviewed by kvn (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4368

From dholmes at openjdk.java.net  Fri Jun 25 00:02:11 2021
From: dholmes at openjdk.java.net (David Holmes)
Date: Fri, 25 Jun 2021 00:02:11 GMT
Subject: Integrated: 8268855: Cleanup name handling in the Thread class and
 subclasses
In-Reply-To: 
References: 
Message-ID: 

On Wed, 23 Jun 2021 06:21:43 GMT, David Holmes  wrote:

> Please review this small cleanup item.
> 
> We can simplify and cleanup up name() management:
> 
> - make name() return "const char *" and only cast away constness at API boundaries when essential
> - add type_name() so that we can avoid code like "if (t->is_VM_Thread()) print("VMThread");
> - Rename JavaThread::get_thread_name() to name() (no need for the extra indirection)
> 
> There are a couple of minor changes to the appearance of some internal threads in the hs_err log e.g.
> 
>   0x000055af03e5b1b0 WatcherThread [stack: 0x00007f685df00000,0x00007f685e000000] [id=15952]
> 
> is now:
> 
>   0x000055af03e5b1b0 WatcherThread "VM Periodic Task Thread" [stack: 0x00007f685df00000,0x00007f685e000000] [id=15952]
> 
> but this shouldn't affect anything and makes things more consistent.
> 
> Notes: 
> 
> 1. While "override" is the ideal style when declaring overriding methods it has to be applied to all virtual methods in a class. So unless "override" is already used in a class, I did not start using it. I have filed a separate RFE to convert the Thread classes to use "override" consistently.
> 2.  While there is no need to redeclare a virtual method as "virtual" I kept to the existing style in those classes where changes were made.
> 3. I did not override type_name() for all the JavaThread subclasses as it seemed unnecessary, but happy to hear other views on this.
> 
> Testing (in progress):
>  - All builds in tiers 1-5
>  - GHA
>  - tiers 1-3 as a sanity test
> 
> Thanks,
> David

This pull request has now been integrated.

Changeset: 08ee7ae6
Author:    David Holmes 
URL:       https://git.openjdk.java.net/jdk/commit/08ee7ae67246b45be9684a4a283f0103f5f1c0c4
Stats:     85 lines in 16 files changed: 34 ins; 14 del; 37 mod

8268855: Cleanup name handling in the Thread class and subclasses

Reviewed-by: lfoltan, coleenp

-------------

PR: https://git.openjdk.java.net/jdk/pull/4569

From dholmes at openjdk.java.net  Fri Jun 25 00:02:10 2021
From: dholmes at openjdk.java.net (David Holmes)
Date: Fri, 25 Jun 2021 00:02:10 GMT
Subject: RFR: 8268855: Cleanup name handling in the Thread class and
 subclasses
In-Reply-To: 
References: 
Message-ID: 

On Wed, 23 Jun 2021 06:21:43 GMT, David Holmes  wrote:

> Please review this small cleanup item.
> 
> We can simplify and cleanup up name() management:
> 
> - make name() return "const char *" and only cast away constness at API boundaries when essential
> - add type_name() so that we can avoid code like "if (t->is_VM_Thread()) print("VMThread");
> - Rename JavaThread::get_thread_name() to name() (no need for the extra indirection)
> 
> There are a couple of minor changes to the appearance of some internal threads in the hs_err log e.g.
> 
>   0x000055af03e5b1b0 WatcherThread [stack: 0x00007f685df00000,0x00007f685e000000] [id=15952]
> 
> is now:
> 
>   0x000055af03e5b1b0 WatcherThread "VM Periodic Task Thread" [stack: 0x00007f685df00000,0x00007f685e000000] [id=15952]
> 
> but this shouldn't affect anything and makes things more consistent.
> 
> Notes: 
> 
> 1. While "override" is the ideal style when declaring overriding methods it has to be applied to all virtual methods in a class. So unless "override" is already used in a class, I did not start using it. I have filed a separate RFE to convert the Thread classes to use "override" consistently.
> 2.  While there is no need to redeclare a virtual method as "virtual" I kept to the existing style in those classes where changes were made.
> 3. I did not override type_name() for all the JavaThread subclasses as it seemed unnecessary, but happy to hear other views on this.
> 
> Testing (in progress):
>  - All builds in tiers 1-5
>  - GHA
>  - tiers 1-3 as a sanity test
> 
> Thanks,
> David

Thanks for the reviews.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4569

From manc at openjdk.java.net  Fri Jun 25 02:30:12 2021
From: manc at openjdk.java.net (Man Cao)
Date: Fri, 25 Jun 2021 02:30:12 GMT
Subject: RFR: 8268290: Improve LockFreeQueue<> utility [v3]
In-Reply-To: <4fBS_LTY8B9qMRZTpBQw7b53tE-3Fy8iZ07SYQ1CXo0=.848cffe3-85b3-4343-b330-37591da3b7bc@github.com>
References: 
 <4fBS_LTY8B9qMRZTpBQw7b53tE-3Fy8iZ07SYQ1CXo0=.848cffe3-85b3-4343-b330-37591da3b7bc@github.com>
Message-ID: <6ZTly0bEu0J0IP7aKswWULZ_hsz5aBq5c3lU3Z3TTSs=.45dc1e1d-396f-4a2f-b96e-ef365179a36e@github.com>

On Tue, 22 Jun 2021 17:47:08 GMT, Kim Barrett  wrote:

>> Please review this change to the LockFreeQueue utility class.
>> 
>> The LockFreeQueue originated as an implementation detail of
>> G1DirtyCardQueueSet, and was recently refactored into a public utility
>> class.  In that refactoring it retained some limitations that were
>> acceptable in its original context, but may be problematic as a general
>> utility.
>> 
>> In particular, under some conditions a thread was not be able to pop the
>> last element in the queue, due to interference by a concurrent operation.
>> And this state will persist, so retrying the pop operation won't help until
>> the interfering thread had made sufficient progress. This was mitigated by
>> making the API more complex to provide notice to the client that the queue
>> may be in this state.
>> 
>> But it turns out we can do somewhat better, eliminating one of the
>> limitations, which is the point of this change.  We introduce a
>> pseudo-object used as an end of queue marker.  We can use the transition of
>> the last element's next value from the end marker to NULL by a pop operation
>> as a claim on the element, allowing the losing thread to recognize, retry,
>> and make progress.
>> 
>> This queue still has the limitation that an in-progress push/append may
>> prevent popping elements.  Because of this, the class is being renamed to
>> NonblockingQueue.  The old name suggests stronger guarantees than actually
>> provided.
>> 
>> The PR has two commits, the first for the functional changes, the second for
>> the renaming.  The github diffs don't seem to be recognizing the renaming of
>> the source files as a rename, instead treating the old files as deleted and
>> the new files as added.  The first commit by itself is probably more useful
>> for reviewing the functional changes.
>> 
>> Testing:
>> mach5 tier1-5
>
> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:
> 
>  - Merge branch 'master' into lfqueue
>  - Merge branch 'master' into lfqueue
>  - rename
>  - use end marker to improve pop

Thank you for this fix. I like how append() can just store to &_head on CAS failure.

Minor comments:
Is it recommended to add SpinPause() in pop()'s while loop body?
This sentence above append() in inline.hpp could be removed: "it is an invariant that the old tail's "next" value is NULL".
I can make a PR for these separately.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4379

From kvn at openjdk.java.net  Fri Jun 25 02:32:23 2021
From: kvn at openjdk.java.net (Vladimir Kozlov)
Date: Fri, 25 Jun 2021 02:32:23 GMT
Subject: [jdk17] RFR: 8269260: Add AVX512 and other SSE + AVX combinations
 testing for tests which generate vector instructions
Message-ID: 

[8269179](https://bugs.openjdk.java.net/browse/JDK-8269179) bug shows that we (Oracle) don't test enough different vectors instructions on x64.

I suggest to create new HotSpot compiler test groups for such tests and together with jdk_vector (jdk/incubator/vector) group run them with different SSE and AVX HotSpot flags combinations: 

-XX:UseAVX=3 
-XX:UseAVX=2 
-XX:UseAVX=1 
-XX:UseAVX=0 
-XX:UseAVX=0 -XX:UseSSE=3 
-XX:UseAVX=0 -XX:UseSSE=2 (this is minimal setting for 64 bit)


Here my suggesting how to run them on windows-x64-debug and linux-x64-debug: 

hs-tier2: 
  hotspot_vector_1 - run with all flags combinations listed in Description 
  hotspot_vector_2 - run with `-XX:UseAVX=3` only 

hs-tier3: 
  jdk_vector - run with all flags combinations listed in Description 
  hotspot_vector_2 - run with all combinations except `-XX:UseAVX=3`


Tier1 already runs these tests in default mode.

Tested hs-tier1-3 internally.

-------------

Commit messages:
 - 8269260: Add AVX512 and other SSE + AVX combinations testing for tests which generate vector instructions

Changes: https://git.openjdk.java.net/jdk17/pull/144/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk17&pr=144&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8269260
  Stats: 23 lines in 1 file changed: 23 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk17/pull/144.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk17 pull/144/head:pull/144

PR: https://git.openjdk.java.net/jdk17/pull/144

From github.com+6704669+asgibbons at openjdk.java.net  Fri Jun 25 03:23:09 2021
From: github.com+6704669+asgibbons at openjdk.java.net (Scott Gibbons)
Date: Fri, 25 Jun 2021 03:23:09 GMT
Subject: Integrated: 8268276: Base64 Decoding optimization for x86 using
 AVX-512
In-Reply-To: 
References: 
Message-ID: <6kPmBbPIBQ8tVmezz0Rp22gFotQ7nF2e0i93njbGI5k=.f0e06a2f-506f-41e9-bdbf-004fbc2b3398@github.com>

On Fri, 4 Jun 2021 20:55:51 GMT, Scott Gibbons  wrote:

> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding.
> 
> A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done.  This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so.  A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded.  This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded.
> 
> The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding.  The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically.
> 
> Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k.  The numbers are given in the table below.
> 
> **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**.
> 
> 
> Benchmark Name | Base Score | Optimized Score | Gain
> -- | -- | -- | --
> testBase64Decode size 1 | 15.36 | 15.32 | 1.00
> testBase64Decode size 3 | 17.00 | 16.72 | 1.02
> testBase64Decode size 7 | 20.60 | 18.82 | 1.09
> testBase64Decode size 32 | 34.21 | 26.77 | 1.28
> testBase64Decode size 64 | 54.43 | 38.35 | 1.42
> testBase64Decode size 80 | 66.40 | 48.34 | 1.37
> testBase64Decode size 96 | 73.16 | 52.90 | 1.38
> testBase64Decode size 112 | 84.93 | 51.82 | 1.64
> testBase64Decode size 512 | 288.81 | 32.04 | 9.01
> testBase64Decode size 1000 | 560.48 | 40.79 | 13.74
> testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72
> testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15
> testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07
> testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10
> testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02
> testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10
> testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05
> testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00
> testBase64MIMEDecode size 96 | 364.00 | 346.66 | 1.05
> testBase64MIMEDecode size 112 | 472.88 | 394.78 | 1.20
> testBase64MIMEDecode size 512 | 1814.96 | 1671.28 | 1.09
> testBase64MIMEDecode size 1000 | 3623.50 | 3227.61 | 1.12
> testBase64MIMEDecode size 20000 | 70484.09 | 64940.77 | 1.09
> testBase64MIMEDecode size 50000 | 191732.34 | 158158.95 | 1.21
> testBase64WithErrorInputsDecode size 1 | 1531.02 | 1185.19 | 1.29
> testBase64WithErrorInputsDecode size 3 | 1306.59 | 1170.99 | 1.12
> testBase64WithErrorInputsDecode size 7 | 1238.11 | 1176.62 | 1.05
> testBase64WithErrorInputsDecode size 32 | 1346.46 | 1138.47 | 1.18
> testBase64WithErrorInputsDecode size 64 | 1195.28 | 1172.52 | 1.02
> testBase64WithErrorInputsDecode size 80 | 1469.00 | 1180.94 | 1.24
> testBase64WithErrorInputsDecode size 96 | 1434.48 | 1167.74 | 1.23
> testBase64WithErrorInputsDecode size 112 | 1440.06 | 1162.56 | 1.24
> testBase64WithErrorInputsDecode size 512 | 1362.79 | 1193.42 | 1.14
> testBase64WithErrorInputsDecode size 1000 | 1426.07 | 1194.44 | 1.19
> testBase64WithErrorInputsDecode size   20000 | 1398.44 | 1138.17 | 1.23
> testBase64WithErrorInputsDecode size   50000 | 1409.41 | 1114.16 | 1.26

This pull request has now been integrated.

Changeset: c37988d0
Author:    Scott Gibbons 
Committer: Sandhya Viswanathan 
URL:       https://git.openjdk.java.net/jdk/commit/c37988d0793b24d98d285530dfda69999a227937
Stats:     753 lines in 12 files changed: 735 ins; 4 del; 14 mod

8268276: Base64 Decoding optimization for x86 using AVX-512

Reviewed-by: erikj, sviswanathan, kvn

-------------

PR: https://git.openjdk.java.net/jdk/pull/4368

From sviswanathan at openjdk.java.net  Fri Jun 25 03:23:08 2021
From: sviswanathan at openjdk.java.net (Sandhya Viswanathan)
Date: Fri, 25 Jun 2021 03:23:08 GMT
Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512
 [v7]
In-Reply-To: 
References: 
 
 
Message-ID: <6kkhnovupJKFv06Bz6fi1bpme2eOrsytWTs34mU5g0c=.f7f01265-f4ce-4607-af39-daecd7d5ef90@github.com>

On Thu, 24 Jun 2021 14:50:01 GMT, Vladimir Kozlov  wrote:

>> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Fixing Windows build warnings
>
> The rest of testing hs-tier1-4 and xcomp is finished and clean.
> So this is the only failure. I attached hs_err file to RFE.

Thanks a lot @vnkozlov for the review and test.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4368

From iwalulya at openjdk.java.net  Fri Jun 25 09:04:07 2021
From: iwalulya at openjdk.java.net (Ivan Walulya)
Date: Fri, 25 Jun 2021 09:04:07 GMT
Subject: RFR: 8268290: Improve LockFreeQueue<> utility [v3]
In-Reply-To: <6ZTly0bEu0J0IP7aKswWULZ_hsz5aBq5c3lU3Z3TTSs=.45dc1e1d-396f-4a2f-b96e-ef365179a36e@github.com>
References: 
 <4fBS_LTY8B9qMRZTpBQw7b53tE-3Fy8iZ07SYQ1CXo0=.848cffe3-85b3-4343-b330-37591da3b7bc@github.com>
 <6ZTly0bEu0J0IP7aKswWULZ_hsz5aBq5c3lU3Z3TTSs=.45dc1e1d-396f-4a2f-b96e-ef365179a36e@github.com>
Message-ID: 

On Fri, 25 Jun 2021 02:27:23 GMT, Man Cao  wrote:

> Minor comments:
> Is it recommended to add SpinPause() in pop()'s while loop body?

I think with a try_pop method, if one needs to have a SpinPause,  they can re-implement the retry-loop with the SpinPause and try_pop instead of having this in a generic pop().

-------------

PR: https://git.openjdk.java.net/jdk/pull/4379

From aph at openjdk.java.net  Fri Jun 25 14:09:29 2021
From: aph at openjdk.java.net (Andrew Haley)
Date: Fri, 25 Jun 2021 14:09:29 GMT
Subject: RFR: 8261579: AArch64: Support for weaker memory ordering in Atomic
Message-ID: 

At present the Atomic operations in HotSpot only support conservative (the very strongest) and relaxed (the weakest) memory ordering.
We should add at least seq_cst for LSE. This patch also adds a release-only CAS, needed for Shenandoah.

-------------

Commit messages:
 - Support atomic_memory_order for CAS on BSD
 - Release-only CAS
 - AArch64 Atomic::cmpxchg acq_rel and seq_cst

Changes: https://git.openjdk.java.net/jdk/pull/4597/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4597&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8261579
  Stats: 93 lines in 6 files changed: 88 ins; 2 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4597.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4597/head:pull/4597

PR: https://git.openjdk.java.net/jdk/pull/4597

From aph at openjdk.java.net  Fri Jun 25 14:19:22 2021
From: aph at openjdk.java.net (Andrew Haley)
Date: Fri, 25 Jun 2021 14:19:22 GMT
Subject: RFR: 8261579: AArch64: Support for weaker memory ordering in
 Atomic [v2]
In-Reply-To: 
References: 
Message-ID: 

> At present the Atomic operations in HotSpot only support conservative (the very strongest) and relaxed (the weakest) memory ordering.
> We should add at least seq_cst for LSE. This patch also adds a release-only CAS, needed for Shenandoah.

Andrew Haley has updated the pull request incrementally with one additional commit since the last revision:

  Generate LSE stubs for release-only CAS

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4597/files
  - new: https://git.openjdk.java.net/jdk/pull/4597/files/5f78f964..bf988ca0

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4597&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4597&range=00-01

  Stats: 11 lines in 1 file changed: 11 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4597.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4597/head:pull/4597

PR: https://git.openjdk.java.net/jdk/pull/4597

From aph at openjdk.java.net  Fri Jun 25 14:56:36 2021
From: aph at openjdk.java.net (Andrew Haley)
Date: Fri, 25 Jun 2021 14:56:36 GMT
Subject: RFR: 8261579: AArch64: Support for weaker memory ordering in
 Atomic [v3]
In-Reply-To: 
References: 
Message-ID: 

> At present the Atomic operations in HotSpot only support conservative (the very strongest) and relaxed (the weakest) memory ordering.
> We should add at least seq_cst for LSE. This patch also adds a release-only CAS, needed for Shenandoah.

Andrew Haley has updated the pull request incrementally with one additional commit since the last revision:

  Sanitize memory order for BSD CAS

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4597/files
  - new: https://git.openjdk.java.net/jdk/pull/4597/files/bf988ca0..bbf1a51a

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4597&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4597&range=01-02

  Stats: 15 lines in 1 file changed: 14 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4597.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4597/head:pull/4597

PR: https://git.openjdk.java.net/jdk/pull/4597

From aph at openjdk.java.net  Fri Jun 25 15:02:35 2021
From: aph at openjdk.java.net (Andrew Haley)
Date: Fri, 25 Jun 2021 15:02:35 GMT
Subject: RFR: 8261579: AArch64: Support for weaker memory ordering in
 Atomic [v4]
In-Reply-To: 
References: 
Message-ID: 

> At present the Atomic operations in HotSpot only support conservative (the very strongest) and relaxed (the weakest) memory ordering.
> We should add at least seq_cst for LSE. This patch also adds a release-only CAS, needed for Shenandoah.

Andrew Haley has updated the pull request incrementally with one additional commit since the last revision:

  Sanitize memory order for BSD CAS

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4597/files
  - new: https://git.openjdk.java.net/jdk/pull/4597/files/bbf1a51a..6298e85a

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4597&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4597&range=02-03

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4597.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4597/head:pull/4597

PR: https://git.openjdk.java.net/jdk/pull/4597

From kbarrett at openjdk.java.net  Fri Jun 25 15:30:10 2021
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Fri, 25 Jun 2021 15:30:10 GMT
Subject: RFR: 8268290: Improve LockFreeQueue<> utility [v3]
In-Reply-To: <4fBS_LTY8B9qMRZTpBQw7b53tE-3Fy8iZ07SYQ1CXo0=.848cffe3-85b3-4343-b330-37591da3b7bc@github.com>
References: 
 <4fBS_LTY8B9qMRZTpBQw7b53tE-3Fy8iZ07SYQ1CXo0=.848cffe3-85b3-4343-b330-37591da3b7bc@github.com>
Message-ID: 

On Tue, 22 Jun 2021 17:47:08 GMT, Kim Barrett  wrote:

>> Please review this change to the LockFreeQueue utility class.
>> 
>> The LockFreeQueue originated as an implementation detail of
>> G1DirtyCardQueueSet, and was recently refactored into a public utility
>> class.  In that refactoring it retained some limitations that were
>> acceptable in its original context, but may be problematic as a general
>> utility.
>> 
>> In particular, under some conditions a thread was not be able to pop the
>> last element in the queue, due to interference by a concurrent operation.
>> And this state will persist, so retrying the pop operation won't help until
>> the interfering thread had made sufficient progress. This was mitigated by
>> making the API more complex to provide notice to the client that the queue
>> may be in this state.
>> 
>> But it turns out we can do somewhat better, eliminating one of the
>> limitations, which is the point of this change.  We introduce a
>> pseudo-object used as an end of queue marker.  We can use the transition of
>> the last element's next value from the end marker to NULL by a pop operation
>> as a claim on the element, allowing the losing thread to recognize, retry,
>> and make progress.
>> 
>> This queue still has the limitation that an in-progress push/append may
>> prevent popping elements.  Because of this, the class is being renamed to
>> NonblockingQueue.  The old name suggests stronger guarantees than actually
>> provided.
>> 
>> The PR has two commits, the first for the functional changes, the second for
>> the renaming.  The github diffs don't seem to be recognizing the renaming of
>> the source files as a rename, instead treating the old files as deleted and
>> the new files as added.  The first commit by itself is probably more useful
>> for reviewing the functional changes.
>> 
>> Testing:
>> mach5 tier1-5
>
> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:
> 
>  - Merge branch 'master' into lfqueue
>  - Merge branch 'master' into lfqueue
>  - rename
>  - use end marker to improve pop

I don't think `pop()` needs `SpinPause()` in its body.  It's not really an empty spin.  It only loops on a cmpxchg failure.  Someone might want to use `try_pop()` and `SpinYield` or something similar in a highly contended case, but I'd rather leave that to the specific case to parameterize.

You are correct that this comment should be removed: "it is an invariant that the old tail's "next" value is NULL".  Oops, sorry I missed that.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4379

From iveresov at openjdk.java.net  Fri Jun 25 18:32:13 2021
From: iveresov at openjdk.java.net (Igor Veresov)
Date: Fri, 25 Jun 2021 18:32:13 GMT
Subject: [jdk17] RFR: 8269260: Add AVX512 and other SSE + AVX combinations
 testing for tests which generate vector instructions
In-Reply-To: 
References: 
Message-ID: 

On Fri, 25 Jun 2021 02:24:39 GMT, Vladimir Kozlov  wrote:

> [8269179](https://bugs.openjdk.java.net/browse/JDK-8269179) bug shows that we (Oracle) don't test enough different vectors instructions on x64.
> 
> I suggest to create new HotSpot compiler test groups for such tests and together with jdk_vector (jdk/incubator/vector) group run them with different SSE and AVX HotSpot flags combinations: 
> 
> -XX:UseAVX=3 
> -XX:UseAVX=2 
> -XX:UseAVX=1 
> -XX:UseAVX=0 
> -XX:UseAVX=0 -XX:UseSSE=3 
> -XX:UseAVX=0 -XX:UseSSE=2 (this is minimal setting for 64 bit)
> 
> 
> Here my suggesting how to run them on windows-x64-debug and linux-x64-debug: 
> 
> hs-tier2: 
>   hotspot_vector_1 - run with all flags combinations listed in Description 
>   hotspot_vector_2 - run with `-XX:UseAVX=3` only 
> 
> hs-tier3: 
>   jdk_vector - run with all flags combinations listed in Description 
>   hotspot_vector_2 - run with all combinations except `-XX:UseAVX=3`
> 
> 
> Tier1 already runs these tests in default mode.
> 
> Tested hs-tier1-3 internally.

Marked as reviewed by iveresov (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk17/pull/144

From dlong at openjdk.java.net  Fri Jun 25 20:03:06 2021
From: dlong at openjdk.java.net (Dean Long)
Date: Fri, 25 Jun 2021 20:03:06 GMT
Subject: [jdk17] RFR: 8269260: Add AVX512 and other SSE + AVX combinations
 testing for tests which generate vector instructions
In-Reply-To: 
References: 
Message-ID: 

On Fri, 25 Jun 2021 02:24:39 GMT, Vladimir Kozlov  wrote:

> [8269179](https://bugs.openjdk.java.net/browse/JDK-8269179) bug shows that we (Oracle) don't test enough different vectors instructions on x64.
> 
> I suggest to create new HotSpot compiler test groups for such tests and together with jdk_vector (jdk/incubator/vector) group run them with different SSE and AVX HotSpot flags combinations: 
> 
> -XX:UseAVX=3 
> -XX:UseAVX=2 
> -XX:UseAVX=1 
> -XX:UseAVX=0 
> -XX:UseAVX=0 -XX:UseSSE=3 
> -XX:UseAVX=0 -XX:UseSSE=2 (this is minimal setting for 64 bit)
> 
> 
> Here my suggesting how to run them on windows-x64-debug and linux-x64-debug: 
> 
> hs-tier2: 
>   hotspot_vector_1 - run with all flags combinations listed in Description 
>   hotspot_vector_2 - run with `-XX:UseAVX=3` only 
> 
> hs-tier3: 
>   jdk_vector - run with all flags combinations listed in Description 
>   hotspot_vector_2 - run with all combinations except `-XX:UseAVX=3`
> 
> 
> Tier1 already runs these tests in default mode.
> 
> Tested hs-tier1-3 internally.

Marked as reviewed by dlong (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk17/pull/144

From kvn at openjdk.java.net  Fri Jun 25 22:53:05 2021
From: kvn at openjdk.java.net (Vladimir Kozlov)
Date: Fri, 25 Jun 2021 22:53:05 GMT
Subject: [jdk17] RFR: 8269260: Add AVX512 and other SSE + AVX combinations
 testing for tests which generate vector instructions
In-Reply-To: 
References: 
Message-ID: 

On Fri, 25 Jun 2021 02:24:39 GMT, Vladimir Kozlov  wrote:

> [8269179](https://bugs.openjdk.java.net/browse/JDK-8269179) bug shows that we (Oracle) don't test enough different vectors instructions on x64.
> 
> I suggest to create new HotSpot compiler test groups for such tests and together with jdk_vector (jdk/incubator/vector) group run them with different SSE and AVX HotSpot flags combinations: 
> 
> -XX:UseAVX=3 
> -XX:UseAVX=2 
> -XX:UseAVX=1 
> -XX:UseAVX=0 
> -XX:UseAVX=0 -XX:UseSSE=3 
> -XX:UseAVX=0 -XX:UseSSE=2 (this is minimal setting for 64 bit)
> 
> 
> Here my suggesting how to run them on windows-x64-debug and linux-x64-debug: 
> 
> hs-tier2: 
>   hotspot_vector_1 - run with all flags combinations listed in Description 
>   hotspot_vector_2 - run with `-XX:UseAVX=3` only 
> 
> hs-tier3: 
>   jdk_vector - run with all flags combinations listed in Description 
>   hotspot_vector_2 - run with all combinations except `-XX:UseAVX=3`
> 
> 
> Tier1 already runs these tests in default mode.
> 
> Tested hs-tier1-3 internally.

Thank you, Dean and Igor.

-------------

PR: https://git.openjdk.java.net/jdk17/pull/144

From kvn at openjdk.java.net  Fri Jun 25 22:53:06 2021
From: kvn at openjdk.java.net (Vladimir Kozlov)
Date: Fri, 25 Jun 2021 22:53:06 GMT
Subject: [jdk17] Integrated: 8269260: Add AVX512 and other SSE + AVX
 combinations testing for tests which generate vector instructions
In-Reply-To: 
References: 
Message-ID: 

On Fri, 25 Jun 2021 02:24:39 GMT, Vladimir Kozlov  wrote:

> [8269179](https://bugs.openjdk.java.net/browse/JDK-8269179) bug shows that we (Oracle) don't test enough different vectors instructions on x64.
> 
> I suggest to create new HotSpot compiler test groups for such tests and together with jdk_vector (jdk/incubator/vector) group run them with different SSE and AVX HotSpot flags combinations: 
> 
> -XX:UseAVX=3 
> -XX:UseAVX=2 
> -XX:UseAVX=1 
> -XX:UseAVX=0 
> -XX:UseAVX=0 -XX:UseSSE=3 
> -XX:UseAVX=0 -XX:UseSSE=2 (this is minimal setting for 64 bit)
> 
> 
> Here my suggesting how to run them on windows-x64-debug and linux-x64-debug: 
> 
> hs-tier2: 
>   hotspot_vector_1 - run with all flags combinations listed in Description 
>   hotspot_vector_2 - run with `-XX:UseAVX=3` only 
> 
> hs-tier3: 
>   jdk_vector - run with all flags combinations listed in Description 
>   hotspot_vector_2 - run with all combinations except `-XX:UseAVX=3`
> 
> 
> Tier1 already runs these tests in default mode.
> 
> Tested hs-tier1-3 internally.

This pull request has now been integrated.

Changeset: 824a5169
Author:    Vladimir Kozlov 
URL:       https://git.openjdk.java.net/jdk17/commit/824a51693e10afba834823efb38195ee0d692e5e
Stats:     23 lines in 1 file changed: 23 ins; 0 del; 0 mod

8269260: Add AVX512 and other SSE + AVX combinations testing for tests which generate vector instructions

Reviewed-by: iveresov, dlong

-------------

PR: https://git.openjdk.java.net/jdk17/pull/144

From manc at openjdk.java.net  Sat Jun 26 02:36:12 2021
From: manc at openjdk.java.net (Man Cao)
Date: Sat, 26 Jun 2021 02:36:12 GMT
Subject: RFR: 8269417: Minor clarification on NonblockingQueue utility
Message-ID: 

Hi,

Could you review this change mainly based on the comments in https://github.com/openjdk/jdk/pull/4379?
I also added an assertion to ensure and better understand why the Atomic::store() in append() is correct.

Stress tested with fastdebug build with:
$ make test TEST="gtest:NonblockingQueueTestBasics gtest:NonblockingQueueTest" GTEST="REPEAT=-1"

-------------

Commit messages:
 - Clarify NonblockingQueue's comments and add an assertion.

Changes: https://git.openjdk.java.net/jdk/pull/4600/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4600&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8269417
  Stats: 25 lines in 2 files changed: 15 ins; 5 del; 5 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4600.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4600/head:pull/4600

PR: https://git.openjdk.java.net/jdk/pull/4600

From kbarrett at openjdk.java.net  Sun Jun 27 00:37:04 2021
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Sun, 27 Jun 2021 00:37:04 GMT
Subject: RFR: 8269417: Minor clarification on NonblockingQueue utility
In-Reply-To: 
References: 
Message-ID: <1WU7X_NlvSMt8rTB5605dKauFCtDek775ukIclcls-k=.e41adef2-5a38-461a-a494-f4c64112fbfa@github.com>

On Sat, 26 Jun 2021 02:29:20 GMT, Man Cao  wrote:

> Hi,
> 
> Could you review this change mainly based on the comments in https://github.com/openjdk/jdk/pull/4379?
> I also added an assertion to ensure and better understand why the Atomic::store() in append() is correct.
> 
> Stress tested with fastdebug build with:
> $ make test TEST="gtest:NonblockingQueueTestBasics gtest:NonblockingQueueTest" GTEST="REPEAT=-1"

Looks good.  Just one suggestion in commentary.

src/hotspot/share/utilities/nonblockingQueue.inline.hpp line 116:

> 114:     // _head simultaneously, because the Atomic::xchg() above orders these
> 115:     // push/append operations so they perform Atomic::cmpxchg() on different
> 116:     // old_tail. Thus, at most one Atomic::cmpxchg() can fail.

s/Thus, ... fail./Thus, the cmpxchg can only fail because of a concurrent try_pop./

-------------

Marked as reviewed by kbarrett (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4600

From jwilhelm at openjdk.java.net  Sun Jun 27 22:56:05 2021
From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson)
Date: Sun, 27 Jun 2021 22:56:05 GMT
Subject: Withdrawn: Merge jdk17
In-Reply-To: 
References: 
Message-ID: 

On Thu, 24 Jun 2021 00:36:38 GMT, Jesper Wilhelmsson  wrote:

> Forwardport JDK 17 -> JDK 18

This pull request has been closed without being integrated.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4579

From jwilhelm at openjdk.java.net  Sun Jun 27 23:14:43 2021
From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson)
Date: Sun, 27 Jun 2021 23:14:43 GMT
Subject: RFR: Merge jdk17
Message-ID: 

Forwardport JDK 17 -> JDK 18

-------------

Commit messages:
 - Merge
 - 8258746: illegal access to global field _jvmci_old_thread_counters by terminated thread causes crash
 - 8266269: Lookup::accessClass fails with IAE when accessing an arrayClass with a protected inner class as component class
 - 8269351: Proxy::newProxyInstance and MethodHandleProxies::asInterfaceInstance should reject sealed interfaces
 - 8269260: Add AVX512 and other SSE + AVX combinations testing for tests which generate vector instructions
 - 8269302: serviceability/dcmd/framework/InvalidCommandTest.java still fails after JDK-8268433
 - 8269036: tools/jpackage/share/AppImagePackageTest.java failed with "hdiutil: create failed - Resource busy"
 - 8269074: (fs) Files.copy fails to copy from /proc on some linux kernel versions
 - 8256919: BCEL: Utility.encode forget to close
 - 8269335: Unable to load svml library
 - ... and 20 more: https://git.openjdk.java.net/jdk/compare/8bed3534...2d9b73c0

The webrevs contain the adjustments done while merging with regards to each parent branch:
 - master: https://webrevs.openjdk.java.net/?repo=jdk&pr=4606&range=00.0
 - jdk17: https://webrevs.openjdk.java.net/?repo=jdk&pr=4606&range=00.1

Changes: https://git.openjdk.java.net/jdk/pull/4606/files
  Stats: 1925 lines in 90 files changed: 1452 ins; 227 del; 246 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4606.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4606/head:pull/4606

PR: https://git.openjdk.java.net/jdk/pull/4606

From jwilhelm at openjdk.java.net  Sun Jun 27 23:55:06 2021
From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson)
Date: Sun, 27 Jun 2021 23:55:06 GMT
Subject: Integrated: Merge jdk17
In-Reply-To: 
References: 
Message-ID: 

On Sun, 27 Jun 2021 23:05:10 GMT, Jesper Wilhelmsson  wrote:

> Forwardport JDK 17 -> JDK 18

This pull request has now been integrated.

Changeset: a29953d8
Author:    Jesper Wilhelmsson 
URL:       https://git.openjdk.java.net/jdk/commit/a29953d805ac6360bcfe005bcefa60e112788494
Stats:     1925 lines in 90 files changed: 1452 ins; 227 del; 246 mod

Merge

-------------

PR: https://git.openjdk.java.net/jdk/pull/4606

From stuefe at openjdk.java.net  Mon Jun 28 06:20:07 2021
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Mon, 28 Jun 2021 06:20:07 GMT
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one
In-Reply-To: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
Message-ID: 

On Thu, 10 Jun 2021 02:07:36 GMT, Yi Yang  wrote:

> From users' perspective, we can find corresponding os thread via top directly, otherwise, we must convert hex format based nid to an integer, and find that thread via `top -pid `. This slightly facilitates our debugging process, but would obviously break some existing jstack analysis tool.
> 
> Jstack Before:
> 
> "ParGC Thread#7" os_prio=0 cpu=103260.18ms elapsed=5255043.58s tid=0x00007f967000b000 nid=0x12e67 runnable
> 
> "ParGC Thread#8" os_prio=0 cpu=104818.76ms elapsed=5255043.58s tid=0x00007f967000c000 nid=0x12e68 runnable
> 
> "ParGC Thread#9" os_prio=0 cpu=102164.69ms elapsed=5255043.58s tid=0x00007f967000e000 nid=0x12e69 runnable
> 
> Jstack After:
> "G1 Conc#0" os_prio=0 cpu=0.03ms elapsed=1295.27s tid=0x00007f99dc096490 nid=117707 runnable
> 
> "G1 Refine#0" os_prio=0 cpu=0.06ms elapsed=1295.22s tid=0x00007f99dc2cad20 nid=117708 runnable
> 
> "G1 Service" os_prio=0 cpu=87.05ms elapsed=1295.22s tid=0x00007f99dc2cc140 nid=117709 runnable
> 
> Top:
>    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>  49083 tianxia+ 20 0 32.8g 594148 10796 S 103.3 0.1 0:10.05 java
>  71291 qingfen+ 20 0 39.3g 26.7g 18312 S 100.7 5.3 16861:35 jhsdb
>  50407 tianxia+ 20 0 32.5g 32796 9768 S 100.3 0.0 0:05.80 java
> 107429 maolian+ 20 0 11.4g 1.1g 10956 S 100.3 0.2 20173:52 java
>  99923 root 10 -10 288520 163228 5088 S 5.9 0.0 6463:53 AliYunDun

Changes requested by stuefe (Reviewer).

src/hotspot/share/runtime/osThread.cpp line 41:

> 39: // Printing
> 40: void OSThread::print_on(outputStream *st) const {
> 41:   st->print("nid=%d ", thread_id());

thread_id is of an opaque type (eg pthread_t). I think we can reasonably assume its numeric, but I would print it as an unsigned 64bit int just in case.

src/hotspot/share/runtime/osThread.cpp line 49:

> 47:     case CONDVAR_WAIT:            st->print("waiting on condition ");      break;
> 48:     case OBJECT_WAIT:             st->print("in Object.wait() ");          break;
> 49:     case BREAKPOINTED:            st->print("at breakpoint");               break;

These cleanups don't seem to have anything to do with this change.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4449

From yyang at openjdk.java.net  Mon Jun 28 07:40:08 2021
From: yyang at openjdk.java.net (Yi Yang)
Date: Mon, 28 Jun 2021 07:40:08 GMT
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one
In-Reply-To: 
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
 
Message-ID: 

On Mon, 28 Jun 2021 06:16:14 GMT, Thomas Stuefe  wrote:

>> From users' perspective, we can find corresponding os thread via top directly, otherwise, we must convert hex format based nid to an integer, and find that thread via `top -pid `. This slightly facilitates our debugging process, but would obviously break some existing jstack analysis tool.
>> 
>> Jstack Before:
>> 
>> "ParGC Thread#7" os_prio=0 cpu=103260.18ms elapsed=5255043.58s tid=0x00007f967000b000 nid=0x12e67 runnable
>> 
>> "ParGC Thread#8" os_prio=0 cpu=104818.76ms elapsed=5255043.58s tid=0x00007f967000c000 nid=0x12e68 runnable
>> 
>> "ParGC Thread#9" os_prio=0 cpu=102164.69ms elapsed=5255043.58s tid=0x00007f967000e000 nid=0x12e69 runnable
>> 
>> Jstack After:
>> "G1 Conc#0" os_prio=0 cpu=0.03ms elapsed=1295.27s tid=0x00007f99dc096490 nid=117707 runnable
>> 
>> "G1 Refine#0" os_prio=0 cpu=0.06ms elapsed=1295.22s tid=0x00007f99dc2cad20 nid=117708 runnable
>> 
>> "G1 Service" os_prio=0 cpu=87.05ms elapsed=1295.22s tid=0x00007f99dc2cc140 nid=117709 runnable
>> 
>> Top:
>>    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>>  49083 tianxia+ 20 0 32.8g 594148 10796 S 103.3 0.1 0:10.05 java
>>  71291 qingfen+ 20 0 39.3g 26.7g 18312 S 100.7 5.3 16861:35 jhsdb
>>  50407 tianxia+ 20 0 32.5g 32796 9768 S 100.3 0.0 0:05.80 java
>> 107429 maolian+ 20 0 11.4g 1.1g 10956 S 100.3 0.2 20173:52 java
>>  99923 root 10 -10 288520 163228 5088 S 5.9 0.0 6463:53 AliYunDun
>
> src/hotspot/share/runtime/osThread.cpp line 41:
> 
>> 39: // Printing
>> 40: void OSThread::print_on(outputStream *st) const {
>> 41:   st->print("nid=%d ", thread_id());
> 
> thread_id is of an opaque type (eg pthread_t). I think we can reasonably assume its numeric, but I would print it as an unsigned 64bit int just in case.

Hi Thomas, we can not use other format specifiers (`%ld`,`%llu`) after my practice, because it can not compile on my mac:

> src/hotspot/share/runtime/osThread.cpp line 49:
> 
>> 47:     case CONDVAR_WAIT:            st->print("waiting on condition ");      break;
>> 48:     case OBJECT_WAIT:             st->print("in Object.wait() ");          break;
>> 49:     case BREAKPOINTED:            st->print("at breakpoint");               break;
> 
> These cleanups don't seem to have anything to do with this change.

Restored.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4449

From yyang at openjdk.java.net  Mon Jun 28 07:46:29 2021
From: yyang at openjdk.java.net (Yi Yang)
Date: Mon, 28 Jun 2021 07:46:29 GMT
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one [v2]
In-Reply-To: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
Message-ID: 

> From users' perspective, we can find corresponding os thread via top directly, otherwise, we must convert hex format based nid to an integer, and find that thread via `top -pid `. This slightly facilitates our debugging process, but would obviously break some existing jstack analysis tool.
> 
> Jstack Before:
> 
> "ParGC Thread#7" os_prio=0 cpu=103260.18ms elapsed=5255043.58s tid=0x00007f967000b000 nid=0x12e67 runnable
> 
> "ParGC Thread#8" os_prio=0 cpu=104818.76ms elapsed=5255043.58s tid=0x00007f967000c000 nid=0x12e68 runnable
> 
> "ParGC Thread#9" os_prio=0 cpu=102164.69ms elapsed=5255043.58s tid=0x00007f967000e000 nid=0x12e69 runnable
> 
> Jstack After:
> "G1 Conc#0" os_prio=0 cpu=0.03ms elapsed=1295.27s tid=0x00007f99dc096490 nid=117707 runnable
> 
> "G1 Refine#0" os_prio=0 cpu=0.06ms elapsed=1295.22s tid=0x00007f99dc2cad20 nid=117708 runnable
> 
> "G1 Service" os_prio=0 cpu=87.05ms elapsed=1295.22s tid=0x00007f99dc2cc140 nid=117709 runnable
> 
> Top:
>    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>  49083 tianxia+ 20 0 32.8g 594148 10796 S 103.3 0.1 0:10.05 java
>  71291 qingfen+ 20 0 39.3g 26.7g 18312 S 100.7 5.3 16861:35 jhsdb
>  50407 tianxia+ 20 0 32.5g 32796 9768 S 100.3 0.0 0:05.80 java
> 107429 maolian+ 20 0 11.4g 1.1g 10956 S 100.3 0.2 20173:52 java
>  99923 root 10 -10 288520 163228 5088 S 5.9 0.0 6463:53 AliYunDun

Yi Yang has updated the pull request incrementally with one additional commit since the last revision:

  restore cleanup code

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4449/files
  - new: https://git.openjdk.java.net/jdk/pull/4449/files/eb469267..4baeb175

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4449&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4449&range=00-01

  Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4449.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4449/head:pull/4449

PR: https://git.openjdk.java.net/jdk/pull/4449

From stuefe at openjdk.java.net  Mon Jun 28 08:49:09 2021
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Mon, 28 Jun 2021 08:49:09 GMT
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one [v2]
In-Reply-To: 
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
 
 
Message-ID: 

On Mon, 28 Jun 2021 07:37:10 GMT, Yi Yang  wrote:

>> src/hotspot/share/runtime/osThread.cpp line 41:
>> 
>>> 39: // Printing
>>> 40: void OSThread::print_on(outputStream *st) const {
>>> 41:   st->print("nid=%d ", thread_id());
>> 
>> thread_id is of an opaque type (eg pthread_t). I think we can reasonably assume its numeric, but I would print it as an unsigned 64bit int just in case.
>
> Hi Thomas, we can not use other format specifiers (`%ld`,`%llu`) after my practice, because it can not compile on my mac:

You'd do:

print("nid: " UINT64_FORMAT, (uint64_t) id):;

thread_t is, among other things, pthread_t, which is opaque. Any current code treating that as signed int is incorrect too.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4449

From github.com+25214855+casparcwang at openjdk.java.net  Mon Jun 28 08:59:20 2021
From: github.com+25214855+casparcwang at openjdk.java.net (=?UTF-8?B?546L6LaF?=)
Date: Mon, 28 Jun 2021 08:59:20 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in the
 jit code
Message-ID: 

Lots of c1 and c2 jit methods do not contain any oop, so the nmethod entry barrier can be skipped.

1, c1 jit code will patch oops or Klass into the nmethod, so the entry barrier cannot directly be eliminated, current implementation uses a jump instruction to replace the jcc instruction. If the jit code is patched to contain oops, the entry barrier is patched back to the jcc instruction.

2, only the jit code of core library methods do not contain any oops.

3, currently only support zgc

-------------

Commit messages:
 - Remove redundent whitepace
 - Fix 'int to char' cast warning
 - bypass the entry barrier if there is no oop in the nm

Changes: https://git.openjdk.java.net/jdk/pull/4610/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4610&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8269476
  Stats: 89 lines in 9 files changed: 89 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4610.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4610/head:pull/4610

PR: https://git.openjdk.java.net/jdk/pull/4610

From kevinw at openjdk.java.net  Mon Jun 28 09:01:10 2021
From: kevinw at openjdk.java.net (Kevin Walls)
Date: Mon, 28 Jun 2021 09:01:10 GMT
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one [v2]
In-Reply-To: 
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
 
Message-ID: 

On Mon, 28 Jun 2021 07:46:29 GMT, Yi Yang  wrote:

>> From users' perspective, we can find corresponding os thread via top directly, otherwise, we must convert hex format based nid to an integer, and find that thread via `top -pid `. This slightly facilitates our debugging process, but would obviously break some existing jstack analysis tool.
>> 
>> Jstack Before:
>> 
>> "ParGC Thread#7" os_prio=0 cpu=103260.18ms elapsed=5255043.58s tid=0x00007f967000b000 nid=0x12e67 runnable
>> 
>> "ParGC Thread#8" os_prio=0 cpu=104818.76ms elapsed=5255043.58s tid=0x00007f967000c000 nid=0x12e68 runnable
>> 
>> "ParGC Thread#9" os_prio=0 cpu=102164.69ms elapsed=5255043.58s tid=0x00007f967000e000 nid=0x12e69 runnable
>> 
>> Jstack After:
>> "G1 Conc#0" os_prio=0 cpu=0.03ms elapsed=1295.27s tid=0x00007f99dc096490 nid=117707 runnable
>> 
>> "G1 Refine#0" os_prio=0 cpu=0.06ms elapsed=1295.22s tid=0x00007f99dc2cad20 nid=117708 runnable
>> 
>> "G1 Service" os_prio=0 cpu=87.05ms elapsed=1295.22s tid=0x00007f99dc2cc140 nid=117709 runnable
>> 
>> Top:
>>    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>>  49083 tianxia+ 20 0 32.8g 594148 10796 S 103.3 0.1 0:10.05 java
>>  71291 qingfen+ 20 0 39.3g 26.7g 18312 S 100.7 5.3 16861:35 jhsdb
>>  50407 tianxia+ 20 0 32.5g 32796 9768 S 100.3 0.0 0:05.80 java
>> 107429 maolian+ 20 0 11.4g 1.1g 10956 S 100.3 0.2 20173:52 java
>>  99923 root 10 -10 288520 163228 5088 S 5.9 0.0 6463:53 AliYunDun
>
> Yi Yang has updated the pull request incrementally with one additional commit since the last revision:
> 
>   restore cleanup code

Hi,
If you attach WinDbg on Windows to a JVM, you might be glad of the nid=0x... format as that is its choice of base for the thread ids.
So this depends on your tools.  Maybe frustrated top users outnumber happy WinDbg users for the JVM, and maybe they don't.  Maybe this change delights some users and frustrates others.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4449

From stuefe at openjdk.java.net  Mon Jun 28 09:20:02 2021
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Mon, 28 Jun 2021 09:20:02 GMT
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one [v2]
In-Reply-To: 
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
 
 
Message-ID: 

On Mon, 28 Jun 2021 08:58:09 GMT, Kevin Walls  wrote:

> Hi,
> If you attach WinDbg on Windows to a JVM, you might be glad of the nid=0x... format as that is its choice of base for the thread ids.
> So this depends on your tools. Maybe frustrated top users outnumber happy WinDbg users for the JVM, and maybe they don't. Maybe this change delights some users and frustrates others.

Why not do it platform dependent then? This would make sense especially since the type is opaque. Let each platform handling printing. Windows can hex-print its DWORD thread id. Linux can print its kernel LWP. And platforms where the thread id is 64bit, or a structure, can print that.

For now default implementations could live in `os::Windows::print_thread_id(thread_t)` and `os::Posix::print_thread_id(thread_t)`, respectively.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4449

From jiefu at openjdk.java.net  Mon Jun 28 09:44:05 2021
From: jiefu at openjdk.java.net (Jie Fu)
Date: Mon, 28 Jun 2021 09:44:05 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code
In-Reply-To: 
References: 
Message-ID: 

On Mon, 28 Jun 2021 08:40:16 GMT, ??  wrote:

> Lots of c1 and c2 jit methods do not contain any oop, so the nmethod entry barrier can be skipped.
> 
> 1, c1 jit code will patch oops or Klass into the nmethod, so the entry barrier cannot directly be eliminated, current implementation uses a jump instruction to replace the jcc instruction. If the jit code is patched to contain oops, the entry barrier is patched back to the jcc instruction.
> 
> 2, only the jit code of core library methods do not contain any oops.
> 
> 3, currently only support zgc

src/hotspot/cpu/aarch64/gc/shared/barrierSetNMethod_aarch64.cpp line 166:

> 164: 
> 165: void BarrierSetNMethod::fix_entry_barrier(nmethod*, bool) {
> 166:   // not implement yet

Shall we use `Unimplemented();` here?

-------------

PR: https://git.openjdk.java.net/jdk/pull/4610

From kevinw at openjdk.java.net  Mon Jun 28 10:10:02 2021
From: kevinw at openjdk.java.net (Kevin Walls)
Date: Mon, 28 Jun 2021 10:10:02 GMT
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one [v2]
In-Reply-To: 
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
 
 
 
Message-ID: 

On Mon, 28 Jun 2021 09:16:33 GMT, Thomas Stuefe  wrote:

> Why not do it platform dependent then? ...

Checked Visual Studio, and that goes with decimal for thread IDs. 8-)

It's the tools rather than the platform.  But yes, hex for thread IDs seems to be in the minority.  (I have occasionally found this annoying.)

-------------

PR: https://git.openjdk.java.net/jdk/pull/4449

From github.com+25214855+casparcwang at openjdk.java.net  Mon Jun 28 11:21:53 2021
From: github.com+25214855+casparcwang at openjdk.java.net (=?UTF-8?B?546L6LaF?=)
Date: Mon, 28 Jun 2021 11:21:53 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code [v2]
In-Reply-To: 
References: 
Message-ID: 

> Lots of c1 and c2 jit methods do not contain any oop, so the nmethod entry barrier can be skipped.
> 
> 1, c1 jit code will patch oops or Klass into the nmethod, so the entry barrier cannot directly be eliminated, current implementation uses a jump instruction to replace the jcc instruction. If the jit code is patched to contain oops, the entry barrier is patched back to the jcc instruction.
> 
> 2, only the jit code of core library methods do not contain any oops.
> 
> 3, currently only support zgc

?? has updated the pull request incrementally with one additional commit since the last revision:

  Only implement in x86

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4610/files
  - new: https://git.openjdk.java.net/jdk/pull/4610/files/aca47946..51871984

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4610&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4610&range=00-01

  Stats: 21 lines in 6 files changed: 0 ins; 20 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4610.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4610/head:pull/4610

PR: https://git.openjdk.java.net/jdk/pull/4610

From github.com+25214855+casparcwang at openjdk.java.net  Mon Jun 28 11:21:56 2021
From: github.com+25214855+casparcwang at openjdk.java.net (=?UTF-8?B?546L6LaF?=)
Date: Mon, 28 Jun 2021 11:21:56 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code [v2]
In-Reply-To: 
References: 
 
Message-ID: 

On Mon, 28 Jun 2021 09:40:51 GMT, Jie Fu  wrote:

>> ?? has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Only implement in x86
>
> src/hotspot/cpu/aarch64/gc/shared/barrierSetNMethod_aarch64.cpp line 166:
> 
>> 164: 
>> 165: void BarrierSetNMethod::fix_entry_barrier(nmethod*, bool) {
>> 166:   // not implement yet
> 
> Shall we use `Unimplemented();` here?

refactor to x86 only

-------------

PR: https://git.openjdk.java.net/jdk/pull/4610

From zgu at openjdk.java.net  Mon Jun 28 12:15:03 2021
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Mon, 28 Jun 2021 12:15:03 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code [v2]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Mon, 28 Jun 2021 11:17:11 GMT, ??  wrote:

>> src/hotspot/cpu/aarch64/gc/shared/barrierSetNMethod_aarch64.cpp line 166:
>> 
>>> 164: 
>>> 165: void BarrierSetNMethod::fix_entry_barrier(nmethod*, bool) {
>>> 166:   // not implement yet
>> 
>> Shall we use `Unimplemented();` here?
>
> refactor to x86 only

I think there is a simpler way to accomplish this by playing with disarmed_value, e.g. setting nmethod's disarmed_value to certain pattern to indicate it is always disarmed.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4610

From github.com+25214855+casparcwang at openjdk.java.net  Mon Jun 28 12:37:03 2021
From: github.com+25214855+casparcwang at openjdk.java.net (=?UTF-8?B?546L6LaF?=)
Date: Mon, 28 Jun 2021 12:37:03 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code [v2]
In-Reply-To: 
References: 
 
 
 
Message-ID: 

On Mon, 28 Jun 2021 12:11:48 GMT, Zhengyu Gu  wrote:

> I think there is a simpler way to accomplish this by playing with disarmed_value, e.g. setting nmethod's disarmed_value to certain pattern to indicate it is always disarmed.

`__ cmpl(disarmed_addr, 0);` will only set ZF=1 when `dest = source`. By playing with disarmed_value, do you mean also change the cmpl instruction to other instruction?

-------------

PR: https://git.openjdk.java.net/jdk/pull/4610

From david.holmes at oracle.com  Mon Jun 28 12:37:57 2021
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 28 Jun 2021 22:37:57 +1000
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one [v2]
In-Reply-To: 
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
 
 
 
Message-ID: <550a3797-9694-b4f3-1e64-c556210194b0@oracle.com>

On 28/06/2021 6:49 pm, Thomas Stuefe wrote:
> On Mon, 28 Jun 2021 07:37:10 GMT, Yi Yang  wrote:
> 
>>> src/hotspot/share/runtime/osThread.cpp line 41:
>>>
>>>> 39: // Printing
>>>> 40: void OSThread::print_on(outputStream *st) const {
>>>> 41:   st->print("nid=%d ", thread_id());
>>>
>>> thread_id is of an opaque type (eg pthread_t). I think we can reasonably assume its numeric, but I would print it as an unsigned 64bit int just in case.
>>
>> Hi Thomas, we can not use other format specifiers (`%ld`,`%llu`) after my practice, because it can not compile on my mac:
> 
> You'd do:
> 
> print("nid: " UINT64_FORMAT, (uint64_t) id):;
> 
> thread_t is, among other things, pthread_t, which is opaque. Any current code treating that as signed int is incorrect too.

If it is opaque then I don't see how signed or unsigned makes any 
difference. You are assuming it can just be treated as a 64-bit value; 
whether you interpret that as a signed or unsigned value just changes 
how you print it. I agree printing only positive values is nicer visually.

David

> -------------
> 
> PR: https://git.openjdk.java.net/jdk/pull/4449
> 

From zgu at openjdk.java.net  Mon Jun 28 12:52:09 2021
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Mon, 28 Jun 2021 12:52:09 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code [v2]
In-Reply-To: 
References: 
 
 
 
 
Message-ID: 

On Mon, 28 Jun 2021 12:34:20 GMT, ??  wrote:

>> I think there is a simpler way to accomplish this by playing with disarmed_value, e.g. setting nmethod's disarmed_value to certain pattern to indicate it is always disarmed.
>
>> I think there is a simpler way to accomplish this by playing with disarmed_value, e.g. setting nmethod's disarmed_value to certain pattern to indicate it is always disarmed.
> 
> `__ cmpl(disarmed_addr, 0);` will only set ZF=1 when `dest = source`. By playing with disarmed_value, do you mean also change the cmpl instruction to other instruction?

You may need an additional instruction, e.g. uses MSB of disarmed_value to indicate always disarmed and test the bit before `cmpl`. 

Also, you need a new method to set the bit during nmethod registration.

nmethod entry barrier tests is_armed() early, so may benefit more.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4610

From kevinw at openjdk.java.net  Mon Jun 28 13:10:04 2021
From: kevinw at openjdk.java.net (Kevin Walls)
Date: Mon, 28 Jun 2021 13:10:04 GMT
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one
In-Reply-To: 
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
 
Message-ID: <37m5nc6KDsehgna2Z7xEbz_J5iodCHhUQgpcnXxVEIk=.5cea815d-6cfa-42d2-ba81-391aaaffb1d2@github.com>

On Fri, 18 Jun 2021 06:14:49 GMT, Yi Yang  wrote:

> Do you think this would facilitate debugging process? And is it acceptable? Any feedback is appreciated!

My first comments were to say that this makes things better for some people, but a little worse for others.
Maybe overall this looks like it makes things better for most people. 8-)

If so (and if we don't discover more tools that prefer hex for thread IDs!), then we want to be consistent, so in addition to the native/built in implementation, we should also update:

src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/JavaThread.java
..to keep the SA implementation in sync.  It would be odd to have thread dumps looking more different depending on what generated them.

And if changing that, also change:
test/hotspot/jtreg/serviceability/sa/JhsdbThreadInfoTest.java

I don't see other tests that parse this information.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4449

From github.com+25214855+casparcwang at openjdk.java.net  Mon Jun 28 13:21:11 2021
From: github.com+25214855+casparcwang at openjdk.java.net (=?UTF-8?B?546L6LaF?=)
Date: Mon, 28 Jun 2021 13:21:11 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code [v2]
In-Reply-To: 
References: 
 
 
 
 
 
Message-ID: 

On Mon, 28 Jun 2021 12:48:44 GMT, Zhengyu Gu  wrote:

> You may need an additional instruction, e.g. uses MSB of disarmed_value to indicate always disarmed and test the bit before `cmpl`.
> 
> Also, you need a new method to set the bit during nmethod registration.
> 
> nmethod entry barrier tests is_armed() early, so may benefit more.

The barrier can be bypassed only when the nmethod does not any contain oop relocation, if it's bypassed, it should always be bypassed, there is no need to test the bit of disarmed_value.

I think several `nop` instruction can be added before `cmpl`, and patched to direct jump if the barrier can be bypassed. And when the nmethod is changed to contain oops again, the direct jump should be patched to `nop` again. 

`x86_instruction_opcode  disarmed_value_addr, some_magic_number_immediate; jcc label;` : If there exists some instructions like this, only need to change `some_magic_number_immediate` to make the following `jcc` instruction alway jump, this is the dream version of code sequences, but I do not figure out which instructions can be used to implement this functionality.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4610

From github.com+25214855+casparcwang at openjdk.java.net  Mon Jun 28 13:29:03 2021
From: github.com+25214855+casparcwang at openjdk.java.net (=?UTF-8?B?546L6LaF?=)
Date: Mon, 28 Jun 2021 13:29:03 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code [v2]
In-Reply-To: 
References: 
 
 
 
 
 
 
Message-ID: <9DHGMZwxC14QFaCWcPdYYoYn7kfwDo2YpvlMFlEq8d0=.0b35cfed-87aa-4354-a33a-47fc9fdeeb17@github.com>

On Mon, 28 Jun 2021 13:18:13 GMT, ??  wrote:

>> You may need an additional instruction, e.g. uses MSB of disarmed_value to indicate always disarmed and test the bit before `cmpl`. 
>> 
>> Also, you need a new method to set the bit during nmethod registration.
>> 
>> nmethod entry barrier tests is_armed() early, so may benefit more.
>
>> You may need an additional instruction, e.g. uses MSB of disarmed_value to indicate always disarmed and test the bit before `cmpl`.
>> 
>> Also, you need a new method to set the bit during nmethod registration.
>> 
>> nmethod entry barrier tests is_armed() early, so may benefit more.
> 
> The barrier can be bypassed only when the nmethod does not any contain oop relocation, if it's bypassed, it should always be bypassed, there is no need to test the bit of disarmed_value.
> 
> I think several `nop` instruction can be added before `cmpl`, and patched to direct jump if the barrier can be bypassed. And when the nmethod is changed to contain oops again, the direct jump should be patched to `nop` again. 
> 
> `x86_instruction_opcode  disarmed_value_addr, some_magic_number_immediate; jcc label;` : If there exists some instructions like this, only need to change `some_magic_number_immediate` to make the following `jcc` instruction alway jump, this is the dream version of code sequences, but I do not figure out which instructions can be used to implement this functionality.

maybe `subl ; jcc` can be used to implement a more compact version, but the `nop`<->`jump` can benefit more

-------------

PR: https://git.openjdk.java.net/jdk/pull/4610

From aph at openjdk.java.net  Mon Jun 28 13:45:06 2021
From: aph at openjdk.java.net (Andrew Haley)
Date: Mon, 28 Jun 2021 13:45:06 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code [v2]
In-Reply-To: <9DHGMZwxC14QFaCWcPdYYoYn7kfwDo2YpvlMFlEq8d0=.0b35cfed-87aa-4354-a33a-47fc9fdeeb17@github.com>
References: 
 
 
 
 
 
 
 <9DHGMZwxC14QFaCWcPdYYoYn7kfwDo2YpvlMFlEq8d0=.0b35cfed-87aa-4354-a33a-47fc9fdeeb17@github.com>
Message-ID: 

On Mon, 28 Jun 2021 13:26:15 GMT, ??  wrote:

>>> You may need an additional instruction, e.g. uses MSB of disarmed_value to indicate always disarmed and test the bit before `cmpl`.
>>> 
>>> Also, you need a new method to set the bit during nmethod registration.
>>> 
>>> nmethod entry barrier tests is_armed() early, so may benefit more.
>> 
>> The barrier can be bypassed only when the nmethod does not any contain oop relocation, if it's bypassed, it should always be bypassed, there is no need to test the bit of disarmed_value.
>> 
>> I think several `nop` instruction can be added before `cmpl`, and patched to direct jump if the barrier can be bypassed. And when the nmethod is changed to contain oops again, the direct jump should be patched to `nop` again. 
>> 
>> `x86_instruction_opcode  disarmed_value_addr, some_magic_number_immediate; jcc label;` : If there exists some instructions like this, only need to change `some_magic_number_immediate` to make the following `jcc` instruction alway jump, this is the dream version of code sequences, but I do not figure out which instructions can be used to implement this functionality.
>
> maybe `subl ; jcc` can be used to implement a more compact version, but the `nop`<->`jump` can benefit more

> > You may need an additional instruction, e.g. uses MSB of disarmed_value to indicate always disarmed and test the bit before `cmpl`.
> > Also, you need a new method to set the bit during nmethod registration.
> > nmethod entry barrier tests is_armed() early, so may benefit more.
> 
> The barrier can be bypassed only when the nmethod does not any contain oop relocation, if it's bypassed, it should always be bypassed, there is no need to test the bit of disarmed_value.

On AArch64 we have

  __ ldrw(rscratch1, guard);

  // Subsequent loads of oops must occur after load of guard value.                                                                                                                                                                                                                    
  // BarrierSetNMethod::disarm sets guard with release semantics.                                                                                                                                                                                                                      
  __ membar(__ LoadLoad);
  __ ldrw(rscratch2, thread_disarmed_addr);
  __ cmpw(rscratch1, rscratch2);
  __ br(Assembler::EQ, skip);

This `membar` is evil, and it would be very nice to be rid of it. If we have a `nop` before the guard load, we can patch it to jump around the whole sequence, so that would be my preferred thing to do.

On AArch64 we don't care about C1 patching at all, we just don't do any. It would be better IMO if everyone gave up C1 patching, but that's a discussion for another day.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4610

From rkennke at redhat.com  Mon Jun 28 13:50:36 2021
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 28 Jun 2021 15:50:36 +0200
Subject: What to do with the biased-locking header bit?
Message-ID: 

A FYI/RFC. See some discussions here:

https://github.com/openjdk/lilliput/pull/10

Cheerio,
Roman


From stuefe at openjdk.java.net  Mon Jun 28 13:50:14 2021
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Mon, 28 Jun 2021 13:50:14 GMT
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one
In-Reply-To: <550a3797-9694-b4f3-1e64-c556210194b0@oracle.com>
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
 <550a3797-9694-b4f3-1e64-c556210194b0@oracle.com>
Message-ID: 

On Mon, 28 Jun 2021 12:41:35 GMT, David Holmes  wrote:

> > You'd do:
> > print("nid: " UINT64_FORMAT, (uint64_t) id):;
> > thread_t is, among other things, pthread_t, which is opaque. Any current code treating that as signed int is incorrect too.
> 
> If it is opaque then I don't see how signed or unsigned makes any
> difference. You are assuming it can just be treated as a 64-bit value;
> whether you interpret that as a signed or unsigned value just changes
> how you print it. I agree printing only positive values is nicer visually.
> 
> David

My `signed in` comment was referring to the existing use of `%d` in the code base. I'm more concerned with the 32bit range of int than the signedness (though I never saw an OS tool displaying negative numbers for thread ids).

-------------

PR: https://git.openjdk.java.net/jdk/pull/4449

From adinn at openjdk.java.net  Mon Jun 28 14:48:12 2021
From: adinn at openjdk.java.net (Andrew Dinn)
Date: Mon, 28 Jun 2021 14:48:12 GMT
Subject: RFR: 8261579: AArch64: Support for weaker memory ordering in
 Atomic [v4]
In-Reply-To: 
References: 
 
Message-ID: 

On Fri, 25 Jun 2021 15:02:35 GMT, Andrew Haley  wrote:

>> At present the Atomic operations in HotSpot only support conservative (the very strongest) and relaxed (the weakest) memory ordering.
>> We should add at least seq_cst for LSE. This patch also adds a release-only CAS, needed for Shenandoah.
>
> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Sanitize memory order for BSD CAS

All looks good to me apart from the bsd changes which I am in not really in a position to comment on. I'll have to take your word for it that the builtin CAS requires the trailing mode arguments you specify.

-------------

Marked as reviewed by adinn (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4597

From shade at openjdk.java.net  Mon Jun 28 14:59:07 2021
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 28 Jun 2021 14:59:07 GMT
Subject: RFR: 8261579: AArch64: Support for weaker memory ordering in
 Atomic [v4]
In-Reply-To: 
References: 
 
Message-ID: 

On Fri, 25 Jun 2021 15:02:35 GMT, Andrew Haley  wrote:

>> At present the Atomic operations in HotSpot only support conservative (the very strongest) and relaxed (the weakest) memory ordering.
>> We should add at least seq_cst for LSE. This patch also adds a release-only CAS, needed for Shenandoah.
>
> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Sanitize memory order for BSD CAS

Only stylistic nits. I am running the Shenandoah performance tests now.

src/hotspot/cpu/aarch64/atomic_aarch64.hpp line 52:

> 50: extern aarch64_atomic_stub_t aarch64_atomic_cmpxchg_8_seq_cst_impl;
> 51: extern aarch64_atomic_stub_t aarch64_atomic_cmpxchg_4_release_impl;
> 52: extern aarch64_atomic_stub_t aarch64_atomic_cmpxchg_8_release_impl;

Stylistic: I'd keep the ordering hierarchy here, first `release`, then `seq_cst`.

Suggestion:

extern aarch64_atomic_stub_t aarch64_atomic_cmpxchg_4_release_impl;
extern aarch64_atomic_stub_t aarch64_atomic_cmpxchg_8_release_impl;
extern aarch64_atomic_stub_t aarch64_atomic_cmpxchg_4_seq_cst_impl;
extern aarch64_atomic_stub_t aarch64_atomic_cmpxchg_8_seq_cst_impl;

src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S line 122:

> 120:         cmp     w3, w1
> 121:         b.ne    1f
> 122:         stlxr    w8, w2, [x0]

Suggestion:

        stlxr   w8, w2, [x0]

src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S line 134:

> 132:         cmp     x3, x1
> 133:         b.ne    1f
> 134:         stlxr    w8, x2, [x0]

Suggestion:

        stlxr   w8, x2, [x0]

src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S line 143:

> 141: aarch64_atomic_cmpxchg_4_seq_cst_default_impl:
> 142:         prfm    pstl1strm, [x0]
> 143: 0:      ldaxr    w3, [x0]

Suggestion:

0:      ldaxr   w3, [x0]

src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S line 146:

> 144:         cmp     w3, w1
> 145:         b.ne    1f
> 146:         stlxr    w8, w2, [x0]

Suggestion:

        stlxr   w8, w2, [x0]

src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S line 155:

> 153: aarch64_atomic_cmpxchg_8_seq_cst_default_impl:
> 154:         prfm    pstl1strm, [x0]
> 155: 0:      ldaxr    x3, [x0]

Suggestion:

0:      ldaxr   x3, [x0]

src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S line 158:

> 156:         cmp     x3, x1
> 157:         b.ne    1f
> 158:         stlxr    w8, x2, [x0]

Suggestion:

        stlxr   w8, x2, [x0]

src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.hpp line 157:

> 155:     stub = aarch64_atomic_cmpxchg_4_release_impl; break;
> 156:   case memory_order_seq_cst:
> 157:   case memory_order_acq_rel:

Swap to keep the strength hierarchy.

Suggestion:

  case memory_order_acq_rel:
  case memory_order_seq_cst:

src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.hpp line 180:

> 178:     stub = aarch64_atomic_cmpxchg_8_release_impl; break;
> 179:   case memory_order_seq_cst:
> 180:   case memory_order_acq_rel:

Swap to keep strength hierarchy. 

Suggestion:

  case memory_order_acq_rel:
  case memory_order_seq_cst:

-------------

PR: https://git.openjdk.java.net/jdk/pull/4597

From shade at openjdk.java.net  Mon Jun 28 15:38:05 2021
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 28 Jun 2021 15:38:05 GMT
Subject: RFR: 8261579: AArch64: Support for weaker memory ordering in
 Atomic [v4]
In-Reply-To: 
References: 
 
Message-ID: <9T9ns37WY3qcm_MlIDZAC_Wm1JQA32foyD2WBIBoud0=.5a2a9757-eed1-434a-9f54-66439c839ae2@github.com>

On Fri, 25 Jun 2021 15:02:35 GMT, Andrew Haley  wrote:

>> At present the Atomic operations in HotSpot only support conservative (the very strongest) and relaxed (the weakest) memory ordering.
>> We should add at least seq_cst for LSE. This patch also adds a release-only CAS, needed for Shenandoah.
>
> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Sanitize memory order for BSD CAS

Also, if you want all platforms built without errors with GHA, you need to merge in current master.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4597

From aph at openjdk.java.net  Mon Jun 28 16:08:34 2021
From: aph at openjdk.java.net (Andrew Haley)
Date: Mon, 28 Jun 2021 16:08:34 GMT
Subject: RFR: 8261579: AArch64: Support for weaker memory ordering in
 Atomic [v5]
In-Reply-To: 
References: 
Message-ID: 

> At present the Atomic operations in HotSpot only support conservative (the very strongest) and relaxed (the weakest) memory ordering.
> We should add at least seq_cst for LSE. This patch also adds a release-only CAS, needed for Shenandoah.

Andrew Haley has updated the pull request incrementally with ten additional commits since the last revision:

 - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S
   
   Co-authored-by: Aleksey Shipil?v 
 - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S
   
   Co-authored-by: Aleksey Shipil?v 
 - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S
   
   Co-authored-by: Aleksey Shipil?v 
 - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S
   
   Co-authored-by: Aleksey Shipil?v 
 - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S
   
   Co-authored-by: Aleksey Shipil?v 
 - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S
   
   Co-authored-by: Aleksey Shipil?v 
 - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.hpp
   
   Co-authored-by: Aleksey Shipil?v 
 - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.hpp
   
   Co-authored-by: Aleksey Shipil?v 
 - Update src/hotspot/cpu/aarch64/atomic_aarch64.hpp
   
   Co-authored-by: Aleksey Shipil?v 

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4597/files
  - new: https://git.openjdk.java.net/jdk/pull/4597/files/6298e85a..e691806f

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4597&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4597&range=03-04

  Stats: 15 lines in 3 files changed: 4 ins; 5 del; 6 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4597.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4597/head:pull/4597

PR: https://git.openjdk.java.net/jdk/pull/4597

From aph at openjdk.java.net  Mon Jun 28 16:08:38 2021
From: aph at openjdk.java.net (Andrew Haley)
Date: Mon, 28 Jun 2021 16:08:38 GMT
Subject: RFR: 8261579: AArch64: Support for weaker memory ordering in
 Atomic [v4]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Mon, 28 Jun 2021 14:54:34 GMT, Aleksey Shipilev  wrote:

>> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Sanitize memory order for BSD CAS
>
> src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S line 122:
> 
>> 120:         cmp     w3, w1
>> 121:         b.ne    1f
>> 122:         stlxr    w8, w2, [x0]
> 
> Suggestion:
> 
>         stlxr   w8, w2, [x0]

Well spotted!

-------------

PR: https://git.openjdk.java.net/jdk/pull/4597

From zgu at openjdk.java.net  Mon Jun 28 16:21:06 2021
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Mon, 28 Jun 2021 16:21:06 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code [v2]
In-Reply-To: 
References: 
 
Message-ID: 

On Mon, 28 Jun 2021 11:21:53 GMT, ??  wrote:

>> Lots of c1 and c2 jit methods do not contain any oop, so the nmethod entry barrier can be skipped.
>> 
>> 1, c1 jit code will patch oops or Klass into the nmethod, so the entry barrier cannot directly be eliminated, current implementation uses a jump instruction to replace the jcc instruction. If the jit code is patched to contain oops, the entry barrier is patched back to the jcc instruction.
>> 
>> 2, only the jit code of core library methods do not contain any oops.
>> 
>> 3, currently only support zgc
>
> ?? has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Only implement in x86

nmethod entry barrier is inserted in SharedRuntime::generate_native_wrapper(). 

At that point, the method is already compiled and its metadata information should be available. It should be possible to determine if there are any embedded oops there and elide barrier completely if there is none.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4610

From aph at openjdk.java.net  Mon Jun 28 16:21:50 2021
From: aph at openjdk.java.net (Andrew Haley)
Date: Mon, 28 Jun 2021 16:21:50 GMT
Subject: RFR: 8261579: AArch64: Support for weaker memory ordering in
 Atomic [v6]
In-Reply-To: 
References: 
Message-ID: 

> At present the Atomic operations in HotSpot only support conservative (the very strongest) and relaxed (the weakest) memory ordering.
> We should add at least seq_cst for LSE. This patch also adds a release-only CAS, needed for Shenandoah.

Andrew Haley has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 16 additional commits since the last revision:

 - Merge branch 'master' into aarch64_acq_rel_cas
 - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S
   
   Co-authored-by: Aleksey Shipil?v 
 - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S
   
   Co-authored-by: Aleksey Shipil?v 
 - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S
   
   Co-authored-by: Aleksey Shipil?v 
 - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S
   
   Co-authored-by: Aleksey Shipil?v 
 - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S
   
   Co-authored-by: Aleksey Shipil?v 
 - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S
   
   Co-authored-by: Aleksey Shipil?v 
 - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.hpp
   
   Co-authored-by: Aleksey Shipil?v 
 - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.hpp
   
   Co-authored-by: Aleksey Shipil?v 
 - Update src/hotspot/cpu/aarch64/atomic_aarch64.hpp
   
   Co-authored-by: Aleksey Shipil?v 
 - ... and 6 more: https://git.openjdk.java.net/jdk/compare/dac70d7e...e2327629

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4597/files
  - new: https://git.openjdk.java.net/jdk/pull/4597/files/e691806f..e2327629

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4597&range=05
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4597&range=04-05

  Stats: 35202 lines in 671 files changed: 17658 ins; 15651 del; 1893 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4597.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4597/head:pull/4597

PR: https://git.openjdk.java.net/jdk/pull/4597

From shade at openjdk.java.net  Mon Jun 28 18:06:24 2021
From: shade at openjdk.java.net (Aleksey Shipilev)
Date: Mon, 28 Jun 2021 18:06:24 GMT
Subject: RFR: 8261579: AArch64: Support for weaker memory ordering in
 Atomic [v6]
In-Reply-To: 
References: 
 
Message-ID: <7USkENUjTZw_zWVhHO1NnwtWE2i8XqU9r22XZMX-VyI=.4575bc26-e92e-42c5-bc5c-8c8edd3eb493@github.com>

On Mon, 28 Jun 2021 16:21:50 GMT, Andrew Haley  wrote:

>> At present the Atomic operations in HotSpot only support conservative (the very strongest) and relaxed (the weakest) memory ordering.
>> We should add at least seq_cst for LSE. This patch also adds a release-only CAS, needed for Shenandoah.
>
> Andrew Haley has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 16 additional commits since the last revision:
> 
>  - Merge branch 'master' into aarch64_acq_rel_cas
>  - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S
>    
>    Co-authored-by: Aleksey Shipil?v 
>  - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S
>    
>    Co-authored-by: Aleksey Shipil?v 
>  - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S
>    
>    Co-authored-by: Aleksey Shipil?v 
>  - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S
>    
>    Co-authored-by: Aleksey Shipil?v 
>  - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S
>    
>    Co-authored-by: Aleksey Shipil?v 
>  - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.S
>    
>    Co-authored-by: Aleksey Shipil?v 
>  - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.hpp
>    
>    Co-authored-by: Aleksey Shipil?v 
>  - Update src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.hpp
>    
>    Co-authored-by: Aleksey Shipil?v 
>  - Update src/hotspot/cpu/aarch64/atomic_aarch64.hpp
>    
>    Co-authored-by: Aleksey Shipil?v 
>  - ... and 6 more: https://git.openjdk.java.net/jdk/compare/9759b6e0...e2327629

Performance data looks good, I think it does what it should.

-------------

Marked as reviewed by shade (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4597

From manc at openjdk.java.net  Mon Jun 28 18:17:34 2021
From: manc at openjdk.java.net (Man Cao)
Date: Mon, 28 Jun 2021 18:17:34 GMT
Subject: RFR: 8269417: Minor clarification on NonblockingQueue utility [v2]
In-Reply-To: 
References: 
Message-ID: 

> Hi,
> 
> Could you review this change mainly based on the comments in https://github.com/openjdk/jdk/pull/4379?
> I also added an assertion to ensure and better understand why the Atomic::store() in append() is correct.
> 
> Stress tested with fastdebug build with:
> $ make test TEST="gtest:NonblockingQueueTestBasics gtest:NonblockingQueueTest" GTEST="REPEAT=-1"

Man Cao has updated the pull request incrementally with one additional commit since the last revision:

  Clarify cmpxchg only fail due to try_pop.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4600/files
  - new: https://git.openjdk.java.net/jdk/pull/4600/files/b4150ec9..e7890729

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4600&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4600&range=00-01

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4600.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4600/head:pull/4600

PR: https://git.openjdk.java.net/jdk/pull/4600

From lmesnik at openjdk.java.net  Mon Jun 28 18:44:25 2021
From: lmesnik at openjdk.java.net (Leonid Mesnik)
Date: Mon, 28 Jun 2021 18:44:25 GMT
Subject: RFR: 8245877: assert(_value != __null) failed: resolving NULL _value
 in JvmtiExport::post_compiled_method_load
Message-ID: 

The crash happens because nmethod might become a zombie before it is enqueued in JvmtiDeferredEventQueue or after it is dequeued from it. The crash is reproduced by serviceability/jvmti/CompiledMethodLoad/Zombie.java. However, it takes ~3K  runs to hit it. I verified the fix by running this test >100K on each platform. Also, I verified that protecting in 'void JvmtiDeferredEventQueue::post(JvmtiEnv* env)' is not enough.

-------------

Commit messages:
 - fix in problemlist.txt
 - Merge branch 'master' of https://github.com/openjdk/jdk into 8245877
 - ident fixed.
 - test unproblemlisted.
 - fix

Changes: https://git.openjdk.java.net/jdk/pull/4602/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4602&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8245877
  Stats: 9 lines in 4 files changed: 5 ins; 2 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4602.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4602/head:pull/4602

PR: https://git.openjdk.java.net/jdk/pull/4602

From sspitsyn at openjdk.java.net  Mon Jun 28 19:02:08 2021
From: sspitsyn at openjdk.java.net (Serguei Spitsyn)
Date: Mon, 28 Jun 2021 19:02:08 GMT
Subject: RFR: 8245877: assert(_value != __null) failed: resolving NULL
 _value in JvmtiExport::post_compiled_method_load
In-Reply-To: 
References: 
Message-ID: 

On Sat, 26 Jun 2021 17:48:15 GMT, Leonid Mesnik  wrote:

> The crash happens because nmethod might become a zombie before it is enqueued in JvmtiDeferredEventQueue or after it is dequeued from it. The crash is reproduced by serviceability/jvmti/CompiledMethodLoad/Zombie.java. However, it takes ~3K  runs to hit it. I verified the fix by running this test >100K on each platform. Also, I verified that protecting in 'void JvmtiDeferredEventQueue::post(JvmtiEnv* env)' is not enough.

Hi Leonid,
I looks good to me.
Thank you for addressing it!
Thanks,
Serguei

-------------

Marked as reviewed by sspitsyn (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4602

From iklam at openjdk.java.net  Mon Jun 28 19:46:59 2021
From: iklam at openjdk.java.net (Ioi Lam)
Date: Mon, 28 Jun 2021 19:46:59 GMT
Subject: RFR: 8269004 Implement ResizableResourceHashtable [v2]
In-Reply-To: 
References: 
Message-ID: 

> In HotSpot we have (at least) two hashtable designs in the C++ code:
> 
> - share/utilities/hashtable.hpp
> - share/utilities/resourceHash.hpp
> 
> Of the two, the `ResourceHashtable` API is much cleaner and most new code has been written with it. However, one issue is that the `SIZE` of `ResourceHashtable` is a compile-time constant. This makes the hash-to-index computation very fast on x64 (gcc can avoid using the slow divq instruction for modulo). However, the downside is we cannot use `ResourceHashtable` when we need a hashtable whose size is determined at run time (and, optionally, resizeable).
> 
> This PR refactors `ResourceHashtable` into a base template class `ResourceHashtableBase`, whose `size()` function can be configured by a subclass to be either constant or runtime-configurable. 
> 
> Note: since we want to preserve the performance of `hash % SIZE`, we can't make `size()` a virtual function.
> 
> Preliminary benchmark shows that this refactoring has no impact on the performance of the constant `ResourceHashtable`. See https://github.com/iklam/tools/tree/main/bench/resourceHash:
> 
> *before*
> ResourceHashtable: 2.70 sec
> 
> *after*
> ResourceHashtable: 2.72 sec
> ResizableResourceHashtable: 5.29 sec
> 
> To make sure `ResizableResourceHashtable` works, I rewrote some CDS code to use `ResizableResourceHashtable` instead of `KVHashtable`

Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:

  @coleenp comments

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4536/files
  - new: https://git.openjdk.java.net/jdk/pull/4536/files/468a1e0d..cad6e8e6

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4536&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4536&range=00-01

  Stats: 6 lines in 3 files changed: 0 ins; 4 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4536.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4536/head:pull/4536

PR: https://git.openjdk.java.net/jdk/pull/4536

From iklam at openjdk.java.net  Mon Jun 28 19:47:00 2021
From: iklam at openjdk.java.net (Ioi Lam)
Date: Mon, 28 Jun 2021 19:47:00 GMT
Subject: RFR: 8269004 Implement ResizableResourceHashtable [v2]
In-Reply-To: 
References: 
 
Message-ID: 

On Wed, 23 Jun 2021 01:31:51 GMT, Coleen Phillimore  wrote:

>> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   @coleenp comments
>
> src/hotspot/share/cds/classListParser.hpp line 37:
> 
>> 35: 
>> 36: class constantPoolHandle;
>> 37: class Thread;
> 
> I was looking for the use of constantPoolHandle in the header and I know why the forward declaration is needed.  Shouldn't this declaration use a const reference so that the handle code doesn't create an unnecessary copy?
> 
> bool ClassListParser::is_matching_cp_entry(constantPoolHandle &pool, int cp_index, TRAPS) {

I changed it to pass `const constantPoolHandle &pool`. As we discussed offline, neither will create a new copy of the handle, but passing a `const` reference is cleaner.

> src/hotspot/share/utilities/resourceHash.hpp line 252:
> 
>> 250:     // http://stackoverflow.com/questions/8532961/template-argument-of-type-that-is-defined-by-inner-typedef-from-other-template-c
>> 251:     //typename ResourceHashtableFns::hash_fn   HASH   = primitive_hash,
>> 252:     //typename ResourceHashtableFns::equals_fn EQUALS = primitive_equals,
> 
> Can you remove this xlC comment?  Not sure why we care.

Done.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4536

From zgu at openjdk.java.net  Mon Jun 28 19:47:10 2021
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Mon, 28 Jun 2021 19:47:10 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code [v2]
In-Reply-To: 
References: 
 
Message-ID: 

On Mon, 28 Jun 2021 11:21:53 GMT, ??  wrote:

>> Lots of c1 and c2 jit methods do not contain any oop, so the nmethod entry barrier can be skipped.
>> 
>> 1, c1 jit code will patch oops or Klass into the nmethod, so the entry barrier cannot directly be eliminated, current implementation uses a jump instruction to replace the jcc instruction. If the jit code is patched to contain oops, the entry barrier is patched back to the jcc instruction.
>> 
>> 2, only the jit code of core library methods do not contain any oops.
>> 
>> 3, currently only support zgc
>
> ?? has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Only implement in x86

Following code should help you to find out if the method contains any relocatable oops, so you can generate nmethod entry barrier accordingly ...

`diff --git a/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp b/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp`
`index fb4e3f54400..4c2686d27db 100644`
`--- a/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp`
`+++ b/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp`
`@@ -1803,6 +1803,9 @@ nmethod* SharedRuntime::generate_native_wrapper(MacroAssembler* masm,`
`   // -2 because return address is already present and so is saved rbp
`   __ subptr(rsp, stack_size - 2*wordSize);`
 
`+  bool can_elide_nmethod_barrier = !nmethod::has_relocatable_oops(masm->code());`
`+  // This nmethod has no relocatable oop, can elide the barrier.`
`+  // However, we still need to generate something to not crash nmethod arm/disarm calls.`
`   BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler();`
`   bs->nmethod_entry_barrier(masm);`
 
`diff --git a/src/hotspot/share/code/nmethod.cpp b/src/hotspot/share/code/nmethod.cpp`
`index e2b27e5f4f0..99c7276e952 100644`
`--- a/src/hotspot/share/code/nmethod.cpp`
`+++ b/src/hotspot/share/code/nmethod.cpp`
`@@ -879,6 +879,19 @@ void nmethod::log_identity(xmlStream* log) const {`
` #endif`
` }`
 
`+bool nmethod::has_relocatable_oops(const CodeBuffer* cb) {`
`+  for (int n = (int) CodeBuffer::SECT_FIRST; n < (int)CodeBuffer::SECT_LIMIT; n++) {`
`+    const CodeSection* cs = cb->code_section(n);`
`+    RelocIterator iter(const_cast(cs));`
`+    while (iter.next()) {`
`+      if (iter.type() == relocInfo::oop_type) {`
`+        // Found relocatable oop`
`+        return true;`
`+      }`
`+    }`
`+  }`
`+  return false;`
`+}`
 
 `#define LOG_OFFSET(log, name)                    `
`   if (p2i(name##_end()) - p2i(name##_begin())) `
`diff --git a/src/hotspot/share/code/nmethod.hpp b/src/hotspot/share/code/nmethod.hpp`
`index 893f28863a6..1504ae9bc51 100644`
`--- a/src/hotspot/share/code/nmethod.hpp`
`+++ b/src/hotspot/share/code/nmethod.hpp`
`@@ -555,6 +555,8 @@ public:`
`   // Verify calls to dead methods have been cleaned.`
`   void verify_clean_inline_caches();`
 
`+  static bool has_relocatable_oops(const CodeBuffer* cb);`
`+`
 `  // unlink and deallocate this nmethod`
 `  // Only NMethodSweeper class is expected to use this. NMethodSweeper is not`
 `  // expected to use any other private methods/data in this class.`

-------------

PR: https://git.openjdk.java.net/jdk/pull/4610

From smonteith at openjdk.java.net  Mon Jun 28 20:43:08 2021
From: smonteith at openjdk.java.net (Stuart Monteith)
Date: Mon, 28 Jun 2021 20:43:08 GMT
Subject: RFR: 8261579: AArch64: Support for weaker memory ordering in
 Atomic [v4]
In-Reply-To: 
References: 
 
Message-ID: <9OAA5ZJ4EX4xvYKDIqFWXv-NeC1KC4doMVKmrAwNQzA=.bf100b26-4753-4977-94a0-cc65f789666f@github.com>

On Fri, 25 Jun 2021 15:02:35 GMT, Andrew Haley  wrote:

>> At present the Atomic operations in HotSpot only support conservative (the very strongest) and relaxed (the weakest) memory ordering.
>> We should add at least seq_cst for LSE. This patch also adds a release-only CAS, needed for Shenandoah.
>
> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Sanitize memory order for BSD CAS

I can find no fault in this - this will be a useful change.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4597

From kbarrett at openjdk.java.net  Mon Jun 28 21:52:06 2021
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Mon, 28 Jun 2021 21:52:06 GMT
Subject: RFR: 8269004 Implement ResizableResourceHashtable [v2]
In-Reply-To: 
References: 
 
Message-ID: 

On Mon, 28 Jun 2021 19:46:59 GMT, Ioi Lam  wrote:

>> In HotSpot we have (at least) two hashtable designs in the C++ code:
>> 
>> - share/utilities/hashtable.hpp
>> - share/utilities/resourceHash.hpp
>> 
>> Of the two, the `ResourceHashtable` API is much cleaner and most new code has been written with it. However, one issue is that the `SIZE` of `ResourceHashtable` is a compile-time constant. This makes the hash-to-index computation very fast on x64 (gcc can avoid using the slow divq instruction for modulo). However, the downside is we cannot use `ResourceHashtable` when we need a hashtable whose size is determined at run time (and, optionally, resizeable).
>> 
>> This PR refactors `ResourceHashtable` into a base template class `ResourceHashtableBase`, whose `size()` function can be configured by a subclass to be either constant or runtime-configurable. 
>> 
>> Note: since we want to preserve the performance of `hash % SIZE`, we can't make `size()` a virtual function.
>> 
>> Preliminary benchmark shows that this refactoring has no impact on the performance of the constant `ResourceHashtable`. See https://github.com/iklam/tools/tree/main/bench/resourceHash:
>> 
>> *before*
>> ResourceHashtable: 2.70 sec
>> 
>> *after*
>> ResourceHashtable: 2.72 sec
>> ResizableResourceHashtable: 5.29 sec
>> 
>> To make sure `ResizableResourceHashtable` works, I rewrote some CDS code to use `ResizableResourceHashtable` instead of `KVHashtable`
>
> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:
> 
>   @coleenp comments

Changes requested by kbarrett (Reviewer).

src/hotspot/share/utilities/resourceHash.hpp line 38:

> 36:     MEMFLAGS MEM_TYPE
> 37:     >
> 38: class ResourceHashtableBase : public ResourceObj {

Rather than a CRTP base class, I think it might be simpler to have a base class that has a type template parameter that provides the sizing/resizing policy. That type might be used either to specify the type of a new member or even a further base class (to benefit from EBO in the size-is-constant case). The derived class constructor would call the base class constructor with a policy object as an argument.

src/hotspot/share/utilities/resourceHash.hpp line 115:

> 113:   }
> 114: 
> 115:   unsigned size() const { return static_cast(this)->size_impl(); }

I think size() should return the number of entries. The number of buckets should use a different name (assuming it needs to be publically accessible).

-------------

PR: https://git.openjdk.java.net/jdk/pull/4536

From kbarrett at openjdk.java.net  Mon Jun 28 22:07:08 2021
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Mon, 28 Jun 2021 22:07:08 GMT
Subject: RFR: 8269417: Minor clarification on NonblockingQueue utility [v2]
In-Reply-To: 
References: 
 
Message-ID: 

On Mon, 28 Jun 2021 18:17:34 GMT, Man Cao  wrote:

>> Hi,
>> 
>> Could you review this change mainly based on the comments in https://github.com/openjdk/jdk/pull/4379?
>> I also added an assertion to ensure and better understand why the Atomic::store() in append() is correct.
>> 
>> Stress tested with fastdebug build with:
>> $ make test TEST="gtest:NonblockingQueueTestBasics gtest:NonblockingQueueTest" GTEST="REPEAT=-1"
>
> Man Cao has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Clarify cmpxchg only fail due to try_pop.

Looks good.

-------------

Marked as reviewed by kbarrett (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4600

From jwilhelm at openjdk.java.net  Mon Jun 28 22:07:38 2021
From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson)
Date: Mon, 28 Jun 2021 22:07:38 GMT
Subject: RFR: Merge jdk17
Message-ID: 

Forwardport JDK 17 -> JDK 18

-------------

Commit messages:
 - Merge
 - 8269426: Rename test/jdk/java/lang/invoke/t8150782 to accessClassAndFindClass
 - 8267952: async logging supports to dynamically change tags and decorators
 - 8269534: Remove java/util/concurrent/locks/Lock/TimedAcquireLeak.java from ProblemList.txt
 - 8269403: Fix jpackage tests to gracefully handle jpackage app launcher crashes
 - 8269304: Regression ~5% in 2005 in b27
 - 8268236: The documentation of the String.regionMatches method contains error

The webrevs contain the adjustments done while merging with regards to each parent branch:
 - master: https://webrevs.openjdk.java.net/?repo=jdk&pr=4619&range=00.0
 - jdk17: https://webrevs.openjdk.java.net/?repo=jdk&pr=4619&range=00.1

Changes: https://git.openjdk.java.net/jdk/pull/4619/files
  Stats: 224 lines in 20 files changed: 163 ins; 15 del; 46 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4619.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4619/head:pull/4619

PR: https://git.openjdk.java.net/jdk/pull/4619

From manc at openjdk.java.net  Mon Jun 28 22:38:43 2021
From: manc at openjdk.java.net (Man Cao)
Date: Mon, 28 Jun 2021 22:38:43 GMT
Subject: RFR: 8269417: Minor clarification on NonblockingQueue utility [v3]
In-Reply-To: 
References: 
Message-ID: 

> Hi,
> 
> Could you review this change mainly based on the comments in https://github.com/openjdk/jdk/pull/4379?
> I also added an assertion to ensure and better understand why the Atomic::store() in append() is correct.
> 
> Stress tested with fastdebug build with:
> $ make test TEST="gtest:NonblockingQueueTestBasics gtest:NonblockingQueueTest" GTEST="REPEAT=-1"

Man Cao has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:

 - Merge remote-tracking branch 'origin/master' into 8269417
 - Clarify cmpxchg only fail due to try_pop.
 - Clarify NonblockingQueue's comments and add an assertion.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4600/files
  - new: https://git.openjdk.java.net/jdk/pull/4600/files/e7890729..32d68b7b

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4600&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4600&range=01-02

  Stats: 2543 lines in 133 files changed: 1729 ins; 433 del; 381 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4600.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4600/head:pull/4600

PR: https://git.openjdk.java.net/jdk/pull/4600

From jwilhelm at openjdk.java.net  Mon Jun 28 23:07:44 2021
From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson)
Date: Mon, 28 Jun 2021 23:07:44 GMT
Subject: RFR: Merge jdk17 [v2]
In-Reply-To: 
References: 
Message-ID: 

> Forwardport JDK 17 -> JDK 18

Jesper Wilhelmsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 103 commits:

 - Merge
 - 8269409: Post JEP 411 refactoring: core-libs with maximum covering > 10K
   
   Reviewed-by: lancea, naoto
 - 8269433: Remove effectively unused ReferenceProcessor::_enqueuing_is_done
   
   Reviewed-by: kbarrett, tschatzl
 - 8268902: Testing for threadObj != NULL is unnecessary in suspend handshake
   
   Reviewed-by: pchilanomate, dcubed
 - 8269222: Incorrect number of workers reported for reference processing
   
   Reviewed-by: tschatzl, sangheki
 - 8269122: The use of "extern const" for Register definitions generates poor code
   
   Reviewed-by: adinn, kbarrett, kvn
 - 8269003: Update the java manpage for JDK 18
   
   Reviewed-by: minqi
 - Merge
 - 8269261: The PlaceHolder code uses Thread everywhere but is always dealing with JavaThreads
   
   Reviewed-by: ccheung, coleenp
 - 8269129: Multiple tier1 tests in hotspot/jtreg/compiler are failing for client VMs
   
   Reviewed-by: kvn, iveresov
 - ... and 93 more: https://git.openjdk.java.net/jdk/compare/56240690...8863e7a7

-------------

Changes: https://git.openjdk.java.net/jdk/pull/4619/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4619&range=01
  Stats: 27175 lines in 592 files changed: 16042 ins; 9481 del; 1652 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4619.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4619/head:pull/4619

PR: https://git.openjdk.java.net/jdk/pull/4619

From jwilhelm at openjdk.java.net  Mon Jun 28 23:07:45 2021
From: jwilhelm at openjdk.java.net (Jesper Wilhelmsson)
Date: Mon, 28 Jun 2021 23:07:45 GMT
Subject: Integrated: Merge jdk17
In-Reply-To: 
References: 
Message-ID: 

On Mon, 28 Jun 2021 21:58:36 GMT, Jesper Wilhelmsson  wrote:

> Forwardport JDK 17 -> JDK 18

This pull request has now been integrated.

Changeset: 03d54e6e
Author:    Jesper Wilhelmsson 
URL:       https://git.openjdk.java.net/jdk/commit/03d54e6ef1a40ee78b0cc65ca0aea276fbdbc7b7
Stats:     224 lines in 20 files changed: 163 ins; 15 del; 46 mod

Merge

-------------

PR: https://git.openjdk.java.net/jdk/pull/4619

From github.com+25214855+casparcwang at openjdk.java.net  Tue Jun 29 00:59:06 2021
From: github.com+25214855+casparcwang at openjdk.java.net (=?UTF-8?B?546L6LaF?=)
Date: Tue, 29 Jun 2021 00:59:06 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code [v2]
In-Reply-To: 
References: 
 
 
 
 
 
 
 <9DHGMZwxC14QFaCWcPdYYoYn7kfwDo2YpvlMFlEq8d0=.0b35cfed-87aa-4354-a33a-47fc9fdeeb17@github.com>
 
Message-ID: 

On Mon, 28 Jun 2021 13:42:03 GMT, Andrew Haley  wrote:

> This `membar` is evil, and it would be very nice to be rid of it. If we have a `nop` before the guard load, we can patch it to jump around the whole sequence, so that would be my preferred thing to do.

thanks for your suggestion, I'll try to implement the `nop` version on aarch64 architecture.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4610

From github.com+25214855+casparcwang at openjdk.java.net  Tue Jun 29 01:16:05 2021
From: github.com+25214855+casparcwang at openjdk.java.net (=?UTF-8?B?546L6LaF?=)
Date: Tue, 29 Jun 2021 01:16:05 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code [v2]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Mon, 28 Jun 2021 16:18:08 GMT, Zhengyu Gu  wrote:

> nmethod entry barrier is inserted in SharedRuntime::generate_native_wrapper().
> 
> At that point, the method is already compiled and its metadata information should be available. It should be possible to determine if there are any embedded oops there and elide barrier completely if there is none.

Thanks for providing the patch. But we cannot aways figure out there are any oops on the generation of entry barrier:
1, C1 patching mechanism will dynamically introduce java mirror oop to the nmethod, if the patching code is executed. If the oops is added again to the method, the entry barrier should be activated again.
2, In C2 compilation, `MachPrologNode::emit` will call `MacroAssembler::verified_entry` to generate the nmethod entry barrier, that's the very first node to emit machine instructions, so at this time how many oops embedded is unknown.
3, In `nmethod::new_nmethod`, it will call `CodeBuffer::finalize_oop_references` to add class loader of embedded Klass meta to the oop recorder.  Only visit relocation will miss these classloaders.

So the entry barrier can not be easily eliminated on current code base, and if the barrier needs to be eliminated, there are lots of things to do.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4610

From david.holmes at oracle.com  Tue Jun 29 02:09:07 2021
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 29 Jun 2021 12:09:07 +1000
Subject: What to do with the biased-locking header bit?
In-Reply-To: 
References: 
Message-ID: <7f10650d-0782-6815-0721-c3d3444c96a7@oracle.com>

On 28/06/2021 11:50 pm, Roman Kennke wrote:
> A FYI/RFC. See some discussions here:
> 
> https://github.com/openjdk/lilliput/pull/10

I think Valhalla will probably want this. In any case valhalla is where 
this should be discussed as any changes to the header needs to be 
coordinated through valhalla.

Cheers,
David

> Cheerio,
> Roman
> 

From dholmes at openjdk.java.net  Tue Jun 29 02:28:00 2021
From: dholmes at openjdk.java.net (David Holmes)
Date: Tue, 29 Jun 2021 02:28:00 GMT
Subject: RFR: 8245877: assert(_value != __null) failed: resolving NULL
 _value in JvmtiExport::post_compiled_method_load
In-Reply-To: 
References: 
Message-ID: 

On Sat, 26 Jun 2021 17:48:15 GMT, Leonid Mesnik  wrote:

> The crash happens because nmethod might become a zombie before it is enqueued in JvmtiDeferredEventQueue or after it is dequeued from it. The crash is reproduced by serviceability/jvmti/CompiledMethodLoad/Zombie.java. However, it takes ~3K  runs to hit it. I verified the fix by running this test >100K on each platform. Also, I verified that protecting in 'void JvmtiDeferredEventQueue::post(JvmtiEnv* env)' is not enough.

Hi Leonid,

I'm not clear on the details here - please see comments below.

Thanks,
David

src/hotspot/share/code/nmethod.cpp line 1611:

> 1609:       return;
> 1610:     }
> 1611:     mark_as_seen_on_stack();

Not obvious what this actually does in relation to the dequeuing problem.

src/hotspot/share/prims/jvmtiImpl.cpp line 968:

> 966:   for (QueueNode* node = _queue_head; node != NULL; node = node->next()) {
> 967:     node->event().post_compiled_method_load_event(env);
> 968:   }

Can't you dequeue() immediately after calling post_compiled_method_load_event()?

-------------

PR: https://git.openjdk.java.net/jdk/pull/4602

From yyang at openjdk.java.net  Tue Jun 29 03:30:06 2021
From: yyang at openjdk.java.net (Yi Yang)
Date: Tue, 29 Jun 2021 03:30:06 GMT
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one
In-Reply-To: <37m5nc6KDsehgna2Z7xEbz_J5iodCHhUQgpcnXxVEIk=.5cea815d-6cfa-42d2-ba81-391aaaffb1d2@github.com>
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
 
 <37m5nc6KDsehgna2Z7xEbz_J5iodCHhUQgpcnXxVEIk=.5cea815d-6cfa-42d2-ba81-391aaaffb1d2@github.com>
Message-ID: 

On Mon, 28 Jun 2021 13:07:21 GMT, Kevin Walls  wrote:

> If so (and if we don't discover more tools that prefer hex for thread IDs!), then we want to be consistent, so in addition to the native/built in implementation, we should also update:

> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/JavaThread.java
..to keep the SA implementation in sync. It would be odd to have thread dumps looking more different depending on what generated them.

> And if changing that, also change:
test/hotspot/jtreg/serviceability/sa/JhsdbThreadInfoTest.java

Thanks for the comments! I will change the corresponding SA implementation and tests.


> > Hi,
> > If you attach WinDbg on Windows to a JVM, you might be glad of the nid=0x... format as that is its choice of base for the thread ids.
> > So this depends on your tools. Maybe frustrated top users outnumber happy WinDbg users for the JVM, and maybe they don't. Maybe this change delights some users and frustrates others.
> 
> Why not do it platform dependent then? This would make sense especially since the type is opaque. Let each platform handling printing. Windows can hex-print its DWORD thread id. Linux can print its kernel LWP. And platforms where the thread id is 64bit, or a structure, can print that.
> 
> For now default implementations could live in `os::Windows::print_thread_id(thread_t)` and `os::Posix::print_thread_id(thread_t)`, respectively.

Will it be too heavy to add a platform-dependent implementation for this small function? As Kevin said, maybe this change delights some users and frustrates others. But since POSIX is the vast majority of users, it may be a better choice to adapt to them. Just IMHO..

-------------

PR: https://git.openjdk.java.net/jdk/pull/4449

From yyang at openjdk.java.net  Tue Jun 29 03:42:32 2021
From: yyang at openjdk.java.net (Yi Yang)
Date: Tue, 29 Jun 2021 03:42:32 GMT
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one [v3]
In-Reply-To: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
Message-ID: 

> From users' perspective, we can find corresponding os thread via top directly, otherwise, we must convert hex format based nid to an integer, and find that thread via `top -pid `. This slightly facilitates our debugging process, but would obviously break some existing jstack analysis tool.
> 
> Jstack Before:
> 
> "ParGC Thread#7" os_prio=0 cpu=103260.18ms elapsed=5255043.58s tid=0x00007f967000b000 nid=0x12e67 runnable
> 
> "ParGC Thread#8" os_prio=0 cpu=104818.76ms elapsed=5255043.58s tid=0x00007f967000c000 nid=0x12e68 runnable
> 
> "ParGC Thread#9" os_prio=0 cpu=102164.69ms elapsed=5255043.58s tid=0x00007f967000e000 nid=0x12e69 runnable
> 
> Jstack After:
> "G1 Conc#0" os_prio=0 cpu=0.03ms elapsed=1295.27s tid=0x00007f99dc096490 nid=117707 runnable
> 
> "G1 Refine#0" os_prio=0 cpu=0.06ms elapsed=1295.22s tid=0x00007f99dc2cad20 nid=117708 runnable
> 
> "G1 Service" os_prio=0 cpu=87.05ms elapsed=1295.22s tid=0x00007f99dc2cc140 nid=117709 runnable
> 
> Top:
>    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>  49083 tianxia+ 20 0 32.8g 594148 10796 S 103.3 0.1 0:10.05 java
>  71291 qingfen+ 20 0 39.3g 26.7g 18312 S 100.7 5.3 16861:35 jhsdb
>  50407 tianxia+ 20 0 32.5g 32796 9768 S 100.3 0.0 0:05.80 java
> 107429 maolian+ 20 0 11.4g 1.1g 10956 S 100.3 0.2 20173:52 java
>  99923 root 10 -10 288520 163228 5088 S 5.9 0.0 6463:53 AliYunDun

Yi Yang has updated the pull request incrementally with one additional commit since the last revision:

  use UINT64_FORMAT; change SA and test impl

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4449/files
  - new: https://git.openjdk.java.net/jdk/pull/4449/files/4baeb175..0f230755

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4449&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4449&range=01-02

  Stats: 4 lines in 3 files changed: 0 ins; 0 del; 4 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4449.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4449/head:pull/4449

PR: https://git.openjdk.java.net/jdk/pull/4449

From iklam at openjdk.java.net  Tue Jun 29 03:45:41 2021
From: iklam at openjdk.java.net (Ioi Lam)
Date: Tue, 29 Jun 2021 03:45:41 GMT
Subject: RFR: 8269004 Implement ResizableResourceHashtable [v3]
In-Reply-To: 
References: 
Message-ID: 

> In HotSpot we have (at least) two hashtable designs in the C++ code:
> 
> - share/utilities/hashtable.hpp
> - share/utilities/resourceHash.hpp
> 
> Of the two, the `ResourceHashtable` API is much cleaner and most new code has been written with it. However, one issue is that the `SIZE` of `ResourceHashtable` is a compile-time constant. This makes the hash-to-index computation very fast on x64 (gcc can avoid using the slow divq instruction for modulo). However, the downside is we cannot use `ResourceHashtable` when we need a hashtable whose size is determined at run time (and, optionally, resizeable).
> 
> This PR refactors `ResourceHashtable` into a base template class `ResourceHashtableBase`, whose `size()` function can be configured by a subclass to be either constant or runtime-configurable. 
> 
> Note: since we want to preserve the performance of `hash % SIZE`, we can't make `size()` a virtual function.
> 
> Preliminary benchmark shows that this refactoring has no impact on the performance of the constant `ResourceHashtable`. See https://github.com/iklam/tools/tree/main/bench/resourceHash:
> 
> *before*
> ResourceHashtable: 2.70 sec
> 
> *after*
> ResourceHashtable: 2.72 sec
> ResizableResourceHashtable: 5.29 sec
> 
> To make sure `ResizableResourceHashtable` works, I rewrote some CDS code to use `ResizableResourceHashtable` instead of `KVHashtable`

Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:

  @kimbarrett feedback to move the storage code to a base class

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4536/files
  - new: https://git.openjdk.java.net/jdk/pull/4536/files/cad6e8e6..e5f9c16f

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4536&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4536&range=01-02

  Stats: 178 lines in 4 files changed: 95 ins; 42 del; 41 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4536.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4536/head:pull/4536

PR: https://git.openjdk.java.net/jdk/pull/4536

From iklam at openjdk.java.net  Tue Jun 29 03:51:05 2021
From: iklam at openjdk.java.net (Ioi Lam)
Date: Tue, 29 Jun 2021 03:51:05 GMT
Subject: RFR: 8269004 Implement ResizableResourceHashtable [v2]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Mon, 28 Jun 2021 21:49:16 GMT, Kim Barrett  wrote:

>> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   @coleenp comments
>
> src/hotspot/share/utilities/resourceHash.hpp line 38:
> 
>> 36:     MEMFLAGS MEM_TYPE
>> 37:     >
>> 38: class ResourceHashtableBase : public ResourceObj {
> 
> Rather than a CRTP base class, I think it might be simpler to have a base class that has a type template parameter that provides the sizing/resizing policy. That type might be used either to specify the type of a new member or even a further base class (to benefit from EBO in the size-is-constant case). The derived class constructor would call the base class constructor with a policy object as an argument.

Per Kim's suggestion, I moved the storage management code to two base classes: FixedResourceHashtableStorage and ResizeableResourceHashtableStorage. 

Now the `ResourceHashtable::_table[]` is in-line allocated (same as as before this PR). I checked with gcc and it generates identical code as before this PR.

> src/hotspot/share/utilities/resourceHash.hpp line 115:
> 
>> 113:   }
>> 114: 
>> 115:   unsigned size() const { return static_cast(this)->size_impl(); }
> 
> I think size() should return the number of entries. The number of buckets should use a different name (assuming it needs to be publically accessible).

In the latest version, I am following the same naming convention in hashtable.hpp:
- table_size() = number of buckets
- number_of_entries() = number of entries

-------------

PR: https://git.openjdk.java.net/jdk/pull/4536

From lmesnik at openjdk.java.net  Tue Jun 29 05:52:38 2021
From: lmesnik at openjdk.java.net (Leonid Mesnik)
Date: Tue, 29 Jun 2021 05:52:38 GMT
Subject: RFR: 8245877: assert(_value != __null) failed: resolving NULL
 _value in JvmtiExport::post_compiled_method_load [v2]
In-Reply-To: 
References: 
Message-ID: <6J6aOiqWbHZapyork2iniBsbRjxhxNhsLfiK01kBPlU=.8d7b894a-d848-43e0-a5b7-21e6b0f476c6@github.com>

> The crash happens because nmethod might become a zombie before it is enqueued in JvmtiDeferredEventQueue or after it is dequeued from it. The crash is reproduced by serviceability/jvmti/CompiledMethodLoad/Zombie.java. However, it takes ~3K  runs to hit it. I verified the fix by running this test >100K on each platform. Also, I verified that protecting in 'void JvmtiDeferredEventQueue::post(JvmtiEnv* env)' is not enough.

Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision:

  Added comment.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4602/files
  - new: https://git.openjdk.java.net/jdk/pull/4602/files/35d5848c..9ea2cb9c

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4602&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4602&range=00-01

  Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4602.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4602/head:pull/4602

PR: https://git.openjdk.java.net/jdk/pull/4602

From lmesnik at openjdk.java.net  Tue Jun 29 06:16:02 2021
From: lmesnik at openjdk.java.net (Leonid Mesnik)
Date: Tue, 29 Jun 2021 06:16:02 GMT
Subject: RFR: 8245877: assert(_value != __null) failed: resolving NULL
 _value in JvmtiExport::post_compiled_method_load [v2]
In-Reply-To: 
References: 
 
Message-ID: 

On Tue, 29 Jun 2021 02:24:37 GMT, David Holmes  wrote:

>> Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Added comment.
>
> Hi Leonid,
> 
> I'm not clear on the details here - please see comments below.
> 
> Thanks,
> David

Replied to [@dholmes-ora] comments but don't see any notifications from GitHub yet. Not clear if it is my PR/GitHub/mail issues.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4602

From eosterlund at openjdk.java.net  Tue Jun 29 07:08:01 2021
From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=)
Date: Tue, 29 Jun 2021 07:08:01 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code [v2]
In-Reply-To: 
References: 
 
Message-ID: 

On Mon, 28 Jun 2021 11:21:53 GMT, ??  wrote:

>> Lots of c1 and c2 jit methods do not contain any oop, so the nmethod entry barrier can be skipped.
>> 
>> 1, c1 jit code will patch oops or Klass into the nmethod, so the entry barrier cannot directly be eliminated, current implementation uses a jump instruction to replace the jcc instruction. If the jit code is patched to contain oops, the entry barrier is patched back to the jcc instruction.
>> 
>> 2, only the jit code of core library methods do not contain any oops.
>> 
>> 3, currently only support zgc
>
> ?? has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Only implement in x86

I don't think we should do this at all.

1) When this was introduced, we did not see the overhead of nmethod entry barriers in performance profiles. Did you see any improvement with the patch?
2) Any method that can be on-stack has at least an oop for the method holder, to prevent the method from being unloaded.
3) These barriers are like a swiss army knife. They do not only protect against oops in machine code. As mentioned they also protect against the class being unloaded, and racing between unloading and loading. In project loom they are also used for the nmethod lifecycle, and in generational ZGC they are used to patch the barrier code itself.
4) Having special nmethods that don't quack the same way makes the GC code unintuitive to work with and understand.

In summary, I don't think we should be eliding any nmethod entry barriers.

-------------

Changes requested by eosterlund (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4610

From aph at openjdk.java.net  Tue Jun 29 07:39:08 2021
From: aph at openjdk.java.net (Andrew Haley)
Date: Tue, 29 Jun 2021 07:39:08 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code [v2]
In-Reply-To: 
References: 
 
Message-ID: 

On Mon, 28 Jun 2021 11:21:53 GMT, ??  wrote:

>> Lots of c1 and c2 jit methods do not contain any oop, so the nmethod entry barrier can be skipped.
>> 
>> 1, c1 jit code will patch oops or Klass into the nmethod, so the entry barrier cannot directly be eliminated, current implementation uses a jump instruction to replace the jcc instruction. If the jit code is patched to contain oops, the entry barrier is patched back to the jcc instruction.
>> 
>> 2, only the jit code of core library methods do not contain any oops.
>> 
>> 3, currently only support zgc
>
> ?? has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Only implement in x86

On 6/29/21 8:03 AM, Erik ?sterlund wrote:
> 1) When this was introduced, we did not see the overhead of nmethod entry barriers in performance profiles. Did you see any improvement with the patch?

This is a LoadLoad fence at the start of every method followed by a load
with a dependent branch. If there wasn't a significant hit in the profiles
I wouldn't believe the profiles. Even adding code that doesn't appear to slow
things down uses additional speculation resources.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. 
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

-------------

PR: https://git.openjdk.java.net/jdk/pull/4610

From aph at openjdk.java.net  Tue Jun 29 07:43:07 2021
From: aph at openjdk.java.net (Andrew Haley)
Date: Tue, 29 Jun 2021 07:43:07 GMT
Subject: Integrated: 8261579: AArch64: Support for weaker memory ordering in
 Atomic
In-Reply-To: 
References: 
Message-ID: 

On Fri, 25 Jun 2021 14:02:46 GMT, Andrew Haley  wrote:

> At present the Atomic operations in HotSpot only support conservative (the very strongest) and relaxed (the weakest) memory ordering.
> We should add at least seq_cst for LSE. This patch also adds a release-only CAS, needed for Shenandoah.

This pull request has now been integrated.

Changeset: a9771575
Author:    Andrew Haley 
URL:       https://git.openjdk.java.net/jdk/commit/a97715755d01b88ad9e4cf32f10ca5a3f2fda898
Stats:     117 lines in 6 files changed: 112 ins; 2 del; 3 mod

8261579: AArch64: Support for weaker memory ordering in Atomic

Reviewed-by: adinn, shade

-------------

PR: https://git.openjdk.java.net/jdk/pull/4597

From dholmes at openjdk.java.net  Tue Jun 29 07:45:07 2021
From: dholmes at openjdk.java.net (David Holmes)
Date: Tue, 29 Jun 2021 07:45:07 GMT
Subject: RFR: 8245877: assert(_value != __null) failed: resolving NULL
 _value in JvmtiExport::post_compiled_method_load [v2]
In-Reply-To: <6J6aOiqWbHZapyork2iniBsbRjxhxNhsLfiK01kBPlU=.8d7b894a-d848-43e0-a5b7-21e6b0f476c6@github.com>
References: 
 <6J6aOiqWbHZapyork2iniBsbRjxhxNhsLfiK01kBPlU=.8d7b894a-d848-43e0-a5b7-21e6b0f476c6@github.com>
Message-ID: <65S6fJoltY4UvUNCYWeIIouULgtn7OAgBMU12VocwGQ=.ba555918-d9bb-4058-9ade-358fa5a9f2d9@github.com>

On Tue, 29 Jun 2021 05:52:38 GMT, Leonid Mesnik  wrote:

>> The crash happens because nmethod might become a zombie before it is enqueued in JvmtiDeferredEventQueue or after it is dequeued from it. The crash is reproduced by serviceability/jvmti/CompiledMethodLoad/Zombie.java. However, it takes ~3K  runs to hit it. I verified the fix by running this test >100K on each platform. Also, I verified that protecting in 'void JvmtiDeferredEventQueue::post(JvmtiEnv* env)' is not enough.
>
> Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Added comment.

I see the comment update in the PR and email, but no response to my comment about dequeuing within the for-loop.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4602

From lmesnik at openjdk.java.net  Tue Jun 29 07:51:05 2021
From: lmesnik at openjdk.java.net (Leonid Mesnik)
Date: Tue, 29 Jun 2021 07:51:05 GMT
Subject: RFR: 8245877: assert(_value != __null) failed: resolving NULL
 _value in JvmtiExport::post_compiled_method_load [v2]
In-Reply-To: <6J6aOiqWbHZapyork2iniBsbRjxhxNhsLfiK01kBPlU=.8d7b894a-d848-43e0-a5b7-21e6b0f476c6@github.com>
References: 
 <6J6aOiqWbHZapyork2iniBsbRjxhxNhsLfiK01kBPlU=.8d7b894a-d848-43e0-a5b7-21e6b0f476c6@github.com>
Message-ID: 

On Tue, 29 Jun 2021 05:52:38 GMT, Leonid Mesnik  wrote:

>> The crash happens because nmethod might become a zombie before it is enqueued in JvmtiDeferredEventQueue or after it is dequeued from it. The crash is reproduced by serviceability/jvmti/CompiledMethodLoad/Zombie.java. However, it takes ~3K  runs to hit it. I verified the fix by running this test >100K on each platform. Also, I verified that protecting in 'void JvmtiDeferredEventQueue::post(JvmtiEnv* env)' is not enough.
>
> Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Added comment.

replied to comments

-------------

PR: https://git.openjdk.java.net/jdk/pull/4602

From lmesnik at openjdk.java.net  Tue Jun 29 07:51:06 2021
From: lmesnik at openjdk.java.net (Leonid Mesnik)
Date: Tue, 29 Jun 2021 07:51:06 GMT
Subject: RFR: 8245877: assert(_value != __null) failed: resolving NULL
 _value in JvmtiExport::post_compiled_method_load [v2]
In-Reply-To: 
References: 
 
Message-ID: 

On Tue, 29 Jun 2021 02:23:58 GMT, David Holmes  wrote:

>> Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Added comment.
>
> src/hotspot/share/code/nmethod.cpp line 1611:
> 
>> 1609:       return;
>> 1610:     }
>> 1611:     mark_as_seen_on_stack();
> 
> Not obvious what this actually does in relation to the dequeuing problem.

updated comments

-------------

PR: https://git.openjdk.java.net/jdk/pull/4602

From lmesnik at openjdk.java.net  Tue Jun 29 07:55:07 2021
From: lmesnik at openjdk.java.net (Leonid Mesnik)
Date: Tue, 29 Jun 2021 07:55:07 GMT
Subject: RFR: 8245877: assert(_value != __null) failed: resolving NULL
 _value in JvmtiExport::post_compiled_method_load [v2]
In-Reply-To: 
References: 
 
Message-ID: 

On Tue, 29 Jun 2021 02:23:25 GMT, David Holmes  wrote:

>> Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Added comment.
>
> src/hotspot/share/prims/jvmtiImpl.cpp line 968:
> 
>> 966:   for (QueueNode* node = _queue_head; node != NULL; node = node->next()) {
>> 967:     node->event().post_compiled_method_load_event(env);
>> 968:   }
> 
> Can't you dequeue() immediately after calling post_compiled_method_load_event()?

Seems that dequeue in for-loop deletes the node which posted. It is possible to update the loop to have update dequeue in the same iteration however, I don't think it is a good idea to mix iterator/deletion in the same loop. What is the reason for this change?

-------------

PR: https://git.openjdk.java.net/jdk/pull/4602

From sgehwolf at openjdk.java.net  Tue Jun 29 08:25:06 2021
From: sgehwolf at openjdk.java.net (Severin Gehwolf)
Date: Tue, 29 Jun 2021 08:25:06 GMT
Subject: RFR: JDK-8266490: Extend the OSContainer API to support the pids
 controller of cgroups [v2]
In-Reply-To: 
References: 
 
Message-ID: <4TD_2jJOnOQ6-D2eCFdJzF3tQg_H-Vm6IrFcyX_xSIw=.028fbe3f-bc04-4b9c-8b35-a6a450a80f7f@github.com>

On Wed, 23 Jun 2021 13:37:59 GMT, Matthias Baesken  wrote:

>> Hello, please review this PR; it extend the OSContainer API in order to also support the pids controller of cgroups.
>> 
>> I noticed that unlike the other controllers "cpu", "cpuset", "cpuacct", "memory"  on some older Linux distros (SLES 12.1, RHEL 7.1) the pids controller might not be there (or not fully supported) so it was added as optional  , see the coding
>> 
>> 
>>   if (!cg_infos[PIDS_IDX]._data_complete) {
>>     log_debug(os, container)("Optional cgroup v1 pids subsystem not found");
>>     // keep the other controller info, pids is optional
>>   }
>
> Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Adjustments following Severins comments

This looks pretty good now. Looking forward to seeing container tests for this new code.

src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 559:

> 557:     return OSCONTAINER_ERROR;
> 558:   }
> 559:   // Unlimited memory in Cgroups V2 is the literal string 'max'

Please don't add version specific comments to version agnostic code. Suggestion: "Unlimited memory in cgroups is the literal string 'max' for some controllers, for example the pids controller."

src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 268:

> 266:   char * pidsmax_str = pids_max_val();
> 267:   jlong pidsmax = limit_from_str(pidsmax_str);
> 268:   return pidsmax;

Do we need this local variable? Consider using `return limit_from_str(pidsmax_str);` instead.

src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 250:

> 248:   char * pidsmax_str = pids_max_val();
> 249:   jlong pidsmax = limit_from_str(pidsmax_str);
> 250:   return pidsmax;

Same here. Use `return limit_from_str(pidsmax_str);`

-------------

PR: https://git.openjdk.java.net/jdk/pull/4518

From luhenry at openjdk.java.net  Tue Jun 29 08:25:05 2021
From: luhenry at openjdk.java.net (Ludovic Henry)
Date: Tue, 29 Jun 2021 08:25:05 GMT
Subject: RFR: 8178287: AsyncGetCallTrace fails to traverse valid Java
 stacks
In-Reply-To: 
References: <9qfnLj_-jz8MocK7UIIs5-NYZsVPJ7J20ZLiORqpUlM=.cb712662-0eb9-4d17-a67d-42451423f470@github.com>
 
Message-ID: 

On Wed, 9 Jun 2021 19:04:54 GMT, Serguei Spitsyn  wrote:

>> When the signal sent for AsyncGetCallTrace or JFR would land on a runtime stub (like arraycopy), a vtable stub, or the prolog of a compiled method,  it wouldn't be able to detect the sender (caller) frame for multiple reasons. This patch fixes these cases through adding CodeBlob-specific frame parser which are in the best position to know how a frame is setup.
>> 
>> The following examples have been profiled with honest-profiler which uses `AsyncGetCallTrace`.
>> 
>> # `Prof1`
>> 
>> public class Prof1 {
>> 
>>     public static void main(String[] args) {
>>         StringBuilder sb = new StringBuilder();
>>         for (int i = 0; i < 1000000; i++) {
>>             sb.append("ab");
>>             sb.delete(0, 1);
>>         }
>>         System.out.println(sb.length());
>>     }
>> }
>> 
>> 
>> - Baseline:
>> 
>> Flat Profile (by method):
>>         (t 99.4,s 99.4) AGCT::Unknown Java[ERR=-5]
>>         (t  0.5,s  0.2) Prof1::main
>>         (t  0.2,s  0.2) java.lang.AbstractStringBuilder::append
>>         (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::ensureCapacityInternal
>>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::shift
>>         (t  0.0,s  0.0) java.lang.String::getBytes
>>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::putStringAt
>>         (t  0.0,s  0.0) java.lang.StringBuilder::delete
>>         (t  0.2,s  0.0) java.lang.StringBuilder::append
>>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::delete
>>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::putStringAt
>> 
>> - With `StubRoutinesBlob::FrameParser`:
>> 
>> Flat Profile (by method):
>>         (t 98.7,s 98.7) java.lang.AbstractStringBuilder::ensureCapacityInternal
>>         (t  0.9,s  0.9) java.lang.AbstractStringBuilder::delete
>>         (t 99.8,s  0.2) Prof1::main
>>         (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>>         (t  0.0,s  0.0) AGCT::Unknown Java[ERR=-5]
>>         (t 98.8,s  0.0) java.lang.AbstractStringBuilder::append
>>         (t 98.8,s  0.0) java.lang.StringBuilder::append
>>         (t  0.9,s  0.0) java.lang.StringBuilder::delete
>> 
>> 
>> # `Prof2`
>> 
>> import java.util.function.Supplier;
>> 
>> public class Prof2 {
>> 
>>     public static void main(String[] args) {
>>         var rand = new java.util.Random(0);
>>         Supplier[] suppliers = {
>>                 () -> 0,
>>                 () -> 1,
>>                 () -> 2,
>>                 () -> 3,
>>         };
>> 
>>         long sum = 0;
>>         for (int i = 0; i >= 0; i++) {
>>             sum += (int)suppliers[i % suppliers.length].get();
>>         }
>>     }
>> }
>> 
>> 
>> - Baseline:
>> 
>> Flat Profile (by method):
>>         (t 60.7,s 60.7) AGCT::Unknown Java[ERR=-5]
>>         (t 39.2,s 35.2) Prof2::main
>>         (t  1.4,s  1.4) Prof2::lambda$main$3
>>         (t  1.0,s  1.0) Prof2::lambda$main$2
>>         (t  0.9,s  0.9) Prof2::lambda$main$1
>>         (t  0.7,s  0.7) Prof2::lambda$main$0
>>         (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>>         (t  0.0,s  0.0) java.lang.Thread::exit
>>         (t  0.9,s  0.0) Prof2$$Lambda$2.0x0000000800c00c28::get
>>         (t  1.0,s  0.0) Prof2$$Lambda$3.0x0000000800c01000::get
>>         (t  1.4,s  0.0) Prof2$$Lambda$4.0x0000000800c01220::get
>>         (t  0.7,s  0.0) Prof2$$Lambda$1.0x0000000800c00a08::get
>> 
>> 
>> - With `VtableBlob::FrameParser` and `nmethod::FrameParser`:
>> 
>> Flat Profile (by method):
>>         (t 74.1,s 70.3) Prof2::main
>>         (t  6.5,s  5.5) Prof2$$Lambda$29.0x0000000800081220::get
>>         (t  6.6,s  5.4) Prof2$$Lambda$28.0x0000000800081000::get
>>         (t  5.7,s  5.0) Prof2$$Lambda$26.0x0000000800080a08::get
>>         (t  5.9,s  5.0) Prof2$$Lambda$27.0x0000000800080c28::get
>>         (t  4.9,s  4.9) AGCT::Unknown Java[ERR=-5]
>>         (t  1.2,s  1.2) Prof2::lambda$main$2
>>         (t  0.9,s  0.9) Prof2::lambda$main$3
>>         (t  0.9,s  0.9) Prof2::lambda$main$1
>>         (t  0.7,s  0.7) Prof2::lambda$main$0
>>         (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>
> Hi Ludovic,
> Thank you for working on this fix in the AsyncGetCallTrace.
> What version of JDK release do you intent to target?
> Just wanted to make sure you know the JDK 17 development cycle will be closed tomorrow for P4 bugs and enhancements. The repository will be forked and the RDP 1 phase started.
> I doubt the review of your fix will be completed by this time.
> So, please, keep in mind your PR will go to 18, not 17.
> Thanks,
> Serguei

@sspitsyn how could we take this change forward? Thank you

-------------

PR: https://git.openjdk.java.net/jdk/pull/4436

From github.com+25214855+casparcwang at openjdk.java.net  Tue Jun 29 08:37:41 2021
From: github.com+25214855+casparcwang at openjdk.java.net (=?UTF-8?B?546L6LaF?=)
Date: Tue, 29 Jun 2021 08:37:41 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code [v3]
In-Reply-To: 
References: 
Message-ID: 

> Lots of c1 and c2 jit methods do not contain any oop, so the nmethod entry barrier can be skipped.
> 
> 1, c1 jit code will patch oops or Klass into the nmethod, so the entry barrier cannot directly be eliminated, current implementation uses a jump instruction to replace the jcc instruction. If the jit code is patched to contain oops, the entry barrier is patched back to the jcc instruction.
> 
> 2, only the jit code of core library methods do not contain any oops.
> 
> 3, currently only support zgc

?? has updated the pull request incrementally with two additional commits since the last revision:

 - Add support for aarch64
 - Change to nop version

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4610/files
  - new: https://git.openjdk.java.net/jdk/pull/4610/files/51871984..442475b1

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4610&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4610&range=01-02

  Stats: 179 lines in 6 files changed: 110 ins; 61 del; 8 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4610.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4610/head:pull/4610

PR: https://git.openjdk.java.net/jdk/pull/4610

From dnsimon at openjdk.java.net  Tue Jun 29 08:57:15 2021
From: dnsimon at openjdk.java.net (Doug Simon)
Date: Tue, 29 Jun 2021 08:57:15 GMT
Subject: RFR: 8269416: [JVMCI] capture libjvmci crash data to a file
Message-ID: <0fS5y8qZ0_n3dJHKPW23m_gRKtBZ2eX6NtoyhDoZpfg=.b55e1aa6-e305-4dfb-bc33-e799394b8a0e@github.com>

When a fatal error occurs in libgraal, it writes a crash dump to `tty`. Instead, it should be captured in a separate log file that is then referenced in the HotSpot crash summary (just like the hs_err_pid and CI replay compile log files are). This allows libgraal crash data to be more easily submitted along with VM crash reports.

For example:

> java -Dlibgraal.CrashAtIsFatal=true -Dgraal.CrashAt=String.equals -cp bin CountUppercase skjdf
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (jvmciRuntime.cpp:909), pid=36298, tid=41219
#  fatal error: thread 41219: Fatal error in JVMCI shared library
#
# JRE version: OpenJDK Runtime Environment GraalVM LIBGRAAL 21.3.0-dev (16.0.2) (build 16.0.2-internal+0-adhoc.dnsimon.labsjdk-ce-16)
# Java VM: OpenJDK 64-Bit Server VM GraalVM LIBGRAAL 21.3.0-dev (16.0.2-internal+0-adhoc.dnsimon.labsjdk-ce-16, mixed mode, tiered, jvmci, jvmci compiler, compressed oops, compressed class ptrs, g1 gc, bsd-amd64)
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /Users/dnsimon/graal/graal/compiler/hs_err_pid36298.log
#
# The JVMCI shared library error data is saved as:
# /Users/dnsimon/graal/graal/compiler/hs_err_pid36298_libjvmci.log
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

-------------

Commit messages:
 - capture libjvmci crash data to a file

Changes: https://git.openjdk.java.net/jdk/pull/4620/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4620&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8269416
  Stats: 90 lines in 7 files changed: 87 ins; 0 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4620.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4620/head:pull/4620

PR: https://git.openjdk.java.net/jdk/pull/4620

From iwalulya at openjdk.java.net  Tue Jun 29 09:25:11 2021
From: iwalulya at openjdk.java.net (Ivan Walulya)
Date: Tue, 29 Jun 2021 09:25:11 GMT
Subject: RFR: 8269417: Minor clarification on NonblockingQueue utility [v3]
In-Reply-To: 
References: 
 
Message-ID: 

On Mon, 28 Jun 2021 22:38:43 GMT, Man Cao  wrote:

>> Hi,
>> 
>> Could you review this change mainly based on the comments in https://github.com/openjdk/jdk/pull/4379?
>> I also added an assertion to ensure and better understand why the Atomic::store() in append() is correct.
>> 
>> Stress tested with fastdebug build with:
>> $ make test TEST="gtest:NonblockingQueueTestBasics gtest:NonblockingQueueTest" GTEST="REPEAT=-1"
>
> Man Cao has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
> 
>  - Merge remote-tracking branch 'origin/master' into 8269417
>  - Clarify cmpxchg only fail due to try_pop.
>  - Clarify NonblockingQueue's comments and add an assertion.

lgtm!

-------------

Marked as reviewed by iwalulya (Committer).

PR: https://git.openjdk.java.net/jdk/pull/4600

From dholmes at openjdk.java.net  Tue Jun 29 10:33:04 2021
From: dholmes at openjdk.java.net (David Holmes)
Date: Tue, 29 Jun 2021 10:33:04 GMT
Subject: RFR: 8245877: assert(_value != __null) failed: resolving NULL
 _value in JvmtiExport::post_compiled_method_load [v2]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Tue, 29 Jun 2021 07:51:52 GMT, Leonid Mesnik  wrote:

>> src/hotspot/share/prims/jvmtiImpl.cpp line 968:
>> 
>>> 966:   for (QueueNode* node = _queue_head; node != NULL; node = node->next()) {
>>> 967:     node->event().post_compiled_method_load_event(env);
>>> 968:   }
>> 
>> Can't you dequeue() immediately after calling post_compiled_method_load_event()?
>
> Seems that dequeue in for-loop deletes the node which posted. It is possible to update the loop to have update dequeue in the same iteration however, I don't think it is a good idea to mix iterator/deletion in the same loop. What is the reason for this change?

Just to save iterating the queue twice. And that is what the original loop did - you just have to switch the order of processing and deleting.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4602

From github.com+25214855+casparcwang at openjdk.java.net  Tue Jun 29 11:35:52 2021
From: github.com+25214855+casparcwang at openjdk.java.net (=?UTF-8?B?546L6LaF?=)
Date: Tue, 29 Jun 2021 11:35:52 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code [v4]
In-Reply-To: 
References: 
Message-ID: 

> Lots of c1 and c2 jit methods do not contain any oop, so the nmethod entry barrier can be skipped.
> 
> 1, c1 jit code will patch oops or Klass into the nmethod, so the entry barrier cannot directly be eliminated, current implementation uses a jump instruction to replace the jcc instruction. If the jit code is patched to contain oops, the entry barrier is patched back to the jcc instruction.
> 
> 2, only the jit code of core library methods do not contain any oops.
> 
> 3, currently only support zgc

?? has updated the pull request incrementally with one additional commit since the last revision:

  Fix assert error

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4610/files
  - new: https://git.openjdk.java.net/jdk/pull/4610/files/442475b1..a7d7dbc0

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4610&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4610&range=02-03

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4610.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4610/head:pull/4610

PR: https://git.openjdk.java.net/jdk/pull/4610

From kevinw at openjdk.java.net  Tue Jun 29 11:52:09 2021
From: kevinw at openjdk.java.net (Kevin Walls)
Date: Tue, 29 Jun 2021 11:52:09 GMT
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one [v3]
In-Reply-To: 
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
 
Message-ID: <7d4eUfnPx_xpdF2AbHyp6Kuzzg7ppyZesyXMusS7lDk=.2c96c78a-d3af-4684-b5c6-b5393c6a1ced@github.com>

On Tue, 29 Jun 2021 03:42:32 GMT, Yi Yang  wrote:

>> From users' perspective, we can find corresponding os thread via top directly, otherwise, we must convert hex format based nid to an integer, and find that thread via `top -pid `. This slightly facilitates our debugging process, but would obviously break some existing jstack analysis tool.
>> 
>> Jstack Before:
>> 
>> "ParGC Thread#7" os_prio=0 cpu=103260.18ms elapsed=5255043.58s tid=0x00007f967000b000 nid=0x12e67 runnable
>> 
>> "ParGC Thread#8" os_prio=0 cpu=104818.76ms elapsed=5255043.58s tid=0x00007f967000c000 nid=0x12e68 runnable
>> 
>> "ParGC Thread#9" os_prio=0 cpu=102164.69ms elapsed=5255043.58s tid=0x00007f967000e000 nid=0x12e69 runnable
>> 
>> Jstack After:
>> "G1 Conc#0" os_prio=0 cpu=0.03ms elapsed=1295.27s tid=0x00007f99dc096490 nid=117707 runnable
>> 
>> "G1 Refine#0" os_prio=0 cpu=0.06ms elapsed=1295.22s tid=0x00007f99dc2cad20 nid=117708 runnable
>> 
>> "G1 Service" os_prio=0 cpu=87.05ms elapsed=1295.22s tid=0x00007f99dc2cc140 nid=117709 runnable
>> 
>> Top:
>>    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>>  49083 tianxia+ 20 0 32.8g 594148 10796 S 103.3 0.1 0:10.05 java
>>  71291 qingfen+ 20 0 39.3g 26.7g 18312 S 100.7 5.3 16861:35 jhsdb
>>  50407 tianxia+ 20 0 32.5g 32796 9768 S 100.3 0.0 0:05.80 java
>> 107429 maolian+ 20 0 11.4g 1.1g 10956 S 100.3 0.2 20173:52 java
>>  99923 root 10 -10 288520 163228 5088 S 5.9 0.0 6463:53 AliYunDun
>
> Yi Yang has updated the pull request incrementally with one additional commit since the last revision:
> 
>   use UINT64_FORMAT; change SA and test impl

Maybe we can't make everybody happy, but what we have now is a good improvement (for many/most).
I tested with the SA change, the test passes, and manually on Windows I ran jstack and jhsdb jstack, jhsdb clhsdb.

> Will it be too heavy to add a platform-dependent implementation for this small function?...
I think so.
I think this looks good now.

I don't think it's a change we should backport, or at least not quickly, out of concern for people/tools who might be parsing this output.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4449

From kevinw at openjdk.java.net  Tue Jun 29 11:58:05 2021
From: kevinw at openjdk.java.net (Kevin Walls)
Date: Tue, 29 Jun 2021 11:58:05 GMT
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one [v2]
In-Reply-To: 
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
 
Message-ID: 

On Mon, 28 Jun 2021 07:46:29 GMT, Yi Yang  wrote:

>> From users' perspective, we can find corresponding os thread via top directly, otherwise, we must convert hex format based nid to an integer, and find that thread via `top -pid `. This slightly facilitates our debugging process, but would obviously break some existing jstack analysis tool.
>> 
>> Jstack Before:
>> 
>> "ParGC Thread#7" os_prio=0 cpu=103260.18ms elapsed=5255043.58s tid=0x00007f967000b000 nid=0x12e67 runnable
>> 
>> "ParGC Thread#8" os_prio=0 cpu=104818.76ms elapsed=5255043.58s tid=0x00007f967000c000 nid=0x12e68 runnable
>> 
>> "ParGC Thread#9" os_prio=0 cpu=102164.69ms elapsed=5255043.58s tid=0x00007f967000e000 nid=0x12e69 runnable
>> 
>> Jstack After:
>> "G1 Conc#0" os_prio=0 cpu=0.03ms elapsed=1295.27s tid=0x00007f99dc096490 nid=117707 runnable
>> 
>> "G1 Refine#0" os_prio=0 cpu=0.06ms elapsed=1295.22s tid=0x00007f99dc2cad20 nid=117708 runnable
>> 
>> "G1 Service" os_prio=0 cpu=87.05ms elapsed=1295.22s tid=0x00007f99dc2cc140 nid=117709 runnable
>> 
>> Top:
>>    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>>  49083 tianxia+ 20 0 32.8g 594148 10796 S 103.3 0.1 0:10.05 java
>>  71291 qingfen+ 20 0 39.3g 26.7g 18312 S 100.7 5.3 16861:35 jhsdb
>>  50407 tianxia+ 20 0 32.5g 32796 9768 S 100.3 0.0 0:05.80 java
>> 107429 maolian+ 20 0 11.4g 1.1g 10956 S 100.3 0.2 20173:52 java
>>  99923 root 10 -10 288520 163228 5088 S 5.9 0.0 6463:53 AliYunDun
>
> Yi Yang has updated the pull request incrementally with one additional commit since the last revision:
> 
>   restore cleanup code

src/hotspot/share/runtime/osThread.cpp line 41:

> 39: // Printing
> 40: void OSThread::print_on(outputStream *st) const {
> 41:   st->print("nid=%d ", thread_id());

Just update the (C) year above from 2019 to 2021.
JhsdbThreadInfoTest.java has it already in the latest revision.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4449

From kevinw at openjdk.java.net  Tue Jun 29 11:58:04 2021
From: kevinw at openjdk.java.net (Kevin Walls)
Date: Tue, 29 Jun 2021 11:58:04 GMT
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one [v3]
In-Reply-To: 
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
 
Message-ID: 

On Tue, 29 Jun 2021 03:42:32 GMT, Yi Yang  wrote:

>> From users' perspective, we can find corresponding os thread via top directly, otherwise, we must convert hex format based nid to an integer, and find that thread via `top -pid `. This slightly facilitates our debugging process, but would obviously break some existing jstack analysis tool.
>> 
>> Jstack Before:
>> 
>> "ParGC Thread#7" os_prio=0 cpu=103260.18ms elapsed=5255043.58s tid=0x00007f967000b000 nid=0x12e67 runnable
>> 
>> "ParGC Thread#8" os_prio=0 cpu=104818.76ms elapsed=5255043.58s tid=0x00007f967000c000 nid=0x12e68 runnable
>> 
>> "ParGC Thread#9" os_prio=0 cpu=102164.69ms elapsed=5255043.58s tid=0x00007f967000e000 nid=0x12e69 runnable
>> 
>> Jstack After:
>> "G1 Conc#0" os_prio=0 cpu=0.03ms elapsed=1295.27s tid=0x00007f99dc096490 nid=117707 runnable
>> 
>> "G1 Refine#0" os_prio=0 cpu=0.06ms elapsed=1295.22s tid=0x00007f99dc2cad20 nid=117708 runnable
>> 
>> "G1 Service" os_prio=0 cpu=87.05ms elapsed=1295.22s tid=0x00007f99dc2cc140 nid=117709 runnable
>> 
>> Top:
>>    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>>  49083 tianxia+ 20 0 32.8g 594148 10796 S 103.3 0.1 0:10.05 java
>>  71291 qingfen+ 20 0 39.3g 26.7g 18312 S 100.7 5.3 16861:35 jhsdb
>>  50407 tianxia+ 20 0 32.5g 32796 9768 S 100.3 0.0 0:05.80 java
>> 107429 maolian+ 20 0 11.4g 1.1g 10956 S 100.3 0.2 20173:52 java
>>  99923 root 10 -10 288520 163228 5088 S 5.9 0.0 6463:53 AliYunDun
>
> Yi Yang has updated the pull request incrementally with one additional commit since the last revision:
> 
>   use UINT64_FORMAT; change SA and test impl

Marked as reviewed by kevinw (Committer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/4449

From eosterlund at openjdk.java.net  Tue Jun 29 13:19:04 2021
From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=)
Date: Tue, 29 Jun 2021 13:19:04 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code [v2]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Tue, 29 Jun 2021 07:36:04 GMT, Andrew Haley  wrote:

> On 6/29/21 8:03 AM, Erik ?sterlund wrote:
> > 1) When this was introduced, we did not see the overhead of nmethod entry barriers in performance profiles. Did you see any improvement with the patch?
> 
> This is a LoadLoad fence at the start of every method followed by a load
> with a dependent branch. If there wasn't a significant hit in the profiles
> I wouldn't believe the profiles. Even adding code that doesn't appear to slow
> things down uses additional speculation resources.

My main concern is that I am not convinced that eliding nmethod entry barriers is sound. It really isn't all about protecting oops found in the machine code as I explained previously. There is a lot more to it. Therefore I would like to at least have some convincing number showing that such an optimization effort is worthwhile.

If you are looking at ways of getting rid of the loadload, I have a scheme that can elide it if you are interested. I discussed it with Stuart when he was prototyping how to build these barriers, but I think he concluded we didn't need such optimizations in the end. It came with some more complexity. That's why I am curious if you have numbers suggesting the opposite. Because if so, I have a better idea to target the loadload in particular, witout violating our invariants and still being correct.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4610

From aph at redhat.com  Tue Jun 29 14:51:22 2021
From: aph at redhat.com (Andrew Haley)
Date: Tue, 29 Jun 2021 15:51:22 +0100
Subject: RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals
In-Reply-To: 
References: 
Message-ID: <7f980cea-4608-e455-0dba-b2f7c4100ce3@redhat.com>

I had to make some changes to the benchmark to get accurate timing, because
it is swamped by JMH overhead for very small strings.

It should be clear from my patch what I did. The most important part is
to run the test code in a loop, or you won't see small effects. We're
trying to measure something that only takes a few nanoseconds.

This is what I see, Apple M1, two equal strings:

Old:

StringEquals.equal       8  avgt    5   0.948 ? 0.001  us/op
StringEquals.equal      11  avgt    5   0.948 ? 0.004  us/op
StringEquals.equal      16  avgt    5   0.948 ? 0.001  us/op
StringEquals.equal      22  avgt    5   1.260 ? 0.002  us/op
StringEquals.equal      32  avgt    5   1.886 ? 0.001  us/op
StringEquals.equal      45  avgt    5   2.514 ? 0.001  us/op
StringEquals.equal      64  avgt    5   3.141 ? 0.003  us/op
StringEquals.equal      91  avgt    5   4.395 ? 0.002  us/op
StringEquals.equal     121  avgt    5   5.653 ? 0.014  us/op
StringEquals.equal     181  avgt    5   8.011 ? 0.010  us/op
StringEquals.equal     256  avgt    5  11.433 ? 0.014  us/op
StringEquals.equal     512  avgt    5  23.005 ? 0.124  us/op
StringEquals.equal    1024  avgt    5  49.185 ? 0.032  us/op

Your patch:

Benchmark           (size)  Mode  Cnt   Score   Error  Units
StringEquals.equal       8  avgt    5   1.574 ? 0.001  us/op
StringEquals.equal      11  avgt    5   1.734 ? 0.004  us/op
StringEquals.equal      16  avgt    5   1.888 ? 0.002  us/op
StringEquals.equal      22  avgt    5   1.892 ? 0.003  us/op
StringEquals.equal      32  avgt    5   2.517 ? 0.003  us/op
StringEquals.equal      45  avgt    5   2.988 ? 0.002  us/op
StringEquals.equal      64  avgt    5   2.517 ? 0.003  us/op
StringEquals.equal      91  avgt    5   8.659 ? 0.007  us/op
StringEquals.equal     121  avgt    5   5.649 ? 0.007  us/op
StringEquals.equal     181  avgt    5   6.050 ? 0.009  us/op
StringEquals.equal     256  avgt    5   7.088 ? 0.016  us/op
StringEquals.equal     512  avgt    5  14.163 ? 0.018  us/op
StringEquals.equal    1024  avgt    5  29.998 ? 0.052  us/op


As you can see, we're looking at regressions all the way up to size=45,
with something very odd happening at size=91. Finally the vectorized
code starts to pull ahead at size=181.

A few things:

You should never be executing the TAIL unless the string is really
short. Just do one pair of unaligned loads at the end to finish.

Please don't use aliases for rscratch1 and rscratch2. Calling them tmp1
and tmp2 doesn't help the reader.

So: please make sure the smaller strings are at least as good as
they are now. Remember strings are usually short, so we can tolerate
no regressions with the smaller sizes.

I don't think that Neon does any good here. This is what I get by rewriting
(just) the stub with scalar registers, in the attached patch:

Benchmark           (size)  Mode  Cnt   Score   Error  Units
StringEquals.equal       8  avgt    5   1.574 ? 0.004  us/op
StringEquals.equal      11  avgt    5   1.734 ? 0.003  us/op
StringEquals.equal      16  avgt    5   1.888 ? 0.002  us/op
StringEquals.equal      22  avgt    5   1.891 ? 0.003  us/op
StringEquals.equal      32  avgt    5   2.517 ? 0.001  us/op
StringEquals.equal      45  avgt    5   2.988 ? 0.002  us/op
StringEquals.equal      64  avgt    5   2.595 ? 0.004  us/op
StringEquals.equal      91  avgt    5   4.083 ? 0.006  us/op
StringEquals.equal     121  avgt    5   5.432 ? 0.006  us/op
StringEquals.equal     181  avgt    5   6.292 ? 0.009  us/op
StringEquals.equal     256  avgt    5   7.232 ? 0.008  us/op
StringEquals.equal     512  avgt    5  13.304 ? 0.012  us/op
StringEquals.equal    1024  avgt    5  25.537 ? 0.012  us/op

I use an editor with automatic indentation, as do many people, so
I inserted brackets in the right places in the assembly code.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. 
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 8268229.patch
Type: text/x-patch
Size: 12464 bytes
Desc: not available
URL: 

From kvn at openjdk.java.net  Tue Jun 29 16:05:01 2021
From: kvn at openjdk.java.net (Vladimir Kozlov)
Date: Tue, 29 Jun 2021 16:05:01 GMT
Subject: RFR: 8269416: [JVMCI] capture libjvmci crash data to a file
In-Reply-To: <0fS5y8qZ0_n3dJHKPW23m_gRKtBZ2eX6NtoyhDoZpfg=.b55e1aa6-e305-4dfb-bc33-e799394b8a0e@github.com>
References: <0fS5y8qZ0_n3dJHKPW23m_gRKtBZ2eX6NtoyhDoZpfg=.b55e1aa6-e305-4dfb-bc33-e799394b8a0e@github.com>
Message-ID: 

On Mon, 28 Jun 2021 22:58:04 GMT, Doug Simon  wrote:

> When a fatal error occurs in libgraal, it writes a crash dump to `tty`. Instead, it should be captured in a separate log file that is then referenced in the HotSpot crash summary (just like the hs_err_pid and CI replay compile log files are). This allows libgraal crash data to be more easily submitted along with VM crash reports.
> 
> For example:
> 
>> java -Dlibgraal.CrashAtIsFatal=true -Dgraal.CrashAt=String.equals -cp bin CountUppercase skjdf
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  Internal Error (jvmciRuntime.cpp:909), pid=36298, tid=41219
> #  fatal error: thread 41219: Fatal error in JVMCI shared library
> #
> # JRE version: OpenJDK Runtime Environment GraalVM LIBGRAAL 21.3.0-dev (16.0.2) (build 16.0.2-internal+0-adhoc.dnsimon.labsjdk-ce-16)
> # Java VM: OpenJDK 64-Bit Server VM GraalVM LIBGRAAL 21.3.0-dev (16.0.2-internal+0-adhoc.dnsimon.labsjdk-ce-16, mixed mode, tiered, jvmci, jvmci compiler, compressed oops, compressed class ptrs, g1 gc, bsd-amd64)
> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /Users/dnsimon/graal/graal/compiler/hs_err_pid36298.log
> #
> # The JVMCI shared library error data is saved as:
> # /Users/dnsimon/graal/graal/compiler/hs_err_pid36298_libjvmci.log
> #
> # If you would like to submit a bug report, please visit:
> #   https://bugreport.java.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #

Someone from Runtime group may have good suggestions for this changes.
I think you need to follow (or create JVMCI specific flags) flags used for error reporting:

SuppressFatalErrorMessage
ShowMessageBoxOnError
UseOSErrorReporting
ErrorFileToStdout
ErrorFileToStderr

src/hotspot/share/utilities/vmError.cpp line 1592:

> 1590: #if INCLUDE_JVMCI
> 1591:   if (JVMCI::fatal_log_filename() != NULL) {
> 1592:     out.print_raw("#\n# The JVMCI shared library error data is saved as:\n# ");

I prefer `report file` instead of `data`.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4620

From lmesnik at openjdk.java.net  Tue Jun 29 16:38:29 2021
From: lmesnik at openjdk.java.net (Leonid Mesnik)
Date: Tue, 29 Jun 2021 16:38:29 GMT
Subject: RFR: 8245877: assert(_value != __null) failed: resolving NULL
 _value in JvmtiExport::post_compiled_method_load [v2]
In-Reply-To: <65S6fJoltY4UvUNCYWeIIouULgtn7OAgBMU12VocwGQ=.ba555918-d9bb-4058-9ade-358fa5a9f2d9@github.com>
References: 
 <6J6aOiqWbHZapyork2iniBsbRjxhxNhsLfiK01kBPlU=.8d7b894a-d848-43e0-a5b7-21e6b0f476c6@github.com>
 <65S6fJoltY4UvUNCYWeIIouULgtn7OAgBMU12VocwGQ=.ba555918-d9bb-4058-9ade-358fa5a9f2d9@github.com>
Message-ID: 

On Tue, 29 Jun 2021 07:42:17 GMT, David Holmes  wrote:

>> Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Added comment.
>
> I see the comment update in the PR and email, but no response to my comment about dequeuing within the for-loop.

[@dholmes-ora] I've updated the loop in the post. Seems your comments were removed because I remove the first loop.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4602

From lmesnik at openjdk.java.net  Tue Jun 29 16:38:28 2021
From: lmesnik at openjdk.java.net (Leonid Mesnik)
Date: Tue, 29 Jun 2021 16:38:28 GMT
Subject: RFR: 8245877: assert(_value != __null) failed: resolving NULL
 _value in JvmtiExport::post_compiled_method_load [v3]
In-Reply-To: 
References: 
Message-ID: 

> The crash happens because nmethod might become a zombie before it is enqueued in JvmtiDeferredEventQueue or after it is dequeued from it. The crash is reproduced by serviceability/jvmti/CompiledMethodLoad/Zombie.java. However, it takes ~3K  runs to hit it. I verified the fix by running this test >100K on each platform. Also, I verified that protecting in 'void JvmtiDeferredEventQueue::post(JvmtiEnv* env)' is not enough.

Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision:

  post updated.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4602/files
  - new: https://git.openjdk.java.net/jdk/pull/4602/files/9ea2cb9c..2e222f03

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4602&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4602&range=01-02

  Stats: 4 lines in 1 file changed: 1 ins; 3 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4602.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4602/head:pull/4602

PR: https://git.openjdk.java.net/jdk/pull/4602

From manc at openjdk.java.net  Tue Jun 29 17:07:08 2021
From: manc at openjdk.java.net (Man Cao)
Date: Tue, 29 Jun 2021 17:07:08 GMT
Subject: Integrated: 8269417: Minor clarification on NonblockingQueue utility
In-Reply-To: 
References: 
Message-ID: 

On Sat, 26 Jun 2021 02:29:20 GMT, Man Cao  wrote:

> Hi,
> 
> Could you review this change mainly based on the comments in https://github.com/openjdk/jdk/pull/4379?
> I also added an assertion to ensure and better understand why the Atomic::store() in append() is correct.
> 
> Stress tested with fastdebug build with:
> $ make test TEST="gtest:NonblockingQueueTestBasics gtest:NonblockingQueueTest" GTEST="REPEAT=-1"

This pull request has now been integrated.

Changeset: bb42d751
Author:    Man Cao 
URL:       https://git.openjdk.java.net/jdk/commit/bb42d75161cdf5d9ef2b1b227000df5165ab1198
Stats:     25 lines in 2 files changed: 15 ins; 5 del; 5 mod

8269417: Minor clarification on NonblockingQueue utility

Reviewed-by: kbarrett, iwalulya

-------------

PR: https://git.openjdk.java.net/jdk/pull/4600

From duke at openjdk.java.net  Tue Jun 29 17:53:09 2021
From: duke at openjdk.java.net (duke)
Date: Tue, 29 Jun 2021 17:53:09 GMT
Subject: Withdrawn: JDK-8260332: ParallelGC: Cooperative pretouch for oldgen
 expansion
In-Reply-To: 
References: 
Message-ID: 

On Fri, 12 Mar 2021 19:56:23 GMT, Amit Pawar  wrote:

> In case of ParallelGC, oldgen expansion can happen during promotion. Expanding thread will touch the pages and can't request for task execution as this GC thread is already executing a task. The expanding thread holds the lock on "ExpandHeap_lock" to resize the oldgen and other threads may wait for their turn. This is a blocking call.
> 
> This patch changes this behavior by adding another constructor in "MutexLocker" class to enable non blocking or try_lock operation. This way one thread will acquire the lock and other threads can join pretouch work. Threads failed to acquire the lock will join pretouch only when task is marked ready by expanding thread.
> 
> Following minimum expansion size are seen during expansion.
> 1. 512KB without largepages and without UseNUMA.
> 2. 64MB without largepages and with UseNUMA,
> 3. 2MB (on x86)  with large pages and without UseNUMA,
> 4. 64MB without large pages and with UseNUMA.
> 
> When Oldgen is expanding repeatedly with smaller size then this change wont help. For such cases, resize size should adapt to application demand to make use of this change. For example if application nature triggers 100 expansion with smaller sizes in same GC then it is better to increase the expansion size during each resize to reduce the number of resizes. If this patch is accepted then will plan to fix this case in another patch.
> 
> Jtreg all test passed.
> 
> Please review this change.

This pull request has been closed without being integrated.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2976

From coleenp at openjdk.java.net  Tue Jun 29 18:02:06 2021
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Tue, 29 Jun 2021 18:02:06 GMT
Subject: RFR: 8245877: assert(_value != __null) failed: resolving NULL
 _value in JvmtiExport::post_compiled_method_load [v3]
In-Reply-To: 
References: 
 
Message-ID: <9Wb7MLLfo_Rud35xgv0mnJC0XdomfmpYJfwyUaWTV5o=.eac10b02-e8d1-4931-abf6-a57c0ef5a35c@github.com>

On Tue, 29 Jun 2021 16:38:28 GMT, Leonid Mesnik  wrote:

>> The crash happens because nmethod might become a zombie before it is enqueued in JvmtiDeferredEventQueue or after it is dequeued from it. The crash is reproduced by serviceability/jvmti/CompiledMethodLoad/Zombie.java. However, it takes ~3K  runs to hit it. I verified the fix by running this test >100K on each platform. Also, I verified that protecting in 'void JvmtiDeferredEventQueue::post(JvmtiEnv* env)' is not enough.
>
> Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision:
> 
>   post updated.

Looks good!

src/hotspot/share/prims/jvmtiImpl.cpp line 968:

> 966:   while (_queue_head != NULL) {
> 967:     _queue_head->event().post_compiled_method_load_event(env);
> 968:     dequeue();

Good find!  So we _can_ zombie the nmethod after we take it off the list.  Makes a lot of sense.  Thank you for your perseverance in tracking this down!

-------------

Marked as reviewed by coleenp (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4602

From coleenp at openjdk.java.net  Tue Jun 29 18:02:07 2021
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Tue, 29 Jun 2021 18:02:07 GMT
Subject: RFR: 8245877: assert(_value != __null) failed: resolving NULL
 _value in JvmtiExport::post_compiled_method_load [v3]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Tue, 29 Jun 2021 07:46:39 GMT, Leonid Mesnik  wrote:

>> src/hotspot/share/code/nmethod.cpp line 1611:
>> 
>>> 1609:       return;
>>> 1610:     }
>>> 1611:     mark_as_seen_on_stack();
>> 
>> Not obvious what this actually does in relation to the dequeuing problem.
>
> updated comments

I'm not 100% this is needed because we don't handshake this thread and we just checked that it can't be converted to zombie, but I don't think this hurts anything and I thought it might be needed.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4602

From coleenp at openjdk.java.net  Tue Jun 29 20:19:03 2021
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Tue, 29 Jun 2021 20:19:03 GMT
Subject: RFR: 8269004 Implement ResizableResourceHashtable [v3]
In-Reply-To: 
References: 
 
Message-ID: 

On Tue, 29 Jun 2021 03:45:41 GMT, Ioi Lam  wrote:

>> In HotSpot we have (at least) two hashtable designs in the C++ code:
>> 
>> - share/utilities/hashtable.hpp
>> - share/utilities/resourceHash.hpp
>> 
>> Of the two, the `ResourceHashtable` API is much cleaner and most new code has been written with it. However, one issue is that the `SIZE` of `ResourceHashtable` is a compile-time constant. This makes the hash-to-index computation very fast on x64 (gcc can avoid using the slow divq instruction for modulo). However, the downside is we cannot use `ResourceHashtable` when we need a hashtable whose size is determined at run time (and, optionally, resizeable).
>> 
>> This PR refactors `ResourceHashtable` into a base template class `ResourceHashtableBase`, whose `size()` function can be configured by a subclass to be either constant or runtime-configurable. 
>> 
>> Note: since we want to preserve the performance of `hash % SIZE`, we can't make `size()` a virtual function.
>> 
>> Preliminary benchmark shows that this refactoring has no impact on the performance of the constant `ResourceHashtable`. See https://github.com/iklam/tools/tree/main/bench/resourceHash:
>> 
>> *before*
>> ResourceHashtable: 2.70 sec
>> 
>> *after*
>> ResourceHashtable: 2.72 sec
>> ResizableResourceHashtable: 5.29 sec
>> 
>> To make sure `ResizableResourceHashtable` works, I rewrote some CDS code to use `ResizableResourceHashtable` instead of `KVHashtable`
>
> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:
> 
>   @kimbarrett feedback to move the storage code to a base class

The cross product of Resizeable/Fixed x CHeap/ResourceAllocated is something that makes this hard to get my head around.  Would you have a resizeable ResourceAllocated hashtable?
Something doesn't seem right.
We have ResourceHashtables with fixed sizes in the code, and there are a couple of cases that use the CHeap allocation, and they need resizing.  Maybe only have two choices to make things easier?  Lots of indirection makes this really challenging to see if it's correct.
It could be that you want the hashtable to be resource allocated but the elements are CHeap allocated.  Which table is that?  We had this discussion with GrowableArray and I believe we settled on that if the GrowableArray is CHeap allocated, all the elements are also.

src/hotspot/share/utilities/resizeableResourceHash.hpp line 127:

> 125:     }
> 126: 
> 127:     FREE_C_HEAP_ARRAY(Node*, old_table);

If you resource allocated old_table, this doesn't seem right.
It seems like the ResizeableHashtable should be a CHeapHashtable and it's always resizeable.
And the code either has a fixed size ResourceHashtable or a variable/resizeable CHeapHashtable.

src/hotspot/share/utilities/resourceHash.hpp line 246:

> 244:     K, V, HASH, EQUALS, ALLOC_TYPE, MEM_TYPE> {
> 245: public:
> 246:   ResourceHashtable() : ResourceHashtableBase, K, V, HASH, EQUALS, ALLOC_TYPE, MEM_TYPE>() {}

nit: can you break up this line after the V> ?

-------------

PR: https://git.openjdk.java.net/jdk/pull/4536

From iklam at openjdk.java.net  Tue Jun 29 20:45:05 2021
From: iklam at openjdk.java.net (Ioi Lam)
Date: Tue, 29 Jun 2021 20:45:05 GMT
Subject: RFR: 8269004 Implement ResizableResourceHashtable [v3]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Tue, 29 Jun 2021 20:15:53 GMT, Coleen Phillimore  wrote:

> The cross product of Resizeable/Fixed x CHeap/ResourceAllocated is something that makes this hard to get my head around. Would you have a resizeable ResourceAllocated hashtable?
> Something doesn't seem right.

The choice of Resource vs CHeap depends on whether the table is accessed in a local scope. If you have a table that's used only locally, but it could (sometimes) store lots of elements, it may make sense to make it ResourceAllocated + Resizable.

> We have ResourceHashtables with fixed sizes in the code, and there are a couple of cases that use the CHeap allocation, and they need resizing. Maybe only have two choices to make things easier? Lots of indirection makes this really challenging to see if it's correct.
> It could be that you want the hashtable to be resource allocated but the elements are CHeap allocated. Which table is that? We had this discussion with GrowableArray and I believe we settled on that if the GrowableArray is CHeap allocated, all the elements are also.

Currently, the table and its elements are allocated by the same allocator. If `_table` is resource-allocated, then all the `Nodes` are also resource-allocated.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4536

From coleenp at openjdk.java.net  Tue Jun 29 21:04:40 2021
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Tue, 29 Jun 2021 21:04:40 GMT
Subject: RFR: 8269004 Implement ResizableResourceHashtable [v2]
In-Reply-To: 
References: 
 
 
 
Message-ID: 

On Tue, 29 Jun 2021 03:48:14 GMT, Ioi Lam  wrote:

>> src/hotspot/share/utilities/resourceHash.hpp line 115:
>> 
>>> 113:   }
>>> 114: 
>>> 115:   unsigned size() const { return static_cast(this)->size_impl(); }
>> 
>> I think size() should return the number of entries. The number of buckets should use a different name (assuming it needs to be publically accessible).
>
> In the latest version, I am following the same naming convention in hashtable.hpp:
> - table_size() = number of buckets
> - number_of_entries() = number of entries

Yes, please.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4536

From iklam at openjdk.java.net  Tue Jun 29 21:04:37 2021
From: iklam at openjdk.java.net (Ioi Lam)
Date: Tue, 29 Jun 2021 21:04:37 GMT
Subject: RFR: 8269004 Implement ResizableResourceHashtable [v4]
In-Reply-To: 
References: 
Message-ID: <29ADEYUJ_QfBxtes5NSY8CwtQnw06eXQWJMAP0MdJ60=.cbdab722-ac5c-40c7-a481-dbad7bed54cd@github.com>

> In HotSpot we have (at least) two hashtable designs in the C++ code:
> 
> - share/utilities/hashtable.hpp
> - share/utilities/resourceHash.hpp
> 
> Of the two, the `ResourceHashtable` API is much cleaner and most new code has been written with it. However, one issue is that the `SIZE` of `ResourceHashtable` is a compile-time constant. This makes the hash-to-index computation very fast on x64 (gcc can avoid using the slow divq instruction for modulo). However, the downside is we cannot use `ResourceHashtable` when we need a hashtable whose size is determined at run time (and, optionally, resizeable).
> 
> This PR refactors `ResourceHashtable` into a base template class `ResourceHashtableBase`, whose `size()` function can be configured by a subclass to be either constant or runtime-configurable. 
> 
> Note: since we want to preserve the performance of `hash % SIZE`, we can't make `size()` a virtual function.
> 
> Preliminary benchmark shows that this refactoring has no impact on the performance of the constant `ResourceHashtable`. See https://github.com/iklam/tools/tree/main/bench/resourceHash:
> 
> *before*
> ResourceHashtable: 2.70 sec
> 
> *after*
> ResourceHashtable: 2.72 sec
> ResizableResourceHashtable: 5.29 sec
> 
> To make sure `ResizableResourceHashtable` works, I rewrote some CDS code to use `ResizableResourceHashtable` instead of `KVHashtable`

Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:

  @coleenp comments

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4536/files
  - new: https://git.openjdk.java.net/jdk/pull/4536/files/e5f9c16f..543de2b7

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4536&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4536&range=02-03

  Stats: 7 lines in 2 files changed: 4 ins; 0 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4536.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4536/head:pull/4536

PR: https://git.openjdk.java.net/jdk/pull/4536

From iklam at openjdk.java.net  Tue Jun 29 21:12:05 2021
From: iklam at openjdk.java.net (Ioi Lam)
Date: Tue, 29 Jun 2021 21:12:05 GMT
Subject: RFR: 8269004 Implement ResizableResourceHashtable [v3]
In-Reply-To: 
References: 
 
 
Message-ID: 

On Tue, 29 Jun 2021 20:11:23 GMT, Coleen Phillimore  wrote:

>> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   @kimbarrett feedback to move the storage code to a base class
>
> src/hotspot/share/utilities/resizeableResourceHash.hpp line 127:
> 
>> 125:     }
>> 126: 
>> 127:     FREE_C_HEAP_ARRAY(Node*, old_table);
> 
> If you resource allocated old_table, this doesn't seem right.
> It seems like the ResizeableHashtable should be a CHeapHashtable and it's always resizeable.
> And the code either has a fixed size ResourceHashtable or a variable/resizeable CHeapHashtable.

Fixed. The `FREE_C_HEAP_ARRAY` should be called only if the table is c-heap allocated.

Per our off-line conversation,  let's leave the 4 combination of {res, c-heap} x {fixed, resizeable}, and live with the bad naming for now (even `ResourceHashtable` is misleading since it can be resource allocated). We should try to start getting rid of the old `Hashtable` classes. Eventually, we can clean up the naming with something like:

- ResizeableResourceHashtable -> Hashtable
- ResourceHashtable -> FixedHashtable

> src/hotspot/share/utilities/resourceHash.hpp line 246:
> 
>> 244:     K, V, HASH, EQUALS, ALLOC_TYPE, MEM_TYPE> {
>> 245: public:
>> 246:   ResourceHashtable() : ResourceHashtableBase, K, V, HASH, EQUALS, ALLOC_TYPE, MEM_TYPE>() {}
> 
> nit: can you break up this line after the V> ?

Fixed.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4536

From dnsimon at openjdk.java.net  Tue Jun 29 21:14:03 2021
From: dnsimon at openjdk.java.net (Doug Simon)
Date: Tue, 29 Jun 2021 21:14:03 GMT
Subject: RFR: 8269416: [JVMCI] capture libjvmci crash data to a file
In-Reply-To: 
References: <0fS5y8qZ0_n3dJHKPW23m_gRKtBZ2eX6NtoyhDoZpfg=.b55e1aa6-e305-4dfb-bc33-e799394b8a0e@github.com>
 
Message-ID: 

On Tue, 29 Jun 2021 16:02:27 GMT, Vladimir Kozlov  wrote:

> Someone from Runtime group may have good suggestions for this changes.
> I think you need to follow (or create JVMCI specific flags) flags used for error reporting:
> 
> SuppressFatalErrorMessage

I don't think extra handling is needed for that flag. It cuts off all paths that lead to libjvmci error handling.

> ErrorFileToStdout
> ErrorFileToStderr

I will push a change to respect these flags in `JVMCI::fatal_log`.

> ShowMessageBoxOnError
> UseOSErrorReporting

It's not clear to me that I need to worry about these - advice welcome.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4620

From coleenp at openjdk.java.net  Tue Jun 29 21:37:04 2021
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Tue, 29 Jun 2021 21:37:04 GMT
Subject: RFR: 8269004 Implement ResizableResourceHashtable [v4]
In-Reply-To: <29ADEYUJ_QfBxtes5NSY8CwtQnw06eXQWJMAP0MdJ60=.cbdab722-ac5c-40c7-a481-dbad7bed54cd@github.com>
References: 
 <29ADEYUJ_QfBxtes5NSY8CwtQnw06eXQWJMAP0MdJ60=.cbdab722-ac5c-40c7-a481-dbad7bed54cd@github.com>
Message-ID: 

On Tue, 29 Jun 2021 21:04:37 GMT, Ioi Lam  wrote:

>> In HotSpot we have (at least) two hashtable designs in the C++ code:
>> 
>> - share/utilities/hashtable.hpp
>> - share/utilities/resourceHash.hpp
>> 
>> Of the two, the `ResourceHashtable` API is much cleaner and most new code has been written with it. However, one issue is that the `SIZE` of `ResourceHashtable` is a compile-time constant. This makes the hash-to-index computation very fast on x64 (gcc can avoid using the slow divq instruction for modulo). However, the downside is we cannot use `ResourceHashtable` when we need a hashtable whose size is determined at run time (and, optionally, resizeable).
>> 
>> This PR refactors `ResourceHashtable` into a base template class `ResourceHashtableBase`, whose `size()` function can be configured by a subclass to be either constant or runtime-configurable. 
>> 
>> Note: since we want to preserve the performance of `hash % SIZE`, we can't make `size()` a virtual function.
>> 
>> Preliminary benchmark shows that this refactoring has no impact on the performance of the constant `ResourceHashtable`. See https://github.com/iklam/tools/tree/main/bench/resourceHash:
>> 
>> *before*
>> ResourceHashtable: 2.70 sec
>> 
>> *after*
>> ResourceHashtable: 2.72 sec
>> ResizableResourceHashtable: 5.29 sec
>> 
>> To make sure `ResizableResourceHashtable` works, I rewrote some CDS code to use `ResizableResourceHashtable` instead of `KVHashtable`
>
> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:
> 
>   @coleenp comments

Marked as reviewed by coleenp (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/4536

From coleenp at openjdk.java.net  Tue Jun 29 21:37:05 2021
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Tue, 29 Jun 2021 21:37:05 GMT
Subject: RFR: 8269004 Implement ResizableResourceHashtable [v3]
In-Reply-To: 
References: 
 
 
 
Message-ID: 

On Tue, 29 Jun 2021 21:08:34 GMT, Ioi Lam  wrote:

>> src/hotspot/share/utilities/resizeableResourceHash.hpp line 127:
>> 
>>> 125:     }
>>> 126: 
>>> 127:     FREE_C_HEAP_ARRAY(Node*, old_table);
>> 
>> If you resource allocated old_table, this doesn't seem right.
>> It seems like the ResizeableHashtable should be a CHeapHashtable and it's always resizeable.
>> And the code either has a fixed size ResourceHashtable or a variable/resizeable CHeapHashtable.
>
> Fixed. The `FREE_C_HEAP_ARRAY` should be called only if the table is c-heap allocated.
> 
> Per our off-line conversation,  let's leave the 4 combination of {res, c-heap} x {fixed, resizeable}, and live with the bad naming for now (even `ResourceHashtable` is misleading since it can be resource allocated). We should try to start getting rid of the old `Hashtable` classes. Eventually, we can clean up the naming with something like:
> 
> - ResizeableResourceHashtable -> Hashtable
> - ResourceHashtable -> FixedHashtable

Ok, thank you for fixing this.  Yes, these are better names.  We can fix all the uses when we agree on some convention and when we replace the utilities/hashtable hashtable and KVHashtable.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4536

From dnsimon at openjdk.java.net  Tue Jun 29 21:43:32 2021
From: dnsimon at openjdk.java.net (Doug Simon)
Date: Tue, 29 Jun 2021 21:43:32 GMT
Subject: RFR: 8269416: [JVMCI] capture libjvmci crash data to a file [v2]
In-Reply-To: <0fS5y8qZ0_n3dJHKPW23m_gRKtBZ2eX6NtoyhDoZpfg=.b55e1aa6-e305-4dfb-bc33-e799394b8a0e@github.com>
References: <0fS5y8qZ0_n3dJHKPW23m_gRKtBZ2eX6NtoyhDoZpfg=.b55e1aa6-e305-4dfb-bc33-e799394b8a0e@github.com>
Message-ID: <6EJyr_Jao1QuTa5NE-Ub_TLBezK82p1ucUzH3DjfLOs=.803cfe50-f086-4355-b062-72763f8d935d@github.com>

> When a fatal error occurs in libgraal, it writes a crash dump to `tty`. Instead, it should be captured in a separate log file that is then referenced in the HotSpot crash summary (just like the hs_err_pid and CI replay compile log files are). This allows libgraal crash data to be more easily submitted along with VM crash reports.
> 
> For example:
> 
>> java -Dlibgraal.CrashAtIsFatal=true -Dgraal.CrashAt=String.equals -cp bin CountUppercase skjdf
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  Internal Error (jvmciRuntime.cpp:909), pid=36298, tid=41219
> #  fatal error: thread 41219: Fatal error in JVMCI shared library
> #
> # JRE version: OpenJDK Runtime Environment GraalVM LIBGRAAL 21.3.0-dev (16.0.2) (build 16.0.2-internal+0-adhoc.dnsimon.labsjdk-ce-16)
> # Java VM: OpenJDK 64-Bit Server VM GraalVM LIBGRAAL 21.3.0-dev (16.0.2-internal+0-adhoc.dnsimon.labsjdk-ce-16, mixed mode, tiered, jvmci, jvmci compiler, compressed oops, compressed class ptrs, g1 gc, bsd-amd64)
> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /Users/dnsimon/graal/graal/compiler/hs_err_pid36298.log
> #
> # The JVMCI shared library error data is saved as:
> # /Users/dnsimon/graal/graal/compiler/hs_err_pid36298_libjvmci.log
> #
> # If you would like to submit a bug report, please visit:
> #   https://bugreport.java.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #

Doug Simon has updated the pull request incrementally with one additional commit since the last revision:

  respect ErrorFileToStdout and ErrorFileToStderr flags

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4620/files
  - new: https://git.openjdk.java.net/jdk/pull/4620/files/1720090f..fbeaedef

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4620&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4620&range=00-01

  Stats: 16 lines in 2 files changed: 7 ins; 0 del; 9 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4620.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4620/head:pull/4620

PR: https://git.openjdk.java.net/jdk/pull/4620

From dnsimon at openjdk.java.net  Tue Jun 29 21:54:30 2021
From: dnsimon at openjdk.java.net (Doug Simon)
Date: Tue, 29 Jun 2021 21:54:30 GMT
Subject: RFR: 8269416: [JVMCI] capture libjvmci crash data to a file [v3]
In-Reply-To: <0fS5y8qZ0_n3dJHKPW23m_gRKtBZ2eX6NtoyhDoZpfg=.b55e1aa6-e305-4dfb-bc33-e799394b8a0e@github.com>
References: <0fS5y8qZ0_n3dJHKPW23m_gRKtBZ2eX6NtoyhDoZpfg=.b55e1aa6-e305-4dfb-bc33-e799394b8a0e@github.com>
Message-ID: 

> When a fatal error occurs in libgraal, it writes a crash dump to `tty`. Instead, it should be captured in a separate log file that is then referenced in the HotSpot crash summary (just like the hs_err_pid and CI replay compile log files are). This allows libgraal crash data to be more easily submitted along with VM crash reports.
> 
> For example:
> 
>> java -Dlibgraal.CrashAtIsFatal=true -Dgraal.CrashAt=String.equals -cp bin CountUppercase skjdf
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  Internal Error (jvmciRuntime.cpp:909), pid=36298, tid=41219
> #  fatal error: thread 41219: Fatal error in JVMCI shared library
> #
> # JRE version: OpenJDK Runtime Environment GraalVM LIBGRAAL 21.3.0-dev (16.0.2) (build 16.0.2-internal+0-adhoc.dnsimon.labsjdk-ce-16)
> # Java VM: OpenJDK 64-Bit Server VM GraalVM LIBGRAAL 21.3.0-dev (16.0.2-internal+0-adhoc.dnsimon.labsjdk-ce-16, mixed mode, tiered, jvmci, jvmci compiler, compressed oops, compressed class ptrs, g1 gc, bsd-amd64)
> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /Users/dnsimon/graal/graal/compiler/hs_err_pid36298.log
> #
> # The JVMCI shared library error data is saved as:
> # /Users/dnsimon/graal/graal/compiler/hs_err_pid36298_libjvmci.log
> #
> # If you would like to submit a bug report, please visit:
> #   https://bugreport.java.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #

Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:

  respect ErrorFileToStdout and ErrorFileToStderr flags

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4620/files
  - new: https://git.openjdk.java.net/jdk/pull/4620/files/fbeaedef..ba004ef0

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4620&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4620&range=01-02

  Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4620.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4620/head:pull/4620

PR: https://git.openjdk.java.net/jdk/pull/4620

From kvn at openjdk.java.net  Tue Jun 29 22:18:02 2021
From: kvn at openjdk.java.net (Vladimir Kozlov)
Date: Tue, 29 Jun 2021 22:18:02 GMT
Subject: RFR: 8269416: [JVMCI] capture libjvmci crash data to a file [v3]
In-Reply-To: 
References: <0fS5y8qZ0_n3dJHKPW23m_gRKtBZ2eX6NtoyhDoZpfg=.b55e1aa6-e305-4dfb-bc33-e799394b8a0e@github.com>
 
Message-ID: 

On Tue, 29 Jun 2021 21:54:30 GMT, Doug Simon  wrote:

>> When a fatal error occurs in libgraal, it writes a crash dump to `tty`. Instead, it should be captured in a separate log file that is then referenced in the HotSpot crash summary (just like the hs_err_pid and CI replay compile log files are). This allows libgraal crash data to be more easily submitted along with VM crash reports.
>> 
>> For example:
>> 
>>> java -Dlibgraal.CrashAtIsFatal=true -Dgraal.CrashAt=String.equals -cp bin CountUppercase skjdf
>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> #  Internal Error (jvmciRuntime.cpp:909), pid=36298, tid=41219
>> #  fatal error: thread 41219: Fatal error in JVMCI shared library
>> #
>> # JRE version: OpenJDK Runtime Environment GraalVM LIBGRAAL 21.3.0-dev (16.0.2) (build 16.0.2-internal+0-adhoc.dnsimon.labsjdk-ce-16)
>> # Java VM: OpenJDK 64-Bit Server VM GraalVM LIBGRAAL 21.3.0-dev (16.0.2-internal+0-adhoc.dnsimon.labsjdk-ce-16, mixed mode, tiered, jvmci, jvmci compiler, compressed oops, compressed class ptrs, g1 gc, bsd-amd64)
>> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
>> #
>> # An error report file with more information is saved as:
>> # /Users/dnsimon/graal/graal/compiler/hs_err_pid36298.log
>> #
>> # The JVMCI shared library error data is saved as:
>> # /Users/dnsimon/graal/graal/compiler/hs_err_pid36298_libjvmci.log
>> #
>> # If you would like to submit a bug report, please visit:
>> #   https://bugreport.java.com/bugreport/crash.jsp
>> # The crash happened outside the Java Virtual Machine in native code.
>> # See problematic frame for where to report the bug.
>> #
>
> Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR.

Okay. this is fine.

Still you need review from Runtime group. I will ping them.

-------------

Marked as reviewed by kvn (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4620

From dholmes at openjdk.java.net  Tue Jun 29 22:30:02 2021
From: dholmes at openjdk.java.net (David Holmes)
Date: Tue, 29 Jun 2021 22:30:02 GMT
Subject: RFR: 8269416: [JVMCI] capture libjvmci crash data to a file [v3]
In-Reply-To: 
References: <0fS5y8qZ0_n3dJHKPW23m_gRKtBZ2eX6NtoyhDoZpfg=.b55e1aa6-e305-4dfb-bc33-e799394b8a0e@github.com>
 
Message-ID: 

On Tue, 29 Jun 2021 21:54:30 GMT, Doug Simon  wrote:

>> When a fatal error occurs in libgraal, it writes a crash dump to `tty`. Instead, it should be captured in a separate log file that is then referenced in the HotSpot crash summary (just like the hs_err_pid and CI replay compile log files are). This allows libgraal crash data to be more easily submitted along with VM crash reports.
>> 
>> For example:
>> 
>>> java -Dlibgraal.CrashAtIsFatal=true -Dgraal.CrashAt=String.equals -cp bin CountUppercase skjdf
>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> #  Internal Error (jvmciRuntime.cpp:909), pid=36298, tid=41219
>> #  fatal error: thread 41219: Fatal error in JVMCI shared library
>> #
>> # JRE version: OpenJDK Runtime Environment GraalVM LIBGRAAL 21.3.0-dev (16.0.2) (build 16.0.2-internal+0-adhoc.dnsimon.labsjdk-ce-16)
>> # Java VM: OpenJDK 64-Bit Server VM GraalVM LIBGRAAL 21.3.0-dev (16.0.2-internal+0-adhoc.dnsimon.labsjdk-ce-16, mixed mode, tiered, jvmci, jvmci compiler, compressed oops, compressed class ptrs, g1 gc, bsd-amd64)
>> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
>> #
>> # An error report file with more information is saved as:
>> # /Users/dnsimon/graal/graal/compiler/hs_err_pid36298.log
>> #
>> # The JVMCI shared library error data is saved as:
>> # /Users/dnsimon/graal/graal/compiler/hs_err_pid36298_libjvmci.log
>> #
>> # If you would like to submit a bug report, please visit:
>> #   https://bugreport.java.com/bugreport/crash.jsp
>> # The crash happened outside the Java Virtual Machine in native code.
>> # See problematic frame for where to report the bug.
>> #
>
> Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR.

Personally I think JVMCI error reporting can be handled independently of the other VM error reporting flags etc. No need to interact with ErrorFileToStdOut/Err as this is not the error file that flag refers to.

I will take a look at the changes in more detail.

Thanks,
David

-------------

PR: https://git.openjdk.java.net/jdk/pull/4620

From dholmes at openjdk.java.net  Tue Jun 29 23:23:04 2021
From: dholmes at openjdk.java.net (David Holmes)
Date: Tue, 29 Jun 2021 23:23:04 GMT
Subject: RFR: 8269416: [JVMCI] capture libjvmci crash data to a file [v3]
In-Reply-To: 
References: <0fS5y8qZ0_n3dJHKPW23m_gRKtBZ2eX6NtoyhDoZpfg=.b55e1aa6-e305-4dfb-bc33-e799394b8a0e@github.com>
 
Message-ID: <4mDVlXQ8lFRejMqk7zhF-vCGkTplVPlBKHu2qb9Sezo=.72fc6f41-e55e-41fe-ba13-7f7b683b04a8@github.com>

On Tue, 29 Jun 2021 21:54:30 GMT, Doug Simon  wrote:

>> When a fatal error occurs in libgraal, it writes a crash dump to `tty`. Instead, it should be captured in a separate log file that is then referenced in the HotSpot crash summary (just like the hs_err_pid and CI replay compile log files are). This allows libgraal crash data to be more easily submitted along with VM crash reports.
>> 
>> For example:
>> 
>>> java -Dlibgraal.CrashAtIsFatal=true -Dgraal.CrashAt=String.equals -cp bin CountUppercase skjdf
>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> #  Internal Error (jvmciRuntime.cpp:909), pid=36298, tid=41219
>> #  fatal error: thread 41219: Fatal error in JVMCI shared library
>> #
>> # JRE version: OpenJDK Runtime Environment GraalVM LIBGRAAL 21.3.0-dev (16.0.2) (build 16.0.2-internal+0-adhoc.dnsimon.labsjdk-ce-16)
>> # Java VM: OpenJDK 64-Bit Server VM GraalVM LIBGRAAL 21.3.0-dev (16.0.2-internal+0-adhoc.dnsimon.labsjdk-ce-16, mixed mode, tiered, jvmci, jvmci compiler, compressed oops, compressed class ptrs, g1 gc, bsd-amd64)
>> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
>> #
>> # An error report file with more information is saved as:
>> # /Users/dnsimon/graal/graal/compiler/hs_err_pid36298.log
>> #
>> # The JVMCI shared library error data is saved as:
>> # /Users/dnsimon/graal/graal/compiler/hs_err_pid36298_libjvmci.log
>> #
>> # If you would like to submit a bug report, please visit:
>> #   https://bugreport.java.com/bugreport/crash.jsp
>> # The crash happened outside the Java Virtual Machine in native code.
>> # See problematic frame for where to report the bug.
>> #
>
> Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR.

This is predominantly a compiler change so I can only approve the shared runtime changes in vmError.

I have no concerns about this change from a runtime perspective.

Thanks,
David

-------------

Marked as reviewed by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4620

From dholmes at openjdk.java.net  Tue Jun 29 23:39:03 2021
From: dholmes at openjdk.java.net (David Holmes)
Date: Tue, 29 Jun 2021 23:39:03 GMT
Subject: RFR: 8245877: assert(_value != __null) failed: resolving NULL
 _value in JvmtiExport::post_compiled_method_load [v3]
In-Reply-To: 
References: 
 
Message-ID: 

On Tue, 29 Jun 2021 16:38:28 GMT, Leonid Mesnik  wrote:

>> The crash happens because nmethod might become a zombie before it is enqueued in JvmtiDeferredEventQueue or after it is dequeued from it. The crash is reproduced by serviceability/jvmti/CompiledMethodLoad/Zombie.java. However, it takes ~3K  runs to hit it. I verified the fix by running this test >100K on each platform. Also, I verified that protecting in 'void JvmtiDeferredEventQueue::post(JvmtiEnv* env)' is not enough.
>
> Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision:
> 
>   post updated.

Looks good!

Thanks,
David

-------------

Marked as reviewed by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4602

From sspitsyn at openjdk.java.net  Wed Jun 30 00:21:03 2021
From: sspitsyn at openjdk.java.net (Serguei Spitsyn)
Date: Wed, 30 Jun 2021 00:21:03 GMT
Subject: RFR: 8245877: assert(_value != __null) failed: resolving NULL
 _value in JvmtiExport::post_compiled_method_load [v3]
In-Reply-To: 
References: 
 
Message-ID: 

On Tue, 29 Jun 2021 16:38:28 GMT, Leonid Mesnik  wrote:

>> The crash happens because nmethod might become a zombie before it is enqueued in JvmtiDeferredEventQueue or after it is dequeued from it. The crash is reproduced by serviceability/jvmti/CompiledMethodLoad/Zombie.java. However, it takes ~3K  runs to hit it. I verified the fix by running this test >100K on each platform. Also, I verified that protecting in 'void JvmtiDeferredEventQueue::post(JvmtiEnv* env)' is not enough.
>
> Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision:
> 
>   post updated.

Marked as reviewed by sspitsyn (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/4602

From github.com+25214855+casparcwang at openjdk.java.net  Wed Jun 30 02:05:10 2021
From: github.com+25214855+casparcwang at openjdk.java.net (=?UTF-8?B?546L6LaF?=)
Date: Wed, 30 Jun 2021 02:05:10 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code [v4]
In-Reply-To: 
References: 
 
Message-ID: 

On Tue, 29 Jun 2021 11:35:52 GMT, ??  wrote:

>> Lots of c1 and c2 jit methods do not contain any oop, so the nmethod entry barrier can be skipped.
>> 
>> 1, c1 jit code will patch oops or Klass into the nmethod, so the entry barrier cannot directly be eliminated, current implementation uses a jump instruction to replace the jcc instruction. If the jit code is patched to contain oops, the entry barrier is patched back to the jcc instruction.
>> 
>> 2, only the jit code of core library methods do not contain any oops.
>> 
>> 3, currently only support zgc
>
> ?? has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix assert error

In summary, the benefit is too small, and it adds maintenance burden to future loom and generational zgc design, so just close this pr.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4610

From github.com+25214855+casparcwang at openjdk.java.net  Wed Jun 30 02:05:10 2021
From: github.com+25214855+casparcwang at openjdk.java.net (=?UTF-8?B?546L6LaF?=)
Date: Wed, 30 Jun 2021 02:05:10 GMT
Subject: Withdrawn: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code
In-Reply-To: 
References: 
Message-ID: 

On Mon, 28 Jun 2021 08:40:16 GMT, ??  wrote:

> Lots of c1 and c2 jit methods do not contain any oop, so the nmethod entry barrier can be skipped.
> 
> 1, c1 jit code will patch oops or Klass into the nmethod, so the entry barrier cannot directly be eliminated, current implementation uses a jump instruction to replace the jcc instruction. If the jit code is patched to contain oops, the entry barrier is patched back to the jcc instruction.
> 
> 2, only the jit code of core library methods do not contain any oops.
> 
> 3, currently only support zgc

This pull request has been closed without being integrated.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4610

From lmesnik at openjdk.java.net  Wed Jun 30 02:09:07 2021
From: lmesnik at openjdk.java.net (Leonid Mesnik)
Date: Wed, 30 Jun 2021 02:09:07 GMT
Subject: Integrated: 8245877: assert(_value != __null) failed: resolving NULL
 _value in JvmtiExport::post_compiled_method_load
In-Reply-To: 
References: 
Message-ID: 

On Sat, 26 Jun 2021 17:48:15 GMT, Leonid Mesnik  wrote:

> The crash happens because nmethod might become a zombie before it is enqueued in JvmtiDeferredEventQueue or after it is dequeued from it. The crash is reproduced by serviceability/jvmti/CompiledMethodLoad/Zombie.java. However, it takes ~3K  runs to hit it. I verified the fix by running this test >100K on each platform. Also, I verified that protecting in 'void JvmtiDeferredEventQueue::post(JvmtiEnv* env)' is not enough.

This pull request has now been integrated.

Changeset: b969136b
Author:    Leonid Mesnik 
URL:       https://git.openjdk.java.net/jdk/commit/b969136b9fcf5f977ebe466f5f9de5c520413e84
Stats:     7 lines in 4 files changed: 3 ins; 1 del; 3 mod

8245877: assert(_value != __null) failed: resolving NULL _value in JvmtiExport::post_compiled_method_load

Reviewed-by: sspitsyn, dholmes, coleenp

-------------

PR: https://git.openjdk.java.net/jdk/pull/4602

From yyang at openjdk.java.net  Wed Jun 30 02:43:29 2021
From: yyang at openjdk.java.net (Yi Yang)
Date: Wed, 30 Jun 2021 02:43:29 GMT
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one [v4]
In-Reply-To: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
Message-ID: <4_6BIdsGYYONAjubvPtHZakuSRRrM3N5RSKK5-HFI-8=.95ceff04-68cc-4ae5-8338-6e61bbf6edcc@github.com>

> From users' perspective, we can find corresponding os thread via top directly, otherwise, we must convert hex format based nid to an integer, and find that thread via `top -pid `. This slightly facilitates our debugging process, but would obviously break some existing jstack analysis tool.
> 
> Jstack Before:
> 
> "ParGC Thread#7" os_prio=0 cpu=103260.18ms elapsed=5255043.58s tid=0x00007f967000b000 nid=0x12e67 runnable
> 
> "ParGC Thread#8" os_prio=0 cpu=104818.76ms elapsed=5255043.58s tid=0x00007f967000c000 nid=0x12e68 runnable
> 
> "ParGC Thread#9" os_prio=0 cpu=102164.69ms elapsed=5255043.58s tid=0x00007f967000e000 nid=0x12e69 runnable
> 
> Jstack After:
> "G1 Conc#0" os_prio=0 cpu=0.03ms elapsed=1295.27s tid=0x00007f99dc096490 nid=117707 runnable
> 
> "G1 Refine#0" os_prio=0 cpu=0.06ms elapsed=1295.22s tid=0x00007f99dc2cad20 nid=117708 runnable
> 
> "G1 Service" os_prio=0 cpu=87.05ms elapsed=1295.22s tid=0x00007f99dc2cc140 nid=117709 runnable
> 
> Top:
>    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>  49083 tianxia+ 20 0 32.8g 594148 10796 S 103.3 0.1 0:10.05 java
>  71291 qingfen+ 20 0 39.3g 26.7g 18312 S 100.7 5.3 16861:35 jhsdb
>  50407 tianxia+ 20 0 32.5g 32796 9768 S 100.3 0.0 0:05.80 java
> 107429 maolian+ 20 0 11.4g 1.1g 10956 S 100.3 0.2 20173:52 java
>  99923 root 10 -10 288520 163228 5088 S 5.9 0.0 6463:53 AliYunDun

Yi Yang has updated the pull request incrementally with one additional commit since the last revision:

  update copyright

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4449/files
  - new: https://git.openjdk.java.net/jdk/pull/4449/files/0f230755..6d3d0523

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4449&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4449&range=02-03

  Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4449.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4449/head:pull/4449

PR: https://git.openjdk.java.net/jdk/pull/4449

From yyang at openjdk.java.net  Wed Jun 30 02:43:30 2021
From: yyang at openjdk.java.net (Yi Yang)
Date: Wed, 30 Jun 2021 02:43:30 GMT
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one [v2]
In-Reply-To: 
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
 
 
Message-ID: 

On Tue, 29 Jun 2021 11:54:55 GMT, Kevin Walls  wrote:

>> Yi Yang has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   restore cleanup code
>
> src/hotspot/share/runtime/osThread.cpp line 41:
> 
>> 39: // Printing
>> 40: void OSThread::print_on(outputStream *st) const {
>> 41:   st->print("nid=%d ", thread_id());
> 
> Just update the (C) year above from 2019 to 2021.
> JhsdbThreadInfoTest.java has it already in the latest revision.

Updated in both osThread.cpp and test file.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4449

From duke at openjdk.java.net  Wed Jun 30 03:07:04 2021
From: duke at openjdk.java.net (duke)
Date: Wed, 30 Jun 2021 03:07:04 GMT
Subject: Withdrawn: 8266519: Cleanup resolve() leftovers from BarrierSet et al
In-Reply-To: 
References: 
Message-ID: 

On Tue, 4 May 2021 16:38:30 GMT, Roman Kennke  wrote:

> Shenandoah used to require a way to resolve oops, but it's long unused the the corresponding code in the access machinery is obsolete. Let's remove it.
> 
> Testing:
>  - [x] hotspot_gc_shenandoah
>  - [x] tier1
>  - [x] tier2

This pull request has been closed without being integrated.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3862

From stuefe at openjdk.java.net  Wed Jun 30 04:45:08 2021
From: stuefe at openjdk.java.net (Thomas Stuefe)
Date: Wed, 30 Jun 2021 04:45:08 GMT
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one [v4]
In-Reply-To: <4_6BIdsGYYONAjubvPtHZakuSRRrM3N5RSKK5-HFI-8=.95ceff04-68cc-4ae5-8338-6e61bbf6edcc@github.com>
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
 <4_6BIdsGYYONAjubvPtHZakuSRRrM3N5RSKK5-HFI-8=.95ceff04-68cc-4ae5-8338-6e61bbf6edcc@github.com>
Message-ID: <5j7HXm2U9KqM6te-IT7o60Bb3OsLK_temQ4UGi88WWU=.7b8d6548-cf70-49ec-a458-198edf79c35f@github.com>

On Wed, 30 Jun 2021 02:43:29 GMT, Yi Yang  wrote:

>> From users' perspective, we can find corresponding os thread via top directly, otherwise, we must convert hex format based nid to an integer, and find that thread via `top -pid `. This slightly facilitates our debugging process, but would obviously break some existing jstack analysis tool.
>> 
>> Jstack Before:
>> 
>> "ParGC Thread#7" os_prio=0 cpu=103260.18ms elapsed=5255043.58s tid=0x00007f967000b000 nid=0x12e67 runnable
>> 
>> "ParGC Thread#8" os_prio=0 cpu=104818.76ms elapsed=5255043.58s tid=0x00007f967000c000 nid=0x12e68 runnable
>> 
>> "ParGC Thread#9" os_prio=0 cpu=102164.69ms elapsed=5255043.58s tid=0x00007f967000e000 nid=0x12e69 runnable
>> 
>> Jstack After:
>> "G1 Conc#0" os_prio=0 cpu=0.03ms elapsed=1295.27s tid=0x00007f99dc096490 nid=117707 runnable
>> 
>> "G1 Refine#0" os_prio=0 cpu=0.06ms elapsed=1295.22s tid=0x00007f99dc2cad20 nid=117708 runnable
>> 
>> "G1 Service" os_prio=0 cpu=87.05ms elapsed=1295.22s tid=0x00007f99dc2cc140 nid=117709 runnable
>> 
>> Top:
>>    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>>  49083 tianxia+ 20 0 32.8g 594148 10796 S 103.3 0.1 0:10.05 java
>>  71291 qingfen+ 20 0 39.3g 26.7g 18312 S 100.7 5.3 16861:35 jhsdb
>>  50407 tianxia+ 20 0 32.5g 32796 9768 S 100.3 0.0 0:05.80 java
>> 107429 maolian+ 20 0 11.4g 1.1g 10956 S 100.3 0.2 20173:52 java
>>  99923 root 10 -10 288520 163228 5088 S 5.9 0.0 6463:53 AliYunDun
>
> Yi Yang has updated the pull request incrementally with one additional commit since the last revision:
> 
>   update copyright

LGTM

test/hotspot/jtreg/serviceability/sa/JhsdbThreadInfoTest.java line 64:

> 62: 
> 63:             out.shouldMatch("\".+\" #\\d+ daemon prio=\\d+ tid=0x[0-9a-f]+ nid=[0-9]+ .+ \\[0x[0-9a-f]+]");
> 64:             out.shouldMatch("\"main\" #\\d+ prio=\\d+ tid=0x[0-9a-f]+ nid=[0-9]+ .+ \\[0x[0-9a-f]+]");

small nit, instead of `[0-9]` you could use `\d`, and to match a hex number `\p{XDigit}` could be used. But since you just follow the existing pattern, I leave it up to you whether you want to change this.

-------------

Marked as reviewed by stuefe (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4449

From yyang at openjdk.java.net  Wed Jun 30 06:01:00 2021
From: yyang at openjdk.java.net (Yi Yang)
Date: Wed, 30 Jun 2021 06:01:00 GMT
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one [v4]
In-Reply-To: <5j7HXm2U9KqM6te-IT7o60Bb3OsLK_temQ4UGi88WWU=.7b8d6548-cf70-49ec-a458-198edf79c35f@github.com>
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
 <4_6BIdsGYYONAjubvPtHZakuSRRrM3N5RSKK5-HFI-8=.95ceff04-68cc-4ae5-8338-6e61bbf6edcc@github.com>
 <5j7HXm2U9KqM6te-IT7o60Bb3OsLK_temQ4UGi88WWU=.7b8d6548-cf70-49ec-a458-198edf79c35f@github.com>
Message-ID: 

On Wed, 30 Jun 2021 04:42:19 GMT, Thomas Stuefe  wrote:

>> Yi Yang has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   update copyright
>
> test/hotspot/jtreg/serviceability/sa/JhsdbThreadInfoTest.java line 64:
> 
>> 62: 
>> 63:             out.shouldMatch("\".+\" #\\d+ daemon prio=\\d+ tid=0x[0-9a-f]+ nid=[0-9]+ .+ \\[0x[0-9a-f]+]");
>> 64:             out.shouldMatch("\"main\" #\\d+ prio=\\d+ tid=0x[0-9a-f]+ nid=[0-9]+ .+ \\[0x[0-9a-f]+]");
> 
> small nit, instead of `[0-9]` you could use `\d`, and to match a hex number `\p{XDigit}` could be used. But since you just follow the existing pattern, I leave it up to you whether you want to change this.

Good catch! [`\p{XDigit}`](https://www.tutorialspoint.com/javaregex/javaregex_posix_class_xdigit.htm) seems a standard/better way to match any hexadecimal character than `[0-9a-fA-F]+`

-------------

PR: https://git.openjdk.java.net/jdk/pull/4449

From yyang at openjdk.java.net  Wed Jun 30 06:30:07 2021
From: yyang at openjdk.java.net (Yi Yang)
Date: Wed, 30 Jun 2021 06:30:07 GMT
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one [v4]
In-Reply-To: <4_6BIdsGYYONAjubvPtHZakuSRRrM3N5RSKK5-HFI-8=.95ceff04-68cc-4ae5-8338-6e61bbf6edcc@github.com>
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
 <4_6BIdsGYYONAjubvPtHZakuSRRrM3N5RSKK5-HFI-8=.95ceff04-68cc-4ae5-8338-6e61bbf6edcc@github.com>
Message-ID: <0m0q3DCpLM8PFQK11Kur48hGpRPxT5JOa3L2klxlGMg=.c1330a91-4e0d-4fc8-a1ff-1930369fe9ef@github.com>

On Wed, 30 Jun 2021 02:43:29 GMT, Yi Yang  wrote:

>> From users' perspective, we can find corresponding os thread via top directly, otherwise, we must convert hex format based nid to an integer, and find that thread via `top -pid `. This slightly facilitates our debugging process, but would obviously break some existing jstack analysis tool.
>> 
>> Jstack Before:
>> 
>> "ParGC Thread#7" os_prio=0 cpu=103260.18ms elapsed=5255043.58s tid=0x00007f967000b000 nid=0x12e67 runnable
>> 
>> "ParGC Thread#8" os_prio=0 cpu=104818.76ms elapsed=5255043.58s tid=0x00007f967000c000 nid=0x12e68 runnable
>> 
>> "ParGC Thread#9" os_prio=0 cpu=102164.69ms elapsed=5255043.58s tid=0x00007f967000e000 nid=0x12e69 runnable
>> 
>> Jstack After:
>> "G1 Conc#0" os_prio=0 cpu=0.03ms elapsed=1295.27s tid=0x00007f99dc096490 nid=117707 runnable
>> 
>> "G1 Refine#0" os_prio=0 cpu=0.06ms elapsed=1295.22s tid=0x00007f99dc2cad20 nid=117708 runnable
>> 
>> "G1 Service" os_prio=0 cpu=87.05ms elapsed=1295.22s tid=0x00007f99dc2cc140 nid=117709 runnable
>> 
>> Top:
>>    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>>  49083 tianxia+ 20 0 32.8g 594148 10796 S 103.3 0.1 0:10.05 java
>>  71291 qingfen+ 20 0 39.3g 26.7g 18312 S 100.7 5.3 16861:35 jhsdb
>>  50407 tianxia+ 20 0 32.5g 32796 9768 S 100.3 0.0 0:05.80 java
>> 107429 maolian+ 20 0 11.4g 1.1g 10956 S 100.3 0.2 20173:52 java
>>  99923 root 10 -10 288520 163228 5088 S 5.9 0.0 6463:53 AliYunDun
>
> Yi Yang has updated the pull request incrementally with one additional commit since the last revision:
> 
>   update copyright

> _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on [hotspot-dev](mailto:hotspot-dev at mail.openjdk.java.net):_
> 
> On 28/06/2021 6:49 pm, Thomas Stuefe wrote:
> 
> > On Mon, 28 Jun 2021 07:37:10 GMT, Yi Yang  wrote:
> > > > src/hotspot/share/runtime/osThread.cpp line 41:
> > > > > 39: // Printing
> > > > > 40: void OSThread::print_on(outputStream *st) const {
> > > > > 41:   st->print("nid=%d ", thread_id());
> > > > 
> > > > 
> > > > thread_id is of an opaque type (eg pthread_t). I think we can reasonably assume its numeric, but I would print it as an unsigned 64bit int just in case.
> > > 
> > > 
> > > Hi Thomas, we can not use other format specifiers (`%ld`,`%llu`) after my practice, because it can not compile on my mac:
> > 
> > 
> > You'd do:
> > print("nid: " UINT64_FORMAT, (uint64_t) id):;
> > thread_t is, among other things, pthread_t, which is opaque. Any current code treating that as signed int is incorrect too.
> 
> If it is opaque then I don't see how signed or unsigned makes any
> difference. You are assuming it can just be treated as a 64-bit value;
> whether you interpret that as a signed or unsigned value just changes
> how you print it. I agree printing only positive values is nicer visually.
> 
> David

@dholmes-ora David, can you plz take a look at latest versions once you have time. Thanks!

-------------

PR: https://git.openjdk.java.net/jdk/pull/4449

From yyang at openjdk.java.net  Wed Jun 30 06:38:30 2021
From: yyang at openjdk.java.net (Yi Yang)
Date: Wed, 30 Jun 2021 06:38:30 GMT
Subject: RFR: 8268425: Show decimal nid of OSThread instead of hex format
 one [v5]
In-Reply-To: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
References: <2XpHch1KL91iW9wQ9VdboCFdkyUxdCwCq_-Dad6zo4E=.b01db185-1596-4f0c-b1ee-2d125d50963c@github.com>
Message-ID: 

> From users' perspective, we can find corresponding os thread via top directly, otherwise, we must convert hex format based nid to an integer, and find that thread via `top -pid `. This slightly facilitates our debugging process, but would obviously break some existing jstack analysis tool.
> 
> Jstack Before:
> 
> "ParGC Thread#7" os_prio=0 cpu=103260.18ms elapsed=5255043.58s tid=0x00007f967000b000 nid=0x12e67 runnable
> 
> "ParGC Thread#8" os_prio=0 cpu=104818.76ms elapsed=5255043.58s tid=0x00007f967000c000 nid=0x12e68 runnable
> 
> "ParGC Thread#9" os_prio=0 cpu=102164.69ms elapsed=5255043.58s tid=0x00007f967000e000 nid=0x12e69 runnable
> 
> Jstack After:
> "G1 Conc#0" os_prio=0 cpu=0.03ms elapsed=1295.27s tid=0x00007f99dc096490 nid=117707 runnable
> 
> "G1 Refine#0" os_prio=0 cpu=0.06ms elapsed=1295.22s tid=0x00007f99dc2cad20 nid=117708 runnable
> 
> "G1 Service" os_prio=0 cpu=87.05ms elapsed=1295.22s tid=0x00007f99dc2cc140 nid=117709 runnable
> 
> Top:
>    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>  49083 tianxia+ 20 0 32.8g 594148 10796 S 103.3 0.1 0:10.05 java
>  71291 qingfen+ 20 0 39.3g 26.7g 18312 S 100.7 5.3 16861:35 jhsdb
>  50407 tianxia+ 20 0 32.5g 32796 9768 S 100.3 0.0 0:05.80 java
> 107429 maolian+ 20 0 11.4g 1.1g 10956 S 100.3 0.2 20173:52 java
>  99923 root 10 -10 288520 163228 5088 S 5.9 0.0 6463:53 AliYunDun

Yi Yang has updated the pull request incrementally with one additional commit since the last revision:

  use \p{XDigit}

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4449/files
  - new: https://git.openjdk.java.net/jdk/pull/4449/files/6d3d0523..7a30c3df

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4449&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4449&range=03-04

  Stats: 3 lines in 1 file changed: 1 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4449.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4449/head:pull/4449

PR: https://git.openjdk.java.net/jdk/pull/4449

From dnsimon at openjdk.java.net  Wed Jun 30 07:17:00 2021
From: dnsimon at openjdk.java.net (Doug Simon)
Date: Wed, 30 Jun 2021 07:17:00 GMT
Subject: RFR: 8269416: [JVMCI] capture libjvmci crash data to a file [v3]
In-Reply-To: 
References: <0fS5y8qZ0_n3dJHKPW23m_gRKtBZ2eX6NtoyhDoZpfg=.b55e1aa6-e305-4dfb-bc33-e799394b8a0e@github.com>
 
 
Message-ID: 

On Tue, 29 Jun 2021 22:14:50 GMT, Vladimir Kozlov  wrote:

>> Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
>> 
>>   respect ErrorFileToStdout and ErrorFileToStderr flags
>
> Okay. this is fine.
> 
> Still you need review from Runtime group. I will ping them.

Thanks for the review David. If it's all the same, I'll keep the ErrorFileToStdOut/Err behavior for this libjvmci error reporting. Looking at the rationale for https://bugs.openjdk.java.net/browse/JDK-8220786, it makes sense for libjvmci error reporting to go to the console when retrieving log files from container environments is problematic. It might even be worth having CI compiler replay respect these flags but that's a separate question that I leave up to @vnkozlov.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4620

From sspitsyn at openjdk.java.net  Wed Jun 30 09:21:05 2021
From: sspitsyn at openjdk.java.net (Serguei Spitsyn)
Date: Wed, 30 Jun 2021 09:21:05 GMT
Subject: RFR: 8178287: AsyncGetCallTrace fails to traverse valid Java
 stacks [v3]
In-Reply-To: 
References: <9qfnLj_-jz8MocK7UIIs5-NYZsVPJ7J20ZLiORqpUlM=.cb712662-0eb9-4d17-a67d-42451423f470@github.com>
 
Message-ID: 

On Fri, 18 Jun 2021 08:56:32 GMT, Ludovic Henry  wrote:

>> When the signal sent for AsyncGetCallTrace or JFR would land on a runtime stub (like arraycopy), a vtable stub, or the prolog of a compiled method,  it wouldn't be able to detect the sender (caller) frame for multiple reasons. This patch fixes these cases through adding CodeBlob-specific frame parser which are in the best position to know how a frame is setup.
>> 
>> The following examples have been profiled with honest-profiler which uses `AsyncGetCallTrace`.
>> 
>> # `Prof1`
>> 
>> public class Prof1 {
>> 
>>     public static void main(String[] args) {
>>         StringBuilder sb = new StringBuilder();
>>         for (int i = 0; i < 1000000; i++) {
>>             sb.append("ab");
>>             sb.delete(0, 1);
>>         }
>>         System.out.println(sb.length());
>>     }
>> }
>> 
>> 
>> - Baseline:
>> 
>> Flat Profile (by method):
>>         (t 99.4,s 99.4) AGCT::Unknown Java[ERR=-5]
>>         (t  0.5,s  0.2) Prof1::main
>>         (t  0.2,s  0.2) java.lang.AbstractStringBuilder::append
>>         (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::ensureCapacityInternal
>>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::shift
>>         (t  0.0,s  0.0) java.lang.String::getBytes
>>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::putStringAt
>>         (t  0.0,s  0.0) java.lang.StringBuilder::delete
>>         (t  0.2,s  0.0) java.lang.StringBuilder::append
>>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::delete
>>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::putStringAt
>> 
>> - With `StubRoutinesBlob::FrameParser`:
>> 
>> Flat Profile (by method):
>>         (t 98.7,s 98.7) java.lang.AbstractStringBuilder::ensureCapacityInternal
>>         (t  0.9,s  0.9) java.lang.AbstractStringBuilder::delete
>>         (t 99.8,s  0.2) Prof1::main
>>         (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>>         (t  0.0,s  0.0) AGCT::Unknown Java[ERR=-5]
>>         (t 98.8,s  0.0) java.lang.AbstractStringBuilder::append
>>         (t 98.8,s  0.0) java.lang.StringBuilder::append
>>         (t  0.9,s  0.0) java.lang.StringBuilder::delete
>> 
>> 
>> # `Prof2`
>> 
>> import java.util.function.Supplier;
>> 
>> public class Prof2 {
>> 
>>     public static void main(String[] args) {
>>         var rand = new java.util.Random(0);
>>         Supplier[] suppliers = {
>>                 () -> 0,
>>                 () -> 1,
>>                 () -> 2,
>>                 () -> 3,
>>         };
>> 
>>         long sum = 0;
>>         for (int i = 0; i >= 0; i++) {
>>             sum += (int)suppliers[i % suppliers.length].get();
>>         }
>>     }
>> }
>> 
>> 
>> - Baseline:
>> 
>> Flat Profile (by method):
>>         (t 60.7,s 60.7) AGCT::Unknown Java[ERR=-5]
>>         (t 39.2,s 35.2) Prof2::main
>>         (t  1.4,s  1.4) Prof2::lambda$main$3
>>         (t  1.0,s  1.0) Prof2::lambda$main$2
>>         (t  0.9,s  0.9) Prof2::lambda$main$1
>>         (t  0.7,s  0.7) Prof2::lambda$main$0
>>         (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>>         (t  0.0,s  0.0) java.lang.Thread::exit
>>         (t  0.9,s  0.0) Prof2$$Lambda$2.0x0000000800c00c28::get
>>         (t  1.0,s  0.0) Prof2$$Lambda$3.0x0000000800c01000::get
>>         (t  1.4,s  0.0) Prof2$$Lambda$4.0x0000000800c01220::get
>>         (t  0.7,s  0.0) Prof2$$Lambda$1.0x0000000800c00a08::get
>> 
>> 
>> - With `VtableBlob::FrameParser` and `nmethod::FrameParser`:
>> 
>> Flat Profile (by method):
>>         (t 74.1,s 70.3) Prof2::main
>>         (t  6.5,s  5.5) Prof2$$Lambda$29.0x0000000800081220::get
>>         (t  6.6,s  5.4) Prof2$$Lambda$28.0x0000000800081000::get
>>         (t  5.7,s  5.0) Prof2$$Lambda$26.0x0000000800080a08::get
>>         (t  5.9,s  5.0) Prof2$$Lambda$27.0x0000000800080c28::get
>>         (t  4.9,s  4.9) AGCT::Unknown Java[ERR=-5]
>>         (t  1.2,s  1.2) Prof2::lambda$main$2
>>         (t  0.9,s  0.9) Prof2::lambda$main$3
>>         (t  0.9,s  0.9) Prof2::lambda$main$1
>>         (t  0.7,s  0.7) Prof2::lambda$main$0
>>         (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>
> Ludovic Henry has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix comments

Hi Ludovic,
You need two reviews including one (R)eviewer before integration of your fix.
Thanks,
Serguei

-------------

PR: https://git.openjdk.java.net/jdk/pull/4436

From dnsimon at openjdk.java.net  Wed Jun 30 09:57:29 2021
From: dnsimon at openjdk.java.net (Doug Simon)
Date: Wed, 30 Jun 2021 09:57:29 GMT
Subject: RFR: 8269416: [JVMCI] capture libjvmci crash data to a file [v4]
In-Reply-To: <0fS5y8qZ0_n3dJHKPW23m_gRKtBZ2eX6NtoyhDoZpfg=.b55e1aa6-e305-4dfb-bc33-e799394b8a0e@github.com>
References: <0fS5y8qZ0_n3dJHKPW23m_gRKtBZ2eX6NtoyhDoZpfg=.b55e1aa6-e305-4dfb-bc33-e799394b8a0e@github.com>
Message-ID: <6Cl9W6gwy79SypouhXiUwwwmGKmf0nl3iC9Pg0fu41w=.4552c8c6-3c20-4a3f-abbc-29b2a0532165@github.com>

> When a fatal error occurs in libgraal, it writes a crash dump to `tty`. Instead, it should be captured in a separate log file that is then referenced in the HotSpot crash summary (just like the hs_err_pid and CI replay compile log files are). This allows libgraal crash data to be more easily submitted along with VM crash reports.
> 
> For example:
> 
>> java -Dlibgraal.CrashAtIsFatal=true -Dgraal.CrashAt=String.equals -cp bin CountUppercase skjdf
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  Internal Error (jvmciRuntime.cpp:909), pid=36298, tid=41219
> #  fatal error: thread 41219: Fatal error in JVMCI shared library
> #
> # JRE version: OpenJDK Runtime Environment GraalVM LIBGRAAL 21.3.0-dev (16.0.2) (build 16.0.2-internal+0-adhoc.dnsimon.labsjdk-ce-16)
> # Java VM: OpenJDK 64-Bit Server VM GraalVM LIBGRAAL 21.3.0-dev (16.0.2-internal+0-adhoc.dnsimon.labsjdk-ce-16, mixed mode, tiered, jvmci, jvmci compiler, compressed oops, compressed class ptrs, g1 gc, bsd-amd64)
> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /Users/dnsimon/graal/graal/compiler/hs_err_pid36298.log
> #
> # The JVMCI shared library error data is saved as:
> # /Users/dnsimon/graal/graal/compiler/hs_err_pid36298_libjvmci.log
> #
> # If you would like to submit a bug report, please visit:
> #   https://bugreport.java.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #

Doug Simon has updated the pull request incrementally with one additional commit since the last revision:

  use fdStream instead of fileStream

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4620/files
  - new: https://git.openjdk.java.net/jdk/pull/4620/files/ba004ef0..fac6c3dc

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4620&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4620&range=02-03

  Stats: 19 lines in 2 files changed: 2 ins; 3 del; 14 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4620.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4620/head:pull/4620

PR: https://git.openjdk.java.net/jdk/pull/4620

From mli at openjdk.java.net  Wed Jun 30 12:05:17 2021
From: mli at openjdk.java.net (Hamlin Li)
Date: Wed, 30 Jun 2021 12:05:17 GMT
Subject: RFR: JDK-8269650: Optimize gc-locker in [Get|Release]StringCritical
 for latin string
Message-ID: 

Currently, JNI GetStringCritical locks gc locker for all strings including latin and non-latin until ReleaseStringCritical.
But for latin, it's not necessary to still lock gc locker after GetStringCritical, as it's copied anyway whether obj pining is supported or not, so it's fine to unlock gc locker after GetStringCritical.

-------------

Commit messages:
 - optimize gc-locker in GetStringCritical for latin str

Changes: https://git.openjdk.java.net/jdk/pull/4637/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4637&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8269650
  Stats: 4 lines in 1 file changed: 3 ins; 1 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4637.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4637/head:pull/4637

PR: https://git.openjdk.java.net/jdk/pull/4637

From david.holmes at oracle.com  Wed Jun 30 12:32:42 2021
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 30 Jun 2021 22:32:42 +1000
Subject: RFR: 8269416: [JVMCI] capture libjvmci crash data to a file [v3]
In-Reply-To: 
References: <0fS5y8qZ0_n3dJHKPW23m_gRKtBZ2eX6NtoyhDoZpfg=.b55e1aa6-e305-4dfb-bc33-e799394b8a0e@github.com>
 
 
 
Message-ID: <10aef568-5558-6860-62eb-beab6edfd237@oracle.com>

On 30/06/2021 5:17 pm, Doug Simon wrote:
> On Tue, 29 Jun 2021 22:14:50 GMT, Vladimir Kozlov  wrote:
> 
>>> Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
>>>
>>>    respect ErrorFileToStdout and ErrorFileToStderr flags
>>
>> Okay. this is fine.
>>
>> Still you need review from Runtime group. I will ping them.
> 
> Thanks for the review David. If it's all the same, I'll keep the ErrorFileToStdOut/Err behavior for this libjvmci error reporting. Looking at the rationale for https://bugs.openjdk.java.net/browse/JDK-8220786, it makes sense for libjvmci error reporting to go to the console when retrieving log files from container environments is problematic. It might even be worth having CI compiler replay respect these flags but that's a separate question that I leave up to @vnkozlov.

No problem. I hadn't seen that you'd made the change when I made the 
comment.

David

> -------------
> 
> PR: https://git.openjdk.java.net/jdk/pull/4620
> 

From tschatzl at openjdk.java.net  Wed Jun 30 12:35:07 2021
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Wed, 30 Jun 2021 12:35:07 GMT
Subject: RFR: JDK-8269650: Optimize gc-locker in
 [Get|Release]StringCritical for latin string
In-Reply-To: 
References: 
Message-ID: 

On Wed, 30 Jun 2021 11:55:49 GMT, Hamlin Li  wrote:

> Currently, JNI GetStringCritical locks gc locker for all strings including latin and non-latin until ReleaseStringCritical.
> But for latin, it's not necessary to still lock gc locker after GetStringCritical, as it's copied anyway whether obj pining is supported or not, so it's fine to unlock gc locker after GetStringCritical.

Actually I think the *String* object can be unlocked regardless of `is_latin1` or not. The code returns the *char array* that the native code is going to process after all - which is not locked *at all* but probably should be. I filed [JDK-8269661](https://bugs.openjdk.java.net/browse/JDK-8269661) for this.

Which should probably be fixed first, because if the change correctly unconditionally unlocked the String object, Shenandoah would start to fail.
It probably works at this time because the String object and the char array are typically located in the same region anyway.

-------------

Changes requested by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4637

From dnsimon at openjdk.java.net  Wed Jun 30 12:38:32 2021
From: dnsimon at openjdk.java.net (Doug Simon)
Date: Wed, 30 Jun 2021 12:38:32 GMT
Subject: Integrated: 8269416: [JVMCI] capture libjvmci crash data to a file
In-Reply-To: <0fS5y8qZ0_n3dJHKPW23m_gRKtBZ2eX6NtoyhDoZpfg=.b55e1aa6-e305-4dfb-bc33-e799394b8a0e@github.com>
References: <0fS5y8qZ0_n3dJHKPW23m_gRKtBZ2eX6NtoyhDoZpfg=.b55e1aa6-e305-4dfb-bc33-e799394b8a0e@github.com>
Message-ID: 

On Mon, 28 Jun 2021 22:58:04 GMT, Doug Simon  wrote:

> When a fatal error occurs in libgraal, it writes a crash dump to `tty`. Instead, it should be captured in a separate log file that is then referenced in the HotSpot crash summary (just like the hs_err_pid and CI replay compile log files are). This allows libgraal crash data to be more easily submitted along with VM crash reports.
> 
> For example:
> 
>> java -Dlibgraal.CrashAtIsFatal=true -Dgraal.CrashAt=String.equals -cp bin CountUppercase skjdf
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  Internal Error (jvmciRuntime.cpp:909), pid=36298, tid=41219
> #  fatal error: thread 41219: Fatal error in JVMCI shared library
> #
> # JRE version: OpenJDK Runtime Environment GraalVM LIBGRAAL 21.3.0-dev (16.0.2) (build 16.0.2-internal+0-adhoc.dnsimon.labsjdk-ce-16)
> # Java VM: OpenJDK 64-Bit Server VM GraalVM LIBGRAAL 21.3.0-dev (16.0.2-internal+0-adhoc.dnsimon.labsjdk-ce-16, mixed mode, tiered, jvmci, jvmci compiler, compressed oops, compressed class ptrs, g1 gc, bsd-amd64)
> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /Users/dnsimon/graal/graal/compiler/hs_err_pid36298.log
> #
> # The JVMCI shared library error data is saved as:
> # /Users/dnsimon/graal/graal/compiler/hs_err_pid36298_libjvmci.log
> #
> # If you would like to submit a bug report, please visit:
> #   https://bugreport.java.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #

This pull request has now been integrated.

Changeset: a6b253d8
Author:    Doug Simon 
URL:       https://git.openjdk.java.net/jdk/commit/a6b253d85c732ddd1d3154d5fc108d2bba66ab01
Stats:     96 lines in 7 files changed: 93 ins; 0 del; 3 mod

8269416: [JVMCI] capture libjvmci crash data to a file

Reviewed-by: kvn, dholmes

-------------

PR: https://git.openjdk.java.net/jdk/pull/4620

From dnsimon at openjdk.java.net  Wed Jun 30 12:38:31 2021
From: dnsimon at openjdk.java.net (Doug Simon)
Date: Wed, 30 Jun 2021 12:38:31 GMT
Subject: RFR: 8269416: [JVMCI] capture libjvmci crash data to a file [v5]
In-Reply-To: <0fS5y8qZ0_n3dJHKPW23m_gRKtBZ2eX6NtoyhDoZpfg=.b55e1aa6-e305-4dfb-bc33-e799394b8a0e@github.com>
References: <0fS5y8qZ0_n3dJHKPW23m_gRKtBZ2eX6NtoyhDoZpfg=.b55e1aa6-e305-4dfb-bc33-e799394b8a0e@github.com>
Message-ID: 

> When a fatal error occurs in libgraal, it writes a crash dump to `tty`. Instead, it should be captured in a separate log file that is then referenced in the HotSpot crash summary (just like the hs_err_pid and CI replay compile log files are). This allows libgraal crash data to be more easily submitted along with VM crash reports.
> 
> For example:
> 
>> java -Dlibgraal.CrashAtIsFatal=true -Dgraal.CrashAt=String.equals -cp bin CountUppercase skjdf
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  Internal Error (jvmciRuntime.cpp:909), pid=36298, tid=41219
> #  fatal error: thread 41219: Fatal error in JVMCI shared library
> #
> # JRE version: OpenJDK Runtime Environment GraalVM LIBGRAAL 21.3.0-dev (16.0.2) (build 16.0.2-internal+0-adhoc.dnsimon.labsjdk-ce-16)
> # Java VM: OpenJDK 64-Bit Server VM GraalVM LIBGRAAL 21.3.0-dev (16.0.2-internal+0-adhoc.dnsimon.labsjdk-ce-16, mixed mode, tiered, jvmci, jvmci compiler, compressed oops, compressed class ptrs, g1 gc, bsd-amd64)
> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /Users/dnsimon/graal/graal/compiler/hs_err_pid36298.log
> #
> # The JVMCI shared library error data is saved as:
> # /Users/dnsimon/graal/graal/compiler/hs_err_pid36298_libjvmci.log
> #
> # If you would like to submit a bug report, please visit:
> #   https://bugreport.java.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #

Doug Simon has updated the pull request incrementally with one additional commit since the last revision:

  avoid allocating name_buffer when using ErrorFileToStdout/err

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4620/files
  - new: https://git.openjdk.java.net/jdk/pull/4620/files/fac6c3dc..b9f27c7c

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4620&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4620&range=03-04

  Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4620.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4620/head:pull/4620

PR: https://git.openjdk.java.net/jdk/pull/4620

From kbarrett at openjdk.java.net  Wed Jun 30 12:53:06 2021
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Wed, 30 Jun 2021 12:53:06 GMT
Subject: RFR: 8269004 Implement ResizableResourceHashtable [v4]
In-Reply-To: <29ADEYUJ_QfBxtes5NSY8CwtQnw06eXQWJMAP0MdJ60=.cbdab722-ac5c-40c7-a481-dbad7bed54cd@github.com>
References: 
 <29ADEYUJ_QfBxtes5NSY8CwtQnw06eXQWJMAP0MdJ60=.cbdab722-ac5c-40c7-a481-dbad7bed54cd@github.com>
Message-ID: 

On Tue, 29 Jun 2021 21:04:37 GMT, Ioi Lam  wrote:

>> In HotSpot we have (at least) two hashtable designs in the C++ code:
>> 
>> - share/utilities/hashtable.hpp
>> - share/utilities/resourceHash.hpp
>> 
>> Of the two, the `ResourceHashtable` API is much cleaner and most new code has been written with it. However, one issue is that the `SIZE` of `ResourceHashtable` is a compile-time constant. This makes the hash-to-index computation very fast on x64 (gcc can avoid using the slow divq instruction for modulo). However, the downside is we cannot use `ResourceHashtable` when we need a hashtable whose size is determined at run time (and, optionally, resizeable).
>> 
>> This PR refactors `ResourceHashtable` into a base template class `ResourceHashtableBase`, whose `size()` function can be configured by a subclass to be either constant or runtime-configurable. 
>> 
>> Note: since we want to preserve the performance of `hash % SIZE`, we can't make `size()` a virtual function.
>> 
>> Preliminary benchmark shows that this refactoring has no impact on the performance of the constant `ResourceHashtable`. See https://github.com/iklam/tools/tree/main/bench/resourceHash:
>> 
>> *before*
>> ResourceHashtable: 2.70 sec
>> 
>> *after*
>> ResourceHashtable: 2.72 sec
>> ResizableResourceHashtable: 5.29 sec
>> 
>> To make sure `ResizableResourceHashtable` works, I rewrote some CDS code to use `ResizableResourceHashtable` instead of `KVHashtable`
>
> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:
> 
>   @coleenp comments

Mostly good, but a few minor nits.

src/hotspot/share/utilities/resourceHash.hpp line 80:

> 78: 
> 79:   Node const** lookup_node(unsigned hash, K const& key) const {
> 80:     return const_cast(

[pre-existing] I think this `const_cast` to add const is unnecessary.

src/hotspot/share/utilities/resourceHash.hpp line 88:

> 86: 
> 87:  public:
> 88:   ResourceHashtableBase() : _number_of_entries(0) {}

I'd prefer this explicitly initialize STORAGE, e.g. add `STORAGE()` to value-initialize rather than default-initialize it.  (I think it doesn't currently make a difference, but I think being explicit is clearer.)

src/hotspot/share/utilities/resourceHash.hpp line 90:

> 88:   ResourceHashtableBase() : _number_of_entries(0) {}
> 89: 
> 90:   ResourceHashtableBase(unsigned size) : STORAGE(size), _number_of_entries(0) {}

These constructors seem like they should be non-public.

src/hotspot/share/utilities/resourceHash.hpp line 92:

> 90:   ResourceHashtableBase(unsigned size) : STORAGE(size), _number_of_entries(0) {}
> 91: 
> 92:   ~ResourceHashtableBase() {

Base class constructor should be non-public to avoid slicing.  Also need to decide what to do about copying, either disallow or deep copy.  Default shallow copy seems likely to be wrong.

src/hotspot/share/utilities/resourceHash.hpp line 222:

> 220: protected:
> 221:   FixedResourceHashtableStorage() {
> 222:     memset(_table, 0, TABLE_SIZE * sizeof(Node*));

Instead of memset, consider `FixedResourceHashtableStorage() : _table{} {}`.

src/hotspot/share/utilities/resourceHash.hpp line 224:

> 222:     memset(_table, 0, TABLE_SIZE * sizeof(Node*));
> 223:   }
> 224: 

Destructor should be non-public to prevent slicing.  Also need to consider what to do about copying.

-------------

Changes requested by kbarrett (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4536

From dholmes at openjdk.java.net  Wed Jun 30 12:54:01 2021
From: dholmes at openjdk.java.net (David Holmes)
Date: Wed, 30 Jun 2021 12:54:01 GMT
Subject: RFR: JDK-8269650: Optimize gc-locker in
 [Get|Release]StringCritical for latin string
In-Reply-To: 
References: 
Message-ID: 

On Wed, 30 Jun 2021 11:55:49 GMT, Hamlin Li  wrote:

> Currently, JNI GetStringCritical locks gc locker for all strings including latin and non-latin until ReleaseStringCritical.
> But for latin, it's not necessary to still lock gc locker after GetStringCritical, as it's copied anyway whether obj pining is supported or not, so it's fine to unlock gc locker after GetStringCritical.

Hi Hamlin,

This seems quite reasonable - so reasonable that I really need to know why we were not doing this from day one. So I need to do a bit of digging into the history here, but that will have to wait for tomorrow morning. :)

Thanks,
David

-------------

PR: https://git.openjdk.java.net/jdk/pull/4637

From kbarrett at openjdk.java.net  Wed Jun 30 13:04:02 2021
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Wed, 30 Jun 2021 13:04:02 GMT
Subject: RFR: 8269004 Implement ResizableResourceHashtable [v2]
In-Reply-To: 
References: 
 
 
 
Message-ID: 

On Tue, 29 Jun 2021 03:48:12 GMT, Ioi Lam  wrote:

>> src/hotspot/share/utilities/resourceHash.hpp line 38:
>> 
>>> 36:     MEMFLAGS MEM_TYPE
>>> 37:     >
>>> 38: class ResourceHashtableBase : public ResourceObj {
>> 
>> Rather than a CRTP base class, I think it might be simpler to have a base class that has a type template parameter that provides the sizing/resizing policy. That type might be used either to specify the type of a new member or even a further base class (to benefit from EBO in the size-is-constant case). The derived class constructor would call the base class constructor with a policy object as an argument.
>
> Per Kim's suggestion, I moved the storage management code to two base classes: FixedResourceHashtableStorage and ResizeableResourceHashtableStorage. 
> 
> Now the `ResourceHashtable::_table[]` is in-line allocated (same as as before this PR). I checked with gcc and it generates identical code as before this PR.

It's kind of unfortunate / confusing that ResourceHashtableBase has two "incompatible" constructors as a result of factoring out the storage, but otherwise I like this.  One consequence is that the two derived types actually need to be classes with different constructors, and can't just be type aliases of the base.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4536

From tschatzl at openjdk.java.net  Wed Jun 30 14:10:02 2021
From: tschatzl at openjdk.java.net (Thomas Schatzl)
Date: Wed, 30 Jun 2021 14:10:02 GMT
Subject: RFR: JDK-8269650: Optimize gc-locker in
 [Get|Release]StringCritical for latin string
In-Reply-To: 
References: 
Message-ID: 

On Wed, 30 Jun 2021 11:55:49 GMT, Hamlin Li  wrote:

> Currently, JNI GetStringCritical locks gc locker for all strings including latin and non-latin until ReleaseStringCritical.
> But for latin, it's not necessary to still lock gc locker after GetStringCritical, as it's copied anyway whether obj pining is supported or not, so it's fine to unlock gc locker after GetStringCritical.

The gc team would like to ask you to wait on pushing/finishing this optimization after the fix for [JDK-8269661](https://bugs.openjdk.java.net/browse/JDK-8269661) bubbles up to JDK 18. JDK-8269661 is a P2 bug that should be fixed in JDK 17 too; it would be more convenient for us to fix it there first, then wait until we merge changes from there to 18.

Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4637

From aph at openjdk.java.net  Wed Jun 30 16:52:08 2021
From: aph at openjdk.java.net (Andrew Haley)
Date: Wed, 30 Jun 2021 16:52:08 GMT
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code [v4]
In-Reply-To: 
References: 
 
Message-ID: 

On Tue, 29 Jun 2021 11:35:52 GMT, ??  wrote:

>> Lots of c1 and c2 jit methods do not contain any oop, so the nmethod entry barrier can be skipped.
>> 
>> 1, c1 jit code will patch oops or Klass into the nmethod, so the entry barrier cannot directly be eliminated, current implementation uses a jump instruction to replace the jcc instruction. If the jit code is patched to contain oops, the entry barrier is patched back to the jcc instruction.
>> 
>> 2, only the jit code of core library methods do not contain any oops.
>> 
>> 3, currently only support zgc
>
> ?? has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix assert error

On 6/29/21 2:16 PM, Erik ?sterlund wrote:
>> On 6/29/21 8:03 AM, Erik ?sterlund wrote:
>>> 1) When this was introduced, we did not see the overhead of nmethod entry barriers in performance profiles. Did you see any improvement with the patch?
>>
>> This is a LoadLoad fence at the start of every method followed by a load
>> with a dependent branch. If there wasn't a significant hit in the profiles
>> I wouldn't believe the profiles. Even adding code that doesn't appear to slow
>> things down uses additional speculation resources.
> 
> My main concern is that I am not convinced that eliding nmethod entry barriers is sound. It really isn't all about protecting oops found in the machine code as I explained previously. There is a lot more to it. Therefore I would like to at least have some convincing number showing that such an optimization effort is worthwhile.
> 
> If you are looking at ways of getting rid of the loadload, I have a scheme that can elide it if you are interested. I discussed it with Stuart when he was prototyping how to build these barriers, but I think he concluded we didn't need such optimizations in the end. It came with some more complexity. That's why I am curious if you have numbers suggesting the opposite. Because if so, I have a better idea to target the loadload in particular, witout violating our invariants and still being correct.

No, I don't have numbers, and it would be a moderately serious effort to get them.
I could do it, but it seems like a ridiculous effort to eliminate something that
obviously should be eliminated.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. 
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

-------------

PR: https://git.openjdk.java.net/jdk/pull/4610

From john.r.rose at oracle.com  Wed Jun 30 20:02:48 2021
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 30 Jun 2021 20:02:48 +0000
Subject: RFR: 8269476: Skip nmethod entry barrier if there is no oops in
 the jit code [v4]
In-Reply-To: 
References: 
 
 
Message-ID: <9FFE0079-6EC2-4C0B-B235-C9F9D2C9E90D@oracle.com>

On Jun 30, 2021, at 9:52 AM, Andrew Haley > wrote:

No, I don't have numbers, and it would be a moderately serious effort to get them.
I could do it, but it seems like a ridiculous effort to eliminate something that
obviously should be eliminated.

I?ll pile on here, because there seem to be a number
of interesting issues in play: nmethod dynamics,
special-purpose vs general-purpose mechanisms,
speculatively applied optimizations, and instruction
patching.

There are obvious micro-benchmarks where the cost
of nmethod entry would be detectable.  But nmethods
tend to be large and loopy and calls to them are typically
infrequent and expensive.  (Expenses include spilling
registers, redoing checks on inputs, and lots more.)

So I do think the burden of proof is on any proposer
of nmethod linkage changes to show objective
evidence of performance effects, either neutral
in the case of cleanups or enhancements, or
improvements in the case of purported
?optimizations? that rely on tricky code.

Meanwhile, this change ?optimizes? a general
mechanism under the theory that it?s only for
one purpose, so we?d tax all additional uses of
Erik?s ?swiss army knife? in the future.  As some
may know, I am fond of such swiss army knives.
HotSpot ought to have a goodly collection of
them.  (See current efforts to upgrade the
hash tables, for example.)

Also, without fully understanding the specific
issue, I?m nervous about the comments that the
conditions for applying the ?optimization? are
hard to state and hard to check.  We all know that
a ?fingers crossed? approach to gating optimizations
leads to crashes in the future and headaches for
everybody.  I?m not against optimizations ?just
because it?s obviously beneficial?, but they also
have to be obviously reliable and maintainable,
or, if not, we have to prove they are worth the
short- and long-term costs.

The specific thing that scares me about this change
is that the entry barrier has a complex state diagram.
The barrier is elided but can be inserted again later.
Among all optimization tactics we use, instruction
patching has (probably) the highest cost in terms
of risk of race conditions and unpredictable behavior
on present and future platforms.

Also, this particular change requires platform
specific changes.  Those are expensive, although
the cost in this case may be lower because they
are optional.  (Actually, it?s not lower, just deferred,
because eventually other platforms would adopt
the changes, in the name of regularity.)

To me all that sounds costly, except for an unproven
performance gain on some platforms with some
configurations.

So I?m glad this PR is withdrawn.  If we find out later
that nmethod call overhead is a problem, let?s take
a broader approach to it.  We can consider this PR
to be a proof of concept, that there are some ways to
reduce nmethod call overhead.

I do appreciate the hard work that Caspar put into
designing and coding this PR.  And yet I *also* appreciate
that the review process prevented this proof of concept
from being committed in its present state.

? John

From zgu at openjdk.java.net  Wed Jun 30 20:17:22 2021
From: zgu at openjdk.java.net (Zhengyu Gu)
Date: Wed, 30 Jun 2021 20:17:22 GMT
Subject: [jdk17] RFR: 8269697: JNI_GetPrimitiveArrayCritical() should not
 accept object array
Message-ID: <3haNnfVYr8DFrHCOx3EKMARn3Qs_JTJSVVIxvl1fJYg=.10c1be00-600e-442c-95c7-db705464054e@github.com>

GetPrimitiveArrayCritical() is supposed to only be used with primitive array types, but nothing prevents current implementation from accepting object arrays (please see attached test case in bug).

My purposed fix is not very friendly, it crashes JVM if a none primitive array is passed in, but I am sure what to expect in this scenario.

Specification people, please comment. Thanks!

-------------

Commit messages:
 - v0

Changes: https://git.openjdk.java.net/jdk17/pull/185/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk17&pr=185&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8269697
  Stats: 7 lines in 1 file changed: 0 ins; 5 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk17/pull/185.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk17 pull/185/head:pull/185

PR: https://git.openjdk.java.net/jdk17/pull/185

From kbarrett at openjdk.java.net  Wed Jun 30 21:45:00 2021
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Wed, 30 Jun 2021 21:45:00 GMT
Subject: RFR: JDK-8269650: Optimize gc-locker in
 [Get|Release]StringCritical for latin string
In-Reply-To: 
References: 
Message-ID: 

On Wed, 30 Jun 2021 11:55:49 GMT, Hamlin Li  wrote:

> Currently, JNI GetStringCritical locks gc locker for all strings including latin and non-latin until ReleaseStringCritical.
> But for latin, it's not necessary to still lock gc locker after GetStringCritical, as it's copied anyway whether obj pining is supported or not, so it's fine to unlock gc locker after GetStringCritical.

It turns out that fixing JDK-8269661 involves some code rearrangement that makes it both simple and obvious to not do the gc-locker/pinning in the latin1 case; it would actually make things messier to not include that change.  So unless David's research comes up with something unexpected, I'll probably take over the optimization as part of that bug fix.  Still need to do more testing before sending out a PR.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4637

From david.holmes at oracle.com  Wed Jun 30 22:39:57 2021
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 1 Jul 2021 08:39:57 +1000
Subject: RFR: JDK-8269650: Optimize gc-locker in
 [Get|Release]StringCritical for latin string
In-Reply-To: 
References: 
 
Message-ID: <37db0385-8bc5-b257-e4c1-d24898e0e616@oracle.com>

On 30/06/2021 10:35 pm, Thomas Schatzl wrote:
> On Wed, 30 Jun 2021 11:55:49 GMT, Hamlin Li  wrote:
> 
>> Currently, JNI GetStringCritical locks gc locker for all strings including latin and non-latin until ReleaseStringCritical.
>> But for latin, it's not necessary to still lock gc locker after GetStringCritical, as it's copied anyway whether obj pining is supported or not, so it's fine to unlock gc locker after GetStringCritical.
> 
> Actually I think the *String* object can be unlocked regardless of `is_latin1` or not. The code returns the *char array* that the native code is going to process after all - which is not locked *at all* but probably should be. I filed [JDK-8269661](https://bugs.openjdk.java.net/browse/JDK-8269661) for this.

I admit I do not know how String objects are laid out since compact 
strings came along but IIRC before that the char* pointed to an actual 
array embedded in the String's value, and so the String (and its value 
array) had to be pinned/locked.

David

> Which should probably be fixed first, because if the change correctly unconditionally unlocked the String object, Shenandoah would start to fail.
> It probably works at this time because the String object and the char array are typically located in the same region anyway.
> 
> -------------
> 
> Changes requested by tschatzl (Reviewer).
> 
> PR: https://git.openjdk.java.net/jdk/pull/4637
> 

From coleenp at openjdk.java.net  Wed Jun 30 22:56:13 2021
From: coleenp at openjdk.java.net (Coleen Phillimore)
Date: Wed, 30 Jun 2021 22:56:13 GMT
Subject: RFR: 8268364: jmethod clearing should be done during unloading
Message-ID: <4rn1RSGefWZrUjBgkyJFeFe6hg05r3iVmQq-7PcRj1o=.ac03228b-1fea-459d-95eb-5cfde53f803a@github.com>

This patch moves the jmethod clearing to ClassLoaderData::unload() but also adds a check to Method::checked_resolved_jmethod_id() to handle the case where ZGC may be unloading a class but not have gotten to ClassLoaderData::unload() yet.  JVMTI will read a NULL method for checked_resolved_jmethod_id() in this case, and not get a Method that will shortly, or has already been reclaimed in the Metaspace destructor.
Since I was there, I also added Method::is_valid_method() check to checked_resolve_jmethod_id. I don't think it's expensive anymore but it could be added under DEBUG.  Either way method->method_holder()->is_loader_alive() will crash if !is_valid_method so we should leave it.   As I wrote in the related issues, the bogus Method may have been because of a previous set of bugs with post_compiled_method_load events.

Tested with tiers 1-6 on linux-x64-debug and 1-3 on windows-x64-debug.

Also ran vmTestbase/nsk/{jdi,jvmti} tests with VM_OPTIONS=-XX:+UseZGC  -XX:ZCollectionInterval=0.01 -XX:ZFragmen
tationLimit=0

-------------

Commit messages:
 - Make is_valid_method() an assert.
 - 8213466: Method::checked_resolve_jmethod_id() could do better checks
 - 8268364: jmethod clearing should be done during unloading

Changes: https://git.openjdk.java.net/jdk/pull/4643/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4643&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8268364
  Stats: 33 lines in 2 files changed: 17 ins; 13 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4643.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4643/head:pull/4643

PR: https://git.openjdk.java.net/jdk/pull/4643

From kbarrett at openjdk.java.net  Wed Jun 30 23:30:59 2021
From: kbarrett at openjdk.java.net (Kim Barrett)
Date: Wed, 30 Jun 2021 23:30:59 GMT
Subject: RFR: JDK-8269650: Optimize gc-locker in
 [Get|Release]StringCritical for latin string
In-Reply-To: 
References: 
Message-ID: 

On Wed, 30 Jun 2021 11:55:49 GMT, Hamlin Li  wrote:

> Currently, JNI GetStringCritical locks gc locker for all strings including latin and non-latin until ReleaseStringCritical.
> But for latin, it's not necessary to still lock gc locker after GetStringCritical, as it's copied anyway whether obj pining is supported or not, so it's fine to unlock gc locker after GetStringCritical.

> _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on [hotspot-dev](mailto:hotspot-dev at mail.openjdk.java.net):_
> 
> On 30/06/2021 10:35 pm, Thomas Schatzl wrote:
> 
> > On Wed, 30 Jun 2021 11:55:49 GMT, Hamlin Li  wrote:
> > > Currently, JNI GetStringCritical locks gc locker for all strings including latin and non-latin until ReleaseStringCritical.
> > > But for latin, it's not necessary to still lock gc locker after GetStringCritical, as it's copied anyway whether obj pining is supported or not, so it's fine to unlock gc locker after GetStringCritical.
> > 
> > 
> > Actually I think the *String* object can be unlocked regardless of `is_latin1` or not. The code returns the *char array* that the native code is going to process after all - which is not locked *at all* but probably should be. I filed [JDK-8269661](https://bugs.openjdk.java.net/browse/JDK-8269661) for this.
> 
> I admit I do not know how String objects are laid out since compact
> strings came along but IIRC before that the char* pointed to an actual
> array embedded in the String's value, and so the String (and its value
> array) had to be pinned/locked.
> 
> David

I don?t see how that embedding of the char array could work (without bespoke GC support or something like Valhalla).
I also don?t see any sign of that in the code history.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4637

From david.holmes at oracle.com  Wed Jun 30 23:58:20 2021
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 1 Jul 2021 09:58:20 +1000
Subject: RFR: JDK-8269650: Optimize gc-locker in
 [Get|Release]StringCritical for latin string
In-Reply-To: 
References: 
 
Message-ID: <32968da5-2661-28f0-e567-3b30a5c7e835@oracle.com>

On 1/07/2021 7:45 am, Kim Barrett wrote:
> On Wed, 30 Jun 2021 11:55:49 GMT, Hamlin Li  wrote:
> 
>> Currently, JNI GetStringCritical locks gc locker for all strings including latin and non-latin until ReleaseStringCritical.
>> But for latin, it's not necessary to still lock gc locker after GetStringCritical, as it's copied anyway whether obj pining is supported or not, so it's fine to unlock gc locker after GetStringCritical.
> 
> It turns out that fixing JDK-8269661 involves some code rearrangement that makes it both simple and obvious to not do the gc-locker/pinning in the latin1 case; it would actually make things messier to not include that change.  So unless David's research comes up with something unexpected, I'll probably take over the optimization as part of that bug fix.  Still need to do more testing before sending out a PR.

The is_latin code handling came along with compact strings, and no 
change was made to the GC_locker::lock_critical/unlock_critical pairing.

At that time I can't see that there could be any change. The GC critical 
lock was global/coarse-grained, and all compact strings did was change 
the type of the array it didn't change the way the array was allocated 
or embedded within a Java object.

AFAICS, as Kim has stated in JDK-8269661, any problems here have arisen 
when object-pinning was introduced. Though I'm still unclear whether the 
raw array is actually embedded in the java byte[] instance.

David
-----

> -------------
> 
> PR: https://git.openjdk.java.net/jdk/pull/4637
>